Optimising Large Language Models for Vehicle Classification
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
This thesis examines the use of Large Language Models optimised for vehicle classification
within the insurance industry. Traditional methods at If P&C Insurance
suffer from inaccuracies and scalability issues due to unstructured text data from
manual inputs and relatively basic techniques. To address this, our study utilises
Parameter-Efficient Fine-Tuning and Low-Rank Adaptation on standardised, manually
labelled vehicle data. Our approach combined careful prompt engineering,
dataset preprocessing, hyperparameter optimisation via the Nondominated Sorting
Genetic Algorithm II, and thorough evaluation across different base models, including
various Llama model sizes, DeepSeek, Mistral, and Phi. Through this integrated
methodology, we achieved a final model accuracy of 96.8% on hold-out data, using
a fine-tuned Llama 3.1 model with 8 billion parameters. Despite the model’s relatively
modest size, targeted adaptation enabled it to outperform larger proprietary
models such as GPT-4o on the specific classification task. Implementation aspects,
including computational needs, cost efficiency, and human-in-the-loop strategies,
are also discussed. Our deployment framework emphasises selective human review
based on model confidence, enabling a sustainable balance between automation and
accuracy. Financial and infrastructure considerations showed that fine-tuned opensource
models could offer significant cost savings compared to API-based solutions.
In conclusion, this research presents a scalable, cost-effective, and high-performing
solution that enhances vehicle classification, leading to improved risk segmentation
and pricing precision in the insurance sector. The results demonstrate that finetuned
open-source LLMs, when carefully adapted, can rival and even surpass much
larger commercial models in domain-specific applications, offering a viable path for
modernising traditional insurance workflows.
Beskrivning
Ämne/nyckelord
AI, LLM, PEFT, LoRA, machine learning, supervised learning, unsupervised learning, big data, classification, risk-based pricing.