Optimising Large Language Models for Vehicle Classification

Publicerad

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

This thesis examines the use of Large Language Models optimised for vehicle classification within the insurance industry. Traditional methods at If P&C Insurance suffer from inaccuracies and scalability issues due to unstructured text data from manual inputs and relatively basic techniques. To address this, our study utilises Parameter-Efficient Fine-Tuning and Low-Rank Adaptation on standardised, manually labelled vehicle data. Our approach combined careful prompt engineering, dataset preprocessing, hyperparameter optimisation via the Nondominated Sorting Genetic Algorithm II, and thorough evaluation across different base models, including various Llama model sizes, DeepSeek, Mistral, and Phi. Through this integrated methodology, we achieved a final model accuracy of 96.8% on hold-out data, using a fine-tuned Llama 3.1 model with 8 billion parameters. Despite the model’s relatively modest size, targeted adaptation enabled it to outperform larger proprietary models such as GPT-4o on the specific classification task. Implementation aspects, including computational needs, cost efficiency, and human-in-the-loop strategies, are also discussed. Our deployment framework emphasises selective human review based on model confidence, enabling a sustainable balance between automation and accuracy. Financial and infrastructure considerations showed that fine-tuned opensource models could offer significant cost savings compared to API-based solutions. In conclusion, this research presents a scalable, cost-effective, and high-performing solution that enhances vehicle classification, leading to improved risk segmentation and pricing precision in the insurance sector. The results demonstrate that finetuned open-source LLMs, when carefully adapted, can rival and even surpass much larger commercial models in domain-specific applications, offering a viable path for modernising traditional insurance workflows.

Beskrivning

Ämne/nyckelord

AI, LLM, PEFT, LoRA, machine learning, supervised learning, unsupervised learning, big data, classification, risk-based pricing.

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced