Utilizing Large Language Models for Logged Data Anlysis in Industrial Contexts - Investigating the Adaptation of Large Language Models for Logged Data Analysis

Le, Eric; Sahni, Himanshu

Utilizing Large Language Models for Logged Data Anlysis in Industrial Contexts - Investigating the Adaptation of Large Language Models for Logged Data Analysis

dc.contributor.author	Le, Eric
dc.contributor.author	Sahni, Himanshu
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering	en
dc.contributor.examiner	Johansson, Moa
dc.contributor.supervisor	Haghir Chehreghani, Morteza
dc.date.accessioned	2025-02-11T13:10:53Z
dc.date.available	2025-02-11T13:10:53Z
dc.date.issued	2024
dc.date.submitted
dc.description.abstract	The rapid advancements in large language models (LLMs), such as OpenAI’s GPT series and Meta’s Llama series, hold significant potential to revolutionize data analysis processes in industrial contexts. This thesis explores the adaptation of LLMs for logged data analysis within an industrial context, specifically at Volvo Group. The objective is to customize and integrate an LLM-based system to automate the resource-intensive data analysis process and enhance efficiency. The research involves constructing a multi-component agent system that incorporates prompt engineering techniques, specifically targeting the analysis of MF4 files, a format optimized for efficient storage of time-stamped measurement data. Our methodology includes developing a planner to break down complex analysis tasks into manageable subtasks, a selector to categorize user requests into computation and plotting tasks, and a code interpreter to generate and execute Python code for these tasks. The system’s performance was evaluated using custom-designed quantitative metrics and qualitative assessments from human evaluators to ensure accuracy and efficiency. The LLM-based agent outperformed the current Python-based workflow in terms of efficiency, reducing the need for manual coding and making the analysis process substantially faster and more convenient. The model achieved an accuracy of 93.89% for computations and 90.36% for plots. The reliability of the model was confirmed through rigorous evaluations, and it showed a consistent performance with high accuracy scores and positive feedback from the human evaluators. Key findings show the model handles complex data analysis and generates diverse, tailored plots, highlighting its utility and transformative potential for scalable, effective industrial data processing workflows.
dc.identifier.coursecode	DATX05
dc.identifier.uri	http://hdl.handle.net/20.500.12380/309114
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Large Language Models (LLMs)
dc.subject	Logged Data Analysis
dc.subject	Automation
dc.subject	Open-Source Models
dc.subject	Prompt Engineering
dc.subject	Agent-Based Analysis
dc.subject	MF4 File Analysis
dc.title	Utilizing Large Language Models for Logged Data Anlysis in Industrial Contexts - Investigating the Adaptation of Large Language Models for Logged Data Analysis
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Complex adaptive systems (MPCAS), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 24-79 EL HS.pdf
Storlek:: 4.18 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen