Utilizing Large Language Models for Logged Data Anlysis in Industrial Contexts - Investigating the Adaptation of Large Language Models for Logged Data Analysis

dc.contributor.authorLe, Eric
dc.contributor.authorSahni, Himanshu
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerJohansson, Moa
dc.contributor.supervisorHaghir Chehreghani, Morteza
dc.date.accessioned2025-02-11T13:10:53Z
dc.date.available2025-02-11T13:10:53Z
dc.date.issued2024
dc.date.submitted
dc.description.abstractThe rapid advancements in large language models (LLMs), such as OpenAI’s GPT series and Meta’s Llama series, hold significant potential to revolutionize data analysis processes in industrial contexts. This thesis explores the adaptation of LLMs for logged data analysis within an industrial context, specifically at Volvo Group. The objective is to customize and integrate an LLM-based system to automate the resource-intensive data analysis process and enhance efficiency. The research involves constructing a multi-component agent system that incorporates prompt engineering techniques, specifically targeting the analysis of MF4 files, a format optimized for efficient storage of time-stamped measurement data. Our methodology includes developing a planner to break down complex analysis tasks into manageable subtasks, a selector to categorize user requests into computation and plotting tasks, and a code interpreter to generate and execute Python code for these tasks. The system’s performance was evaluated using custom-designed quantitative metrics and qualitative assessments from human evaluators to ensure accuracy and efficiency. The LLM-based agent outperformed the current Python-based workflow in terms of efficiency, reducing the need for manual coding and making the analysis process substantially faster and more convenient. The model achieved an accuracy of 93.89% for computations and 90.36% for plots. The reliability of the model was confirmed through rigorous evaluations, and it showed a consistent performance with high accuracy scores and positive feedback from the human evaluators. Key findings show the model handles complex data analysis and generates diverse, tailored plots, highlighting its utility and transformative potential for scalable, effective industrial data processing workflows.
dc.identifier.coursecodeDATX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/309114
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectLarge Language Models (LLMs)
dc.subjectLogged Data Analysis
dc.subjectAutomation
dc.subjectOpen-Source Models
dc.subjectPrompt Engineering
dc.subjectAgent-Based Analysis
dc.subjectMF4 File Analysis
dc.titleUtilizing Large Language Models for Logged Data Anlysis in Industrial Contexts - Investigating the Adaptation of Large Language Models for Logged Data Analysis
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeComplex adaptive systems (MPCAS), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 24-79 EL HS.pdf
Storlek:
4.18 MB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: