Utilizing Large Language Models for Logged Data Anlysis in Industrial Contexts - Investigating the Adaptation of Large Language Models for Logged Data Analysis
dc.contributor.author | Le, Eric | |
dc.contributor.author | Sahni, Himanshu | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för data och informationsteknik | sv |
dc.contributor.department | Chalmers University of Technology / Department of Computer Science and Engineering | en |
dc.contributor.examiner | Johansson, Moa | |
dc.contributor.supervisor | Haghir Chehreghani, Morteza | |
dc.date.accessioned | 2025-02-11T13:10:53Z | |
dc.date.available | 2025-02-11T13:10:53Z | |
dc.date.issued | 2024 | |
dc.date.submitted | ||
dc.description.abstract | The rapid advancements in large language models (LLMs), such as OpenAI’s GPT series and Meta’s Llama series, hold significant potential to revolutionize data analysis processes in industrial contexts. This thesis explores the adaptation of LLMs for logged data analysis within an industrial context, specifically at Volvo Group. The objective is to customize and integrate an LLM-based system to automate the resource-intensive data analysis process and enhance efficiency. The research involves constructing a multi-component agent system that incorporates prompt engineering techniques, specifically targeting the analysis of MF4 files, a format optimized for efficient storage of time-stamped measurement data. Our methodology includes developing a planner to break down complex analysis tasks into manageable subtasks, a selector to categorize user requests into computation and plotting tasks, and a code interpreter to generate and execute Python code for these tasks. The system’s performance was evaluated using custom-designed quantitative metrics and qualitative assessments from human evaluators to ensure accuracy and efficiency. The LLM-based agent outperformed the current Python-based workflow in terms of efficiency, reducing the need for manual coding and making the analysis process substantially faster and more convenient. The model achieved an accuracy of 93.89% for computations and 90.36% for plots. The reliability of the model was confirmed through rigorous evaluations, and it showed a consistent performance with high accuracy scores and positive feedback from the human evaluators. Key findings show the model handles complex data analysis and generates diverse, tailored plots, highlighting its utility and transformative potential for scalable, effective industrial data processing workflows. | |
dc.identifier.coursecode | DATX05 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12380/309114 | |
dc.language.iso | eng | |
dc.setspec.uppsok | Technology | |
dc.subject | Large Language Models (LLMs) | |
dc.subject | Logged Data Analysis | |
dc.subject | Automation | |
dc.subject | Open-Source Models | |
dc.subject | Prompt Engineering | |
dc.subject | Agent-Based Analysis | |
dc.subject | MF4 File Analysis | |
dc.title | Utilizing Large Language Models for Logged Data Anlysis in Industrial Contexts - Investigating the Adaptation of Large Language Models for Logged Data Analysis | |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.degree | Master's Thesis | en |
dc.type.uppsok | H | |
local.programme | Complex adaptive systems (MPCAS), MSc |