Utilizing Large Language Models for Logged Data Anlysis in Industrial Contexts - Investigating the Adaptation of Large Language Models for Logged Data Analysis
Download
Date
Authors
Type
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The rapid advancements in large language models (LLMs), such as OpenAI’s GPT
series and Meta’s Llama series, hold significant potential to revolutionize data
analysis processes in industrial contexts. This thesis explores the adaptation of LLMs
for logged data analysis within an industrial context, specifically at Volvo Group.
The objective is to customize and integrate an LLM-based system to automate
the resource-intensive data analysis process and enhance efficiency. The research
involves constructing a multi-component agent system that incorporates prompt
engineering techniques, specifically targeting the analysis of MF4 files, a format
optimized for efficient storage of time-stamped measurement data. Our methodology
includes developing a planner to break down complex analysis tasks into manageable
subtasks, a selector to categorize user requests into computation and plotting tasks,
and a code interpreter to generate and execute Python code for these tasks. The
system’s performance was evaluated using custom-designed quantitative metrics and
qualitative assessments from human evaluators to ensure accuracy and efficiency.
The LLM-based agent outperformed the current Python-based workflow in terms
of efficiency, reducing the need for manual coding and making the analysis process
substantially faster and more convenient. The model achieved an accuracy of 93.89%
for computations and 90.36% for plots. The reliability of the model was confirmed
through rigorous evaluations, and it showed a consistent performance with high
accuracy scores and positive feedback from the human evaluators. Key findings
show the model handles complex data analysis and generates diverse, tailored plots,
highlighting its utility and transformative potential for scalable, effective industrial
data processing workflows.
Description
Keywords
Large Language Models (LLMs), Logged Data Analysis, Automation, Open-Source Models, Prompt Engineering, Agent-Based Analysis, MF4 File Analysis