Predicting the Need for Test Maintenance Using LLM Agents - Applying Test Maintenance Factors to Changes in Production Code to Identify If and Where Test Cases Need to Be Updated
Ladda ner
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Test maintenance, the act of modifying and updating test cases to ensure they keep
up with the changes made in the production code, is a necessary but time-consuming
and effort-intensive activity. One way to alleviate these efforts is by automating parts
of the test maintenance process, however, setting up and maintaining automation
tools can be time-consuming as well. Generative AI and Large Language Models
(LLMs) offer new avenues for automation and lessening the test maintenance problem.
One of these is through LLM agents, sophisticated AI systems that reason,
plan, and use tools to help it achieve its goals.
This thesis was conducted as an exploratory case study at Ericsson and investigated
how generative AI can help ease test maintenance, specifically how LLM agents can
be used to predict test maintenance. The thesis had three phases: Identifying factors
that trigger test maintenance; exploring the capabilities of generative AI and
how it might be used to help with test maintenance; and, using the results from
the two previous phases, building a prototype to help predict if and if so where test
maintenance is needed based on changes to the production code. We identified 40
factors that when changed in production code cause a need for test maintenance,
and successfully demonstrated how they can be used as triggers in a setup with LLM
agents. Out of the four different setups that were evaluated, we found that using
multiple LLM agents coordinated by a planning agent, and giving these access to
both production code and natural language summaries of test cases, worked best.
We also, through a thorough literature review, identify test maintenance actions
LLMs can take and help with. These demonstrate both the possibilities and current
limitations of LLMs when it comes to test maintenance, and the results highlight
how—though a large focus of LLM studies within software engineering has focused
on code generation—the capabilities of LLMs are much broader. This study provides
examples of how LLM agents can be used more broadly and all-encompassingly.
Beskrivning
Ämne/nyckelord
Software engineering (SE), test maintenance, large language model (LLM), LLM agent