Predicting the Need for Test Maintenance Using LLM Agents - Applying Test Maintenance Factors to Changes in Production Code to Identify If and Where Test Cases Need to Be Updated

Publicerad

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Test maintenance, the act of modifying and updating test cases to ensure they keep up with the changes made in the production code, is a necessary but time-consuming and effort-intensive activity. One way to alleviate these efforts is by automating parts of the test maintenance process, however, setting up and maintaining automation tools can be time-consuming as well. Generative AI and Large Language Models (LLMs) offer new avenues for automation and lessening the test maintenance problem. One of these is through LLM agents, sophisticated AI systems that reason, plan, and use tools to help it achieve its goals. This thesis was conducted as an exploratory case study at Ericsson and investigated how generative AI can help ease test maintenance, specifically how LLM agents can be used to predict test maintenance. The thesis had three phases: Identifying factors that trigger test maintenance; exploring the capabilities of generative AI and how it might be used to help with test maintenance; and, using the results from the two previous phases, building a prototype to help predict if and if so where test maintenance is needed based on changes to the production code. We identified 40 factors that when changed in production code cause a need for test maintenance, and successfully demonstrated how they can be used as triggers in a setup with LLM agents. Out of the four different setups that were evaluated, we found that using multiple LLM agents coordinated by a planning agent, and giving these access to both production code and natural language summaries of test cases, worked best. We also, through a thorough literature review, identify test maintenance actions LLMs can take and help with. These demonstrate both the possibilities and current limitations of LLMs when it comes to test maintenance, and the results highlight how—though a large focus of LLM studies within software engineering has focused on code generation—the capabilities of LLMs are much broader. This study provides examples of how LLM agents can be used more broadly and all-encompassingly.

Beskrivning

Ämne/nyckelord

Software engineering (SE), test maintenance, large language model (LLM), LLM agent

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced