Improving Defect Localization by Classifying the Affected Asset using Machine Learning
Examensarbete för masterexamen
Software engineering and technology (MPSOF), MSc
Today’s market demands complex large-scale software to be developed and delivered at an increased pace. The increase in software complexity increases the cost of maintenance which on average accounts for 60 percent of software costs. Corrective maintenance accounts for 21 percent of the maintenance costs which includes receiving a defect report describing a defect, diagnosing and removing the described defect. A vital part of a defect’s resolution is the task of defect localization. Defect localization is the task of finding the exact location of the defect in the system. The defect report, in particular the asset attribute, help the assigned entity to limit the search space when investigating the exact location of the defect. However, research has shown that oftentimes reporters initially assign values to these attributes that provide incorrect information. In this thesis, using machine learning to classify the source asset for a given defect report at a telecom company was evaluated. Following design science research, two iterations were conducted. The first iteration evaluated classification models for classifying the source asset after submission of a defect report. By training a SVM with features constructed from both categorical and textual attributes of the defect reports an accuracy of 58.52% was achieved. The second iteration evaluated classification models for providing the reporter with recommendations of likely assets. By using recommendations provided by a SVM trained with features from both categorical and textual attributes of the defect reports the precision could be significantly increased.
Data- och informationsvetenskap , Computer and Information Science