Learning efficient software fault localization via genetic programming
Examensarbete för masterexamen
The high cost associated with debugging of computer software has motivated development of semi-automatic fault localization techniques. Such techniques assist developers in locating faulty code by ranking program statements according to their likelihood of being faulty. The ranking is done by automated analysis of test coverage or execution profile data. A variety of fault localization techniques utilizing different types of ranking functions have been proposed in the past. In this paper, we present a new fault localization technique where we have used genetic programming to find a highly effective ranking function. First, we divide frequently appearing fault types into four subsets. We then identify potentially useful execution profiles and use genetic programming to search for a new improved ranking function for each fault type individually. Finally, we merge the ranking lists provided by the four ranking functions into a final aggregated ranking list, which is used by the developer to search for faulty code. We evaluated the efficiency of our technique using execution profile data from two programs, GCC and SPACE, and compared it to the efficiency of two existing fault localization techniques. The result shows that our approach is highly effective, as we can locate more than 90% of the faults by examining the top 20% of the statements in the ranking list. The improvement in efficiency is about 10% compared to Lightweight fault localization and 20% compared the Tarantula technique.
Informations- och kommunikationsteknik , Systemteknik , Information & Communication Technology , Systems engineering