Using machine learning and natural language processing to automatically extract information from software documentation
Ladda ner
Publicerad
Författare
Typ
Examensarbete för masterexamen
Program
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Engineers face many challenges when it comes to using and maintaining software
documentation. The OD3 is a vision for the future of software documentation which
proposes that documentation should be generated based on user queries. There are
many steps that need to be taken to create such a system. This research takes one
of those necessary steps by investigating the categories of software knowledge that
are contained in software documentation, automatically classifying sentences from
software documentation into those sentences, and exploring methods to identify
sentence relations. This analysis was conducted on one case documentation. A
system, Software Documentation Supporter (SDS), was then built to explore and
evaluate the results. The aim of the SDS is to support the user when navigating
through long software documentation. In the system, the user can choose from a
list of questions, and the software knowledge extracted from the documentation is
used to answer those questions. The results were evaluated using a quantitative
and a qualitative approach. As the sample size of the evaluation was small, the
quantitative results did not show a significant difference in the time it took users
to solve tasks using the SDS, compared to using only the documentation. The
qualitative results showed that participants did feel that the SDS supported them
and that it helped them navigate the documentation, however it was also clear that
improvements need to be made both in regards to the method, and the design of
the system.
Beskrivning
Ämne/nyckelord
software, documentation, architecture, requirement, natural language processing, classification, clustering