Applying software engineering and machine learning practices to manage machine learning complexity

Typ
Examensarbete för masterexamen
Program
Publicerad
2022
Författare
HIllström, Sara
Mejborn, Johan
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Today, both software engineering (SE) and machine learning (ML) are two fairly well-established areas within engineering. The field of software engineering for machine learning (SE4ML) addresses the issue of applying software engineering practices for software containing ML. Complexity is a term with a widespread definition, and the way of handling and defining it is something that differs between traditional software engineering and machine learning. In this thesis, complexity is defined as the measure of the resources expended by another system in interacting with a piece of software. If the interacting system is another machine we define it as resource cost, and if the interacting system is instead people (tasks such as, e.g., debugging and testing) we define it as software complexity. This thesis was conducted in close collaboration with a partner company and aims to contribute to SE4ML by providing a framework aimed to act as guidance for how software complexity and resource cost may be addressed in different parts of the ML development process. The framework should also provide insights into possible trade-offs between software complexity and resource cost. To validate the framework, validation interviews with practitioners as well as representatives from academia were held, and the framework was also applied to an existing problem at the partner company. The latter was done by tweaking an existing ML model and developing two other models for comparison purposes. In conclusion, the validation interviews and the application to an existing ML model confirmed that the framework is useful for practitioners. There are trade-offs between some of the different activities that form the framework, referred to as artifacts. This means that practitioners, to some extent, need to balance contradicting artifacts to optimize the resource cost and software complexity trade-off, depending on the specific use case at hand.
Beskrivning
Ämne/nyckelord
software engineering, machine learning, complexity, framework
Citation
Arkitekt (konstruktör)
Geografisk plats
Byggnad (typ)
Byggår
Modelltyp
Skala
Teknik / material