Development of a Query Language for Improved Versioning Support for Machine- Learning-Based Systems
Ladda ner
Typ
Examensarbete för masterexamen
Program
Publicerad
2022
Författare
Tran, Erik
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
This thesis is about development of a query language for improved versioning support
for machine-learning-based systems, focusing on the perspective of software
engineers. The motivation is to combine the worlds of software engineers and data
scientists as they have to work together effectively. Different existing tools that support
management of machine learning assets are described and the reason why they
are not fit for software engineers are explained. A design science approach method
is applied for this thesis. Methods such as requirement elicitation, artifact feature
elicitation and artifact feature prioritizations have been applied. Requirements were
formed through independent research and are evaluated in four interviews. Features
were implemented based on the requirements, and are evaluated as well in another
four interviews. The artifact feature prioritization method includes construction of
a traceability matrix. The final evaluation results indicated that the population who
would hypothetically use this query language in its current state are users who are
less experienced in management of machine learning assets. Discussions regarding
future work such are related to scalability and data analysability are discussed in the
report. The query language could expand to more advanced users if more advanced
features are implemented e.g. features that supports data analysability or features
that supports other models so that it is not restricted to only Scikit-learn classes
that currently are the only classes that are able to be created by using the query
language.
Beskrivning
Ämne/nyckelord
software engineering , query language , machine learning , asset , versioning