Empirical Analysis of Hidden Technical Debt Patterns in Machine Learning Software

Publicerad

Typ

Examensarbete för masterexamen

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Development, deployment, and maintenance of Machine Learning (ML) based software products are costly. However, these costs are usually neglected. Challenges regarding maintainability of ML software were explained under the framework of "Hidden Technical Debt" (HTD) by Sculley et al. [10] by making an analogy to technical debt in traditional software. HTD patterns are due to a group of ML software practices and activities leading to the future difficulty in ML system improvements, many unhandled errors in the long term and hence considered as main causes of the increase in maintainability cost. Moreover, some of those patterns keep expanding unnoticed; for that reason, they are called hidden patterns. ML systems have a special ability for increasing technical debt due to ML specific issues at the system level in addition to having all the problems of regular code. The aim of this thesis is to empirically analyze which and how HTD patterns emerge during the early development phase of ML software, namely the prototyping phase. For this purpose, we conducted a case study to analyze ML models. These models will go into production and then integrated to the software system owned by Västtrafik, that is the public transportation agency in the west area of Sweden. In order to investigate the generalizability of our case study findings, we conducted a workshop with practitioners consisting of data scientists and software engineers. During our case study, out of 25 HTD patterns, we were able to detect 12. One of the 12 patterns detected was only observed to a limited extent. Observed patterns during prototyping are mainly underutilized data dependencies (e.g., correlated, bundled, and features) and ML code smells (e.g., glue code, pipeline jungles, dead experimental code paths). We also observed entanglement, configuration debt, abstraction debt, prediction bias, multiple language smell, data testing debt, and cultural debt up to some extent. All of the 12 HTD patterns that were undetected, could only be detected after deployment of ML software. The only undetected HTD pattern is "Plain Old Data Type Smell", since we did not implement the ML algorithms from scratch, but instead used existing ML libraries owned by an online cloud solution. Our workshop results indicate that, majority of our findings are applicable to other ML application domains. Practitioners also agreed that prototypes built by data scientists are not ideal in terms of software engineering (SE) practices. Hence, developers need to refactor the prototypes in order to prepare for the production stage.

Beskrivning

Ämne/nyckelord

Technical Debt, Hidden Technical Debt Patterns (HTD) Patterns, Machine Learning (ML), Software Engineering (SE), Maintainability, Feature Engineering

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced