Debloating Machine Learning Systems
Ladda ner
Publicerad
Författare
Typ
Examensarbete för masterexamen
Program
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
The size and complexity of software systems tend to grow over time. As a side-effect,
this increase can potentially lead to the accumulation of unused code, also known
as bloat. In this study, we assess the prevalence of bloat in Machine Learning (ML)
systems, give an overview of a selection of existing debloating tools and study their
applicability to workloads in this field. In order to assess the tools, we run a number
of experiments on five different ML models, that are written using the PyTorch li brary. The debloating target is a Docker image containing the ML library and other
dependancies required besides the model itself and the dataset. Cimplifier is the
only tool we test that was able to generate working images. While the literature in
the field of debloating suggests a possible reduction in metrics such as memory usage
or power consumption, our testing only shows a reduction in storage size. Most of
the removed files are parts of the Nvidia CUDA toolkit and the Intel Math Kernel
Library. To summarize, Cimplifier gives promising results when it comes to storage
reductions (around 50%) but is unable to impact other metrics such as GPU usage,
power consumption or workload runtime.
Beskrivning
Ämne/nyckelord
Computer, science, computer science, machine learning, bloat, debloating, project, thesis