Debloating Machine Learning Systems
Loading...
Download
Date
Authors
Type
Examensarbete för masterexamen
Programme
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The size and complexity of software systems tend to grow over time. As a side-effect,
this increase can potentially lead to the accumulation of unused code, also known
as bloat. In this study, we assess the prevalence of bloat in Machine Learning (ML)
systems, give an overview of a selection of existing debloating tools and study their
applicability to workloads in this field. In order to assess the tools, we run a number
of experiments on five different ML models, that are written using the PyTorch li brary. The debloating target is a Docker image containing the ML library and other
dependancies required besides the model itself and the dataset. Cimplifier is the
only tool we test that was able to generate working images. While the literature in
the field of debloating suggests a possible reduction in metrics such as memory usage
or power consumption, our testing only shows a reduction in storage size. Most of
the removed files are parts of the Nvidia CUDA toolkit and the Intel Math Kernel
Library. To summarize, Cimplifier gives promising results when it comes to storage
reductions (around 50%) but is unable to impact other metrics such as GPU usage,
power consumption or workload runtime.
Description
Keywords
Computer, science, computer science, machine learning, bloat, debloating, project, thesis
