Debloating Machine Learning Systems

Loading...
Thumbnail Image

Date

Type

Examensarbete för masterexamen

Programme

Model builders

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The size and complexity of software systems tend to grow over time. As a side-effect, this increase can potentially lead to the accumulation of unused code, also known as bloat. In this study, we assess the prevalence of bloat in Machine Learning (ML) systems, give an overview of a selection of existing debloating tools and study their applicability to workloads in this field. In order to assess the tools, we run a number of experiments on five different ML models, that are written using the PyTorch li brary. The debloating target is a Docker image containing the ML library and other dependancies required besides the model itself and the dataset. Cimplifier is the only tool we test that was able to generate working images. While the literature in the field of debloating suggests a possible reduction in metrics such as memory usage or power consumption, our testing only shows a reduction in storage size. Most of the removed files are parts of the Nvidia CUDA toolkit and the Intel Math Kernel Library. To summarize, Cimplifier gives promising results when it comes to storage reductions (around 50%) but is unable to impact other metrics such as GPU usage, power consumption or workload runtime.

Description

Keywords

Computer, science, computer science, machine learning, bloat, debloating, project, thesis

Citation

Architect

Location

Type of building

Build Year

Model type

Scale

Material / technology

Index

Endorsement

Review

Supplemented By

Referenced By