Debloating Machine Learning Systems

Publicerad

Typ

Examensarbete för masterexamen

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

The size and complexity of software systems tend to grow over time. As a side-effect, this increase can potentially lead to the accumulation of unused code, also known as bloat. In this study, we assess the prevalence of bloat in Machine Learning (ML) systems, give an overview of a selection of existing debloating tools and study their applicability to workloads in this field. In order to assess the tools, we run a number of experiments on five different ML models, that are written using the PyTorch li brary. The debloating target is a Docker image containing the ML library and other dependancies required besides the model itself and the dataset. Cimplifier is the only tool we test that was able to generate working images. While the literature in the field of debloating suggests a possible reduction in metrics such as memory usage or power consumption, our testing only shows a reduction in storage size. Most of the removed files are parts of the Nvidia CUDA toolkit and the Intel Math Kernel Library. To summarize, Cimplifier gives promising results when it comes to storage reductions (around 50%) but is unable to impact other metrics such as GPU usage, power consumption or workload runtime.

Beskrivning

Ämne/nyckelord

Computer, science, computer science, machine learning, bloat, debloating, project, thesis

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced