Security Analysis of Code Bloat in Machine Learning Systems
Loading...
Download
Date
Authors
Type
Examensarbete för masterexamen
Programme
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Code bloat is a significant issue in modern software systems as they continue to
increase in size and complexity. Furthermore, with the widespread adoption of containerized
applications, there is an abundance of unneeded packages that suffer from
a wide range of vulnerabilities. In this thesis, we analyze the prevalence of security
vulnerabilities in containers used for Machine Learning (ML) systems. We consider
two popular ML frameworks, namely, PyTorch and TensorFlow. Making use of
container scanning tools, we observed over 100 Common Vulnerabilities and Exposures
(CVE) in the tested containers. Our experiments show that debloating using
Cimplifier leads to a reduction in the image sizes of up to 49% and a reduction of
vulnerabilities of at least 87%. The majority of the removed CVEs can be attributed
to the removal of bloat specific to redundant parts of the containers’ installed OS
packages. A smaller portion of the CVEs detected in the Python packages were
removed by Cimplifier.
Description
Keywords
Security, Debloating, Vulnerability Scanning, Machine Learning Systems, Containers, Docker
