Automation and Orchestration for Machine Learning Pipelines A study of Machine Learning Scaling: Exploring Micro-service architecture with Kubernetes
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Program
Complex adaptive systems (MPCAS), MSc
Publicerad
2024
Författare
Melberg, Filip, VASILIKI KOSTARA
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
lthough Machine Learning (ML) has been around for many decades, its popu larity has grown tremendously in recent years. Today’s requirements show a great
need for the development and management of ML projects beyond algorithms and
coding. The aim of this thesis is to investigate how a minimal team of engineers can
create and maintain a ML pipeline. To this end, we will explore how a Machine
Learning Operations (MLOps) pipeline could be created using containerization
and container orchestration of micro-services. After relevant research, the result is
a minimal, on-premises Kubernetes cluster set up on physical servers and Virtual
Machines (VMs) running the Ubuntu Operating System (OS). The cluster consists
of a master and two worker nodes, which are used for two main ML frameworks.
Populating the cluster with more nodes is straightforward, which makes scaling a
simple task. Additionally, a locally shared folder on the network is mounted in
the cluster as an external storage and the cluster is configured to access either a
local or a cloud-provided container registry. Once the cluster is set up and run ning, an application is launched to train the YOLOv5 model on a custom dataset.
Later, Distributed Data Parallel (DDP) training is performed on the cluster using
PyTorch, TorchX, PyTorch Lightning and Volcano.
Beskrivning
Ämne/nyckelord
DevOps, Docker, Kubernetes, Micro-service, ML, MLOps, PyTorch, YOLO