Automation and Orchestration for Machine Learning Pipelines A study of Machine Learning Scaling: Exploring Micro-service architecture with Kubernetes

dc.contributor.authorMelberg, Filip, VASILIKI KOSTARA
dc.contributor.departmentChalmers tekniska högskola / Institutionen för fysiksv
dc.contributor.departmentChalmers University of Technology / Department of Physicsen
dc.contributor.examinerVolpe, Giovanni
dc.contributor.supervisorVolpe, Giovanni
dc.date.accessioned2024-06-12T11:44:08Z
dc.date.available2024-06-12T11:44:08Z
dc.date.issued2024
dc.date.submitted
dc.description.abstractlthough Machine Learning (ML) has been around for many decades, its popu larity has grown tremendously in recent years. Today’s requirements show a great need for the development and management of ML projects beyond algorithms and coding. The aim of this thesis is to investigate how a minimal team of engineers can create and maintain a ML pipeline. To this end, we will explore how a Machine Learning Operations (MLOps) pipeline could be created using containerization and container orchestration of micro-services. After relevant research, the result is a minimal, on-premises Kubernetes cluster set up on physical servers and Virtual Machines (VMs) running the Ubuntu Operating System (OS). The cluster consists of a master and two worker nodes, which are used for two main ML frameworks. Populating the cluster with more nodes is straightforward, which makes scaling a simple task. Additionally, a locally shared folder on the network is mounted in the cluster as an external storage and the cluster is configured to access either a local or a cloud-provided container registry. Once the cluster is set up and run ning, an application is launched to train the YOLOv5 model on a custom dataset. Later, Distributed Data Parallel (DDP) training is performed on the cluster using PyTorch, TorchX, PyTorch Lightning and Volcano.
dc.identifier.coursecodeTIFX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/307806
dc.language.isoeng
dc.setspec.uppsokPhysicsChemistryMaths
dc.subjectDevOps, Docker, Kubernetes, Micro-service, ML, MLOps, PyTorch, YOLO
dc.titleAutomation and Orchestration for Machine Learning Pipelines A study of Machine Learning Scaling: Exploring Micro-service architecture with Kubernetes
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeComplex adaptive systems (MPCAS), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
Master_Thesis_Report_Filip_Melberg.pdf
Storlek:
1.94 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: