Exploring Optimized CPU-Inference for Latency-Critical Machine Learning Tasks

SIKLUND, AMANDA; SEDERSTEN, MAX

Exploring Optimized CPU-Inference for Latency-Critical Machine Learning Tasks

Ladda ner

Master_thesis_Max_Sedersten_Amanda_Siklund.pdf (1.95 MB)

Publicerad

2024

Författare

SIKLUND, AMANDA

SEDERSTEN, MAX

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Complex adaptive systems (MPCAS), MSc

Sammanfattning

In recent years, machine learning has grown to become increasingly prevalent for a wide range of applications spanning multiple industries. For some of these applica tions, low latency can be critical, which may limit the types of hardware that can be used. Graphical Processing Units (GPUs) have long been the go-to hardware for machine learning tasks, often outperforming alternatives like Central Process ing Units (CPUs), but these are not practical in all situations. We explore CPUs, leveraging modern optimization techniques like pruning and quantization, as a com petitive alternative to GPUs with comparable predictive performance. This thesis provides a comparison of the two hardware types on a real-time latency-critical vi sion task. On the GPU side, TensorRT in combination with quantization is used to achieve state-of-the-art inference performance on the hardware. On the CPU side, the model is optimized using SparseML to introduce unstructured sparsity and quantization. This optimized model is then used by the DeepSparse runtime engine for optimized inference. Our findings show that the CPU approach can outperform the GPU hardware in certain situations. This suggests that CPU hardware could potentially be used in applications previously limited to GPUs.

Ämne/nyckelord

machine learning, neural network, model compression, pruning, quanti zation, optimization, CPU, GPU, Neural Magic, NVIDIA

URI

http://hdl.handle.net/20.500.12380/307923

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Exploring Optimized CPU-Inference for Latency-Critical Machine Learning Tasks

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced