Hybrid Compression: Exploiting Model and Data Compression for Deep Neural Network Workloads

Xie, Yunyao; Li, Naicheng

Hybrid Compression: Exploiting Model and Data Compression for Deep Neural Network Workloads

dc.contributor.author	Xie, Yunyao
dc.contributor.author	Li, Naicheng
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering	en
dc.contributor.examiner	Pericas, Miquel
dc.contributor.supervisor	Petersen Moura Trancoso, Pedro
dc.date.accessioned	2022-11-30T08:50:13Z
dc.date.available	2022-11-30T08:50:13Z
dc.date.issued	2022
dc.date.submitted	2020
dc.description.abstract	Nowadays, various kinds of deep neural networks (DNNs) show human-level capabilities in their domain. But the networks usually have millions or billions of parameters and significant computation cost. To bring the artificial intelligence into people’s daily lives, deploying DNNs efficiently on various hardware (e.g., resourcelimited edge devices) has become a popular topic. In our thesis work, we explore the model compression and data compression and then combine them to build hybrid compression to reduce the computation cost, memory cost of DNNs and speed up the model inference speed. The hybrid compression gives us compression ratios of 5.15 and 5.57, speedup of 2.66 and 3.93, and accuracy losses of 2.38% and 2.64%, if we start from pruning 30% of floating point operations (FLOPs) of MobileNetV1 and ResNet50 respectively. Starting from models which are pruned 50% FLOPs, hybrid compression can achieve compression ratio of 6.29 and 7.53, speedup of 2.74 and 4.21 and with accuracy drop of 7.71% and 9.27% for MobileNetV1 and ResNet50. To verify the effectiveness of hybrid compression, we evaluate the inferece speed of compressed model on two edge devices, Nvidia Jetson Nano and Nvidia Jetson Xavier NX. We find that the gains of hybrid compression are hardware related and NX shows more impressive strengths than Nano. At last, we give users recommendations about how to apply the hybrid compression from different aspects.
dc.identifier.coursecode	DATX05
dc.identifier.uri	https://odr.chalmers.se/handle/20.500.12380/305842
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Deep Neural Networks
dc.subject	CNNs
dc.subject	Model Compression
dc.subject	Data Compression
dc.subject	Model Inference
dc.title	Hybrid Compression: Exploiting Model and Data Compression for Deep Neural Network Workloads
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Computer systems and networks (MPCSN), MSc
local.programme	High-performance computer systems (MPHPC), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 22-66 Xie Li.pdf
Storlek:: 3.09 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 1.64 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen