Hybrid Compression: Exploiting Model and Data Compression for Deep Neural Network Workloads

Xie, Yunyao; Li, Naicheng

Hybrid Compression: Exploiting Model and Data Compression for Deep Neural Network Workloads

Ladda ner

CSE 22-66 Xie Li.pdf (3.09 MB)

Publicerad

2022

Författare

Xie, Yunyao

Li, Naicheng

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Computer systems and networks (MPCSN), MSc
High-performance computer systems (MPHPC), MSc

Sammanfattning

Nowadays, various kinds of deep neural networks (DNNs) show human-level capabilities in their domain. But the networks usually have millions or billions of parameters and significant computation cost. To bring the artificial intelligence into people’s daily lives, deploying DNNs efficiently on various hardware (e.g., resourcelimited edge devices) has become a popular topic. In our thesis work, we explore the model compression and data compression and then combine them to build hybrid compression to reduce the computation cost, memory cost of DNNs and speed up the model inference speed. The hybrid compression gives us compression ratios of 5.15 and 5.57, speedup of 2.66 and 3.93, and accuracy losses of 2.38% and 2.64%, if we start from pruning 30% of floating point operations (FLOPs) of MobileNetV1 and ResNet50 respectively. Starting from models which are pruned 50% FLOPs, hybrid compression can achieve compression ratio of 6.29 and 7.53, speedup of 2.74 and 4.21 and with accuracy drop of 7.71% and 9.27% for MobileNetV1 and ResNet50. To verify the effectiveness of hybrid compression, we evaluate the inferece speed of compressed model on two edge devices, Nvidia Jetson Nano and Nvidia Jetson Xavier NX. We find that the gains of hybrid compression are hardware related and NX shows more impressive strengths than Nano. At last, we give users recommendations about how to apply the hybrid compression from different aspects.

Ämne/nyckelord

Deep Neural Networks, CNNs, Model Compression, Data Compression, Model Inference

URI

https://odr.chalmers.se/handle/20.500.12380/305842

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Hybrid Compression: Exploiting Model and Data Compression for Deep Neural Network Workloads

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced