Hybrid Compression: Exploiting Model and Data Compression for Deep Neural Network Workloads

dc.contributor.authorXie, Yunyao
dc.contributor.authorLi, Naicheng
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerPericas, Miquel
dc.contributor.supervisorPetersen Moura Trancoso, Pedro
dc.date.accessioned2022-11-30T08:50:13Z
dc.date.available2022-11-30T08:50:13Z
dc.date.issued2022
dc.date.submitted2020
dc.description.abstractNowadays, various kinds of deep neural networks (DNNs) show human-level capabilities in their domain. But the networks usually have millions or billions of parameters and significant computation cost. To bring the artificial intelligence into people’s daily lives, deploying DNNs efficiently on various hardware (e.g., resourcelimited edge devices) has become a popular topic. In our thesis work, we explore the model compression and data compression and then combine them to build hybrid compression to reduce the computation cost, memory cost of DNNs and speed up the model inference speed. The hybrid compression gives us compression ratios of 5.15 and 5.57, speedup of 2.66 and 3.93, and accuracy losses of 2.38% and 2.64%, if we start from pruning 30% of floating point operations (FLOPs) of MobileNetV1 and ResNet50 respectively. Starting from models which are pruned 50% FLOPs, hybrid compression can achieve compression ratio of 6.29 and 7.53, speedup of 2.74 and 4.21 and with accuracy drop of 7.71% and 9.27% for MobileNetV1 and ResNet50. To verify the effectiveness of hybrid compression, we evaluate the inferece speed of compressed model on two edge devices, Nvidia Jetson Nano and Nvidia Jetson Xavier NX. We find that the gains of hybrid compression are hardware related and NX shows more impressive strengths than Nano. At last, we give users recommendations about how to apply the hybrid compression from different aspects.
dc.identifier.coursecodeDATX05
dc.identifier.urihttps://odr.chalmers.se/handle/20.500.12380/305842
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectDeep Neural Networks
dc.subjectCNNs
dc.subjectModel Compression
dc.subjectData Compression
dc.subjectModel Inference
dc.titleHybrid Compression: Exploiting Model and Data Compression for Deep Neural Network Workloads
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeComputer systems and networks (MPCSN), MSc
local.programmeHigh-performance computer systems (MPHPC), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 22-66 Xie Li.pdf
Storlek:
3.09 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
1.64 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: