Low latency video analytics system with multi-exit neural networks

Typ
Examensarbete för masterexamen
Master's Thesis
Program
Computer systems and networks (MPCSN), MSc
Data science and AI (MPDSC), MSc
Publicerad
2022
Författare
HARINDRAN, NEETHU
POOJARY, BHARATH
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Computer vision-based control systems have become increasingly powerful and promising in tackling real-world problems. This can be accredited to the use of deep learning methods in these systems with state-of-the-art performance sometimes outperforming humans in tasks which require subjective decision making. This has resulted in increased interest in these systems from Swedish industry, including Volvo. One example system where these systems are used is the Volvo GPSS system, where semantic segmentation is used to perform real-time decisions based on pixel level classification of a monitored area. However, such systems frequently deal with a trade-off between latency and accuracy. This is primarily due to the increasing number of model layers being used to develop Deep-Neural-Network models for vision systems, resulting in equal resource utilization regardless of input complexity. In this thesis, we develop an approach that employs input adaptive multi-exit strategy to exploit latency benefits of dynamic processing based on the input complexity. The proposed approach aims to have a reduced average inference time as the simple input samples takes an early exit and only the complex samples need more computation offered by all the model layers. The open source CityScapes dataset and the Volvo dataset were used in a number of multi-exit semantic segmentation experiments with HRNet architecture chosen as the backbone. The thesis work studies three novel exit strategies, including reinforcement learning, auxiliary models, and fast Fourier transform. Out of all the methods examined, the reinforcement learningbased exit strategy displayed the best performance advantages, with accuracy on par with unbranched HRNet and a significant decrease in latency and computation.
Beskrivning
Ämne/nyckelord
Multi-exit Neural Networks , Input Adaptive Inference , Semantic Segmentation , Inference Optimization
Citation
Arkitekt (konstruktör)
Geografisk plats
Byggnad (typ)
Byggår
Modelltyp
Skala
Teknik / material
Index