Distilled Few-Shot Object Detection for Truck Equipment: Distilling a few-shot state-of-the-art model for truck equipment detection utilizing stable diffusion for data augmentation
Ladda ner
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Few-shot object detection in specialized domains like truck equipment faces significant challenges due to the limited availability of data and the lack of customized models. This study addresses these issues by developing an efficient, streamlined pipeline for identifying truck equipment in constrained situations. We employ knowledge distillation, transferring knowledge from a teacher, the recent CD-ViTO architecture,
to a lightweight YOLOv11 student model. We use Stable Diffusion 3.5 for synthetic image generation with prompt engineering to augment the limited training data. The study evaluates the impact of this approach on accuracy and inference speed. The results show that knowledge distillation with generated data significantly boosts the student models’ accuracy, particularly in low-shot settings such as 1- and 5-shot, with one configuration outperforming the teacher. Critically, the distilled YOLOv11 student models achieve drastically reduced inference times (83 times speed increase), enabling real-time application. This pipeline demonstrates a successful strategy for creating performant object detection systems in environments with limited data.
Beskrivning
Ämne/nyckelord
Few-Shot Object Detection, Knowledge Distillation, Machine Learning, Stable Diffusion, Image Generation, Truck Equipment, Data Augmentation, Real-Time Object Detection, Cross-Domain Vision Transformer, You Only Look Once