Distilled Few-Shot Object Detection for Truck Equipment: Distilling a few-shot state-of-the-art model for truck equipment detection utilizing stable diffusion for data augmentation

dc.contributor.authorBorg, Andrei
dc.contributor.authorJohansson, Marcus
dc.contributor.departmentChalmers tekniska högskola / Institutionen för elektrotekniksv
dc.contributor.examinerKahl, Fredrik
dc.contributor.supervisorBjörklund, Anders
dc.date.accessioned2025-06-12T14:06:04Z
dc.date.issued2025
dc.date.submitted
dc.description.abstractFew-shot object detection in specialized domains like truck equipment faces significant challenges due to the limited availability of data and the lack of customized models. This study addresses these issues by developing an efficient, streamlined pipeline for identifying truck equipment in constrained situations. We employ knowledge distillation, transferring knowledge from a teacher, the recent CD-ViTO architecture, to a lightweight YOLOv11 student model. We use Stable Diffusion 3.5 for synthetic image generation with prompt engineering to augment the limited training data. The study evaluates the impact of this approach on accuracy and inference speed. The results show that knowledge distillation with generated data significantly boosts the student models’ accuracy, particularly in low-shot settings such as 1- and 5-shot, with one configuration outperforming the teacher. Critically, the distilled YOLOv11 student models achieve drastically reduced inference times (83 times speed increase), enabling real-time application. This pipeline demonstrates a successful strategy for creating performant object detection systems in environments with limited data.
dc.identifier.coursecodeEENX30
dc.identifier.urihttp://hdl.handle.net/20.500.12380/309415
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectFew-Shot Object Detection
dc.subjectKnowledge Distillation
dc.subjectMachine Learning
dc.subjectStable Diffusion
dc.subjectImage Generation
dc.subjectTruck Equipment
dc.subjectData Augmentation
dc.subjectReal-Time Object Detection
dc.subjectCross-Domain Vision Transformer
dc.subjectYou Only Look Once
dc.titleDistilled Few-Shot Object Detection for Truck Equipment: Distilling a few-shot state-of-the-art model for truck equipment detection utilizing stable diffusion for data augmentation
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeComputer science – algorithms, languages and logic (MPALG), MSc
local.programmeData science and AI (MPDSC), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
E2_Master_s_Thesis.pdf
Storlek:
35.96 MB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: