Synthetic Data Generation Techniques for Automotive Machine Learning

Fredriksson, Jonny; Durgé , Rasmus

Synthetic Data Generation Techniques for Automotive Machine Learning

Ladda ner

Masters_thesis_Jonny_Fredriksson_Rasmus_Durge_2023.pdf (7.51 MB)

Publicerad

2023

Författare

Fredriksson, Jonny

Durgé , Rasmus

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Complex adaptive systems (MPCAS), MSc
Engineering mathematics and computational science (MPENM), MSc

Sammanfattning

Seat belts drastically reduce the risk of injury or death, given that one is wearing them correctly. This thesis emanates from Volvo Cars’ aspiration to tackle this risk, using the growing potential of machine learning. The foundation of this work stems from another thesis at Volvo Cars, where a semantic segmentation model was developed, for identifying and segmenting the seat belt in an image of a car occupant. To apply this segmentation model approach, the tedious and costly process of collecting and annotating data is fundamental. The thesis explores the concept of using synthetic data, i.e., data that is made by software and annotated in silico, as a substitute for, and a complement to, previously collected real-world data. Specifically, the thesis explores different methods on how to apply and generate synthetic data and what aspects improve its quality, regarding the prediction accuracy of the segmentation model. As a measure of prediction accuracy, the mean intersection over union (IoU) over a test set consisting of real-world images is used. Several segmentation models, with different architectures, are evaluated to find the best-performing network. The thesis also explores the concept of domain randomization, which aims to narrow the domain gap between the synthetic and real data, as well as multiple label annotations to investigate whether identifying other objects improves segmentation of the seat belt, and guided backpropagation to explain predictions made by the segmentation model. This thesis shows that, although the choice of network architecture is shown to have a relatively small effect on performance, the top performing network is found to be a Unet++ decoder and a ResNet 34 encoder. The thesis also suggests that when there is a scarcity of real-world data, introducing synthetic data can improve prediction accuracy, both by training the model on a mix of real and synthetic data, and by pre-training the model on synthetic data before training it on real data. The results also suggest that when the model is trained to also identify objects which often interact with the seat belt, e.g., the occupant’s shirt, its prediction accuracy on the seat belt can improve. The thesis has identified ways to make synthetic data more appropriate for training the seat belt segmentation model. This thesis successfully demonstrates that there is a lot of potential to further develop the application of synthetic data in the future. One obvious approach would be to use a more powerful graphics engine, making the synthetic data even more realistic

Ämne/nyckelord

Seat belt, car occupant, semantic segmentation, neural networks, augmentations, synthetic data, domain gap

URI

http://hdl.handle.net/20.500.12380/306404

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Synthetic Data Generation Techniques for Automotive Machine Learning

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced