Towards Unknown Traffic Driving Pattern Discovery with Active Learning
Typ
Examensarbete för masterexamen
Program
Publicerad
2021
Författare
Jarl, Sanna
Wennerblom, Julia
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
The promise of autonomous vehicles is frequently discussed and the traffic landscape
is expected to change drastically with the technology of AD. Therefore rigorous test ing is essential for the reliance on and trust in the system. The vast amounts of
labelled data for testing is not a trivial thing to obtain. One potential approach
to this issue is Active Learning as its purpose is to produce a robust data set with
minimal human interaction. The aim of this project is to examine the effectiveness
of active learning for annotation of scenario data collected by a Volvo Cars Corpo ration (VCC) vehicle. Active learning trains a classifier on a small initial annotated
data set and uses it to determine which unlabelled data points need annotation by
a human. The classifier is then retrained with the updated annotated set until the
budget of queries is spent. In this study, active learning is performed on the la tent space produced by multivariate Time Series t-Distributed Stochastic Neighbor
Embedding (mTSNE), Recurrent Autoencoder (RAE) and Variational Recurrent
Autoencoder (VRAE). Investigations are made into which embedding, classifier and
query strategy is most suitable for the task of performing active learning on VCC’s
trajectory data. A study is also performed on the impact of different degrees of class
imbalance in the data. Area Under the Curve (AUC) and F1 score with regards to
number of queried points are used as measures of performance. In many cases, active
learning has proven an effective tool. We can conclude that the mTSNE embedding
with the Support Vector Machines algorithm (SVM) as a classifier outperforms the
other models, with both high AUC and F1 score in addition to a low run time and
high stability. Entropy querying is observed as the most suitable query method. The
separability of the mTSNE generated latent space provides a less complex model,
although the mTSNE transformation itself is very computational heavy. RAE also
performs well, though combined with a Neural Network (NN) it struggles with de tecting the smaller class as the class imbalance increases. VRAE proves to be a
suboptimal choice of embedding, since it performs worse than the two others. We
conclude that for mTSNE, 50 queries is sufficient to reach a high AUC and F1 score
for most class imbalances, and for RAE, that number is 125. The potential of active
learning to act as an unknown class detector was also investigated using RAE and
VRAE embedded data. Cut in was regarded as the unknown class, and performance
was measured in terms of number of queried cut ins. The results show that for a
budget size up to 200 queries RAE with SVM classifier queries the most cut ins,
while for a larger budget sizes VRAE with SVM queries the most cut ins.
Beskrivning
Ämne/nyckelord
Active Learning , Unknown Detection , Annotation , Time Series Analysis , mTSNE , SVM , Neural Network , query