Large-Scale Transformer-Based Multi-Target Tracking
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Program
Engineering mathematics and computational science (MPENM), MSc
Publicerad
2024
Författare
Spjuth, Oliver
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
In military surveillance, radar-based tracking of objects is essential. The growing use
of small-scale drones, as seen in the Russia-Ukraine war, necessitates tracking at low
speeds. At these speeds, birds are also detected and the number of false detections
increases, making the already complex Multi-Target Tracking (MTT) problem more
challenging. Recent advances in machine learning, particularly the transformer ar-
chitecture, present new opportunities to address these challenges, making it valuable
to explore their application in air surveillance contexts.
Although transformers have shown promise in related fields such as automotive
radar, adapting them to air surveillance presents specific hurdles. These include
managing the quadratic scaling of attention as the number of detections increases,
ensuring accurate state estimation across large continuous areas, and simultaneously
estimating a large number of targets.
To address these challenges, a four-module pipeline was developed. The first module
reduced attention complexity by generating local contexts of detections for paral-
lel processing. This was followed by a transformer-encoder-based filter designed to
eliminate false detections (FDF). Next, the original problem was partitioned into
independent subproblems using a graph-based clustering approach. One suggested
implementation utilized the attention scores from the FDF to construct edges be-
tween detections (nodes). The Leiden algorithm, a community detection algorithm,
was then applied to identify clusters of related detections. These clusters were sub-
sequently processed in parallel by the final transformer-based MTT module.
This approach significantly reduced the initial memory demands of attention from
approximately 320 GB to 1.6 GB while maintaining performance across the pipeline.
The false detection filter achieved a balanced accuracy and F1 score of 99%, ef-
fectively reducing the problem complexity. The attention-score-based partitioning
method accurately identified subproblems that were predominantly optimal (single-
target) or near-optimal.
When evaluated using MTT metrics, the pipeline employing the attention-score-
based partitioning method demonstrated promising results, with few missed or false
detections and a total inference time of approximately 0.5 seconds for over 100,000
detections. The system scaled effectively with increased complexity and adapted
well to varying conditions.
Beskrivning
Ämne/nyckelord
transformer, multiple target tracking, data association, leiden algorithm, clustering, attention, radar tracking, radar, graph clustering