Multi-Object Tracking Based on Multi-
Sensor Multi-Bernoulli Densities Fusion

Master’s thesis in Systems, Control and Mechatronics

JINGKAI ZHOU
QINGHUAN LIU

Department of Electrical Engineering

CHALMERS UNIVERSITY OF TECHNOLOGY
Gothenburg, Sweden 2023
www.chalmers.se

www.chalmers.se


Master’s Thesis 2023

Multi-Object Tracking Based on Multi-Sensor
Multi-Bernoulli Densities Fusion

JINGKAI ZHOU
QINGHUAN LIU

Department of Electrical Engineering
Division of Signal Processing and Biomedical Engineering Chalmers

University of Technology
Gothenburg, Sweden 2023


Multi-Object Tracking Based on Multi-Sensor Multi-Bernoulli Densities Fusion
JINGKAI ZHOU
QINGHUAN LIU

© JINGKAI ZHOU, QINGHUAN LIU 2023.

Supervisor: Yuxuan Xia, Department of Electrical Engineering
Daniel Johansson, Volvo Trucks
Felix Gustavsson, Volvo Trucks
Alexander Greger, Volvo Trucks

Examiner: Lennart Svensson, Department of Electrical Engineering

Master’s Thesis 2023
Department of Electrical Engineering
Division of Signal Processing and Biomedical Engineering
Chalmers University of Technology
SE-412 96 Gothenburg
Sweden
Telephone +46 31 772 1000

Typeset in LATEX, template by Kyriaki Antoniadou-Plytaria
Gothenburg, Sweden 2023

iv


Multi-Object Tracking Based on Multi-Sensor Multi-Bernoulli Densities Fusion
JINGKAI ZHOU
QINGHUAN LIU
Department of Electrical Engineering
Division of Signal Processing and Biomedical Engineering
Chalmers University of Technology

Abstract
Advanced Driver Assistance Systems (ADAS) utilizes multiple sensors to help drivers
deal with complicated traffic situations. In practice, we need multi-sensor algorithms
that can accurately estimate object states, such as positions and velocities, while
operating under a limited computational budget.

In this thesis, we propose a multi-sensor fusion algorithm using multi-Bernoulli den-
sities to obtain the fused multi-object densities and estimate the object trajectories.
The density of each object is estimated from the single object state fusion, such
as covariance intersection, safe fusion, etc. First, we perform data association to
associate the densities from different sensors that correspond to the same object.
Then, based on the most likely association, the multi-object fusion problem can be
decomposed into several single object state fusions. Once the fused multi-object
densities are obtained, object trajectories, if desired, can be constructed by back-
ward simulation without object label information. The efficacy of this work has been
validated using simulation data as well as real data collected by a Volvo testing truck.

The result shows that our proposed algorithm can accurately associate data in the
sense of fewer missed and false detections, achieve accurate multi-sensor multi-object
fusion, and the response time is also acceptable.

Keywords: Multi-object tracking, Data association, Sensor fusion, Trajectory esti-
mation.

v


Acknowledgements
We would like to thank our supervisor, Yuxuan Xia, at Chalmers University of
Technology, for his valuable advice and comprehensive discussions with us during
the thesis. We would also like to thank our supervisors, Daniel Johansson, Felix
Gustavsson and Alexander Greger, at Volvo Trucks. They have given us important
empirical advice on data processing and have handled many complex data, without
which our thesis would have taken much longer. We would like to thank Shenting
Xie and Sixuan Pei for being our opponents in our final presentation and giving us
valuable feedback on our thesis. Finally, we would like to express our sincere grati-
tude to our examiner Lennart Svensson for his suggestions, which have contributed
to the rigour of this report.

Jingkai Zhou & Qinghuan Liu, Gothenburg, June 2023

vii


List of Acronyms

Below is the list of acronyms that have been used throughout this thesis listed in
alphabetical order:

AA Arithmetic Average
ADAS Advanced Driver Assistance Systems
CC Cross Covariance
CF Centralized Fusion
CI Covariance Intersection
CT Coordinate Turn
CV Constant Velocity
DBSCAN Density-Based Spatial Clustering of Applications with Noise
EKF Extended Kalman Filter
FoVs Field of Views
GOSPA Generalized Optimal Sub-Pattern Assignment Metric
KF Kalman Filter
KLD Kullback-Leibler Divergence
MAP Maximum a Posterior
MB Multi-Bernoulli
MBM Multi-Bernoulli Mixture
MCMC Markov Chain Monte Carlo
MOT Multi-Object Tracking
NLL Negative Log-Likelihood
PDF Possibility Density Function
PMB Poison Multi-Bernoulli
PMBM Poison Multi-Bernoulli Mixture
PPP Poisson Points Process
RFSs Random Finite Sets
RTS Rauch-Tung-Striebel
SF Safe Fusion
SOT Single-Object Tracking
SVD Singular Value Decomposition
T2TF Track-to-Track Fusion

ix


Nomenclature

Below is the nomenclature of indices, sets, parameters, and variables that have been
used throughout this thesis.

Indices

i,j Indices for objects, trajectories, hypothesis and weight matrices
m,n,u,v Indices for entries in the matrix
k Time instance

Sets

X Set of trajectories
π Multi-trajectory density
γ Assignment between estimates and ground truth
Γ Set of all possible assignments between estimates and ground truth

Parameters

A Dynamic matrix
Q Dynamic noise matrix
H Measurement matrix
R Measurement noise matrix
c Maximum allowable localization error in GOSPA
α Factor for the cardinality mismatching penalty in GOSPA
p Factor for the outliers penalty in GOSPA

xi


Variables

x Object state
P Object state’s uncertainty
q Object dynamic noise
y Measurement of object’s state
e Measurement noise
K Kalman gain
C,C ′ Cost matrix for solving the 2D assignment problem
a Activator of each entry in cost matrix (can only be 1 or 0)
w Weight of the local hypothesis
r Existence probability of the local hypothesis
X Single object trajectory
p Single trajectory density of the local hypothesis

xii


Contents

List of Acronyms ix

Nomenclature xi

List of Figures xv

List of Tables xvii

1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.1 T2TF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Data Association Techniques . . . . . . . . . . . . . . . . . . . 2
1.2.3 Deep Learning-based Approaches . . . . . . . . . . . . . . . . 3

1.3 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Scope and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Theory 5
2.1 State Space Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Object State Representation . . . . . . . . . . . . . . . . . . . 5
2.1.2 Motion Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.3 Measurement Model . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Bayesian Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Prediction, Update, and Smoothing . . . . . . . . . . . . . . . 7
2.2.2 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 Backward Simulation for Smoothing . . . . . . . . . . . . . . 9

2.3 Random Finite Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.1 Poisson Point Process . . . . . . . . . . . . . . . . . . . . . . 10
2.3.2 Bernoulli RFS . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.3 Multi-Bernoulli and Multi-Bernoulli Mixture . . . . . . . . . . 10
2.3.4 Poisson Multi-Bernoulli Mixture . . . . . . . . . . . . . . . . . 11
2.3.5 PMBM filter and PMB filter . . . . . . . . . . . . . . . . . . . 11

2.4 Set of Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Methods 13
3.1 Sensory Information . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

xiii


Contents

3.2 Multi-object Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.1 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.2 Data Association . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3 Single-object Density Fusion Algorithms . . . . . . . . . . . . . . . . 18
3.3.1 Covariance Intersection . . . . . . . . . . . . . . . . . . . . . . 18
3.3.2 Safe Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3.3 Arithmetic Average Fusion . . . . . . . . . . . . . . . . . . . . 20
3.3.4 Cross-covariance Fusion . . . . . . . . . . . . . . . . . . . . . 21

3.4 Multi-object Trajectory Estimation . . . . . . . . . . . . . . . . . . . 23
3.4.1 Backward Simulation for Sets of Trajectories . . . . . . . . . . 23
3.4.2 Gaussian Implementation for Backward Simulation . . . . . . 25

3.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.5.1 GOSPA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.5.2 NLL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4 Simulation Results 29
4.1 Simulation Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.1.1 Sensor FoVs Setup . . . . . . . . . . . . . . . . . . . . . . . . 29
4.1.2 Ground Truth Creation . . . . . . . . . . . . . . . . . . . . . . 29
4.1.3 Object Estimate Densities Creation . . . . . . . . . . . . . . . 31

4.2 MOT Simulation Evaluation . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.1 Simulation Result with Different Number of Objects . . . . . . 31
4.2.2 Simulation Result with Different Overlapping areas . . . . . . 32
4.2.3 Simulation Result with Different Motion Model . . . . . . . . 33
4.2.4 Simulation Result with a Combination of Low-quality Sensors

and High-quality Sensors . . . . . . . . . . . . . . . . . . . . . 34
4.3 Trajectory Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5 Experimental Results 37
5.1 Real-world Sensor Data . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.2.1 Fusion Result Visualization . . . . . . . . . . . . . . . . . . . 37
5.2.2 GOSPA Evaluation Result . . . . . . . . . . . . . . . . . . . . 38
5.2.3 NLL Evaluation Result . . . . . . . . . . . . . . . . . . . . . . 38
5.2.4 Computational Complexity . . . . . . . . . . . . . . . . . . . . 40

5.3 Trajectory Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6 Discussion 43
6.1 Simulation and Experimental Comparison . . . . . . . . . . . . . . . 43
6.2 Choice of Single-object Fusion Method . . . . . . . . . . . . . . . . . 43
6.3 Simplification in Real-world Applications . . . . . . . . . . . . . . . . 44
6.4 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

7 Conclusion 47

Bibliography 49

xiv


List of Figures

2.1 A Bayesian network with motion and measurement processes that are
conditionally dependent. . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Relationship between state’s prediction, update, and smoothing in
time sequence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.1 A diagram of fusion structure containing clustering, data association,
single object fusion and trajectory estimation where each circle rep-
resents a single object density. . . . . . . . . . . . . . . . . . . . . . . 14

3.2 The sensor layout. There are six sensors on the testing truck, and the
circular segments represent the FoVs of sensors. FLR: Front-looking
radar, FLC: front-looking camera, LF: left-front-looking radar, LR:left-
rare-looking radar, RF: right-front-looking radar, RR:right-rare-looking
radar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.3 An example shows the fusion map where each circle represents an
object density, and each double arrow indicates a mapping between
one pair of object densities from two different sensors. . . . . . . . . . 16

3.4 The illustration of CI fusion method . . . . . . . . . . . . . . . . . . 18
3.5 The SF result illustration . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.6 The AA result illustration . . . . . . . . . . . . . . . . . . . . . . . . 21
3.7 Illustration of the fusion result with two different correlation coefficients. 22
3.8 An example shows the backward simulation results (dashed line).

There are 4 trajectories existing at time k + 1 and 5 objects at time
k, where trajectory 1 is ended at time k; trajectories 2 − 4 are up-
dated by the objects at time k; trajectories 5 and 6 initialize two new
trajectories; and all the trajectories starting after time k + 1 remain
unaltered. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.1 The sensor layout contains nine different blocks, four blocks at the
corners covered by only one sensor, one block at the centre covered
by all four sensors, and the rest four blocks covered by two sensors. . 30

4.2 Examples of randomly generated trajectories in four different simula-
tion settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.3 Two different settings for the sensors’ FoVs overlapping areas. . . . . 33
4.4 Trajectory estimations of 5 objects, where the green dots represent the

fusion results, the blue line segments denote the trajectory estimates,
and the ground truth trajectories are marked as red line segments. . . 35

xv


List of Figures

4.5 Trajectory estimations on 5 objects with random birth and random
death. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.6 Trajectory estimations on 5 objects. We set the random birth and
death and make the objects closer. . . . . . . . . . . . . . . . . . . . 36

5.1 The scenario in the real-world testing data, where a car takes over
the testing truck (blue rectangular) on the road. . . . . . . . . . . . . 38

5.2 The fusion result from four different single-object fusion algorithms. . 39
5.3 GOSPA values over 61 time instances of four different single-object

fusion algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.4 NLL values over 61 time instances of four different single-object fusion

algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.5 The green dots are the fusion results, blue line segments are the tra-

jectory estimations, and ground truth trajectories are marked as red
line segments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.1 Inaccurate sensors densities in two specific time instances . . . . . . . 44

xvi


List of Tables

4.1 GOSPA and NLL values of four implementations with 5 objects. . . . 31
4.2 GOSPA and NLL values of four implementations with 20 objects. . . 31
4.4 GOSPA and NLL values of four implementations in a scenario with

a larger overlapping area. . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 GOSPA and NLL values of four implementations in a scenario with

a smaller overlapping area. . . . . . . . . . . . . . . . . . . . . . . . . 33
4.5 GOSPA and NLL values of four implementations with CT model and

5 objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.7 GOSPA and NLL values of four implementations with CT model and

15 objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.6 GOSPA and NLL values of four implementations with CT model and

10 objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.8 GOSPA and NLL values of four implementations with CT model and

20 objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.9 GOSPA and NLL values of four implementations with a combination

of low-quality and high-quality sensors. . . . . . . . . . . . . . . . . . 34

5.1 Average GOSPA value of four single-object fusion algorithms . . . . . 38
5.2 Average NLL value of four single-object fusion methods . . . . . . . . 38
5.3 Average time consumption in the simulations . . . . . . . . . . . . . . 40

6.1 GOSPA and NLL values if camera data are ignored . . . . . . . . . . 45

xvii


List of Tables

xviii


1
Introduction

This chapter provides an introduction to the thesis. Following that, the aim and
objective are stated to provide clarity on the specific problem that the thesis seeks
to address and the approach to be taken. Additionally, a review of related work
is conducted, discussing the strengths and weaknesses of existing approaches and
establishing the relevance of this thesis in relation to them. Furthermore, a compre-
hensive explanation of the limitations and scope of this thesis is provided. Lastly,
the final section offers an outline of the thesis, providing a structured overview of
its subsequent sections and content.

1.1 Background

Advanced Driver Assistance Systems (ADAS) is the technology that helps drivers
control and navigate their vehicles more safely and comfortably. ADAS uses sensors
such as cameras and radars to monitor the surroundings, detect hazards, and pro-
vide timely warnings or automated actions. ADAS features include adaptive cruise
control, lane departure warning, forward collision warning, and driver monitoring
systems. Within the mentioned strategies, the fusion of multi-object densities for
multi-object tracking (MOT) is the key technic.

The vehicle’s perceptual capabilities can be enhanced by using multiple sensors, en-
abling a more advanced perception system. In the fusion of multi-object densities,
two architectures are commonly employed: Centralized Fusion (CF) and Track-to-
Track Fusion (T2TF) [1]. CF architecture often faces computational limitations as
it necessitates the central processor to handle the processing of all raw data. In
contrast, T2TF architecture integrates track estimation, typically in the form of ob-
ject densities with ID information, obtained from processing the sensor’s detections
instead of directly incorporating raw detections [2].

In the fusion of multi-object densities, it is important to avoid double-counting
the information from different sensors. Additionally, effectively handling densities
originating from different fields of view (FoVs) poses a significant challenge. These
requirements give rise to a data association problem, which can be approached by
solving a 2D assignment problem. By finding the optimal assignment with the min-
imum total assignment cost, an object detected by one sensor can be mapped to
its corresponding object detected by another sensor. This assignment is commonly
referred to as a “fusion map” [3].

1


1. Introduction

According to the best fusion map, the assigned densities from different sensors can
be combined into a fused object density. Several widely employed methods for single-
object fusion exist, such as cross-covariance, covariance intersection, and covariance
union [4]. These methods facilitate the integration of information from different
sensors, resulting in a consolidated representation of the object.

In the context of MOT, it is insufficient to solely fuse the (unlabeled) object den-
sities within individual time instances. Estimating these objects’ trajectories over
a specific time interval is necessary. In single-sensor tracking scenarios, trajectories
are typically inferred based on object labels. However, this approach encounters
challenges when dealing with multiple sensors, as the same object may be assigned
different labels by different sensors, a problem commonly known as label mismatch.
To overcome this issue, it is crucial to develop a trajectory estimation method capa-
ble of handling sets of unlabeled multi-object densities. Our objective is to construct
a flexible and less restrictive fusion for the MOT problem.

1.2 Related Work
This section contains several widely-used methods related to data association and
multi-object densities fusion.

1.2.1 T2TF
T2TF has emerged as a promising approach for improving the robustness and ac-
curacy of multi-sensor MOT systems. T2TF techniques aim to fuse information
from multiple object tracks to enhance object association and tracking performance.
T2TF relies on the use of single-object fusion methods and several of them can be
implemented, e.g. covariance intersection [5] [6] [7], safe fusion [8] [9], arithmetic
average fusion [10] [11] and cross-covariance fusion [7] [12]. In [7], covariance inter-
section and cross-covariance fusion are compared. The simulation result shows that
the cross-covariance fusion returned the best fusion quality w.r.t. root mean square
error, and the runtime of the cross-covariance fusion is the lowest. In [9], the safe
fusion is compared to the covariance intersection, and the result shows that the safe
fusion obtains estimates with better consistency.

1.2.2 Data Association Techniques
Data association techniques play a crucial role in T2TF. Most data association prob-
lems are treated as 2D linear assignment problems, where the best assignment can
be found using, e.g. the Hungarian algorithm [13]. The main differences between
these assignment problems exist in the formation of the cost matrix. In addition, as
seen from [14], the association problem also contains information on object labels.
Moreover, some clustering methods can also be implemented for data association.

2


1. Introduction

These techniques all aim to associate the data using the information within single-
time instances.

Using historical information, the Multisensor Track-To-Track Association method
[15] associates tracks based on the concept of “historical distance”. The historical
distance between two tracks contains past estimates to make the track association
more reasonable. The algorithm clusters the tracks that correspond to the same
object by finding the minimal historical distance in an iterative fashion.

1.2.3 Deep Learning-based Approaches
Deep learning-based approaches have also been applied to T2TF in MOT [16]. These
methods leverage the power of deep neural networks to learn feature representations
and perform track fusion. In this survey [17], track associations are determined based
on similarity measures or distance metrics. Deep learning-based T2TF methods have
shown promising results in handling complex MOT scenarios and achieving state-
of-the-art performance. Aiming at the fusion of multiple trajectory densities using
deep learning, a transformer-based fusion method is introduced to output a set of
fused global object densities [18].

1.3 Aim
The aim of this thesis is to develop a fusion structure for multi-sensor multi-Bernoulli
densities and enable the tracking of multiple objects. The structure incorporates
techniques such as data association, single-object densities fusion, and object tra-
jectory estimation.

Firstly, the fusion structure needs to accurately and effectively estimate the densities
of multiple objects in the surrounding environment. These densities, describing the
moving objects within (different) sensor FoVs, are then fused together under the
correct data association. Secondly, we evaluate four single-object fusion methods
using appropriate evaluation metrics and select the most effective one for subsequent
trajectory estimation tasks. Finally, utilizing the fused multi-object densities, we
employ backward simulation to estimate the smoothed trajectories of the objects.

1.4 Scope and Limitations
This thesis is subject to certain constraints, which may affect the thesis’s outcomes.
Firstly, the raw data from the sensors is inaccessible, limiting the project in the do-
main of T2TF. Secondly, the trajectory estimation can only perform partially online.
To clarify, assuming at a time of interest in a time interval, trajectory estimation
requires that the fused multi-object densities in this time interval are all accessible.
Moreover, only moving objects are considered when processing the sensor and refer-
ence data, while stationary objects are excluded through pre-processing. Addition-

3


1. Introduction

ally, the proposed structure operates under the assumption of time synchronization
of the sensor data, achieved during an inaccessible pre-processing step.

1.5 Thesis Outline
The remainder of the master thesis is structured in the following manner. Chapter
2 introduces the relevant theories employed in our thesis, including Bayesian esti-
mation theory, random finite set theory, etc. Chapter 3 details the methodologies
employed in our fusion structure, encompassing the pre-processing step, data associ-
ation methods, single-object fusion techniques, trajectory estimation methods, and
evaluation methods. Chapter 4 outlines the simulation setups and evaluation results
under different scenarios. Chapter 5 presents the results obtained from real-world
reference data. Subsequently, Chapter 6 discusses the result of the simulation and
real-world testing. Finally, Chapter 7 presents the conclusion.

4


2
Theory

This chapter presents the background theory, including the state space representa-
tion, Bayesian estimation and object density representation related to MOT algo-
rithms.

2.1 State Space Modelling
State space representation is a mathematical modelling technique used in control
theory, physics, engineering, and many other fields to describe the behaviour of a
system over time. In this thesis, the object’s behaviour is represented by a state
vector that describes the object’s current state and how this object evolves over
time.

2.1.1 Object State Representation
In this thesis, we would like to estimate multiple objects’ positions and velocities in
two dimensions under Cartesian coordinates. Therefore, we write the state xk of an
object at time k and represent it as a vector with an uncertainty matrix in the form
of Gaussian,

µk =


x
ẋ
y
ẏ

 , Pk = diag(
[
σx σẋ σy σẏ

]
), (2.1)

where µ represents the mean of this state density, and P represents its uncertainty.
In addition, we denote the set of objects existing at time k as xk.

2.1.2 Motion Model
Given a state of an object at time instance k, the object state at the next time
instance k + 1 can be predicted by using a motion model. In general, this motion
model can be represented as an affine function.

xk+1 = f (xk, uk, qk) , (2.2)

where f is the dynamic function, uk is the control input at time k, and qk represents
the noise input.

5


2. Theory

For linear and Gaussian motion models, (2.2) can be represented in the matrix form
xk+1 = Akxk + Bkuk + qk, qk ∼ N (0, Qk) , (2.3)

where Ak is the dynamic matrix at time k, Bk is the input matrix and Qk denotes
the noise matrix.

2.1.3 Measurement Model
A measurement model is a representation of the state of the system being observed,
which is used to estimate the state of the system based on the sensor measurements
obtained from the environment. The equation for the measurement model at time
k is generally given as

yk = h (xk, ek) , (2.4)
where h represents the function that transforms the states into measurements, and
ek represents the measurement noise.

Under the assumption of linearity and Gaussian distribution, the matrix represen-
tation of the measurement process is,

yk = Hxk + ek, ek ∼ N (0, Rk) , (2.5)
where H is the measurement matrix, which maps the object state space into the
measurement space, and Rk is the measurement noise matrix.

2.2 Bayesian Estimation
In object tracking, Bayesian estimation is a common approach used to estimate the
state of a moving object based on noisy sensor measurements. To apply Bayesian
estimation in MOT, the probabilistic models of the object’s motion should be first
developed, i.e. the motion and measurement densities aforementioned in Subsec-
tions 2.1.2 and 2.1.3. It is assumed that we have a prior distribution over the object
state, which captures the knowledge about its probable location and motion before
observing the new measurements. As new measurements come in, the object’s prior
distribution will be updated using Bayes’ rule. This results in a posterior distribu-
tion that describes the updated probability of the object’s state incorporating the
measurement information. This posterior updating process can be represented as a
Bayesian network.

In Figure 2.1, we make two assumptions. First, the current state at time k is only
dependent on the previous state at time k − 1. That is, the object moves according
to a Markov process. Second, the current measurement is only dependent on the
current state. Therefore, p (xk|x1:k−1) and p (yk|x1:k) can be, respectively, simplified
as

p (xk|x1:k−1) = p (xk|xk−1) , (2.6)
and

p (yk|x1:k) = p (yk|xk) . (2.7)

6


2. Theory

Figure 2.1: A Bayesian network with motion and measurement processes that are
conditionally dependent.

Figure 2.2: Relationship between state’s prediction, update, and smoothing in
time sequence.

2.2.1 Prediction, Update, and Smoothing
The Bayes filter recursively computes the object filtering density by performing
prediction and update. After obtaining all the object filtering estimates in a given
time interval, Bayesian smoothing is a tool that uses the motion model to further
improve the accuracy of object state estimation at earlier time steps. Figure 2.2
shows the time relation diagram on prediction, update, and smoothing.

In the prediction step, the object state is predicted given the previous posterior
density p (xk−1|y1:k−1) at time k − 1 using the Chapman-Kolmogorov equation. The
explicit expression is

p (xk|y1:k−1) =
∫

p (xk−1, xk|y1:k−1) dxk−1

=
∫

p (xk|xk−1, y1:k−1) p (xk−1|y1:k−1) dxk−1

=
∫

p (xk|xk−1) p (xk−1|y1:k−1) dxk−1, (2.8)

where p (xk|xk−1) describes the motion model (2.3).

7


2. Theory

After prediction, the object state is updated with the measurement yk received at
time k using the Bayes rule. The measurement update equation is

p (xk|y1:k) = p (xk|yk, y1:k−1)

= p (xk|yk, y1:k−1)
p (yk|y1:k−1)

p (xk|y1:k−1) (2.9)

∝ p (yk|xk, y1:k−1) p (xk|y1:k−1)
= p (yk|xk) p (xk|y1:k−1) , (2.10)

where p (yk|xk) is the measurement likelihood parameterized by (2.6), p (yk|y1:k−1)
is the normalizing constant, and p (xk|y1:k−1) is the predicted density obtained from
(2.8).

After performing recursive predicting and updating using all the measurements in
the time interval 1 : T , smoothing seeks to find the density p (xk|y1:T ) at time k
where k ≤ T − 1. Some common approaches for smoothing are RTS smoothing,
two-filter smoothing and backward simulation. In this thesis, we perform backward
simulation by recursively drawing samples from the backward kernel. From the
Markov property, given xk+1, xk is independent from the measurement sequence
yk+1:T . As a consequence, the backward smoothing kernel p(xk|xk+1, y1:T ) can be
written as p(xk|xk+1, y1:k), which can be further expressed as

p (xk|xk+1, y1:k) = p (xk, xk+1|y1:k)
p (xk+1|y1:k)

= p (xk+1|xk, y1:k) p (xk|y1:k)
p (xk+1|y1:k)

= p (xk+1|xk) p (xk|y1:k)
p (xk+1|y1:k) . (2.11)

2.2.2 Kalman Filter

Kalman filter (KF) provides the Bayesian optimal filtering solution for linear and
Gaussian models. In Kalman filtering, both the predicted and filtering densities are
Gaussian, parameterized by mean x̂k|k′ and covariance Pk|k′ , where k′ = k − 1 for
prediction and k′ = k for update. The prediction step is given by

x̂k|k−1 = Axk−1, (2.12)
Pk|k−1 = APk−1A

T + Q, (2.13)

8


2. Theory

The update step is given as

x̂k|k = x̂k|k−1Kkvk, (2.14)
Pk|k = Pk|k−1 − KkSkKT

k , (2.15)
Kk = Pk|k−1H

T
k S−1

k , (2.16)
vk = yk − Hkx̂k|k−1, (2.17)
Sk = HkPk|k−1H

T
k + Rk, (2.18)

where Kk is the Kalman gain, vk is the innovation, and Sk is the innovation covari-
ance.

2.2.3 Backward Simulation for Smoothing
In a time interval 1 : T , smoothing makes use of the measurement information in
the entire time interval. One approach to performing smoothing is using backward
simulation, which employs a forward-backward idea. First, the object filtering den-
sities are obtained in the forward filtering direction. Then particle representation
of smoothed object densities is obtained by sequentially drawing samples from a
backward smoothing kernel. In general, the joint posterior density p (x1:T | y1:T ) can
be factorized as

p (x1:T | y1:T ) =
(

T −1∏
k=1

p (xk | xk+1, y1:k)
)

p (xT | y1:T ) , (2.19)

where p (xk | xk+1, y1:k) is the backward kernel. After initializing the density p (xT | y1:T )
at the last time instance T , the process of backward simulation can be done by draw-
ing samples of xk from the kernel p (xk|xk+1, y1:k) successively from time T − 1 to
time 1.

For the linear Gaussian dynamic model, the backward simulation kernel can be
written in matrix form, which follows,

p (xk | xk+1, y1:k) = N (xk; µk, Mk) (2.20)

with simulated mean vector and covariance matrix,

µk = x̂k|k + Pk|kA⊤E−1
k

(
xk+1 − Ax̂k|k

)
, (2.21)

Mk = Pk|k − Pk|kA⊤E−1
k APk|k, (2.22)

Pk+1|k = Q + APk|kA⊤, (2.23)

where x̂k|k and Pk|k, k ∈ 1 . . . T are the densities obtained from forward filtering.

2.3 Random Finite Sets
Random finite sets (RFSs) are random variables used to represent sets with a fi-
nite number of elements. The elements in an RFS are order-irrelevant and element

9


2. Theory

non-repeatable, e.g. X1 = {a, b} is identical to X2 = {b, a} and one cannot write
X3 = {a, b, a}. In addition, the set operations like intersection ∩, union ∪, and
cardinality | · | are all applicable to RFS.

In MOT, both the number of objects and their states may change over time and need
to be estimated, making RFS an appealing tool for building mathematical models.
Two building blocks for RFS-based MOT methods are the Poisson point process
(PPP) and the Bernoulli RFS.

2.3.1 Poisson Point Process
For a PPP, its cardinality follows a Poisson distribution, meaning that events are
randomly distributed and occur independently. The multi-object density of a PPP
x = {x1, ..., xn} is [19]

f (x) = e−λ
n∏

i=1
λf (xi) , (2.24)

parameterized by the Poisson rate λ and the single object density f(xi). Note that
since the elements in a PPP are i.i.d. their single object densities are the same. In
MOT, PPP is often used to model newborn objects and clutter measurements.

2.3.2 Bernoulli RFS
A Bernoulli RFS x has density [19]

f(x) =


1 − r, x = ∅
rf(x), x = {x}
0, |x| ≥ 2

, (2.25)

where r ∈ [0, 1] represents the existence probability of an object, f(x) is the single-
object density of this object.

2.3.3 Multi-Bernoulli and Multi-Bernoulli Mixture
A multi-Bernoulli is a disjoint union of independent Bernoulli RFSs. Its density can
be written as

f(x) =
∑

⊎i∈Ixi=x

∏
i∈I

fi (xi) , (2.26)

where each Bernoulli RFS is indexed by i ∈ I, and ⊎ denotes the disjoint union.

The Multi-Bernoulli Mixture (MBM) RFSs are used to model the set of the detected
objects under data association uncertainties, where each mixture component repre-
sents a different data association hypothesis. The multi-object density of an MBM
is written as [20]

f(x) =
∑
j∈J

wj

∑
⊎i∈Ijxi=x

∏
i∈Ij

fj,i (xi) , (2.27)

where wj is the weight of each MB indexed by j ∈ J, and it satisfies that∑j∈J wj = 1.

10


2. Theory

2.3.4 Poisson Multi-Bernoulli Mixture
A Poisson Multi-Bernoulli Mixture (PMBM) is a disjoint union of a PPP and an
MBM [20], and its multi-object density can be expressed as

f(x) =
∑

xu⊎Xd=x
fu (xu) fd

(
xd
)

, (2.28)

where xu represents the PPP undetected objects with density (2.24), and xd repre-
sents the MBM detected objects with density (2.27). If there is only a single MB
component in the MBM (i.e. |J| = 1), then the PMBM reduces to a PMB.

2.3.5 PMBM filter and PMB filter
PMBM provides a closed-form solution for multiple object filtering with standard
multi-object models with PPP birth [21] [22]. The resulting filter is the PMBM
filter, and the PMB filter is an efficient approximation of the PMBM filter [22]. In
the PMBM density representation, the PPP describes the potential objects that are
hypothesized to exist but have never been detected, and the MBM describes targets
that have been detected at least once.

2.4 Set of Trajectories
A trajectory is a sequence of object states at consecutive time steps. A single
trajectory can be represented as

X =
(
t, x1:ν

)
, (2.29)

where t is the trajectory start time and ν represents the length of this trajectory.
The density of single-trajectory density is denoted as π (t, x1:ν), where π (·) is a real-
value function that integrates to one. Similar to the representation of RFSs, the set
of trajectories in a time interval α : γ can be represented as

Xα:γ = {X1, X2, ..., Xn}. (2.30)

The PPP of the Xα:γ has density

π (Xα:γ) = e−⟨λ,1⟩[λ(·)]Xα:γ , (2.31)

where λ(·) is the Poisson intensity, ⟨f, g⟩ denotes the inner product
∫

f(x)g(x)dx,
and the Bernoulli process of the Xα:γ has density

π (Xα:γ) =


1 − r, Xα:γ = ∅
rp(X), Xα:γ = {X}
0, otherwise

(2.32)

Similarly, the MB process of the Xα:γ has the density in a similar form to (2.26).

11


2. Theory

12


3
Methods

This chapter presents the methods employed in our proposed multi-object fusion
algorithm, including data association and single-object fusion methods, and then
trajectory estimation is performed based on the fused multi-object densities.

The focus of this thesis lies on tracking movable objects around a testing truck with
one camera and five radars equipped, a practical challenge at Volvo Trucks. Each
sensor individually processes the raw object detections and provides track-level infor-
mation. The objective is to effectively leverage the information from several sensors
to achieve MOT. In the context of T2TF, this requires precise multi-object densities
and accurate estimation of object trajectories.

There are several challenges related to this problem, and the first one is related to
the data association problem. It is important to associate single-object densities
that represent the same object from different sensors. In addition, we observed that
in practice the tracking system may report more than one track estimate for ob-
jects moving across the edges of different FoVs. Therefore, a clustering algorithm
is needed as a pre-processing step to group closely-spaced track estimates for each
sensor. Furthermore, in order to enhance the adaptability of this multi-object fusion
technique, different single-object fusion methods can be applied to different working
scenarios. However, this introduces a challenge in determining a suitable evaluation
tool for assigning each method to its specialized scenario.

Currently, most of the multi-object fusion methods only focus on fusing multi-object
densities at individual time instances, without providing trajectory information. To
construct trajectories in single sensor tracking, one may make use of the labelling
information and connect object state estimates with the same unique label. This
simple labelling strategy may work well in some cases, but it is not reliable due to
track fragmentation and track switches in challenging scenarios. Furthermore, in
multi-sensor MOT, the same object may be assigned different labels by different
sensors, a problem commonly known as label mismatch. Thus, the labelling infor-
mation can be misleading and may give rise to unreasonable trajectory estimates.
To construct trajectories in the context of multi-sensor fusion, we use backward sim-
ulation for sets of trajectories [23], which does not require any labelling information.

To address the mentioned problems, we develop our fusion structure as illustrated
in Figure 3.1.

13


3. Methods

Figure 3.1: A diagram of fusion structure containing clustering, data association,
single object fusion and trajectory estimation where each circle represents a single
object density.

3.1 Sensory Information
There are six sensors mounted on the testing truck, including five radars and one
camera. The approximate layout of sensors is shown in Figure 3.2.

The sensors’ feedback is highly integrated, including objects’ positions, velocities,
accelerations, headings, and their corresponding covariances in Cartesian coordi-
nates. The object dynamic states referring to the environment, e.g. movable or
stationary, are also included in the feedback. The information of objects’ positions,
velocities and their corresponding covariances can be utilised to form the object
state representation shown in Subsection 2.1.1.

3.2 Multi-object Fusion

3.2.1 Pre-processing
The output from each sensor captures objects’ existence uncertainties via exis-
tence probabilities. Therefore, the multi-object density can be regarded as a multi-
Bernoulli. In the real data, we observe that the existence probabilities are all com-
paratively high, usually above 0.9, even around 90% of existence probabilities are
above 0.98. Through empirical observation, we found that the object estimates
with existence probability below 0.9 typically represent false detections. Therefore,
we choose not to take false detections by removing object estimates with existence
probability below 0.9 for further implementation.

14


3. Methods

Figure 3.2: The sensor layout. There are six sensors on the testing truck, and the
circular segments represent the FoVs of sensors. FLR: Front-looking radar, FLC:
front-looking camera, LF: left-front-looking radar, LR:left-rare-looking radar, RF:
right-front-looking radar, RR:right-rare-looking radar.

During the sensing process, it is common for an object to be detected more than
once when it crosses the edges of different FoVs. To address this, a pre-processing
step is employed to cluster sensor data using DBSCAN [24] prior to data association.
Unlike many other clustering methods, DBSCAN does not require the number of
clusters as an input parameter. DBSCAN identifies clusters based on the minimum
number of points τ required to form a dense region and the maximum distance ϵ
between two points for them to be considered part of the same cluster.

DBSCAN clusters objects by assigning input objects to core objects, border objects,
and noise objects. DBSCAN starts by randomly selecting an object from the set
of unassigned input objects and finding all objects within a distance ϵ. The se-
lected objects are marked as core objects if the number of found objects exceeds or
equals τ . Next, DBSCAN expands the clustering by iterating all objects within ϵ
of the core object and repeating the process until no more objects can be added.
Those objects that are not classified as core objects but within ϵ of a core object are
marked as border objects and assigned to the cluster. Finally, objects not assigned
to any cluster are marked as noise. After assigning all the objects, the process stops.

The characteristics of the data, such as the object densities, influence the choice

15


3. Methods

Figure 3.3: An example shows the fusion map where each circle represents an
object density, and each double arrow indicates a mapping between one pair of
object densities from two different sensors.

of the parameters τ , ϵ. In this thesis, τ is set to 1, meaning each detection may
correspond to a single object. The parameter ϵ has been tuned according to the
testing scenario.

3.2.2 Data Association

Since the sensors report objects’ positions with covariances along with their existence
probabilities, we can treat the object estimates from a sensor as a multi-Bernoulli
distribution. Then, the association between the object densities from two differ-
ent sensors is needed for single-object fusion. The data association problem can be
addressed by introducing a fusion map indicating the densities’ mappings from two
sensors, as shown in Figure 3.3. To build the fusion map, the Bernoulli densities from
two sensors are measured by the similarity in terms of Kullback-Leibler divergence
(KLD) [25]. The best fusion map between two sensors can be found by the sum of
the lowest KLDs between pairs of Bernoulli densities, which can be formulated as a
2D assignment problem. For the fusion of more than two sensors, the global optimal
fusion map can be found by taking those two steps in pairs in a sequential manner.

The KLD between two distributions p(x) and q(x) is defined as follows:

DKL(p||q) =
∫

p(x) log p(x)
q(x)dx. (3.1)

The value of KLD will be 0 if two single-object deities are identical. If the PDFs of
the distributions p(·) and q(·) are in the form of the Bernoulli RFSs density with
k dimensional multivariate normal distribution, the KLD between p(·) and q(·) is

16


3. Methods

given by [26]

DKL(p||q) = (1 − rp) log 1 − rp

1 − rq

+ rp log rp

rq

+ rp

2

[
tr(P −1

q Pp) − log |Pp|
|Pq|

− k + (xp − xq)T P −1
q (xp − xq)

]
, (3.2)

where Bernoulli density p(·) is parameterized by existence probability rp, mean xp,
and covariance Pp, and Bernoulli density q(·) by existence probability rq, mean xq,
and covariance Pq. The notation |·| denotes the matrix determinant and tr(·) de-
notes the trace of the matrix.

Since the KLD is asymmetric, the following symmetrized version of the two KLDs,
defined as

J(p, q) = 1
2 (DKL(p||q) + DKL(q||p)) , (3.3)

is used to measure the distance between two densities. The cost matrix C is con-
structed by calculating the KLD between every pair of object densities. Three
padded matrixes are added to make the square cost matrix. Assume the cost ma-
trix C is of size n × m, which means there are n single-object densities in sensor 1
and m single-object densities in sensor 2. Three padded matrixes D1

n×n, D2
m×m and

0m×n make a square cost matrix. Let C ′ be the augmented cost matrix in size of
(m + n) × (m + n) as follows:

C ′ =
[

Cn×m D1
n×n

D2
m×m 0m×n

]
, (3.4)

where the padded matrix D1 and D2 are square matrices, where the diagonal el-
ements are threshold while all other entries are set to +∞. To avoid associating
densities that are too far away, a threshold, denoting the maximum allowable as-
signment cost [27], is applied to the cost matrix C ′. The threshold is the maximum
allowable assignment cost. Then, the best fusion map between sensor 1 and sensor
2 can be found by solving the following optimal assignment problem

arg min
a

|I1|∑
u=1

|I2|∑
v=1

au,vC ′
u,v

subject to
|Iu|∑
t=1

au,v = 1, ∀u,

|I1|∑
u=1

au,v = 1, ∀v,

au,v ∈ {0, 1},

(3.5)

where C ′
u,v is the entry at the u-th row and v-th column of the cost matrix, and au,v

indicates whether the u-th object and t-th object in the cost matrix C ′ are associated.
Note that only the entries within block C indicate the association between u-th
single-object density in sensor 1 and v-th single-object density in sensor 2. The

17


3. Methods

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5

x

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

y

Distribution 1

Distribution 2

Fusion result

Figure 3.4: The illustration of CI fusion method

association in block D1 means the unassigned single-object densities in sensor 1, and
similarly, the association in block D2 means the unassigned single-object densities
in sensor 2. The unassigned single-object densities may associate with single-object
densities from other sensors. If a single-object density remains unassigned after the
whole process, then this is regarded as an individual object that has been detected
by only one sensor. To solve this 2D assignment problem, we can use, e.g. Munkres’
algorithm [28] [29]. Finally, the data association among six sensors can be found by
sequentially solving this assignment problem five times in a predetermined order.

3.3 Single-object Density Fusion Algorithms

Four single-object density fusion methods are used to merge single-object densi-
ties, which are covariance intersection, safe fusion, arithmetic average, and cross-
covariance fusion, respectively. To help understand the differences between these
four fusion methods, we compare their fusion results with a toy example: the fusion
of two Gaussian distributions with the same mean but different covariances.

3.3.1 Covariance Intersection

Covariance intersection (CI) [5] [6] works for fusing two Gaussian distributions.
Given two Gaussian distributions with mean µ1 and µ2 and covariance P1 and P2
respectively, the fused estimation µ̂ and P̂ is formed from the following convex
combination,

18


3. Methods

P̂ −1 = ωP −1
1 + (1 − ω)P −1

2 , (3.6)
P̂ −1µ̂ = ωP −1

1 µ1 + (1 − ω)P −1
2 µ2, (3.7)

where ω ∈ [0, 1]. Figure 3.4 compares the fusion result with the original distribu-
tions. We can see that the fused covariance ellipse crosses four intersection points
of two covariance ellipses.

As an extension, CI can also handle cases with the merging of more than two Gaus-
sian distributions. Given N single-object Gaussian densities with mean µ1, · · · , µN

and covariance P1, · · · , PN respectively. The mean µ̂ and covariance P̂ of the fused
Gaussian density can be computed by [6]:

P̂ −1 =
∑

i

ωiP
−1
i , (3.8a)

P̂ −1µ̂ =
∑

i

ωiP
−1
i µi, (3.8b)

where wi is the weight of the i-th Gaussian density, and it satisfies that ∑N
i=1 wi = 1.

The fused existence probability r̂ can be obtained as follows:

C =

∣∣∣2πP̂
∣∣∣ 1

2

∏i |2πPi|
ωi
2

e[ 1
2 µ̂T P̂ −1µ̂−

∑i
ωiµ

T
i P −1

i µi] (3.9)

r̂ = C
∏i rωi

i∏i(1 − ri)ωi + C
∏i rωi

i

(3.10)

where ri is the existence probability of the i-th Bernoulli density.

3.3.2 Safe Fusion
An intuitive way to fuse two Gaussian densities is to pick one density with the
smaller uncertainty, which is the general idea of safe fusion (SF) [8] [9]. The detailed
algorithm for fusing two multi-variate Gaussian distributions with mean µ1 and µ2
and covariance P1 and P2 is shown below,

1. Find the information matrix (inverse of the covariance matrix) I1 = P −1
1 and

I2 = P −1
2 .

2. Singular value decomposition (SVD): I1 = U1D1U
T
1 .

3. SVD: D
−1/2
1 UT

1 I2U1D
−1/2
1 = U2D2U

T
2 .

4. Transformation matrix: T = UT
2 D

1/2
1 U1.

5. State transformation: µ̄1 = Tµ1 and µ̄2 = Tµ2. Now the covariance of new
distributions are P̄1 = I and P̄2 = D−1

2 respectively.
6. For each component i = 1, 2, · · · , k, let

µ̄i =

µ̄i
1 if Dii

2 < 1
µ̄i

2 if Dii
2 ≥ 1

, (3.11a)

Dii =

1 if Dii
2 < 1

Dii
2 if Dii

2 ≥ 1
, (3.11b)

19


3. Methods

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5

x

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

y

Distribution 1

Distribution 2

Fusion result

Figure 3.5: The SF result illustration

where D is a diagonal matrix. The superscript ii on D means the i-th element
on the diagonal.

7. Inverse the state transformation to get the estimate

µ̂ = T −1µ̄, (3.11c)
P̂ = T −1D−1T −T . (3.11d)

We proceed to make explanations on these steps. The operation of SVD in step 2
transforms the first covariance’s ellipsoid into a unit circle. The second SVD opera-
tion in step 3 rotates the coordinate system, so the second covariance’s ellipsoid can
be aligned to the coordinate axes. In step 6, the variables with a smaller variance
are chosen. In Figure 3.5, we can notice that the fusion result is the combination of
variables with a smaller variance.

When there are more than two single-object densities, fusing can be done in a
sequential manner, which is order dependent. When fusing Gaussian densities of
Bernoulli components, the fusion order can be determined by ranking the object
existence probabilities in descending order. For the fusion of existence probability,
we use the same method as in the CI fusion method, referring to (3.10).

3.3.3 Arithmetic Average Fusion
The Arithmetic Average (AA) fusion [10] [11] computes the arithmetic average over
a finite number of single-object densities. Given two Gaussian distributions with the
means, covariances and weights as µ1 and µ2, P1 and P2 and ω1 and ω2 respectively,

20


3. Methods

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5

x

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

y

Distribution 1

Distribution 2

Fusion result

Figure 3.6: The AA result illustration

the fused density µ̂ and P̂ is formed from the arithmetic average as follows:

µ̂ = ω1µ1 + ω2µ2, (3.12a)
P̂ = ω1

(
P1 + (µ̂ − µ1)(µ̂ − µ1)T

)
+ ω2

(
P2 + (µ̂ − µ2)(µ̂ − x2)T

)
, (3.12b)

r̂ = ω1r1 + ω2r2. (3.12c)

Figure 3.6 illustrates a fusion example of AA, where it can be seen that the fusion
result has larger covariance than each density.

Given N distributions f1(x), · · · , fN(x) with weights ω1, · · · , ωN , the process of AA
with more than two densities is

fAA(x) =
∑

i

ωifi(x), (3.13)

where ∑i wi = 1. If there are more than 2 Gaussian distributions, the mean and
covariance of the fused Gaussian distribution can be calculated as

µ̂ =
∑

i

ωiµi, (3.14a)

P̂ =
∑

i

ωi

(
Pi + (µ̂ − µi)(µ̂ − µi)T

)
, (3.14b)

r̂ =
∑

i

ωiri. (3.14c)

3.3.4 Cross-covariance Fusion
The Cross-covariance (CC) fusion method uses cross-correlation between object den-
sities [4] [12]. Given two Gaussian distributions with mean µ1 and µ2 and covariance

21


3. Methods

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5

x

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

y

Distribution 1

Distribution 2

Fusion result

(a) The CC fusion result when ρ = 0.01

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5

x

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

y

Distribution 1

Distribution 2

Fusion result

(b) The CC fusion result when ρ = 0.99

Figure 3.7: Illustration of the fusion result with two different correlation coeffi-
cients.

P1 and P2, the fused density µ̂ is calculated as follow:

µ̂ = µ1 + χ(µ2 − µ1), (3.15)

where
χ = (P1 − P12)U−1

12 , (3.16)
U12 = P1 + P2 − P12 − P T

12, (3.17)
where P12 is cross-covariance matrix. Note that the CC between two object densities
from two sensors can be accurately calculated if we have full knowledge of how each
sensor works, which is not always the case. An alternative way to approximate the
CC matrix is using the entry-wise product. The CC of two covariance matrix P1
and P2 can be approximated as [4]

P12 ≈ ρ
√

P1 · P2, (3.18)

where ρ is the correlation coefficient. The Figure 3.7 shows the fusion result when
ρ = 0.01 and ρ = 0.99. The fusion covariance ellipse is within the range of the
covered area, and a larger correlation coefficient leads to a smaller fusion covariance.
For scenarios where the states are 2D Cartesian position vectors, ρ ≈ 0.4 is a good
approximation [4]. The covariance matrix P̂ is calculated using

P̂ = P1 − (P1 − P12)U−1
12 (P1 − P12)T (3.19)

Supposing there are N densities, the matrix Pc, which includes each density’s co-
variance and their cross-covariance, can be expressed as:

Pc =


P1 P12 · · · P1N

P21 P2 · · · P2N

· · · . . . · · ·
PN1 PN2 · · · PN

 , (3.20)

22


3. Methods

where the P1, . . . , PN on the matrix diagonal are covariance of each object density,
and other parts are cross-covariance between two densities. The fused Gaussian
density has mean and covariance [30]

x̂ = (IT P −1I)−1IT P −1
c X, (3.21a)

P̂ = (IT P −1I)−1, (3.21b)

where I = [I I · · · I]T is composed of Nk×k identity matrix, and k is the object state
dimension, I is k×k identity matrix, and X is a kN ×1 vector with all object’s mean.
The fused existence probability can be found similarly by noting that a Bernoulli
variable with existence probability r has mean r and variance Pr = r(1 − r). The
correlation factor ρ can be 0, so the cross-covariance matrix between each existence
probability is 0. The (3.20) can be rewritten as:

Prc = diag (Pr1 , · · · , PrN
) , (3.22)

where Pri
is the variance of each existence probability. The fused existence proba-

bility can be found by
r̂ = (IT P −1I)−1IT P −1

rc Xr, (3.23)
where Xr is a N × 1 vector with all existence probabilities.

3.4 Multi-object Trajectory Estimation
The object information from the sensors contains the object labels, which can be
used to form object trajectories. However, if we have two or more objects with
different labels, it is unclear how to determine the label of the fused object density.
One approach to address this is to reassign object labels based on the best fusion
map [14]. However, this method may face difficulties in challenging scenarios where
initially well-separated objects move closely and then separate again. In such cases,
this track labelling procedure may encounter problems and result in unrealistic tra-
jectory estimates. A more appealing approach to multi-object trajectory estimation
is using backward simulation [23] [31], which can directly work on unlabelled multi-
object densities.

Based on the fused multi-object densities at each time step, trajectory estimation
acts as a particle smoother. As discussed in Section 2.2.3, multi-object trajectory
estimation using backward simulation consists of a forward filtering process and a
backward smoothing process. The fused multi-object densities can be viewed as
output from the forward filtering process. For backward smoothing, we sequentially
draw samples of sets of trajectories from the backward kernel backwards in time.
This will be elaborated in the following subsections.

3.4.1 Backward Simulation for Sets of Trajectories
The objective is to obtain the multi-trajectory posterior density π1:T |T (X1:T ) from
multi-Bernoulli densities using backward simulation. As mentioned in Section 2.2.3,

23


3. Methods

Figure 3.8: An example shows the backward simulation results (dashed line).
There are 4 trajectories existing at time k + 1 and 5 objects at time k, where
trajectory 1 is ended at time k; trajectories 2 − 4 are updated by the objects at
time k; trajectories 5 and 6 initialize two new trajectories; and all the trajectories
starting after time k + 1 remain unaltered.

backward simulation is a sequential process from the last time instance T to the
initial time instance. In each step, we consider four possible local hypotheses for
a set of trajectories. Assume that we have a set of trajectories at time k + 1 and
an MB set of object densities at time k. The first local hypothesis is that a subset
of the trajectories is updated by a subset of the object set, meaning that these
trajectories are extended to time k; the second local hypothesis is that a subset of
the trajectories is ended at time k + 1; the third local hypothesis is that a subset of
the object set initializes a new set of trajectories at time k; the last local hypothesis
is that a subset of the trajectories remains unaltered, meaning that these trajectories
start at or after time k + 1. Figure 3.8 shows an example of four possible results
when performing the backward simulation from time k + 1 to time k.

A global hypothesis is a collection of local hypotheses, explaining the association
of each trajectory and object density. Each local hypothesis can be represented as
ai, and it can be one of the four local hypotheses mentioned above. Therefore, a
global hypothesis can be represented as a set h = {a1, a2, ..., an}. By initializing
the trajectories at the last time instance T (by sampling the MB density at time
T) and sequentially drawing samples of trajectories from the backward simulation
kernel (2.19) till time 1, we can obtain a sample of the set of trajectories in time
interval 1 : T . If we run the complete backward simulation S times, we then obtain a
particle representation of the posterior multi-trajectory density in time interval 1 : T .

24


3. Methods

3.4.2 Gaussian Implementation for Backward Simulation
This subsection shows the Gaussian implementation of backward simulation. The
following assumptions are made:

• Each object has a constant survival probability pS,
• Object state dynamic function is linear Gaussian, i.e. xk+1 = N (·; Axk, Q)

(see Section 2.3),
• The i-th Bernoulli component at time k has existence probability ri

k|k and
Gaussian single-object density with mean xi

k|k and covariance P i
k|k,

• The set of newborn objects at each time step is a PPP with intensity

λB
k (x) =

Nb
k∑

i=1
wb,i

k N
(
x; xb,i

k , P b,i
k

)
, (3.24)

where N b
k is the number of Gaussian components, and each component has

weight wb,i
k .

The different local hypotheses can be represented as
• The hypothesis, corresponding to the case that the trajectory ended at time

step k + 1, is given by

wi,1
k:T |T = 1 − ri

k|k + ri
k|k

(
1 − pS

)
, (3.25a)

ri,1
k:T |T =

ri
k|k

(
1 − pS

)
wi,1

k:T |T
, (3.25b)

pi,1
k:T |T

(
t, x1:ν

)
= δk[t]δ1[ν]N

(
x1; xi

k|k, P i
k|k

)
, (3.25c)

where δk[·] represents the Kronecker delta function centred at k.
• The hypothesis, corresponding to the case that the Bernoulli component, i ∈

{1, . . . , nk|k}, is updated by trajectory Y j =
(
tj, y1:νj

)
, j ∈ {1, . . . , m}, is given

by

wi,j+1
k:T |T =ri

k|kpSN
(
y1; Axi

k|k, P i
k+1|k

)
, (3.26a)

ri,j+1
k:T |T =1, (3.26b)

pi,j+1
k:T |T

(
t, x1:ν

)
=δk[t]δνj+1[ν]δy1:νj

(
x2:ν

)
× N

(
x1; xi

k|T , P i
k|T

)
, (3.26c)

xi
k|T =xi

k|k + Gi
(
y1 − Axi

k|k

)
, (3.26d)

P i
k|T =P i

k|k − GiAP i
k|k, (3.26e)

Gi =P i
k|kF T

(
P i

k+1|k

)−1
, (3.26f)

P i
k+1|k =AP i

k|kAT + Q. (3.26g)

• The hypothesis of the i-th trajectory Bernoulli component, i ∈ {nk|k+1, . . . , nk|k+
m} corresponding to the case that the object with trajectory Y j =

(
tj, y1:νj

)
25


3. Methods

was first detected at time step k + 1, is given by

wi,2
k:T |T =

Nb
k+1∑

i=1
wb,i

k+1N
(
y1; xb,i

k+1, P b,i
k+1

)
(3.27a)

ri,2
k:T |T = 1, (3.27b)

pi,2
k:T |T

(
t, x1:ν

)
= wi,2

k:T |T iδ(tj ,y1:νj)
(
t, x1:ν

)
. (3.27c)

• The hypothesis of the i-th trajectory Bernoulli component, i ∈ {nk|k+1, . . . , nk|k+
m} corresponding to the case that the object with trajectory Y j =

(
tj, y1:νj

)
remains unaltered, has wi,1

k:T |T = 1, ri,1
k:T |T = 1 and pi,1

k:T |T (X) = δY j (X).

In principle, we need to enumerate every possible global hypothesis to draw samples,
which can be computationally expensive. One possible solution is to only keep the
global hypotheses with non-negligible weights [23]. This can be addressed by solving
a ranked assignment problem using Murty’s algorithm [32] [33]. More specifically,
our goal is to sample an association that assigns m trajectories at time k + 1 to nk|k
Bernoulli densities at time k. The assignment problem is formulated by constructing
a cost matrix,

C = − log
[

W1 W2
]

, (3.28a)

W
(j,i)
1 =

wi,j+1
k:T |T

wi,1
k:T |T

, (3.28b)

W2 = diag
(
w

nk|k+1,2
k:T |T , . . . , w

nk|k+m,2
k:T |T

)
, (3.28c)

where the (j, i)-th entry in W1 ∈ Rm×nk|k represents the cost by assigning the j-
th object state to the corresponding i-th trajectory and W2 ∈ Rm×m is a diagonal
matrix where each entry on the diagonal represents the new trajectories are initial-
ized by the object state at time k. Then by running Murty’s algorithm, we obtain
M -best assignments with non-negligible weights and we draw a sample from these
M-best assignments.

In addition to pruning negligible global hypotheses, we also compute the squared
Mahalanobis distance (SMD) between the first state y1 in a trajectory and the
predicted density of N

(
x; xi

k|k, P i
k|k

)
SMD

(
y1; xi

k|k, P i
k|k

)
=

(
y1 − Axi

k|k

)T (
AP i

k|kAT + Q
)−1 (

y1 − Axi
k|k

)
(3.29)

to a threshold σ. If this distance is larger than σ, we set the corresponding entry in
the cost matrix (3.28a) to 0.

3.5 Evaluation
To evaluate the performance of MOT algorithms, we need to measure the similarity
between the estimated set of object states and the ground truth. Intuitively, an

26


3. Methods

MOT performance measure should capture localization errors for properly detected
objects, cost for missed detections, and cost for false detections. In this thesis, two
performance measures are used to evaluate MOT algorithms: GOSPA metric [34]
and NLL [35].

3.5.1 GOSPA
Let X be the set of true object states and Y be the set of object estimates. Let
d(x, y) denote the Euclidean distance between x and y, and d(c) = min(c, d(x, y)).
Also let Πn be the set of all permutations of {1, · · · , n} for any n ∈ N where π ∈ Πn

is a sequence (π(1), · · · , π(n)). The GOSPA metric is defined as

d(c,α)
p (X, Y ) ≜

 min
π∈Π|Y |

|X|∑
i=1

d(c)
(
xi, yπ(i)

)p
+ cp

α
(|Y | − |X|)

 1
p

. (3.30)

There are three parameters in GOSPA: c stands for the maximum allowable local-
ization error, p gives a penalty for outliers, and α is a factor for the cardinality
mismatching penalty, which is suggested to be set as 2 for evaluating MOT perfor-
mance in [34]. Given a ground truth x ∈ X, object estimate y ∈ Y , and d(x, y) ≥ c,
then according to d(c) = min(c, d(x, y)), the distance between x and y is c. The
penalty for assigning ground truth x with object estimate y is cp, and cp/α other-
wise. If we set α = 2, the cost of the unassigned object would be the same as the
cost of that object being associated with another object at distance c. Computing
the GOSPA metric (α = 2) can be formulated as a 2D assignment problem:

d(c,2)
p (X, Y ) =

min
γ∈Γ

 ∑
(i,j)∈γ

d (xi, yj)p + cp

2 (|X| + |Y | − 2|γ|)
 1

p

, (3.31)

where γ is an assignment set between the estimates and the ground truth, and Γ
is a set of all possible assignment sets γ. Note that |γ| is the number of properly
assigned objects, and that |X| − |γ| and |Y | − |γ| represent the number of missed
and false objects, respectively. Consequently, GOSPA allows for the decomposition
of the total estimation error into localization errors for properly detected objects
and errors for missed and false detections, respectively.

3.5.2 NLL
GOSPA metric does not take into account the uncertainty information when eval-
uating the MOT performance, and it has two hyper-parameters c and p that need
to be chosen. As a comparison, NLL also evaluates the object state uncertainty
estimates and has no hyperparameters. Given the true set of object states Y and
the multi-object posterior density f(·), the NLL is defined as

NLL(Y, f) = − log f(Y ). (3.32)
Note that our posterior density is an MB. To assess performance using the NLL
measure, we introduce a PPP component, whose intensity is a Gaussian distribution

27


3. Methods

centred at the origin with a large covariance. The NLL evaluates PMB densities in
three parts, including matched objects, missed objects and false detections. Let the
PPP intensity function be λ(·), and let ri and pi(·) represent the existence probability
and single-object density of i-th Bernoulli components, respectively. The following
algorithm can be used to approximately calculate the NLL [35]:

NLL (Y, fPMB) ≈ min
γ∈Γ

−
∑

(i,j)∈γ

log (ripi (yj))︸ ︷︷ ︸
Localization

−
∑

i∈F(γ)
log (1 − ri)︸ ︷︷ ︸

False detections

+
∫

λ (y′) dy′ −
∑

j∈M(γ)
log λ (yj)︸ ︷︷ ︸

Missed objects

,
(3.33)

where γ represents a possible assignment between the estimates and the ground
truth, M represents the ground truth that is not assigned to any estimates, and the
F denotes the estimates that are not assigned to any ground truth. The localiza-
tion part measures how well the Bernoulli components explain the matched ground
truths. The false detections part captures the existence uncertainty of unmatched
object densities. Finally, the missed object part evaluates how well the PPP com-
ponent explains the unmatched ground truths.

The NLL of PMB can be interpreted as the sum of the likelihoods of all possi-
ble assignments between the ground truths and either PPP component or one of
the Bernoulli components of the estimates. It is generally intractable to compute
the exact NLL due to the large number of possible assignments. However, usually,
only a few of these likelihoods of assignments contribute most to the exact sum.
This means that the likelihood can be approximated by finding the best assignment
between the ground truths and the estimates and omitting all other possible assign-
ments. Finding the assignment γ with the minimum cost can be formulated as a 2D
assignment problem, and the corresponding cost matrix is calculated as

Ci,j =


− log

(
pi(yj)
1−ri

ri

)
, if i ≤ m

− log λ (yj) , if i = j + m

∞, otherwise ,

(3.34)

where Ci,j is the (i,j)-th entry of cost matrix C, and m is the number of estimates.

28


4
Simulation Results

In this chapter, simulations are set up based on the knowledge from the testing
trucks. Then the settings of the simulation and result comparisons are shown.

4.1 Simulation Setting
In this section, we introduce the simulation settings for MOT evaluation. Four
sensors are placed in a 150 m× 150 m area. Each sensor’s feedback is an MB
density representing the set of objects within its FoVs. At each time step, we perform
multi-object density fusion to fuse the densities from each sensor. Then, we apply
backward simulation on the fused multi-Bernoulli density at each time step to obtain
trajectory estimates.

4.1.1 Sensor FoVs Setup
We set the average overlapping area for four sensors to reflect the real truck con-
figuration as shown in Figure 3.2, where some areas can be surveilled by multiple
sensors, and some just by single sensors. The sensors’ layout for simulation is shown
in Figure 4.1.

4.1.2 Ground Truth Creation
We assume that the objects’ movements follow a constant velocity (CV) model. The
following steps generate the ground truth:

1. Specify the initial position and velocity of each object. The object state at
time k can be written as

xk =
[
x ẋ y ẏ

]T
. (4.1)

2. Determine the time interval and simulation duration. For convenience, we set
the time interval and the duration as Ts = 1 and 20 respectively.

3. Generate the object densities. The state of the object at time k can be se-
quentially generated using the motion model (2.3). During the simulation, we
set the dynamic matrix Ak as time-invariant

A = I2 ⊗
[
1 Ts

0 1

]
, (4.2)

29


4. Simulation Results

Figure 4.1: The sensor layout contains nine different blocks, four blocks at the
corners covered by only one sensor, one block at the centre covered by all four
sensors, and the rest four blocks covered by two sensors.

where I2 is an identity matrix of size 2, ⊗ denotes the Kronecker product,
and Ts represents the time interval of this discrete process. And we set the
input matrix Bk as zero. The noise is defined as q ∼ N (0, Q) where Q =

σ2
qI2 ⊗

[
T 3

s

3
T 2

s

2
T 2

s

2 Ts

]
, and we set the σq fairly as 0.3.

4. The generated object densities at consecutive time steps can be viewed as a set
of sequences. Given a sequence of object densities, at each time k, we sample
a random number from a uniform distribution and compare it with pi = 0.92.
If the sample is larger than pi, we remove this single density. Using this trick,
we introduce random birth and death into this sequence of object densities,
which makes the ground truth more similar to the real-world setting.

In addition to the CV model, we implement the coordinate turn (CT) motion model
to create ground truth with nonlinear object motions. With the object states xk =[
x y v θ ω

]T
where x and y represent the position, v is the velocity, θ is the

heading direction, and ω is the turn rate, we set the dynamic function f (xk, uk, qk)
in Section 2.2 as time-invariant and omit the input uk. The dynamic function is

f (x, q) =


x + Tsv cos(θ)
y + Tsv sin(θ)

v
θ + Tsω

ω

+ q, (4.3)

30


4. Simulation Results

where q ∼ N (0, Q) is the process noise in CT model with Q =


0 0 0 0 0
0 0 0 0 0
0 0 Tsσ

2
v 0 0

0 0 0 0 0
0 0 0 0 Tsσ

2
ω

.

4.1.3 Object Estimate Densities Creation
For creating the object estimate densities for the simulation, we assume that the ob-
jects can be estimated around the ground truth. This can be achieved by drawing a
sample from a Gaussian distribution whose mean is located at the ground truth with
a constant covariance matrix, which can be viewed as a measurement of estimation
accuracy. The covariance of this sample is set by sampling from a Wishart distribu-
tion, which allows the generation of random covariance matrices that can capture
the desired real-world statistical properties. This is achieved by setting the sigma
parameter of Wishart based on the knowledge of real-world sensors’ covariance and
by limiting the degree of freedom at a normal level. Figure 4.2 shows four examples
of the simulated tracks.

4.2 MOT Simulation Evaluation
The multi-object multi-sensor fusion algorithm, described in Section 3.2 is integrated
with the four different single-object fusion methods described in Section 3.3, result-
ing in four different implementations. We evaluate their performance in scenarios
with different sensor FoVs and the number of objects using GOSPA and NLL.

4.2.1 Simulation Result with Different Number of Objects
First, we keep the sensors’ FoVs fixed as shown in Figure 4.1, and conduct two
simulations with 5 and 20 objects, respectively, and each object has random initial
positions and velocities. The evaluation results are obtained by averaging a number
of 100,000 times Monte Carlo simulations. The results under 5 and 20 objects are
shown in Table 4.1 and Table 4.2, respectively.

Table 4.1: GOSPA and NLL values of four implementations with 5 objects.

CI SF AA CC
GOSPA 0.9817 ± 0.1007 1.2108 ± 0.0869 0.9817 ± 0.1007 0.9692 ± 0.1051

NLL 8.4053 ± 2.6080 15.1347 ± 2.4953 8.1641 ± 2.5828 11.7661 ± 2.5077

Table 4.2: GOSPA and NLL values of four implementations with 20 objects.

CI SF AA CC
GOSPA 1.9722 ± 0.1141 2.4153 ± 0.1040 1.9722 ± 0.1141 1.9502 ± 0.1172

NLL 31.9725 ± 4.8912 57.4180 ± 4.7032 31.0877 ± 4.8450 44.7085 ± 4.7162

31


4. Simulation Results

(a) Randomly generated 5 trajectories
using CV model.

(b) Randomly generated 20 trajectories
using CV model.

(c) Randomly generated 5 trajectories
using CT model.

(d) Randomly generated 20 trajectories
using CT model.

Figure 4.2: Examples of randomly generated trajectories in four different simula-
tion settings.

4.2.2 Simulation Result with Different Overlapping areas
We also evaluate the robustness of the multi-object fusion algorithm under two
different sensors’ FoVs configurations, and in both cases, we set the number of
trajectories as 5. The evaluation results are obtained by averaging a number of
100,000 times Monte Carlo simulations. The results of three different cases with
small, medium, and large overlapping areas are shown in Table 4.3, 4.1, and 4.4,
respectively.

Table 4.4: GOSPA and NLL values of four implementations in a scenario with a
larger overlapping area.

CI SF AA CC
GOSPA 0.7213 ± 0.0728 1.2185 ± 0.09671 0.7213 ± 0.0728 0.6190 ± 0.0699

NLL 1.8620 ± 1.1840 15.1644 ± 2.5903 1.9628 ± 0.9246 5.9861 ± 2.0167

32


4. Simulation Results

Table 4.3: GOSPA and NLL values of four implementations in a scenario with a
smaller overlapping area.

CI SF AA CC
GOSPA 1.0870 ± 0.0962 1.2097 ± 0.08340 1.0870 ± 0.09623 1.0836 ± 0.09787

NLL 11.4350 ± 2.6708 15.1213 ± 2.4486 11.2689 ± 2.6710 13.4188 ± 2.4831

(a) Small overlapping area (b) Large overlapping area

Figure 4.3: Two different settings for the sensors’ FoVs overlapping areas.

4.2.3 Simulation Result with Different Motion Model

We also evaluate the fusion performance when the object trajectories are generated
using the CT model. We choose the sensor layout as Figure 4.1, and we consider
four different scenarios with 5, 10, 15, and 20 objects. The evaluation results are
obtained by averaging a number of 100,000 times Monte Carlo simulations. The
result of the simulation with 5 trajectories is shown in Table 4.5.

Table 4.5: GOSPA and NLL values of four implementations with CT model and
5 objects.

CI SF AA CC
GOSPA 1.2486 ± 0.2346 1.4512 ± 0.2082 1.2486 ± 0.2346 1.2369 ± 0.2379

NLL 8.0111 ± 5.0117 13.5588 ± 5.1780 7.8215 ± 4.9827 10.7542 ± 5.0881

Table 4.7: GOSPA and NLL values of four implementations with CT model and
15 objects.

CI SF AA CC
GOSPA 2.3764 ± 0.3143 2.6880 ± 0.2778 2.3764 ± 0.3143 2.3600 ± 0.3173

NLL 22.7095 ± 7.3046 36.1679 ± 8.2424 22.2723 ± 7.2524 29.3310 ± 7.6685

33


4. Simulation Results

Table 4.6: GOSPA and NLL values of four implementations with CT model and
10 objects.

CI SF AA CC
GOSPA 1.8606 ± 0.2830 2.1285 ± 0.2521 1.8606 ± 0.2830 1.8461 ± 0.2863

NLL 15.2951 ± 6.3094 25.1431 ± 6.7127 14.9541 ± 6.2712 20.1593 ± 6.5033

Table 4.8: GOSPA and NLL values of four implementations with CT model and
20 objects.

CI SF AA CC
GOSPA 2.8250 ± 0.3354 3.1676 ± 0.2999 2.8250 ± 0.3354 2.8078 ± 0.3385

NLL 30.7134 ± 8.7723 47.0626 ± 9.8334 30.1637 ± 8.6975 38.8617 ± 9.1875

4.2.4 Simulation Result with a Combination of Low-quality
Sensors and High-quality Sensors

In addition, we evaluate the performance of the multi-object multi-sensor fusion algo-
rithm in scenarios where sensors have different qualities. In particular, we manually
set the sensor 1 and the sensor 4 with big measurement covariance, which repre-
sents the larger uncertainty about the object estimate. The simulation is conducted
with the CV model with 5 trajectories, and the results are obtained by averaging a
number of 100,000 times Monte Carlo simulations. The result is shown in Table 4.9.

Table 4.9: GOSPA and NLL values of four implementations with a combination
of low-quality and high-quality sensors.

CI SF AA CC
GOSPA 2.6378 ± 0.8274 2.3573 ± 0.7316 3.1566 ± 0.7226 2.4653 ± 0.8887

NLL 10.2655 ± 4.0854 16.0120 ± 11.1145 15.0920 ± 12.2601 14.5136 ± 3.1087

4.3 Trajectory Estimation
Trajectory estimates are obtained by applying backward simulation, described in
Section 3.4, to the fused multi-Bernoulli multi-object densities. Figure 4.4 shows an
example of trajectory estimation, where there are 5 objects moving from the left-
bottom to the right-top of the surveillance area. In the first scenario, we create the
ground truth without random birth and death. It can be seen from the results that,
in general, the trajectory estimates of different objects can be correctly separated
even when objects are moving in close distance, see the left-bottom corner in Figure
4.4. Figure 4.5 and Figure 4.6 show two more complex examples of trajectory
estimation, where we bring back the random birth and death. In addition, in Figure
4.6, we manually set the objects closer to be more challenging.

34


4. Simulation Results

-50 -40 -30 -20 -10 0 10 20 30 40

x (m)

-40

-30

-20

-10

0

10

20

30

40

y
 (

m
)

trajectory estimates

fusion results

ground truth

Figure 4.4: Trajectory estimations of 5 objects, where the green dots represent
the fusion results, the blue line segments denote the trajectory estimates, and the
ground truth trajectories are marked as red line segments.

35


4. Simulation Results

-50 -40 -30 -20 -10 0 10 20 30 40 50

x (m)

-40

-30

-20

-10

0

10

20

30

40

y
 (

m
)

trajectory estimates

fusion results

ground truth

Figure 4.5: Trajectory estimations on 5 objects with random birth and random
death.

-50 -40 -30 -20 -10 0 10 20 30 40

x (m)

-40

-30

-20

-10

0

10

20

30

40

y
 (

m
)

trajectory estimates

fusion results

ground truth

Figure 4.6: Trajectory estimations on 5 objects. We set the random birth and
death and make the objects closer.

36


5
Experimental Results

In this chapter, we evaluate the four different implementations using real-world data.
In the scenario being considered, a Volvo V60 car is overtaking a moving testing
truck and cutting in from the left rear to its left front. The fusion results obtained
from four different implementations are compared to the ground truth, and the
fusion performance is evaluated using GOSPA and NLL. In addition, the running
times of these algorithms are compared.

5.1 Real-world Sensor Data
This real-world evaluation is powered by the Volvo reference system, including mul-
tiple real-world datasets and ground truth, with both multi-object and single-object
data. However, due to time constraints and imperfect reference data, we can only
evaluate our fusion algorithm under a single-object scenario. In practice, we clip a
representative scenario, in which a Volvo V60 car overtakes an ego truck. As shown
in Figure 5.1a, this car crosses the FoVs of four different sensors during the entire
process, a scenario that could be considered representative for testing our developed
fusion algorithm. In addition, the experimental evaluation, as shown in Figure 5.1b,
only uses valid objects densities that are moving and with high existence confidence.

Usually, the sensor is set to return position on the middle of the closest side of this
object. Therefore, the object’s size can be approximated based on the measurement
of this vehicle’s closest side length and width length. In the reference system, the
data that we get is a point attached to the back of the car. Thus, a position
transformation is needed to compensate for this bias. It is also worth mentioning the
time synchronization between the sensor data and the ground truth. In our testing
case, the ground truth is updated more frequently than the sensor data. Therefore,
it is always possible to find the corresponding ground truth for any sensor data at
any given time.

5.2 Experimental Evaluation

5.2.1 Fusion Result Visualization
Figure 5.2 shows the multi-object fusion results tested using four different single-
object fusion methods.

37


5. Experimental Results

(a) Ground truth (b) Valid feedback from sensors

Figure 5.1: The scenario in the real-world testing data, where a car takes over the
testing truck (blue rectangular) on the road.

5.2.2 GOSPA Evaluation Result
For this experimental evaluation, we set the parameters of GOSPA as p = 2, c = 8,
and α = 2. The average GOSPA of four different single-object fusion methods is
shown in Table 5.1.

Table 5.1: Average GOSPA value of four single-object fusion algorithms

CI SF AA CC
GOSPA 0.8127 0.8630 0.8737 0.8351

In order to illustrate the comparison at different time steps, Figure 5.3 and Figure
5.4 show the GOSPA and NLL values during the overtaking process.

5.2.3 NLL Evaluation Result
The result showing the NLL of four different single-object fusion methods is given
in Table 5.2.

Table 5.2: Average NLL value of four single-object fusion methods

method CI SF AA CC
NLL 3.5433505 4.8674531 3.5174546 4.3920736

38


5. Experimental Results

(a) Covariance intersection (b) Safe fusion

(c) Arithmetic average fusion (d) Cross-covariance fusion

Figure 5.2: The fusion result from four different single-object fusion algorithms.

39


5. Experimental Results

0 10 20 30 40 50 60 70

time instance

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

G
O

S
P

A

CI

SF

AA

CC

Figure 5.3: GOSPA values over 61 time instances of four different single-object
fusion algorithms.

5.2.4 Computational Complexity
To evaluate the computational complexity, we record the running time for each im-
plementation1. The runtime is measured by running the entire multi-object fusion
process (including preprocessing, data association, and single-object density fusion
algorithms) 10000 times and takes the average. The result is shown in Table 5.3.

Table 5.3: Average time consumption in the simulations

method CI SF AA CC
time (s) 0.02915 0.03062 0.02811 0.02954

5.3 Trajectory Estimation
The trajectory can be estimated after getting the fusion result in real data. Figure
5.5 illustrates the trajectory result.

1The implementation is carried out under MATLAB 2020a powered by Intel®Core™ i7-11850H
processor with 32 GB DDR4 memory.

40


5. Experimental Results

0 10 20 30 40 50 60 70

time instance

-2

0

2

4

6

8

10

12

14

16

N
L

L

CI

SF

AA

CC

Figure 5.4: NLL values over 61 time instances of four different single-object fusion
algorithms.

41


5. Experimental Results

Figure 5.5: The green dots are the fusion results, blue line segments are the tra-
jectory estimations, and ground truth trajectories are marked as red line segments.

42


6
Discussion

6.1 Simulation and Experimental Comparison
The performance of the simulation surpasses that of the real-world experiment sig-
nificantly. This discrepancy can be attributed to the configuration of the simulation,
which inherently influences its outcomes. However, it should be noted that the sim-
ulation falls short of capturing the complete intricacy of the real world.

After analyzing the real-world data, it has been observed that the sensors’ low-
quality output led to imprecise fusion results. This problem becomes evident when
examining specific time instances, as depicted in the accompanying Figure 6.1. Un-
der extreme working conditions, e.g. in Figure 6.1a, only one sensor reports a density
(the mean of this density is marked as a blue point). However, this feedback is ap-
proximately one meter away from the ground truth. This is caused by the estimation
compensation. We transfer all the densities in terms of their mean points all located
at the rear-middle point of the object, however, without good knowledge of the size
of the object, the compensation might drift away from the ground truth. In Figure
6.1b, there are two densities reported from the sensors, one from the camera and
one from the front-looking radar. However, due to the unpredictable time difference,
both densities drift away from the ground truth even if we have good compensation.

A noticeable disparity in trajectory estimation can be observed between Figure 4.4
and Figure 5.5. In the simulation, the error levels for both longitude and latitude
velocities are closely set, resulting in acceptable velocities for accurate trajectory
estimation. However, in the real-world scenario, only the longitude velocity can
be deemed reliable, while the reliability of latitude velocity is compromised due to
the limitations of the radars. Consequently, in Figure 5.5, the discrepancy between
the estimated and ground truth trajectories is acceptable in terms of longitude, but
the estimation of latitude is not satisfactory. With improved sensor quality, the
trajectory estimation is expected to be more accurate.

6.2 Choice of Single-object Fusion Method
It is hard to draw a concrete conclusion on which single-object fusion method signif-
icantly outperforms others. The selection of the method relies on the sensor quality.
In cases where it is common to encounter situations where one sensor consistently
provides tracks of significantly higher quality than others, i.e. accurate detections

43


6. Discussion

(a) Inaccurate densities returned from left-
rear radar

(b) Inaccurate densities mainly returned in
front of ego truck

Figure 6.1: Inaccurate sensors densities in two specific time instances

with small covariance, the SF algorithm is a wise choice (see Subsection 4.2.4).
However, under this thesis configuration, the qualities of the sensors are at the same
level, and the tracks are fused around the ground truth with averaging weights. As
a consequence, AA and CI fusion yield better results.

6.3 Simplification in Real-world Applications
In some situations, the covariance estimation is less important. The AA fusion can
be a good choice for this application. From the experimental and simulation result,
AA required less computational time and the result is acceptable. The AA fusion
can directly compute the fused mean estimate by taking the weighted sum. As a
comparison, other algorithms require more complex matrix calculation, including
SVD or inversion of the covariance matrix. This is more computationally expensive
than arithmetic average fusion. In conclusion, the arithmetic average fusion can be
a choice for less complex applications.

Another problem that can be seen is that the camera’s position estimation is inac-
curate. Table 6.1 is the result with the absence of the camera’s feedback. Compared
to Table 5.1 and 5.2, it can be seen from Table 6.1 that the GOSPA value becomes
lower after ignoring the camera’s feedback. The covariance intersection algorithm
performs better with more accurate sensor feedback.

44


6. Discussion

Table 6.1: GOSPA and NLL values if camera data are ignored

CI SF AA CC
GOSPA 0.7920 0.8544 0.8242 0.8333

NLL 3.5796 4.8482 3.6265 4.3793

6.4 Future work
Due to time limitations, we could not fully leverage all the available information,
such as object classification and bounding boxes. One possible future work direction
is to perform data association based on classification relationships by incorporating
object classification information. Additionally, it is also interesting to enhance the
understanding of object tracking by introducing another dimension that indicates
the spatial location of tracks on the object, which can be achieved through the uti-
lization of bounding boxes.

Furthermore, our data association formulation was limited within individual time
instances. However, it may be possible to establish historical relationships between
adjacent time instances. This helps to build the data association that cannot be
correctly reflected in terms of spatial information. These historical relationships
may be accomplished using a graph-based data association approach [36].

45


6. Discussion

46


7
Conclusion

In this thesis, we propose a multi-sensor multi-Bernoulli densities fusion algorithm
with implications for both theoretical understanding and practical applications in
the field of ADAS. The result shows that the objects can be tracked accurately
under the given experimental environment. Additionally, this fusion algorithm is
robust and can be implemented with multiple sensor setups. The trajectories esti-
mated from the fused multi-object densities are accurate compared to the ground
truth. As an extension, four single-object fusion methods make this multi-sensor
multi-Bernoulli densities fusion algorithm flexible to be adapted to different types
of sensors.

47


7. Conclusion

48


Bibliography

[1] C.-Y. Chong, S. Mori, W. Barker, and K.-C. Chang, “Architectures and al-
gorithms for track association and fusion,” IEEE Aerospace and Electronic
Systems Magazine, vol. 15, no. 1, pp. 5–13, Jan. 2000, Conference Name:
IEEE Aerospace and Electronic Systems Magazine, issn: 1557-959X. doi:
10.1109/62.821657.

[2] C. Knill, A. Scheel, and K. Dietmayer, “A direct scattering model for track-
ing vehicles with high-resolution radars,” in 2016 IEEE Intelligent Vehicles
Symposium (IV), Jun. 2016, pp. 298–303. doi: 10.1109/IVS.2016.7535401.

[3] B. Wang, W. Yi, R. Hoseinnezhad, S. Li, L. Kong, and X. Yang, “Distributed
fusion with multi-Bernoulli filter based on generalized covariance intersection,”
IEEE Transactions on Signal Processing, vol. 65, no. 1, pp. 242–255, Jan. 2017,
Conference Name: IEEE Transactions on Signal Processing, issn: 1941-0476.
doi: 10.1109/TSP.2016.2617825.

[4] S. Matzka and R. Altendorfer, “A comparison of track-to-track fusion algo-
rithms for automotive sensor fusion,” in vol. 35, Mar. 25, 2009, pp. 69–81,
isbn: 978-3-540-89858-0. doi: 10.1007/978-3-540-89859-7_6.

[5] S. Julier and J. Uhlmann, “A non-divergent estimation algorithm in the pres-
ence of unknown correlations,” in Proceedings of the 1997 American Control
Conference (Cat. No.97CH36041), ISSN: 0743-1619, vol. 4, Jun. 1997, 2369–
2373 vol.4. doi: 10.1109/ACC.1997.609105.

[6] S. J. Uhlmann Jeffrey K., “General decentralized data fusion with covariance
intersection,” in Handbook of Multisensor Data Fusion, 2nd ed., Num Pages:
26, CRC Press, 2009, isbn: 978-1-315-21948-6.

[7] K. Yang, Y. Bar-Shalom, and P. Willett, “Track-to-track fusion with cross-
covariances from radar and IR/EO sensor,” in 2019 22th International Con-
ference on Information Fusion (FUSION), Jul. 2019, pp. 1–5. doi: 10.23919/
FUSION43075.2019.9011439.

[8] G. Fredrik, “Safe fusion of dependent estimates,” in Statistical Sensor Fusion,
2:2, pp. 32–36, isbn: 978-91-44-07732-41.

[9] J. Nygards, V. Deleskog, and G. Hendeby, “Safe fusion compared to estab-
lished distributed fusion methods,” in 2016 IEEE International Conference
on Multisensor Fusion and Integration for Intelligent Systems (MFI), Baden-
Baden, Germany: IEEE, Sep. 2016, pp. 265–271, isbn: 978-1-4673-9708-7. doi:
10.1109/MFI.2016.7849499. [Online]. Available: http://ieeexplore.ieee.
org/document/7849499/ (visited on 01/24/2023).

[10] T. Li, X. Wang, Y. Liang, and Q. Pan, “On arithmetic average fusion and its
application for distributed multi-Bernoulli multitarget tracking,” IEEE Trans-

49

https://doi.org/10.1109/62.821657
https://doi.org/10.1109/IVS.2016.7535401
https://doi.org/10.1109/TSP.2016.2617825
https://doi.org/10.1007/978-3-540-89859-7_6
https://doi.org/10.1109/ACC.1997.609105
https://doi.org/10.23919/FUSION43075.2019.9011439
https://doi.org/10.23919/FUSION43075.2019.9011439
https://doi.org/10.1109/MFI.2016.7849499
http://ieeexplore.ieee.org/document/7849499/
http://ieeexplore.ieee.org/document/7849499/


Bibliography

actions on Signal Processing, vol. 68, pp. 2883–2896, 2020, Conference Name:
IEEE Transactions on Signal Processing, issn: 1941-0476. doi: 10.1109/TSP.
2020.2985643.

[11] T. Li, Y. Xin, Y. Song, E. Song, and H. Fan, Some statistic and information-
theoretic results on arithmetic average density fusion, Dec. 31, 2021. arXiv:
2110.01440[cs,math,stat]. [Online]. Available: http://arxiv.org/abs/
2110.01440 (visited on 05/10/2023).

[12] Y. Bar-Shalom, “On the track-to-track correlation problem,” IEEE Transac-
tions on Automatic Control, vol. 26, no. 2, pp. 571–572, Apr. 1981, Confer-
ence Name: IEEE Transactions on Automatic Control, issn: 1558-2523. doi:
10.1109/TAC.1981.1102635.

[13] H. W. Kuhn, “The Hungarian method for the assignment problem,” Naval
Research Logistics Quarterly, vol. 2, no. 1, pp. 83–97, 1955, issn: 1931-9193.
doi: 10.1002/nav.3800020109. (visited on 05/25/2023).

[14] L. Gao, G. Battistelli, and L. Chisci, “Fusion of labeled RFS densities with
minimum information loss,” IEEE Transactions on Signal Processing, vol. 68,
pp. 5855–5868, 2020, Conference Name: IEEE Transactions on Signal Process-
ing, issn: 1941-0476. doi: 10.1109/TSP.2020.3028496.

[15] A. Houenou, P. Bonnifait, V. Cherfaoui, and J.-F. Boissou, “A track-to-track
association method for automotive perception systems,” in 2012 IEEE Intelli-
gent Vehicles Symposium, Alcal de Henares , Madrid, Spain: IEEE, Jun. 2012,
pp. 704–710, isbn: 978-1-4673-2118-1 978-1-4673-2119-8 978-1-4673-2117-4.
doi: 10.1109/IVS.2012.6232261. [Online]. Available: http://ieeexplore.
ieee.org/document/6232261/ (visited on 01/26/2023).

[16] W. Luo, J. Xing, A. Milan, X. Zhang, W. Liu, and T.-K. Kim, “Multiple ob-
ject tracking: A literature review,” Artificial Intelligence, vol. 293, p. 103 448,
Apr. 1, 2021, issn: 0004-3702. doi: 10.1016/j.artint.2020.103448. [On-
line]. Available: https://www.sciencedirect.com/science/article/pii/
S0004370220301958 (visited on 05/08/2023).

[17] Y. Xu, X. Zhou, S. Chen, and F. Li, “Deep learning for multiple object
tracking: A survey,” IET Computer Vision, vol. 13, no. 4, pp. 355–368, 2019,
_eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1049/iet-cvi.2018.5598, issn:
1751-9640. doi: 10.1049/iet-cvi.2018.5598. [Online]. Available: https:
//onlinelibrary.wiley.com/doi/abs/10.1049/iet- cvi.2018.5598
(visited on 05/18/2023).

[18] L. Li, C. Dai, Y. Xia, and L. Svensson, “Deep fusion of multi-object densities
using transformer,” in ICASSP 2023 - 2023 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), Jun. 2023, pp. 1–5. doi:
10.1109/ICASSP49357.2023.10096214.

[19] R. P. S. Mahler, Advances in Statistical Multisource-Multitarget Information
Fusion. Artech House, Aug. 1, 2014, 1167 pp., Google-Books-ID: zPBaBAAAQBAJ,
isbn: 978-1-60807-798-4.

[20] K. Granström, M. Fatemi, and L. Svensson, “Poisson multi-Bernoulli mixture
conjugate prior for multiple extended target filtering,” IEEE Transactions on
Aerospace and Electronic Systems, vol. 56, no. 1, pp. 208–225, Feb. 2020, Con-

50

https://doi.org/10.1109/TSP.2020.2985643
https://doi.org/10.1109/TSP.2020.2985643
https://arxiv.org/abs/2110.01440 [cs, math, stat]
http://arxiv.org/abs/2110.01440
http://arxiv.org/abs/2110.01440
https://doi.org/10.1109/TAC.1981.1102635
https://doi.org/10.1002/nav.3800020109
https://doi.org/10.1109/TSP.2020.3028496
https://doi.org/10.1109/IVS.2012.6232261
http://ieeexplore.ieee.org/document/6232261/
http://ieeexplore.ieee.org/document/6232261/
https://doi.org/10.1016/j.artint.2020.103448
https://www.sciencedirect.com/science/article/pii/S0004370220301958
https://www.sciencedirect.com/science/article/pii/S0004370220301958
https://doi.org/10.1049/iet-cvi.2018.5598
https://onlinelibrary.wiley.com/doi/abs/10.1049/iet-cvi.2018.5598
https://onlinelibrary.wiley.com/doi/abs/10.1049/iet-cvi.2018.5598
https://doi.org/10.1109/ICASSP49357.2023.10096214


Bibliography

ference Name: IEEE Transactions on Aerospace and Electronic Systems, issn:
1557-9603. doi: 10.1109/TAES.2019.2920220.

[21] J. L. Williams, “Marginal multi-Bernoulli filters: RFS derivation of MHT,
JIPDA, and association-based MeMBer,” IEEE Transactions on Aerospace
and Electronic Systems, vol. 51, no. 3, pp. 1664–1687, 2015.

[22] Á. F. García-Fernández, J. L. Williams, K. Granström, and L. Svensson,
“Poisson multi-Bernoulli mixture filter: Direct derivation and implementa-
tion,” IEEE Transactions on Aerospace and Electronic Systems, vol. 54, no. 4,
pp. 1883–1901, 2018.

[23] Y. Xia, L. Svensson, Á. F. García-Fernández, J. L. Williams, D. Svensson,
and K. Granström, “Multiple object trajectory estimation using backward
simulation,” IEEE Transactions on Signal Processing, vol. 70, pp. 3249–3263,
2022, Conference Name: IEEE Transactions on Signal Processing, issn: 1941-
0476. doi: 10.1109/TSP.2022.3184794.

[24] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm
for discovering clusters in large spatial databases with noise,” in Proceedings
of the Second International Conference on Knowledge Discovery and Data
Mining, ser. KDD’96, Portland, Oregon: AAAI Press, Aug. 2, 1996, pp. 226–
231. (visited on 05/09/2023).

[25] S. Kullback and R. A. Leibler, “On information and sufficiency,” The Annals
of Mathematical Statistics, vol. 22, no. 1, pp. 79–86, 1951, Publisher: Institute
of Mathematical Statistics, issn: 0003-4851. [Online]. Available: https://
www.jstor.org/stable/2236703 (visited on 05/11/2023).

[26] M. Fontana, A. F. Garcia-Fenandez, and S. Maskell, “Bernoulli merging for the
Poisson multi-Bernoulli mixture filter,” in 2020 IEEE 23rd International Con-
ference on Information Fusion (FUSION), Rustenburg, South Africa: IEEE,
Jul. 2020, pp. 1–8, isbn: 978-0-578-64709-8. doi: 10.23919/FUSION45008.
2020.9190443. [Online]. Available: https://ieeexplore.ieee.org/document/
9190443/ (visited on 06/30/2023).

[27] L. Gao, G. Battistelli, and L. Chisci, “Fusion of labeled RFS densities with
different fields of view,” IEEE Transactions on Aerospace and Electronic Sys-
tems, vol. 58, no. 6, pp. 5908–5924, 2022.

[28] J. Munkres, “Algorithms for the assignment and transportation problems,”
Journal of the Society for Industrial and Applied Mathematics, vol. 5, no. 1,
pp. 32–38, Mar. 1957, issn: 0368-4245, 2168-3484. doi: 10.1137/0105003.
[Online]. Available: http://epubs.siam.org/doi/10.1137/0105003 (visited
on 05/11/2023).

[29] F. Bourgeois and J.-C. Lassalle, “An extension of the Munkres algorithm
for the assignment problem to rectangular matrices,” Communications of the
ACM, vol. 14, no. 12, pp. 802–804, Dec. 1, 1971, issn: 0001-0782. doi: 10.
1145/362919.362945. [Online]. Available: https://dl.acm.org/doi/10.
1145/362919.362945 (visited on 05/12/2023).

[30] H. Chen, T. Kirubarajan, and Y. Bar-Shalom, “Performance limits of track-to-
track fusion versus centralized estimation: Theory and application [sensor fu-
sion],” IEEE Transactions on Aerospace and Electronic Systems, vol. 39, no. 2,

51

https://doi.org/10.1109/TAES.2019.2920220
https://doi.org/10.1109/TSP.2022.3184794
https://www.jstor.org/stable/2236703
https://www.jstor.org/stable/2236703
https://doi.org/10.23919/FUSION45008.2020.9190443
https://doi.org/10.23919/FUSION45008.2020.9190443
https://ieeexplore.ieee.org/document/9190443/
https://ieeexplore.ieee.org/document/9190443/
https://doi.org/10.1137/0105003
http://epubs.siam.org/doi/10.1137/0105003
https://doi.org/10.1145/362919.362945
https://doi.org/10.1145/362919.362945
https://dl.acm.org/doi/10.1145/362919.362945
https://dl.acm.org/doi/10.1145/362919.362945


Bibliography

pp. 386–400, Apr. 2003, Conference Name: IEEE Transactions on Aerosp