Interactive Change Point Detection Approaches in Time-Series
Ladda ner
Publicerad
Författare
Typ
Examensarbete för masterexamen
Program
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Change point detection becomes more and more important as datasets
increase in size, where unsupervised detection algorithms can help users
process data. This is of importance for datasets where multiple phases
are present and need to be separated in order to be compared. To detect
change points, a number of unsupervised algorithms have been developed
which are based on different principles. One approach is to define an optimisation
problem and minimise a cost function along with a penalty
function. Another approach uses Bayesian statistics to predict the probability
of a specific point being a change point. This study examines
how the algorithms are affected by features in the data, and the possibility
to incorporate user feedback and a priori knowledge about the data.
The optimisation and Bayesian approaches for offline change point detection
are studied and applied to simulated datasets as well as a real
world multi-phase dataset. In the optimisation approach, the choice of
the cost function affects the predictions made by the algorithm. In extension
to the existing studies, a new type of cost function using Tikhonov
regularisation is introduced. The Bayesian approach calculates the posterior
distribution for the probability of time steps being a change point.
It uses a priori knowledge on the distance between consecutive change
points and a likelihood function with information about the segments.
Performance comparison in terms of accuracy of the two approaches form
the foundation of this work. The study has found that the performance
of the change point detection algorithms are affected by the features in
the data. The approaches have previously been studied separately and
a novelty lies in comparing the predictions made by the two approaches
in a specific setting, consisting of simulated datasets and a real world
example.
Based on the comparison of various change point detection algorithms,
several directions for future research are discussed. A potential extension
is to apply the studied concept for offline algorithms, to the corresponding
online algorithms. The study of other cost functions can be explored
further, with emphasis on modified versions of the regularised cost functions
presented in this work.
Key
Beskrivning
Ämne/nyckelord
Change point detection, unsupervised machine learning, optimisation, Bayesian statistics, Tikhonov regularisation.