Machine learning for big sequence data: Wavelet-compressed Hidden Markov Models
Loading...
Download
Date
Authors
Type
Examensarbete för masterexamen
Programme
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Hidden Markov models are among the most important machine learning methods
for the statistical analysis of sequential data, but they struggle when applied on
big data. Their relative inefficiency has been addressed several times by the use of
some compression techniques, either for the computation. This thesis explores the
former, with the application of a data compression technique based on wavelets and
the subsequent adaptation of the main HMMs algorithms from the literature: the
forward, Viterbi and Baum-Welch algorithms used to solve the evaluation, decoding
and training problem respectively. The testing phase shows that this new technique
generally yields equal or better results, obtaining some extremely high speedups in
the training problem, making it even thousands of times faster; this allows to easily
train a HMM with big data on a commodity laptop.
Description
Keywords
machine, learning, sequence, wavelet, compression, hidden, markov, models, viterbi, training
