Machine learning for big sequence data: Wavelet-compressed Hidden Markov Models

Bello, Luca

Machine learning for big sequence data: Wavelet-compressed Hidden Markov Models

Ladda ner

CSE 20-67 Bello.pdf (3.42 MB)

Publicerad

2020

Författare

Bello, Luca

Typ

Examensarbete för masterexamen

Sammanfattning

Hidden Markov models are among the most important machine learning methods for the statistical analysis of sequential data, but they struggle when applied on big data. Their relative inefficiency has been addressed several times by the use of some compression techniques, either for the computation. This thesis explores the former, with the application of a data compression technique based on wavelets and the subsequent adaptation of the main HMMs algorithms from the literature: the forward, Viterbi and Baum-Welch algorithms used to solve the evaluation, decoding and training problem respectively. The testing phase shows that this new technique generally yields equal or better results, obtaining some extremely high speedups in the training problem, making it even thousands of times faster; this allows to easily train a HMM with big data on a commodity laptop.

Ämne/nyckelord

machine, learning, sequence, wavelet, compression, hidden, markov, models, viterbi, training

URI

https://hdl.handle.net/20.500.12380/301733

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Machine learning for big sequence data: Wavelet-compressed Hidden Markov Models

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced