Nonparametric Evolutionary Short Text Topic Modeling

dc.contributor.authorEjbyfeldt, Emil
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)sv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineering (Chalmers)en
dc.description.abstractWith the advent of social media more information is published and discussions happens in the form of short text. Tools are needed for detecting new and changes in topics that can help people understand and explore the vast amount of information available. Many of current approaches do not handle short text well and some require specification of the number of topics beforehand. A way of extending Dirichlet Processes Mixture Models to handle temporal data is introduced. A collapsed Gibbs sampling algorithm for interference is derived for the model. In the model data is divided into epochs where data is interchangeable within an epoch. The number of clusters in each epoch is unbounded and the model has the ability to recover the birth, death and split of clusters. Topic modeling is done by assuming that each short text belong to a single topic. The model is specifically evaluated on short text dataset to show the model’s ability to discover topic evolution and discover the appearance of new topics. We also show that the model has better stability and less overfitting than previous solutions with the same abilities.
dc.subjectData- och informationsvetenskap
dc.subjectComputer and Information Science
dc.titleNonparametric Evolutionary Short Text Topic Modeling
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster Thesisen
local.programmeComplex adaptive systems (MPCAS), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Bild (thumbnail)
1.28 MB
Adobe Portable Document Format