Twitter Topic Modeling

Loading...
Thumbnail Image

Date

Type

Examensarbete för masterexamen
Master Thesis

Model builders

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Following social media discussions related to real life events, has been a great topic of interest. There is no general method for deciding whether the social media discussions reflect the dynamics of the events or if they lead a life on their own. Existing methods for analyzing social media discussions rely on extensive manual work from domain experts and do not generalize well to discussions on languages other than English nor to various events. Combining the domain expert’s knowledge with data driven approaches can lead to models that are applicable to di↵erent domains, and the same time are capable of handling large data amount from social media. In this research, we modeled the Twitter discussions about the Swedish party leader debate held on October 2013. We constructed a semiautomatic model based on Term Frequency-Inverse Document Frequency in order to identify and measure the debate topics on Twitter. For discovering other discussions, we made use of Latent Dirichlet Allocation - an unsupervised learning algorithm. We evaluated the models manually with the help of a domain expert. We compared the Twitter discussions to the topics the politicians were talking about on the debate. The correlation between the Twitter discussions and the debate topic corresponds to the results from a still ongoing political science research. The political science domain expert Linn Sandberg from The University of Gothenburg, Department of Political Science contributed to the research by defining the research-question and evaluating the models.

Description

Keywords

Data- och informationsvetenskap, Computer and Information Science

Citation

Architect

Location

Type of building

Build Year

Model type

Scale

Material / technology

Index

Endorsement

Review

Supplemented By

Referenced By