Large scale news article clustering

Examensarbete för masterexamen

Please use this identifier to cite or link to this item:
Download file(s):
File Description SizeFormat 
179841.pdfFulltext935.19 kBAdobe PDFView/Open
Type: Examensarbete för masterexamen
Master Thesis
Title: Large scale news article clustering
Authors: Yregård, Love
Lönnberg, Marcus
Abstract: In this thesis we examined different approaches on how to cluster news articles so that two articles which are covering the same information would belong to the same cluster. We examined already existing algorithms and pre-processing steps as well as developed our own. Our requirements were that the algorithm should be able to handle a vast amount of articles, produce clusters of high quality and do this in a short amount of time. We managed to come up with an algorithm which was quite fast and could produce clusters of high quality. We also developed two different optimization methods in order to speed up the clustering algorithms even more. We found that these methods improved the runtime performance greatly for two of the algorithms while the cluster quality was not significantly affected.
Keywords: Informations- och kommunikationsteknik;Data- och informationsvetenskap;Information & Communication Technology;Computer and Information Science
Issue Date: 2013
Publisher: Chalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)
Chalmers University of Technology / Department of Computer Science and Engineering (Chalmers)
Collection:Examensarbeten för masterexamen // Master Theses

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.