Automatic extractive single document summarization An unsupervised approach

Bengtsson, Jonatan; Skeppstedt, Christoffer

Automatic extractive single document summarization An unsupervised approach

dc.contributor.author	Bengtsson, Jonatan
dc.contributor.author	Skeppstedt, Christoffer
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering (Chalmers)	en
dc.date.accessioned	2019-07-03T13:07:46Z
dc.date.available	2019-07-03T13:07:46Z
dc.date.issued	2013
dc.description.abstract	This thesis describes the implementation and evaluation of a system for automatic, extractive single document summarization. Three different unsupervised algorithms for sentence relevance ranking are evaluated to form the basis of this system. The first is the well established, graph based TextRank, the second is based on K-means clustering and the third on one-class support vector machines (SVM). Further more, several different variations of the original approaches are evaluated. These algorithms are, in themselves, language independent, but language dependent text preprocessing is needed to use them in this setting. Evaluations of the system, using the de facto standard ROUGE evaluation toolkit, shows that TextRank obtains the best score. The K-means approach gives competitive results, beating the predefined baselines on the main test corpus. The one-class SVM yields the worst performance of the three, but still manage to beat one of two baselines. The system is evaluated for both English and Swedish, however, the main evaluation is done for short news articles in English. In our opinion this system, together with domain specific boosting provides adequate results for the corpora tested.
dc.identifier.uri	https://hdl.handle.net/20.500.12380/174136
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Data- och informationsvetenskap
dc.subject	Computer and Information Science
dc.title	Automatic extractive single document summarization An unsupervised approach
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master Thesis	en
dc.type.uppsok	H

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: 174136.pdf
Size:: 1.04 MB
Format:: Adobe Portable Document Format
Description:: Fulltext

Ladda ner

Samlingar

Examensarbeten för masterexamen