Text summarization using transfer learnin: Extractive and abstractive summarization using BERT and GPT-2 on news and podcast data

Examensarbete för masterexamen

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.12380/300416
Download file(s):
File Description SizeFormat 
CSE 19-83 ODR Risne Siltova.pdfText summarization using transfer learning: Extractive and abstractive summarization using BERT and GPT-2 on news and podcast data2.66 MBAdobe PDFView/Open
Type: Examensarbete för masterexamen
Title: Text summarization using transfer learnin: Extractive and abstractive summarization using BERT and GPT-2 on news and podcast data
Authors: RISNE, VICTOR
SIITOVA, ADÉLE
Abstract: A summary of a long text document enables people to easily grasp the information of the topic without having the need to read the whole document. This thesis aims to automate text summarization by using two approaches: extractive and abstractive. The former approach utilizes submodular functions and the language representation model BERT, while the latter uses the language model GPT-2. We operate on two types of datasets: CNN/DailyMail, a benchmarked news article dataset and Podcast, a dataset comprised of podcast episode transcripts. The results obtained using the GPT-2 on the CNN/DailyMail dataset are competitive to state-of-the-art. Besides the quantitative evaluation, we also perform a qualitative investigation in the form of a human evaluation, along with inspection of the trained model that demonstrates that it learns reasonable abstractions.
Keywords: transformer;BERT;GPT-2;text summarization;natural language processing
Issue Date: 2019
Publisher: Chalmers tekniska högskola / Institutionen för data och informationsvetenskap
URI: https://hdl.handle.net/20.500.12380/300416
Collection:Examensarbeten för masterexamen // Master Theses



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.