Summarization of news articles

dc.contributor.authorBeronius, Oscar
dc.contributor.departmentChalmers tekniska högskola / Institutionen för fysiksv
dc.contributor.examinerGranath, Mats
dc.contributor.supervisorNilsson Hansen, Johan
dc.contributor.supervisorBurke, David
dc.date.accessioned2019-08-05T10:34:18Z
dc.date.available2019-08-05T10:34:18Z
dc.date.issued2019sv
dc.date.submitted2019
dc.description.abstractIn this work, two neural summarization models, Seq2seq and the Transformer, with several variations, were implemented and evaluated on the task of abstractively summarizing news articles. Seq2seq yielded poor results, likely due to not being flexible enough to fit the data set. The Transformer yielded promising results, and it was discovered that the quality of the output was heavily dependent on the quality of the input data, indicating that the implementation might be good but the performance bottle necked by the data set. For future work, specifically in developing a summarizer of clusters of documents, a recommended approach would be to combine an abstractive summarizer such as the Transformer, with extractive methods. In such a case, the Transformer could be further improved upon by pre-training it on word embeddings such as Google BERT, or training it on additional data sets such as CNN/Daily Mail. Finally, it was discovered that the used evaluation metric, ROUGE, could not be considered complete for the given task, and it would thus be advised to explore additional evaluation metrics for summarization models.sv
dc.identifier.coursecodeTIFX05sv
dc.identifier.urihttps://hdl.handle.net/20.500.12380/300074
dc.language.isoengsv
dc.setspec.uppsokPhysicsChemistryMaths
dc.subjectSummarizationsv
dc.subjectSummarysv
dc.subjectNLPsv
dc.subjectTransformersv
dc.subjectAttentionsv
dc.subjectArticlessv
dc.subjectLongsv
dc.subjectMultiplesv
dc.subjectSeq2seqsv
dc.subjectAbstractivesv
dc.titleSummarization of news articlessv
dc.type.degreeExamensarbete för masterexamensv
dc.type.uppsokH
local.programmeComplex adaptive systems (MPCAS), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
Master_Thesis_Oscar_Beronius.pdf
Storlek:
1.99 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
1.14 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: