Text analysis for email multi label classification

Harsha Kadam, Sanjit; Paniskaki, Kyriaki

Text analysis for email multi label classification

dc.contributor.author	Harsha Kadam, Sanjit
dc.contributor.author	Paniskaki, Kyriaki
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.examiner	Dubhashi, Devdatt
dc.contributor.supervisor	Naili, Marwa
dc.date.accessioned	2020-07-08T11:24:36Z
dc.date.available	2020-07-08T11:24:36Z
dc.date.issued	2020	sv
dc.date.submitted	2020
dc.description.abstract	This master’s thesis studies a multi label text classification task on a small data set of bilingual, English and Swedish, short texts (emails). Specifically, the size of the data set is 5800 emails and those emails are distributed among 107 classes with the special case that the majority of the emails includes the two languages at the same time. For handling this task different models have been employed: Support Vector Machines (SVM), Gated Recurrent Units (GRU), Convolution Neural Network (CNN), Quasi Recurrent Neural Network (QRNN) and Transformers. The experiments demonstrate that in terms of weighted averaged F1 score, the SVM outperforms the other models with a score of 0.96 followed by the CNN with 0.89 and the QRNN with 0.80.	sv
dc.identifier.coursecode	DATX05	sv
dc.identifier.uri	https://hdl.handle.net/20.500.12380/301402
dc.language.iso	eng	sv
dc.setspec.uppsok	Technology
dc.subject	natural language processing	sv
dc.subject	machine learning	sv
dc.subject	multi label text classification	sv
dc.subject	deep neural networks	sv
dc.subject	bilingual texts	sv
dc.subject	emails	sv
dc.subject	short texts	sv
dc.title	Text analysis for email multi label classification	sv
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.uppsok	H

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 20-52 Kadam.pdf
Storlek:: 2.03 MB
Format:: Adobe Portable Document Format
Beskrivning:

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 1.14 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen