Deep Active Learning for Swedish Named Entity Recognition An empiric evaluation of active learning algorithms for Named Entity Recognition

Hagatulah, Nadim; Arvidsson, Kalle

Deep Active Learning for Swedish Named Entity Recognition An empiric evaluation of active learning algorithms for Named Entity Recognition

dc.contributor.author	Hagatulah, Nadim
dc.contributor.author	Arvidsson, Kalle
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.examiner	Angelov, Krasimir
dc.contributor.supervisor	Johansson, Richard
dc.contributor.supervisor	Wolff, Petter
dc.date.accessioned	2021-07-02T08:10:04Z
dc.date.available	2021-07-02T08:10:04Z
dc.date.issued	2021	sv
dc.date.submitted	2021
dc.description.abstract	Named entity recognition holds promise for numerous practical applications involving text data, such as keyword extraction and automated anonymization. However, successfully train a machine learning model for Named Entity Recognition is challenging due to the amount of annotated data required, especially for cases where language that is not globally common such as Swedish is involved. In such cases, using a Deep pre-trained model such as BERT in conjunction with the practice of active learning may be preferred. To obtain some insight into the implementation of such an approach, this thesis serves as an empirical study of various active learning strategies when used in conjunction with BERT-based name entity recognition. The performance of different active learning algorithms and the effect of acquisition size on the performance of active learning is the main focus of this study. In conclusion, after comparing and evaluating 17 different active learning methods, the study’s empirical results demonstrate entropy sampling to be the best performing active learning algorithm for Named Entity Recognition of Swedish texts, and the choice of acquisition sizes is practically negligible to performance.	sv
dc.identifier.uri	https://hdl.handle.net/20.500.12380/302940
dc.language.iso	eng	sv
dc.setspec.uppsok	Technology
dc.subject	Active Learning	sv
dc.subject	Deep Learning	sv
dc.subject	Transformer	sv
dc.subject	BERT	sv
dc.subject	NLP	sv
dc.subject	Named Entity Recognition	sv
dc.subject	Diversity-Based Sampling	sv
dc.subject	Uncertainty-Based Sampling	sv
dc.subject	Pool- Based Sampling	sv
dc.subject	Cumulative Training	sv
dc.title	Deep Active Learning for Swedish Named Entity Recognition An empiric evaluation of active learning algorithms for Named Entity Recognition	sv
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.uppsok	H
local.programme	Computer science – algorithms, languages and logic (MPALG), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 21-75 Hagatulah Arvidsson.pdf
Storlek:: 2.76 MB
Format:: Adobe Portable Document Format
Beskrivning:

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 1.51 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen