Domain Adapted Language Models

dc.contributor.authorJansson, Erik
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.examinerHaghir Chehreghani, Morteza
dc.contributor.supervisorJohansson, Richard
dc.date.accessioned2019-10-03T13:52:43Z
dc.date.available2019-10-03T13:52:43Z
dc.date.issued2019sv
dc.date.submitted2019
dc.description.abstractBERT is a recent neural network model that has proven it self amassive leap forward in natural language processing. Due to the tedious training required by this massive model, a pretrained BERT instance has been released as a high-performing starting point for further training on downstream tasks. The pretrained model has been trained on general English text and may not be optimal for applications in specialist language domains. This study examines adapting the pretrained BERT model to the specialist language domain of legal text, with classification as the downstream task of interest. The study finds that domain adaptation is most beneficial if faced with small task-specific datasets, where performance can approach that of a model pretrained from scratch on legal text data. The study further presents practical guidelines for applying BERT in specialist language domains.sv
dc.identifier.coursecodeDATX05sv
dc.identifier.urihttps://hdl.handle.net/20.500.12380/300390
dc.language.isoengsv
dc.setspec.uppsokTechnology
dc.subjectnatural language processingsv
dc.subjectBERTsv
dc.subjecttransformersv
dc.subjectdomain adaptationsv
dc.subjectlanguage modelsv
dc.subjectclassificationsv
dc.titleDomain Adapted Language Modelssv
dc.type.degreeExamensarbete för masterexamensv
dc.type.uppsokH
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 19-89 Jansson.pdf
Storlek:
24.12 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
1.14 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: