Domain Adapted Language Models

Jansson, Erik

Domain Adapted Language Models

Ladda ner

CSE 19-89 Jansson.pdf (24.12 MB)

Publicerad

2019

Författare

Jansson, Erik

Typ

Examensarbete för masterexamen

Sammanfattning

BERT is a recent neural network model that has proven it self amassive leap forward in natural language processing. Due to the tedious training required by this massive model, a pretrained BERT instance has been released as a high-performing starting point for further training on downstream tasks. The pretrained model has been trained on general English text and may not be optimal for applications in specialist language domains. This study examines adapting the pretrained BERT model to the specialist language domain of legal text, with classiﬁcation as the downstream task of interest. The study ﬁnds that domain adaptation is most beneﬁcial if faced with small task-speciﬁc datasets, where performance can approach that of a model pretrained from scratch on legal text data. The study further presents practical guidelines for applying BERT in specialist language domains.

Ämne/nyckelord

natural language processing, BERT, transformer, domain adaptation, language model, classiﬁcation

URI

https://hdl.handle.net/20.500.12380/300390

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Domain Adapted Language Models

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced