ODR kommer att vara otillgängligt pga systemunderhåll onsdag 25 februari, 13:00 -15:00 (ca). Var vänlig och logga ut i god tid. // ODR will be unavailable due to system maintenance, Wednesday February 25, 13:00 - 15:00. Please log out in due time.
 

Catastrophic Forgetting in Language Models

dc.contributor.authorPeng, Tiantian
dc.contributor.authorTayefeh, Shakila
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerJohansson, Richard
dc.contributor.supervisorNaili, Marwa
dc.date.accessioned2026-01-16T07:22:37Z
dc.date.issued2025
dc.date.submitted
dc.description.abstractCatastrophic forgetting remains a persistent challenge in the continual learning paradigm of neural networks, particularly in the context of pre-trained language models. This thesis investigates the phenomenon of catastrophic forgetting in large language models (LLMs), with a focus on BERT, through a series of benchmark evaluations. Specifically, we explore the effects of fine-tuning BERT on a vision-andlanguage dataset and subsequently evaluate its performance on GLUE and Super-GLUE tasks to assess the retention of previously learned knowledge. A brute-force approach was employed in an attempt to mitigate forgetting, involving standard finetuning without regularization or memory replay mechanisms. Contrary to expectations, empirical results demonstrate that the fine-tuned models exhibit degraded performance on benchmark tasks compared to the original pre-trained models, highlighting the severity of catastrophic forgetting. These findings emphasize the need for more sophisticated mitigation strategies and contribute to a deeper understanding of transfer learning limitations in current NLP systems.
dc.identifier.coursecodeDATX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/310896
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectCatastrophic Forgetting
dc.subjectContinual Learning
dc.subjectBERT
dc.subjectFine-Tuning
dc.subjectTransfer Learning
dc.subjectGLUE
dc.subjectSuperGLUE
dc.subjectNatural Language Processing (NLP)
dc.titleCatastrophic Forgetting in Language Models
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeData science and AI (MPDSC), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 25-133 TP ST.pdf
Storlek:
6.7 MB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: