Catastrophic Forgetting in Language Models

Peng, Tiantian; Tayefeh, Shakila

Catastrophic Forgetting in Language Models

dc.contributor.author	Peng, Tiantian
dc.contributor.author	Tayefeh, Shakila
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering	en
dc.contributor.examiner	Johansson, Richard
dc.contributor.supervisor	Naili, Marwa
dc.date.accessioned	2026-01-16T07:22:37Z
dc.date.issued	2025
dc.date.submitted
dc.description.abstract	Catastrophic forgetting remains a persistent challenge in the continual learning paradigm of neural networks, particularly in the context of pre-trained language models. This thesis investigates the phenomenon of catastrophic forgetting in large language models (LLMs), with a focus on BERT, through a series of benchmark evaluations. Specifically, we explore the effects of fine-tuning BERT on a vision-andlanguage dataset and subsequently evaluate its performance on GLUE and Super-GLUE tasks to assess the retention of previously learned knowledge. A brute-force approach was employed in an attempt to mitigate forgetting, involving standard finetuning without regularization or memory replay mechanisms. Contrary to expectations, empirical results demonstrate that the fine-tuned models exhibit degraded performance on benchmark tasks compared to the original pre-trained models, highlighting the severity of catastrophic forgetting. These findings emphasize the need for more sophisticated mitigation strategies and contribute to a deeper understanding of transfer learning limitations in current NLP systems.
dc.identifier.coursecode	DATX05
dc.identifier.uri	https://hdl.handle.net/20.500.12380/310896
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Catastrophic Forgetting
dc.subject	Continual Learning
dc.subject	BERT
dc.subject	Fine-Tuning
dc.subject	Transfer Learning
dc.subject	GLUE
dc.subject	SuperGLUE
dc.subject	Natural Language Processing (NLP)
dc.title	Catastrophic Forgetting in Language Models
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Data science and AI (MPDSC), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 25-133 TP ST.pdf
Size:: 6.7 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Size:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Description:

Ladda ner

Samlingar

Examensarbeten för masterexamen