Catastrophic Forgetting in Language Models
| dc.contributor.author | Peng, Tiantian | |
| dc.contributor.author | Tayefeh, Shakila | |
| dc.contributor.department | Chalmers tekniska högskola / Institutionen för data och informationsteknik | sv |
| dc.contributor.department | Chalmers University of Technology / Department of Computer Science and Engineering | en |
| dc.contributor.examiner | Johansson, Richard | |
| dc.contributor.supervisor | Naili, Marwa | |
| dc.date.accessioned | 2026-01-16T07:22:37Z | |
| dc.date.issued | 2025 | |
| dc.date.submitted | ||
| dc.description.abstract | Catastrophic forgetting remains a persistent challenge in the continual learning paradigm of neural networks, particularly in the context of pre-trained language models. This thesis investigates the phenomenon of catastrophic forgetting in large language models (LLMs), with a focus on BERT, through a series of benchmark evaluations. Specifically, we explore the effects of fine-tuning BERT on a vision-andlanguage dataset and subsequently evaluate its performance on GLUE and Super-GLUE tasks to assess the retention of previously learned knowledge. A brute-force approach was employed in an attempt to mitigate forgetting, involving standard finetuning without regularization or memory replay mechanisms. Contrary to expectations, empirical results demonstrate that the fine-tuned models exhibit degraded performance on benchmark tasks compared to the original pre-trained models, highlighting the severity of catastrophic forgetting. These findings emphasize the need for more sophisticated mitigation strategies and contribute to a deeper understanding of transfer learning limitations in current NLP systems. | |
| dc.identifier.coursecode | DATX05 | |
| dc.identifier.uri | http://hdl.handle.net/20.500.12380/310896 | |
| dc.language.iso | eng | |
| dc.setspec.uppsok | Technology | |
| dc.subject | Catastrophic Forgetting | |
| dc.subject | Continual Learning | |
| dc.subject | BERT | |
| dc.subject | Fine-Tuning | |
| dc.subject | Transfer Learning | |
| dc.subject | GLUE | |
| dc.subject | SuperGLUE | |
| dc.subject | Natural Language Processing (NLP) | |
| dc.title | Catastrophic Forgetting in Language Models | |
| dc.type.degree | Examensarbete för masterexamen | sv |
| dc.type.degree | Master's Thesis | en |
| dc.type.uppsok | H | |
| local.programme | Data science and AI (MPDSC), MSc |
