Informed regularization Aiding the identification of spurious correlations

Möller, Fredrik

Informed regularization Aiding the identification of spurious correlations

dc.contributor.author	Möller, Fredrik
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.examiner	Johansson, Richard
dc.contributor.supervisor	Johansson, Richard
dc.date.accessioned	2021-03-08T07:12:59Z
dc.date.available	2021-03-08T07:12:59Z
dc.date.issued	2021	sv
dc.date.submitted	2020
dc.description.abstract	Today, end-to-end neural networks that feature deep and complex architectures, are common tools to use in natural language processing. By using these methods it has become harder to identify which inputs have contributed the most to a model’s classification. This issue leads to the problem of models overfitting on features that cannot directly be identified by a developer. To open up the black box of complex deep learning natural language processing systems, this study aims to investigate what information can be extracted from the data used to train a model and how the model’s inputs are weighted during pre diction. This thesis aims to present methods that can aid in the identification of differences between the population a developer intends to model with a data set and what correlations a model makes from the true content of the data. By presenting three novel methods that can aid a developer with the task of identify ing spurious correlations, it was possible to present information regarding a spurious correlation between two pre-selected keywords and a model’s classification. It was also shown that the identification and reduction of spurious correlations is a tricky subject. Results showed that, from the reduction of the spurious correlation asso ciated with the selected keyword, the model made another correlation which could be considered as spurious.	sv
dc.identifier.coursecode	MPSYS	sv
dc.identifier.uri	https://hdl.handle.net/20.500.12380/302254
dc.language.iso	eng	sv
dc.setspec.uppsok	Technology
dc.subject	NLP	sv
dc.subject	Explainability	sv
dc.subject	Regularization	sv
dc.subject	Layer-wise relevance propagation	sv
dc.subject	TF-IDF	sv
dc.subject	NCOF	sv
dc.title	Informed regularization Aiding the identification of spurious correlations	sv
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.uppsok	H

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 21-06 Möller.pdf
Storlek:: 1.88 MB
Format:: Adobe Portable Document Format
Beskrivning:

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 1.14 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen