Learning to mitigate reliance on features with missing values in interpretable prediction models

dc.contributor.authorDuan, Tianyi
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerOlsson, Simon
dc.contributor.supervisorD. Johansson, Fredrik
dc.contributor.supervisorStempfle, Lena
dc.date.accessioned2025-04-30T10:00:19Z
dc.date.issued2025
dc.date.submitted
dc.description.abstractIn the healthcare area, it is common for datasets to contain observations that are missing for the corresponding features. Predicting outcomes with such datasets in supervised learning tasks often results in outcomes that are heavily influenced by these missing values. This thesis modifies two original machine learning algorithms and introduces two novel models: the Least Absolute Shrinkage and Selection Operator Mitigating Reliance (LASSOMR) and the Decision Tree Mitigating Reliance (DTMR). Both models are designed to reduce dependency on features with missing values during predictions. This reduction is achieved by penalizing features that have missing values, thereby decreasing the model’s reliance on these features. The synthetic dataset and real-world dataset are used to explore that DTMR and LASSOMR models give a larger penalty to the features that have larger missing ratios. As a result, the coefficient value of the features becomes less leading to the goal of relying less on features having missing values. Additionally, real-world datasets with missing values evaluate the performance of these models against baseline methods, confirming that the models perform comparably while effectively mitigating reliance on missing value features.
dc.identifier.coursecodeDATX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/309298
dc.language.isoeng
dc.relation.ispartofseriesCSE 24-135
dc.setspec.uppsokTechnology
dc.subjectMachine Learning, Supervised Learning, Missing Values, Healthcare
dc.titleLearning to mitigate reliance on features with missing values in interpretable prediction models
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeComputer science – algorithms, languages and logic (MPALG), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 24-135 TD.pdf
Storlek:
2.46 MB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: