Deep Learning Models for Data Integration and Surrogate Models for Interpretable Predictions with Applications in Integromics and Recommender Systems

Edwardsson, Per; Liew, Oskar

Deep Learning Models for Data Integration and Surrogate Models for Interpretable Predictions with Applications in Integromics and Recommender Systems

dc.contributor.author	Edwardsson, Per
dc.contributor.author	Liew, Oskar
dc.contributor.department	Chalmers tekniska högskola / Institutionen för matematiska vetenskaper	sv
dc.contributor.examiner	Kristiansson, Erik
dc.contributor.supervisor	Jörnsten, Rebecka
dc.contributor.supervisor	Held, Felix
dc.date.accessioned	2020-07-02T11:17:21Z
dc.date.available	2020-07-02T11:17:21Z
dc.date.issued	2020	sv
dc.date.submitted	2020
dc.description.abstract	Many tasks require the simultaneous analysis of multiple heterogeneous data sets, also known as integrative data analysis. In the past, most data integration methods made linear assumptions in the shared latent representations between the data sets. Recently, Deep Collective Matrix Factorization (dCMF) was proposed as a matrix completion algorithm that can utilize auxiliary data sources without making any assumptions about the data, by modelling non-linearities using deep learning. In this thesis, we examine the performance and versatility of dCMF and propose a framework to interpret the predictions of the model, based on Linear Interpretable Model-agnostic Explanations (LIME), that we call dCMF-LIME. The explanations give variable importance measures for an individual prediction and can be used to gain trust or to troubleshoot a model. We also propose a method for unsupervised data translation that we call a Data Translation Network (DTN) that can learn to transform data from one set of data to another by first encoding them to a shared latent domain and then reconstructing any of the learned data from said latent domain. We saw that dCMF outperformed our baseline methods on simulated data and a recommendation task, but it showed poor performance on our gene-disease association test, where it was outclassed by all other methods. DTN displayed the third best performance in the same test and shows promise for future work.	sv
dc.identifier.coursecode	MVEX03	sv
dc.identifier.uri	https://hdl.handle.net/20.500.12380/301197
dc.language.iso	eng	sv
dc.setspec.uppsok	PhysicsChemistryMaths
dc.subject	Integrative data analysis, Deep learning, dCMF, CMF, Integromics	sv
dc.title	Deep Learning Models for Data Integration and Surrogate Models for Interpretable Predictions with Applications in Integromics and Recommender Systems	sv
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.uppsok	H

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: Master Thesis Oskar Liew Per Edwardsson.pdf
Storlek:: 1.33 MB
Format:: Adobe Portable Document Format
Beskrivning:

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 1.14 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen