Cross-tissue variance analysis of gene sets

Loading...
Thumbnail Image

Date

Type

Examensarbete för masterexamen
Master's Thesis

Model builders

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Gene set enrichment is used to investigate the differences between gene expression for genetic pathways in transcriptomic data. Gene set scoring methods like GSVA and singscore are used in gene set enrichment analysis to assess the enrichment of genes of interest, called gene sets. GSVA and singscore produces a score of how expressed a gene set is in relationship with a reference expression, a reference that is not always accessible. In this work we apply variance decomposition to investigate the use of singscore and GSVA to create a baseline for RNA-seq data that lacks control samples and apply a VAE for prediction of gene set scores across tissues. To this end, variance decomposition was done on GTEx to assess the dataset’s use as a baseline, and a VAE was trained on GTEx with the aim of predicting gene set scores across tissues. Our results show that there is a limited use of using a reference dataset as a basis for RNA-seq data. The results are not conclusive enough to warrant usage in applications with the precision needed in pharmaceutical research. The VAE based prediction shows lacklustre results in predicting expression over tissues, and other machine learning methods should be investigated for this application.

Description

Keywords

RNA-seq, GSVA, Transcriptomics, Bioinformatics, Variational autoencoder, GTEx, Variance decomposition

Citation

Architect

Location

Type of building

Build Year

Model type

Scale

Material / technology

Index

Endorsement

Review

Supplemented By

Referenced By