Gene Regulatory Networks Inference using Bidirectional Encoder Representations from Transformers

Loading...
Thumbnail Image

Date

Type

Examensarbete för masterexamen
Master's Thesis

Model builders

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Alzheimers disease is characterised by complex molecular mechanisms that are only partially understood. This thesis leverages single-cell RNA sequencing data from the newly released ROSMAP dataset, using an equally new BERT framework, to uncover potential drivers of disease onset, progression or therapeutic targets. A pretrained transformer model (Geneformer) is finetuned to classify major cell types and attempt patient classification. Geneformer achieves robust classification of major cell types, identifying molecular signals within a restricted gene set. but does not generalise effectively for patient classification. Performance is compared on reduced and full datasets to examine resource trade-offs. Molecular markers and candidate genes through perturbation analysis are presented through in silico perturbation. Future work may integrate updated versions of Geneformer with expanded gene inclusion and deeper architecture. This approach contributes insights into determinants of Alzheimers disease.

Description

Keywords

Data science, machine learning, bioinformatics, transformers, genomics, project, thesis

Citation

Architect

Location

Type of building

Build Year

Model type

Scale

Material / technology

Index

Endorsement

Review

Supplemented By

Referenced By