Gene Regulatory Networks Inference using Bidirectional Encoder Representations from Transformers

Publicerad

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Alzheimers disease is characterised by complex molecular mechanisms that are only partially understood. This thesis leverages single-cell RNA sequencing data from the newly released ROSMAP dataset, using an equally new BERT framework, to uncover potential drivers of disease onset, progression or therapeutic targets. A pretrained transformer model (Geneformer) is finetuned to classify major cell types and attempt patient classification. Geneformer achieves robust classification of major cell types, identifying molecular signals within a restricted gene set. but does not generalise effectively for patient classification. Performance is compared on reduced and full datasets to examine resource trade-offs. Molecular markers and candidate genes through perturbation analysis are presented through in silico perturbation. Future work may integrate updated versions of Geneformer with expanded gene inclusion and deeper architecture. This approach contributes insights into determinants of Alzheimers disease.

Beskrivning

Ämne/nyckelord

Data science, machine learning, bioinformatics, transformers, genomics, project, thesis

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced