Learning Meaningful Representations of Cells

Loading...
Thumbnail Image

Date

Type

Examensarbete för masterexamen
Master's Thesis

Model builders

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Batch effects are a significant concern in single-cell RNA sequencing (scRNA-Seq) data analysis, where variations in the data can be attributed to factors unrelated to cell types. This can make downstream analysis a challenging task. In this study, a neural network model is designed utilizing contrastive learning and a novel loss func tion for learning an generalizable embedding space from scRNA-Seq data. When benchmarked against multiple established methods for scRNA-Seq integration, the model outperforms existing methods in learning a generalizable embedding space on multiple datasets. A downstream application that was investigated for the embedding space was cell type annotation. When compared against multiple well established cell type classifiers, the model in this study displayed a performance competitive with top performing methods across multiple metrics, such as accuracy, balanced accuracy, and F1 score. These findings aim to quantify the “meaningfulness” of the embedding space learned by the model, and highlight the potential applications of these learned cellular representations. The model is currently being structured into an open-source Python package, simplifying and streamlining its usage.

Description

Keywords

scRNA-Seq, Deep learning, Contrastive learning, Bioinformatics, Cell type annotation, Novel cell type detection, Cell type representations, Machine learning, AI, Transformer

Citation

Architect

Location

Type of building

Build Year

Model type

Scale

Material / technology

Index

Endorsement

Review

Supplemented By

Referenced By