Learning Meaningful Representations of Cells
Loading...
Date
Authors
Type
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Programme
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Batch effects are a significant concern in single-cell RNA sequencing (scRNA-Seq) data analysis, where variations in the data can be attributed to factors unrelated to
cell types. This can make downstream analysis a challenging task. In this study, a neural network model is designed utilizing contrastive learning and a novel loss func tion for learning an generalizable embedding space from scRNA-Seq data. When benchmarked against multiple established methods for scRNA-Seq integration, the
model outperforms existing methods in learning a generalizable embedding space on multiple datasets. A downstream application that was investigated for the embedding space was cell type annotation. When compared against multiple well established cell type classifiers, the model in this study displayed a performance competitive with top performing methods across multiple metrics, such as accuracy, balanced accuracy, and F1 score. These findings aim to quantify the “meaningfulness” of the embedding space learned by the model, and highlight the potential applications of these learned cellular representations. The model is currently being structured into an open-source Python package, simplifying and streamlining its usage.
Description
Keywords
scRNA-Seq, Deep learning, Contrastive learning, Bioinformatics, Cell type annotation, Novel cell type detection, Cell type representations, Machine learning, AI, Transformer
