Deconvolution methods for quantification of copy number variations in liquid biopsy sequencing
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Program
Engineering mathematics and computational science (MPENM), MSc
Publicerad
2024
Författare
Eriksson, Lotta
Hallin, Linnea
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Copy number variations, prevalent in cancers, are genomic alterations that result in
losses or gains of entire genomic regions. Such alterations can be evaluated using cheap
low-pass whole genome sequencing using liquid biopsies. These methods are promising
for tracking the evolution of the cancer in real-time, due to their low cost and noninvasive
nature which enable frequent sampling. The DNA sequenced from liquid biopsies
is a mixture of cancer-specific DNA and DNA from healthy cells, the latter without
these alterations. Therefore liquid biopsies can, for example, help monitor the proportion
of an emerging cancer subtype in the tumor. Lakatos et al. [10] introduced methods
for estimating the cancer proportion and the proportion of the most dominant cancer
subtype in the sample, termed purity estimation and subclonal tracking. However, our
ability to track the cancer evolution is hindered by the low signal in such samples, due
to the contamination of healthy DNA, and measurement noise. We thus aim to develop
methods for denoising and deconvolution of the underlying copy number profile of the
tumor, to enhance the signal in liquid biopsy sequencing measurements.
In this work, we evaluate two frameworks for deconvoluting such samples: a denoising
autoencoder and Bayesian change point detection. We compare these methods to rolling
median-based segmentation, using the mean squared error of the reconstructed copy
number profile and the F1-score. We demonstrate that both deconvolution methods
work better than the rolling median in low-purity and noisy regions. We then implement
our methods for purity estimation and subclonal tracking, based on the methods by
Lakatos et al. and using the denoised data obtained from the previous step. In general,
we find that Bayesian change point detection outperforms the other methods, is suitable
for denoising liquid biopsy samples, and can be used for subclonal tracking. Using our
full updated pipeline, we can improve the estimation of purity and subclonal ratio values,
especially in low-purity and low-quality samples.
Beskrivning
Ämne/nyckelord
Denoising autoencoder, Bayesian change point detection, cumulative segmented regression, genomics, copy-number variations, liquid biopsy sequencing