Learning the Fragment Size Distribution in Liquid Biopsy Sequencing
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Cancer as a disease affects thousands of patients every year. Earlier, cancer was
analyzed through tissue biopsies derived during surgery. However, due to improved
sequencing methods, liquid biopsies have become more common, as these are convenient
and provide the opportunity to monitor cancer evolution in real time. The
project aim was to employ computational methods to analyze fragment length distributions
from liquid biopsies by finding characteristics related to cancer and then
filter appropriately. We carried out the project by utilizing a combination of machine
learning and statistical learning. The machine learning models were performed for
two labels, where one was based on purity and one was generated through a minimalistic
cell death model. We found characteristic information linked to cancer by
evaluating the models based on feature importance. The project resulted in one
label sufficient enough for usage, which led to several models outperforming relevant
baselines. As the models were somewhat flawed due to comprised results and
insufficient data, no filtering could be made with the guarantee of only removing
healthy data. However, we still managed to find characteristic features because of
synergistic results across the models.
Beskrivning
Ämne/nyckelord
Cancer, liquid biopsies, ovarian cancer, statistical learning, machine learning, statistics, Python, chromosome, necrosis