Taxonomic Classification of Bacteria in Shotgun Metagenomic Samples
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Program
Engineering mathematics and computational science (MPENM), MSc
Publicerad
2023
Författare
Gold Rodal, Iris
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
The aim was to investigate taxonomic classification and removal of human host
DNA in the context of a bioinformatic analysis pipeline for screening of pathogens.
The examination was carried out using simulated short-read sequenced shotgun
metagenomic samples. It was found that a majority of human origin DNA could
be separated from bacteria using the K-mer based read classifier Kraken 2 with a
custom built human only reference database. Effects on taxonomic classification
performance were surveyed for variations in sample composition, parameter settings
of the taxonomic classifier and reference database composition. Maintaining both
high precision and recall for species level taxonomic classification of metagenomic
samples was challenging for limited computational resources. A one-size-fits-all approach
to taxonomic classification of any shotgun metagenomic sample would be
near impossible with the tested K-mer based classifiers (Kraken 2 and Bracken) and
instead specialized pipeline tracks optimized for different expected range of species,
sequencing depth and abundance distributions could be a solution.
Beskrivning
Ämne/nyckelord
Metagenome taxonomic classification, shotgun metagenomics, WGS, Kraken 2, Bracken, taxonomic classifier, host removal.