Taxonomic Classification of Metagenomic Short Reads

Examensarbete för masterexamen

Please use this identifier to cite or link to this item:
Download file(s):
File Description SizeFormat 
Master_Thesis_Matilda_Wikstrom.pdf1.53 MBAdobe PDFView/Open
Type: Examensarbete för masterexamen
Title: Taxonomic Classification of Metagenomic Short Reads
Authors: Wikström, Matilda
Abstract: Hospital acquired infections is a large issue in modern healthcare and they are becoming more difficult to treat due to increasing antibiotic resistance. To limit the spread of serious bacterial infections there is a need for fast diagnosis and treatment. The advent of next-generation sequencing has drastically reduced sequencing costs making it feasible to analyze metagenomic samples taken directly from the patients. This thesis has evaluated three metagenomic analysis tools with regards to species identification and abundance estimation for simulated metagenomic short reads originating from 15 different species. All tools showed different strengths and weaknesses, however an outstanding weakness found was classification of reads belonging to the Streptococcus mitis group and the Mycobacterium tuberculosis complex. To improve the classification of reads from Streptococcus and Mycobacterium we implemented a feed-forward neural network. For Streptococcus species we obtained an accuracy of 95% while our models failed to reach higher than 31% accuracy for Mycobacterium species. One of the causes for these different results is that the pairwise BLAST identity within the species groups are around 95% similarity for Streptococcus and 99% for Mycobacterium.
Keywords: Metagenomics, Machine Learning, Neural Network, Taxonomic Classification
Issue Date: 2020
Publisher: Chalmers tekniska högskola / Institutionen för matematiska vetenskaper
Collection:Examensarbeten för masterexamen // Master Theses

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.