Deep learning for prediction of antibiotic resistance genes

Examensarbete för masterexamen

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.12380/302732
Download file(s):
File Description SizeFormat 
Masters_Thesis_Erika_Salomonsson.pdf750.95 kBAdobe PDFThumbnail
View/Open
Bibliographical item details
FieldValue
Type: Examensarbete för masterexamen
Title: Deep learning for prediction of antibiotic resistance genes
Authors: Salomonsson, Erika
Abstract: Abstract Antibiotic resistance is a serious public health challenge since it reduces the ability to prevent and treat bacterial infections with antibiotics. Bacteria that are resistant to antibiotics contain resistance genes that are shared between cells. This flow of resistance genes is one of the main reasons behind the rapid global increase of antibiotic resistant bacteria. It is essential to gather information about the already existing resistance genes to be able to counter the flow and to understand what resistance genes might become present in future clinical settings. The aim of this master’s thesis is to investigate if the transformer, which is a relatively new deep learning model, can predict genes that are resistant to the antibiotic class aminoglycosides and also to see if the transformer can distinguish between five different resistance gene classes. An advantage with transformers is that they rely on attention mechanisms that can detect global and complex dependencies in DNA structures which help characterize resistance genes. In order to reach the aim of this project, the architecture and parameters in the transformer model are explored and evaluated to find the model yielding the best performance. The optimal model is then used to make predictions on a real dataset. We obtained a transformer model that could predict resistance genes with a sensitivity of 0.989 and a specificity of 0.999. Using the same model, around 0.237 % of the real data were predicted as resistant. When the model tried to distinguish between resistance gene classes the sensitivity varied for the classes, where the lowest sensitivity was 0.263 and the highest sensitivity was 0.823. For all classes the specificity was higher than 0.970. A conclusion is that the performance of the transformer model to a great extent depends on the appearance of the input data. The bigger and more diverse dataset, the more dependencies in the DNA structure can be captured implying better performance. With proper datasets the transformer model can make classifications with very good performance.
Keywords: antibiotic resistance, resistance genes, deep learning, transformer model, predictions, sensitivity, specificity
Issue Date: 2021
Publisher: Chalmers tekniska högskola / Institutionen för matematiska vetenskaper
URI: https://hdl.handle.net/20.500.12380/302732
Collection:Examensarbeten för masterexamen // Master Theses



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.