What is a successful antibiotic resistance gene? A conceptual model and machine learning predictions

dc.contributor.authorEinarsson, Elinor
dc.contributor.authorTorell, Stina
dc.contributor.departmentChalmers tekniska högskola / Institutionen för matematiska vetenskapersv
dc.contributor.examinerKristiansson, Erik
dc.contributor.supervisorLund, David
dc.date.accessioned2024-02-05T13:39:40Z
dc.date.available2024-02-05T13:39:40Z
dc.date.issued2024
dc.date.submitted2023
dc.description.abstractAntibiotic resistance is a global public health threat and it causes bacterial infections to become more difficult to treat. The spread of antibiotic resistance genes (ARGs) is predominantly driven by horizontal gene transfer (HGT) that enables bacteria to share genetic information directly between cells. The ability of an ARG to spread is influenced by a range of factors, and has become a popular field of research, aiming to find characteristics that enable rapid antibiotic resistance dissemination. This facilitates the identification of ARGs that possess the ability to disseminate rapidly, and for proactive measures against the dissemination to be implemented. Bioinformatics tools were used to study the prevalence of 4775 known ARGs in 867 318 bacterial genomes. A conceptual model describing the success of an ARG was developed containing four different measures of dissemination, over taxonomic barriers, in different GC-environments, geographical dissemination, and dissemination to pathogenic bacteria. By using a top-down approach studying the success of a gene, the thesis complements research studying factors that characterizes successful and rapid HGT. The conceptual model resulted in a success-score for each ARG that reflected the overall performance in the four components. Among the ARGs found to be highly successful the most common class was multidrug resistance, followed by aminoglycoside, β-lactam, and MLS antibiotic resistance. Furthermore, the success-score together with information about the genes, were used to investigate the possibility to predict the success of an ARG with the use of machine learning in a binary classification Random forest algorithm. The model was built to evaluate the predictive performance using decreasing amounts of observations of each gene. As expected, the predictive performance of the model improved as the number of observation increased. Based on only one observation, it was possible to predict the class of each gene with an average sensitivity of ~70% at 90% specificity, and with 250 observations a sensitivity of 98% could be attained. Sequence related features such as gene length and codon usage were important when only a few observations of a gene were used, but as the number of observations grew, non-sequence related features such as number of countries and pathogens a gene was found in, became more relevant. A meta-analysis also aims to explore the managerial and policy implications of antibiotics resistance, and findings include that policies facilitating for machine learning are important to implement. This study can be used as a starting point in the modelling of antibiotic resistance gene success, aiming to help identify emerging ARGs that have the possibility to become future threats.
dc.identifier.coursecodeMVEX03
dc.identifier.urihttp://hdl.handle.net/20.500.12380/307559
dc.language.isoeng
dc.setspec.uppsokPhysicsChemistryMaths
dc.subjectAntibiotic Resistance, Bioinformatics, Horizontal Gene Transfer, Successful ARGs, Machine Learning, Random Forest, Managerial implications
dc.titleWhat is a successful antibiotic resistance gene? A conceptual model and machine learning predictions
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeBiotechnology (MPBIO), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
Master thesis_Elinor_Einarsson_Stina_Torell_2024.pdf
Storlek:
4.02 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: