Compatibility patterns in antibiotic resistant genes in pathogens

Typ
Examensarbete för masterexamen
Program
Biotechnology (MPBIO), MSc
Publicerad
2020
Författare
Lindbom, Agnes
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Antibiotic resistance is increasing globally and is a substantial threat to the public health with higher costs and longer hospital stays. Infections of antibiotic resistant bacteria are harder to treat and without actions, there is a risk common infections eventually can become life-threatening again. Antibiotic resistance can be acquired by mutations in existing genes or from resistance genes transferred from other bacteria by horizontal gene transfer. The reason why some genes are transferred could be explained by the compatibility of the gene, but currently, there is not so much information what would be needed to make a gene compatible and able to be transferred and make new antibiotic resistant bacteria. The aim of this project is to investigate whether there are compatibility patterns in antibiotic resistant genes in the pathogens Escherichia coli and Klebsiella pneumoniae. This was done by three analyses, kmer analysis, frequency analysis and analysis of regions surrounding the gene. The project also included predictive models to investigate the predictive ability of a logistic regression model. The data was collected from NCBI and ResFinder and core genomes from the species were used. The antibiotic resistant genes were divided into groups whether they were present or not in the species after using BLAST. The kmer analysis used the kmer distributions of different kmer lengths in three methods; squared Euclidean distance, absolute maximum values and maximum value and gave similar results for both species. For smaller kmer lengths, differences could be seen for the species in the median and p-values between the antibiotic resistant genes and the core genome genes. For increasing kmer lengths less differences could be seen. In the frequency analysis the genes were merged into genes groups and the values from kmer analysis were compared against the number of hits from the BLAST results. Many values clustered around the median values and the gene groups with the most hits were close to the median values while the values far away from the median values did not have many hits. No clear conclusion about different antibiotic classes could be seen. In the analysis of regions surrounding the gene, sequences of 100 bp upstream and downstream of each gene were cut and the genes were divided into groups whether they were present in both or one species. There could be seen differences between the unique groups of the species, while the difference between the shared groups were surprisingly low. On kmer level, the kmers that differed most between the species had no clear correlation, potentially they could be related to the higher GC content in K. pneumoniae. Predictive models were created with logistic regression for all genes and three different antibiotic classes. The model for all genes included the length and 21 kmers out of 64 possible kmers. The model performed better than a random classification with the best values of true positive rate of 72% and false positive rate of 11%. The other models included fewer kmers and most of the classifier performed better than a random classification. In this project analyses have developed and from the results it has been found compatibility patterns of the antibiotic resistant genes by looking at the gene sequences and the regions surrounding the genes. From the gene sequences it has also been possible to predict the gene compatibility in a predictive model.
Beskrivning
Ämne/nyckelord
antibiotic resistance, pathogens, horizontal gene transfer, gene compatibility, predictive model, logistic regression, bioinformatics
Citation
Arkitekt (konstruktör)
Geografisk plats
Byggnad (typ)
Byggår
Modelltyp
Skala
Teknik / material
Index