Examensarbeten för masterexamen

Browse

Senast publicerade

Visar 1 - 5 av 378
  • Post
    What is a successful antibiotic resistance gene? A conceptual model and machine learning predictions
    (2024) Einarsson, Elinor; Torell, Stina; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Kristiansson, Erik; Lund, David
    Antibiotic resistance is a global public health threat and it causes bacterial infections to become more difficult to treat. The spread of antibiotic resistance genes (ARGs) is predominantly driven by horizontal gene transfer (HGT) that enables bacteria to share genetic information directly between cells. The ability of an ARG to spread is influenced by a range of factors, and has become a popular field of research, aiming to find characteristics that enable rapid antibiotic resistance dissemination. This facilitates the identification of ARGs that possess the ability to disseminate rapidly, and for proactive measures against the dissemination to be implemented. Bioinformatics tools were used to study the prevalence of 4775 known ARGs in 867 318 bacterial genomes. A conceptual model describing the success of an ARG was developed containing four different measures of dissemination, over taxonomic barriers, in different GC-environments, geographical dissemination, and dissemination to pathogenic bacteria. By using a top-down approach studying the success of a gene, the thesis complements research studying factors that characterizes successful and rapid HGT. The conceptual model resulted in a success-score for each ARG that reflected the overall performance in the four components. Among the ARGs found to be highly successful the most common class was multidrug resistance, followed by aminoglycoside, β-lactam, and MLS antibiotic resistance. Furthermore, the success-score together with information about the genes, were used to investigate the possibility to predict the success of an ARG with the use of machine learning in a binary classification Random forest algorithm. The model was built to evaluate the predictive performance using decreasing amounts of observations of each gene. As expected, the predictive performance of the model improved as the number of observation increased. Based on only one observation, it was possible to predict the class of each gene with an average sensitivity of ~70% at 90% specificity, and with 250 observations a sensitivity of 98% could be attained. Sequence related features such as gene length and codon usage were important when only a few observations of a gene were used, but as the number of observations grew, non-sequence related features such as number of countries and pathogens a gene was found in, became more relevant. A meta-analysis also aims to explore the managerial and policy implications of antibiotics resistance, and findings include that policies facilitating for machine learning are important to implement. This study can be used as a starting point in the modelling of antibiotic resistance gene success, aiming to help identify emerging ARGs that have the possibility to become future threats.
  • Post
    Taxonomic Classification of Bacteria in Shotgun Metagenomic Samples
    (2023) Gold Rodal, Iris; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Kristiansson, Erik; Aspelin, Oscar
    The aim was to investigate taxonomic classification and removal of human host DNA in the context of a bioinformatic analysis pipeline for screening of pathogens. The examination was carried out using simulated short-read sequenced shotgun metagenomic samples. It was found that a majority of human origin DNA could be separated from bacteria using the K-mer based read classifier Kraken 2 with a custom built human only reference database. Effects on taxonomic classification performance were surveyed for variations in sample composition, parameter settings of the taxonomic classifier and reference database composition. Maintaining both high precision and recall for species level taxonomic classification of metagenomic samples was challenging for limited computational resources. A one-size-fits-all approach to taxonomic classification of any shotgun metagenomic sample would be near impossible with the tested K-mer based classifiers (Kraken 2 and Bracken) and instead specialized pipeline tracks optimized for different expected range of species, sequencing depth and abundance distributions could be a solution.
  • Post
    Road condition classification from CCTV images using machine learning
    (2023) Askbom, Ludvig; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Persson, Daniel; Persson, Daniel
    Understanding and categorizing road conditions is crucial for driver safety and road maintenance. This research explores practical approaches to classify road conditions using images from CCTV stations. Two classification challenges are addressed: distinguishing between snowy and non-snowy conditions and between snowy, wet, and dry conditions. The thesis evaluates various machine learning methods for road condition classification on multiple CCTV stations, including established and novel approaches. Established methods involve feature extraction through texture analysis and finetuning convolutional neural networks and vision transformers. Novel contributions include training an image segmentation model for road segmentation and utilizing persistent homology for feature extraction. Notably, this thesis sets itself apart by separating data into training and test sets based on CCTV stations. This is important to evaluate the methods’ and models’ abilities to generalize to new CCTV stations. The best-performing model, a fine-tuned vision transformer, achieved accuracies of 87.9% and 75.3% for classifying snow/no snow and snow/wet/dry, respectively. These results underscore the complexity of the classification problem and highlight the effectiveness of deep learning models for large-scale road condition classification based on images.
  • Post
    Static solutions of the Einstein-Dirac system for an increasing number of particles behave as solutions of the Einstein- Vlasov system
    (2023) Blomqvist, Joakim; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Andréasson, Håkan; Andréasson, Håkan
    In this thesis we will study static solutions to the spherically symmetric Einstein- Dirac system. This system couples Einstein’s theory of general relativity to Dirac’s relativistic description of quantum mechanics. The goal was to study the transition from a quantum mechanical description to a classical description by comparing properties of the solutions to the Einstein-Dirac system to solutions of the Einstein- Vlasov system as the number of particles of the former system increases. In 1999 Finster et al. [10] found for the first-time static solutions to the Einstein-Dirac system in the case of two fermions with opposite spins. Recently this study has been extended to a larger number of particles by Leith et al [14]. In particular, they construct highly relativistic solutions. The structure of the solutions is strikingly similar to the structure of highly relativistic solutions of the Einstein-Vlasov system. In both cases multi-peak solutions are obtained, and moreover, the maximum compactness of the solutions is very similar. The compactness is measured by the quantity m/r, where m is the mass and r the areal radius, and in both cases the maximum value appears to be 4/9. Furthermore, in quantum mechanics the pressure may be negative whereas classically it is non-negative. We find that already for 16 particles the pressure is non-negative and thus behaves classically. In order to compare the solutions, I need to construct solutions numerically to the Einstein- Dirac system in the case of a large number of particles. This requires a delicate procedure with significant numerical precision when the number of particles in the system grows.
  • Post
    Cross-tissue variance analysis of gene sets
    (2023) Thune , Oskar; Kööhler, Mauritz; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Kristiansson, Erik; Angermann, Bastian
    Gene set enrichment is used to investigate the differences between gene expression for genetic pathways in transcriptomic data. Gene set scoring methods like GSVA and singscore are used in gene set enrichment analysis to assess the enrichment of genes of interest, called gene sets. GSVA and singscore produces a score of how expressed a gene set is in relationship with a reference expression, a reference that is not always accessible. In this work we apply variance decomposition to investigate the use of singscore and GSVA to create a baseline for RNA-seq data that lacks control samples and apply a VAE for prediction of gene set scores across tissues. To this end, variance decomposition was done on GTEx to assess the dataset’s use as a baseline, and a VAE was trained on GTEx with the aim of predicting gene set scores across tissues. Our results show that there is a limited use of using a reference dataset as a basis for RNA-seq data. The results are not conclusive enough to warrant usage in applications with the precision needed in pharmaceutical research. The VAE based prediction shows lacklustre results in predicting expression over tissues, and other machine learning methods should be investigated for this application.