Predicting multiple chemical contexts using multi-label classification and predictors

dc.contributor.authorLahti, Gustav
dc.contributor.authorMårdh, Agnes
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.examinerDubhashi, Devdatt
dc.contributor.supervisorOlsson, Simon
dc.date.accessioned2021-06-23T12:45:28Z
dc.date.available2021-06-23T12:45:28Z
dc.date.issued2021sv
dc.date.submitted2020
dc.description.abstractDrug discovery is a time and resource intensive process. Machine learning is one way of speeding up the process. One important task is to choose suitable conditions – solvents, catalysts etc. – for a reaction to optimize the amount of product from the reaction. The purpose of this thesis was to investigate ways to improve condition prediction. In this thesis the condition prediction is limited to chemical contexts, or sets of conditions, and the reaction class Buchwald-Hartwig that is common in drug discovery. First, we evaluate two models using two approaches for multi-label classification to predict several possible chemical contexts for a reaction. We evaluate both a neural network and a binary relevance model. Second, we present a model for condition prediction of a chemical library used for parallel synthesis. Last, Venn ABERS predictors were added on top of these models to evaluate the impact of model calibration on these tasks. However, calibrating the scores with Venn-ABERS predictors did not improve our results. All models show potential in improving condition prediction. We consider both models for the multi-label classification task to be well-performing. Also, both models performed better than the naive models. The novel model for condition prediction for chemical libraries also showed good results which out-performed naive classifiers.sv
dc.identifier.coursecodeMPALGsv
dc.identifier.urihttps://hdl.handle.net/20.500.12380/302704
dc.language.isoengsv
dc.setspec.uppsokTechnology
dc.subjectreaction predictionsv
dc.subjectcondition predictionsv
dc.subjectcheminformaticssv
dc.subjectmachine learningsv
dc.subjectdrug developmentsv
dc.subjectmulti-label classificationsv
dc.subjectpredictorsv
dc.subjectmodel calibrationsv
dc.titlePredicting multiple chemical contexts using multi-label classification and predictorssv
dc.type.degreeExamensarbete för masterexamensv
dc.type.uppsokH
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 21-53 Mårdh Lahti ODR 302704.pdf
Storlek:
2.67 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
1.14 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: