Predicting multiple chemical contexts using multi-label classification and predictors

Loading...
Thumbnail Image

Date

Type

Examensarbete för masterexamen

Programme

Model builders

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Drug discovery is a time and resource intensive process. Machine learning is one way of speeding up the process. One important task is to choose suitable conditions – solvents, catalysts etc. – for a reaction to optimize the amount of product from the reaction. The purpose of this thesis was to investigate ways to improve condition prediction. In this thesis the condition prediction is limited to chemical contexts, or sets of conditions, and the reaction class Buchwald-Hartwig that is common in drug discovery. First, we evaluate two models using two approaches for multi-label classification to predict several possible chemical contexts for a reaction. We evaluate both a neural network and a binary relevance model. Second, we present a model for condition prediction of a chemical library used for parallel synthesis. Last, Venn ABERS predictors were added on top of these models to evaluate the impact of model calibration on these tasks. However, calibrating the scores with Venn-ABERS predictors did not improve our results. All models show potential in improving condition prediction. We consider both models for the multi-label classification task to be well-performing. Also, both models performed better than the naive models. The novel model for condition prediction for chemical libraries also showed good results which out-performed naive classifiers.

Description

Keywords

reaction prediction, condition prediction, cheminformatics, machine learning, drug development, multi-label classification, predictor, model calibration

Citation

Architect

Location

Type of building

Build Year

Model type

Scale

Material / technology

Index

Endorsement

Review

Supplemented By

Referenced By