Discovering Novel Chemical Reactions
Typ
Examensarbete för masterexamen
Program
Complex adaptive systems (MPCAS), MSc
Publicerad
2021
Författare
Rydholm, Emma
Svensson, Emma
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Accurately predicting chemical reactions can facilitate the search for optimal synthe sis routes in a chemical reaction network and as a consequence expedite the lengthy
drug discovery process. As an effort in this direction, this work aims to explore
AstraZeneca’s chemical knowledge graph by two complementary analyses. In a first
part, graph theory related statistics is employed as a means to gain insights about
the chemical reaction graph at AstraZeneca. Significant differences are observed be tween this internal reaction graph and the one based on the public dataset of United
States patents as well as other reaction graphs discussed in literature. Secondly, a
link prediction model is applied to and evaluated on AstraZeneca’s chemical reaction
graph, in order to suggest new potential chemical reactions. In order to successfully
accomplish this task, an existing link prediction model is adapted and trained. The
test results are then compared to heuristic baselines, showing that the proposed
implementation substantially exceeds what can be achieved with heuristic methods.
One of the contribution from this research is a comparison between different ways to
sample the ground truth class of non-existing links for training and evaluation. The
choice of method for this task is shown to have an impact on the final predictions.
Finally, a set of promising, predicted reactions are suggested and is currently under
further investigation at AstraZeneca.
Beskrivning
Ämne/nyckelord
Link Prediction , Graph Neural Networks , Knowledge Graph , Chemical Reaction Graph , Graph Analysis , Synthesis Prediction , Drug Discovery