Sampling a Subset of Chemical Space with GNN-Based Generative Models
Examensarbete för masterexamen
In recent years deep neural network models have been used in the field of drug discovery for de novo molecular design. One, somewhat novel, field of deep learning that has seen some use in drug discovery is graph neural networks (GNN:s). This thesis evaluates 6 GNN models for use in molecular graph generation. The evaluation is based on a benchmark introduced by Arús-Pous et al. , which measures how well models sample a subset of chemical space. The models are also compared to existing recurrent neural network models (RNN:s), which use string representation of molecules. The best performing GNN models achieve comparable scores to the RNN models, all though the RNN models score higher. Even though the GNN models score slightly lower on two of the training sets, they still show great potential for future use and merit further research. In addition to this, a data loading scheme for PyTorch is introduced, which increases training speed by loading training data from disk efficiently.
machine learning , deep learning , graph neural networks , message passing neural network , de novo molecular design , graph generation