Molecular Optimization using Deep Learning Extensions of the Transformer for Molecular Optimization

Typ
Examensarbete för masterexamen
Program
Publicerad
2020
Författare
Forsberg, Marcus
Mattsson, Felix
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Over the recent years, the development in deep learning has provided new approaches to molecular optimization. Molecular optimization aims to find structurally similar molecules to a given starting molecule, yielding specified improvements in terms of different molecular properties. By representing molecules as SMILES, an ap proach to encode molecules as strings of tokens, molecular optimization can be framed as a machine translation problem, where starting molecules are translated to molecules with optimized properties. Previous work has shown success for the Transformer known from natural language processing [1, 2] in the area of molecular optimization. The thesis covers two extensions of the developed Transformer model in [1] through curriculum learning and Core-Fixed formulation. Through curriculum learning, training is structured through a sequence of tasks (curriculum) based on increasing difficulty. The curriculum could either be determined while training a model (machine-based) or manually (human heuristic-based). The thesis explores various approaches to human-based curriculum learning. For the other extension, Core-Fixed formulation, the thesis provides an approach to reformulating the input and output of the original model [1], which involves specifying in the input to the translation model which part that should be fixed (core) and which part that should be exchanged (R-group) to optimize the complete molecule’s properties. The results show advantages both in training time and molecule generation performance using the Core-Fixed formulation. For curriculum learning, the results do not indicate a clear improvement. The thesis suggests looking into more sophisticated curriculum learning approaches.
Beskrivning
Ämne/nyckelord
Molecular Optimization , Matched Molecular Pairs , Transformer , AD-MET , Master’s Thesis
Citation
Arkitekt (konstruktör)
Geografisk plats
Byggnad (typ)
Byggår
Modelltyp
Skala
Teknik / material
Index