Developing a one-to-many generation LLM for diverse, accurate and efficient retrosynthesis
dc.contributor.author | Li, Junyong | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för data och informationsteknik | sv |
dc.contributor.department | Chalmers University of Technology / Department of Computer Science and Engineering | en |
dc.contributor.examiner | Engkvist, Ola | |
dc.contributor.supervisor | Johansson, Richard | |
dc.date.accessioned | 2025-01-13T10:58:57Z | |
dc.date.available | 2025-01-13T10:58:57Z | |
dc.date.issued | 2024 | |
dc.date.submitted | ||
dc.description.abstract | One of the most common applications of deep learning for cheminformatics is retrosynthesis, which is a task of predicting reactants given a chemical product. After transformer was invented, it has been widely used for retrosynthesis. Chemformer is a transformer-based model, which was pre-trained using SMILES of chemical molecules first and can be fine-tuned for retrosynthesis. The model achieves stateof- the-art performance on this task. Retrosynthesis task expects multiple predictions of reactants. Chemformer uses beam search or multinomial search to get multiple predictions, which results in a lack of diversity, accuracy and efficiency of the model. In this project, the sphere projection strategy, which is a one-to-many generation strategy, was applied to Chemformer to enable it to generate multiple predictions. The sphere projection achieves one-to-many generation by introducing variations of source embedding of encoder and combining those variations with a single-prediction sampler, such as greedy search and multinomial search (multinomial size = 1). By comparing the modified Chemformer with sphere projection strategy to the baseline Chemformer, it was shown that the strategy can improve diversity, accuracy and efficiency by 197%, 7% and 4% respectively for beam search, and 101%, 2% and 17% respectively for multinomial search. | |
dc.identifier.coursecode | DATX05 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12380/309079 | |
dc.language.iso | eng | |
dc.setspec.uppsok | Technology | |
dc.subject | Retrosynthesis | |
dc.subject | LLM | |
dc.subject | large-language model | |
dc.subject | one-to-many generation | |
dc.subject | machine learning | |
dc.subject | deep learning | |
dc.subject | transformer | |
dc.subject | diversity, accuracy | |
dc.subject | efficiency | |
dc.title | Developing a one-to-many generation LLM for diverse, accurate and efficient retrosynthesis | |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.degree | Master's Thesis | en |
dc.type.uppsok | H | |
local.programme | Computer systems and networks (MPCSN), MSc |