Developing a one-to-many generation LLM for diverse, accurate and efficient retrosynthesis

Li, Junyong

Developing a one-to-many generation LLM for diverse, accurate and efficient retrosynthesis

dc.contributor.author	Li, Junyong
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering	en
dc.contributor.examiner	Engkvist, Ola
dc.contributor.supervisor	Johansson, Richard
dc.date.accessioned	2025-01-13T10:58:57Z
dc.date.available	2025-01-13T10:58:57Z
dc.date.issued	2024
dc.date.submitted
dc.description.abstract	One of the most common applications of deep learning for cheminformatics is retrosynthesis, which is a task of predicting reactants given a chemical product. After transformer was invented, it has been widely used for retrosynthesis. Chemformer is a transformer-based model, which was pre-trained using SMILES of chemical molecules first and can be fine-tuned for retrosynthesis. The model achieves stateof- the-art performance on this task. Retrosynthesis task expects multiple predictions of reactants. Chemformer uses beam search or multinomial search to get multiple predictions, which results in a lack of diversity, accuracy and efficiency of the model. In this project, the sphere projection strategy, which is a one-to-many generation strategy, was applied to Chemformer to enable it to generate multiple predictions. The sphere projection achieves one-to-many generation by introducing variations of source embedding of encoder and combining those variations with a single-prediction sampler, such as greedy search and multinomial search (multinomial size = 1). By comparing the modified Chemformer with sphere projection strategy to the baseline Chemformer, it was shown that the strategy can improve diversity, accuracy and efficiency by 197%, 7% and 4% respectively for beam search, and 101%, 2% and 17% respectively for multinomial search.
dc.identifier.coursecode	DATX05
dc.identifier.uri	http://hdl.handle.net/20.500.12380/309079
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Retrosynthesis
dc.subject	LLM
dc.subject	large-language model
dc.subject	one-to-many generation
dc.subject	machine learning
dc.subject	deep learning
dc.subject	transformer
dc.subject	diversity, accuracy
dc.subject	efficiency
dc.title	Developing a one-to-many generation LLM for diverse, accurate and efficient retrosynthesis
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Computer systems and networks (MPCSN), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 24-78 JL.pdf
Storlek:: 1.19 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen