Developing a one-to-many generation LLM for diverse, accurate and efficient retrosynthesis

dc.contributor.authorLi, Junyong
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerEngkvist, Ola
dc.contributor.supervisorJohansson, Richard
dc.date.accessioned2025-01-13T10:58:57Z
dc.date.available2025-01-13T10:58:57Z
dc.date.issued2024
dc.date.submitted
dc.description.abstractOne of the most common applications of deep learning for cheminformatics is retrosynthesis, which is a task of predicting reactants given a chemical product. After transformer was invented, it has been widely used for retrosynthesis. Chemformer is a transformer-based model, which was pre-trained using SMILES of chemical molecules first and can be fine-tuned for retrosynthesis. The model achieves stateof- the-art performance on this task. Retrosynthesis task expects multiple predictions of reactants. Chemformer uses beam search or multinomial search to get multiple predictions, which results in a lack of diversity, accuracy and efficiency of the model. In this project, the sphere projection strategy, which is a one-to-many generation strategy, was applied to Chemformer to enable it to generate multiple predictions. The sphere projection achieves one-to-many generation by introducing variations of source embedding of encoder and combining those variations with a single-prediction sampler, such as greedy search and multinomial search (multinomial size = 1). By comparing the modified Chemformer with sphere projection strategy to the baseline Chemformer, it was shown that the strategy can improve diversity, accuracy and efficiency by 197%, 7% and 4% respectively for beam search, and 101%, 2% and 17% respectively for multinomial search.
dc.identifier.coursecodeDATX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/309079
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectRetrosynthesis
dc.subjectLLM
dc.subjectlarge-language model
dc.subjectone-to-many generation
dc.subjectmachine learning
dc.subjectdeep learning
dc.subjecttransformer
dc.subjectdiversity, accuracy
dc.subjectefficiency
dc.titleDeveloping a one-to-many generation LLM for diverse, accurate and efficient retrosynthesis
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeComputer systems and networks (MPCSN), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 24-78 JL.pdf
Storlek:
1.19 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: