Active Learning for Surrogate Models to Augment AI-Driven Molecular Design

JOSEFSON, CHRISTIAN; NYMAN, CLARA

Active Learning for Surrogate Models to Augment AI-Driven Molecular Design

Ladda ner

CSE 22-107 Josefsson Nyman.pdf (6.01 MB)

Publicerad

2022

Författare

JOSEFSON, CHRISTIAN

NYMAN, CLARA

Typ

Examensarbete för masterexamen

Sammanfattning

This project investigated whether an active learning (AL) framework can help mitigate computational costs for AI-driven molecular design, without negatively impacting accuracy. The surrogate models Random Forest (RF) and Support Vector Regression (SVR) were tested together with the acquisition functions (AF) Random, Thompson Sampling (TS), Tanimoto Similarity, Expected Improvement (EI), Probability of Improvement (PI), Upper Confidence Bound (UCB) and ε−Greedy. Of these, the combination RF and Random acquisition were concluded to perform the best with regards to error rate, measured as root mean square error, and time consumption, measured in runtime per epoch. SVR had slightly lower error, but took substantially longer time. Depending on the choice of AF, one run using RF took approximately 2-17.5 hours, while one run using SVR took approximately 100-175 hours. Four tuning parameters were introduced to see if they could further optimize the framework. It was discovered that a longer retrain interval and a smaller acquisition batch did not significantly impact accuracy while shortening the time consumption. To summarise, an RF model with the Random AF with a 5 epoch initial pooling, no warm-up phase, a retrain interval of 20 and an acquisition batch size of 20 was selected to mitigate computational costs while simultaneously keeping the error stable.

Ämne/nyckelord

active learning, bayesian optimization, de novo design, molecular design, drug discovery, surrogate model, machine learning, molecular docking

URI

https://hdl.handle.net/20.500.12380/305713

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Active Learning for Surrogate Models to Augment AI-Driven Molecular Design

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced