Active Learning and Predictive Modeling Using Uncertainty Quantification

dc.contributor.authorBlomgren, Carl
dc.contributor.authorGummesson Svensson, Hampus
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.examinerKemp, Graham
dc.contributor.supervisorYu, Yinan
dc.contributor.supervisorJohansson, Simon
dc.date.accessioned2020-09-18T14:08:32Z
dc.date.available2020-09-18T14:08:32Z
dc.date.issued2020sv
dc.date.submitted2020
dc.description.abstractA deficit with current state-of-the-art machine learning algorithms in drug discovery is that they solely provide a point-estimate. However, in drug discovery, where data is associated with costly and time consuming experiments, there is a need for the models to indicate the uncertainty of their outputs. Otherwise, the models might be used erroneously. In order to obtain uncertainty from the models, this thesis utilizes Bayesian statistical models. In particular, the objective of this thesis is twofold: (1) Investigate the use of uncertainty in active learning (AL) for predicting the observed yields of chemical reactions with different reaction conditions and reactants. Uncertainty methods for AL and methods based on design of experiments were compared. The predictions were done by using the Bayesian probabilistic matrix factorization model Macau. (2) Investigate how the induced uncertainty affects the performance of Bayesian neural networks used to predict reaction conditions. The uncertainty was used to evaluate how reliable the obtained predictions are. The network was based on variational Bayesian methods and we compare Bayes by Backprop and MC dropout on a severely imbalanced data set. We found that the use of uncertainty in active learning shows better performance with respect to absolute error and variance when a sufficient number of data points have been added to the training set. Also, using uncertainty seems to yield a significant different training set compared to randomly selected points. Bayes by Backprop illustrates comparable accuracy to MC dropout, however, it struggles to predict the minority classes. This further affects the uncertainty estimates on the minority classes which could indicate that MC dropout is more certain than Bayes by Backprop. To conclude, the introduction of uncertainty quantification seems to provide some valuable information to synthesis prediction models. However, future research on the quality of the uncertainty is needed to use the induced uncertainty to its full extent.sv
dc.identifier.coursecodeDATX05sv
dc.identifier.urihttps://hdl.handle.net/20.500.12380/301740
dc.language.isoengsv
dc.setspec.uppsokTechnology
dc.subjectmachine learningsv
dc.subjectuncertainty quantificationsv
dc.subjectBayesian probabilistic matrix factorizationsv
dc.subjectBayesian neural networkssv
dc.subjectBayesian statisticssv
dc.subjectvariational inferencesv
dc.subjectactive learningsv
dc.subjectdrug discoverysv
dc.subjectsynthesis predictionsv
dc.titleActive Learning and Predictive Modeling Using Uncertainty Quantificationsv
dc.type.degreeExamensarbete för masterexamensv
dc.type.uppsokH
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 20-55 Blomgren Gummesson Svensson.pdf
Storlek:
15.62 MB
Format:
Adobe Portable Document Format
Beskrivning:
Active Learning and Predictive Modeling Using Uncertainty Quantification
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
1.14 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: