On nonlinear machine learning methodology for dose-response data in drug discovery

Examensarbete för masterexamen

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.12380/300963
Download file(s):
File Description SizeFormat 
Klara_Granbom_Master_Thesis.pdf2.17 MBAdobe PDFView/Open
Type: Examensarbete för masterexamen
Title: On nonlinear machine learning methodology for dose-response data in drug discovery
Authors: Granbom, Klara
Abstract: This thesis investigates novel approaches to use nonlinear methodology for doseresponse data in drug discovery. Such methodology could potentially create insights and value within the field, saving resources such as time and usage of animals in experiments. Methods for dimensionality reduction and visualization, as well as methods for classification of compounds into clinical classes based on therapeutic usage, are investigated. The thesis builds partly upon previous research where linear methods, based on partial least squares and principal component analysis, have been used for dimensionality reduction in drug discovery. By using results from linear methods as a benchmark, this thesis investigates the nonlinear methods kernel partial least squares and t-distributed stochastic neighbor embedding for dimensionality reduction. Moreover, methods for classification of compounds are investigated using the linear method multinomial logistic regression as well as the nonlinear methods random forest and multi-layer perceptron networks. Results from nonlinear methods for dimensionality reduction do not detect any distinctly new patterns or clusters, compared to linear methodology. However, some results are promising to build upon in further methodology development. The best performing classification method shows results corresponding to wellknown effects for 70.6% of the compounds evaluated. Moreover, classifications of 11.8% of the compounds indicate potentially unknown effects, which are considered interesting and could be a springboard for further analysis and innovation. Therefore, this classification methodology can create insight and potentially high value.
Keywords: drug discovery, machine learning, classification, multi-layer perceptron, random forest, dimensionality reduction, partial least squares, kernel partial least squares, t-sne
Issue Date: 2020
Publisher: Chalmers tekniska högskola / Institutionen för matematiska vetenskaper
URI: https://hdl.handle.net/20.500.12380/300963
Collection:Examensarbeten för masterexamen // Master Theses

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.