On nonlinear machine learning methodology for dose-response data in drug discovery

Granbom, Klara

On nonlinear machine learning methodology for dose-response data in drug discovery

Ladda ner

Klara_Granbom_Master_Thesis.pdf (2.12 MB)

Typ

Examensarbete för masterexamen

Publicerad

2020

Författare

Granbom, Klara

Sammanfattning

This thesis investigates novel approaches to use nonlinear methodology for doseresponse data in drug discovery. Such methodology could potentially create insights and value within the field, saving resources such as time and usage of animals in experiments. Methods for dimensionality reduction and visualization, as well as methods for classification of compounds into clinical classes based on therapeutic usage, are investigated. The thesis builds partly upon previous research where linear methods, based on partial least squares and principal component analysis, have been used for dimensionality reduction in drug discovery. By using results from linear methods as a benchmark, this thesis investigates the nonlinear methods kernel partial least squares and t-distributed stochastic neighbor embedding for dimensionality reduction. Moreover, methods for classification of compounds are investigated using the linear method multinomial logistic regression as well as the nonlinear methods random forest and multi-layer perceptron networks. Results from nonlinear methods for dimensionality reduction do not detect any distinctly new patterns or clusters, compared to linear methodology. However, some results are promising to build upon in further methodology development. The best performing classification method shows results corresponding to wellknown effects for 70.6% of the compounds evaluated. Moreover, classifications of 11.8% of the compounds indicate potentially unknown effects, which are considered interesting and could be a springboard for further analysis and innovation. Therefore, this classification methodology can create insight and potentially high value.

Ämne/nyckelord

drug discovery, machine learning, classification, multi-layer perceptron, random forest, dimensionality reduction, partial least squares, kernel partial least squares, t-sne

URI

https://hdl.handle.net/20.500.12380/300963

Samling

Examensarbeten för masterexamen

Visa fullständig post