Machine Learning for Predicting Targeted Protein Degradation

Publicerad

Författare

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

PROteolysis TArgeting Chimeras (PROTACs) are an emerging high-potential therapeutic technology. PROTACs leverage the ubiquitination and proteasome processes within a cell to degrade a Protein Of Interest (POI). Designing new PROTAC molecules, however, is a challenging task, as assessing the degradation efficacy of PROTACs often requires extensive effort, mostly in terms of expertise, cost and time, for instance via laboratory assays. Machine Learning (ML) and Deep Learning (DL) technologies are revolutionizing many scientific fields, including the drug development pipeline. In this thesis, we present the data collection and curation strategy, as well as several candidate DL models, for ultimately predicting the degradation efficacy of PROTAC molecules. In order to train and evaluate our system, we propose a curated version of open source datasets from literature. Relevant features such as pDC50, Dmax, E3 ligase type, POI amino acid sequence, and experimental cell type are carefully organized and parsed via a Named Entity Recognition system based on a BERT model. The curated datasets have been used for developing three candidate DL models. Each DL model is designed to leverage different PROTAC representations: molecular fingerprints, molecular graphs and tokenized SMILES. The proposed models are evaluated against an XGBoost model baseline and the State-of-The-Art (SOTA) model for predicting PROTACs degradation activity. Overall, our best DL models achieved a validation accuracy of 80.26% versus SOTA’s 77.95% score, and a Receiver Operating Characteristic Area Under the Curve (ROC AUC) validation score of 0.849 versus SOTA’ 0.847.

Beskrivning

Ämne/nyckelord

Deep learning, Chemoinformatics, PROTAC, Drug design

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced