Predictive AI for Hepatic Safety: A dual analysis of CYP450 time-dependent inhibition and trapping assays using supervised learning models
Ladda ner
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
This work explores the development and evaluation of machine learning models for predicting toxicity-related endpoints, focusing on time-dependent inhibition of cytochrome P450 enzymes and reactivity in trapping assays (glutathione, potassium cyanide, and methoxylamine). A variety of modeling strategies were assessed, including decision trees and Chemprop neural networks in both single-task and multitask configurations. Model performances were estimated using temporally split datasets to better reflect real-world prediction scenarios. While tree-based models consistently delivered more stable and balanced results, Chemprop models showed greater sensitivity to class imbalance, data partitioning, and representation. Attempts to mitigate these issues using data resampling techniques, additional molecular descriptors, and scaffold-based data reduction led to limited improvements. Further analysis of feature distributions and chemical space connectivity highlighted key challenges, such as weak class separation in descriptor values and structural isolation of test compounds, especially under temporal splits. In the case of trapping assays, multitask learning failed to improve generalization, likely due to the biological heterogeneity of the endpoints. Overall, results emphasize that data limitations are the primary bottleneck. Enhancing chemical diversity, improving feature representations, and tailoring models to specific endpoint properties appear critical for achieving more robust predictions in toxicity modeling.
Beskrivning
Ämne/nyckelord
CYP450 inhibition, Trapping Assays, Machine Learning, Decision Trees, Chemprop, Toxicology Prediction, Imbalanced Data
