AI-Based Toxicity Prediction as an Alternative to Animal Testing
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
In recent years, there has been a significant increase in the use of chemicals in our
environment due to growing demand and consumption. Consequently, large-scale
chemical regulation based on toxicological assays has been implemented to prevent
exposure-related consequences for nature and human health. Historically, animal-based assays have been used for this purpose. However, there is now an increasing
demand to replace these animal-based assessment methods with computer-based
alternatives. Despite previous attempts to develop computer-based models, these
models have proven to be unreliable and inaccurate, leading to a decrease in interest. Therefore, there is a pressing need to develop new computer-based models for
toxicity assessment. Here, the introduction of deep learning models, particularly
transformer architecture, has the potential to revolutionise the field. Deep neural
networks have demonstrated the ability to handle complex and high-dimensional
problems, surpassing older modelling techniques. Moreover, as the transformer has
shown promise in handling chemical structure information, there is growing interest
in its usage in the field of environmental toxicity assessment. The aim of this project
was hence to explore the potential of transformer-based deep neural network models
for the purpose of toxicity assessment.
For this project, a subset of rat and mice in vivo toxicity assay data associated
with EC50 and LOEC measurements, as well as different administration routes,
were utilised. Here, three sets of data were analysed, each distinguished by the
hazards: acute toxicity, carcinogenicity, or reproductive toxicity. The first type of
model, the single-DNN model, was created for each data set separately. Subsequently, these models were expanded to the multiple-DNN model, able to handle
all three data sets simultaneously. For all models, a pre-trained RoBERTa transformer was utilised to interpret canonicalised SMILES representation of chemical
structures, with the performance then evaluated through repeated 10-fold crossvalidation. Principal Component Analysis demonstrated that the transformer could
identify patterns in chemical structures related to toxicity. Moreover, the study
found that the single-DNN model outperformed the multiple-DNN model in all trials, likely due to the latter’s increased complexity. All models exhibited leniency
towards chemicals with low measured concentrations, and to mitigate this problem,
a more stringent loss for lower concentrations was suggested. Overall, this project
demonstrated the potential and effectiveness of transformer-based computer models
for toxicity assessment, showcasing the versatility of this technology for addressing
a broad range of toxic hazards
Beskrivning
Ämne/nyckelord
Environmental risk assessment, SMILES, RoBERTa, deep learning, artifical intelligence, transformer, toxicity