Explainable AI for the Transformer Model Used on Chemical Language
Ladda ner
Typ
Examensarbete för masterexamen
Program
Publicerad
2022
Författare
Bükk, Caroline
Hoang, Linda
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
One of the main challenges in drug discovery is to find new molecules with desirable
properties. In recent years, using deep learning models to change the properties
of a molecule has shown promising results. This task is done by letting the model
transform the original molecule, and is often referred to as molecular optimization.
A problem with using deep learning models is that it is difficult to understand what
the model bases its decisions on. In our project, understanding what the model basis
its decision on could be valuable feedback to drug designers and chemists. It could
both extend their understanding of suitable transformations in different scenarios
and provide insight in how the model could be improved.
In this thesis, we have focused on explaining the Transformer model, when used to
perform molecular optimization. As the molecules in this task are expressed in a
chemical language, this problem can be viewed as a machine translation problem.
The predicted molecule then corresponds to the translation of the input molecule
and the desirable property changes. To explain the model, we considered a set of
assumptions of what the model would focus on. The assumptions were inspired by
the chemists’ intuition regarding what should influence the transformation most.
The attention weights of the cross-attention layer were then analysed to test if these
assumptions were correct. In order to determine if a contribution to the transformation
could be considered important, relative comparisons between different parts
of the input and output were used.
We found that in some regards, the chemists’ intuition agreed with our comparisons
of the attention weights. However, in some cases, the absolute value of the attention
weights on the important parts were still very low. For future work, we suggest
additional assumptions based on the chemists’ intuition and experiments to test
them. We also suggest to use the explainability technique, integrated gradient, that
could be applied similarly and used to verify our results.
Beskrivning
Ämne/nyckelord
Explainable AI , attention weights , transformer , NLP , molecular optimization , machine translation , machine learning