Finding Influential Examples in Deep Learning Models
Publicerad
Författare
Typ
Examensarbete för masterexamen
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Machine learning models are powerful, but not without errors and the complexity
of large models makes it hard for a human to intuitively understand the cause of
the error. This thesis approaches the task of explaining predictions made by deep
learning models by studying the importance of specific examples in the training data,
referred to as influence. In practice, the embedding representation of the training
data, defined as the output from an arbitrary layer in the model, is compared to
the influence on a prediction. Two models are investigated; a Logistic Regression
model and a Convolutional Neural Network. The aim of this thesis is thus to identify
influential examples in deep learning models in a computationally efficient way, by
studying the relation between the representation of the data in a network and its
influence. The main results include comparisons between various metrics of distance
in the embedding representation of the images to their influence. Similar examples
are shown to be clustered close together, training examples close to a test example
exhibited high influence for a correctly classified test example. Training examples
far away from its class centroid in the embedding space also show high influence.
Beskrivning
Ämne/nyckelord
influence, convolutional, network, embedding, features, similarity