Finding Influential Examples in Deep Learning Models
Date
Authors
Type
Examensarbete för masterexamen
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Machine learning models are powerful, but not without errors and the complexity
of large models makes it hard for a human to intuitively understand the cause of
the error. This thesis approaches the task of explaining predictions made by deep
learning models by studying the importance of specific examples in the training data,
referred to as influence. In practice, the embedding representation of the training
data, defined as the output from an arbitrary layer in the model, is compared to
the influence on a prediction. Two models are investigated; a Logistic Regression
model and a Convolutional Neural Network. The aim of this thesis is thus to identify
influential examples in deep learning models in a computationally efficient way, by
studying the relation between the representation of the data in a network and its
influence. The main results include comparisons between various metrics of distance
in the embedding representation of the images to their influence. Similar examples
are shown to be clustered close together, training examples close to a test example
exhibited high influence for a correctly classified test example. Training examples
far away from its class centroid in the embedding space also show high influence.
Description
Keywords
influence, convolutional, network, embedding, features, similarity