Bayesian Network Fisher Kernel for Categorical Feature Spaces
Loading...
Date
Authors
Type
Examensarbete för masterexamen
Programme
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Similarity measures between categorical feature vectors are non-intuitive and difficult to compute, since no definitive way of representing distances between two such vectors exists. The Fisher kernel provides a method for computing similarities by considering an underlying statistical model, which circumvents the problem of computing distances between categorical vectors. A promising probabilistic model is the Bayesian network, which is able to capture local dependencies between variables. In this thesis, the Fisher kernel based on discrete Bayesian networks is explored in a categorical setting. This new similarity measure between categorical vectors is primarily evaluated using the task of clustering. In addition, Bayesian networks are evaluated on the task of imputation in order toa ddress the possibility of incomplete datasets. By breaking down the structure of the Bayesian network into basic segments, the connection between the network structure and the produced Fisher similarities was investigated. The Fisher kernel was found to have great potential given that a suitable network structure was considered. However, this structure did not necessarily coincide with structures learnt using conventional learning methods for Bayesian networks.
Description
Keywords
Bayesian network, Fisher kernel, kernel, clustering, imputation, categorical, similarity, machine learning
