Bayesian Network Fisher Kernel for Categorical Feature Spaces
Publicerad
Författare
Typ
Examensarbete för masterexamen
Program
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Similarity measures between categorical feature vectors are non-intuitive and difficult to compute, since no definitive way of representing distances between two such vectors exists. The Fisher kernel provides a method for computing similarities by considering an underlying statistical model, which circumvents the problem of computing distances between categorical vectors. A promising probabilistic model is the Bayesian network, which is able to capture local dependencies between variables. In this thesis, the Fisher kernel based on discrete Bayesian networks is explored in a categorical setting. This new similarity measure between categorical vectors is primarily evaluated using the task of clustering. In addition, Bayesian networks are evaluated on the task of imputation in order toa ddress the possibility of incomplete datasets. By breaking down the structure of the Bayesian network into basic segments, the connection between the network structure and the produced Fisher similarities was investigated. The Fisher kernel was found to have great potential given that a suitable network structure was considered. However, this structure did not necessarily coincide with structures learnt using conventional learning methods for Bayesian networks.
Beskrivning
Ämne/nyckelord
Bayesian network, Fisher kernel, kernel, clustering, imputation, categorical, similarity, machine learning