Graph Classification with Differential Privacy

Publicerad

Typ

Examensarbete för masterexamen
Master Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

With increasing usage of online services such as email, social networks and online shopping more data than ever before is being gathered. Many types of data for which structure, flow and relationships are important may naturally be represented by graphs. Services may want to use machine learning algorithms to gain insight about their customers or users, and to successfully use the concepts of machine learning they must take the structure and properties of graph data into account. Seemingly innocuous data could potentially be used to infer sensitive information about individuals and should be kept private, the algorithms and techniques used must therefore take privacy into consideration. In this thesis we consider the combination of machine learning and privacy by bringing together the concepts of support vector machines and differential privacy. We examine the classification of graphs by means of kernel methods and present a framework for constructing private representations of two well known graph kernels, the random walk kernel and graphlet kernels. Furthermore, to allow for classification of large graphs we present a novel sampling scheme for approximation of subgraph counts. We evaluate both sampling and classification using four real world datasets consisting of social, road and protein networks.

Beskrivning

Ämne/nyckelord

Data- och informationsvetenskap, Computer and Information Science

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced