DANF: Approximate Neighborhood Function on Large Dynamic Graphs Continuously finding changes in node centrality
Examensarbete för masterexamen
Computer science – algorithms, languages and logic (MPALG), MSc
HANSEN, JOHAN NILSSON
The neighborhood function measures node centrality in graphs by measuring how many nodes a given node can reach in a certain number of steps. The neighborhood function can for example be used to find central nodes or the degree of separation. The state-of-the-art algorithm, called HyperANF (Hyper Approximate Neighborhood Function), can calculate an approximate neighborhood function for graphs with billions of nodes within hours using a standard workstation [P. Boldi, M. Rosa, and S. Vigna, “Hyperanf: Approximating the neighbourhood function of very large graphs on a budget,” CoRR, vol. abs/1011.5599, 2010]. However, it only supports static graphs. If the neighborhood function should be calculated on a dynamic graph, the algorithm has to be re-run at any change in the graph. We develop a novel algorithm called Dynamic Approximate Neighborhood Function (DANF) which extends HyperANF to support dynamic graphs. In our algorithm, all relevant nodes are updated when new edges are added to the graph. This allows a constantly updated neighborhood function for all nodes in large graphs. DANF will be used on a real-time data stream supplied by the company Meltwater, where about 2 million news articles are received per day. Rapidly changing nodes and trends are detected by tracking the nodes whose centrality change by an insertion. This is used to monitor which subjects are getting more or less popular.
Data- och informationsvetenskap , Computer and Information Science