Search | arXiv e-print repository

Scalable Katz Ranking Computation in Large Static and Dynamic Graphs

Authors: Alexander van der Grinten, Elisabetta Bergamini, Oded Green, David A. Bader, Henning Meyerhenke

Abstract: Network analysis defines a number of centrality measures to identify the most central nodes in a network. Fast computation of those measures is a major challenge in algorithmic network analysis. Aside from closeness and betweenness, Katz centrality is one of the established centrality measures. In this paper, we consider the problem of computing rankings for Katz centrality. In particular, we prop… ▽ More Network analysis defines a number of centrality measures to identify the most central nodes in a network. Fast computation of those measures is a major challenge in algorithmic network analysis. Aside from closeness and betweenness, Katz centrality is one of the established centrality measures. In this paper, we consider the problem of computing rankings for Katz centrality. In particular, we propose upper and lower bounds on the Katz score of a given node. While previous approaches relied on numerical approximation or heuristics to compute Katz centrality rankings, we construct an algorithm that iteratively improves those upper and lower bounds until a correct Katz ranking is obtained. We extend our algorithm to dynamic graphs while maintaining its correctness guarantees. Experiments demonstrate that our static graph algorithm outperforms both numerical approaches and heuristics with speedups between 1.5x and 3.5x, depending on the desired quality guarantees. Our dynamic graph algorithm improves upon the static algorithm for update batches of less than 10000 edges. We provide efficient parallel CPU and GPU implementations of our algorithms that enable near real-time Katz centrality computation for graphs with hundreds of millions of nodes in fractions of seconds. △ Less

Submitted 10 July, 2018; originally announced July 2018.

Comments: Published at ESA'18

arXiv:1710.01144 [pdf, other]

Scaling up Group Closeness Maximization

Authors: Elisabetta Bergamini, Tanya Gonser, Henning Meyerhenke

Abstract: Closeness is a widely-used centrality measure in social network analysis. For a node it indicates the reciprocal of the average shortest-path distance to the other nodes of the network. While the identification of the k nodes with highest closeness received significant attention, many applications are actually interested in finding a group of nodes that is central as a whole. For this problem, onl… ▽ More Closeness is a widely-used centrality measure in social network analysis. For a node it indicates the reciprocal of the average shortest-path distance to the other nodes of the network. While the identification of the k nodes with highest closeness received significant attention, many applications are actually interested in finding a group of nodes that is central as a whole. For this problem, only recently a greedy algorithm has been proposed [Chen et al., ADC 2016]. The approximation factor of (1 - 1/e) proposed by Chen et al. for this algorithm does not hold, though, as we show in this version of our paper. Since their implementation of the greedy algorithm was still too slow for large networks, Chen et al. also proposed a heuristic without approximation guarantee. In the present paper we develop new techniques to speed up the greedy algorithm. Compared to the previous implementation, our approach is orders of magnitude faster and, compared to the heuristic proposed by Chen et al., we always find a solution with better quality in a comparable running time in our experiments. Our method Greedy++ allows us to estimate the group with maximum closeness on networks with up to hundreds of millions of edges in minutes or at most a few hours. The greedy approach by [Chen et al., ADC 2016] would take several days already on networks with hundreds of thousands of edges. Our experiments show that the solution found by Greedy++ is actually very close to the optimum (...) Note: This paper version fixes the issue of relying on the presumed (but incorrect) submodularity of group closeness. While this has implications on the theoretical assessment of the greedy algorithm, our algorithm variant and its implementation remain unaffected. The reason is that Greedy++ relies (among others) on the supermodularity of farness, which does hold. △ Less

Submitted 15 May, 2019; v1 submitted 3 October, 2017; originally announced October 2017.

Comments: A previous version of this paper appeared in the Proc. of 20th SIAM Workshop on Algorithm Engineering and Experiments (ALENEX 2018)

arXiv:1710.01143 [pdf, other]

Computing Top-k Closeness Centrality in Fully-dynamic Graphs

Authors: Patrick Bisenius, Elisabetta Bergamini, Eugenio Angriman, Henning Meyerhenke

Abstract: Closeness is a widely-studied centrality measure. Since it requires all pairwise distances, computing closeness for all nodes is infeasible for large real-world networks. However, for many applications, it is only necessary to find the k most central nodes and not all closeness values. Prior work has shown that computing the top-k nodes with highest closeness can be done much faster than computing… ▽ More Closeness is a widely-studied centrality measure. Since it requires all pairwise distances, computing closeness for all nodes is infeasible for large real-world networks. However, for many applications, it is only necessary to find the k most central nodes and not all closeness values. Prior work has shown that computing the top-k nodes with highest closeness can be done much faster than computing closeness for all nodes in real-world networks. However, for networks that evolve over time, no dynamic top-k closeness algorithm exists that improves on static recomputation. In this paper, we present several techniques that allow us to efficiently compute the k nodes with highest (harmonic) closeness after an edge insertion or an edge deletion. Our algorithms use information obtained during earlier computations to omit unnecessary work. However, they do not require asymptotically more memory than the static algorithms (i. e., linear in the number of nodes). We propose separate algorithms for complex networks (which exhibit the small-world property) and networks with large diameter such as street networks, and we compare them against static recomputation on a variety of real-world networks. On many instances, our dynamic algorithms are two orders of magnitude faster than recomputation; on some large graphs, we even reach average speedups between $10^3$ and $10^4$. △ Less

Submitted 3 October, 2017; originally announced October 2017.

Comments: Accepted for publication at the 20th SIAM Workshop on Algorithm Engineering and Experiments (ALENEX 2018)

arXiv:1704.08592 [pdf, other]

Faster Betweenness Centrality Updates in Evolving Networks

Authors: Elisabetta Bergamini, Henning Meyerhenke, Mark Ortmann, Arie Slobbe

Abstract: Finding central nodes is a fundamental problem in network analysis. Betweenness centrality is a well-known measure which quantifies the importance of a node based on the fraction of shortest paths going though it. Due to the dynamic nature of many today's networks, algorithms that quickly update centrality scores have become a necessity. For betweenness, several dynamic algorithms have been propos… ▽ More Finding central nodes is a fundamental problem in network analysis. Betweenness centrality is a well-known measure which quantifies the importance of a node based on the fraction of shortest paths going though it. Due to the dynamic nature of many today's networks, algorithms that quickly update centrality scores have become a necessity. For betweenness, several dynamic algorithms have been proposed over the years, targeting different update types (incremental- and decremental-only, fully-dynamic). In this paper we introduce a new dynamic algorithm for updating betweenness centrality after an edge insertion or an edge weight decrease. Our method is a combination of two independent contributions: a faster algorithm for updating pairwise distances as well as number of shortest paths, and a faster algorithm for updating dependencies. Whereas the worst-case running time of our algorithm is the same as recomputation, our techniques considerably reduce the number of operations performed by existing dynamic betweenness algorithms. △ Less

Submitted 27 April, 2017; originally announced April 2017.

Comments: Accepted at the 16th International Symposium on Experimental Algorithms (SEA 2017)

arXiv:1704.01077 [pdf, other]

Computing top-k Closeness Centrality Faster in Unweighted Graphs

Authors: Elisabetta Bergamini, Michele Borassi, Pierluigi Crescenzi, Andrea Marino, Henning Meyerhenke

Abstract: Given a connected graph $G=(V,E)$, the closeness centrality of a vertex $v$ is defined as $\frac{n-1}{\sum_{w \in V} d(v,w)}$. This measure is widely used in the analysis of real-world complex networks, and the problem of selecting the $k$ most central vertices has been deeply analysed in the last decade. However, this problem is computationally not easy, especially for large networks: in the firs… ▽ More Given a connected graph $G=(V,E)$, the closeness centrality of a vertex $v$ is defined as $\frac{n-1}{\sum_{w \in V} d(v,w)}$. This measure is widely used in the analysis of real-world complex networks, and the problem of selecting the $k$ most central vertices has been deeply analysed in the last decade. However, this problem is computationally not easy, especially for large networks: in the first part of the paper, we prove that it is not solvable in time $Ø(|E|^{2-ε})$ on directed graphs, for any constant $ε>0$, under reasonable complexity assumptions. Furthermore, we propose a new algorithm for selecting the $k$ most central nodes in a graph: we experimentally show that this algorithm improves significantly both the textbook algorithm, which is based on computing the distance between all pairs of vertices, and the state of the art. For example, we are able to compute the top $k$ nodes in few dozens of seconds in real-world networks with millions of nodes and edges. Finally, as a case study, we compute the $10$ most central actors in the IMDB collaboration network, where two actors are linked if they played together in a movie, and in the Wikipedia citation network, which contains a directed edge from a page $p$ to a page $q$ if $p$ contains a link to $q$. △ Less

Submitted 27 April, 2017; v1 submitted 4 April, 2017; originally announced April 2017.

arXiv:1702.05284 [pdf, other]

Improving the betweenness centrality of a node by adding links

Authors: Elisabetta Bergamini, Pierluigi Crescenzi, Gianlorenzo D'Angelo, Henning Meyerhenke, Lorenzo Severini, Yllka Velaj

Abstract: Betweenness is a well-known centrality measure that ranks the nodes according to their participation in the shortest paths of a network. In several scenarios, having a high betweenness can have a positive impact on the node itself. Hence, in this paper we consider the problem of determining how much a vertex can increase its centrality by creating a limited amount of new edges incident to it. In p… ▽ More Betweenness is a well-known centrality measure that ranks the nodes according to their participation in the shortest paths of a network. In several scenarios, having a high betweenness can have a positive impact on the node itself. Hence, in this paper we consider the problem of determining how much a vertex can increase its centrality by creating a limited amount of new edges incident to it. In particular, we study the problem of maximizing the betweenness score of a given node -- Maximum Betweenness Improvement (MBI) -- and that of maximizing the ranking of a given node -- Maximum Ranking Improvement (MRI). We show that MBI cannot be approximated in polynomial-time within a factor $(1-\frac{1}{2e})$ and that MRI does not admit any polynomial-time constant factor approximation algorithm, both unless $P=NP$. We then propose a simple greedy approximation algorithm for MBI with an almost tight approximation ratio and we test its performance on several real-world networks. We experimentally show that our algorithm highly increases both the betweenness score and the ranking of a given node ant that it outperforms several competitive baselines. To speed up the computation of our greedy algorithm, we also propose a new dynamic algorithm for updating the betweenness of one node after an edge insertion, which might be of independent interest. Using the dynamic algorithm, we are now able to compute an approximation of MBI on networks with up to $10^5$ edges in most cases in a matter of seconds or a few minutes. △ Less

Submitted 1 August, 2018; v1 submitted 17 February, 2017; originally announced February 2017.

Comments: Accepted to ACM Journal of Experimental Algorithmics (JEA)

arXiv:1607.02955 [pdf, other]

Estimating Current-Flow Closeness Centrality with a Multigrid Laplacian Solver

Authors: Elisabetta Bergamini, Michael Wegner, Dimitar Lukarski, Henning Meyerhenke

Abstract: Matrices associated with graphs, such as the Laplacian, lead to numerous interesting graph problems expressed as linear systems. One field where Laplacian linear systems play a role is network analysis, e. g. for certain centrality measures that indicate if a node (or an edge) is important in the network. One such centrality measure is current-flow closeness. To allow network analysis workflows to… ▽ More Matrices associated with graphs, such as the Laplacian, lead to numerous interesting graph problems expressed as linear systems. One field where Laplacian linear systems play a role is network analysis, e. g. for certain centrality measures that indicate if a node (or an edge) is important in the network. One such centrality measure is current-flow closeness. To allow network analysis workflows to profit from a fast Laplacian solver, we provide an implementation of the LAMG multigrid solver in the NetworKit package, facilitating the computation of current-flow closeness values or related quantities. Our main contribution consists of two algorithms that accelerate the current-flow computation for one node or a reasonably small node subset significantly. One sampling-based algorithm provides an unbiased estimation of the related electrical farness, the other one is based on the Johnson-Lindenstrauss transform. Our inexact algorithms lead to very accurate results in practice. Thanks to them one is now able to compute an estimation of current-flow closeness of one node on networks with tens of millions of nodes and edges within seconds or a few minutes. From a network analytical point of view, our experiments indicate that current-flow closeness can discriminate among different nodes significantly better than traditional shortest-path closeness and is also considerably more resistant to noise -- we thus show that two known drawbacks of shortest-path closeness are alleviated by the current-flow variant. △ Less

Submitted 6 November, 2020; v1 submitted 11 July, 2016; originally announced July 2016.

Comments: Conference version published in Proceedings of SIAM CSC 2016

arXiv:1510.07971 [pdf, other]

Approximating Betweenness Centrality in Fully-dynamic Networks

Authors: Elisabetta Bergamini, Henning Meyerhenke

Abstract: Betweenness is a well-known centrality measure that ranks the nodes of a network according to their participation in shortest paths. Since an exact computation is prohibitive in large networks, several approximation algorithms have been proposed. Besides that, recent years have seen the publication of dynamic algorithms for efficient recomputation of betweenness in networks that change over time.… ▽ More Betweenness is a well-known centrality measure that ranks the nodes of a network according to their participation in shortest paths. Since an exact computation is prohibitive in large networks, several approximation algorithms have been proposed. Besides that, recent years have seen the publication of dynamic algorithms for efficient recomputation of betweenness in networks that change over time. In this paper we propose the first betweenness centrality approximation algorithms with a provable guarantee on the maximum approximation error for dynamic networks. Several new intermediate algorithmic results contribute to the respective approximation algorithms: (i) new upper bounds on the vertex diameter, (ii) the first fully-dynamic algorithm for updating an approximation of the vertex diameter in undirected graphs, and (iii) an algorithm with lower time complexity for updating single-source shortest paths in unweighted graphs after a batch of edge actions. Using approximation, our algorithms are the first to make in-memory computation of betweenness in dynamic networks with millions of edges feasible. Our experiments show that our algorithms can achieve substantial speedups compared to recomputation, up to several orders of magnitude. Moreover, the approximation accuracy is usually significantly better than the theoretical guarantee in terms of absolute error. More importantly, for reasonably small approximation error thresholds, the rank of nodes is well preserved, in particular for nodes with high betweenness. △ Less

Submitted 27 October, 2015; originally announced October 2015.

Comments: arXiv admin note: substantial text overlap with arXiv:1504.07091, arXiv:1409.6241

arXiv:1504.07091 [pdf, other]

Fully-dynamic Approximation of Betweenness Centrality

Authors: Elisabetta Bergamini, Henning Meyerhenke

Abstract: Betweenness is a well-known centrality measure that ranks the nodes of a network according to their participation in shortest paths. Since an exact computation is prohibitive in large networks, several approximation algorithms have been proposed. Besides that, recent years have seen the publication of dynamic algorithms for efficient recomputation of betweenness in evolving networks. In previous w… ▽ More Betweenness is a well-known centrality measure that ranks the nodes of a network according to their participation in shortest paths. Since an exact computation is prohibitive in large networks, several approximation algorithms have been proposed. Besides that, recent years have seen the publication of dynamic algorithms for efficient recomputation of betweenness in evolving networks. In previous work we proposed the first semi-dynamic algorithms that recompute an approximation of betweenness in connected graphs after batches of edge insertions. In this paper we propose the first fully-dynamic approximation algorithms (for weighted and unweighted undirected graphs that need not to be connected) with a provable guarantee on the maximum approximation error. The transfer to fully-dynamic and disconnected graphs implies additional algorithmic problems that could be of independent interest. In particular, we propose a new upper bound on the vertex diameter for weighted undirected graphs. For both weighted and unweighted graphs, we also propose the first fully-dynamic algorithms that keep track of such upper bound. In addition, we extend our former algorithm for semi-dynamic BFS to batches of both edge insertions and deletions. Using approximation, our algorithms are the first to make in-memory computation of betweenness in fully-dynamic networks with millions of edges feasible. Our experiments show that they can achieve substantial speedups compared to recomputation, up to several orders of magnitude. △ Less

Submitted 3 July, 2015; v1 submitted 27 April, 2015; originally announced April 2015.

arXiv:1409.6241 [pdf, other]

Approximating Betweenness Centrality in Large Evolving Networks

Authors: Elisabetta Bergamini, Henning Meyerhenke, Christian L. Staudt

Abstract: Betweenness centrality ranks the importance of nodes by their participation in all shortest paths of the network. Therefore computing exact betweenness values is impractical in large networks. For static networks, approximation based on randomly sampled paths has been shown to be significantly faster in practice. However, for dynamic networks, no approximation algorithm for betweenness centrality… ▽ More Betweenness centrality ranks the importance of nodes by their participation in all shortest paths of the network. Therefore computing exact betweenness values is impractical in large networks. For static networks, approximation based on randomly sampled paths has been shown to be significantly faster in practice. However, for dynamic networks, no approximation algorithm for betweenness centrality is known that improves on static recomputation. We address this deficit by proposing two incremental approximation algorithms (for weighted and unweighted connected graphs) which provide a provable guarantee on the absolute approximation error. Processing batches of edge insertions, our algorithms yield significant speedups up to a factor of $10^4$ compared to restarting the approximation. This is enabled by investing memory to store and efficiently update shortest paths. As a building block, we also propose an asymptotically faster algorithm for updating the SSSP problem in unweighted graphs. Our experimental study shows that our algorithms are the first to make in-memory computation of a betweenness ranking practical for million-edge semi-dynamic networks. Moreover, our results show that the accuracy is even better than the theoretical guarantees in terms of absolutes errors and the rank of nodes is well preserved, in particular for those with high betweenness. △ Less

Submitted 22 September, 2014; originally announced September 2014.

Comments: 17 pages

Showing 1–10 of 10 results for author: Bergamini, E