Search | arXiv e-print repository

doi 10.1103/PhysRevE.86.066104

PageRank and rank-reversal dependence on the damping factor

Authors: Seung-Woo Son, Claire Christensen, Peter Grassberger, Maya Paczuski

Abstract: PageRank (PR) is an algorithm originally developed by Google to evaluate the importance of web pages. Considering how deeply rooted Google's PR algorithm is to gathering relevant information or to the success of modern businesses, the question of rank-stability and choice of the damping factor (a parameter in the algorithm) is clearly important. We investigate PR as a function of the damping facto… ▽ More PageRank (PR) is an algorithm originally developed by Google to evaluate the importance of web pages. Considering how deeply rooted Google's PR algorithm is to gathering relevant information or to the success of modern businesses, the question of rank-stability and choice of the damping factor (a parameter in the algorithm) is clearly important. We investigate PR as a function of the damping factor d on a network obtained from a domain of the World Wide Web, finding that rank-reversal happens frequently over a broad range of PR (and of d). We use three different correlation measures, Pearson, Spearman, and Kendall, to study rank-reversal as d changes, and show that the correlation of PR vectors drops rapidly as d changes from its frequently cited value, $d_0=0.85$. Rank-reversal is also observed by measuring the Spearman and Kendall rank correlation, which evaluate relative ranks rather than absolute PR. Rank-reversal happens not only in directed networks containing rank-sinks but also in a single strongly connected component, which by definition does not contain any sinks. We relate rank-reversals to rank-pockets and bottlenecks in the directed network structure. For the network studied, the relative rank is more stable by our measures around $d=0.65$ than at $d=d_0$. △ Less

Submitted 23 January, 2012; originally announced January 2012.

Comments: 14 pages, 9 figures

Journal ref: Phys. Rev. E 86, 066104 (2012)

arXiv:1201.1507 [pdf, ps, other]

doi 10.1103/PhysRevE.86.046104

Sampling properties of directed networks

Authors: Seung-Woo Son, Claire Christensen, Golnoosh Bizhani, David V. Foster, Peter Grassberger, Maya Paczuski

Abstract: For many real-world networks only a small "sampled" version of the original network may be investigated; those results are then used to draw conclusions about the actual system. Variants of breadth-first search (BFS) sampling, which are based on epidemic processes, are widely used. Although it is well established that BFS sampling fails, in most cases, to capture the IN-component(s) of directed ne… ▽ More For many real-world networks only a small "sampled" version of the original network may be investigated; those results are then used to draw conclusions about the actual system. Variants of breadth-first search (BFS) sampling, which are based on epidemic processes, are widely used. Although it is well established that BFS sampling fails, in most cases, to capture the IN-component(s) of directed networks, a description of the effects of BFS sampling on other topological properties are all but absent from the literature. To systematically study the effects of sampling biases on directed networks, we compare BFS sampling to random sampling on complete large-scale directed networks. We present new results and a thorough analysis of the topological properties of seven different complete directed networks (prior to sampling), including three versions of Wikipedia, three different sources of sampled World Wide Web data, and an Internet-based social network. We detail the differences that sampling method and coverage can make to the structural properties of sampled versions of these seven networks. Most notably, we find that sampling method and coverage affect both the bow-tie structure, as well as the number and structure of strongly connected components in sampled networks. In addition, at low sampling coverage (i.e. less than 40%), the values of average degree, variance of out-degree, degree auto-correlation, and link reciprocity are overestimated by 30% or more in BFS-sampled networks, and only attain values within 10% of the corresponding values in the complete networks when sampling coverage is in excess of 65%. These results may cause us to rethink what we know about the structure, function, and evolution of real-world directed networks. △ Less

Submitted 13 October, 2012; v1 submitted 6 January, 2012; originally announced January 2012.

Comments: 21 pages, 11 figures

Journal ref: Phys. Rev. E 86, 046104 (2012)

arXiv:1109.4631 [pdf, other]

doi 10.1103/PhysRevE.84.066111

Random Sequential Renormalization and Agglomerative Percolation in Networks: Application to Erd"os-R'enyi and Scale-free Graphs

Authors: Golnoosh Bizhani, Peter Grassberger, Maya Paczuski

Abstract: We study the statistical behavior under random sequential renormalization(RSR) of several network models including Erd"os R'enyi (ER) graphs, scale-free networks and an annealed model (AM) related to ER graphs. In RSR the network is locally coarse grained by choosing at each renormalization step a node at random and joining it to all its neighbors. Compared to previous (quasi-)parallel renormaliza… ▽ More We study the statistical behavior under random sequential renormalization(RSR) of several network models including Erd"os R'enyi (ER) graphs, scale-free networks and an annealed model (AM) related to ER graphs. In RSR the network is locally coarse grained by choosing at each renormalization step a node at random and joining it to all its neighbors. Compared to previous (quasi-)parallel renormalization methods [C.Song et.al], RSR allows a more fine-grained analysis of the renormalization group (RG) flow, and unravels new features, that were not discussed in the previous analyses. In particular we find that all networks exhibit a second order transition in their RG flow. This phase transition is associated with the emergence of a giant hub and can be viewed as a new variant of percolation, called agglomerative percolation. We claim that this transition exists also in previous graph renormalization schemes and explains some of the scaling laws seen there. For critical trees it happens as N/N0 -> 0 in the limit of large systems (where N0 is the initial size of the graph and N its size at a given RSR step). In contrast, it happens at finite N/N0 in sparse ER graphs and in the annealed model, while it happens for N/N0 -> 1 on scale-free networks. Critical exponents seem to depend on the type of the graph but not on the average degree and obey usual scaling relations for percolation phenomena. For the annealed model they agree with the exponents obtained from a mean-field theory. At late times, the networks exhibit a star-like structure in agreement with the results of Radicchi et. al. While degree distributions are of main interest when regarding the scheme as network renormalization, mass distributions (which are more relevant when considering 'supernodes' as clusters) are much easier to study using the fast Newman-Ziff algorithm for percolation, allowing us to obtain very high statistics. △ Less

Submitted 12 December, 2011; v1 submitted 21 September, 2011; originally announced September 2011.

Journal ref: Phys. Rev. E 84, 066111 (2011)

arXiv:1012.2384 [pdf, other]

doi 10.1103/PhysRevE.84.066117

Clustering Drives Assortativity and Community Structure in Ensembles of Networks

Authors: David V. Foster, Jacob G. Foster, Peter Grassberger, Maya Paczuski

Abstract: Clustering, assortativity, and communities are key features of complex networks. We probe dependencies between these attributes and find that ensembles with strong clustering display both high assortativity by degree and prominent community structure, while ensembles with high assortativity are much less biased towards clustering or community structure. Further, clustered networks can amplify smal… ▽ More Clustering, assortativity, and communities are key features of complex networks. We probe dependencies between these attributes and find that ensembles with strong clustering display both high assortativity by degree and prominent community structure, while ensembles with high assortativity are much less biased towards clustering or community structure. Further, clustered networks can amplify small homophilic bias for trait assortativity. This marked asymmetry suggests that transitivity, rather than homophily, drives the standard nonsocial/social network dichotomy. △ Less

Submitted 5 January, 2011; v1 submitted 10 December, 2010; originally announced December 2010.

Comments: 4 pages, 4 figures

arXiv:1009.3955 [pdf, other]

doi 10.1103/PhysRevE.83.036110

Random Sequential Renormalization of Networks I: Application to Critical Trees

Authors: Golnoosh Bizhani, Vishal Sood, Maya Paczuski, Peter Grassberger

Abstract: We introduce the concept of Random Sequential Renormalization (RSR) for arbitrary networks. RSR is a graph renormalization procedure that locally aggregates nodes to produce a coarse grained network. It is analogous to the (quasi-)parallel renormalization schemes introduced by C. Song {\it et al.} (Nature {\bf 433}, 392 (2005)) and studied more recently by F. Radicchi {\it et al.} (Phys. Rev. Lett… ▽ More We introduce the concept of Random Sequential Renormalization (RSR) for arbitrary networks. RSR is a graph renormalization procedure that locally aggregates nodes to produce a coarse grained network. It is analogous to the (quasi-)parallel renormalization schemes introduced by C. Song {\it et al.} (Nature {\bf 433}, 392 (2005)) and studied more recently by F. Radicchi {\it et al.} (Phys. Rev. Lett. {\bf 101}, 148701 (2008)), but much simpler and easier to implement. In this first paper we apply RSR to critical trees and derive analytical results consistent with numerical simulations. Critical trees exhibit three regimes in their evolution under RSR: (i) An initial regime $N_0^ν\lesssim N<N_0$, where $N$ is the number of nodes at some step in the renormalization and $N_0$ is the initial size. RSR in this regime is described by a mean field theory and fluctuations from one realization to another are small. The exponent $ν=1/2$ is derived using random walk arguments. The degree distribution becomes broader under successive renormalization -- reaching a power law, $p_k\sim 1/k^γ$ with $γ=2$ and a variance that diverges as $N_0^{1/2}$ at the end of this regime. Both of these results are derived based on a scaling theory. (ii) An intermediate regime for $N_0^{1/4}\lesssim N \lesssim N_0^{1/2}$, in which hubs develop, and fluctuations between different realizations of the RSR are large. Crossover functions exhibiting finite size scaling, in the critical region $N\sim N_0^{1/2} \to \infty$, connect the behaviors in the first two regimes. (iii) The last regime, for $1 \ll N\lesssim N_0^{1/4}$, is characterized by the appearance of star configurations with a central hub surrounded by many leaves. The distribution of sizes where stars first form is found numerically to be a power law up to a cutoff that scales as $N_0^{ν_{star}}$ with $ν_{star}\approx 1/4$. △ Less

Submitted 23 March, 2011; v1 submitted 20 September, 2010; originally announced September 2010.

Journal ref: Phys. Rev. E 83, 036110 (2011)

arXiv:0908.4288 [pdf, other]

doi 10.1073/pnas.0912671107

Edge direction and the structure of networks

Authors: Jacob G. Foster, David V. Foster, Peter Grassberger, Maya Paczuski

Abstract: Directed networks are ubiquitous and are necessary to represent complex systems with asymmetric interactions---from food webs to the World Wide Web. Despite the importance of edge direction for detecting local and community structure, it has been disregarded in studying a basic type of global diversity in networks: the tendency of nodes with similar numbers of edges to connect. This tendency, call… ▽ More Directed networks are ubiquitous and are necessary to represent complex systems with asymmetric interactions---from food webs to the World Wide Web. Despite the importance of edge direction for detecting local and community structure, it has been disregarded in studying a basic type of global diversity in networks: the tendency of nodes with similar numbers of edges to connect. This tendency, called assortativity, affects crucial structural and dynamic properties of real-world networks, such as error tolerance or epidemic spreading. Here we demonstrate that edge direction has profound effects on assortativity. We define a set of four directed assortativity measures and assign statistical significance by comparison to randomized networks. We apply these measures to three network classes---online/social networks, food webs, and word-adjacency networks. Our measures (i) reveal patterns common to each class, (ii) separate networks that have been previously classified together, and (iii) expose limitations of several existing theoretical models. We reject the standard classification of directed networks as purely assortative or disassortative. Many display a class-specific mixture, likely reflecting functional or historical constraints, contingencies, and forces guiding the system's evolution. △ Less

Submitted 7 November, 2010; v1 submitted 28 August, 2009; originally announced August 2009.

Comments: 13 pages, 6 figures, 3 tables

Journal ref: Proceedings of the National Academy of Sciences of the United States of America 2010, Vol. 107, No. 24

arXiv:cs/0412027 [pdf, ps, other]

Correlated dynamics in human printing behavior

Authors: Uli Harder, Maya Paczuski

Abstract: Arrival times of requests to print in a student laboratory were analyzed. Inter-arrival times between subsequent requests follow a universal scaling law relating time intervals and the size of the request, indicating a scale invariant dynamics with respect to the size. The cumulative distribution of file sizes is well-described by a modified power law often seen in non-equilibrium critical syste… ▽ More Arrival times of requests to print in a student laboratory were analyzed. Inter-arrival times between subsequent requests follow a universal scaling law relating time intervals and the size of the request, indicating a scale invariant dynamics with respect to the size. The cumulative distribution of file sizes is well-described by a modified power law often seen in non-equilibrium critical systems. For each user, waiting times between their individual requests show long range dependence and are broadly distributed from seconds to weeks. All results are incompatible with Poisson models, and may provide evidence of critical dynamics associated with voluntary thought processes in the brain. △ Less

Submitted 7 December, 2004; originally announced December 2004.

Comments: 4 pages, 4 figures

ACM Class: D.4.8

arXiv:cs/0410005 [pdf, ps, other]

A dynamical model of a GRID market

Authors: Uli Harder, Peter Harrison, Maya Paczuski, Tejas Shah

Abstract: We discuss potential market mechanisms for the GRID. A complete dynamical model of a GRID market is defined with three types of agents. Providers, middlemen and users exchange universal GRID computing units (GCUs) at varying prices. Providers and middlemen have strategies aimed at maximizing profit while users are 'satisficing' agents, and only change their behavior if the service they receive i… ▽ More We discuss potential market mechanisms for the GRID. A complete dynamical model of a GRID market is defined with three types of agents. Providers, middlemen and users exchange universal GRID computing units (GCUs) at varying prices. Providers and middlemen have strategies aimed at maximizing profit while users are 'satisficing' agents, and only change their behavior if the service they receive is sufficiently poor or overpriced. Preliminary results from a multi-agent numerical simulation of the market model shows that the distribution of price changes has a power law tail. △ Less

Submitted 2 October, 2004; originally announced October 2004.

Comments: 4 pages, 3 figures

Showing 1–8 of 8 results for author: Paczuski, M