-
A Comparative Analysis of Wealth Index Predictions in Africa between three Multi-Source Inference Models
Authors:
Márton Karsai,
János Kertész,
Lisette Espín-Noboa
Abstract:
Poverty map inference is a critical area of research, with growing interest in both traditional and modern techniques, ranging from regression models to convolutional neural networks applied to tabular data, images, and networks. Despite extensive focus on the validation of training phases, the scrutiny of final predictions remains limited. Here, we compare the Relative Wealth Index (RWI) inferred…
▽ More
Poverty map inference is a critical area of research, with growing interest in both traditional and modern techniques, ranging from regression models to convolutional neural networks applied to tabular data, images, and networks. Despite extensive focus on the validation of training phases, the scrutiny of final predictions remains limited. Here, we compare the Relative Wealth Index (RWI) inferred by Chi et al. (2022) with the International Wealth Index (IWI) inferred by Lee and Braithwaite (2022) and Espín-Noboa et al. (2023) across six Sub-Saharan African countries. Our analysis focuses on identifying trends and discrepancies in wealth predictions over time. Our results show that the predictions by Chi et al. and Espín-Noboa et al. align with general GDP trends, with differences expected due to the distinct time-frames of the training sets. However, predictions by Lee and Braithwaite diverge significantly, indicating potential issues with the validity of the model. These discrepancies highlight the need for policymakers and stakeholders in Africa to rigorously audit models that predict wealth, especially those used for decision-making on the ground. These and other techniques require continuous verification and refinement to enhance their reliability and ensure that poverty alleviation strategies are well-founded.
△ Less
Submitted 4 September, 2024; v1 submitted 2 August, 2024;
originally announced August 2024.
-
Homophilic organization of egocentric communities in ICT services
Authors:
Chandreyee Roy,
Hang-Hyun Jo,
János Kertész,
Kimmo Kaski,
János Török
Abstract:
Members of a society can be characterized by a large number of features, such as gender, age, ethnicity, religion, social status, and shared activities. One of the main tie-forming factors between individuals in human societies is homophily, the tendency of being attracted to similar others. Homophily has been mainly studied with focus on one of the features and little is known about the roles of…
▽ More
Members of a society can be characterized by a large number of features, such as gender, age, ethnicity, religion, social status, and shared activities. One of the main tie-forming factors between individuals in human societies is homophily, the tendency of being attracted to similar others. Homophily has been mainly studied with focus on one of the features and little is known about the roles of similarities of different origins in the formation of communities. To close this gap, we analyze three datasets from Information and Communications Technology (ICT) services, namely, two online social networks and a network deduced from mobile phone calls, in all of which metadata about individual features are available. We identify communities within egocentric networks and surprisingly find that the larger the community is, the more overlap is found between features of its members and the ego. We interpret this finding in terms of the effort needed to manage the communities; the larger diversity requires more effort such that to maintain a large diverse group may exceed the capacity of the members. As the ego reaches out to her alters on an ICT service, we observe that the first alter in each community tends to have a higher feature overlap with the ego than the rest. Moreover the feature overlap of the ego with all her alters displays a non-monotonic behaviors as a function of the ego's degree. We propose a simple mechanism of how people add links in their egocentric networks of alters that reproduces all the empirical observations and shows the reason behind non-monotonic tendency of the egocentric feature overlap as a function of the ego's degree.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
Robustness of Decentralised Learning to Nodes and Data Disruption
Authors:
Luigi Palmieri,
Chiara Boldrini,
Lorenzo Valerio,
Andrea Passarella,
Marco Conti,
János Kertész
Abstract:
In the vibrant landscape of AI research, decentralised learning is gaining momentum. Decentralised learning allows individual nodes to keep data locally where they are generated and to share knowledge extracted from local data among themselves through an interactive process of collaborative refinement. This paradigm supports scenarios where data cannot leave local nodes due to privacy or sovereign…
▽ More
In the vibrant landscape of AI research, decentralised learning is gaining momentum. Decentralised learning allows individual nodes to keep data locally where they are generated and to share knowledge extracted from local data among themselves through an interactive process of collaborative refinement. This paradigm supports scenarios where data cannot leave local nodes due to privacy or sovereignty reasons or real-time constraints imposing proximity of models to locations where inference has to be carried out. The distributed nature of decentralised learning implies significant new research challenges with respect to centralised learning. Among them, in this paper, we focus on robustness issues. Specifically, we study the effect of nodes' disruption on the collective learning process. Assuming a given percentage of "central" nodes disappear from the network, we focus on different cases, characterised by (i) different distributions of data across nodes and (ii) different times when disruption occurs with respect to the start of the collaborative learning task. Through these configurations, we are able to show the non-trivial interplay between the properties of the network connecting nodes, the persistence of knowledge acquired collectively before disruption or lack thereof, and the effect of data availability pre- and post-disruption. Our results show that decentralised learning processes are remarkably robust to network disruption. As long as even minimum amounts of data remain available somewhere in the network, the learning process is able to recover from disruptions and achieve significant classification accuracy. This clearly varies depending on the remaining connectivity after disruption, but we show that even nodes that remain completely isolated can retain significant knowledge acquired before the disruption.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Milgram's experiment in the knowledge space: Individual navigation strategies
Authors:
Manran Zhu,
János Kertész
Abstract:
Data deluge characteristic for our times has led to information overload, posing a significant challenge to effectively finding our way through the digital landscape. Addressing this issue requires an in-depth understanding of how we navigate through the abundance of information. Previous research has discovered multiple patterns in how individuals navigate in the geographic, social, and informati…
▽ More
Data deluge characteristic for our times has led to information overload, posing a significant challenge to effectively finding our way through the digital landscape. Addressing this issue requires an in-depth understanding of how we navigate through the abundance of information. Previous research has discovered multiple patterns in how individuals navigate in the geographic, social, and information spaces, yet individual differences in strategies for navigation in the knowledge space has remained largely unexplored. To bridge the gap, we conducted an online experiment where participants played a navigation game on Wikipedia and completed questionnaires about their personal information. Utilizing a graph embedding trained on the English Wikipedia, our study identified distinctive strategies that participants adopt: when the target is a famous person, participants typically use the geographical and occupational information of the target to navigate, reminiscent of hub-driven and proximity-driven approaches, respectively. We discovered that many participants playing the same game exhibit a "wisdom of the crowd" effect: The set of strategies provide a good estimate for the information landscape around the target indicating that the individual differences complement each other.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Initialisation and Topology Effects in Decentralised Federated Learning
Authors:
Arash Badie-Modiri,
Chiara Boldrini,
Lorenzo Valerio,
János Kertész,
Márton Karsai
Abstract:
Fully decentralised federated learning enables collaborative training of individual machine learning models on distributed devices on a communication network while keeping the training data localised. This approach enhances data privacy and eliminates both the single point of failure and the necessity for central coordination. Our research highlights that the effectiveness of decentralised federat…
▽ More
Fully decentralised federated learning enables collaborative training of individual machine learning models on distributed devices on a communication network while keeping the training data localised. This approach enhances data privacy and eliminates both the single point of failure and the necessity for central coordination. Our research highlights that the effectiveness of decentralised federated learning is significantly influenced by the network topology of connected devices. We propose a strategy for uncoordinated initialisation of the artificial neural networks, which leverages the distribution of eigenvector centralities of the nodes of the underlying communication network, leading to a radically improved training efficiency. Additionally, our study explores the scaling behaviour and choice of environmental parameters under our proposed initialisation strategy. This work paves the way for more efficient and scalable artificial neural network training in a distributed and uncoordinated environment, offering a deeper understanding of the intertwining roles of network structure and learning dynamics.
△ Less
Submitted 22 May, 2024; v1 submitted 23 March, 2024;
originally announced March 2024.
-
Coordination-free Decentralised Federated Learning on Complex Networks: Overcoming Heterogeneity
Authors:
Lorenzo Valerio,
Chiara Boldrini,
Andrea Passarella,
János Kertész,
Márton Karsai,
Gerardo Iñiguez
Abstract:
Federated Learning (FL) is a well-known framework for successfully performing a learning task in an edge computing scenario where the devices involved have limited resources and incomplete data representation. The basic assumption of FL is that the devices communicate directly or indirectly with a parameter server that centrally coordinates the whole process, overcoming several challenges associat…
▽ More
Federated Learning (FL) is a well-known framework for successfully performing a learning task in an edge computing scenario where the devices involved have limited resources and incomplete data representation. The basic assumption of FL is that the devices communicate directly or indirectly with a parameter server that centrally coordinates the whole process, overcoming several challenges associated with it. However, in highly pervasive edge scenarios, the presence of a central controller that oversees the process cannot always be guaranteed, and the interactions (i.e., the connectivity graph) between devices might not be predetermined, resulting in a complex network structure. Moreover, the heterogeneity of data and devices further complicates the learning process. This poses new challenges from a learning standpoint that we address by proposing a communication-efficient Decentralised Federated Learning (DFL) algorithm able to cope with them. Our solution allows devices communicating only with their direct neighbours to train an accurate model, overcoming the heterogeneity induced by data and different training histories. Our results show that the resulting local models generalise better than those trained with competing approaches, and do so in a more communication-efficient way.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
In A Society of Strangers, Kin Is Still Key: Identified Family Relations In Large-Scale Mobile Phone Data
Authors:
Tamás Dávid-Barrett,
Sebastian Diaz,
Carlos Rodriguez-Sickert,
Isabel Behncke,
Anna Rotkirch,
János Kertész,
Loreto Bravo
Abstract:
Mobile call networks have been widely used to investigate communication patterns and the network of interactions of humans at the societal scale. Yet, more detailed analysis is often hindered by having no information about the nature of the relationships, even if some metadata about the individuals are available. Using a unique, large mobile phone database with information about individual surname…
▽ More
Mobile call networks have been widely used to investigate communication patterns and the network of interactions of humans at the societal scale. Yet, more detailed analysis is often hindered by having no information about the nature of the relationships, even if some metadata about the individuals are available. Using a unique, large mobile phone database with information about individual surnames in a population in which people inherit two surnames: one from their father, and one from their mother, we are able to differentiate among close kin relationship types. Here we focus on the difference between the most frequently called alters depending on whether they are family relationships or not. We find support in the data for two hypotheses: (1) phone calls between family members are more frequent and last longer than phone calls between non-kin, and (2) the phone call pattern between family members show a higher variation depending on the stage of life-course compared to non-family members. We give an interpretation of these findings within the framework of evolutionary anthropology: kinship matters even when demographic processes, such as low fertility, urbanisation and migration reduce the access to family members. Furthermore, our results provide tools for distinguishing between different kinds of kin relationships from mobile call data, when information about names are unavailable.
△ Less
Submitted 7 July, 2023;
originally announced July 2023.
-
Human-AI Coevolution
Authors:
Dino Pedreschi,
Luca Pappalardo,
Emanuele Ferragina,
Ricardo Baeza-Yates,
Albert-Laszlo Barabasi,
Frank Dignum,
Virginia Dignum,
Tina Eliassi-Rad,
Fosca Giannotti,
Janos Kertesz,
Alistair Knott,
Yannis Ioannidis,
Paul Lukowicz,
Andrea Passarella,
Alex Sandy Pentland,
John Shawe-Taylor,
Alessandro Vespignani
Abstract:
Human-AI coevolution, defined as a process in which humans and AI algorithms continuously influence each other, increasingly characterises our society, but is understudied in artificial intelligence and complexity science literature. Recommender systems and assistants play a prominent role in human-AI coevolution, as they permeate many facets of daily life and influence human choices on online pla…
▽ More
Human-AI coevolution, defined as a process in which humans and AI algorithms continuously influence each other, increasingly characterises our society, but is understudied in artificial intelligence and complexity science literature. Recommender systems and assistants play a prominent role in human-AI coevolution, as they permeate many facets of daily life and influence human choices on online platforms. The interaction between users and AI results in a potentially endless feedback loop, wherein users' choices generate data to train AI models, which, in turn, shape subsequent user preferences. This human-AI feedback loop has peculiar characteristics compared to traditional human-machine interaction and gives rise to complex and often ``unintended'' social outcomes. This paper introduces Coevolution AI as the cornerstone for a new field of study at the intersection between AI and complexity science focused on the theoretical, empirical, and mathematical investigation of the human-AI feedback loop. In doing so, we: (i) outline the pros and cons of existing methodologies and highlight shortcomings and potential ways for capturing feedback loop mechanisms; (ii) propose a reflection at the intersection between complexity science, AI and society; (iii) provide real-world examples for different human-AI ecosystems; and (iv) illustrate challenges to the creation of such a field of study, conceptualising them at increasing levels of abstraction, i.e., technical, epistemological, legal and socio-political.
△ Less
Submitted 3 May, 2024; v1 submitted 23 June, 2023;
originally announced June 2023.
-
Individual differences in knowledge network navigation
Authors:
Manran Zhu,
Taha Yasseri,
János Kertész
Abstract:
With the rapid accumulation of online information, efficient web navigation has grown vital yet challenging. To create an easily navigable cyberspace catering to diverse demographics, understanding how people navigate differently is paramount. While previous research has unveiled individual differences in spatial navigation, such differences in knowledge space navigation remain sparse. To bridge t…
▽ More
With the rapid accumulation of online information, efficient web navigation has grown vital yet challenging. To create an easily navigable cyberspace catering to diverse demographics, understanding how people navigate differently is paramount. While previous research has unveiled individual differences in spatial navigation, such differences in knowledge space navigation remain sparse. To bridge this gap, we conducted an online experiment where participants played a navigation game on Wikipedia and completed personal information questionnaires. Our analysis shows that age negatively affects knowledge space navigation performance, while multilingualism enhances it. Under time pressure, participants' performance improves across trials and males outperform females, an effect not observed in games without time pressure. In our experiment, successful route-finding is usually not related to abilities of innovative exploration of routes. Our results underline the importance of age, multilingualism and time constraint in the knowledge space navigation.
△ Less
Submitted 19 March, 2024; v1 submitted 17 March, 2023;
originally announced March 2023.
-
Universal patterns in egocentric communication networks
Authors:
Gerardo Iñiguez,
Sara Heydari,
János Kertész,
Jari Saramäki
Abstract:
Tie strengths in social networks are heterogeneous, with strong and weak ties playing different roles at both the network and the individual level. Egocentric networks, networks of relationships around a focal individual, exhibit a small number of strong ties and a larger number of weaker ties, a pattern that is evident in electronic communication records, such as mobile phone calls. Mobile phone…
▽ More
Tie strengths in social networks are heterogeneous, with strong and weak ties playing different roles at both the network and the individual level. Egocentric networks, networks of relationships around a focal individual, exhibit a small number of strong ties and a larger number of weaker ties, a pattern that is evident in electronic communication records, such as mobile phone calls. Mobile phone data has also revealed persistent individual differences within this pattern. However, the generality and the driving mechanisms of this tie strength heterogeneity remain unclear. Here, we study tie strengths in egocentric networks across multiple datasets containing records of interactions between millions of people over time periods ranging from months to years. Our findings reveal a remarkable universality in the distribution of tie strengths and their individual-level variation across different modes of communication, even in channels that may not reflect offline social relationships. With the help of an analytically tractable model of egocentric network evolution, we show that the observed universality can be attributed to the competition between cumulative advantage and random choice, two general mechanisms of tie reinforcement whose balance determines the amount of heterogeneity in tie strengths. Our results provide new insights into the driving mechanisms of tie strength heterogeneity in social networks and have implications for the understanding of social network structure and individual behavior.
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Interpreting wealth distribution via poverty map inference using multimodal data
Authors:
Lisette Espín-Noboa,
János Kertész,
Márton Karsai
Abstract:
Poverty maps are essential tools for governments and NGOs to track socioeconomic changes and adequately allocate infrastructure and services in places in need. Sensor and online crowd-sourced data combined with machine learning methods have provided a recent breakthrough in poverty map inference. However, these methods do not capture local wealth fluctuations, and are not optimized to produce acco…
▽ More
Poverty maps are essential tools for governments and NGOs to track socioeconomic changes and adequately allocate infrastructure and services in places in need. Sensor and online crowd-sourced data combined with machine learning methods have provided a recent breakthrough in poverty map inference. However, these methods do not capture local wealth fluctuations, and are not optimized to produce accountable results that guarantee accurate predictions to all sub-populations. Here, we propose a pipeline of machine learning models to infer the mean and standard deviation of wealth across multiple geographically clustered populated places, and illustrate their performance in Sierra Leone and Uganda. These models leverage seven independent and freely available feature sources based on satellite images, and metadata collected via online crowd-sourcing and social media. Our models show that combined metadata features are the best predictors of wealth in rural areas, outperforming image-based models, which are the best for predicting the highest wealth quintiles. Our results recover the local mean and variation of wealth, and correctly capture the positive yet non-monotonous correlation between them. We further demonstrate the capabilities and limitations of model transfer across countries and the effects of data recency and other biases. Our methodology provides open tools to build towards more transparent and interpretable models to help governments and NGOs to make informed decisions based on data availability, urbanization level, and poverty thresholds.
△ Less
Submitted 6 April, 2023; v1 submitted 17 February, 2023;
originally announced February 2023.
-
Competition for popularity and interventions on a Chinese microblogging site
Authors:
Hao Cui,
János Kertész
Abstract:
Microblogging sites are important vehicles for the users to obtain information and shape public opinion thus they are arenas of continuous competition for popularity. Most popular topics are usually indicated on ranking lists. In this study, we investigate the public attention dynamics through the Hot Search List (HSL) of the Chinese microblog Sina Weibo, where trending hashtags are ranked based o…
▽ More
Microblogging sites are important vehicles for the users to obtain information and shape public opinion thus they are arenas of continuous competition for popularity. Most popular topics are usually indicated on ranking lists. In this study, we investigate the public attention dynamics through the Hot Search List (HSL) of the Chinese microblog Sina Weibo, where trending hashtags are ranked based on a multi-dimensional search volume index. We characterize the rank dynamics by the time spent by hashtags on the list, the time of the day they appear there, the rank diversity, and by the ranking trajectories. We show how the circadian rhythm affects the popularity of hashtags, and observe categories of their rank trajectories by a machine learning clustering algorithm. By analyzing patterns of ranking dynamics using various measures, we identify anomalies that are likely to result from the platform provider's intervention into the ranking, including the anchoring of hashtags to certain ranks on the HSL. We propose a simple model of ranking that explains the mechanism of this anchoring effect. We found an over-representation of hashtags related to international politics at 3 out of 4 anchoring ranks on the HSL, indicating possible manipulations of public opinion.
△ Less
Submitted 30 November, 2022; v1 submitted 22 August, 2022;
originally announced August 2022.
-
Human-AI ecosystem with abrupt changes as a function of the composition
Authors:
Pierluigi Contucci,
János Kertész,
Godwin Osabutey
Abstract:
The progressive advent of artificial intelligence machines may represent both an opportunity or a threat. In order to have an idea of what is coming we propose a model that simulate a Human-AI ecosystem. In particular we consider systems where agents present biases, peer-to-peer interactions and also three body interactions that are crucial and describe two humans interacting with an artificial ag…
▽ More
The progressive advent of artificial intelligence machines may represent both an opportunity or a threat. In order to have an idea of what is coming we propose a model that simulate a Human-AI ecosystem. In particular we consider systems where agents present biases, peer-to-peer interactions and also three body interactions that are crucial and describe two humans interacting with an artificial agent and two artificial intelligence agents interacting with a human. We focus our analysis by exploring how the relative fraction of artificial intelligence agents affect that ecosystem. We find evidence that for suitable values of the interaction parameters, arbitrarily small changes in such percentage may trigger dramatic changes for the system that can be either in one of the two polarised states or in an undecided state.
△ Less
Submitted 7 April, 2022;
originally announced April 2022.
-
"Born in Rome" or "Sleeping Beauty": Emergence of hashtag popularity on the Chinese microblog Sina Weibo
Authors:
Hao Cui,
János Kertész
Abstract:
To understand the emergence of hashtag popularity in online social networking complex systems, we study the largest Chinese microblogging site Sina Weibo, which has a Hot Search List (HSL) showing in real time the ranking of the 50 most popular hashtags based on search activity. We investigate the prehistory of successful hashtags from 17 July 2020 to 17 September 2020 by mapping out the related i…
▽ More
To understand the emergence of hashtag popularity in online social networking complex systems, we study the largest Chinese microblogging site Sina Weibo, which has a Hot Search List (HSL) showing in real time the ranking of the 50 most popular hashtags based on search activity. We investigate the prehistory of successful hashtags from 17 July 2020 to 17 September 2020 by mapping out the related interaction network preceding the selection to HSL. We have found that the circadian activity pattern has an impact on the time needed to get to the HSL. When analyzing this time we distinguish two extreme categories: a) "Born in Rome", which means hashtags are mostly first created by super-hubs or reach super-hubs at an early stage during their propagation and thus gain immediate wide attention from the broad public, and b) "Sleeping Beauty", meaning the hashtags gain little attention at the beginning and reach system-wide popularity after a considerable time lag. The evolution of the repost networks of successful hashtags before getting to the HSL show two types of growth patterns: "smooth" and "stepwise". The former is usually dominated by a super-hub and the latter results from consecutive waves of contributions of smaller hubs. The repost networks of unsuccessful hashtags exhibit a simple evolution pattern.
△ Less
Submitted 9 November, 2022; v1 submitted 28 March, 2022;
originally announced March 2022.
-
Opinion dynamics in social networks: From models to data
Authors:
Antonio F. Peralta,
János Kertész,
Gerardo Iñiguez
Abstract:
Opinions are an integral part of how we perceive the world and each other. They shape collective action, playing a role in democratic processes, the evolution of norms, and cultural change. For decades, researchers in the social and natural sciences have tried to describe how shifting individual perspectives and social exchange lead to archetypal states of public opinion like consensus and polariz…
▽ More
Opinions are an integral part of how we perceive the world and each other. They shape collective action, playing a role in democratic processes, the evolution of norms, and cultural change. For decades, researchers in the social and natural sciences have tried to describe how shifting individual perspectives and social exchange lead to archetypal states of public opinion like consensus and polarization. Here we review some of the many contributions to the field, focusing both on idealized models of opinion dynamics, and attempts at validating them with observational data and controlled sociological experiments. By further closing the gap between models and data, these efforts may help us understand how to face current challenges that require the agreement of large groups of people in complex scenarios, such as economic inequality, climate change, and the ongoing fracture of the sociopolitical landscape.
△ Less
Submitted 19 December, 2022; v1 submitted 4 January, 2022;
originally announced January 2022.
-
Deep learning based parameter search for an agent based social network model
Authors:
Yohsuke Murase,
Hang-Hyun Jo,
János Török,
János Kertész,
Kimmo Kaski
Abstract:
Interactions between humans give rise to complex social networks that are characterized by heterogeneous degree distribution, weight-topology relation, overlapping community structure, and dynamics of links. Understanding such networks is a primary goal of science due to serving as the scaffold for many emergent social phenomena from disease spreading to political movements. An appropriate tool fo…
▽ More
Interactions between humans give rise to complex social networks that are characterized by heterogeneous degree distribution, weight-topology relation, overlapping community structure, and dynamics of links. Understanding such networks is a primary goal of science due to serving as the scaffold for many emergent social phenomena from disease spreading to political movements. An appropriate tool for studying them is agent-based modeling, in which nodes, representing persons, make decisions about creating and deleting links, thus yielding various macroscopic behavioral patterns. Here we focus on studying a generalization of the weighted social network model, being one of the most fundamental agent-based models for describing the formation of social ties and social networks. This Generalized Weighted Social Network (GWSN) model incorporates triadic closure, homophilic interactions, and various link termination mechanisms, which have been studied separately in the previous works. Accordingly, the GWSN model has an increased number of input parameters and the model behavior gets excessively complex, making it challenging to clarify the model behavior. We have executed massive simulations with a supercomputer and using the results as the training data for deep neural networks to conduct regression analysis for predicting the properties of the generated networks from the input parameters. The obtained regression model was also used for global sensitivity analysis to identify which parameters are influential or insignificant. We believe that this methodology is applicable for a large class of complex network models, thus opening the way for more realistic quantitative agent-based modeling.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
Ecology in the digital world of Wikipedia
Authors:
Fumiko Ogushi,
János Kertész,
Kimmo Kaski,
Takashi Shimada
Abstract:
Wikipedia, a paradigmatic example of online knowledge space is organized in a collaborative, bottom-up way with voluntary contributions, yet it maintains a level of reliability comparable to that of traditional encyclopedias. The lack of selected professional writers and editors makes the judgement about quality and trustworthiness of the articles a real challenge. Here we show that a self-consist…
▽ More
Wikipedia, a paradigmatic example of online knowledge space is organized in a collaborative, bottom-up way with voluntary contributions, yet it maintains a level of reliability comparable to that of traditional encyclopedias. The lack of selected professional writers and editors makes the judgement about quality and trustworthiness of the articles a real challenge. Here we show that a self-consistent metrics for the network defined by the edit records captures well the character of editors' activity and the articles' level of complexity. Using our metrics, one can better identify the human-labeled high-quality articles, e.g., "featured" ones, and differentiate them from the popular and controversial articles. Furthermore, the dynamics of the editor-article system is also well captured by the metrics, revealing the evolutionary pathways of articles and diverse roles of editors. We demonstrate that the collective effort of the editors indeed drives to the direction of article improvement.
△ Less
Submitted 21 May, 2021;
originally announced May 2021.
-
The effect of algorithmic bias and network structure on coexistence, consensus, and polarization of opinions
Authors:
Antonio F. Peralta,
Matteo Neri,
János Kertész,
Gerardo Iñiguez
Abstract:
Individuals of modern societies share ideas and participate in collective processes within a pervasive, variable, and mostly hidden ecosystem of content filtering technologies that determine what information we see online. Despite the impact of these algorithms on daily life and society, little is known about their effect on information transfer and opinion formation. It is thus unclear to what ex…
▽ More
Individuals of modern societies share ideas and participate in collective processes within a pervasive, variable, and mostly hidden ecosystem of content filtering technologies that determine what information we see online. Despite the impact of these algorithms on daily life and society, little is known about their effect on information transfer and opinion formation. It is thus unclear to what extent algorithmic bias has a harmful influence on collective decision-making, such as a tendency to polarize debate. Here we introduce a general theoretical framework to systematically link models of opinion dynamics, social network structure, and content filtering. We showcase the flexibility of our framework by exploring a family of binary-state opinion dynamics models where information exchange lies in a spectrum from pairwise to group interactions. All models show an opinion polarization regime driven by algorithmic bias and modular network structure. The role of content filtering is, however, surprisingly nuanced; for pairwise interactions it leads to polarization, while for group interactions it promotes coexistence of opinions. This allows us to pinpoint which social interactions are robust against algorithmic bias, and which ones are susceptible to bias-enhanced opinion polarization. Our framework gives theoretical ground for the development of heuristics to tackle harmful effects of online bias, such as information bottlenecks, echo chambers, and opinion radicalization.
△ Less
Submitted 27 October, 2022; v1 submitted 17 May, 2021;
originally announced May 2021.
-
Quantifying firm-level economic systemic risk from nation-wide supply networks
Authors:
Christian Diem,
András Borsos,
Tobias Reisch,
János Kertész,
Stefan Thurner
Abstract:
Crises like COVID-19 or the Japanese earthquake in 2011 exposed the fragility of corporate supply networks. The production of goods and services is a highly interdependent process and can be severely impacted by the default of critical suppliers or customers. While knowing the impact of individual companies on national economies is a prerequisite for efficient risk management, the quantitative ass…
▽ More
Crises like COVID-19 or the Japanese earthquake in 2011 exposed the fragility of corporate supply networks. The production of goods and services is a highly interdependent process and can be severely impacted by the default of critical suppliers or customers. While knowing the impact of individual companies on national economies is a prerequisite for efficient risk management, the quantitative assessment of the involved economic systemic risks (ESR) is hitherto practically non-existent, mainly because of a lack of fine-grained data in combination with coherent methods. Based on a unique value added tax dataset we derive the detailed production network of an entire country and present a novel approach for computing the ESR of all individual firms. We demonstrate that a tiny fraction (0.035%) of companies has extraordinarily high systemic risk impacting about 23% of the national economic production should any of them default. Firm size alone cannot explain the ESR of individual companies; their position in the production networks does matter substantially. If companies are ranked according to their economic systemic risk index (ESRI), firms with a rank above a characteristic value have very similar ESRI values, while for the rest the rank distribution of ESRI decays slowly as a power-law; 99.8% of all companies have an impact on less than 1% of the economy. We show that the assessment of ESR is impossible with aggregate data as used in traditional Input-Output Economics. We discuss how simple policies of introducing supply chain redundancies can reduce ESR of some extremely risky companies.
△ Less
Submitted 15 April, 2021;
originally announced April 2021.
-
Complexity science approach to economic crime
Authors:
János Kertész,
Johannes Wachs
Abstract:
In this comment we discuss how complexity science and network science are particularly useful for identifying and describing the hidden traces of economic misbehaviour such as fraud and corruption.
In this comment we discuss how complexity science and network science are particularly useful for identifying and describing the hidden traces of economic misbehaviour such as fraud and corruption.
△ Less
Submitted 27 August, 2020;
originally announced August 2020.
-
Attention dynamics on the Chinese social media Sina Weibo during the COVID-19 pandemic
Authors:
Hao Cui,
János Kertész
Abstract:
Understanding attention dynamics on social media during pandemics could help governments minimize the effects. We focus on how COVID-19 has influenced the attention dynamics on the biggest Chinese microblogging website Sina Weibo during the first four months of the pandemic. We study the real-time Hot Search List (HSL), which provides the ranking of the most popular 50 hashtags based on the amount…
▽ More
Understanding attention dynamics on social media during pandemics could help governments minimize the effects. We focus on how COVID-19 has influenced the attention dynamics on the biggest Chinese microblogging website Sina Weibo during the first four months of the pandemic. We study the real-time Hot Search List (HSL), which provides the ranking of the most popular 50 hashtags based on the amount of Sina Weibo searches. We show how the specific events, measures and developments during the epidemic affected the emergence of different kinds of hashtags and the ranking on the HSL. A significant increase of COVID-19 related hashtags started to occur on HSL around January 20, 2020, when the transmission of the disease between humans was announced. Then very rapidly a situation was reached where COVID-related hashtags occupied 30-70% of the HSL, however, with changing content. We give an analysis of how the hashtag topics changed during the investigated time span and conclude that there are three periods separated by February 12 and March 12. In period 1, we see strong topical correlations and clustering of hashtags; in period 2, the correlations are weakened, without clustering pattern; in period 3, we see a potential of clustering while not as strong as in period 1. We further explore the dynamics of HSL by measuring the ranking dynamics and the lifetimes of hashtags on the list. This way we can obtain information about the decay of attention, which is important for decisions about the temporal placement of governmental measures to achieve permanent awareness. Furthermore, our observations indicate abnormally higher rank diversity in the top 15 ranks on HSL due to the COVID-19 related hashtags, revealing the possibility of algorithmic intervention from the platform provider.
△ Less
Submitted 25 February, 2021; v1 submitted 10 August, 2020;
originally announced August 2020.
-
Give more data, awareness and control to individual citizens, and they will help COVID-19 containment
Authors:
Mirco Nanni,
Gennady Andrienko,
Albert-László Barabási,
Chiara Boldrini,
Francesco Bonchi,
Ciro Cattuto,
Francesca Chiaromonte,
Giovanni Comandé,
Marco Conti,
Mark Coté,
Frank Dignum,
Virginia Dignum,
Josep Domingo-Ferrer,
Paolo Ferragina,
Fosca Giannotti,
Riccardo Guidotti,
Dirk Helbing,
Kimmo Kaski,
Janos Kertesz,
Sune Lehmann,
Bruno Lepri,
Paul Lukowicz,
Stan Matwin,
David Megías Jiménez,
Anna Monreale
, et al. (14 additional authors not shown)
Abstract:
The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the phase 2 of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countri…
▽ More
The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the phase 2 of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens' privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens' "personal data stores", to be shared separately and selectively, voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates - if and when they want, for specific aims - with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society.
△ Less
Submitted 16 April, 2020; v1 submitted 10 April, 2020;
originally announced April 2020.
-
Inequality is rising where social network segregation interacts with urban topology
Authors:
Gergő Tóth,
Johannes Wachs,
Riccardo Di Clemente,
Ákos Jakobi,
Bence Ságvári,
János Kertész,
Balázs Lengyel
Abstract:
Social networks amplify inequalities due to fundamental mechanisms of social tie formation such as homophily and triadic closure. These forces sharpen social segregation reflected in network fragmentation. Yet, little is known about what structural factors facilitate fragmentation. In this paper we use big data from a widely-used online social network to demonstrate that there is a significant rel…
▽ More
Social networks amplify inequalities due to fundamental mechanisms of social tie formation such as homophily and triadic closure. These forces sharpen social segregation reflected in network fragmentation. Yet, little is known about what structural factors facilitate fragmentation. In this paper we use big data from a widely-used online social network to demonstrate that there is a significant relationship between social network fragmentation and income inequality in cities and towns. We find that the organization of the physical urban space has a stronger relationship with fragmentation than unequal access to education, political segregation, or the presence of ethnic and religious minorities. Fragmentation of social networks is significantly higher in towns in which residential neighborhoods are divided by physical barriers such as rivers and railroads and are relatively distant from the center of town. Towns in which amenities are spatially concentrated are also typically more socially segregated. These relationships suggest how urban planning may be a useful point of intervention to mitigate inequalities in the long run.
△ Less
Submitted 25 September, 2019;
originally announced September 2019.
-
Corruption Risk in Contracting Markets: A Network Science Perspective
Authors:
Johannes Wachs,
Mihály Fazekas,
János Kertész
Abstract:
We use methods from network science to analyze corruption risk in a large administrative dataset of over 4 million public procurement contracts from European Union member states covering the years 2008-2016. By mapping procurement markets as bipartite networks of issuers and winners of contracts we can visualize and describe the distribution of corruption risk. We study the structure of these netw…
▽ More
We use methods from network science to analyze corruption risk in a large administrative dataset of over 4 million public procurement contracts from European Union member states covering the years 2008-2016. By mapping procurement markets as bipartite networks of issuers and winners of contracts we can visualize and describe the distribution of corruption risk. We study the structure of these networks in each member state, identify their cores and find that highly centralized markets tend to have higher corruption risk. In all EU countries we analyze, corruption risk is significantly clustered. However, these risks are sometimes more prevalent in the core and sometimes in the periphery of the market, depending on the country. This suggests that the same level of corruption risk may have entirely different distributions. Our framework is both diagnostic and prescriptive: it roots out where corruption is likely to be prevalent in different markets and suggests that different anti-corruption policies are needed in different countries.
△ Less
Submitted 18 September, 2019;
originally announced September 2019.
-
A network approach to cartel detection in public auction markets
Authors:
Johannes Wachs,
János Kertész
Abstract:
Competing firms can increase profits by setting prices collectively, imposing significant costs on consumers. Such groups of firms are known as cartels and because this behavior is illegal, their operations are secretive and difficult to detect. Cartels feel a significant internal obstacle: members feel short-run incentives to cheat. Here we present a network-based framework to detect potential ca…
▽ More
Competing firms can increase profits by setting prices collectively, imposing significant costs on consumers. Such groups of firms are known as cartels and because this behavior is illegal, their operations are secretive and difficult to detect. Cartels feel a significant internal obstacle: members feel short-run incentives to cheat. Here we present a network-based framework to detect potential cartels in bidding markets based on the idea that the chance a group of firms can overcome this obstacle and sustain cooperation depends on the patterns of its interactions. We create a network of firms based on their co-bidding behavior, detect interacting groups, and measure their cohesion and exclusivity, two group-level features of their collective behavior. Applied to a market for school milk, our method detects a known cartel and calculates that it has high cohesion and exclusivity. In a comprehensive set of nearly 150,000 public contracts awarded by the Republic of Georgia from 2011 to 2016, detected groups with high cohesion and exclusivity are significantly more likely to display traditional markers of cartel behavior. We replicate this relationship between group topology and the emergence of cooperation in a simulation model. Our method presents a scalable, unsupervised method to find groups of firms in bidding markets ideally positioned to form lasting cartels.
△ Less
Submitted 20 June, 2019;
originally announced June 2019.
-
Sampling networks by nodal attributes
Authors:
Yohsuke Murase,
Hang-Hyun Jo,
János Török,
János Kertész,
Kimmo Kaski
Abstract:
In a social network individuals or nodes connect to other nodes by choosing one of the channels of communication at a time to re-establish the existing social links. Since available data sets are usually restricted to a limited number of channels or layers, these autonomous decision making processes by the nodes constitute the sampling of a multiplex network leading to just one (though very import…
▽ More
In a social network individuals or nodes connect to other nodes by choosing one of the channels of communication at a time to re-establish the existing social links. Since available data sets are usually restricted to a limited number of channels or layers, these autonomous decision making processes by the nodes constitute the sampling of a multiplex network leading to just one (though very important) example of sampling bias caused by the behavior of the nodes. We develop a general setting to get insight and understand the class of network sampling models, where the probability of sampling a link in the original network depends on the attributes $h$ of its adjacent nodes. Assuming that the nodal attributes are independently drawn from an arbitrary distribution $ρ(h)$ and that the sampling probability $r(h_i , h_j)$ for a link $ij$ of nodal attributes $h_i$ and $h_j$ is also arbitrary, we derive exact analytic expressions of the sampled network for such network characteristics as the degree distribution, degree correlation, and clustering spectrum. The properties of the sampled network turn out to be sums of quantities for the original network topology weighted by the factors stemming from the sampling. Based on our analysis, we find that the sampled network may have sampling-induced network properties that are absent in the original network, which implies the potential risk of a naive generalization of the results of the sample to the entire original network. We also consider the case, when neighboring nodes have correlated attributes to show how to generalize our formalism for such sampling bias and we get good agreement between the analytic results and the numerical simulations.
△ Less
Submitted 22 May, 2019; v1 submitted 12 February, 2019;
originally announced February 2019.
-
Reentrant phase transitions in threshold driven contagion on multiplex networks
Authors:
Samuel Unicomb,
Gerardo Iñiguez,
János Kertész,
Márton Karsai
Abstract:
Models of threshold driven contagion explain the cascading spread of information, behavior, systemic risk, and epidemics on social, financial and biological networks. At odds with empirical observation, these models predict that single-layer unweighted networks become resistant to global cascades after reaching sufficient connectivity. We investigate threshold driven contagion on weight heterogene…
▽ More
Models of threshold driven contagion explain the cascading spread of information, behavior, systemic risk, and epidemics on social, financial and biological networks. At odds with empirical observation, these models predict that single-layer unweighted networks become resistant to global cascades after reaching sufficient connectivity. We investigate threshold driven contagion on weight heterogeneous multiplex networks and show that they can remain susceptible to global cascades at any level of connectivity, and with increasing edge density pass through alternating phases of stability and instability in the form of reentrant phase transitions of contagion. Our results provide a novel theoretical explanation for the observation of large scale contagion in highly connected but heterogeneous networks.
△ Less
Submitted 28 May, 2019; v1 submitted 24 January, 2019;
originally announced January 2019.
-
Social capital predicts corruption risk in towns
Authors:
Johannes Wachs,
Taha Yasseri,
Balázs Lengyel,
János Kertész
Abstract:
Corruption is a social plague: gains accrue to small groups, while its costs are borne by everyone. Significant variation in its level between and within countries suggests a relationship between social structure and the prevalence of corruption, yet, large scale empirical studies thereof have been missing due to lack of data. In this paper we relate the structural characteristics of social capita…
▽ More
Corruption is a social plague: gains accrue to small groups, while its costs are borne by everyone. Significant variation in its level between and within countries suggests a relationship between social structure and the prevalence of corruption, yet, large scale empirical studies thereof have been missing due to lack of data. In this paper we relate the structural characteristics of social capital of towns with corruption in their local governments. Using datasets from Hungary, we quantify corruption risk by suppressed competition and lack of transparency in the town's awarded public contracts. We characterize social capital using social network data from a popular online platform. Controlling for social, economic, and political factors, we find that settlements with fragmented social networks, indicating an excess of \textit{bonding social capital} have higher corruption risk and towns with more diverse external connectivity, suggesting a surplus of \textit{bridging social capital} are less exposed to corruption. We interpret fragmentation as fostering in-group favoritism and conformity, which increase corruption, while diversity facilitates impartiality in public life and stifles corruption.
△ Less
Submitted 12 October, 2018;
originally announced October 2018.
-
Structural transition in social networks: The role of homophily
Authors:
Yohsuke Murase,
Hang-Hyun Jo,
János Török,
János Kertész,
Kimmo Kaski
Abstract:
We introduce a model for the formation of social networks, which takes into account the homophily or the tendency of individuals to associate and bond with similar others, and the mechanisms of global and local attachment as well as tie reinforcement due to social interactions between people. We generalize the weighted social network model such that the nodes or individuals have $F$ features and e…
▽ More
We introduce a model for the formation of social networks, which takes into account the homophily or the tendency of individuals to associate and bond with similar others, and the mechanisms of global and local attachment as well as tie reinforcement due to social interactions between people. We generalize the weighted social network model such that the nodes or individuals have $F$ features and each feature can have $q$ different values. Here the tendency for the tie formation between two individuals due to the overlap in their features represents homophily. We find a phase transition as a function of $F$ or $q$, resulting in a phase diagram. For fixed $q$ and as a function of $F$ the system shows two phases separated at $F_c$. For $F{<}F_c$ large, homogeneous, and well separated communities can be identified within which the features match almost perfectly (segregated phase). When $F$ becomes larger than $F_c$, the nodes start to belong to several communities and within a community the features match only partially (overlapping phase). Several quantities reflect this transition, including the average degree, clustering coefficient, feature overlap, and the number of communities per node. We also make an attempt to interpret these results in terms of observations on social behavior of humans.
△ Less
Submitted 26 March, 2019; v1 submitted 15 August, 2018;
originally announced August 2018.
-
Status maximization as a source of fairness in a networked dictator game
Authors:
Jan E. Snellman,
Gerardo Iñiguez,
János Kertész,
R. A. Barrio,
Kimmo K. Kaski
Abstract:
Human behavioural patterns exhibit selfish or competitive, as well as selfless or altruistic tendencies, both of which have demonstrable effects on human social and economic activity. In behavioural economics, such effects have traditionally been illustrated experimentally via simple games like the dictator and ultimatum games. Experiments with these games suggest that, beyond rational economic th…
▽ More
Human behavioural patterns exhibit selfish or competitive, as well as selfless or altruistic tendencies, both of which have demonstrable effects on human social and economic activity. In behavioural economics, such effects have traditionally been illustrated experimentally via simple games like the dictator and ultimatum games. Experiments with these games suggest that, beyond rational economic thinking, human decision-making processes are influenced by social preferences, such as an inclination to fairness. In this study we suggest that the apparent gap between competitive and altruistic human tendencies can be bridged by assuming that people are primarily maximising their status, i.e., a utility function different from simple profit maximisation. To this end we analyse a simple agent-based model, where individuals play the repeated dictator game in a social network they can modify. As model parameters we consider the living costs and the rate at which agents forget infractions by others. We find that individual strategies used in the game vary greatly, from selfish to selfless, and that both of the above parameters determine when individuals form complex and cohesive social networks.
△ Less
Submitted 14 June, 2018;
originally announced June 2018.
-
The role of geography in the complex diffusion of innovations
Authors:
Balázs Lengyel,
Eszter Bokányi,
Riccardo Di Clemente,
János Kertész,
Marta C. González
Abstract:
The urban-rural divide is increasing in modern societies calling for geographical extensions of social influence modelling. Improved understanding of innovation diffusion across locations and through social connections can provide us with new insights into the spread of information, technological progress and economic development. In this work, we analyze the spatial adoption dynamics of iWiW, an…
▽ More
The urban-rural divide is increasing in modern societies calling for geographical extensions of social influence modelling. Improved understanding of innovation diffusion across locations and through social connections can provide us with new insights into the spread of information, technological progress and economic development. In this work, we analyze the spatial adoption dynamics of iWiW, an Online Social Network (OSN) in Hungary and uncover empirical features about the spatial adoption in social networks. During its entire life cycle from 2002 to 2012, iWiW reached up to 300 million friendship ties of 3 million users. We find that the number of adopters as a function of town population follows a scaling law that reveals a strongly concentrated early adoption in large towns and a less concentrated late adoption. We also discover a strengthening distance decay of spread over the life-cycle indicating high fraction of distant diffusion in early stages but the dominance of local diffusion in late stages. The spreading process is modelled within the Bass diffusion framework that enables us to compare the differential equation version with an agent-based version of the model run on the empirical network. Although both models can capture the macro trend of adoption, they have limited capacity to describe the observed trends of urban scaling and distance decay. We find, however that incorporating adoption thresholds, defined by the fraction of social connections that adopt a technology before the individual adopts, improves the network model fit to the urban scaling of early adopters. Controlling for the threshold distribution enables us to eliminate the bias induced by local network structure on predicting local adoption peaks. Finally, we show that geographical features such as distance from the innovation origin and town size influence prediction of adoption peak at local scales.
△ Less
Submitted 27 August, 2020; v1 submitted 4 April, 2018;
originally announced April 2018.
-
Algorithmic bias amplifies opinion polarization: A bounded confidence model
Authors:
Alina Sîrbu,
Dino Pedreschi,
Fosca Giannotti,
János Kertész
Abstract:
The flow of information reaching us via the online media platforms is optimized not by the information content or relevance but by popularity and proximity to the target. This is typically performed in order to maximise platform usage. As a side effect, this introduces an algorithmic bias that is believed to enhance polarization of the societal debate. To study this phenomenon, we modify the well-…
▽ More
The flow of information reaching us via the online media platforms is optimized not by the information content or relevance but by popularity and proximity to the target. This is typically performed in order to maximise platform usage. As a side effect, this introduces an algorithmic bias that is believed to enhance polarization of the societal debate. To study this phenomenon, we modify the well-known continuous opinion dynamics model of bounded confidence in order to account for the algorithmic bias and investigate its consequences. In the simplest version of the original model the pairs of discussion participants are chosen at random and their opinions get closer to each other if they are within a fixed tolerance level. We modify the selection rule of the discussion partners: there is an enhanced probability to choose individuals whose opinions are already close to each other, thus mimicking the behavior of online media which suggest interaction with similar peers. As a result we observe: a) an increased tendency towards polarization, which emerges also in conditions where the original model would predict convergence, and b) a dramatic slowing down of the speed at which the convergence at the asymptotic state is reached, which makes the system highly unstable. Polarization is augmented by a fragmented initial population.
△ Less
Submitted 6 March, 2018;
originally announced March 2018.
-
Peer relations with mobile phone data: Best friends and family formation
Authors:
Tamas David-Barrett,
Anna Rotkirch,
Asim Ghosh,
Kunal Bhattacharya,
Daniel Monsivais,
Isabel Behncke,
Janos Kertesz,
Kimmo Kaski
Abstract:
Earlier attempts to investigate the changes of the role of friendship in different life stages have failed due to lack of data. We close this gap by using a large data set of mobile phone calls from a European country in 2007, to study how the people's call patterns to their close social contacts are associated with age and gender of the callers. We hypothesize that (i) communication with peers, d…
▽ More
Earlier attempts to investigate the changes of the role of friendship in different life stages have failed due to lack of data. We close this gap by using a large data set of mobile phone calls from a European country in 2007, to study how the people's call patterns to their close social contacts are associated with age and gender of the callers. We hypothesize that (i) communication with peers, defined as callers of similar age, will be most important during the period of family formation and that (ii) the importance of best friends defined as same-sex callers of exactly the same age, will be stronger for women than for men. Results show that the frequency of phone calls with the same-sex peers in this population turns out to be relatively stable through life for both men and women. In line with the first hypothesis, there was a significant increase in the length of the phone calls for callers between ages 30 to 40 years. Partly in line with the second hypothesis, the increase in phone calls turned out to be particularly pronounced among females, although there were only minor gender differences in call frequencies. Furthermore, women tended to have long phone conversations with their same-age female friend, and also with somewhat older peers. In sum, we provide evidence from big data for the adult life stages at which peers are most important, and suggest that best friends appear to have a niche of their own in human sociality.
△ Less
Submitted 25 August, 2017;
originally announced August 2017.
-
Service adoption spreading in online social networks
Authors:
Gerardo Iñiguez,
Zhongyuan Ruan,
Kimmo Kaski,
János Kertész,
Márton Karsai
Abstract:
The collective behaviour of people adopting an innovation, product or online service is commonly interpreted as a spreading phenomenon throughout the fabric of society. This process is arguably driven by social influence, social learning and by external effects like media. Observations of such processes date back to the seminal studies by Rogers and Bass, and their mathematical modelling has taken…
▽ More
The collective behaviour of people adopting an innovation, product or online service is commonly interpreted as a spreading phenomenon throughout the fabric of society. This process is arguably driven by social influence, social learning and by external effects like media. Observations of such processes date back to the seminal studies by Rogers and Bass, and their mathematical modelling has taken two directions: One paradigm, called simple contagion, identifies adoption spreading with an epidemic process. The other one, named complex contagion, is concerned with behavioural thresholds and successfully explains the emergence of large cascades of adoption resulting in a rapid spreading often seen in empirical data. The observation of real world adoption processes has become easier lately due to the availability of large digital social network and behavioural datasets. This has allowed simultaneous study of network structures and dynamics of online service adoption, shedding light on the mechanisms and external effects that influence the temporal evolution of behavioural or innovation adoption. These advancements have induced the development of more realistic models of social spreading phenomena, which in turn have provided remarkably good predictions of various empirical adoption processes. In this chapter we review recent data-driven studies addressing real-world service adoption processes. Our studies provide the first detailed empirical evidence of a heterogeneous threshold distribution in adoption. We also describe the modelling of such phenomena with formal methods and data-driven simulations. Our objective is to understand the effects of identified social mechanisms on service adoption spreading, and to provide potential new directions and open questions for future research.
△ Less
Submitted 29 June, 2017;
originally announced June 2017.
-
Universality and scaling laws in the cascading failure model with healing
Authors:
Marcell Stippinger,
János Kertész
Abstract:
Cascading failures may lead to dramatic collapse in interdependent networks, where the breakdown takes place as a discontinuity of the order parameter. In the cascading failure (CF) model with healing there is a control parameter which at some value suppresses the discontinuity of the order parameter. However, up to this value of the healing parameter the breakdown is a hybrid transition, meaning…
▽ More
Cascading failures may lead to dramatic collapse in interdependent networks, where the breakdown takes place as a discontinuity of the order parameter. In the cascading failure (CF) model with healing there is a control parameter which at some value suppresses the discontinuity of the order parameter. However, up to this value of the healing parameter the breakdown is a hybrid transition, meaning that, besides this first order character, the transition shows scaling too. In this paper we investigate the question of universality related to the scaling behavior. Recently we showed that the hybrid phase transition in the original CF model has two sets of exponents describing respectively the order parameter and the cascade statistics, which are connected by a scaling law. In the CF model with healing we measure these exponents as a function of the healing parameter. We find two universality classes: In the wide range below the critical healing value the exponents agree with those of the original model, while above this value the model displays trivial scaling meaning that fluctuations follow the central limit theorem.
△ Less
Submitted 22 December, 2017; v1 submitted 27 May, 2017;
originally announced May 2017.
-
Stylized facts in social networks: Community-based static modeling
Authors:
Hang-Hyun Jo,
Yohsuke Murase,
János Török,
János Kertész,
Kimmo Kaski
Abstract:
The past analyses of datasets of social networks have enabled us to make empirical findings of a number of aspects of human society, which are commonly featured as stylized facts of social networks, such as broad distributions of network quantities, existence of communities, assortative mixing, and intensity-topology correlations. Since the understanding of the structure of these complex social ne…
▽ More
The past analyses of datasets of social networks have enabled us to make empirical findings of a number of aspects of human society, which are commonly featured as stylized facts of social networks, such as broad distributions of network quantities, existence of communities, assortative mixing, and intensity-topology correlations. Since the understanding of the structure of these complex social networks is far from complete, for deeper insight into human society more comprehensive datasets and modeling of the stylized facts are needed. Although the existing dynamical and static models can generate some stylized facts, here we take an alternative approach by devising a community-based static model with heterogeneous community sizes and larger communities having smaller link density and weight. With these few assumptions we are able to generate realistic social networks that show most stylized facts for a wide range of parameters, as demonstrated numerically and analytically. Since our community-based static model is simple to implement and easily scalable, it can be used as a reference system, benchmark, or testbed for further applications.
△ Less
Submitted 8 August, 2017; v1 submitted 11 November, 2016;
originally announced November 2016.
-
Multiplex Modeling of the Society
Authors:
Janos Kertesz,
Janos Torok,
Yohsuke Murase,
Hang-Hyun Jo,
Kimmo Kaski
Abstract:
The society has a multi-layered structure, where the layers represent the different contexts. To model this structure we begin with a single-layer weighted social network (WSN) model showing the Granovetterian structure. We find that when merging such WSN models, a sufficient amount of inter-layer correlation is needed to maintain the relationship between topology and link weights, while these cor…
▽ More
The society has a multi-layered structure, where the layers represent the different contexts. To model this structure we begin with a single-layer weighted social network (WSN) model showing the Granovetterian structure. We find that when merging such WSN models, a sufficient amount of inter-layer correlation is needed to maintain the relationship between topology and link weights, while these correlations destroy the enhancement in the community overlap due to multiple layers. To resolve this, we devise a geographic multi-layer WSN model, where the indirect inter-layer correlations due to the geographic constraints of individuals enhance the overlaps between the communities and, at the same time, the Granovetterian structure is preserved. Furthermore, the network of social interactions can be considered as a multiplex from another point of view too: each layer corresponds to one communication channel and the aggregate of all them constitutes the entire social network. However, usually one has information only about one of the channels, which should be considered as a sample of the whole. Here we show by simulations and analytical methods that this sampling may lead to bias. For example, while it is expected that the degree distribution of the whole social network has a maximum at a value larger than one, we get with reasonable assumptions about the sampling process a monotonously decreasing distribution as observed in empirical studies of single channel data. We analyse the far-reaching consequences of our findings.
△ Less
Submitted 27 September, 2016;
originally announced September 2016.
-
Empirical study of the role of the topology in spreading on communication networks
Authors:
Alexey N. Medvedev,
Janos Kertesz
Abstract:
Topological aspects, like community structure, and temporal activity patterns, like burstiness, have been shown to severly influence the speed of spreading in temporal networks. We study the influence of the topology on the susceptible-infected (SI) spreading on time stamped communication networks, as obtained from a dataset of mobile phone records. We consider city level networks with intra- and…
▽ More
Topological aspects, like community structure, and temporal activity patterns, like burstiness, have been shown to severly influence the speed of spreading in temporal networks. We study the influence of the topology on the susceptible-infected (SI) spreading on time stamped communication networks, as obtained from a dataset of mobile phone records. We consider city level networks with intra- and inter-city connections. The networks using only intra-city links are usually sparse, where the spreading depends mainly on the average degree. The inter-city links serve as bridges in spreading, speeding up considerably the process. We demonstrate the effect also on model simulations.
△ Less
Submitted 6 July, 2016;
originally announced July 2016.
-
Local cascades induced global contagion: How heterogeneous thresholds, exogenous effects, and unconcerned behaviour govern online adoption spreading
Authors:
Márton Karsai,
Gerardo Iñiguez,
Riivo Kikas,
Kimmo Kaski,
János Kertész
Abstract:
Adoption of innovations, products or online services is commonly interpreted as a spreading process driven to large extent by social influence and conditioned by the needs and capacities of individuals. To model this process one usually introduces behavioural threshold mechanisms, which can give rise to the evolution of global cascades if the system satisfies a set of conditions. However, these mo…
▽ More
Adoption of innovations, products or online services is commonly interpreted as a spreading process driven to large extent by social influence and conditioned by the needs and capacities of individuals. To model this process one usually introduces behavioural threshold mechanisms, which can give rise to the evolution of global cascades if the system satisfies a set of conditions. However, these models do not address temporal aspects of the emerging cascades, which in real systems may evolve through various pathways ranging from slow to rapid patterns. Here we fill this gap through the analysis and modelling of product adoption in the world's largest voice over internet service, the social network of Skype. We provide empirical evidence about the heterogeneous distribution of fractional behavioural thresholds, which appears to be independent of the degree of adopting egos. We show that the structure of real-world adoption clusters is radically different from previous theoretical expectations, since vulnerable adoptions --induced by a single adopting neighbour-- appear to be important only locally, while spontaneous adopters arriving at a constant rate and the involvement of unconcerned individuals govern the global emergence of social spreading.
△ Less
Submitted 29 January, 2016;
originally announced January 2016.
-
Communication with family and friends across the life course
Authors:
Tamas David-Barrett,
Janos Kertesz,
Anna Rotkirch,
Asim Ghosh,
Kunal Bhattacharya,
Daniel Monsivais,
Kimmo Kaski
Abstract:
Each stage of the human life course is characterized by a distinctive pattern of social relations. We study how the intensity and importance of the closest social contacts vary across the life course, using a large database of mobile communication from a European country. We first determine the most likely social relationship type from these mobile phone records by relating the age and gender of t…
▽ More
Each stage of the human life course is characterized by a distinctive pattern of social relations. We study how the intensity and importance of the closest social contacts vary across the life course, using a large database of mobile communication from a European country. We first determine the most likely social relationship type from these mobile phone records by relating the age and gender of the caller and recipient to the frequency, length, and direction of calls. We then show how communication patterns between parents and children, romantic partner, and friends vary across the six main stages of the adult family life course. Young adulthood is dominated by a gradual shift of call activity from parents to close friends, and then to a romantic partner, culminating in the period of early family formation during which the focus is on the romantic partner. During middle adulthood call patterns suggest a high dependence on the parents of the ego, who, presumably often provide alloparental care, while at this stage female same-gender friendship also peaks. During post-reproductive adulthood, individuals and especially women balance close social contacts among three generations. The age of grandparenthood brings the children entering adulthood and family formation into the focus, and is associated with a realignment of close social contacts especially among women, while the old age is dominated by dependence on their children.
△ Less
Submitted 30 December, 2015;
originally announced December 2015.
-
What does Big Data tell? Sampling the social network by communication channels
Authors:
János Török,
Yohsuke Murase,
Hang-Hyun Jo,
János Kertész,
Kimmo Kaski
Abstract:
Big Data has become the primary source of understanding the structure and dynamics of the society at large scale. The network of social interactions can be considered as a multiplex, where each layer corresponds to one communication channel and the aggregate of all of them constitutes the entire social network. However, usually one has information only about one of the channels or even a part of i…
▽ More
Big Data has become the primary source of understanding the structure and dynamics of the society at large scale. The network of social interactions can be considered as a multiplex, where each layer corresponds to one communication channel and the aggregate of all of them constitutes the entire social network. However, usually one has information only about one of the channels or even a part of it, which should be considered as a subset or sample of the whole. Here we introduce a model based on a natural bilateral communication channel selection mechanism, which for one channel leads to consistent changes in the network properties. For example, while it is expected that the degree distribution of the whole social network has a maximum at a value larger than one, we get a monotonously decreasing distribution as observed in empirical studies of single channel data. We also find that assortativity may occur or get strengthened due to the sampling method. We analyze the far-reaching consequences of our findings.
△ Less
Submitted 28 October, 2016; v1 submitted 27 November, 2015;
originally announced November 2015.
-
Kinetics of Social Contagion
Authors:
Zhongyuan Ruan,
Gerardo Iniguez,
Marton Karsai,
Janos Kertesz
Abstract:
Diffusion of information, behavioral patterns or innovations follows diverse pathways depending on a number of conditions, including the structure of the underlying social network, the sensitivity to peer pressure and the influence of media. Here we study analytically and by simulations a general model that incorporates threshold mechanism capturing sensitivity to peer pressure, the effect of `imm…
▽ More
Diffusion of information, behavioral patterns or innovations follows diverse pathways depending on a number of conditions, including the structure of the underlying social network, the sensitivity to peer pressure and the influence of media. Here we study analytically and by simulations a general model that incorporates threshold mechanism capturing sensitivity to peer pressure, the effect of `immune' nodes who never adopt, and a perpetual flow of external information. While any constant, non-zero rate of dynamically-introduced spontaneous adopters leads to global spreading, the kinetics by which the asymptotic state is approached shows rich behavior. In particular we find that, as a function of the immune node density, there is a transition from fast to slow spreading governed by entirely different mechanisms. This transition happens below the percolation threshold of network fragmentation, and has its origin in the competition between cascading behavior induced by adopters and blocking due to immune nodes. This change is accompanied by a percolation transition of the induced clusters.
△ Less
Submitted 30 October, 2015; v1 submitted 31 May, 2015;
originally announced June 2015.
-
Modeling the role of relationship fading and breakup in social network formation
Authors:
Yohsuke Murase,
Hang-Hyun Jo,
János Török,
János Kertész,
Kimmo Kaski
Abstract:
In social networks of human individuals, social relationships do not necessarily last forever as they can either fade gradually with time, resulting in link aging, or terminate abruptly, causing link deletion, as even old friendships may cease. In this paper, we study a social network formation model where we introduce several ways by which a link termination takes place. If we adopt the link agin…
▽ More
In social networks of human individuals, social relationships do not necessarily last forever as they can either fade gradually with time, resulting in link aging, or terminate abruptly, causing link deletion, as even old friendships may cease. In this paper, we study a social network formation model where we introduce several ways by which a link termination takes place. If we adopt the link aging, we get a more modular structure with more homogeneously distributed link weights within communities than when link deletion is used. By investigating distributions and relations of various network characteristics, we find that the empirical findings are better reproduced with the link deletion model. This indicates that link deletion plays a more prominent role in organizing social networks than link aging.
△ Less
Submitted 22 June, 2015; v1 submitted 4 May, 2015;
originally announced May 2015.
-
Geographies of an online social network
Authors:
Balázs Lengyel,
Attila Varga,
Bence Ságvári,
Ákos Jakobi,
János Kertész
Abstract:
How is online social media activity structured in the geographical space? Recent studies have shown that in spite of earlier visions about the "death of distance", physical proximity is still a major factor in social tie formation and maintenance in virtual social networks. Yet, it is unclear, what are the characteristics of the distance dependence in online social networks. In order to explore th…
▽ More
How is online social media activity structured in the geographical space? Recent studies have shown that in spite of earlier visions about the "death of distance", physical proximity is still a major factor in social tie formation and maintenance in virtual social networks. Yet, it is unclear, what are the characteristics of the distance dependence in online social networks. In order to explore this issue the complete network of the former major Hungarian online social network is analyzed. We find that the distance dependence is weaker for the online social network ties than what was found earlier for phone communication networks. For a further analysis we introduced a coarser granularity: We identified the settlements with the nodes of a network and assigned two kinds of weights to the links between them. When the weights are proportional to the number of contacts we observed weakly formed, but spatially based modules resembling to the borders of macro-regions, the highest level of regional administration in the country. If the weights are defined relative to an uncorrelated null model, the next level of administrative regions, counties are reflected.
△ Less
Submitted 13 August, 2015; v1 submitted 26 March, 2015;
originally announced March 2015.
-
Multilayer weighted social network model
Authors:
Yohsuke Murase,
János Török,
Hang-Hyun Jo,
Kimmo Kaski,
János Kertész
Abstract:
Recent empirical studies using large-scale data sets have validated the Granovetter hypothesis on the structure of the society in that there are strongly wired communities connected by weak ties. However, as interaction between individuals takes place in diverse contexts, these communities turn out to be overlapping. This implies that the society has a multilayered structure, where the layers repr…
▽ More
Recent empirical studies using large-scale data sets have validated the Granovetter hypothesis on the structure of the society in that there are strongly wired communities connected by weak ties. However, as interaction between individuals takes place in diverse contexts, these communities turn out to be overlapping. This implies that the society has a multilayered structure, where the layers represent the different contexts. To model this structure we begin with a single-layer weighted social network (WSN) model showing the Granovetterian structure. We find that when merging such WSN models, a sufficient amount of interlayer correlation is needed to maintain the relationship between topology and link weights, while these correlations destroy the enhancement in the community overlap due to multiple layers. To resolve this, we devise a geographic multilayer WSN model, where the indirect interlayer correlations due to the geographic constraints of individuals enhance the overlaps between the communities and, at the same time, the Granovetterian structure is preserved.
△ Less
Submitted 10 November, 2014; v1 submitted 6 August, 2014;
originally announced August 2014.
-
Complex contagion process in spreading of online innovation
Authors:
Márton Karsai,
Gerardo Iñiguez,
Kimmo Kaski,
János Kertész
Abstract:
Diffusion of innovation can be interpreted as a social spreading phenomena governed by the impact of media and social interactions. Although these mechanisms have been identified by quantitative theories, their role and relative importance are not entirely understood, since empirical verification has so far been hindered by the lack of appropriate data. Here we analyse a dataset recording the spre…
▽ More
Diffusion of innovation can be interpreted as a social spreading phenomena governed by the impact of media and social interactions. Although these mechanisms have been identified by quantitative theories, their role and relative importance are not entirely understood, since empirical verification has so far been hindered by the lack of appropriate data. Here we analyse a dataset recording the spreading dynamics of the world's largest Voice over Internet Protocol service to empirically support the assumptions behind models of social contagion. We show that the rate of spontaneous service adoption is constant, the probability of adoption via social influence is linearly proportional to the fraction of adopting neighbours, and the rate of service termination is time-invariant and independent of the behaviour of peers. By implementing the detected diffusion mechanisms into a dynamical agent-based model, we are able to emulate the adoption dynamics of the service in several countries worldwide. This approach enables us to make medium-term predictions of service adoption and disclose dependencies between the dynamics of innovation spreading and the socioeconomic development of a country.
△ Less
Submitted 23 October, 2014; v1 submitted 27 May, 2014;
originally announced May 2014.
-
Transmission of cultural traits in layered ego-centric networks
Authors:
Vasyl Palchykov,
Kimmo Kaski,
Janos Kertész
Abstract:
Although a number of models have been developed to investigate the emergence of culture and evolutionary phases in social systems, one important aspect has not yet been sufficiently emphasized. This is the structure of the underlaying network of social relations serving as channels in transmitting cultural traits, which is expected to play a crucial role in the evolutionary processes in social sys…
▽ More
Although a number of models have been developed to investigate the emergence of culture and evolutionary phases in social systems, one important aspect has not yet been sufficiently emphasized. This is the structure of the underlaying network of social relations serving as channels in transmitting cultural traits, which is expected to play a crucial role in the evolutionary processes in social systems. In this paper we contribute to the understanding of the role of the network structure by developing a layered ego-centric network structure based model, inspired by the social brain hypothesis, to study transmission of cultural traits and their evolution in social network. For this model we first find analytical results in the spirit of mean-field approximation and then to validate the results we compare them with the results of extensive numerical simulations.
△ Less
Submitted 19 November, 2014; v1 submitted 23 May, 2014;
originally announced May 2014.
-
Statistically validated mobile communication networks: Evolution of motifs in European and Chinese data
Authors:
Ming-Xia Li,
Vasyl Palchykov,
Zhi-Qiang Jiang,
Kimmo Kaski,
Janos Kertész,
Salvatore Miccichè,
Michele Tumminello,
Wei-Xing Zhou,
Rosario N. Mantegna
Abstract:
Big data open up unprecedented opportunities to investigate complex systems including the society. In particular, communication data serve as major sources for computational social sciences but they have to be cleaned and filtered as they may contain spurious information due to recording errors as well as interactions, like commercial and marketing activities, not directly related to the social ne…
▽ More
Big data open up unprecedented opportunities to investigate complex systems including the society. In particular, communication data serve as major sources for computational social sciences but they have to be cleaned and filtered as they may contain spurious information due to recording errors as well as interactions, like commercial and marketing activities, not directly related to the social network. The network constructed from communication data can only be considered as a proxy for the network of social relationships. Here we apply a systematic method, based on multiple hypothesis testing, to statistically validate the links and then construct the corresponding Bonferroni network, generalized to the directed case. We study two large datasets of mobile phone records, one from Europe and the other from China. For both datasets we compare the raw data networks with the corresponding Bonferroni networks and point out significant differences in the structures and in the basic network measures. We show evidence that the Bonferroni network provides a better proxy for the network of social interactions than the original one. By using the filtered networks we investigated the statistics and temporal evolution of small directed 3-motifs and conclude that closed communication triads have a formation time-scale, which is quite fast and typically intraday. We also find that open communication triads preferentially evolve to other open triads with a higher fraction of reciprocated calls. These stylized facts were observed for both datasets.
△ Less
Submitted 15 March, 2014;
originally announced March 2014.
-
Modeling Social Dynamics in a Collaborative Environment
Authors:
Gerardo Iñiguez,
János Török,
Taha Yasseri,
Kimmo Kaski,
János Kertész
Abstract:
Wikipedia is a prime example of today's value production in a collaborative environment. Using this example, we model the emergence, persistence and resolution of severe conflicts during collaboration by coupling opinion formation with article editing in a bounded confidence dynamics. The complex social behavior involved in editing articles is implemented as a minimal model with two basic elements…
▽ More
Wikipedia is a prime example of today's value production in a collaborative environment. Using this example, we model the emergence, persistence and resolution of severe conflicts during collaboration by coupling opinion formation with article editing in a bounded confidence dynamics. The complex social behavior involved in editing articles is implemented as a minimal model with two basic elements; (i) individuals interact directly to share information and convince each other, and (ii) they edit a common medium to establish their own opinions. Opinions of the editors and that represented by the article are characterised by a scalar variable. When the pool of editors is fixed, three regimes can be distinguished: (a) a stable mainstream article opinion is continuously contested by editors with extremist views and there is slow convergence towards consensus, (b) the article oscillates between editors with extremist views, reaching consensus relatively fast at one of the extremes, and (c) the extremist editors are converted very fast to the mainstream opinion and the article has an erratic evolution. When editors are renewed with a certain rate, a dynamical transition occurs between different kinds of edit wars, which qualitatively reflect the dynamics of conflicts as observed in real Wikipedia data.
△ Less
Submitted 14 June, 2014; v1 submitted 14 March, 2014;
originally announced March 2014.
-
Enhancing resilience of interdependent networks by healing
Authors:
Marcell Stippinger,
János Kertész
Abstract:
Interdependent networks are characterized by two kinds of interactions: The usual connectivity links within each network and the dependency links coupling nodes of different networks. Due to the latter links such networks are known to suffer from cascading failures and catastrophic breakdowns. When modeling these phenomena, usually one assumes that a fraction of nodes gets damaged in one of the ne…
▽ More
Interdependent networks are characterized by two kinds of interactions: The usual connectivity links within each network and the dependency links coupling nodes of different networks. Due to the latter links such networks are known to suffer from cascading failures and catastrophic breakdowns. When modeling these phenomena, usually one assumes that a fraction of nodes gets damaged in one of the networks, which is followed possibly by a cascade of failures. In real life the initiating failures do not occur at once and effort is made replace the ties eliminated due to the failing nodes. Here we study a dynamic extension of the model of interdependent networks and introduce the possibility of link formation with a probability w, called healing, to bridge non-functioning nodes and enhance network resilience. A single random node is removed, which may initiate an avalanche. After each removal step healing sets in resulting in a new topology. Then a new node fails and the process continues until the giant component disappears either in a catastrophic breakdown or in a smooth transition. Simulation results are presented for square lattices as starting networks under random attacks of constant intensity. We find that the shift in the position of the breakdown has a power-law scaling as a function of the healing probability with an exponent close to 1. Below a critical healing probability, catastrophic cascades form and the average degree of surviving nodes decreases monotonically, while above this value there are no macroscopic cascades and the average degree has first an increasing character and decreases only at the very late stage of the process. These findings facilitate to plan intervention in case of crisis situation by describing the efficiency of healing efforts needed to suppress cascading failures.
△ Less
Submitted 4 September, 2014; v1 submitted 6 December, 2013;
originally announced December 2013.