-
Under manipulations, are some AI models harder to audit?
Authors:
Augustin Godinot,
Gilles Tredan,
Erwan Le Merrer,
Camilla Penzo,
Francois Taïani
Abstract:
Auditors need robust methods to assess the compliance of web platforms with the law. However, since they hardly ever have access to the algorithm, implementation, or training data used by a platform, the problem is harder than a simple metric estimation. Within the recent framework of manipulation-proof auditing, we study in this paper the feasibility of robust audits in realistic settings, in whi…
▽ More
Auditors need robust methods to assess the compliance of web platforms with the law. However, since they hardly ever have access to the algorithm, implementation, or training data used by a platform, the problem is harder than a simple metric estimation. Within the recent framework of manipulation-proof auditing, we study in this paper the feasibility of robust audits in realistic settings, in which models exhibit large capacities. We first prove a constraining result: if a web platform uses models that may fit any data, no audit strategy -- whether active or not -- can outperform random sampling when estimating properties such as demographic parity. To better understand the conditions under which state-of-the-art auditing techniques may remain competitive, we then relate the manipulability of audits to the capacity of the targeted models, using the Rademacher complexity. We empirically validate these results on popular models of increasing capacities, thus confirming experimentally that large-capacity models, which are commonly used in practice, are particularly hard to audit robustly. These results refine the limits of the auditing problem, and open up enticing questions on the connection between model capacity and the ability of platforms to manipulate audit attempts.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Fairness Auditing with Multi-Agent Collaboration
Authors:
Martijn de Vos,
Akash Dhasade,
Jade Garcia Bourrée,
Anne-Marie Kermarrec,
Erwan Le Merrer,
Benoit Rottembourg,
Gilles Tredan
Abstract:
Existing work in fairness auditing assumes that each audit is performed independently. In this paper, we consider multiple agents working together, each auditing the same platform for different tasks. Agents have two levers: their collaboration strategy, with or without coordination beforehand, and their strategy for sampling appropriate data points. We theoretically compare the interplay of these…
▽ More
Existing work in fairness auditing assumes that each audit is performed independently. In this paper, we consider multiple agents working together, each auditing the same platform for different tasks. Agents have two levers: their collaboration strategy, with or without coordination beforehand, and their strategy for sampling appropriate data points. We theoretically compare the interplay of these levers. Our main findings are that (i) collaboration is generally beneficial for accurate audits, (ii) basic sampling methods often prove to be effective, and (iii) counter-intuitively, extensive coordination on queries often deteriorates audits accuracy as the number of agents increases. Experiments on three large datasets confirm our theoretical results. Our findings motivate collaboration during fairness audits of platforms that use ML models for decision-making.
△ Less
Submitted 16 August, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Modeling Rabbit-Holes on YouTube
Authors:
Erwan Le Merrer,
Gilles Tredan,
Ali Yesilkanat
Abstract:
Numerous discussions have advocated the presence of a so called rabbit-hole (RH) phenomenon on social media, interested in advanced personalization to their users. This phenomenon is loosely understood as a collapse of mainstream recommendations, in favor of ultra personalized ones that lock users into narrow and specialized feeds. Yet quantitative studies are often ignoring personalization, are o…
▽ More
Numerous discussions have advocated the presence of a so called rabbit-hole (RH) phenomenon on social media, interested in advanced personalization to their users. This phenomenon is loosely understood as a collapse of mainstream recommendations, in favor of ultra personalized ones that lock users into narrow and specialized feeds. Yet quantitative studies are often ignoring personalization, are of limited scale, and rely on manual tagging to track this collapse. This precludes a precise understanding of the phenomenon based on reproducible observations, and thus the continuous audits of platforms. In this paper, we first tackle the scale issue by proposing a user-sided bot-centric approach that enables large scale data collection, through autoplay walks on recommendations. We then propose a simple theory that explains the appearance of these RHs. While this theory is a simplifying viewpoint on a complex and planet-wide phenomenon, it carries multiple advantages: it can be analytically modeled, and provides a general yet rigorous definition of RHs. We define them as an interplay between i) user interaction with personalization and ii) the attraction strength of certain video categories, which cause users to quickly step apart of mainstream recommendations made to fresh user profiles. We illustrate these concepts by highlighting some RHs found after collecting more than 16 million personalized recommendations on YouTube. A final validation step compares our automatically-identified RHs against manually-identified RHs from a previous research work. Together, those results pave the way for large scale and automated audits of the RH effect in recommendation systems.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
On the relevance of APIs facing fairwashed audits
Authors:
Jade Garcia Bourrée,
Erwan Le Merrer,
Gilles Tredan,
Benoît Rottembourg
Abstract:
Recent legislation required AI platforms to provide APIs for regulators to assess their compliance with the law. Research has nevertheless shown that platforms can manipulate their API answers through fairwashing. Facing this threat for reliable auditing, this paper studies the benefits of the joint use of platform scraping and of APIs. In this setup, we elaborate on the use of scraping to detect…
▽ More
Recent legislation required AI platforms to provide APIs for regulators to assess their compliance with the law. Research has nevertheless shown that platforms can manipulate their API answers through fairwashing. Facing this threat for reliable auditing, this paper studies the benefits of the joint use of platform scraping and of APIs. In this setup, we elaborate on the use of scraping to detect manipulated answers: since fairwashing only manipulates API answers, exploiting scraps may reveal a manipulation. To abstract the wide range of specific API-scrap situations, we introduce a notion of proxy that captures the consistency an auditor might expect between both data sources. If the regulator has a good proxy of the consistency, then she can easily detect manipulation and even bypass the API to conduct her audit. On the other hand, without a good proxy, relying on the API is necessary, and the auditor cannot defend against fairwashing.
We then simulate practical scenarios in which the auditor may mostly rely on the API to conveniently conduct the audit task, while maintaining her chances to detect a potential manipulation. To highlight the tension between the audit task and the API fairwashing detection task, we identify Pareto-optimal strategies in a practical audit scenario.
We believe this research sets the stage for reliable audits in practical and manipulation-prone setups.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
On the Price of Locality in Static Fast Rerouting
Authors:
Klaus-Tycho Foerster,
Juho Hirvonen,
Yvonne-Anne Pignolet,
Stefan Schmid,
Gilles Tredan
Abstract:
Modern communication networks feature fully decentralized flow rerouting mechanisms which allow them to quickly react to link failures. This paper revisits the fundamental algorithmic problem underlying such local fast rerouting mechanisms. Is it possible to achieve perfect resilience, i.e., to define local routing tables which preserve connectivity as long as the underlying network is still conne…
▽ More
Modern communication networks feature fully decentralized flow rerouting mechanisms which allow them to quickly react to link failures. This paper revisits the fundamental algorithmic problem underlying such local fast rerouting mechanisms. Is it possible to achieve perfect resilience, i.e., to define local routing tables which preserve connectivity as long as the underlying network is still connected? Feigenbaum et al. (ACM PODC'12) and Foerster et al. (SIAM APOCS'21) showed that, unfortunately, it is impossible in general.
This paper charts a more complete landscape of the feasibility of perfect resilience. We first show a perhaps surprisingly large price of locality in static fast rerouting mechanisms: even when source and destination remain connected by a linear number of link-disjoint paths after link failures, local rerouting algorithms cannot find any of them which leads to a disconnection on the routing level. This motivates us to study resilience in graphs which exclude certain dense minors, such as cliques or a complete bipartite graphs, and in particular, provide characterizations of the possibility of perfect resilience in different routing models. We provide further insights into the price of locality by showing impossibility results for few failures and investigate perfect resilience on Topology Zoo networks.
△ Less
Submitted 7 April, 2022;
originally announced April 2022.
-
Algorithmic audits of algorithms, and the law
Authors:
Erwan Le Merrer,
Ronan Pons,
Gilles Trédan
Abstract:
Algorithmic decision making is now widespread, ranging from health care allocation to more common actions such as recommendation or information ranking. The aim to audit these algorithms has grown alongside. In this paper, we focus on external audits that are conducted by interacting with the user side of the target algorithm, hence considered as a black box. Yet, the legal framework in which thes…
▽ More
Algorithmic decision making is now widespread, ranging from health care allocation to more common actions such as recommendation or information ranking. The aim to audit these algorithms has grown alongside. In this paper, we focus on external audits that are conducted by interacting with the user side of the target algorithm, hence considered as a black box. Yet, the legal framework in which these audits take place is mostly ambiguous to researchers developing them: on the one hand, the legal value of the audit outcome is uncertain; on the other hand the auditors' rights and obligations are unclear. The contribution of this paper is to articulate two canonical audit forms to law, to shed light on these aspects: 1) the first audit form (we coin the Bobby audit form) checks a predicate against the algorithm, while the second (Sherlock) is more loose and opens up to multiple investigations. We find that: Bobby audits are more amenable to prosecution, yet are delicate as operating on real user data. This can lead to reject by a court (notion of admissibility). Sherlock audits craft data for their operation, most notably to build surrogates of the audited algorithm. It is mostly used for acts for whistleblowing, as even if accepted as a proof, the evidential value will be low in practice. 2) these two forms require the prior respect of a proper right to audit, granted by law or by the platform being audited; otherwise the auditor will be also prone to prosecutions regardless of the audit outcome. This article thus highlights the relation of current audits with law, in order to structure the growing field of algorithm auditing.
△ Less
Submitted 15 February, 2022;
originally announced March 2022.
-
Setting the Record Straighter on Shadow Banning
Authors:
Erwan Le Merrer,
Benoit Morgan,
Gilles Trédan
Abstract:
Shadow banning consists for an online social network in limiting the visibility of some of its users, without them being aware of it. Twitter declares that it does not use such a practice, sometimes arguing about the occurrence of "bugs" to justify restrictions on some users. This paper is the first to address the plausibility or not of shadow banning on a major online platform, by adopting both a…
▽ More
Shadow banning consists for an online social network in limiting the visibility of some of its users, without them being aware of it. Twitter declares that it does not use such a practice, sometimes arguing about the occurrence of "bugs" to justify restrictions on some users. This paper is the first to address the plausibility or not of shadow banning on a major online platform, by adopting both a statistical and a graph topological approach. We first conduct an extensive data collection and analysis campaign, gathering occurrences of visibility limitations on user profiles (we crawl more than 2.5 million of them). In such a black-box observation setup, we highlight the salient user profile features that may explain a banning practice (using machine learning predictors). We then pose two hypotheses for the phenomenon: i) limitations are bugs, as claimed by Twitter, and ii) shadow banning propagates as an epidemic on user-interactions ego-graphs. We show that hypothesis i) is statistically unlikely with regards to the data we collected. We then show some interesting correlation with hypothesis ii), suggesting that the interaction topology is a good indicator of the presence of groups of shadow banned users on the service.
△ Less
Submitted 13 April, 2021; v1 submitted 9 December, 2020;
originally announced December 2020.
-
On the Feasibility of Perfect Resilience with Local Fast Failover
Authors:
Klaus-Tycho Foerster,
Juho Hirvonen,
Yvonne-Anne Pignolet,
Stefan Schmid,
Gilles Tredan
Abstract:
In order to provide a high resilience and to react quickly to link failures, modern computer networks support fully decentralized flow rerouting, also known as local fast failover. In a nutshell, the task of a local fast failover algorithm is to pre-define fast failover rules for each node using locally available information only. These rules determine for each incoming link from which a packet ma…
▽ More
In order to provide a high resilience and to react quickly to link failures, modern computer networks support fully decentralized flow rerouting, also known as local fast failover. In a nutshell, the task of a local fast failover algorithm is to pre-define fast failover rules for each node using locally available information only. These rules determine for each incoming link from which a packet may arrive and the set of local link failures (i.e., the failed links incident to a node), on which outgoing link a packet should be forwarded. Ideally, such a local fast failover algorithm provides a perfect resilience deterministically: a packet emitted from any source can reach any target, as long as the underlying network remains connected. Feigenbaum et al. (ACM PODC 2012) and also Chiesa et al. (IEEE/ACM Trans. Netw. 2017) showed that it is not always possible to provide perfect resilience. Interestingly, not much more is known currently about the feasibility of perfect resilience.
This paper revisits perfect resilience with local fast failover, both in a model where the source can and cannot be used for forwarding decisions. We first derive several fairly general impossibility results: By establishing a connection between graph minors and resilience, we prove that it is impossible to achieve perfect resilience on any non-planar graph; furthermore, while planarity is necessary, it is also not sufficient for perfect resilience. On the positive side, we show that graph families closed under link subdivision allow for simple and efficient failover algorithms which simply skip failed links. We demonstrate this technique by deriving perfect resilience for outerplanar graphs and related scenarios, as well as for scenarios where the source and target are topologically close after failures.
△ Less
Submitted 3 November, 2020; v1 submitted 11 June, 2020;
originally announced June 2020.
-
The Bouncer Problem: Challenges to Remote Explainability
Authors:
Erwan Le Merrer,
Gilles Tredan
Abstract:
The concept of explainability is envisioned to satisfy society's demands for transparency on machine learning decisions. The concept is simple: like humans, algorithms should explain the rationale behind their decisions so that their fairness can be assessed. While this approach is promising in a local context (e.g. to explain a model during debugging at training time), we argue that this reasonin…
▽ More
The concept of explainability is envisioned to satisfy society's demands for transparency on machine learning decisions. The concept is simple: like humans, algorithms should explain the rationale behind their decisions so that their fairness can be assessed. While this approach is promising in a local context (e.g. to explain a model during debugging at training time), we argue that this reasoning cannot simply be transposed in a remote context, where a trained model by a service provider is only accessible through its API. This is problematic as it constitutes precisely the target use-case requiring transparency from a societal perspective. Through an analogy with a club bouncer (which may provide untruthful explanations upon customer reject), we show that providing explanations cannot prevent a remote service from lying about the true reasons leading to its decisions. More precisely, we prove the impossibility of remote explainability for single explanations, by constructing an attack on explanations that hides discriminatory features to the querying user. We provide an example implementation of this attack. We then show that the probability that an observer spots the attack, using several explanations for attempting to find incoherences, is low in practical settings. This undermines the very concept of remote explainability in general.
△ Less
Submitted 23 March, 2020; v1 submitted 3 October, 2019;
originally announced October 2019.
-
TamperNN: Efficient Tampering Detection of Deployed Neural Nets
Authors:
Erwan Le Merrer,
Gilles Tredan
Abstract:
Neural networks are powering the deployment of embedded devices and Internet of Things. Applications range from personal assistants to critical ones such as self-driving cars. It has been shown recently that models obtained from neural nets can be trojaned ; an attacker can then trigger an arbitrary model behavior facing crafted inputs. This has a critical impact on the security and reliability of…
▽ More
Neural networks are powering the deployment of embedded devices and Internet of Things. Applications range from personal assistants to critical ones such as self-driving cars. It has been shown recently that models obtained from neural nets can be trojaned ; an attacker can then trigger an arbitrary model behavior facing crafted inputs. This has a critical impact on the security and reliability of those deployed devices. We introduce novel algorithms to detect the tampering with deployed models, classifiers in particular. In the remote interaction setup we consider, the proposed strategy is to identify markers of the model input space that are likely to change class if the model is attacked, allowing a user to detect a possible tampering. This setup makes our proposal compatible with a wide range of scenarios, such as embedded models, or models exposed through prediction APIs. We experiment those tampering detection algorithms on the canonical MNIST dataset, over three different types of neural nets, and facing five different attacks (trojaning, quantization, fine-tuning, compression and watermarking). We then validate over five large models (VGG16, VGG19, ResNet, MobileNet, DenseNet) with a state of the art dataset (VGGFace2), and report results demonstrating the possibility of an efficient detection of model tampering.
△ Less
Submitted 5 September, 2019; v1 submitted 1 March, 2019;
originally announced March 2019.
-
Adversarial Frontier Stitching for Remote Neural Network Watermarking
Authors:
Erwan Le Merrer,
Patrick Perez,
Gilles Trédan
Abstract:
The state of the art performance of deep learning models comes at a high cost for companies and institutions, due to the tedious data collection and the heavy processing requirements. Recently, [35, 22] proposed to watermark convolutional neural networks for image classification, by embedding information into their weights. While this is a clear progress towards model protection, this technique so…
▽ More
The state of the art performance of deep learning models comes at a high cost for companies and institutions, due to the tedious data collection and the heavy processing requirements. Recently, [35, 22] proposed to watermark convolutional neural networks for image classification, by embedding information into their weights. While this is a clear progress towards model protection, this technique solely allows for extracting the watermark from a network that one accesses locally and entirely.
Instead, we aim at allowing the extraction of the watermark from a neural network (or any other machine learning model) that is operated remotely, and available through a service API. To this end, we propose to mark the model's action itself, tweaking slightly its decision frontiers so that a set of specific queries convey the desired information. In the present paper, we formally introduce the problem and propose a novel zero-bit watermarking algorithm that makes use of adversarial model examples. While limiting the loss of performance of the protected model, this algorithm allows subsequent extraction of the watermark using only few queries. We experimented the approach on three neural networks designed for image classification, in the context of MNIST digit recognition task.
△ Less
Submitted 7 August, 2019; v1 submitted 6 November, 2017;
originally announced November 2017.
-
The topological face of recommendation: models and application to bias detection
Authors:
Erwan Le Merrer,
Gilles Trédan
Abstract:
Recommendation plays a key role in e-commerce and in the entertainment industry. We propose to consider successive recommendations to users under the form of graphs of recommendations. We give models for this representation. Motivated by the growing interest for algorithmic transparency, we then propose a first application for those graphs, that is the potential detection of introduced recommendat…
▽ More
Recommendation plays a key role in e-commerce and in the entertainment industry. We propose to consider successive recommendations to users under the form of graphs of recommendations. We give models for this representation. Motivated by the growing interest for algorithmic transparency, we then propose a first application for those graphs, that is the potential detection of introduced recommendation bias by the service provider. This application relies on the analysis of the topology of the extracted graph for a given user; we propose a notion of recommendation coherence with regards to the topological proximity of recommended items (under the measure of items' k-closest neighbors, reminding the "small-world" model by Watts & Stroggatz). We finally illustrate this approach on a model and on Youtube crawls, targeting the prediction of "Recommended for you" links (i.e., biased or not by Youtube).
△ Less
Submitted 28 April, 2017;
originally announced April 2017.
-
Fast Abstracts and Student Forum Proceedings - EDCC 2016 - 12th European Dependable Computing Conference
Authors:
Gilles Tredan,
Hans Peter Schwefel
Abstract:
Fast Abstracts are short presentations of work in progress or opinion pieces and aim to serve as a rapid and flexible mechanism to (i) Report on current work that may or may not be complete; (ii) Introduce new ideas to the community; (iii) State positions on controversial issues or open problems. On the other hand, the goal of the Student Forum is to encourage students to attend EDCC and present t…
▽ More
Fast Abstracts are short presentations of work in progress or opinion pieces and aim to serve as a rapid and flexible mechanism to (i) Report on current work that may or may not be complete; (ii) Introduce new ideas to the community; (iii) State positions on controversial issues or open problems. On the other hand, the goal of the Student Forum is to encourage students to attend EDCC and present their work, exchange ideas with researchers and practitioners, and get early feedback on their research efforts.
△ Less
Submitted 5 September, 2016;
originally announced September 2016.
-
Uncovering Influence Cookbooks : Reverse Engineering the Topological Impact in Peer Ranking Services
Authors:
Erwan Le Merrer,
Gilles Trédan
Abstract:
Ensuring the early detection of important social network users is a challenging task. Some peer ranking services are now well established, such as PeerIndex, Klout, or Kred. Their function is to rank users according to their influence. This notion of influence is however abstract, and the algorithms achieving this ranking are opaque. Following the rising demand for a more transparent web, we explo…
▽ More
Ensuring the early detection of important social network users is a challenging task. Some peer ranking services are now well established, such as PeerIndex, Klout, or Kred. Their function is to rank users according to their influence. This notion of influence is however abstract, and the algorithms achieving this ranking are opaque. Following the rising demand for a more transparent web, we explore the problem of gaining knowledge by reverse engineering such peer ranking services, with regards to the social network topology they get as an input. Since these services exploit the online activity of users (and therefore their connectivity in social networks), we provide a precise evaluation of how topological metrics of the social network impact the final user ranking. Our approach is the following : we first model the ranking service as a black-box with which we interact by creating user profiles and by performing operations on them. Through those profiles, we trigger some slight topological modifications. By monitoring the impact of these modifications on the rankings of those profiles, we infer the weight of each topological metric in the black-box, thus reversing the service influence cookbook.
△ Less
Submitted 7 October, 2016; v1 submitted 26 August, 2016;
originally announced August 2016.
-
The Many Faces of Graph Dynamics
Authors:
Yvonne Anne Pignolet,
Matthieu Roy,
Stefan Schmid,
Gilles Tredan
Abstract:
The topological structure of complex networks has fascinated researchers for several decades, resulting in the discovery of many universal properties and reoccurring characteristics of different kinds of networks. However, much less is known today about the network dynamics: indeed, complex networks in reality are not static, but rather dynamically evolve over time.
Our paper is motivated by the…
▽ More
The topological structure of complex networks has fascinated researchers for several decades, resulting in the discovery of many universal properties and reoccurring characteristics of different kinds of networks. However, much less is known today about the network dynamics: indeed, complex networks in reality are not static, but rather dynamically evolve over time.
Our paper is motivated by the empirical observation that network evolution patterns seem far from random, but exhibit structure. Moreover, the specific patterns appear to depend on the network type, contradicting the existence of a "one fits it all" model. However, we still lack observables to quantify these intuitions, as well as metrics to compare graph evolutions. Such observables and metrics are needed for extrapolating or predicting evolutions, as well as for interpolating graph evolutions.
To explore the many faces of graph dynamics and to quantify temporal changes, this paper suggests to build upon the concept of centrality, a measure of node importance in a network. In particular, we introduce the notion of centrality distance, a natural similarity measure for two graphs which depends on a given centrality, characterizing the graph type. Intuitively, centrality distances reflect the extent to which (non-anonymous) node roles are different or, in case of dynamic graphs, have changed over time, between two graphs.
We evaluate the centrality distance approach for five evolutionary models and seven real-world social and physical networks. Our results empirically show the usefulness of centrality distances for characterizing graph dynamics compared to a null-model of random evolution, and highlight the differences between the considered scenarios. Interestingly, our approach allows us to compare the dynamics of very different networks, in terms of scale and evolution speed.
△ Less
Submitted 12 October, 2016; v1 submitted 5 August, 2016;
originally announced August 2016.
-
The Many Faces of Graph Dynamics
Authors:
Yvonne Anne Pignolet,
Matthieu Roy,
Stefan Schmid,
Gilles Tredan
Abstract:
The topological structure of complex networks has fascinated researchers for several decades, resulting in the discovery of many universal properties and reoccurring characteristics of different kinds of networks. However, much less is known today about the network dynamics: indeed, complex networks in reality are not static, but rather dynamically evolve over time.
Our paper is motivated by the…
▽ More
The topological structure of complex networks has fascinated researchers for several decades, resulting in the discovery of many universal properties and reoccurring characteristics of different kinds of networks. However, much less is known today about the network dynamics: indeed, complex networks in reality are not static, but rather dynamically evolve over time.
Our paper is motivated by the empirical observation that network evolution patterns seem far from random, but exhibit structure. Moreover, the specific patterns appear to depend on the network type, contradicting the existence of a "one fits it all" model. However, we still lack observables to quantify these intuitions, as well as metrics to compare graph evolutions. Such observables and metrics are needed for extrapolating or predicting evolutions, as well as for interpolating graph evolutions.
To explore the many faces of graph dynamics and to quantify temporal changes, this paper suggests to build upon the concept of centrality, a measure of node importance in a network. In particular, we introduce the notion of centrality distance, a natural similarity measure for two graphs which depends on a given centrality, characterizing the graph type. Intuitively, centrality distances reflect the extent to which (non-anonymous) node roles are different or, in case of dynamic graphs, have changed over time, between two graphs.
We evaluate the centrality distance approach for five evolutionary models and seven real-world social and physical networks. Our results empirically show the usefulness of centrality distances for characterizing graph dynamics compared to a null-model of random evolution, and highlight the differences between the considered scenarios. Interestingly, our approach allows us to compare the dynamics of very different networks, in terms of scale and evolution speed.
△ Less
Submitted 1 March, 2017; v1 submitted 4 June, 2015;
originally announced June 2015.
-
Modeling and Measuring Graph Similarity: The Case for Centrality Distance
Authors:
Matthieu Roy,
Stefan Schmid,
Gilles Trédan
Abstract:
The study of the topological structure of complex networks has fascinated researchers for several decades, and today we have a fairly good understanding of the types and reoccurring characteristics of many different complex networks. However, surprisingly little is known today about models to compare complex graphs, and quantitatively measure their similarity. This paper proposes a natural similar…
▽ More
The study of the topological structure of complex networks has fascinated researchers for several decades, and today we have a fairly good understanding of the types and reoccurring characteristics of many different complex networks. However, surprisingly little is known today about models to compare complex graphs, and quantitatively measure their similarity. This paper proposes a natural similarity measure for complex networks: centrality distance, the difference between two graphs with respect to a given node centrality. Centrality distances allow to take into account the specific roles of the different nodes in the network, and have many interesting applications. As a case study, we consider the closeness centrality in more detail, and show that closeness centrality distance can be used to effectively distinguish between randomly generated and actual evolutionary paths of two dynamic social networks.
△ Less
Submitted 20 June, 2014;
originally announced June 2014.
-
Request Complexity of VNet Topology Extraction: Dictionary-Based Attacks
Authors:
Yvonne-Anne Pignolet,
Stefan Schmid,
Gilles Tredan
Abstract:
The network virtualization paradigm envisions an Internet where arbitrary virtual networks (VNets) can be specified and embedded over a shared substrate (e.g., the physical infrastructure). As VNets can be requested at short notice and for a desired time period only, the paradigm enables a flexible service deployment and an efficient resource utilization.
This paper investigates the security imp…
▽ More
The network virtualization paradigm envisions an Internet where arbitrary virtual networks (VNets) can be specified and embedded over a shared substrate (e.g., the physical infrastructure). As VNets can be requested at short notice and for a desired time period only, the paradigm enables a flexible service deployment and an efficient resource utilization.
This paper investigates the security implications of such an architecture. We consider a simple model where an attacker seeks to extract secret information about the substrate topology, by issuing repeated VNet embedding requests. We present a general framework that exploits basic properties of the VNet embedding relation to infer the entire topology. Our framework is based on a graph motif dictionary applicable for various graph classes. Moreover, we provide upper bounds on the request complexity, the number of requests needed by the attacker to succeed.
△ Less
Submitted 21 April, 2013;
originally announced April 2013.
-
Revisiting Content Availability in Distributed Online Social Networks
Authors:
Doris Schiöberg,
Fabian Schneider,
Gilles Tredan,
Steve Uhlig,
Anja Feldmann
Abstract:
Online Social Networks (OSN) are among the most popular applications in today's Internet. Decentralized online social networks (DOSNs), a special class of OSNs, promise better privacy and autonomy than traditional centralized OSNs. However, ensuring availability of content when the content owner is not online remains a major challenge. In this paper, we rely on the structure of the social graphs u…
▽ More
Online Social Networks (OSN) are among the most popular applications in today's Internet. Decentralized online social networks (DOSNs), a special class of OSNs, promise better privacy and autonomy than traditional centralized OSNs. However, ensuring availability of content when the content owner is not online remains a major challenge. In this paper, we rely on the structure of the social graphs underlying DOSN for replication. In particular, we propose that friends, who are anyhow interested in the content, are used to replicate the users content. We study the availability of such natural replication schemes via both theoretical analysis as well as simulations based on data from OSN users. We find that the availability of the content increases drastically when compared to the online time of the user, e. g., by a factor of more than 2 for 90% of the users. Thus, with these simple schemes we provide a baseline for any more complicated content replication scheme.
△ Less
Submitted 4 October, 2012;
originally announced October 2012.
-
Misleading Stars: What Cannot Be Measured in the Internet?
Authors:
Yvonne Anne Pignolet,
Stefan Schmid,
Gilles Tredan
Abstract:
Traceroute measurements are one of our main instruments to shed light onto the structure and properties of today's complex networks such as the Internet. This paper studies the feasibility and infeasibility of inferring the network topology given traceroute data from a worst-case perspective, i.e., without any probabilistic assumptions on, e.g., the nodes' degree distribution. We attend to a scena…
▽ More
Traceroute measurements are one of our main instruments to shed light onto the structure and properties of today's complex networks such as the Internet. This paper studies the feasibility and infeasibility of inferring the network topology given traceroute data from a worst-case perspective, i.e., without any probabilistic assumptions on, e.g., the nodes' degree distribution. We attend to a scenario where some of the routers are anonymous, and propose two fundamental axioms that model two basic assumptions on the traceroute data: (1) each trace corresponds to a real path in the network, and (2) the routing paths are at most a factor 1/alpha off the shortest paths, for some parameter alpha in (0,1]. In contrast to existing literature that focuses on the cardinality of the set of (often only minimal) inferrable topologies, we argue that a large number of possible topologies alone is often unproblematic, as long as the networks have a similar structure. We hence seek to characterize the set of topologies inferred with our axioms. We introduce the notion of star graphs whose colorings capture the differences among inferred topologies; it also allows us to construct inferred topologies explicitly. We find that in general, inferrable topologies can differ significantly in many important aspects, such as the nodes' distances or the number of triangles. These negative results are complemented by a discussion of a scenario where the trace set is best possible, i.e., "complete". It turns out that while some properties such as the node degrees are still hard to measure, a complete trace set can help to determine global properties such as the connectivity.
△ Less
Submitted 19 June, 2011; v1 submitted 26 May, 2011;
originally announced May 2011.
-
A generic trust framework for large-scale open systems using machine learning
Authors:
Xin Liu,
Gilles Tredan,
Anwitaman Datta
Abstract:
In many large scale distributed systems and on the web, agents need to interact with other unknown agents to carry out some tasks or transactions. The ability to reason about and assess the potential risks in carrying out such transactions is essential for providing a safe and reliable environment. A traditional approach to reason about the trustworthiness of a transaction is to determine the trus…
▽ More
In many large scale distributed systems and on the web, agents need to interact with other unknown agents to carry out some tasks or transactions. The ability to reason about and assess the potential risks in carrying out such transactions is essential for providing a safe and reliable environment. A traditional approach to reason about the trustworthiness of a transaction is to determine the trustworthiness of the specific agent involved, derived from the history of its behavior. As a departure from such traditional trust models, we propose a generic, machine learning approach based trust framework where an agent uses its own previous transactions (with other agents) to build a knowledge base, and utilize this to assess the trustworthiness of a transaction based on associated features, which are capable of distinguishing successful transactions from unsuccessful ones. These features are harnessed using appropriate machine learning algorithms to extract relationships between the potential transaction and previous transactions. The trace driven experiments using real auction dataset show that this approach provides good accuracy and is highly efficient compared to other trust mechanisms, especially when historical information of the specific agent is rare, incomplete or inaccurate.
△ Less
Submitted 1 March, 2011;
originally announced March 2011.