Search | arXiv e-print repository

Identifying Nonstationary Causal Structures with High-Order Markov Switching Models

Authors: Carles Balsells-Rodas, Yixin Wang, Pedro A. M. Mediano, Yingzhen Li

Abstract: Causal discovery in time series is a rapidly evolving field with a wide variety of applications in other areas such as climate science and neuroscience. Traditional approaches assume a stationary causal graph, which can be adapted to nonstationary time series with time-dependent effects or heterogeneous noise. In this work we address nonstationarity via regime-dependent causal structures. We first… ▽ More Causal discovery in time series is a rapidly evolving field with a wide variety of applications in other areas such as climate science and neuroscience. Traditional approaches assume a stationary causal graph, which can be adapted to nonstationary time series with time-dependent effects or heterogeneous noise. In this work we address nonstationarity via regime-dependent causal structures. We first establish identifiability for high-order Markov Switching Models, which provide the foundations for identifiable regime-dependent causal discovery. Our empirical studies demonstrate the scalability of our proposed approach for high-order regime-dependent structure estimation, and we illustrate its applicability on brain activity data. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: CI4TS Workshop @UAI2024

arXiv:2404.07140 [pdf, ps, other]

Characterising directed and undirected metrics of high-order interdependence

Authors: Fernando E. Rosas, Pedro A. M. Mediano, Michael Gastpar

Abstract: Systems of interest for theoretical or experimental work often exhibit high-order interactions, corresponding to statistical interdependencies in groups of variables that cannot be reduced to dependencies in subsets of them. While still under active development, the framework of partial information decomposition (PID) has emerged as the dominant approach to conceptualise and calculate high-order i… ▽ More Systems of interest for theoretical or experimental work often exhibit high-order interactions, corresponding to statistical interdependencies in groups of variables that cannot be reduced to dependencies in subsets of them. While still under active development, the framework of partial information decomposition (PID) has emerged as the dominant approach to conceptualise and calculate high-order interdependencies. PID approaches can be grouped in two types: directed approaches that divide variables into sources and targets, and undirected approaches that treat all variables equally. Directed and undirected approaches are usually employed to investigate different scenarios, and hence little is known about how these two types of approaches may relate to each other, or if their corresponding quantities are linked in some way. In this paper we investigate the relationship between the redundancy-synergy index (RSI) and the O-information, which are practical metrics of directed and undirected high-order interdependencies, respectively. Our results reveal tight links between these two quantities, and provide interpretations of them in terms of likelihood ratios in a hypothesis testing setting, as well as in terms of projections in information geometry. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 6 pages, 1 figure

arXiv:2308.05664 [pdf, other]

Information decomposition reveals hidden high-order contributions to temporal irreversibility

Authors: Andrea I Luppi, Fernando E. Rosas, Gustavo Deco, Morten L. Kringelbach, Pedro A. M. Mediano

Abstract: Temporal irreversibility, often referred to as the arrow of time, is a fundamental concept in statistical mechanics. Markers of irreversibility also provide a powerful characterisation of information processing in biological systems. However, current approaches tend to describe temporal irreversibility in terms of a single scalar quantity, without disentangling the underlying dynamics that contrib… ▽ More Temporal irreversibility, often referred to as the arrow of time, is a fundamental concept in statistical mechanics. Markers of irreversibility also provide a powerful characterisation of information processing in biological systems. However, current approaches tend to describe temporal irreversibility in terms of a single scalar quantity, without disentangling the underlying dynamics that contribute to irreversibility. Here we propose a broadly applicable information-theoretic framework to characterise the arrow of time in multivariate time series, which yields qualitatively different types of irreversible information dynamics. This multidimensional characterisation reveals previously unreported high-order modes of irreversibility, and establishes a formal connection between recent heuristic markers of temporal irreversibility and metrics of information processing. We demonstrate the prevalence of high-order irreversibility in the hyperactive regime of a biophysical model of brain dynamics, showing that our framework is both theoretically principled and empirically useful. This work challenges the view of the arrow of time as a monolithic entity, enhancing both our theoretical understanding of irreversibility and our ability to detect it in practical applications. △ Less

Submitted 10 August, 2023; originally announced August 2023.

arXiv:2306.01645 [pdf, other]

Quantifying synergy and redundancy in multiplex networks

Authors: Andrea I. Luppi, Eckehard Olbrich, Conor Finn, Laura E. Suárez, Fernando E. Rosas, Pedro A. M. Mediano, Jürgen Jost

Abstract: Understanding how different networks relate to each other is key for obtaining a greater insight into complex systems. Here, we introduce an intuitive yet powerful framework to characterise the relationship between two networks comprising the same nodes. We showcase our framework by decomposing the shortest paths between nodes as being contributed uniquely by one or the other source network, or re… ▽ More Understanding how different networks relate to each other is key for obtaining a greater insight into complex systems. Here, we introduce an intuitive yet powerful framework to characterise the relationship between two networks comprising the same nodes. We showcase our framework by decomposing the shortest paths between nodes as being contributed uniquely by one or the other source network, or redundantly by either, or synergistically by the two together. Our approach takes into account the networks' full topology, and it also provides insights at multiple levels of resolution: from global statistics, to individual paths of different length. We show that this approach is widely applicable, from brains to the London public transport system. In humans and across 123 other mammalian species, we demonstrate that reliance on unique contributions by long-range white matter fibers is a conserved feature of mammalian structural brain networks. Across species, we also find that efficient communication relies on significantly greater synergy between long-range and short-range fibers than expected by chance, and significantly less redundancy. Our framework may find applications to help decide how to trade-off different desiderata when designing network systems, or to evaluate their relative presence in existing systems, whether biological or artificial. △ Less

Submitted 8 August, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

arXiv:2306.00904 [pdf, other]

Interaction Measures, Partition Lattices and Kernel Tests for High-Order Interactions

Authors: Zhaolu Liu, Robert L. Peach, Pedro A. M. Mediano, Mauricio Barahona

Abstract: Models that rely solely on pairwise relationships often fail to capture the complete statistical structure of the complex multivariate data found in diverse domains, such as socio-economic, ecological, or biomedical systems. Non-trivial dependencies between groups of more than two variables can play a significant role in the analysis and modelling of such systems, yet extracting such high-order in… ▽ More Models that rely solely on pairwise relationships often fail to capture the complete statistical structure of the complex multivariate data found in diverse domains, such as socio-economic, ecological, or biomedical systems. Non-trivial dependencies between groups of more than two variables can play a significant role in the analysis and modelling of such systems, yet extracting such high-order interactions from data remains challenging. Here, we introduce a hierarchy of $d$-order ($d \geq 2$) interaction measures, increasingly inclusive of possible factorisations of the joint probability distribution, and define non-parametric, kernel-based tests to establish systematically the statistical significance of $d$-order interactions. We also establish mathematical links with lattice theory, which elucidate the derivation of the interaction measures and their composite permutation tests; clarify the connection of simplicial complexes with kernel matrix centring; and provide a means to enhance computational efficiency. We illustrate our results numerically with validations on synthetic data, and through an application to neuroimaging data. △ Less

Submitted 7 November, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: 22 pages, 9 figures

arXiv:2305.13454 [pdf, other]

Dynamical noise can enhance high-order statistical structure in complex systems

Authors: Patricio Orio, Pedro A. M. Mediano, Fernando E. Rosas

Abstract: Recent research has provided a wealth of evidence highlighting the pivotal role of high-order interdependencies in supporting the information-processing capabilities of distributed complex systems. These findings may suggest that high-order interdependencies constitute a powerful resource that is, however, challenging to harness and can be readily disrupted. In this paper we contest this perspecti… ▽ More Recent research has provided a wealth of evidence highlighting the pivotal role of high-order interdependencies in supporting the information-processing capabilities of distributed complex systems. These findings may suggest that high-order interdependencies constitute a powerful resource that is, however, challenging to harness and can be readily disrupted. In this paper we contest this perspective by demonstrating that high-order interdependencies can not only exhibit robustness to stochastic perturbations, but can in fact be enhanced by them. Using elementary cellular automata as a general testbed, our results unveil the capacity of dynamical noise to enhance the statistical regularities between agents and, intriguingly, even alter the prevailing character of their interdependencies. Furthermore, our results show that these effects are related to the high-order structure of the local rules, which affect the system's susceptibility to noise and characteristic times-scales. These results deepen our understanding of how high-order interdependencies may spontaneously emerge within distributed systems interacting with stochastic environments, thus providing an initial step towards elucidating their origin and function in complex systems like the human brain. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: 8 pages, 4 figures, 2 tables

arXiv:2305.07554 [pdf, ps, other]

A Logarithmic Decomposition for Information

Authors: Keenan J. A. Down, Pedro A. M. Mediano

Abstract: The Shannon entropy of a random variable $X$ has much behaviour analogous to a signed measure. Previous work has concretized this connection by defining a signed measure $μ$ on an abstract information space $\tilde{X}$, which is taken to represent the information that $X$ contains. This construction is sufficient to derive many measure-theoretical counterparts to information quantities such as the… ▽ More The Shannon entropy of a random variable $X$ has much behaviour analogous to a signed measure. Previous work has concretized this connection by defining a signed measure $μ$ on an abstract information space $\tilde{X}$, which is taken to represent the information that $X$ contains. This construction is sufficient to derive many measure-theoretical counterparts to information quantities such as the mutual information $I(X; Y) = μ(\tilde{X} \cap \tilde{Y})$, the joint entropy $H(X,Y) = μ(\tilde{X} \cup \tilde{Y})$, and the conditional entropy $H(X|Y) = μ(\tilde{X}\, \setminus \, \tilde{Y})$. We demonstrate that there exists a much finer decomposition with intuitive properties which we call the logarithmic decomposition (LD). We show that this signed measure space has the useful property that its logarithmic atoms are easily characterised with negative or positive entropy, while also being coherent with Yeung's $I$-measure. We present the usability of our approach by re-examining the Gács-Körner common information from this new geometric perspective and characterising it in terms of our logarithmic atoms. We then highlight that our geometric refinement can account for an entire class of information quantities, which we call logarithmically decomposable quantities. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: 9 pages, 4 figures. Submitted to the 2023 IEEE International Symposium on Information Theory

MSC Class: 94A17 ACM Class: E.4

arXiv:2210.02996 [pdf, other]

Synergistic information supports modality integration and flexible learning in neural networks solving multiple tasks

Authors: Alexandra M. Proca, Fernando E. Rosas, Andrea I. Luppi, Daniel Bor, Matthew Crosby, Pedro A. M. Mediano

Abstract: Striking progress has recently been made in understanding human cognition by analyzing how its neuronal underpinnings are engaged in different modes of information processing. Specifically, neural information can be decomposed into synergistic, redundant, and unique features, with synergistic components being particularly aligned with complex cognition. However, two fundamental questions remain un… ▽ More Striking progress has recently been made in understanding human cognition by analyzing how its neuronal underpinnings are engaged in different modes of information processing. Specifically, neural information can be decomposed into synergistic, redundant, and unique features, with synergistic components being particularly aligned with complex cognition. However, two fundamental questions remain unanswered: (a) precisely how and why a cognitive system can become highly synergistic; and (b) how these informational states map onto artificial neural networks in various learning modes. To address these questions, here we employ an information-decomposition framework to investigate the information processing strategies adopted by simple artificial neural networks performing a variety of cognitive tasks in both supervised and reinforcement learning settings. Our results show that synergy increases as neural networks learn multiple diverse tasks. Furthermore, performance in tasks requiring integration of multiple information sources critically relies on synergistic neurons. Finally, randomly turning off neurons during training through dropout increases network redundancy, corresponding to an increase in robustness. Overall, our results suggest that while redundant information is required for robustness to perturbations in the learning process, synergistic information is used to combine information from multiple modalities -- and more generally for flexible and efficient learning. These findings open the door to new ways of investigating how and why learning systems employ specific information-processing strategies, and support the principle that the capacity for general-purpose learning critically relies in the system's information dynamics. △ Less

Submitted 6 October, 2022; originally announced October 2022.

Comments: 33 pages, 15 figures

arXiv:2203.12041 [pdf, other]

doi 10.1038/s41567-022-01548-5

Disentangling high-order mechanisms and high-order behaviours in complex systems

Authors: Fernando E. Rosas, Pedro A. M. Mediano, Andrea I. Luppi, Thomas F. Varley, Joseph T. Lizier, Sebastiano Stramaglia, Henrik J. Jensen, Daniele Marinazzo

Abstract: Battiston et al. (arXiv:2110.06023) provide a comprehensive overview of how investigations of complex systems should take into account interactions between more than two elements, which can be modelled by hypergraphs and studied via topological data analysis. Following a separate line of enquiry, a broad literature has developed information-theoretic tools to characterize high-order interdependenc… ▽ More Battiston et al. (arXiv:2110.06023) provide a comprehensive overview of how investigations of complex systems should take into account interactions between more than two elements, which can be modelled by hypergraphs and studied via topological data analysis. Following a separate line of enquiry, a broad literature has developed information-theoretic tools to characterize high-order interdependencies from observed data. While these could seem to be competing approaches aiming to address the same question, in this correspondence we clarify that this is not the case, and that a complete account of higher-order phenomena needs to embrace both. △ Less

Submitted 21 March, 2022; originally announced March 2022.

Journal ref: Nature Physics (2022)

arXiv:2010.07382 [pdf, ps, other]

Learning, compression, and leakage: Minimising classification error via meta-universal compression principles

Authors: Fernando E. Rosas, Pedro A. M. Mediano, Michael Gastpar

Abstract: Learning and compression are driven by the common aim of identifying and exploiting statistical regularities in data, which opens the door for fertile collaboration between these areas. A promising group of compression techniques for learning scenarios is normalised maximum likelihood (NML) coding, which provides strong guarantees for compression of small datasets - in contrast with more popular e… ▽ More Learning and compression are driven by the common aim of identifying and exploiting statistical regularities in data, which opens the door for fertile collaboration between these areas. A promising group of compression techniques for learning scenarios is normalised maximum likelihood (NML) coding, which provides strong guarantees for compression of small datasets - in contrast with more popular estimators whose guarantees hold only in the asymptotic limit. Here we consider a NML-based decision strategy for supervised classification problems, and show that it attains heuristic PAC learning when applied to a wide variety of models. Furthermore, we show that the misclassification rate of our method is upper bounded by the maximal leakage, a recently proposed metric to quantify the potential of data leakage in privacy-sensitive scenarios. △ Less

Submitted 31 January, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

Comments: 8 pages, no figures

arXiv:2008.12568 [pdf, other]

Causal blankets: Theory and algorithmic framework

Authors: Fernando E. Rosas, Pedro A. M. Mediano, Martin Biehl, Shamil Chandaria, Daniel Polani

Abstract: We introduce a novel framework to identify perception-action loops (PALOs) directly from data based on the principles of computational mechanics. Our approach is based on the notion of causal blanket, which captures sensory and active variables as dynamical sufficient statistics -- i.e. as the "differences that make a difference." Moreover, our theory provides a broadly applicable procedure to con… ▽ More We introduce a novel framework to identify perception-action loops (PALOs) directly from data based on the principles of computational mechanics. Our approach is based on the notion of causal blanket, which captures sensory and active variables as dynamical sufficient statistics -- i.e. as the "differences that make a difference." Moreover, our theory provides a broadly applicable procedure to construct PALOs that requires neither a steady-state nor Markovian dynamics. Using our theory, we show that every bipartite stochastic process has a causal blanket, but the extent to which this leads to an effective PALO formulation varies depending on the integrated information of the bipartition. △ Less

Submitted 29 September, 2020; v1 submitted 28 August, 2020; originally announced August 2020.

arXiv:2006.04176 [pdf, ps, other]

Deep active inference agents using Monte-Carlo methods

Authors: Zafeirios Fountas, Noor Sajid, Pedro A. M. Mediano, Karl Friston

Abstract: Active inference is a Bayesian framework for understanding biological intelligence. The underlying theory brings together perception and action under one single imperative: minimizing free energy. However, despite its theoretical utility in explaining intelligence, computational implementations have been restricted to low-dimensional and idealized situations. In this paper, we present a neural arc… ▽ More Active inference is a Bayesian framework for understanding biological intelligence. The underlying theory brings together perception and action under one single imperative: minimizing free energy. However, despite its theoretical utility in explaining intelligence, computational implementations have been restricted to low-dimensional and idealized situations. In this paper, we present a neural architecture for building deep active inference agents operating in complex, continuous state-spaces using multiple forms of Monte-Carlo (MC) sampling. For this, we introduce a number of techniques, novel to active inference. These include: i) selecting free-energy-optimal policies via MC tree search, ii) approximating this optimal policy distribution via a feed-forward `habitual' network, iii) predicting future parameter belief updates using MC dropouts and, finally, iv) optimizing state transition precision (a high-end form of attention). Our approach enables agents to learn environmental dynamics efficiently, while maintaining task performance, in relation to reward-based counterparts. We illustrate this in a new toy environment, based on the dSprites data-set, and demonstrate that active inference agents automatically create disentangled representations that are apt for modeling state transitions. In a more complex Animal-AI environment, our agents (using the same neural architecture) are able to simulate future state transitions and actions (i.e., plan), to evince reward-directed navigation - despite temporary suspension of visual input. These results show that deep active inference - equipped with MC methods - provides a flexible framework to develop biologically-inspired intelligent agents, with applications in both machine learning and cognitive science. △ Less

Submitted 22 October, 2020; v1 submitted 7 June, 2020; originally announced June 2020.

Comments: To appear in NeurIPS 2020

arXiv:2005.01322 [pdf, other]

Building Proactive Voice Assistants: When and How (not) to Interact

Authors: O. Miksik, I. Munasinghe, J. Asensio-Cubero, S. Reddy Bethi, S-T. Huang, S. Zylfo, X. Liu, T. Nica, A. Mitrocsak, S. Mezza, R. Beard, R. Shi, R. Ng, P. Mediano, Z. Fountas, S-H. Lee, J. Medvesek, H. Zhuang, Y. Rogers, P. Swietojanski

Abstract: Voice assistants have recently achieved remarkable commercial success. However, the current generation of these devices is typically capable of only reactive interactions. In other words, interactions have to be initiated by the user, which somewhat limits their usability and user experience. We propose, that the next generation of such devices should be able to proactively provide the right infor… ▽ More Voice assistants have recently achieved remarkable commercial success. However, the current generation of these devices is typically capable of only reactive interactions. In other words, interactions have to be initiated by the user, which somewhat limits their usability and user experience. We propose, that the next generation of such devices should be able to proactively provide the right information in the right way at the right time, without being prompted by the user. However, achieving this is not straightforward, since there is the danger it could interrupt what the user is doing too much, resulting in it being distracting or even annoying. Furthermore, it could unwittingly, reveal sensitive/private information to third parties. In this report, we discuss the challenges of developing proactively initiated interactions, and suggest a framework for when it is appropriate for the device to intervene. To validate our design assumptions, we describe firstly, how we built a functioning prototype and secondly, a user study that was conducted to assess users' reactions and reflections when in the presence of a proactive voice assistant. This pre-print summarises the state, ideas and progress towards a proactive device as of autumn 2018. △ Less

Submitted 4 May, 2020; originally announced May 2020.

Comments: 17 pages, technical report

arXiv:2001.10387 [pdf, other]

doi 10.1088/1751-8121/abb723

An operational information decomposition via synergistic disclosure

Authors: Fernando Rosas, Pedro Mediano, Borzoo Rassouli, Adam Barrett

Abstract: Multivariate information decompositions hold promise to yield insight into complex systems, and stand out for their ability to identify synergistic phenomena. However, the adoption of these approaches has been hindered by there being multiple possible decompositions, and no precise guidance for preferring one over the others. At the heart of this disagreement lies the absence of a clear operationa… ▽ More Multivariate information decompositions hold promise to yield insight into complex systems, and stand out for their ability to identify synergistic phenomena. However, the adoption of these approaches has been hindered by there being multiple possible decompositions, and no precise guidance for preferring one over the others. At the heart of this disagreement lies the absence of a clear operational interpretation of what synergistic information is. Here we fill this gap by proposing a new information decomposition based on a novel operationalisation of informational synergy, which leverages recent developments in the literature of data privacy. Our decomposition is defined for any number of information sources, and its atoms can be calculated using elementary optimisation techniques. The decomposition provides a natural coarse-graining that scales gracefully with the system's size, and is applicable in a wide range of scenarios of practical interest. △ Less

Submitted 13 March, 2020; v1 submitted 28 January, 2020; originally announced January 2020.

Comments: 14 pages, 8 figures

arXiv:1902.11239 [pdf, other]

doi 10.1103/PhysRevE.100.032305

Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information

Authors: Fernando Rosas, Pedro A. M. Mediano, Michael Gastpar, Henrik J. Jensen

Abstract: This article introduces a model-agnostic approach to study statistical synergy, a form of emergence in which patterns at large scales are not traceable from lower scales. Our framework leverages various multivariate extensions of Shannon's mutual information, and introduces the O-information as a metric capable of characterising synergy- and redundancy-dominated systems. We develop key analytical… ▽ More This article introduces a model-agnostic approach to study statistical synergy, a form of emergence in which patterns at large scales are not traceable from lower scales. Our framework leverages various multivariate extensions of Shannon's mutual information, and introduces the O-information as a metric capable of characterising synergy- and redundancy-dominated systems. We develop key analytical properties of the O-information, and study how it relates to other metrics of high-order interactions from the statistical mechanics and neuroscience literature. Finally, as a proof of concept, we use the proposed framework to explore the relevance of statistical synergy in Baroque music scores. △ Less

Submitted 28 February, 2019; originally announced February 2019.

Journal ref: Phys. Rev. E 100, 032305 (2019)

arXiv:1902.06828 [pdf, other]

doi 10.1162/netn_a_00092

Large-scale directed network inference with multivariate transfer entropy and hierarchical statistical testing

Authors: Leonardo Novelli, Patricia Wollstadt, Pedro Mediano, Michael Wibral, Joseph T. Lizier

Abstract: Network inference algorithms are valuable tools for the study of large-scale neuroimaging datasets. Multivariate transfer entropy is well suited for this task, being a model-free measure that captures nonlinear and lagged dependencies between time series to infer a minimal directed network model. Greedy algorithms have been proposed to efficiently deal with high-dimensional datasets while avoiding… ▽ More Network inference algorithms are valuable tools for the study of large-scale neuroimaging datasets. Multivariate transfer entropy is well suited for this task, being a model-free measure that captures nonlinear and lagged dependencies between time series to infer a minimal directed network model. Greedy algorithms have been proposed to efficiently deal with high-dimensional datasets while avoiding redundant inferences and capturing synergistic effects. However, multiple statistical comparisons may inflate the false positive rate and are computationally demanding, which limited the size of previous validation studies. The algorithm we present---as implemented in the IDTxl open-source software---addresses these challenges by employing hierarchical statistical tests to control the family-wise error rate and to allow for efficient parallelisation. The method was validated on synthetic datasets involving random networks of increasing size (up to 100 nodes), for both linear and nonlinear dynamics. The performance increased with the length of the time series, reaching consistently high precision, recall, and specificity (>98% on average) for 10000 time samples. Varying the statistical significance threshold showed a more favourable precision-recall trade-off for longer time series. Both the network size and the sample size are one order of magnitude larger than previously demonstrated, showing feasibility for typical EEG and MEG experiments. △ Less

Submitted 30 July, 2019; v1 submitted 18 February, 2019; originally announced February 2019.

Journal ref: Network Neuroscience 2019 3:3, 827-847

arXiv:1809.11044 [pdf, other]

Relational Forward Models for Multi-Agent Learning

Authors: Andrea Tacchetti, H. Francis Song, Pedro A. M. Mediano, Vinicius Zambaldi, Neil C. Rabinowitz, Thore Graepel, Matthew Botvinick, Peter W. Battaglia

Abstract: The behavioral dynamics of multi-agent systems have a rich and orderly structure, which can be leveraged to understand these systems, and to improve how artificial agents learn to operate in them. Here we introduce Relational Forward Models (RFM) for multi-agent learning, networks that can learn to make accurate predictions of agents' future behavior in multi-agent environments. Because these mode… ▽ More The behavioral dynamics of multi-agent systems have a rich and orderly structure, which can be leveraged to understand these systems, and to improve how artificial agents learn to operate in them. Here we introduce Relational Forward Models (RFM) for multi-agent learning, networks that can learn to make accurate predictions of agents' future behavior in multi-agent environments. Because these models operate on the discrete entities and relations present in the environment, they produce interpretable intermediate representations which offer insights into what drives agents' behavior, and what events mediate the intensity and valence of social interactions. Furthermore, we show that embedding RFM modules inside agents results in faster learning systems compared to non-augmented baselines. As more and more of the autonomous systems we develop and interact with become multi-agent in nature, developing richer analysis tools for characterizing how and why agents make decisions is increasingly necessary. Moreover, developing artificial agents that quickly and safely learn to coordinate with one another, and with humans in shared environments, is crucial. △ Less

Submitted 28 September, 2018; originally announced September 2018.

arXiv:1807.10459 [pdf]

doi 10.21105/joss.01081

IDTxl: The Information Dynamics Toolkit xl: a Python package for the efficient analysis of multivariate information dynamics in networks

Authors: Patricia Wollstadt, Joseph T. Lizier, Raul Vicente, Conor Finn, Mario Martínez-Zarzuela, Pedro Mediano, Leonardo Novelli, Michael Wibral

Abstract: The Information Dynamics Toolkit xl (IDTxl) is a comprehensive software package for efficient inference of networks and their node dynamics from multivariate time series data using information theory. IDTxl provides functionality to estimate the following measures: 1) For network inference: multivariate transfer entropy (TE)/Granger causality (GC), multivariate mutual information (MI), bivariate… ▽ More The Information Dynamics Toolkit xl (IDTxl) is a comprehensive software package for efficient inference of networks and their node dynamics from multivariate time series data using information theory. IDTxl provides functionality to estimate the following measures: 1) For network inference: multivariate transfer entropy (TE)/Granger causality (GC), multivariate mutual information (MI), bivariate TE/GC, bivariate MI 2) For analysis of node dynamics: active information storage (AIS), partial information decomposition (PID) IDTxl implements estimators for discrete and continuous data with parallel computing engines for both GPU and CPU platforms. Written for Python3.4.3+. △ Less

Submitted 19 February, 2019; v1 submitted 27 July, 2018; originally announced July 2018.

Comments: 4 pages

Journal ref: Journal of Open Source Software, 4(34), 1081

arXiv:1707.01446 [pdf, other]

Spectral Modes of Network Dynamics Reveal Increased Informational Complexity Near Criticality

Authors: Xerxes D. Arsiwalla, Pedro A. M. Mediano, Paul F. M. J. Verschure

Abstract: What does the informational complexity of dynamical networked systems tell us about intrinsic mechanisms and functions of these complex systems? Recent complexity measures such as integrated information have sought to operationalize this problem taking a whole-versus-parts perspective, wherein one explicitly computes the amount of information generated by a network as a whole over and above that g… ▽ More What does the informational complexity of dynamical networked systems tell us about intrinsic mechanisms and functions of these complex systems? Recent complexity measures such as integrated information have sought to operationalize this problem taking a whole-versus-parts perspective, wherein one explicitly computes the amount of information generated by a network as a whole over and above that generated by the sum of its parts during state transitions. While several numerical schemes for estimating network integrated information exist, it is instructive to pursue an analytic approach that computes integrated information as a function of network weights. Our formulation of integrated information uses a Kullback-Leibler divergence between the multi-variate distribution on the set of network states versus the corresponding factorized distribution over its parts. Implementing stochastic Gaussian dynamics, we perform computations for several prototypical network topologies. Our findings show increased informational complexity near criticality, which remains consistent across network topologies. Spectral decomposition of the system's dynamics reveals how informational complexity is governed by eigenmodes of both, the network's covariance and adjacency matrices. We find that as the dynamics of the system approach criticality, high integrated information is exclusively driven by the eigenmode corresponding to the leading eigenvalue of the covariance matrix, while sub-leading modes get suppressed. The implication of this result is that it might be favorable for complex dynamical networked systems such as the human brain or communication systems to operate near criticality so that efficient information integration might be achieved. △ Less

Submitted 5 July, 2017; originally announced July 2017.

arXiv:1611.02648 [pdf, other]

Deep Unsupervised Clustering with Gaussian Mixture Variational Autoencoders

Authors: Nat Dilokthanakul, Pedro A. M. Mediano, Marta Garnelo, Matthew C. H. Lee, Hugh Salimbeni, Kai Arulkumaran, Murray Shanahan

Abstract: We study a variant of the variational autoencoder model (VAE) with a Gaussian mixture as a prior distribution, with the goal of performing unsupervised clustering through deep generative models. We observe that the known problem of over-regularisation that has been shown to arise in regular VAEs also manifests itself in our model and leads to cluster degeneracy. We show that a heuristic called min… ▽ More We study a variant of the variational autoencoder model (VAE) with a Gaussian mixture as a prior distribution, with the goal of performing unsupervised clustering through deep generative models. We observe that the known problem of over-regularisation that has been shown to arise in regular VAEs also manifests itself in our model and leads to cluster degeneracy. We show that a heuristic called minimum information constraint that has been shown to mitigate this effect in VAEs can also be applied to improve unsupervised clustering performance with our model. Furthermore we analyse the effect of this heuristic and provide an intuition of the various processes with the help of visualizations. Finally, we demonstrate the performance of our model on synthetic data, MNIST and SVHN, showing that the obtained clusters are distinct, interpretable and result in achieving competitive performance on unsupervised clustering to the state-of-the-art results. △ Less

Submitted 13 January, 2017; v1 submitted 8 November, 2016; originally announced November 2016.

Comments: 12 pages, 6 figures, Under review as a conference paper at ICLR 2017

Showing 1–20 of 20 results for author: Mediano, P