-
From pixels to planning: scale-free active inference
Authors:
Karl Friston,
Conor Heins,
Tim Verbelen,
Lancelot Da Costa,
Tommaso Salvatori,
Dimitrije Markovic,
Alexander Tschantz,
Magnus Koudahl,
Christopher Buckley,
Thomas Parr
Abstract:
This paper describes a discrete state-space model -- and accompanying methods -- for generative modelling. This model generalises partially observed Markov decision processes to include paths as latent variables, rendering it suitable for active inference and learning in a dynamic setting. Specifically, we consider deep or hierarchical forms using the renormalisation group. The ensuing renormalisi…
▽ More
This paper describes a discrete state-space model -- and accompanying methods -- for generative modelling. This model generalises partially observed Markov decision processes to include paths as latent variables, rendering it suitable for active inference and learning in a dynamic setting. Specifically, we consider deep or hierarchical forms using the renormalisation group. The ensuing renormalising generative models (RGM) can be regarded as discrete homologues of deep convolutional neural networks or continuous state-space models in generalised coordinates of motion. By construction, these scale-invariant models can be used to learn compositionality over space and time, furnishing models of paths or orbits; i.e., events of increasing temporal depth and itinerancy. This technical note illustrates the automatic discovery, learning and deployment of RGMs using a series of applications. We start with image classification and then consider the compression and generation of movies and music. Finally, we apply the same variational principles to the learning of Atari-like games.
△ Less
Submitted 27 July, 2024;
originally announced July 2024.
-
Active Inference and Intentional Behaviour
Authors:
Karl J. Friston,
Tommaso Salvatori,
Takuya Isomura,
Alexander Tschantz,
Alex Kiefer,
Tim Verbelen,
Magnus Koudahl,
Aswin Paul,
Thomas Parr,
Adeel Razi,
Brett Kagan,
Christopher L. Buckley,
Maxwell J. D. Ramstead
Abstract:
Recent advances in theoretical biology suggest that basal cognition and sentient behaviour are emergent properties of in vitro cell cultures and neuronal networks, respectively. Such neuronal networks spontaneously learn structured behaviours in the absence of reward or reinforcement. In this paper, we characterise this kind of self-organisation through the lens of the free energy principle, i.e.,…
▽ More
Recent advances in theoretical biology suggest that basal cognition and sentient behaviour are emergent properties of in vitro cell cultures and neuronal networks, respectively. Such neuronal networks spontaneously learn structured behaviours in the absence of reward or reinforcement. In this paper, we characterise this kind of self-organisation through the lens of the free energy principle, i.e., as self-evidencing. We do this by first discussing the definitions of reactive and sentient behaviour in the setting of active inference, which describes the behaviour of agents that model the consequences of their actions. We then introduce a formal account of intentional behaviour, that describes agents as driven by a preferred endpoint or goal in latent state-spaces. We then investigate these forms of (reactive, sentient, and intentional) behaviour using simulations. First, we simulate the aforementioned in vitro experiments, in which neuronal cultures spontaneously learn to play Pong, by implementing nested, free energy minimising processes. The simulations are then used to deconstruct the ensuing predictive behaviour, leading to the distinction between merely reactive, sentient, and intentional behaviour, with the latter formalised in terms of inductive planning. This distinction is further studied using simple machine learning benchmarks (navigation in a grid world and the Tower of Hanoi problem), that show how quickly and efficiently adaptive behaviour emerges under an inductive form of active inference.
△ Less
Submitted 16 December, 2023; v1 submitted 6 December, 2023;
originally announced December 2023.
-
Supervised structure learning
Authors:
Karl J. Friston,
Lancelot Da Costa,
Alexander Tschantz,
Alex Kiefer,
Tommaso Salvatori,
Victorita Neacsu,
Magnus Koudahl,
Conor Heins,
Noor Sajid,
Dimitrije Markovic,
Thomas Parr,
Tim Verbelen,
Christopher L Buckley
Abstract:
This paper concerns structure learning or discovery of discrete generative models. It focuses on Bayesian model selection and the assimilation of training data or content, with a special emphasis on the order in which data are ingested. A key move - in the ensuing schemes - is to place priors on the selection of models, based upon expected free energy. In this setting, expected free energy reduces…
▽ More
This paper concerns structure learning or discovery of discrete generative models. It focuses on Bayesian model selection and the assimilation of training data or content, with a special emphasis on the order in which data are ingested. A key move - in the ensuing schemes - is to place priors on the selection of models, based upon expected free energy. In this setting, expected free energy reduces to a constrained mutual information, where the constraints inherit from priors over outcomes (i.e., preferred outcomes). The resulting scheme is first used to perform image classification on the MNIST dataset to illustrate the basic idea, and then tested on a more challenging problem of discovering models with dynamics, using a simple sprite-based visual disentanglement paradigm and the Tower of Hanoi (cf., blocks world) problem. In these examples, generative models are constructed autodidactically to recover (i.e., disentangle) the factorial structure of latent states - and their characteristic paths or dynamics.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Reclaiming saliency: rhythmic precision-modulated action and perception
Authors:
Ajith Anil Meera,
Filip Novicky,
Thomas Parr,
Karl Friston,
Pablo Lanillos,
Noor Sajid
Abstract:
Computational models of visual attention in artificial intelligence and robotics have been inspired by the concept of a saliency map. These models account for the mutual information between the (current) visual information and its estimated causes. However, they fail to consider the circular causality between perception and action. In other words, they do not consider where to sample next, given c…
▽ More
Computational models of visual attention in artificial intelligence and robotics have been inspired by the concept of a saliency map. These models account for the mutual information between the (current) visual information and its estimated causes. However, they fail to consider the circular causality between perception and action. In other words, they do not consider where to sample next, given current beliefs. Here, we reclaim salience as an active inference process that relies on two basic principles: uncertainty minimisation and rhythmic scheduling. For this, we make a distinction between attention and salience. Briefly, we associate attention with precision control, i.e., the confidence with which beliefs can be updated given sampled sensory data, and salience with uncertainty minimisation that underwrites the selection of future sensory data. Using this, we propose a new account of attention based on rhythmic precision-modulation and discuss its potential in robotics, providing numerical experiments that showcase advantages of precision-modulation for state and noise estimation, system identification and action selection for informative path planning.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Active inference, Bayesian optimal design, and expected utility
Authors:
Noor Sajid,
Lancelot Da Costa,
Thomas Parr,
Karl Friston
Abstract:
Active inference, a corollary of the free energy principle, is a formal way of describing the behavior of certain kinds of random dynamical systems that have the appearance of sentience. In this chapter, we describe how active inference combines Bayesian decision theory and optimal Bayesian design principles under a single imperative to minimize expected free energy. It is this aspect of active in…
▽ More
Active inference, a corollary of the free energy principle, is a formal way of describing the behavior of certain kinds of random dynamical systems that have the appearance of sentience. In this chapter, we describe how active inference combines Bayesian decision theory and optimal Bayesian design principles under a single imperative to minimize expected free energy. It is this aspect of active inference that allows for the natural emergence of information-seeking behavior. When removing prior outcomes preferences from expected free energy, active inference reduces to optimal Bayesian design, i.e., information gain maximization. Conversely, active inference reduces to Bayesian decision theory in the absence of ambiguity and relative risk, i.e., expected utility maximization. Using these limiting cases, we illustrate how behaviors differ when agents select actions that optimize expected utility, expected information gain, and expected free energy. Our T-maze simulations show optimizing expected free energy produces goal-directed information-seeking behavior while optimizing expected utility induces purely exploitive behavior and maximizing information gain engenders intrinsically motivated behavior.
△ Less
Submitted 21 September, 2021;
originally announced October 2021.
-
Bayesian brains and the Rényi divergence
Authors:
Noor Sajid,
Francesco Faccio,
Lancelot Da Costa,
Thomas Parr,
Jürgen Schmidhuber,
Karl Friston
Abstract:
Under the Bayesian brain hypothesis, behavioural variations can be attributed to different priors over generative model parameters. This provides a formal explanation for why individuals exhibit inconsistent behavioural preferences when confronted with similar choices. For example, greedy preferences are a consequence of confident (or precise) beliefs over certain outcomes. Here, we offer an alter…
▽ More
Under the Bayesian brain hypothesis, behavioural variations can be attributed to different priors over generative model parameters. This provides a formal explanation for why individuals exhibit inconsistent behavioural preferences when confronted with similar choices. For example, greedy preferences are a consequence of confident (or precise) beliefs over certain outcomes. Here, we offer an alternative account of behavioural variability using Rényi divergences and their associated variational bounds. Rényi bounds are analogous to the variational free energy (or evidence lower bound) and can be derived under the same assumptions. Importantly, these bounds provide a formal way to establish behavioural differences through an $α$ parameter, given fixed priors. This rests on changes in $α$ that alter the bound (on a continuous scale), inducing different posterior estimates and consequent variations in behaviour. Thus, it looks as if individuals have different priors, and have reached different conclusions. More specifically, $α\to 0^{+}$ optimisation leads to mass-covering variational estimates and increased variability in choice behaviour. Furthermore, $α\to + \infty$ optimisation leads to mass-seeking variational posteriors and greedy preferences. We exemplify this formulation through simulations of the multi-armed bandit task. We note that these $α$ parameterisations may be especially relevant, i.e., shape preferences, when the true posterior is not in the same family of distributions as the assumed (simpler) approximate density, which may be the case in many real-world scenarios. The ensuing departure from vanilla variational inference provides a potentially useful explanation for differences in behavioural preferences of biological (or artificial) agents under the assumption that the brain performs variational Bayesian inference.
△ Less
Submitted 12 July, 2021;
originally announced July 2021.
-
Reward Maximisation through Discrete Active Inference
Authors:
Lancelot Da Costa,
Noor Sajid,
Thomas Parr,
Karl Friston,
Ryan Smith
Abstract:
Active inference is a probabilistic framework for modelling the behaviour of biological and artificial agents, which derives from the principle of minimising free energy. In recent years, this framework has successfully been applied to a variety of situations where the goal was to maximise reward, offering comparable and sometimes superior performance to alternative approaches. In this paper, we c…
▽ More
Active inference is a probabilistic framework for modelling the behaviour of biological and artificial agents, which derives from the principle of minimising free energy. In recent years, this framework has successfully been applied to a variety of situations where the goal was to maximise reward, offering comparable and sometimes superior performance to alternative approaches. In this paper, we clarify the connection between reward maximisation and active inference by demonstrating how and when active inference agents perform actions that are optimal for maximising reward. Precisely, we show the conditions under which active inference produces the optimal solution to the Bellman equation--a formulation that underlies several approaches to model-based reinforcement learning and control. On partially observed Markov decision processes, the standard active inference scheme can produce Bellman optimal actions for planning horizons of 1, but not beyond. In contrast, a recently developed recursive active inference scheme (sophisticated inference) can produce Bellman optimal actions on any finite temporal horizon. We append the analysis with a discussion of the broader relationship between active inference and reinforcement learning.
△ Less
Submitted 11 July, 2022; v1 submitted 17 September, 2020;
originally announced September 2020.
-
Action and Perception as Divergence Minimization
Authors:
Danijar Hafner,
Pedro A. Ortega,
Jimmy Ba,
Thomas Parr,
Karl Friston,
Nicolas Heess
Abstract:
To learn directed behaviors in complex environments, intelligent agents need to optimize objective functions. Various objectives are known for designing artificial agents, including task rewards and intrinsic motivation. However, it is unclear how the known objectives relate to each other, which objectives remain yet to be discovered, and which objectives better describe the behavior of humans. We…
▽ More
To learn directed behaviors in complex environments, intelligent agents need to optimize objective functions. Various objectives are known for designing artificial agents, including task rewards and intrinsic motivation. However, it is unclear how the known objectives relate to each other, which objectives remain yet to be discovered, and which objectives better describe the behavior of humans. We introduce the Action Perception Divergence (APD), an approach for categorizing the space of possible objective functions for embodied agents. We show a spectrum that reaches from narrow to general objectives. While the narrow objectives correspond to domain-specific rewards as typical in reinforcement learning, the general objectives maximize information with the environment through latent variable models of input sequences. Intuitively, these agents use perception to align their beliefs with the world and use actions to align the world with their beliefs. They infer representations that are informative of past inputs, explore future inputs that are informative of their representations, and select actions or skills that maximally influence future inputs. This explains a wide range of unsupervised objectives from a single principle, including representation learning, information gain, empowerment, and skill discovery. Our findings suggest leveraging powerful world models for unsupervised exploration as a path toward highly adaptive agents that seek out large niches in their environments, rendering task rewards optional.
△ Less
Submitted 12 February, 2022; v1 submitted 3 September, 2020;
originally announced September 2020.
-
Nonparametric Feature Impact and Importance
Authors:
Terence Parr,
James D. Wilson,
Jeff Hamrick
Abstract:
Practitioners use feature importance to rank and eliminate weak predictors during model development in an effort to simplify models and improve generality. Unfortunately, they also routinely conflate such feature importance measures with feature impact, the isolated effect of an explanatory variable on the response variable. This can lead to real-world consequences when importance is inappropriate…
▽ More
Practitioners use feature importance to rank and eliminate weak predictors during model development in an effort to simplify models and improve generality. Unfortunately, they also routinely conflate such feature importance measures with feature impact, the isolated effect of an explanatory variable on the response variable. This can lead to real-world consequences when importance is inappropriately interpreted as impact for business or medical insight purposes. The dominant approach for computing importances is through interrogation of a fitted model, which works well for feature selection, but gives distorted measures of feature impact. The same method applied to the same data set can yield different feature importances, depending on the model, leading us to conclude that impact should be computed directly from the data. While there are nonparametric feature selection algorithms, they typically provide feature rankings, rather than measures of impact or importance. They also typically focus on single-variable associations with the response. In this paper, we give mathematical definitions of feature impact and importance, derived from partial dependence curves, that operate directly on the data. To assess quality, we show that features ranked by these definitions are competitive with existing feature selection techniques using three real data sets for predictive tasks.
△ Less
Submitted 8 June, 2020;
originally announced June 2020.
-
Sophisticated Inference
Authors:
Karl Friston,
Lancelot Da Costa,
Danijar Hafner,
Casper Hesp,
Thomas Parr
Abstract:
Active inference offers a first principle account of sentient behaviour, from which special and important cases can be derived, e.g., reinforcement learning, active learning, Bayes optimal inference, Bayes optimal design, etc. Active inference resolves the exploitation-exploration dilemma in relation to prior preferences, by placing information gain on the same footing as reward or value. In brief…
▽ More
Active inference offers a first principle account of sentient behaviour, from which special and important cases can be derived, e.g., reinforcement learning, active learning, Bayes optimal inference, Bayes optimal design, etc. Active inference resolves the exploitation-exploration dilemma in relation to prior preferences, by placing information gain on the same footing as reward or value. In brief, active inference replaces value functions with functionals of (Bayesian) beliefs, in the form of an expected (variational) free energy. In this paper, we consider a sophisticated kind of active inference, using a recursive form of expected free energy. Sophistication describes the degree to which an agent has beliefs about beliefs. We consider agents with beliefs about the counterfactual consequences of action for states of affairs and beliefs about those latent states. In other words, we move from simply considering beliefs about 'what would happen if I did that' to 'what would I believe about what would happen if I did that'. The recursive form of the free energy functional effectively implements a deep tree search over actions and outcomes in the future. Crucially, this search is over sequences of belief states, as opposed to states per se. We illustrate the competence of this scheme, using numerical simulations of deep decision problems.
△ Less
Submitted 7 June, 2020;
originally announced June 2020.
-
Active inference: demystified and compared
Authors:
Noor Sajid,
Philip J. Ball,
Thomas Parr,
Karl J. Friston
Abstract:
Active inference is a first principle account of how autonomous agents operate in dynamic, non-stationary environments. This problem is also considered in reinforcement learning (RL), but limited work exists on comparing the two approaches on the same discrete-state environments. In this paper, we provide: 1) an accessible overview of the discrete-state formulation of active inference, highlightin…
▽ More
Active inference is a first principle account of how autonomous agents operate in dynamic, non-stationary environments. This problem is also considered in reinforcement learning (RL), but limited work exists on comparing the two approaches on the same discrete-state environments. In this paper, we provide: 1) an accessible overview of the discrete-state formulation of active inference, highlighting natural behaviors in active inference that are generally engineered in RL; 2) an explicit discrete-state comparison between active inference and RL on an OpenAI gym baseline. We begin by providing a condensed overview of the active inference literature, in particular viewing the various natural behaviors of active inference agents through the lens of RL. We show that by operating in a pure belief-based setting, active inference agents can carry out epistemic exploration, and account for uncertainty about their environment in a Bayes-optimal fashion. Furthermore, we show that the reliance on an explicit reward signal in RL is removed in active inference, where reward can simply be treated as another observation; even in the total absence of rewards, agent behaviors are learned through preference learning. We make these properties explicit by showing two scenarios in which active inference agents can infer behaviors in reward-free environments compared to both Q-learning and Bayesian model-based RL agents; by placing zero prior preferences over rewards and by learning the prior preferences over the observations corresponding to reward. We conclude by noting that this formalism can be applied to more complex settings if appropriate generative models can be formulated. In short, we aim to demystify the behavior of active inference agents by presenting an accessible discrete state-space and time formulation, and demonstrate these behaviors in a OpenAI gym environment, alongside RL agents.
△ Less
Submitted 30 October, 2020; v1 submitted 24 September, 2019;
originally announced September 2019.
-
Technical Report: Partial Dependence through Stratification
Authors:
Terence Parr,
James D. Wilson
Abstract:
Partial dependence curves (FPD) introduced by Friedman, are an important model interpretation tool, but are often not accessible to business analysts and scientists who typically lack the skills to choose, tune, and assess machine learning models. It is also common for the same partial dependence algorithm on the same data to give meaningfully different curves for different models, which calls int…
▽ More
Partial dependence curves (FPD) introduced by Friedman, are an important model interpretation tool, but are often not accessible to business analysts and scientists who typically lack the skills to choose, tune, and assess machine learning models. It is also common for the same partial dependence algorithm on the same data to give meaningfully different curves for different models, which calls into question their precision. Expertise is required to distinguish between model artifacts and true relationships in the data.
In this paper, we contribute methods for computing partial dependence curves, for both numerical (StratPD) and categorical explanatory variables (CatStratPD), that work directly from training data rather than predictions of a model. Our methods provide a direct estimate of partial dependence, and rely on approximating the partial derivative of an unknown regression function without first fitting a model and then approximating its partial derivative. We investigate settings where contemporary partial dependence methods---including FPD, ALE, and SHAP methods---give biased results. Furthermore, we demonstrate that our approach works correctly on synthetic and plausibly on real data sets. Our goal is not to argue that model-based techniques are not useful. Rather, we hope to open a new line of inquiry into nonparametric partial dependence.
△ Less
Submitted 24 April, 2020; v1 submitted 15 July, 2019;
originally announced July 2019.
-
The Matrix Calculus You Need For Deep Learning
Authors:
Terence Parr,
Jeremy Howard
Abstract:
This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed. Note that you do not need to understand this material before you start learning to train and use deep learning in practice; rather,…
▽ More
This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed. Note that you do not need to understand this material before you start learning to train and use deep learning in practice; rather, this material is for those who are already familiar with the basics of neural networks, and wish to deepen their understanding of the underlying math. Don't worry if you get stuck at some point along the way---just go back and reread the previous section, and try writing down and working through some examples. And if you're still stuck, we're happy to answer your questions in the Theory category at forums.fast.ai. Note: There is a reference section at the end of the paper summarizing all the key matrix calculus rules and terminology discussed here. See related articles at http://explained.ai
△ Less
Submitted 2 July, 2018; v1 submitted 5 February, 2018;
originally announced February 2018.
-
Technical Report: Towards a Universal Code Formatter through Machine Learning
Authors:
Terence Parr,
Jurgin Vinju
Abstract:
There are many declarative frameworks that allow us to implement code formatters relatively easily for any specific language, but constructing them is cumbersome. The first problem is that "everybody" wants to format their code differently, leading to either many formatter variants or a ridiculous number of configuration options. Second, the size of each implementation scales with a language's gra…
▽ More
There are many declarative frameworks that allow us to implement code formatters relatively easily for any specific language, but constructing them is cumbersome. The first problem is that "everybody" wants to format their code differently, leading to either many formatter variants or a ridiculous number of configuration options. Second, the size of each implementation scales with a language's grammar size, leading to hundreds of rules.
In this paper, we solve the formatter construction problem using a novel approach, one that automatically derives formatters for any given language without intervention from a language expert. We introduce a code formatter called CodeBuff that uses machine learning to abstract formatting rules from a representative corpus, using a carefully designed feature set. Our experiments on Java, SQL, and ANTLR grammars show that CodeBuff is efficient, has excellent accuracy, and is grammar invariant for a given language. It also generalizes to a 4th language tested during manuscript preparation.
△ Less
Submitted 28 June, 2016;
originally announced June 2016.