Zum Hauptinhalt springen

Showing 1–17 of 17 results for author: Sahani, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.03710  [pdf, other

    cs.LG

    A State Representation for Diminishing Rewards

    Authors: Ted Moskovitz, Samo Hromadka, Ahmed Touati, Diana Borsa, Maneesh Sahani

    Abstract: A common setting in multitask reinforcement learning (RL) demands that an agent rapidly adapt to various stationary reward functions randomly sampled from a fixed distribution. In such situations, the successor representation (SR) is a popular framework which supports rapid policy evaluation by decoupling a policy's expected discounted, cumulative state occupancies from a specific reward function.… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  2. arXiv:2306.13472  [pdf, other

    stat.ML cs.LG

    Prediction under Latent Subgroup Shifts with High-Dimensional Observations

    Authors: William I. Walker, Arthur Gretton, Maneesh Sahani

    Abstract: We introduce a new approach to prediction in graphical models with latent-shift adaptation, i.e., where source and target environments differ in the distribution of an unobserved confounding latent variable. Previous work has shown that as long as "concept" and "proxy" variables with appropriate dependence are observed in the source environment, the latent-associated distributional changes can be… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  3. arXiv:2305.15277  [pdf, other

    cs.LG

    Successor-Predecessor Intrinsic Exploration

    Authors: Changmin Yu, Neil Burgess, Maneesh Sahani, Samuel J. Gershman

    Abstract: Exploration is essential in reinforcement learning, particularly in environments where external rewards are sparse. Here we focus on exploration with intrinsic rewards, where the agent transiently augments the external rewards with self-generated intrinsic rewards. Although the study of intrinsic rewards has a long history, existing methods focus on composing the intrinsic reward based on measures… ▽ More

    Submitted 25 January, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

  4. arXiv:2209.05661  [pdf, other

    cs.LG stat.ML

    Unsupervised representation learning with recognition-parametrised probabilistic models

    Authors: William I. Walker, Hugo Soulat, Changmin Yu, Maneesh Sahani

    Abstract: We introduce a new approach to probabilistic unsupervised learning based on the recognition-parametrised model (RPM): a normalised semi-parametric hypothesis class for joint distributions over observed and latent variables. Under the key assumption that observations are conditionally independent given latents, the RPM combines parametric prior and observation-conditioned latent distributions with… ▽ More

    Submitted 20 April, 2023; v1 submitted 12 September, 2022; originally announced September 2022.

  5. arXiv:2209.05212  [pdf, other

    cs.LG q-bio.NC

    Structured Recognition for Generative Models with Explaining Away

    Authors: Changmin Yu, Hugo Soulat, Neil Burgess, Maneesh Sahani

    Abstract: A key goal of unsupervised learning is to go beyond density estimation and sample generation to reveal the structure inherent within observed data. Such structure can be expressed in the pattern of interactions between explanatory latent variables captured through a probabilistic graphical model. Although the learning of structured graphical models has a long history, much recent work in unsupervi… ▽ More

    Submitted 10 November, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

  6. arXiv:2207.08258  [pdf, other

    cs.LG

    Minimum Description Length Control

    Authors: Ted Moskovitz, Ta-Chu Kao, Maneesh Sahani, Matthew M. Botvinick

    Abstract: We propose a novel framework for multitask reinforcement learning based on the minimum description length (MDL) principle. In this approach, which we term MDL-control (MDL-C), the agent learns the common structure among the tasks with which it is faced and then distills it into a simpler representation which facilitates faster convergence and generalization to new tasks. In doing so, MDL-C natural… ▽ More

    Submitted 24 July, 2022; v1 submitted 17 July, 2022; originally announced July 2022.

  7. arXiv:2112.02027  [pdf, other

    q-bio.NC cs.LG

    Divergent representations of ethological visual inputs emerge from supervised, unsupervised, and reinforcement learning

    Authors: Grace W. Lindsay, Josh Merel, Tom Mrsic-Flogel, Maneesh Sahani

    Abstract: Artificial neural systems trained using reinforcement, supervised, and unsupervised learning all acquire internal representations of high dimensional input. To what extent these representations depend on the different learning objectives is largely unknown. Here we compare the representations learned by eight different convolutional neural networks, each with identical ResNet architectures and tra… ▽ More

    Submitted 8 February, 2022; v1 submitted 3 December, 2021; originally announced December 2021.

    Comments: 23 total pages, 9 main figures, 8 Supplementary figures

  8. arXiv:2109.13863  [pdf, other

    cs.LG cs.AI

    A First-Occupancy Representation for Reinforcement Learning

    Authors: Ted Moskovitz, Spencer R. Wilson, Maneesh Sahani

    Abstract: Both animals and artificial agents benefit from state representations that support rapid transfer of learning across tasks and which enable them to efficiently traverse their environments to reach rewarding states. The successor representation (SR), which measures the expected cumulative, discounted state occupancy under a fixed policy, enables efficient transfer to different reward structures in… ▽ More

    Submitted 6 November, 2021; v1 submitted 28 September, 2021; originally announced September 2021.

  9. arXiv:2002.09737  [pdf, other

    stat.ML cs.LG

    Amortised Learning by Wake-Sleep

    Authors: Li K. Wenliang, Theodore Moskovitz, Heishiro Kanagawa, Maneesh Sahani

    Abstract: Models that employ latent variables to capture structure in observed data lie at the heart of many current unsupervised learning algorithms, but exact maximum-likelihood learning for powerful and flexible latent-variable models is almost always intractable. Thus, state-of-the-art approaches either abandon the maximum-likelihood framework entirely, or else rely on a variety of variational approxima… ▽ More

    Submitted 15 August, 2020; v1 submitted 22 February, 2020; originally announced February 2020.

  10. arXiv:1906.09480  [pdf, other

    stat.ML cs.LG cs.NE q-bio.NC

    A neurally plausible model learns successor representations in partially observable environments

    Authors: Eszter Vertes, Maneesh Sahani

    Abstract: Animals need to devise strategies to maximize returns while interacting with their environment based on incoming noisy sensory observations. Task-relevant states, such as the agent's location within an environment or the presence of a predator, are often not directly observable but must be inferred using available sensory information. Successor representations (SR) have been proposed as a middle-g… ▽ More

    Submitted 22 June, 2019; originally announced June 2019.

  11. arXiv:1906.00232  [pdf, ps, other

    cs.LG econ.EM math.FA math.ST stat.ML

    Kernel Instrumental Variable Regression

    Authors: Rahul Singh, Maneesh Sahani, Arthur Gretton

    Abstract: Instrumental variable (IV) regression is a strategy for learning causal relationships in observational data. If measurements of input X and output Y are confounded, the causal relationship can nonetheless be identified if an instrumental variable Z is available that influences X directly, but is conditionally independent of Y given X and the unmeasured confounder. The classic two-stage least squar… ▽ More

    Submitted 15 July, 2020; v1 submitted 1 June, 2019; originally announced June 2019.

    Comments: 41 pages, 11 figures. Advances in Neural Information Processing Systems. 2019

  12. arXiv:1902.04420  [pdf, other

    stat.ML cs.LG math.DS

    Learning interpretable continuous-time models of latent stochastic dynamical systems

    Authors: Lea Duncker, Gergo Bohner, Julien Boussard, Maneesh Sahani

    Abstract: We develop an approach to learn an interpretable semi-parametric model of a latent continuous-time stochastic dynamical system, assuming noisy high-dimensional outputs sampled at uneven times. The dynamics are described by a nonlinear stochastic differential equation (SDE) driven by a Wiener process, with a drift evolution function drawn from a Gaussian process (GP) conditioned on a set of learnt… ▽ More

    Submitted 12 February, 2019; originally announced February 2019.

  13. arXiv:1807.01486  [pdf, other

    math.DS cs.LG physics.data-an

    Empirical fixed point bifurcation analysis

    Authors: Gergo Bohner, Maneesh Sahani

    Abstract: In a common experimental setting, the behaviour of a noisy dynamical system is monitored in response to manipulations of one or more control parameters. Here, we introduce a structured model to describe parametric changes in qualitative system behaviour via stochastic bifurcation analysis. In particular, we describe an extension of Gaussian Process models of transition maps, in which the learned m… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

    Comments: Submitted to ICML2018 on 9 February 2018

  14. arXiv:1805.11051  [pdf, other

    stat.ML cs.LG

    Flexible and accurate inference and learning for deep generative models

    Authors: Eszter Vertes, Maneesh Sahani

    Abstract: We introduce a new approach to learning in hierarchical latent-variable generative models called the "distributed distributional code Helmholtz machine", which emphasises flexibility and accuracy in the inferential process. In common with the original Helmholtz machine and later variational autoencoder algorithms (but unlike adverserial methods) our approach learns an explicit inference or "recogn… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

  15. arXiv:1711.00695  [pdf, other

    cs.LG stat.ML

    A Universal Marginalizer for Amortized Inference in Generative Models

    Authors: Laura Douglas, Iliyan Zarov, Konstantinos Gourgoulias, Chris Lucas, Chris Hart, Adam Baker, Maneesh Sahani, Yura Perov, Saurabh Johri

    Abstract: We consider the problem of inference in a causal generative model where the set of available observations differs between data instances. We show how combining samples drawn from the graphical model with an appropriate masking function makes it possible to train a single neural network to approximate all the corresponding conditional marginal distributions and thus amortize the cost of inference.… ▽ More

    Submitted 2 November, 2017; originally announced November 2017.

    Comments: Submitted to the NIPS 2017 Workshop on Advances in Approximate Bayesian Inference

  16. arXiv:1301.5650  [pdf, other

    stat.ML cs.LG

    Regularization and nonlinearities for neural language models: when are they needed?

    Authors: Marius Pachitariu, Maneesh Sahani

    Abstract: Neural language models (LMs) based on recurrent neural networks (RNN) are some of the most successful word and character-level LMs. Why do they work so well, in particular better than linear neural LMs? Possible explanations are that RNNs have an implicitly better regularization or that RNNs have a higher capacity for storing patterns due to their nonlinearities or both. Here we argue for the firs… ▽ More

    Submitted 20 June, 2013; v1 submitted 23 January, 2013; originally announced January 2013.

    Comments: Added new experiments on large datasets and on the Microsoft Research Sentence Completion Challenge

  17. arXiv:1206.6468  [pdf

    cs.LG cs.SD stat.ML

    Variational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation

    Authors: Gautham Mysore, Maneesh Sahani

    Abstract: The past decade has seen substantial work on the use of non-negative matrix factorization and its probabilistic counterparts for audio source separation. Although able to capture audio spectral structure well, these models neglect the non-stationarity and temporal dynamics that are important properties of audio. The recently proposed non-negative factorial hidden Markov model (N-FHMM) introduces a… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)