Search | arXiv e-print repository

Proving Theorems using Incremental Learning and Hindsight Experience Replay

Authors: Eser Aygün, Laurent Orseau, Ankit Anand, Xavier Glorot, Vlad Firoiu, Lei M. Zhang, Doina Precup, Shibl Mourad

Abstract: Traditional automated theorem provers for first-order logic depend on speed-optimized search and many handcrafted heuristics that are designed to work best over a wide range of domains. Machine learning approaches in literature either depend on these traditional provers to bootstrap themselves or fall short on reaching comparable performance. In this paper, we propose a general incremental learnin… ▽ More Traditional automated theorem provers for first-order logic depend on speed-optimized search and many handcrafted heuristics that are designed to work best over a wide range of domains. Machine learning approaches in literature either depend on these traditional provers to bootstrap themselves or fall short on reaching comparable performance. In this paper, we propose a general incremental learning algorithm for training domain specific provers for first-order logic without equality, based only on a basic given-clause algorithm, but using a learned clause-scoring function. Clauses are represented as graphs and presented to transformer networks with spectral features. To address the sparsity and the initial lack of training data as well as the lack of a natural curriculum, we adapt hindsight experience replay to theorem proving, so as to be able to learn even when no proof can be found. We show that provers trained this way can match and sometimes surpass state-of-the-art traditional provers on the TPTP dataset in terms of both quantity and quality of the proofs. △ Less

Submitted 20 December, 2021; originally announced December 2021.

Comments: 16 pages, 2 figures

ACM Class: I.2.3

arXiv:2106.13105 [pdf, other]

The Option Keyboard: Combining Skills in Reinforcement Learning

Authors: André Barreto, Diana Borsa, Shaobo Hou, Gheorghe Comanici, Eser Aygün, Philippe Hamel, Daniel Toyama, Jonathan Hunt, Shibl Mourad, David Silver, Doina Precup

Abstract: The ability to combine known skills to create new ones may be crucial in the solution of complex reinforcement learning problems that unfold over extended periods. We argue that a robust way of combining skills is to define and manipulate them in the space of pseudo-rewards (or "cumulants"). Based on this premise, we propose a framework for combining skills using the formalism of options. We show… ▽ More The ability to combine known skills to create new ones may be crucial in the solution of complex reinforcement learning problems that unfold over extended periods. We argue that a robust way of combining skills is to define and manipulate them in the space of pseudo-rewards (or "cumulants"). Based on this premise, we propose a framework for combining skills using the formalism of options. We show that every deterministic option can be unambiguously represented as a cumulant defined in an extended domain. Building on this insight and on previous results on transfer learning, we show how to approximate options whose cumulants are linear combinations of the cumulants of known options. This means that, once we have learned options associated with a set of cumulants, we can instantaneously synthesise options induced by any linear combination of them, without any learning involved. We describe how this framework provides a hierarchical interface to the environment whose abstract actions correspond to combinations of basic skills. We demonstrate the practical benefits of our approach in a resource management problem and a navigation task involving a quadrupedal simulated robot. △ Less

Submitted 24 June, 2021; originally announced June 2021.

Comments: Published at NeurIPS 2019

arXiv:2103.03798 [pdf, other]

Training a First-Order Theorem Prover from Synthetic Data

Authors: Vlad Firoiu, Eser Aygun, Ankit Anand, Zafarali Ahmed, Xavier Glorot, Laurent Orseau, Lei Zhang, Doina Precup, Shibl Mourad

Abstract: A major challenge in applying machine learning to automated theorem proving is the scarcity of training data, which is a key ingredient in training successful deep learning models. To tackle this problem, we propose an approach that relies on training purely with synthetically generated theorems, without any human data aside from axioms. We use these theorems to train a neurally-guided saturation-… ▽ More A major challenge in applying machine learning to automated theorem proving is the scarcity of training data, which is a key ingredient in training successful deep learning models. To tackle this problem, we propose an approach that relies on training purely with synthetically generated theorems, without any human data aside from axioms. We use these theorems to train a neurally-guided saturation-based prover. Our neural prover outperforms the state-of-the-art E-prover on this synthetic data in both time and search steps, and shows significant transfer to the unseen human-written theorems from the TPTP library, where it solves 72\% of first-order problems without equality. △ Less

Submitted 6 April, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

arXiv:2006.11259 [pdf, other]

Learning to Prove from Synthetic Theorems

Authors: Eser Aygün, Zafarali Ahmed, Ankit Anand, Vlad Firoiu, Xavier Glorot, Laurent Orseau, Doina Precup, Shibl Mourad

Abstract: A major challenge in applying machine learning to automated theorem proving is the scarcity of training data, which is a key ingredient in training successful deep learning models. To tackle this problem, we propose an approach that relies on training with synthetic theorems, generated from a set of axioms. We show that such theorems can be used to train an automated prover and that the learned pr… ▽ More A major challenge in applying machine learning to automated theorem proving is the scarcity of training data, which is a key ingredient in training successful deep learning models. To tackle this problem, we propose an approach that relies on training with synthetic theorems, generated from a set of axioms. We show that such theorems can be used to train an automated prover and that the learned prover transfers successfully to human-generated theorems. We demonstrate that a prover trained exclusively on synthetic theorems can solve a substantial fraction of problems in TPTP, a benchmark dataset that is used to compare state-of-the-art heuristic provers. Our approach outperforms a model trained on human-generated problems in most axiom sets, thereby showing the promise of using synthetic data for this task. △ Less

Submitted 19 June, 2020; originally announced June 2020.

Comments: 17 pages, 6 figures, submitted to NeurIPS 2020

ACM Class: I.2.3

arXiv:2004.01097 [pdf, other]

Learning to cooperate: Emergent communication in multi-agent navigation

Authors: Ivana Kajić, Eser Aygün, Doina Precup

Abstract: Emergent communication in artificial agents has been studied to understand language evolution, as well as to develop artificial systems that learn to communicate with humans. We show that agents performing a cooperative navigation task in various gridworld environments learn an interpretable communication protocol that enables them to efficiently, and in many cases, optimally, solve the task. An a… ▽ More Emergent communication in artificial agents has been studied to understand language evolution, as well as to develop artificial systems that learn to communicate with humans. We show that agents performing a cooperative navigation task in various gridworld environments learn an interpretable communication protocol that enables them to efficiently, and in many cases, optimally, solve the task. An analysis of the agents' policies reveals that emergent signals spatially cluster the state space, with signals referring to specific locations and spatial directions such as "left", "up", or "upper left room". Using populations of agents, we show that the emergent protocol has basic compositional structure, thus exhibiting a core property of natural language. △ Less

Submitted 30 June, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

Comments: Accepted to CogSci 2020

arXiv:1107.3457 [pdf, other]

doi 10.1088/1742-6596/319/1/012007

Spectral renormalization group theory on networks

Authors: Eser Aygun, Ayse Erzan

Abstract: Discrete amorphous materials are best described in terms of arbitrary networks which can be embedded in three dimensional space. Investigating the thermodynamic equilibrium as well as non-equilibrium behavior of such materials around second order phase transitions call for special techniques. We set up a renormalization group scheme by expanding an arbitrary scalar field living on the nodes of a… ▽ More Discrete amorphous materials are best described in terms of arbitrary networks which can be embedded in three dimensional space. Investigating the thermodynamic equilibrium as well as non-equilibrium behavior of such materials around second order phase transitions call for special techniques. We set up a renormalization group scheme by expanding an arbitrary scalar field living on the nodes of an arbitrary network, in terms of the eigenvectors of the normalized graph Laplacian. The renormalization transformation involves, as usual, the integration over the more "rapidly varying" components of the field, corresponding to eigenvectors with larger eigenvalues, and then rescaling. The critical exponents depend on the particular graph through the spectral density of the eigenvalues. △ Less

Submitted 18 July, 2011; originally announced July 2011.

Comments: 17 pages, 3 figures, presented at the Continuum Models and Discrete Systems (CMDS-12), 21-25 Feb 2011, Saha Institute of Nuclear Physics, Kolkata, India

Journal ref: Journal of Physics: Conference Series 319 (2011) 012007

arXiv:nlin/0609042 [pdf, ps, other]

doi 10.1209/0295-5075/78/60007

A Formal Treatment of Generalized Preferential Attachment and its Empirical Validation

Authors: Amac Herdagdelen, Eser Aygun, Haluk Bingol

Abstract: Generalized preferential attachment is defined as the tendency of a vertex to acquire new links in the future with respect to a particular vertex property. Understanding which properties influence link acquisition tendency (LAT) gives us a predictive power to estimate the future growth of network and insight about the actual dynamics governing the complex networks. In this study, we explore the… ▽ More Generalized preferential attachment is defined as the tendency of a vertex to acquire new links in the future with respect to a particular vertex property. Understanding which properties influence link acquisition tendency (LAT) gives us a predictive power to estimate the future growth of network and insight about the actual dynamics governing the complex networks. In this study, we explore the effect of age and degree on LAT by analyzing data collected from a new complex-network growth dataset. We found that LAT and degree of a vertex are linearly correlated in accordance with previous studies. Interestingly, the relation between LAT and age of a vertex is found to be in conflict with the known models of network growth. We identified three different periods in the network's lifetime where the relation between age and LAT is strongly positive, almost stationary and negative correspondingly. △ Less

Submitted 16 July, 2007; v1 submitted 15 September, 2006; originally announced September 2006.

Journal ref: EPL 78 No 6 (June 2007) 60007

Showing 1–7 of 7 results for author: Aygun, E