Search | arXiv e-print repository

arXiv:1904.01557 [pdf, other]

Analysing Mathematical Reasoning Abilities of Neural Models

Authors: David Saxton, Edward Grefenstette, Felix Hill, Pushmeet Kohli

Abstract: Mathematical reasoning---a core ability within human intelligence---presents some unique challenges as a domain: we do not come to understand and solve mathematical problems primarily on the back of experience and evidence, but on the basis of inferring, learning, and exploiting laws, axioms, and symbol manipulation rules. In this paper, we present a new challenge for the evaluation (and eventuall… ▽ More Mathematical reasoning---a core ability within human intelligence---presents some unique challenges as a domain: we do not come to understand and solve mathematical problems primarily on the back of experience and evidence, but on the basis of inferring, learning, and exploiting laws, axioms, and symbol manipulation rules. In this paper, we present a new challenge for the evaluation (and eventually the design) of neural architectures and similar system, developing a task suite of mathematics problems involving sequential questions and answers in a free-form textual input/output format. The structured nature of the mathematics domain, covering arithmetic, algebra, probability and calculus, enables the construction of training and test splits designed to clearly illuminate the capabilities and failure-modes of different architectures, as well as evaluate their ability to compose and relate knowledge and learned processes. Having described the data generation process and its potential future expansions, we conduct a comprehensive analysis of models from two broad classes of the most powerful sequence-to-sequence architectures and find notable differences in their ability to resolve mathematical problems and generalize their knowledge. △ Less

Submitted 2 April, 2019; originally announced April 2019.

arXiv:1903.11907 [pdf, other]

Meta-Learning surrogate models for sequential decision making

Authors: Alexandre Galashov, Jonathan Schwarz, Hyunjik Kim, Marta Garnelo, David Saxton, Pushmeet Kohli, S. M. Ali Eslami, Yee Whye Teh

Abstract: We introduce a unified probabilistic framework for solving sequential decision making problems ranging from Bayesian optimisation to contextual bandits and reinforcement learning. This is accomplished by a probabilistic model-based approach that explains observed data while capturing predictive uncertainty during the decision making process. Crucially, this probabilistic model is chosen to be a Me… ▽ More We introduce a unified probabilistic framework for solving sequential decision making problems ranging from Bayesian optimisation to contextual bandits and reinforcement learning. This is accomplished by a probabilistic model-based approach that explains observed data while capturing predictive uncertainty during the decision making process. Crucially, this probabilistic model is chosen to be a Meta-Learning system that allows learning from a distribution of related problems, allowing data efficient adaptation to a target task. As a suitable instantiation of this framework, we explore the use of Neural processes due to statistical and computational desiderata. We apply our framework to a broad range of problem domains, such as control problems, recommender systems and adversarial attacks on RL agents, demonstrating an efficient and general black-box learning approach. △ Less

Submitted 12 June, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

arXiv:1807.01613 [pdf, other]

Conditional Neural Processes

Authors: Marta Garnelo, Dan Rosenbaum, Chris J. Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo J. Rezende, S. M. Ali Eslami

Abstract: Deep neural networks excel at function approximation, yet they are typically trained from scratch for each new function. On the other hand, Bayesian methods, such as Gaussian Processes (GPs), exploit prior knowledge to quickly infer the shape of a new function at test time. Yet GPs are computationally expensive, and it can be hard to design appropriate priors. In this paper we propose a family of… ▽ More Deep neural networks excel at function approximation, yet they are typically trained from scratch for each new function. On the other hand, Bayesian methods, such as Gaussian Processes (GPs), exploit prior knowledge to quickly infer the shape of a new function at test time. Yet GPs are computationally expensive, and it can be hard to design appropriate priors. In this paper we propose a family of neural models, Conditional Neural Processes (CNPs), that combine the benefits of both. CNPs are inspired by the flexibility of stochastic processes such as GPs, but are structured as neural networks and trained via gradient descent. CNPs make accurate predictions after observing only a handful of training data points, yet scale to complex functions and large datasets. We demonstrate the performance and versatility of the approach on a range of canonical machine learning tasks, including regression, classification and image completion. △ Less

Submitted 4 July, 2018; originally announced July 2018.

arXiv:1803.10760 [pdf, other]

Unsupervised Predictive Memory in a Goal-Directed Agent

Authors: Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matt Botvinick, Demis Hassabis, Timothy Lillicrap

Abstract: Animals execute goal-directed behaviours despite the limited range and scope of their sensors. To cope, they explore environments and store memories maintaining estimates of important information that is not presently available. Recently, progress has been made with artificial intelligence (AI) agents that learn to perform tasks from sensory input, even at a human level, by merging reinforcement l… ▽ More Animals execute goal-directed behaviours despite the limited range and scope of their sensors. To cope, they explore environments and store memories maintaining estimates of important information that is not presently available. Recently, progress has been made with artificial intelligence (AI) agents that learn to perform tasks from sensory input, even at a human level, by merging reinforcement learning (RL) algorithms with deep neural networks, and the excitement surrounding these results has led to the pursuit of related ideas as explanations of non-human animal learning. However, we demonstrate that contemporary RL algorithms struggle to solve simple tasks when enough information is concealed from the sensors of the agent, a property called "partial observability". An obvious requirement for handling partially observed tasks is access to extensive memory, but we show memory is not enough; it is critical that the right information be stored in the right format. We develop a model, the Memory, RL, and Inference Network (MERLIN), in which memory formation is guided by a process of predictive modeling. MERLIN facilitates the solution of tasks in 3D virtual reality environments for which partial observability is severe and memories must be maintained over long durations. Our model demonstrates a single learning agent architecture that can solve canonical behavioural tasks in psychology and neurobiology without strong simplifying assumptions about the dimensionality of sensory input or the duration of experiences. △ Less

Submitted 28 March, 2018; originally announced March 2018.

arXiv:1802.08535 [pdf, other]

Can Neural Networks Understand Logical Entailment?

Authors: Richard Evans, David Saxton, David Amos, Pushmeet Kohli, Edward Grefenstette

Abstract: We introduce a new dataset of logical entailments for the purpose of measuring models' ability to capture and exploit the structure of logical expressions against an entailment prediction task. We use this task to compare a series of architectures which are ubiquitous in the sequence-processing literature, in addition to a new model class---PossibleWorldNets---which computes entailment as a "convo… ▽ More We introduce a new dataset of logical entailments for the purpose of measuring models' ability to capture and exploit the structure of logical expressions against an entailment prediction task. We use this task to compare a series of architectures which are ubiquitous in the sequence-processing literature, in addition to a new model class---PossibleWorldNets---which computes entailment as a "convolution over possible worlds". Results show that convolutional networks present the wrong inductive bias for this class of problems relative to LSTM RNNs, tree-structured neural networks outperform LSTM RNNs due to their enhanced ability to exploit the syntax of logic, and PossibleWorldNets outperform all benchmarks. △ Less

Submitted 23 February, 2018; originally announced February 2018.

Comments: Published at ICLR 2018 (main conference)

arXiv:1709.03220 [pdf]

Give Me a Like: How HIV/AIDS Nonprofit Organizations Can Engage Their Audience on Facebook

Authors: Yu-Chao Huang, Yi-Pin Lin, Gregory D. Saxton

Abstract: With the rapid proliferation and adoption of social media among healthcare professionals and organizations, social media-based HIV/AIDS intervention programs have become increasingly popular. However, the question of the effectiveness of the HIV/AIDS messages disseminated via social media has received scant attention in the literature. The current study applies content analysis to examine the rela… ▽ More With the rapid proliferation and adoption of social media among healthcare professionals and organizations, social media-based HIV/AIDS intervention programs have become increasingly popular. However, the question of the effectiveness of the HIV/AIDS messages disseminated via social media has received scant attention in the literature. The current study applies content analysis to examine the relationship between Facebook messaging strategies employed by 110 HIV/AIDS nonprofit organizations and audience reactions in the form of liking, commenting, and sharing behavior. The results reveal that HIV/AIDS nonprofit organizations often use informational messages as one-way communication with their audience instead of dialogic interactions. Some specific types of messages, such as medication-focused messages, engender better audience engagement, in contrast, event-related messages and call-to-action messages appear to translate into lower corresponding audience reactions. The findings provide guidance to HIV/AIDS organizations in developing effective social media communication strategies. △ Less

Submitted 10 September, 2017; originally announced September 2017.

Comments: 23 pages, 1 figure, 3 tables, AIDS Education and Prevention, 2016

arXiv:1706.06383 [pdf, other]

Programmable Agents

Authors: Misha Denil, Sergio Gómez Colmenarejo, Serkan Cabi, David Saxton, Nando de Freitas

Abstract: We build deep RL agents that execute declarative programs expressed in formal language. The agents learn to ground the terms in this language in their environment, and can generalize their behavior at test time to execute new programs that refer to objects that were not referenced during training. The agents develop disentangled interpretable representations that allow them to generalize to a wide… ▽ More We build deep RL agents that execute declarative programs expressed in formal language. The agents learn to ground the terms in this language in their environment, and can generalize their behavior at test time to execute new programs that refer to objects that were not referenced during training. The agents develop disentangled interpretable representations that allow them to generalize to a wide variety of zero-shot semantic tasks. △ Less

Submitted 20 June, 2017; originally announced June 2017.

arXiv:1606.01868 [pdf, other]

Unifying Count-Based Exploration and Intrinsic Motivation

Authors: Marc G. Bellemare, Sriram Srinivasan, Georg Ostrovski, Tom Schaul, David Saxton, Remi Munos

Abstract: We consider an agent's uncertainty about its environment and the problem of generalizing this uncertainty across observations. Specifically, we focus on the problem of exploration in non-tabular reinforcement learning. Drawing inspiration from the intrinsic motivation literature, we use density models to measure uncertainty, and propose a novel algorithm for deriving a pseudo-count from an arbitra… ▽ More We consider an agent's uncertainty about its environment and the problem of generalizing this uncertainty across observations. Specifically, we focus on the problem of exploration in non-tabular reinforcement learning. Drawing inspiration from the intrinsic motivation literature, we use density models to measure uncertainty, and propose a novel algorithm for deriving a pseudo-count from an arbitrary density model. This technique enables us to generalize count-based exploration algorithms to the non-tabular case. We apply our ideas to Atari 2600 games, providing sensible pseudo-counts from raw pixels. We transform these pseudo-counts into intrinsic rewards and obtain significantly improved exploration in a number of hard games, including the infamously difficult Montezuma's Revenge. △ Less

Submitted 7 November, 2016; v1 submitted 6 June, 2016; originally announced June 2016.

arXiv:1208.3394 [pdf]

doi 10.1177/1461444812452411

Modeling the adoption and use of social media by nonprofit organizations

Authors: Seungahn Nah, Gregory D. Saxton

Abstract: This study examines what drives organizational adoption and use of social media through a model built around four key factors - strategy, capacity, governance, and environment. Using Twitter, Facebook, and other data on 100 large US nonprofit organizations, the model is employed to examine the determinants of three key facets of social media utilization: 1) adoption, 2) frequency of use, and 3) di… ▽ More This study examines what drives organizational adoption and use of social media through a model built around four key factors - strategy, capacity, governance, and environment. Using Twitter, Facebook, and other data on 100 large US nonprofit organizations, the model is employed to examine the determinants of three key facets of social media utilization: 1) adoption, 2) frequency of use, and 3) dialogue. We find that organizational strategies, capacities, governance features, and external pressures all play a part in these social media adoption and utilization outcomes. Through its integrated, multi-disciplinary theoretical perspective, this study thus helps foster understanding of which types of organizations are able and willing to adopt and juggle multiple social media accounts, to use those accounts to communicate more frequently with their external publics, and to build relationships with those publics through the sending of dialogic messages. △ Less

Submitted 16 August, 2012; originally announced August 2012.

Comments: Seungahn Nah and Gregory D. Saxton. (in press). Modeling the adoption and use of social media by nonprofit organizations. New Media & Society, forthcoming

arXiv:1204.3230 [pdf]

doi 10.1111/j.1083-6101.2012.01576.x

Information, Community, and Action: How Nonprofit Organizations Use Social Media

Authors: Kristen Lovejoy, Gregory D. Saxton

Abstract: The rapid diffusion of "microblogging" services such as Twitter is ushering in a new era of possibilities for organizations to communicate with and engage their core stakeholders and the general public. To enhance understanding of the communicative functions microblogging serves for organizations, this study examines the Twitter utilization practices of the 100 largest nonprofit organizations in t… ▽ More The rapid diffusion of "microblogging" services such as Twitter is ushering in a new era of possibilities for organizations to communicate with and engage their core stakeholders and the general public. To enhance understanding of the communicative functions microblogging serves for organizations, this study examines the Twitter utilization practices of the 100 largest nonprofit organizations in the United States. The analysis reveals there are three key functions of microblogging updates-"information," "community," and "action." Though the informational use of microblogging is extensive, nonprofit organizations are better at using Twitter to strategically engage their stakeholders via dialogic and community-building practices than they have been with traditional websites. The adoption of social media appears to have engendered new paradigms of public engagement. Keywords: microblogging; Twitter; social media; stakeholder relations; organizational communication; organization-public relations; nonprofit organizations △ Less

Submitted 14 April, 2012; originally announced April 2012.

Journal ref: Journal of Computer Mediated Communication, vol. 17, pp. 337-353, 2012

arXiv:1203.5279 [pdf]

Social Media and the Social Good: How Nonprofits Use Facebook to Communicate with the Public

Authors: Gregory D. Saxton, Chao Guo, I-Hsuan Chiu, Bo Feng

Abstract: In this study, we examine the social networking practices of the 100 largest nonprofit organizations in the United States. More specifically, we develop a comprehensive classification scheme to delineate these organizations' use of Facebook as a stakeholder engagement tool. We find that there are 5 primary categories of Facebook "statuses", which can be aggregated into three key dimensions - "in… ▽ More In this study, we examine the social networking practices of the 100 largest nonprofit organizations in the United States. More specifically, we develop a comprehensive classification scheme to delineate these organizations' use of Facebook as a stakeholder engagement tool. We find that there are 5 primary categories of Facebook "statuses", which can be aggregated into three key dimensions - "information", "community", and "action". Our analysis reveals that, though the "informational" use of Facebook is still significant, nonprofit organizations are better at using Facebook to strategically engage their stakeholders via "dialogic" and "community-building" practices than they have been with traditional websites. The adoption of social media seems to have engendered new paradigms of public engagement. △ Less

Submitted 23 March, 2012; originally announced March 2012.

Comments: Chinese-language article

Journal ref: China Third Sector Research, Vol. 1, pp. 40-54, 2011

arXiv:1106.1852 [pdf]

doi 10.1016/j.pubrev.2012.01.005

Engaging Stakeholders through Twitter: How Nonprofit Organizations are Getting More Out of 140 Characters or Less

Authors: Kristen Lovejoy, Richard Waters, Gregory D. Saxton

Abstract: 140 characters seems like too small a space for any meaningful information to be exchanged, but Twitter users have found creative ways to get the most out of each Tweet by using different communication tools. This paper looks into how 73 nonprofit organizations use Twitter to engage stakeholders not only through their tweets, but also through other various communication methods. Specifically, it l… ▽ More 140 characters seems like too small a space for any meaningful information to be exchanged, but Twitter users have found creative ways to get the most out of each Tweet by using different communication tools. This paper looks into how 73 nonprofit organizations use Twitter to engage stakeholders not only through their tweets, but also through other various communication methods. Specifically, it looks into the organizations' utilization of tweet frequency, following behavior, hyperlinks, hashtags, public messages, retweets, and multimedia files. After analyzing 4,655 tweets, the study found that the nation's largest nonprofits are not using Twitter to maximize stakeholder involvement. Instead, they continue to use social media as a one-way communication channel, as less than 20% of their total tweets demonstrate conversations and roughly 16% demonstrate indirect connections to specific users. △ Less

Submitted 17 February, 2012; v1 submitted 9 June, 2011; originally announced June 2011.

Comments: In press, Public Relations Review; 10 pages

Showing 1–12 of 12 results for author: Saxton, D