Decomposing the effects of context valence and feedback information on speed and accuracy during reinforcement learning: a meta-analytical approach using diffusion decision modeling

Laura Fontanesi; Stefano Palminteri; Maël Lebreton

doi:10.3758/s13415-019-00723-1

Decomposing the effects of context valence and feedback information on speed and accuracy during reinforcement learning: a meta-analytical approach using diffusion decision modeling

Cogn Affect Behav Neurosci. 2019 Jun;19(3):490-502. doi: 10.3758/s13415-019-00723-1.

Authors

Laura Fontanesi¹, Stefano Palminteri^{2

3

4}, Maël Lebreton^{5

6

7

8}

Affiliations

¹ Center of Economic Psychology, University of Basel, Basel, Switzerland.
² Human Reinforcement Learning team, Université de Paris Sciences et Lettres, Paris, France. [email protected].
³ Département d'études cognitives, Ecole Normale Supérieure, Paris, France. [email protected].
⁴ Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Santé et Recherche Médicale, Paris, France. [email protected].
⁵ Amsterdam Brain and Cognition, Universiteit van Amsterdam, Amsterdam, The Netherlands.
⁶ Center for Research in Experimental Economics and Political Decision-making, Amsterdam School of Economics, Universiteit van Amsterdam, Amsterdam, The Netherlands.
⁷ Neurology and Imaging of Cognition, Department of Basic Neurosciences, University of Geneva, Geneva, Switzerland.
⁸ Swiss Center for Affective Science, University of Geneva, Geneva, Switzerland.

Abstract

Reinforcement learning (RL) models describe how humans and animals learn by trial-and-error to select actions that maximize rewards and minimize punishments. Traditional RL models focus exclusively on choices, thereby ignoring the interactions between choice preference and response time (RT), or how these interactions are influenced by contextual factors. However, in the field of perceptual decision-making, such interactions have proven to be important to dissociate between different underlying cognitive processes. Here, we investigated such interactions to shed new light on overlooked differences between learning to seek rewards and learning to avoid losses. We leveraged behavioral data from four RL experiments, which feature manipulations of two factors: outcome valence (gains vs. losses) and feedback information (partial vs. complete feedback). A Bayesian meta-analysis revealed that these contextual factors differently affect RTs and accuracy: While valence only affects RTs, feedback information affects both RTs and accuracy. To dissociate between the latent cognitive processes, we jointly fitted choices and RTs across all experiments with a Bayesian, hierarchical diffusion decision model (DDM). We found that the feedback manipulation affected drift rate, threshold, and non-decision time, suggesting that it was not a mere difficulty effect. Moreover, valence affected non-decision time and threshold, suggesting a motor inhibition in punishing contexts. To better understand the learning dynamics, we finally fitted a combination of RL and DDM (RLDDM). We found that while the threshold was modulated by trial-specific decision conflict, the non-decision time was modulated by the learned context valence. Overall, our results illustrate the benefits of jointly modeling RTs and choice data during RL, to reveal subtle mechanistic differences underlying decisions in different learning contexts.

Keywords: Decision diffusion model; Decision-making; Motivation; Reinforcement learning; Response time; Reward.

Publication types

Meta-Analysis
Research Support, Non-U.S. Gov't

MeSH terms

Adult
Decision Making* / physiology
Feedback, Psychological* / physiology
Female
Humans
Male
Models, Biological*
Reaction Time* / physiology
Reinforcement, Psychology*
Young Adult

Grants and funding

EP-D-15-015/EPA/EPA/United States