Contextual modulation of value signals in reward and punishment learning

Stefano Palminteri; Mehdi Khamassi; Mateus Joffily; Giorgio Coricelli

doi:10.1038/ncomms9096

Contextual modulation of value signals in reward and punishment learning

Nat Commun. 2015 Aug 25:6:8096. doi: 10.1038/ncomms9096.

Authors

Stefano Palminteri^{1

2}, Mehdi Khamassi^{3

4}, Mateus Joffily^{4

5}, Giorgio Coricelli^{2

4

6}

Affiliations

¹ Institute of Cognitive Neuroscience (ICN), University College London (UCL), London WC1N 3AR, UK.
² Laboratoire de Neurosciences Cognitives (LNC), Département d'Etudes Cognitives (DEC), Institut National de la Santé et Recherche Médical (INSERM) U960, École Normale Supérieure (ENS), 75005 Paris, France.
³ Instintut des Systèmes Intelligents et Robotique (ISIR), Centre National de la Recherche Scientifique (CNRS) UMR 7222, Université Pierre et Marie Curie (UPMC), 70013 Paris, France.
⁴ Interdepartmental Centre for Mind/Brain Sciences (CIMeC), Università degli study di Trento, 38060 Trento, Italy.
⁵ Groupe d'Analyse et de Théorie Economique, Centre National de la Recherche Scientifique (CNRS) UMR 5229, Université de Lyon, 69003 Lyon, France.
⁶ Department of Economics, University of Southern California (USC), 90089-0253 Los Angeles, California, USA.

Abstract

Compared with reward seeking, punishment avoidance learning is less clearly understood at both the computational and neurobiological levels. Here we demonstrate, using computational modelling and fMRI in humans, that learning option values in a relative--context-dependent--scale offers a simple computational solution for avoidance learning. The context (or state) value sets the reference point to which an outcome should be compared before updating the option value. Consequently, in contexts with an overall negative expected value, successful punishment avoidance acquires a positive value, thus reinforcing the response. As revealed by post-learning assessment of options values, contextual influences are enhanced when subjects are informed about the result of the forgone alternative (counterfactual information). This is mirrored at the neural level by a shift in negative outcome encoding from the anterior insula to the ventral striatum, suggesting that value contextualization also limits the need to mobilize an opponent punishment learning system.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adult
Avoidance Learning / physiology*
Bayes Theorem
Brain / physiology
Brain Mapping
Cerebral Cortex / physiology
Computer Simulation
Decision Making / physiology*
Female
Functional Neuroimaging
Humans
Image Processing, Computer-Assisted
Learning / physiology
Magnetic Resonance Imaging
Male
Models, Neurological
Prefrontal Cortex / physiology*
Punishment*
Reward*
Ventral Striatum / physiology*
Young Adult

Grants and funding

617629/ERC_/European Research Council/International