Reference-point centering and range-adaptation enhance human reinforcement learning at the cost of irrational preferences

Sophie Bavard; Maël Lebreton; Mehdi Khamassi; Giorgio Coricelli; Stefano Palminteri

doi:10.1038/s41467-018-06781-2

Reference-point centering and range-adaptation enhance human reinforcement learning at the cost of irrational preferences

Nat Commun. 2018 Oct 29;9(1):4503. doi: 10.1038/s41467-018-06781-2.

Authors

Sophie Bavard^{1

2

3}, Maël Lebreton^{4

5

6}, Mehdi Khamassi^{7

8}, Giorgio Coricelli^{9

10}, Stefano Palminteri^{11

12

13}

Affiliations

¹ Laboratoire de Neurosciences Cognitives Computationnelles, Institut National de la Santé et Recherche Médicale, 29 rue d'Ulm, 75005, Paris, France.
² Département d'Etudes Cognitives, Ecole Normale Supérieure, Paris, 75005, France.
³ Institut d'Etudes de la Cognition, Université de Paris Sciences et Lettres, Paris, 75005, France.
⁴ CREED lab, Amsterdam School of Economics, Faculty of Business and Economics, University of Amsterdam, Roetersstraat 11, Amsterdam, 1018 WB, The Netherlands.
⁵ Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, 1018 WB, The Netherlands.
⁶ Swiss Centre for Affective Sciences, University of Geneva, 24 rue du Général-Dufour, Geneva, 1205, Switzerland.
⁷ Institut des Systèmes Intelligents et Robotiques, Centre National de la Recherche Scientifique, 4 place Jussieu, 75005, Paris, France.
⁸ Institut des Sciences de l'Information et de leurs Interactions, Sorbonne Universités, 3 rue Michel-Ange, Paris, 75794, France.
⁹ Department of Economics, University of Southern California, Los Angeles, CA, 90007, USA.
¹⁰ Centro Mente e Cervello, Università di Trento, corso Bettini 21, Rovereto, 38068, Italy.
¹¹ Laboratoire de Neurosciences Cognitives Computationnelles, Institut National de la Santé et Recherche Médicale, 29 rue d'Ulm, 75005, Paris, France. [email protected].
¹² Département d'Etudes Cognitives, Ecole Normale Supérieure, Paris, 75005, France. [email protected].
¹³ Institut d'Etudes de la Cognition, Université de Paris Sciences et Lettres, Paris, 75005, France. [email protected].

Abstract

In economics and perceptual decision-making contextual effects are well documented, where decision weights are adjusted as a function of the distribution of stimuli. Yet, in reinforcement learning literature whether and how contextual information pertaining to decision states is integrated in learning algorithms has received comparably little attention. Here, we investigate reinforcement learning behavior and its computational substrates in a task where we orthogonally manipulate outcome valence and magnitude, resulting in systematic variations in state-values. Model comparison indicates that subjects' behavior is best accounted for by an algorithm which includes both reference point-dependence and range-adaptation-two crucial features of state-dependent valuation. In addition, we find that state-dependent outcome valuation progressively emerges, is favored by increasing outcome information and correlated with explicit understanding of the task structure. Finally, our data clearly show that, while being locally adaptive (for instance in negative valence and small magnitude contexts), state-dependent valuation comes at the cost of seemingly irrational choices, when options are extrapolated out from their original contexts.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adolescent
Adult
Algorithms
Attention
Behavior / physiology
Computer Simulation
Decision Making / physiology
Female
Humans
Learning / physiology*
Male
Models, Neurological
Reference Values*
Reinforcement, Psychology*
Reward
Young Adult

Grants and funding

EP-D-15-015/EPA/EPA/United States