The computational roots of positivity and confirmation biases in reinforcement learning

Stefano Palminteri; Maël Lebreton

doi:10.1016/j.tics.2022.04.005

The computational roots of positivity and confirmation biases in reinforcement learning

Trends Cogn Sci. 2022 Jul;26(7):607-621. doi: 10.1016/j.tics.2022.04.005. Epub 2022 May 31.

Authors

Stefano Palminteri¹, Maël Lebreton²

Affiliations

¹ Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Santé et Recherche Médicale, Paris, France; Département d'Études Cognitives, Ecole Normale Supérieure, Paris, France; Université de Recherche Paris Sciences et Lettres, Paris, France. Electronic address: [email protected].
² Paris School of Economics, Paris, France; LabNIC, Department of Fundamental Neurosciences, University of Geneva, Geneva, Switzerland; Swiss Center for Affective Science, Geneva, Switzerland. Electronic address: [email protected].

PMID: 35662490
DOI: 10.1016/j.tics.2022.04.005

Abstract

Humans do not integrate new information objectively: outcomes carrying a positive affective value and evidence confirming one's own prior belief are overweighed. Until recently, theoretical and empirical accounts of the positivity and confirmation biases assumed them to be specific to 'high-level' belief updates. We present evidence against this account. Learning rates in reinforcement learning (RL) tasks, estimated across different contexts and species, generally present the same characteristic asymmetry, suggesting that belief and value updating processes share key computational principles and distortions. This bias generates over-optimistic expectations about the probability of making the right choices and, consequently, generates over-optimistic reward expectations. We discuss the normative and neurobiological roots of these RL biases and their position within the greater picture of behavioral decision-making theories.

Keywords: confirmation; decision; gain; learning; loss; update.

Publication types

Review
Research Support, Non-U.S. Gov't

MeSH terms

Bias
Decision Making*
Humans
Learning
Reinforcement, Psychology*
Reward