Reinforcement learning: Dopamine ramps with fuzzy value estimates

James C R Whittington; Timothy E J Behrens

doi:10.1016/j.cub.2022.01.070

Reinforcement learning: Dopamine ramps with fuzzy value estimates

Curr Biol. 2022 Mar 14;32(5):R213-R215. doi: 10.1016/j.cub.2022.01.070.

Authors

James C R Whittington¹, Timothy E J Behrens²

Affiliations

¹ Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford OX3 9DU, UK. Electronic address: [email protected].
² Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford OX3 9DU, UK; Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3AR, UK; Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London W1T 4JG, UK.

PMID: 35290767
DOI: 10.1016/j.cub.2022.01.070

Abstract

A new study in reinforcement learning theory shows that extending the temporal difference algorithm to unbiased learning under state uncertainty explains the observed ramping behaviour of dopamine neurons.

Publication types

Comment

MeSH terms

Dopamine*
Learning / physiology
Models, Neurological*
Reinforcement, Psychology
Uncertainty

Substances

Dopamine