Reward prediction error neurons implement an efficient code for reward

Heiko H Schütt; Dongjae Kim; Wei Ji Ma

doi:10.1038/s41593-024-01671-x

Reward prediction error neurons implement an efficient code for reward

Nat Neurosci. 2024 Jul;27(7):1333-1339. doi: 10.1038/s41593-024-01671-x. Epub 2024 Jun 19.

Authors

Heiko H Schütt^#^{1

2}, Dongjae Kim^#^{3

4}, Wei Ji Ma³

Affiliations

¹ Center for Neural Science and Department of Psychology, New York University, New York, NY, USA. [email protected].
² Department of Behavioural and Cognitive Sciences, Université du Luxembourg, Esch-Belval, Luxembourg. [email protected].
³ Center for Neural Science and Department of Psychology, New York University, New York, NY, USA.
⁴ Department of AI-Based Convergence, Dankook University, Yongin, Republic of Korea.

^# Contributed equally.

PMID: 38898182
DOI: 10.1038/s41593-024-01671-x

Abstract

We use efficient coding principles borrowed from sensory neuroscience to derive the optimal neural population to encode a reward distribution. We show that the responses of dopaminergic reward prediction error neurons in mouse and macaque are similar to those of the efficient code in the following ways: the neurons have a broad distribution of midpoints covering the reward distribution; neurons with higher thresholds have higher gains, more convex tuning functions and lower slopes; and their slope is higher when the reward distribution is narrower. Furthermore, we derive learning rules that converge to the efficient code. The learning rule for the position of the neuron on the reward axis closely resembles distributional reinforcement learning. Thus, reward prediction error neuron responses may be optimized to broadcast an efficient reward signal, forming a connection between efficient coding and reinforcement learning, two of the most successful theories in computational neuroscience.

MeSH terms

Animals
Dopaminergic Neurons / physiology
Macaca mulatta
Male
Mice
Models, Neurological*
Neurons / physiology
Reinforcement, Psychology
Reward*