Flexible control as surrogate reward or dynamic reward maximization

Cognition. 2022 Dec:229:105262. doi: 10.1016/j.cognition.2022.105262. Epub 2022 Sep 11.

Abstract

The utility of a given experience, like interacting with a particular friend or tasting a particular food, fluctuates continually according to homeostatic and hedonic principles. Consequently, to maximize reward, an individual must be able to escape or attain outcomes as preferences change, by switching between actions. Recent work on human and artificial intelligence has defined such flexible instrumental control in information theoretic terms and postulated that it may serve as a reward surrogate. Another possibility, however, is that the adaptability afforded by flexible control is tacitly implemented by planning for dynamic changes in outcome values. In the current study, an expected utility model that computes decision values over a range of possible monetary gains and losses associated with sensory outcomes provided the best fit to behavioral choice data and performed best in terms of earned rewards. Moreover, consistent with previous work on perceived control and personality, individual differences in dimensional schizotypy were correlated with behavioral choice preferences in conditions with the greatest and lowest levels of flexible control. These results contribute to a growing literature on the role of instrumental control in goal-directed choice.

Keywords: Flexible control; Information theory; Instrumental divergence; Intrinsic reward.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Artificial Intelligence*
  • Choice Behavior
  • Humans
  • Personality
  • Reward*