Evidence for positivity and optimism bias abounds in high-level belief updates. However, no consensus has been reached regarding whether learning asymmetries exist in more elementary forms of updates such as reinforcement learning (RL). In RL, the learning asymmetry concerns the sensitivity difference in incorporating positive and negative prediction errors (PE) into value estimation, namely the asymmetry of learning rates associated with positive and negative PEs. Although RL has been established as a canonical framework in characterizing interactions between agent and environment, the direction of learning asymmetry remains controversial. Here, we propose that part of the controversy stems from the fact that people may have different value expectations before entering the learning environment. Such a default value expectation influences how PEs are calculated and consequently biases subjects' choices. We test this hypothesis in two learning experiments with stable or varying reinforcement probabilities, across monetary gains, losses, and gain-loss mixed environments. Our results consistently support the model incorporating both asymmetric learning rates and the initial value expectation, highlighting the role of initial expectation in value updating and choice preference. Further simulation and model parameter recovery analyses confirm the unique contribution of initial value expectation in accessing learning rate asymmetry.
Copyright: © 2023 Ni et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.