Background: Reward sensitivity is an essential dimension related to mood fluctuations in bipolar disorder (BD), but there is currently a debate around hypersensitivity or hyposensitivity hypotheses to reward in BD during remission, probably related to a heterogeneous population within the BD spectrum and a lack of reward bias evaluation. Here, we examine reward maximization vs. punishment avoidance learning within the BD spectrum during remission.
Methods: Patients with BD-I (n = 45), BD-II (n = 34) and matched (n = 30) healthy controls (HC) were included. They performed an instrumental learning task designed to dissociate reward-based from punishment-based reinforcement learning. Computational modeling was used to identify the mechanisms underlying reinforcement learning performances.
Results: Behavioral results showed a significant reward learning deficit across BD subtypes compared to HC, captured at the computational level by a lower sensitivity to rewards compared to punishments in both BD subtypes. Computational modeling also revealed a higher choice randomness in BD-II compared to BD-I that reflected a tendency of BD-I to perform better during punishment avoidance learning than BD-II.
Limitations: Our patients were not naive to antipsychotic treatment and were not euthymic (but in syndromic remission) according to the International Society for Bipolar Disorder definition.
Conclusions: Our results are consistent with the reward hyposensitivity theory in BD. Computational modeling suggests distinct underlying mechanisms that produce similar observable behaviors, making it a useful tool for distinguishing how symptoms interact in BD versus other disorders. In the long run, a better understanding of these processes could contribute to better prevention and management of BD.
Keywords: Bipolar disorder(1); Computational biology(6); learning(3); punishment(5); reinforcement(2); reward(4).
Copyright © 2023 Elsevier B.V. All rights reserved.