Animals and humans replay neural patterns encoding trajectories through their environment, both whilst they solve decision-making tasks and during rest. Both on-task and off-task replay are believed to contribute to flexible decision making, though how their relative contributions differ remains unclear. We investigated this question by using magnetoencephalography (MEG) to study human subjects while they performed a decision-making task that was designed to reveal the decision algorithms employed. We characterised subjects in terms of how flexibly each adjusted their choices to changes in temporal, spatial and reward structure. The more flexible a subject, the more they replayed trajectories during task performance, and this replay was coupled with re-planning of the encoded trajectories. The less flexible a subject, the more they replayed previously preferred trajectories during rest periods between task epochs. The data suggest that online and offline replay both participate in planning but support distinct decision strategies.
Keywords: human; magnetoencephalography; model-based planning; model-free planning; neuroscience; reinforcement learning; replay.
Studies show that humans and animals replay past experiences in their brain. To do this, the brain creates a pattern of electrical activity for each part of a multistep experience and then plays them back in order. Humans and other animals can replay scenarios either while the experience is still happening (i.e. online replay) or later when they are resting or sleeping (i.e. offline replay). Being able to replay an experience and its outcome may help a person or animal plan a better course of action in the future. However, it is poorly understood how online and offline replay each contribute to such planning. To answer this question, Eldar et al. used a brain imaging tool called magnetoencephalography (MEG for short) to measure the electrical activity inside the brain. This technique was able to detect replays in the brain of individuals performing a particular task, and later whilst they were resting. In the experiments, 40 healthy volunteers played a game in which each location in a space was associated with an image, for example a frog or a traffic sign, and each image was given a value. Participants got paid for moving to more valuable images in one or two steps. Eldar et al. found that people who replay their steps during a task are able to adjust their choices on the fly, whereas individuals who replay their choices during rests tend to approach a task with a less flexible, more preformed plan. Eldar et al. suggest that replaying an experience too much during rest and not enough in real-time might contribute to more rigid behaviors, a theory that could shed light on the mechanisms behind certain behavioral disorders such as obsessive compulsive disorder. However, more studies are needed to determine if these two different replay strategies play a causal role in human behavior.
© 2020, Eldar et al.