Animal behavior is shaped by a myriad of mechanisms acting on a wide range of scales, which hampers quantitative reasoning and the identification of general principles. Here, we combine data analysis and theory to investigate the relationship between behavioral plasticity and heavy-tailed statistics often observed in animal behavior. Specifically, we first leverage high-resolution recordings of C. elegans locomotion to show that stochastic transitions among long-lived behaviors exhibit heavy-tailed first passage time distributions and correlation functions. Such heavy tails can be explained by slow adaptation of behavior over time. This particular result motivates our second step of introducing a general model where we separate fast dynamics on a quasi-stationary multi-well potential, from non-ergodic, slowly varying modes. We then show that heavy tails generically emerge in such a model, and we provide a theoretical derivation of the resulting functional form, which can become a power law with exponents that depend on the strength of the fluctuations. Finally, we provide direct support for the generality of our findings by testing them in a C. elegans mutant where adaptation is suppressed and heavy tails thus disappear, and recordings of larval zebrafish swimming behavior where heavy tails are again prevalent.