What mechanisms distinguish interactive from non-interactive actions? To answer this question we tested participants while they took turns playing music with a virtual partner: in the interactive joint action condition, the participants played a melody together with their partner by grasping (C note) or pressing (G note) a cube-shaped instrument, alternating in playing one note each. In the non-interactive control condition, players' behavior was not guided by a shared melody, so that the partner's actions and notes were irrelevant to the participant. In both conditions, the participant's and partner's actions were physically congruent (e.g., grasp-grasp) or incongruent (e.g., grasp-point), and the partner's association between actions and notes was coherent with the participant's or reversed. Performance in the non-interactive condition was only affected by physical incongruence, whereas joint action was only affected when the partner's action-note associations were reversed. This shows that task interactivity shapes the sensorimotor coding of others' behaviors, and that joint action is based on active prediction of the partner's action effects rather than on passive action imitation. We suggest that such predictions are based on Dyadic Motor Plans that represent both the agent's and the partner's contributions to the interaction goal, like playing a melody together.