Past researchers of the integration of information in memory have typically required participants to attend to and/or commit to memory the stimuli conveying distinct features, rendering difficult the examination of whether the maintenance of the feature pairings can occur involuntarily. To address this issue, the integration of voice and location information in auditory sensory memory was measured using a cross-modal oddball task, in which task-irrelevant auditory deviants are known to capture attention in an involuntary fashion. Participants categorized visual digits presented shortly after to-be-ignored sounds. These sounds consisted in the same phoneme played simultaneously in both ears but in different voices (female in one ear, male in the other). On most trials, the pairing of voice to location was constant (standard sound). On rare and unpredictable trials, the voices swapped locations (deviant sound). In line with past work on attention capture by auditory novelty, the participants were significantly slower to judge the visual digits following the deviant sound, indicating the involuntary encoding of the links between voice and location in auditory memory. These results suggest that voices and locations are integrated in memory and that this binding occurs in conditions in which participants do not intend to commit any information to memory.