Despite notable variations in children's rate of success on theory of mind tasks and the presumed theoretical implications drawn from a child's success or failure on such tasks, there have been no studies of the test-retest reliability of children's performance on these tasks. Twenty-three children (mean age 49.6 months, SD 8.6) watched three videotaped stories illustrating a false-belief situation: the standard experimenter narrated false-belief task, a minor variant replacing the narration of the story with a dialogue among the characters, and a third version involving a humorous situation. The time elapsed between test and retest was 2-3 weeks and the order of presentation was counterbalanced. Results corroborated previous findings of a developmental trend in the understanding of false-belief questions but, despite a general improvement in children's comprehension of the stories, the test-retest reliability for the false belief questions was poor. Although changes recorded between test-retest sessions frequently occurred in the direction of children answering correctly questions they had previously failed, a subset of children incorrectly answered questions they had initially passed. These findings underscore the need for validation assessments of techniques for studying children's developing theories of mind.