Humans can quickly and accurately recognize objects within briefly presented natural scenes. Previous work has provided evidence that scene context contributes to this process, demonstrating improved naming of objects that were presented in semantically consistent scenes (e.g., a sandcastle on a beach) relative to semantically inconsistent scenes (e.g., a sandcastle on a football field). The current study was aimed at investigating which processes underlie the scene consistency effect. Specifically, we tested: (1) whether the effect is due to increased visual feature and/or shape overlap for consistent relative to inconsistent scene-object pairs; and (2) whether the effect is mediated by attention to the background scene. Experiment 1 replicated the scene consistency effect of a previous report (Davenport and Potter, 2004). Using a new, carefully controlled stimulus set, Experiment 2 showed that the scene consistency effect could not be explained by low-level feature or shape overlap between scenes and target objects. Experiments 3a and 3b investigated whether focused attention modulates the scene consistency effect. By using a location cueing manipulation, participants were correctly informed about the location of the target object on a proportion of trials, allowing focused attention to be deployed toward the target object. Importantly, the effect of scene consistency on target object recognition was independent of spatial attention, and was observed both when attention was focused on the target object and when attention was focused on the background scene. These results indicate that a semantically consistent scene context benefits object recognition independently of the focus of attention. We suggest that the scene consistency effect is primarily driven by global scene properties, or "scene gist", that can be processed with minimal attentional resources.
Keywords: high-level vision; natural scene processing; object recognition; semantic consistency; spatial attention.