Bowers et al. rightly emphasise that deep learning models often fail to capture constraints on visual perception that have been discovered by previous research. However, the solution is not to discard deep learning altogether, but to design stimuli and tasks that more closely reflect the problems that biological vision evolved to solve, such as understanding scenes and preparing skilled action.