The mammalian inner ear houses the vestibular and cochlear sensory organs dedicated to sensing balance and sound, respectively. These distinct sensory organs arise from a common prosensory region, but the mechanisms underlying their divergence remain elusive. Here, we showed that two evolutionarily conserved homeobox genes, Irx3 and Irx5, are required for the patterning and segregation of the saccular and cochlear sensory domains, as well as for the formation of auditory sensory cells. Irx3/5 were highly expressed in the cochlea, their deletion resulted in a significantly shortened cochlea with a loss of the ductus reuniens that bridged the vestibule and cochlea. Remarkably, ectopic vestibular hair cells replaced the cochlear non-sensory structure, the Greater Epithelial Ridge. Moreover, most auditory sensory cells in the cochlea were transformed into hair cells of vestibular identity, with only a residual organ of Corti remaining in the mid-apical region of Irx3/5 double knockout mice. Conditional temporal knockouts further revealed that Irx3/5 are essential for controlling cochlear sensory domain formation before embryonic day 14. Our findings demonstrate that Irx3/5 regulate the patterning of vestibular and cochlear sensory cells, providing insights into the separation of vestibular and cochlear sensory organs during mammalian inner ear development.