Phage tailspike proteins are depolymerases that target diverse bacterial surface glycans with high specificity, determining the host-specificity of numerous phages. To address the challenge of identifying tailspike proteins due to their sequence diversity, we developed SpikeHunter, an approach based on the ESM-2 protein language model. Using SpikeHunter, we successfully identified 231,965 tailspike proteins from a dataset comprising 8,434,494 prophages found within 165,365 genomes of five common pathogens. Among these proteins, 143,035 tailspike proteins displayed strong associations with serotypes. Moreover, we observed highly similar tailspike proteins in species that share closely related serotypes. We found extensive domain swapping in all five species, with the C-terminal domain being significantly associated with host serotype highlighting its role in host range determination. Our study presents a comprehensive cross-species analysis of tailspike protein to serotype associations, providing insights applicable to phage therapy and biotechnology.