Prophages can have major clinical implications through their ability to change pathogenic bacterial traits. There is limited understanding of the prophage role in ecological, evolutionary, adaptive processes and pathogenicity of Helicobacter pylori, a widespread bacterium causally associated with gastric cancer. Inferring the exact prophage genomic location and completeness requires complete genomes. The international Helicobacter pylori Genome Project (HpGP) dataset comprises 1011 H. pylori complete clinical genomes enriched with epigenetic data. We thoroughly evaluated the H. pylori prophage genomic content in the HpGP dataset. We investigated population evolutionary dynamics through phylogenetic and pangenome analyses. Additionally, we identified genome rearrangements and assessed the impact of prophage presence on bacterial gene disruption and methylome. We found that 29.5% (298) of the HpGP genomes contain prophages, of which only 32.2% (96) were complete, minimizing the burden of prophage carriage. The prevalence of H. pylori prophage sequences was variable by geography and ancestry, but not by disease status of the human host. Prophage insertion occasionally results in gene disruption that can change the global bacterial epigenome. Gene function prediction allowed the development of the first model for lysogenic-lytic cycle regulation in H. pylori. We have disclosed new prophage inactivation mechanisms that appear to occur by genome rearrangement, merger with other mobile elements, and pseudogene accumulation. Our analysis provides a comprehensive framework for H. pylori prophage biological and genomics, offering insights into lysogeny regulation and bacterial adaptation to prophages.
Keywords: H. pylori; HpGP; genome rearrangement; mobile elements; phage cycle; prophage.