A Novel Approach to Helicobacter pylori Pan-Genome Analysis for Identification of Genomic Islands

PLoS One. 2016 Aug 9;11(8):e0159419. doi: 10.1371/journal.pone.0159419. eCollection 2016.

Abstract

Genomes of a given bacterial species can show great variation in gene content and thus systematic analysis of the entire gene repertoire, termed the pan-genome, is important for understanding bacterial intra-species diversity, population genetics, and evolution. Here, we analyzed the pan-genome from 30 completely sequenced strains of the human gastric pathogen Helicobacter pylori belonging to various phylogeographic groups, focusing on 991 accessory (not fully conserved) orthologous groups (OGs). We developed a method to evaluate the mobility of genes within a genome, using the gene order in the syntenically conserved regions as a reference, and classified the 991 accessory OGs into five classes: Core, Stable, Intermediate, Mobile, and Unique. Phylogenetic networks based on the gene content of Core and Stable classes are highly congruent with that created from the concatenated alignment of fully conserved core genes, in contrast to those of Intermediate and Mobile classes, which show quite different topologies. By clustering the accessory OGs on the basis of phylogenetic pattern similarity and chromosomal proximity, we identified 60 co-occurring gene clusters (CGCs). In addition to known genomic islands, including cag pathogenicity island, bacteriophages, and integrating conjugative elements, we identified some novel ones. One island encodes TerY-phosphorylation triad, which includes the eukaryote-type protein kinase/phosphatase gene pair, and components of type VII secretion system. Another one contains a reverse-transcriptase homolog, which may be involved in the defense against phage infection through altruistic suicide. Many of the CGCs contained restriction-modification (RM) genes. Different RM systems sometimes occupied the same (orthologous) locus in the strains. We anticipate that our method will facilitate pan-genome studies in general and help identify novel genomic islands in various bacterial species.

MeSH terms

  • Chromosomes, Bacterial / genetics
  • DNA Transposable Elements / genetics
  • DNA, Bacterial / genetics
  • Genomic Islands / genetics*
  • Genomics / methods*
  • Helicobacter pylori / genetics*
  • Multigene Family / genetics
  • Phylogeny
  • RNA-Directed DNA Polymerase / genetics

Substances

  • DNA Transposable Elements
  • DNA, Bacterial
  • RNA-Directed DNA Polymerase

Grants and funding

This work was carried out under the Cooperative Research Program of National Institute for Basic Biology [No. 13-359]. This work was supported by National Bioscience Database Center, Japan Science Technology Agency to IU, by KAKENHI from the Japan Society for the Promotion of Science (JSPS) (grant no. 25291080), by KAKENHI from the Ministry of Education, Culture, Sports, Science and Technology (MEXT) (grant nos. 24113506 and 26113704), by the Global COE (Center of Excellence) Project of Genome Information Big Bang from MEXT, by the Programme for Promotion of Basic and Applied Researches for Innovations in Bio-oriented Industry from the Bio-oriented Technology Research Advancement Institution (grant no. 121205003001002100019), and by Science and technology research promotion program for agriculture, forestry, fisheries and food industry from Ministry of Agriculture, Forestry, and Fisheries (grant no. 26025A) to IK. M.F. and K.Y. are JSPS Research Fellows. J.A. is a MIT-Japan program fellow. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.