A novel termini analysis theory using HTS data alone for the identification of Enterococcus phage EF4-like genome termini

BMC Genomics. 2015 May 28;16(1):414. doi: 10.1186/s12864-015-1612-3.

Abstract

Background: Enterococcus faecalis and Enterococcus faecium are typical enterococcal bacterial pathogens. Antibiotic resistance means that the identification of novel E. faecalis and E. faecium phages against antibiotic-resistant Enterococcus have an important impact on public health. In this study, the E. faecalis phage IME-EF4, E. faecium phage IME-EFm1, and both their hosts were antibiotic resistant. To characterize the genome termini of these two phages, a termini analysis theory was developed to provide a wealth of terminal sequence information directly, using only high-throughput sequencing (HTS) read frequency statistics.

Results: The complete genome sequences of phages IME-EF4 and IME-EFm1 were determined, and our termini analysis theory was used to determine the genome termini of these two phages. Results showed 9 bp 3' protruding cohesive ends in both IME-EF4 and IME-EFm1 genomes by analyzing frequencies of HTS reads. For the positive strands of their genomes, the 9 nt 3' protruding cohesive ends are 5'-TCATCACCG-3' (IME-EF4) and 5'-GGGTCAGCG-3' (IME-EFm1). Further experiments confirmed these results. These experiments included mega-primer polymerase chain reaction sequencing, terminal run-off sequencing, and adaptor ligation followed by run-off sequencing.

Conclusion: Using this termini analysis theory, the termini of two newly isolated antibiotic-resistant Enterococcus phages, IME-EF4 and IME-EFm1, were identified as the byproduct of HTS. Molecular biology experiments confirmed the identification. Because it does not require time-consuming wet lab termini analysis experiments, the termini analysis theory is a fast and easy means of identifying phage DNA genome termini using HTS read frequency statistics alone. It may aid understanding of phage DNA packaging.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteriophages / genetics*
  • Drug Resistance, Bacterial
  • Enterococcus / isolation & purification
  • Enterococcus / virology*
  • Enterococcus faecalis / isolation & purification
  • Enterococcus faecalis / virology
  • Enterococcus faecium / isolation & purification
  • Enterococcus faecium / virology
  • Genome, Viral
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Models, Genetic
  • Models, Statistical
  • Sequence Analysis, DNA / methods*
  • Terminal Repeat Sequences*