Species-specific typing of DNA based on palindrome frequency patterns

DNA Res. 2011 Apr;18(2):117-24. doi: 10.1093/dnares/dsr004. Epub 2011 Mar 23.

Abstract

DNA in its natural, double-stranded form may contain palindromes, sequences which read the same from either side because they are identical to their reverse complement on the sister strand. Short palindromes are underrepresented in all kinds of genomes. The frequency distribution of short palindromes exhibits more than twice the inter-species variance of non-palindromic sequences, which renders palindromes optimally suited for the typing of DNA. Here, we show that based on palindrome frequency, DNA sequences can be discriminated to the level of species of origin. By plotting the ratios of actual occurrence to expectancy, we generate palindrome frequency patterns that allow to cluster different sequences of the same genome and to assign plasmids, and in some cases even viruses to their respective host genomes. This finding will be of use in the growing field of metagenomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • Caenorhabditis elegans / genetics
  • Cluster Analysis
  • DNA / classification*
  • DNA / genetics*
  • DNA Fingerprinting*
  • Genetic Variation
  • Genome / genetics
  • Inverted Repeat Sequences / genetics*
  • Metagenomics
  • Species Specificity
  • Yeasts / genetics

Substances

  • DNA