Linguistics of nucleotide sequences. II: Stationary words in genetic texts and the zonal structure of DNA

J Biomol Struct Dyn. 1989 Apr;6(5):1027-38. doi: 10.1080/07391102.1989.10506529.

Abstract

Words are irregularly distributed in genetic texts. The analysis of this irregularity leads to the notion of stationary and non-stationary words. The polyW and polyS tracts are shown to be the most non-stationary words in genetic texts (here W-[A,T], S-[G,C], a polyW tract is a sequence of A,T nucleotides and a polyS tract is a sequence of G,C nucleotides. The distribution of stationary words suggests a method for partitioning DNA into zones. The zones obtained in the case of the phage are interpreted in the light of the Dowe hypothesis of the modular structure of bacteriophage genomes.

MeSH terms

  • Adenoviridae / genetics
  • Bacteriophage lambda / genetics
  • Base Sequence*
  • DNA* / ultrastructure
  • Escherichia coli / genetics
  • Genes
  • Nucleotides*
  • T-Phages / genetics

Substances

  • Nucleotides
  • DNA