Simple diagnostic statistical tests of models for DNA substitution

J Mol Evol. 1993 Dec;37(6):650-61. doi: 10.1007/BF00182751.

Abstract

The accuracy of models for DNA substitution used in phylogenetic analyses is becoming more important with the increasing availability and analysis of molecular sequence data. It is natural to look for ways of improving these models, and to do this in a planned manner it is useful to be able to identify features of sequences that may not be described adequately. In this paper, I describe three statistics which may give useful diagnostic information on departures from models' predictions. The statistical distributions of these statistics are discussed and simple significance tests are derived. These tests are based on the (estimated) phylogeny of the sequences and so have the advantage of using the information contained in this tree. Examples are given of the application of the new tests to Markov chain models describing the evolution of primate pseudogene sequences and small-subunit RNA sequences.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • DNA / genetics
  • Globins / genetics
  • Humans
  • Models, Genetic*
  • Molecular Sequence Data
  • Mutation*
  • Phylogeny*
  • Pseudogenes
  • RNA, Small Nuclear / genetics
  • Statistics as Topic* / methods

Substances

  • RNA, Small Nuclear
  • Globins
  • DNA