A simple multiallele model and its application to identifying preferred-unpreferred codons using polymorphism data

Mol Biol Evol. 2010 Jun;27(6):1327-37. doi: 10.1093/molbev/msq023. Epub 2010 Jan 27.

Abstract

Analysis of within-species polymorphism data usually relies on population genetic models that assume two alleles at a locus (e.g., the infinite sites model). However, many problems of interest can be tackled more naturally by multiallele models. In this study, I construct a model that can accommodate an arbitrary number of alleles at a locus, mutational biases, and selective differences between each of the alleles. It is constructed by representing population dynamics by a Markov transition matrix and is based on the assumption that at most two variants exist at each polymorphic site. A likelihood-based method for inferring the selection and mutational parameters of the model is constructed and is shown to have high accuracy. I use this method to jointly infer preferred codons and mutational parameters in Drosophila melanogaster. Twenty-one codons are identified as preferred, 19 of which were found previously by methods that do not use polymorphism data. Interestingly, the selective difference between the fittest and the worst codons encoding the same amino acid is positively correlated with the number of synonymous codons for that amino acid, in agreement with previous analyses of interspecies data using phylogenetic models. The inferred mutation matrix is highly asymmetric, with C-->T and G-->A being the most common and constituting approximately 18% and approximately 19% of all mutation events, respectively. These results suggest that the new model provides a useful framework for analyzing polymorphism data sampled from multiallele systems.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles*
  • Animals
  • Codon*
  • Databases, Genetic
  • Drosophila melanogaster / genetics
  • Genome, Insect
  • Models, Genetic*
  • Mutation
  • Polymorphism, Genetic*
  • Selection, Genetic

Substances

  • Codon