Sequence-dependent prediction of recombination hotspots in Saccharomyces cerevisiae

J Theor Biol. 2012 Jan 21:293:49-54. doi: 10.1016/j.jtbi.2011.10.004. Epub 2011 Oct 12.

Abstract

Meiotic recombination does not occur randomly across the genome, but instead occurs at relatively high frequencies in some genomic regions (hotspots) and relatively low frequencies in others (coldspots). Hotspots and coldspots would shed light on the mechanism of recombination, but the accurate prediction of hot/cold spots is still an open question. In this study, we presented a model to predict hot/cold spots in yeast using increment of diversity combined with quadratic discriminant analysis (IDQD) based on sequence k-mer frequencies. 5-fold cross validation showed a total prediction accuracy of 80.3%. Compared with other machine-learning algorithms, IDQD approach is as powerful as random forest (RF) and outperforms support vector machine (SVM) in identifying hotspots and coldspots. We also predicted increased recombination rates in the upstream regions of transcription start sites and in the downstream regions of transcription termination sites. Additionally, genome-wide recombination map in yeast obtained by IDQD model is in close agreement with the experimentally generated map, especially for the Peak locations, although some fine-scale differences exist. Our results highlight the sequence dependency of recombination.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Discriminant Analysis
  • Genetic Variation
  • Genome, Fungal
  • Meiosis / genetics
  • Models, Genetic*
  • Recombination, Genetic*
  • Saccharomyces cerevisiae / genetics*
  • Transcription, Genetic