Evaluating putative chimeric sequences from PCR-amplified products

Bioinformatics. 2005 Feb 1;21(3):333-7. doi: 10.1093/bioinformatics/bti008. Epub 2004 Sep 3.

Abstract

Motivation: PCR amplification of highly homologous genes from complex DNA mixtures is known to generate a significant proportion of chimeric sequences. Ribosomal RNA genes are used for microbial species detection and identification in natural environments, and current assessments of microbial diversity are based on these sequences. Thus, chimeric sequences could lead to the discovery of non-existent microbial species and false diversity estimates.

Methods: In essence, our only source of information to decide if a sequence is chimeric or not is to compare it with known, non-chimeric sequences. Putative chimeric sequences were analyzed from sequence fragments of selected length (referred to as words) by comparing nucleotides at corresponding positions. Distances for each word between reference sequences (closely related to the tested sequence) were compared to the differences introduced by the tested sequence. The proposed strategy considers the actual variability existing in different regions throughout the analyzed sequences. The result is an efficient strategy for the evaluation of putative chimeric sequences.

Availability: A program computing the above procedure, Chimera and Cross-Over Detection and Evaluation (Ccode), is available at http://www.irnase.csic.es/users/jmgrau/index.html and http://www.rtphc.csic.es/download.html.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Base Sequence
  • Chimera / genetics*
  • Molecular Sequence Data
  • Polymerase Chain Reaction / methods*
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*
  • Sequence Homology, Nucleic Acid
  • Software