Fast and accurate probe selection algorithm for large genomes

Proc IEEE Comput Soc Bioinform Conf. 2003:2:65-74.

Abstract

The oligo microarray (DNA chip) technology in recent years has a significant impact on genomic study. Many fields such as gene discovery, drug discovery, toxicological research and disease diagnosis, will certainly benefit from its use. A microarray is an orderly arrangement of thousands of DNA fragments where each DNA fragment is a probe (or a fingerprint) of a gene/cDNA. It is important that each probe must uniquely associate with a particular gene/cDNA. Otherwise, the performance of the microarray will be affected. Existing algorithms usually select probes using the criteria of homogeneity, sensitivity, and specificity. Moreover, they improve efficiency employing some heuristics. Such approaches reduce the accuracy. Instead, we make use of some smart filtering techniques to avoid redundant computation while maintaining the accuracy. Based on the new algorithm, optimal short (20 bases) or long (50 or 70 bases) probes can be computed efficiently for large genomes.

MeSH terms

  • Algorithms*
  • Base Sequence
  • Chromosome Mapping / methods*
  • DNA Probes / genetics*
  • Equipment Design
  • Equipment Failure Analysis
  • Genome / genetics*
  • Models, Genetic
  • Models, Statistical
  • Molecular Sequence Data
  • Oligonucleotide Array Sequence Analysis / instrumentation*
  • Oligonucleotide Array Sequence Analysis / methods
  • Pattern Recognition, Automated / methods
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*

Substances

  • DNA Probes