Revealing hidden interval graph structure in STS-content data

Bioinformatics. 1999 Apr;15(4):278-85. doi: 10.1093/bioinformatics/15.4.278.

Abstract

Motivation: STS-content data for genomic mapping contain numerous errors and anomalies resulting in cross-links among distant regions of the genome. Identification of contigs within the data is an important and difficult problem.

Results: This paper introduces a graph algorithm which creates a simplified view of STS-content data. The shape of the resulting structure graph provides a quality check - coherent data produce a straight line, while anomalous data produce branches and loops. In the latter case, it is sometimes possible to disentangle the various paths into subsets of the data covering contiguous regions of the genome, i.e. contigs. These straight subgraphs can then be analyzed in standard ways to construct a physical map. A theoretical basis for the method is presented along with examples of its application to current STS data from human genome centers.

Availability: Freely available on request.

MeSH terms

  • Algorithms*
  • Contig Mapping*
  • Databases, Factual*
  • Humans
  • Neural Networks, Computer*
  • Sequence Tagged Sites*