Combining pattern discovery and discriminant analysis to predict gene co-regulation

N Simonis; S J Wodak; G N Cohen; J van Helden

doi:10.1093/bioinformatics/bth252

Combining pattern discovery and discriminant analysis to predict gene co-regulation

Bioinformatics. 2004 Oct 12;20(15):2370-9. doi: 10.1093/bioinformatics/bth252. Epub 2004 Apr 8.

Authors

N Simonis¹, S J Wodak, G N Cohen, J van Helden

Affiliation

¹ Service de Conformation des Macromolécules Biologiques et Bioinformatique, Centre de Biologie Structurale et Bioinformatique, CP 263, Université Libre de Bruxelles, Bld. du Triomphe B-1050 Bruxelles, Belgium. [email protected]

PMID: 15073004
DOI: 10.1093/bioinformatics/bth252

Abstract

Motivation: Several pattern discovery methods have been proposed to detect over-represented motifs in upstream sequences of co-regulated genes, and are for example used to predict cis-acting elements from clusters of co-expressed genes. The clusters to be analyzed are often noisy, containing a mixture of co-regulated and non-co-regulated genes. We propose a method to discriminate co-regulated from non-co-regulated genes on the basis of counts of pattern occurrences in their non-coding sequences.

Methods: String-based pattern discovery is combined with discriminant analysis to classify genes on the basis of putative regulatory motifs.

Results: The approach is evaluated by comparing the significance of patterns detected in annotated regulons (positive control), random gene selections (negative control) and high-throughput regulons (noisy data) from the yeast Saccharomyces cerevisiae. The classification is evaluated on the annotated regulons, and the robustness and rejection power is assessed with mixtures of co-regulated and random genes.

Publication types

Comparative Study
Evaluation Study
Research Support, Non-U.S. Gov't
Validation Study

MeSH terms

Algorithms*
Computer Simulation
Discriminant Analysis
Gene Expression Regulation / physiology*
Genes, Regulator / genetics*
Models, Genetic
Models, Statistical
Pattern Recognition, Automated / methods*
Saccharomyces cerevisiae / genetics
Saccharomyces cerevisiae / metabolism
Saccharomyces cerevisiae Proteins / genetics*
Sequence Alignment / methods*
Sequence Analysis, DNA / methods*
Transcription Factors / genetics

Substances

Saccharomyces cerevisiae Proteins
Transcription Factors