Motivation: Microarray technology provides access to expression levels of thousands of genes at once, producing large amounts of data. These datasets are valuable only if they are annotated by sufficiently detailed experiment descriptions. However, in many databases a substantial number of these annotations is in free-text format and not readily accessible to computer-aided analysis.
Results: The Multi-Conditional Hybridization Intensity Processing System (M-CHIPS), a data warehousing concept, focuses on providing both structure and algorithms suitable for statistical analysis of a microarray database's entire contents including the experiment annotations. It addresses the rapid growth of the amount of hybridization data, more detailed experimental descriptions, and new kinds of experiments in the future. We have developed a storage concept, a particular instance of which is an organism-specific database. Although these databases may contain different ontologies of experiment annotations, they share the same structure and therefore can be accessed by the very same statistical algorithms. Experiment ontologies have not yet reached their final shape, and standards are reduced to minimal conventions that do not yet warrant extensive description. An ontology-independent structure enables updates of annotation hierarchies during normal database operation without altering the structure.
Availability and supplementary information: http://www.dkfz.de/tbi/services/mchips