Statistical process control for large scale microarray experiments

Bioinformatics. 2002:18 Suppl 1:S155-63. doi: 10.1093/bioinformatics/18.suppl_1.s155.

Abstract

Motivation: Maintaining and controlling data quality is a key problem in large scale microarray studies. In particular systematic changes in experimental conditions across multiple chips can seriously affect quality and even lead to false biological conclusions. Traditionally the influence of these effects can be minimized only by expensive repeated measurements, because a detailed understanding of all process relevant parameters seems impossible.

Results: We introduce a novel method for microarray process control that estimates quality based solely on the distribution of the actual measurements without requiring repeated experiments. A robust version of principle component analysis detects single outlier microarrays and thereby enables the use of techniques from multivariate statistical process control. In particular, the T(2) control chart reliably tracks undesired changes in process relevant parameters. This can be used to improve the microarray process itself, limits necessary repetitions to only affected samples and therefore maintains quality in a cost effective way. We prove the power of the approach on 3 large sets of DNA methylation microarray data.

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Algorithms*
  • Artifacts
  • DNA Methylation*
  • Data Interpretation, Statistical*
  • Gene Expression Profiling / methods
  • Gene Expression Profiling / standards
  • Models, Genetic*
  • Models, Statistical*
  • Nucleic Acid Hybridization / genetics
  • Oligonucleotide Array Sequence Analysis / methods*
  • Oligonucleotide Array Sequence Analysis / standards
  • Principal Component Analysis
  • Quality Assurance, Health Care / methods*
  • Quality Assurance, Health Care / standards
  • Quality Control
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Stochastic Processes