The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models

Nat Biotechnol. 2010 Aug;28(8):827-38. doi: 10.1038/nbt.1665. Epub 2010 Jul 30.

Abstract

Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Breast Neoplasms / diagnosis
  • Breast Neoplasms / genetics
  • Disease Models, Animal
  • Female
  • Gene Expression Profiling / methods
  • Gene Expression Profiling / standards
  • Guidelines as Topic
  • Humans
  • Liver Diseases / etiology
  • Liver Diseases / genetics*
  • Liver Diseases / pathology
  • Lung Diseases / etiology
  • Lung Diseases / genetics*
  • Lung Diseases / pathology
  • Multiple Myeloma / diagnosis
  • Multiple Myeloma / genetics
  • Neoplasms / diagnosis
  • Neoplasms / genetics*
  • Neoplasms / mortality*
  • Neuroblastoma / diagnosis
  • Neuroblastoma / genetics
  • Oligonucleotide Array Sequence Analysis / methods*
  • Oligonucleotide Array Sequence Analysis / standards*
  • Predictive Value of Tests
  • Quality Control
  • Rats
  • Survival Analysis