Robust microarray meta-analysis identifies differentially expressed genes for clinical prediction

ScientificWorldJournal. 2012:2012:989637. doi: 10.1100/2012/989637. Epub 2012 Dec 18.

Abstract

Combining multiple microarray datasets increases sample size and leads to improved reproducibility in identification of informative genes and subsequent clinical prediction. Although microarrays have increased the rate of genomic data collection, sample size is still a major issue when identifying informative genetic biomarkers. Because of this, feature selection methods often suffer from false discoveries, resulting in poorly performing predictive models. We develop a simple meta-analysis-based feature selection method that captures the knowledge in each individual dataset and combines the results using a simple rank average. In a comprehensive study that measures robustness in terms of clinical application (i.e., breast, renal, and pancreatic cancer), microarray platform heterogeneity, and classifier (i.e., logistic regression, diagonal LDA, and linear SVM), we compare the rank average meta-analysis method to five other meta-analysis methods. Results indicate that rank average meta-analysis consistently performs well compared to five other meta-analysis methods.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Breast Neoplasms / genetics
  • Computational Biology / methods*
  • Female
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Kidney Neoplasms / genetics
  • Meta-Analysis as Topic*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Pancreatic Neoplasms / genetics
  • Receptors, Estrogen / genetics
  • Reproducibility of Results

Substances

  • Receptors, Estrogen