chip artifact CORRECTion (caCORRECT): a bioinformatics system for quality assurance of genomics and proteomics array data

Ann Biomed Eng. 2007 Jun;35(6):1068-80. doi: 10.1007/s10439-007-9313-y. Epub 2007 Apr 26.

Abstract

Quality assurance of high throughput "-omics" data is a major concern for biomedical discovery and translational medicine, and is considered a top priority in bioinformatics and systems biology. Here, we report a web-based bioinformatics tool called caCORRECT for chip artifact detection, analysis, and CORRECTion, which removes systematic artifactual noises that are commonly observed in microarray gene expression data. Despite the development of major databases such as GEO arrayExpress, caArray, and the SMD to manage and distribute microarray data to the public, reproducibility has been questioned in many cases, including high-profile papers and datasets. Based on both archived and synthetic data, we have designed the caCORRECT to have several advanced features: (1) to uncover significant, correctable artifacts that affect reproducibility of experiments; (2) to improve the integrity and quality of public archives by removing artifacts; (3) to provide a universal quality score to aid users in their selection of suitable microarray data; and (4) to improve the true-positive rate of biomarker selection verified by test data. These features are expected to improve the reproducibility of Microarray study. caCORRECT is freely available at: http://caCORRECT.bme.gatech.edu.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Artifacts*
  • Computational Biology / methods*
  • Genomics / methods*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Quality Control
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Software Design
  • Software*
  • User-Computer Interface*