CDS: a fold-change based statistical test for concomitant identification of distinctness and similarity in gene expression analysis

Genomics Proteomics Bioinformatics. 2012 Jun;10(3):127-35. doi: 10.1016/j.gpb.2012.06.002. Epub 2012 Jun 25.

Abstract

The problem of identifying differential activity such as in gene expression is a major defeat in biostatistics and bioinformatics. Equally important, however much less frequently studied, is the question of similar activity from one biological condition to another. The fold-change, or ratio, is usually considered a relevant criterion for stating difference and similarity between measurements. Importantly, no statistical method for concomitant evaluation of similarity and distinctness currently exists for biological applications. Modern microarray, digital PCR (dPCR), and Next-Generation Sequencing (NGS) technologies frequently provide a means of coefficient of variation estimation for individual measurements. Using fold-change, and by making the assumption that measurements are normally distributed with known variances, we designed a novel statistical test that allows us to detect concomitantly, thus using the same formalism, differentially and similarly expressed genes (http://cds.ihes.fr). Given two sets of gene measurements in different biological conditions, the probabilities of making type I and type II errors in stating that a gene is differentially or similarly expressed from one condition to the other can be calculated. Furthermore, a confidence interval for the fold-change can be delineated. Finally, we demonstrate that the assumption of normality can be relaxed to consider arbitrary distributions numerically. The Concomitant evaluation of Distinctness and Similarity (CDS) statistical test correctly estimates similarities and differences between measurements of gene expression. The implementation, being time and memory efficient, allows the use of the CDS test in high-throughput data analysis such as microarray, dPCR, and NGS experiments. Importantly, the CDS test can be applied to the comparison of single measurements (N=1) provided the variance (or coefficient of variation) of the signals is known, making CDS a valuable tool also in biomedical analysis where typically a single measurement per subject is available.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenoma / genetics
  • Adrenal Cortex / metabolism
  • Adrenal Cortex Neoplasms / genetics
  • Carcinoma / genetics
  • Confidence Intervals
  • Gene Expression Profiling / methods
  • Gene Expression Profiling / statistics & numerical data*
  • Humans
  • Oligonucleotide Array Sequence Analysis / methods
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*

Associated data

  • GEO/GSE10927