Microarray data analysis: from hypotheses to conclusions using gene expression data

Cell Oncol. 2004;26(5-6):279-90. doi: 10.1155/2004/943940.

Abstract

We review several commonly used methods for the design and analysis of microarray data. To begin with, some experimental design issues are addressed. Several approaches for pre-processing the data (filtering and normalization) before the statistical analysis stage are then discussed. A common first step in this type of analysis is gene selection based on statistical testing. Two approaches, permutation and model-based methods are explained and we emphasize the need to correct for multiple testing. Moreover, powerful approaches based on gene sets are mentioned. Clustering of either genes or samples is frequently performed when analyzing microarray data. We summarize the basics of both supervised and unsupervised clustering (classification). The latter may be of use for creating diagnostic arrays, for example. Construction of biological networks, such as pathways, is a statistically challenging but complex task that is a relatively new development and hence mentioned only briefly. We finish with some remarks on literature and software. The emphasis in this paper is on the philosophy behind several statistical issues and on a critical interpretation of microarray related analysis methods.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Cluster Analysis
  • Gene Expression Profiling / methods
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • Image Processing, Computer-Assisted
  • Internet
  • Models, Statistical
  • Multigene Family
  • Oligonucleotide Array Sequence Analysis / methods*
  • Software
  • Statistics as Topic / methods