Donuts, scratches and blanks: robust model-based segmentation of microarray images

Bioinformatics. 2005 Jun 15;21(12):2875-82. doi: 10.1093/bioinformatics/bti447. Epub 2005 Apr 21.

Abstract

Motivation: Inner holes, artifacts and blank spots are common in microarray images, but current image analysis methods do not pay them enough attention. We propose a new robust model-based method for processing microarray images so as to estimate foreground and background intensities. The method starts with a very simple but effective automatic gridding method, and then proceeds in two steps. The first step applies model-based clustering to the distribution of pixel intensities, using the Bayesian Information Criterion (BIC) to choose the number of groups up to a maximum of three. The second step is spatial, finding the large spatially connected components in each cluster of pixels. The method thus combines the strengths of the histogram-based and spatial approaches. It deals effectively with inner holes in spots and with artifacts. It also provides a formal inferential basis for deciding when the spot is blank, namely when the BIC favors one group over two or three.

Results: We apply our methods for gridding and segmentation to cDNA microarray images from an HIV infection experiment. In these experiments, our method had better stability across replicates than a fixed-circle segmentation method or the seeded region growing method in the SPOT software, without introducing noticeable bias when estimating the intensities of differentially expressed genes.

Availability: spotSegmentation, an R language package implementing both the gridding and segmentation methods is available through the Bioconductor project (http://www.bioconductor.org). The segmentation method requires the contributed R package MCLUST for model-based clustering (http://cran.us.r-project.org).

Contact: [email protected].

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • Artifacts
  • Gene Expression Profiling / methods*
  • Image Enhancement / methods*
  • Image Interpretation, Computer-Assisted / methods
  • In Situ Hybridization, Fluorescence / methods*
  • Microscopy, Fluorescence / methods*
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis / methods*
  • Pattern Recognition, Automated / methods
  • Software*