Comparison of multivariate data analysis strategies for high-content screening

J Biomol Screen. 2011 Mar;16(3):338-47. doi: 10.1177/1087057110395390. Epub 2011 Feb 18.

Abstract

High-content screening (HCS) is increasingly used in biomedical research generating multivariate, single-cell data sets. Before scoring a treatment, the complex data sets are processed (e.g., normalized, reduced to a lower dimensionality) to help extract valuable information. However, there has been no published comparison of the performance of these methods. This study comparatively evaluates unbiased approaches to reduce dimensionality as well as to summarize cell populations. To evaluate these different data-processing strategies, the prediction accuracies and the Z' factors of control compounds of a HCS cell cycle data set were monitored. As expected, dimension reduction led to a lower degree of discrimination between control samples. A high degree of classification accuracy was achieved when the cell population was summarized on well level using percentile values. As a conclusion, the generic data analysis pipeline described here enables a systematic review of alternative strategies to analyze multiparametric results from biological systems.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cells / metabolism
  • Data Mining
  • Electronic Data Processing / methods*
  • Electronic Data Processing / standards
  • High-Throughput Screening Assays
  • Image Interpretation, Computer-Assisted*
  • Multivariate Analysis*
  • Research Design