A metric and workflow for quality control in the analysis of heterogeneity in phenotypic profiles and screens

Methods. 2016 Mar 1:96:12-26. doi: 10.1016/j.ymeth.2015.10.007. Epub 2015 Nov 4.

Abstract

Heterogeneity is well recognized as a common property of cellular systems that impacts biomedical research and the development of therapeutics and diagnostics. Several studies have shown that analysis of heterogeneity: gives insight into mechanisms of action of perturbagens; can be used to predict optimal combination therapies; and can be applied to tumors where heterogeneity is believed to be associated with adaptation and resistance. Cytometry methods including high content screening (HCS), high throughput microscopy, flow cytometry, mass spec imaging and digital pathology capture cell level data for populations of cells. However it is often assumed that the population response is normally distributed and therefore that the average adequately describes the results. A deeper understanding of the results of the measurements and more effective comparison of perturbagen effects requires analysis that takes into account the distribution of the measurements, i.e. the heterogeneity. However, the reproducibility of heterogeneous data collected on different days, and in different plates/slides has not previously been evaluated. Here we show that conventional assay quality metrics alone are not adequate for quality control of the heterogeneity in the data. To address this need, we demonstrate the use of the Kolmogorov-Smirnov statistic as a metric for monitoring the reproducibility of heterogeneity in an SAR screen, describe a workflow for quality control in heterogeneity analysis. One major challenge in high throughput biology is the evaluation and interpretation of heterogeneity in thousands of samples, such as compounds in a cell-based screen. In this study we also demonstrate that three heterogeneity indices previously reported, capture the shapes of the distributions and provide a means to filter and browse big data sets of cellular distributions in order to compare and identify distributions of interest. These metrics and methods are presented as a workflow for analysis of heterogeneity in large scale biology projects.

Keywords: Drug discovery; Heterogeneity; High content screening; Phenotypic profiling; Systems biology.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Line, Tumor
  • Decision Trees
  • Epithelial Cells / drug effects
  • Epithelial Cells / metabolism
  • Epithelial Cells / ultrastructure*
  • Flow Cytometry / standards
  • Flow Cytometry / statistics & numerical data*
  • Gene Expression Regulation, Neoplastic*
  • High-Throughput Screening Assays / standards
  • High-Throughput Screening Assays / statistics & numerical data*
  • Humans
  • Interleukin-6 / pharmacology
  • Microscopy / standards
  • Microscopy / statistics & numerical data*
  • Molecular Imaging / standards
  • Molecular Imaging / statistics & numerical data*
  • Phenotype
  • Quality Control
  • Reproducibility of Results
  • STAT3 Transcription Factor / genetics
  • STAT3 Transcription Factor / metabolism
  • Signal Transduction
  • Statistics, Nonparametric

Substances

  • IL6 protein, human
  • Interleukin-6
  • STAT3 Transcription Factor
  • STAT3 protein, human