flowLearn: fast and precise identification and quality checking of cell populations in flow cytometry

Bioinformatics. 2018 Jul 1;34(13):2245-2253. doi: 10.1093/bioinformatics/bty082.

Abstract

Motivation: Identification of cell populations in flow cytometry is a critical part of the analysis and lays the groundwork for many applications and research discovery. The current paradigm of manual analysis is time consuming and subjective. A common goal of users is to replace manual analysis with automated methods that replicate their results. Supervised tools provide the best performance in such a use case, however they require fine parameterization to obtain the best results. Hence, there is a strong need for methods that are fast to setup, accurate and interpretable.

Results: flowLearn is a semi-supervised approach for the quality-checked identification of cell populations. Using a very small number of manually gated samples, through density alignments it is able to predict gates on other samples with high accuracy and speed. On two state-of-the-art datasets, our tool achieves median(F1)-measures exceeding 0.99 for 31%, and 0.90 for 80% of all analyzed populations. Furthermore, users can directly interpret and adjust automated gates on new sample files to iteratively improve the initial training.

Availability and implementation: FlowLearn is available as an R package on https://github.com/mlux86/flowLearn. Evaluation data is publicly available online. Details can be found in the Supplementary Material.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Flow Cytometry / methods*
  • Software*