Determination of essential phenotypic elements of clusters in high-dimensional entities-DEPECHE

PLoS One. 2019 Mar 7;14(3):e0203247. doi: 10.1371/journal.pone.0203247. eCollection 2019.

Abstract

Technological advances have facilitated an exponential increase in the amount of information that can be derived from single cells, necessitating new computational tools that can make such highly complex data interpretable. Here, we introduce DEPECHE, a rapid, parameter free, sparse k-means-based algorithm for clustering of multi- and megavariate single-cell data. In a number of computational benchmarks aimed at evaluating the capacity to form biologically relevant clusters, including flow/mass-cytometry and single cell RNA sequencing data sets with manually curated gold standard solutions, DEPECHE clusters as well or better than the currently available best performing clustering algorithms. However, the main advantage of DEPECHE, compared to the state-of-the-art, is its unique ability to enhance interpretability of the formed clusters, in that it only retains variables relevant for cluster separation, thereby facilitating computational efficient analyses as well as understanding of complex datasets. DEPECHE is implemented in the open source R package DepecheR currently available at github.com/Theorell/DepecheR.

MeSH terms

  • Algorithms*
  • Cluster Analysis*
  • Computer Simulation
  • Databases, Factual / statistics & numerical data
  • Flow Cytometry / statistics & numerical data
  • Humans
  • Multivariate Analysis
  • Phenotype
  • Single-Cell Analysis / statistics & numerical data
  • Software

Grants and funding

The authors received no specific funding for this work.