DC3 is a method for deconvolution and coupled clustering from bulk and single-cell genomics data

Nat Commun. 2019 Oct 10;10(1):4613. doi: 10.1038/s41467-019-12547-1.

Abstract

Characterizing and interpreting heterogeneous mixtures at the cellular level is a critical problem in genomics. Single-cell assays offer an opportunity to resolve cellular level heterogeneity, e.g., scRNA-seq enables single-cell expression profiling, and scATAC-seq identifies active regulatory elements. Furthermore, while scHi-C can measure the chromatin contacts (i.e., loops) between active regulatory elements to target genes in single cells, bulk HiChIP can measure such contacts in a higher resolution. In this work, we introduce DC3 (De-Convolution and Coupled-Clustering) as a method for the joint analysis of various bulk and single-cell data such as HiChIP, RNA-seq and ATAC-seq from the same heterogeneous cell population. DC3 can simultaneously identify distinct subpopulations, assign single cells to the subpopulations (i.e., clustering) and de-convolve the bulk data into subpopulation-specific data. The subpopulation-specific profiles of gene expression, chromatin accessibility and enhancer-promoter contact obtained by DC3 provide a comprehensive characterization of the gene regulatory system in each subpopulation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Animals
  • Cell Line
  • Chromatin
  • Chromatin Immunoprecipitation / statistics & numerical data
  • Cluster Analysis*
  • Computer Simulation
  • Gene Expression Profiling / methods
  • Gene Expression Profiling / statistics & numerical data*
  • Gene Regulatory Networks
  • Genomics / statistics & numerical data*
  • High-Throughput Nucleotide Sequencing / methods
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Humans
  • Mice
  • Promoter Regions, Genetic
  • Single-Cell Analysis / methods
  • Single-Cell Analysis / statistics & numerical data*

Substances

  • Chromatin