Deconvolution of cellular subsets in human tissue based on targeted DNA methylation analysis at individual CpG sites

BMC Biol. 2020 Nov 24;18(1):178. doi: 10.1186/s12915-020-00910-4.

Abstract

Background: The complex composition of different cell types within a tissue can be estimated by deconvolution of bulk gene expression profiles or with various single-cell sequencing approaches. Alternatively, DNA methylation (DNAm) profiles have been used to establish an atlas for multiple human tissues and cell types. DNAm is particularly suitable for deconvolution of cell types because each CG dinucleotide (CpG site) has only two states per DNA strand-methylated or non-methylated-and these epigenetic modifications are very consistent during cellular differentiation. So far, deconvolution of DNAm profiles implies complex signatures of many CpGs that are often measured by genome-wide analysis with Illumina BeadChip microarrays. In this study, we investigated if the characterization of cell types in tissue is also feasible with individual cell type-specific CpG sites, which can be addressed by targeted analysis, such as pyrosequencing.

Results: We compiled and curated 579 Illumina 450k BeadChip DNAm profiles of 14 different non-malignant human cell types. A training and validation strategy was applied to identify and test for cell type-specific CpGs. We initially focused on estimating the relative amount of fibroblasts using two CpGs that were either hypermethylated or hypomethylated in fibroblasts. The combination of these two DNAm levels into a "FibroScore" correlated with the state of fibrosis and was associated with overall survival in various types of cancer. Furthermore, we identified hypomethylated CpGs for leukocytes, endothelial cells, epithelial cells, hepatocytes, glia, neurons, fibroblasts, and induced pluripotent stem cells. The accuracy of this eight CpG signature was tested in additional BeadChip datasets of defined cell mixtures and the results were comparable to previously published signatures based on several thousand CpGs. Finally, we established and validated pyrosequencing assays for the relevant CpGs that can be utilized for classification and deconvolution of cell types.

Conclusion: This proof of concept study demonstrates that DNAm analysis at individual CpGs reflects the cellular composition of cellular mixtures and different tissues. Targeted analysis of these genomic regions facilitates robust methods for application in basic research and clinical settings.

Keywords: Cancer; Cell types; CpG; DNA methylation; Deconvolution; Epigenetic; Fibrosis; Human; NNLS; Pyrosequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Physiological Phenomena / genetics*
  • CpG Islands*
  • DNA Methylation*
  • Epigenesis, Genetic*
  • Humans