MethCORR infers gene expression from DNA methylation and allows molecular analysis of ten common cancer types using fresh-frozen and formalin-fixed paraffin-embedded tumor samples

Clin Epigenetics. 2021 Jan 28;13(1):20. doi: 10.1186/s13148-021-01000-0.

Abstract

Background: Transcriptional analysis is widely used to study the molecular biology of cancer and hold great biomarker potential for clinical patient stratification. Yet, accurate transcriptional profiling requires RNA of a high quality, which often cannot be retrieved from formalin-fixed, paraffin-embedded (FFPE) tumor tissue that is routinely collected and archived in clinical departments. To overcome this roadblock to clinical testing, we previously developed MethCORR, a method that infers gene expression from DNA methylation data, which is robustly retrieved from FFPE tissue. MethCORR was originally developed for colorectal cancer and with this study, we aim to: (1) extend the MethCORR method to 10 additional cancer types and (2) to illustrate that the inferred gene expression is accurate and clinically informative.

Results: Regression models to infer gene expression information from DNA methylation were developed for ten common cancer types using matched RNA sequencing and DNA methylation profiles (HumanMethylation450 BeadChip) from The Cancer Genome Atlas Project. Robust and accurate gene expression profiles were inferred for all cancer types: on average, the expression of 11,000 genes was modeled with good accuracy and an intra-sample correlation of R2 = 0.90 between inferred and measured gene expression was observed. Molecular pathway analysis and transcriptional subtyping were performed for breast, prostate, and lung cancer samples to illustrate the general usability of the inferred gene expression profiles: overall, a high correlation of r = 0.96 (Pearson) in pathway enrichment scores and a 76% correspondence in molecular subtype calls were observed when using measured and inferred gene expression as input. Finally, inferred expression from FFPE tissue correlated better with RNA sequencing data from matched fresh-frozen tissue than did RNA sequencing data from FFPE tissue (P < 0.0001; Wilcoxon rank-sum test).

Conclusions: In all cancers investigated, MethCORR enabled DNA methylation-based transcriptional analysis, thus enabling future analysis of cancer in situations where high-quality DNA, but not RNA, is available. Here, we provide the framework and resources for MethCORR modeling of ten common cancer types, thereby widely expanding the possibilities for transcriptional studies of archival FFPE material.

Keywords: Biomarkers; Cancer; DNA methylation; FFPE tissue; Gene expression; Molecular subtypes; RNA sequencing.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA Methylation*
  • Exome Sequencing
  • Formaldehyde
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • Neoplasms / diagnosis*
  • Neoplasms / genetics*
  • Paraffin Embedding / methods*
  • Sequence Analysis, RNA / methods*
  • Tissue Fixation / methods*

Substances

  • Formaldehyde