Genome-wide detection of cytosine methylations in plant from Nanopore data using deep learning

Nat Commun. 2021 Oct 13;12(1):5976. doi: 10.1038/s41467-021-26278-9.

Abstract

In plants, cytosine DNA methylations (5mCs) can happen in three sequence contexts as CpG, CHG, and CHH (where H = A, C, or T), which play different roles in the regulation of biological processes. Although long Nanopore reads are advantageous in the detection of 5mCs comparing to short-read bisulfite sequencing, existing methods can only detect 5mCs in the CpG context, which limits their application in plants. Here, we develop DeepSignal-plant, a deep learning tool to detect genome-wide 5mCs of all three contexts in plants from Nanopore reads. We sequence Arabidopsis thaliana and Oryza sativa using both Nanopore and bisulfite sequencing. We develop a denoising process for training models, which enables DeepSignal-plant to achieve high correlations with bisulfite sequencing for 5mC detection in all three contexts. Furthermore, DeepSignal-plant can profile more 5mC sites, which will help to provide a more complete understanding of epigenetic mechanisms of different biological processes.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Arabidopsis / genetics*
  • Arabidopsis / metabolism
  • CpG Islands
  • Cytosine / metabolism*
  • DNA Methylation
  • DNA, Plant / genetics*
  • DNA, Plant / metabolism
  • Deep Learning
  • Epigenesis, Genetic*
  • Genome, Plant*
  • High-Throughput Nucleotide Sequencing / methods
  • Nanopores
  • Oryza / genetics*
  • Oryza / metabolism
  • Sequence Analysis, DNA
  • Sulfites / chemistry

Substances

  • DNA, Plant
  • Sulfites
  • Cytosine
  • hydrogen sulfite