Interpretable prediction models for widespread m6A RNA modification across cell lines and tissues

Bioinformatics. 2023 Dec 1;39(12):btad709. doi: 10.1093/bioinformatics/btad709.

Abstract

Motivation: RNA N6-methyladenosine (m6A) in Homo sapiens plays vital roles in a variety of biological functions. Precise identification of m6A modifications is thus essential to elucidation of their biological functions and underlying molecular-level mechanisms. Currently available high-throughput single-nucleotide-resolution m6A modification data considerably accelerated the identification of RNA modification sites through the development of data-driven computational methods. Nevertheless, existing methods have limitations in terms of the coverage of single-nucleotide-resolution cell lines and have poor capability in model interpretations, thereby having limited applicability.

Results: In this study, we present CLSM6A, comprising a set of deep learning-based models designed for predicting single-nucleotide-resolution m6A RNA modification sites across eight different cell lines and three tissues. Extensive benchmarking experiments are conducted on well-curated datasets and accordingly, CLSM6A achieves superior performance than current state-of-the-art methods. Furthermore, CLSM6A is capable of interpreting the prediction decision-making process by excavating critical motifs activated by filters and pinpointing highly concerned positions in both forward and backward propagations. CLSM6A exhibits better portability on similar cross-cell line/tissue datasets, reveals a strong association between highly activated motifs and high-impact motifs, and demonstrates complementary attributes of different interpretation strategies.

Availability and implementation: The webserver is available at http://csbio.njust.edu.cn/bioinf/clsm6a. The datasets and code are available at https://github.com/zhangying-njust/CLSM6A/.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenosine / genetics
  • Adenosine / metabolism
  • Humans
  • Nucleotides*
  • RNA* / metabolism
  • Sequence Analysis, RNA / methods

Substances

  • RNA
  • Nucleotides
  • Adenosine