Mudskipper detects combinatorial RNA binding protein interactions in multiplexed CLIP data

Cell Genom. 2024 Jul 10;4(7):100603. doi: 10.1016/j.xgen.2024.100603. Epub 2024 Jul 1.

Abstract

The uncovering of protein-RNA interactions enables a deeper understanding of RNA processing. Recent multiplexed crosslinking and immunoprecipitation (CLIP) technologies such as antibody-barcoded eCLIP (ABC) dramatically increase the throughput of mapping RNA binding protein (RBP) binding sites. However, multiplex CLIP datasets are multivariate, and each RBP suffers non-uniform signal-to-noise ratio. To address this, we developed Mudskipper, a versatile computational suite comprising two components: a Dirichlet multinomial mixture model to account for the multivariate nature of ABC datasets and a softmasking approach that identifies and removes non-specific protein-RNA interactions in RBPs with low signal-to-noise ratio. Mudskipper demonstrates superior precision and recall over existing tools on multiplex datasets and supports analysis of repetitive elements and small non-coding RNAs. Our findings unravel splicing outcomes and variant-associated disruptions, enabling higher-throughput investigations into diseases and regulation mediated by RBPs.

Keywords: CLIP; RNA; RNA-binding proteins; deep learning; gene regulation; splicing; transcriptomics; variant interpretation.

MeSH terms

  • Binding Sites
  • Computational Biology / methods
  • Humans
  • Immunoprecipitation / methods
  • Protein Binding
  • RNA / genetics
  • RNA / metabolism
  • RNA-Binding Proteins* / genetics
  • RNA-Binding Proteins* / metabolism
  • Software

Substances

  • RNA-Binding Proteins
  • RNA