Positional motif analysis reveals the extent of specificity of protein-RNA interactions observed by CLIP

Klara Kuret; Aram Gustav Amalietti; D Marc Jones; Charlotte Capitanchik; Jernej Ule

doi:10.1186/s13059-022-02755-2

Positional motif analysis reveals the extent of specificity of protein-RNA interactions observed by CLIP

Genome Biol. 2022 Sep 9;23(1):191. doi: 10.1186/s13059-022-02755-2.

Authors

Klara Kuret^{1

2

3}, Aram Gustav Amalietti^{1

3}, D Marc Jones^{3

4}, Charlotte Capitanchik^{3

4}, Jernej Ule^{5

6

7}

Affiliations

¹ National Institute of Chemistry, Hajdrihova 19, SI-1001, Ljubljana, Slovenia.
² Jozef Stefan International Postgraduate School, Jamova cesta 39, 1000, Ljubljana, Slovenia.
³ The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK.
⁴ UK Dementia Research Institute, King's College London, London, UK.
⁵ National Institute of Chemistry, Hajdrihova 19, SI-1001, Ljubljana, Slovenia. [email protected].
⁶ The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK. [email protected].
⁷ UK Dementia Research Institute, King's College London, London, UK. [email protected].

Abstract

Background: Crosslinking and immunoprecipitation (CLIP) is a method used to identify in vivo RNA-protein binding sites on a transcriptome-wide scale. With the increasing amounts of available data for RNA-binding proteins (RBPs), it is important to understand to what degree the enriched motifs specify the RNA-binding profiles of RBPs in cells.

Results: We develop positionally enriched k-mer analysis (PEKA), a computational tool for efficient analysis of enriched motifs from individual CLIP datasets, which minimizes the impact of technical and regional genomic biases by internal data normalization. We cross-validate PEKA with mCross and show that the use of input control for background correction is not required to yield high specificity of enriched motifs. We identify motif classes with common enrichment patterns across eCLIP datasets and across RNA regions, while also observing variations in the specificity and the extent of motif enrichment across eCLIP datasets, between variant CLIP protocols, and between CLIP and in vitro binding data. Thereby, we gain insights into the contributions of technical and regional genomic biases to the enriched motifs, and find how motif enrichment features relate to the domain composition and low-complexity regions of the studied proteins.

Conclusions: Our study provides insights into the overall contributions of regional binding preferences, protein domains, and low-complexity regions to the specificity of protein-RNA interactions, and shows the value of cross-motif and cross-RBP comparison for data interpretation. Our results are presented for exploratory analysis via an online platform in an RBP-centric and motif-centric manner ( https://imaps.goodwright.com/apps/peka/ ).

Keywords: CLIP; Low-complexity region; Protein-RNA interaction; RNA motif; RNA-binding protein; RNA-binding specificity; k-mer.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Binding Sites
Genomics*
Immunoprecipitation
Protein Domains
RNA*

Substances

RNA

Abstract

Publication types

MeSH terms

Substances

Grants and funding