Background: High-throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP) allows for high resolution, genome-wide mapping of RNA-binding proteins. This methodology is frequently used to validate predicted targets of microRNA binding, as well as direct targets of other RNA-binding proteins. Hence, the accuracy and sensitivity of binding site identification is critical.
Results: We found that substantial mispriming during reverse transcription results in the overrepresentation of sequences complementary to the primer used for reverse transcription. Up to 45 % of peaks in publicly available HITS-CLIP libraries are attributable to this mispriming artifact, and the majority of libraries have detectable levels of mispriming. We also found that standard techniques for validating microRNA-target interactions fail to differentiate between artifactual peaks and physiologically relevant peaks.
Conclusions: Here, we present a modification to the HITS-CLIP protocol that effectively eliminates this artifact and improves the sensitivity and complexity of resulting libraries.
Keywords: CLIP-seq; HITS-CLIP; PAR-CLIP; iCLIP; microRNA.