Motivation: LINE-1 elements are retrotransposons that are capable of copying their sequence to new genomic loci. LINE-1 derepression is associated with a number of disease states, and has the potential to cause significant cellular damage. Because LINE-1 elements are repetitive, it is difficult to quantify LINE-1 RNA at specific loci and to separate transcripts with protein coding capability from other sources of LINE-1 RNA.
Results: We provide a tool, L1EM that uses the expectation maximization algorithm to quantify LINE-1 RNA at each genomic locus, separating transcripts that are capable of generating retrotransposition from those that are not. We show the accuracy of L1EM on simulated data and against long read sequencing from HEK cells.
Availability and implementation: L1EM is written in python. The source code along with the necessary annotations are available at https://github.com/FenyoLab/L1EM and distributed under GPLv3.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected].