KISSPLICE: de-novo calling alternative splicing events from RNA-seq data

Gustavo A T Sacomoto; Janice Kielbassa; Rayan Chikhi; Raluca Uricaru; Pavlos Antoniou; Marie-France Sagot; Pierre Peterlongo; Vincent Lacroix

doi:10.1186/1471-2105-13-S6-S5

KISSPLICE: de-novo calling alternative splicing events from RNA-seq data

BMC Bioinformatics. 2012 Apr 19;13 Suppl 6(Suppl 6):S5. doi: 10.1186/1471-2105-13-S6-S5.

Authors

Gustavo A T Sacomoto¹, Janice Kielbassa, Rayan Chikhi, Raluca Uricaru, Pavlos Antoniou, Marie-France Sagot, Pierre Peterlongo, Vincent Lacroix

Affiliation

¹ INRIA Grenoble Rhône-Alpes, France.

Abstract

Background: In this paper, we address the problem of identifying and quantifying polymorphisms in RNA-seq data when no reference genome is available, without assembling the full transcripts. Based on the fundamental idea that each polymorphism corresponds to a recognisable pattern in a De Bruijn graph constructed from the RNA-seq reads, we propose a general model for all polymorphisms in such graphs. We then introduce an exact algorithm, called KISSPLICE, to extract alternative splicing events.

Results: We show that KISSPLICE enables to identify more correct events than general purpose transcriptome assemblers. Additionally, on a 71 M reads dataset from human brain and liver tissues, KISSPLICE identified 3497 alternative splicing events, out of which 56% are not present in the annotations, which confirms recent estimates showing that the complexity of alternative splicing has been largely underestimated so far.

Conclusions: We propose new models and algorithms for the detection of polymorphism in RNA-seq data. This opens the way to a new kind of studies on large HTS RNA-seq datasets, where the focus is not the global reconstruction of full-length transcripts, but local assembly of polymorphic regions. KISSPLICE is available for download at http://alcovna.genouest.org/kissplice/.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Alternative Splicing*
Genome
Humans
Models, Statistical*
Polymorphism, Single Nucleotide
Reference Standards
Sequence Analysis, RNA*
Tandem Repeat Sequences
Transcriptome