Exon nomenclature and classification of transcripts database (ENACTdb): a resource for analyzing alternative splicing mediated proteome diversity

Paras Verma; Deeksha Thakur; Shashi B Pandit

doi:10.1093/bioadv/vbae157

Exon nomenclature and classification of transcripts database (ENACTdb): a resource for analyzing alternative splicing mediated proteome diversity

Bioinform Adv. 2024 Oct 29;4(1):vbae157. doi: 10.1093/bioadv/vbae157. eCollection 2024.

Authors

Paras Verma¹, Deeksha Thakur¹, Shashi B Pandit¹

Affiliation

¹ Department of Biological Sciences, Indian Institute of Science Education and Research (IISER)-Mohali, Punjab, 140306, India.

Abstract

Motivation: Gene transcripts are distinguished by the composition of their exons, and this different exon composition may contribute to advancing proteome complexity. Despite the availability of alternative splicing information documented in various databases, a ready association of exonic variations to the protein sequence remains a mammoth task.

Results: To associate exonic variation(s) with the protein systematically, we designed the Exon Nomenclature and Classification of Transcripts (ENACT) framework for uniquely annotating exons that tracks their loci in gene architecture context with encapsulating variations in splice site(s) and amino acid coding status. After ENACT annotation, predicted protein features (secondary structure/disorder/Pfam domains) are mapped to exon attributes. Thus, ENACTdb provides trackable exonic variation(s) association to isoform(s) and protein features, enabling the assessment of functional variation due to changes in exon composition. Such analyses can be readily performed through multiple views supported by the server. The exon-centric visualizations of ENACT annotated isoforms could provide insights on the functional repertoire of genes due to alternative splicing and its related processes and can serve as an important resource for the research community.

Availability and implementation: The database is publicly available at https://www.iscbglab.in/enactdb/. It contains protein-coding genes and isoforms for Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus musculus, and Homo sapiens.