Exon nomenclature and classification of transcripts database (ENACTdb): a resource for analyzing alternative splicing mediated proteome diversity

Bioinform Adv. 2024 Oct 29;4(1):vbae157. doi: 10.1093/bioadv/vbae157. eCollection 2024.

Abstract

Motivation: Gene transcripts are distinguished by the composition of their exons, and this different exon composition may contribute to advancing proteome complexity. Despite the availability of alternative splicing information documented in various databases, a ready association of exonic variations to the protein sequence remains a mammoth task.

Results: To associate exonic variation(s) with the protein systematically, we designed the Exon Nomenclature and Classification of Transcripts (ENACT) framework for uniquely annotating exons that tracks their loci in gene architecture context with encapsulating variations in splice site(s) and amino acid coding status. After ENACT annotation, predicted protein features (secondary structure/disorder/Pfam domains) are mapped to exon attributes. Thus, ENACTdb provides trackable exonic variation(s) association to isoform(s) and protein features, enabling the assessment of functional variation due to changes in exon composition. Such analyses can be readily performed through multiple views supported by the server. The exon-centric visualizations of ENACT annotated isoforms could provide insights on the functional repertoire of genes due to alternative splicing and its related processes and can serve as an important resource for the research community.

Availability and implementation: The database is publicly available at https://www.iscbglab.in/enactdb/. It contains protein-coding genes and isoforms for Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus musculus, and Homo sapiens.