Full-Length Transcriptome Analysis of Plasmodium falciparum by Single-Molecule Long-Read Sequencing

Front Cell Infect Microbiol. 2021 Feb 23:11:631545. doi: 10.3389/fcimb.2021.631545. eCollection 2021.

Abstract

Malaria, an infectious disease caused by Plasmodium parasites, still accounts for amounts of deaths annually in last decades. Despite the significance of Plasmodium falciparum as a model organism of malaria parasites, our understanding of gene expression of this parasite remains largely elusive since lots of progress on its genome and transcriptome are based on assembly with short sequencing reads. Herein, we report the new version of transcriptome dataset containing all full-length transcripts over the whole asexual blood stages by adopting a full-length sequencing approach with optimized experimental conditions of cDNA library preparation. We have identified a total of 393 alternative splicing (AS) events, 3,623 long non-coding RNAs (lncRNAs), 1,555 alternative polyadenylation (APA) events, 57 transcription factors (TF), 1,721 fusion transcripts in P. falciparum. Furthermore, the shotgun proteome was performed to validate the full-length transcriptome of P. falciparum. More importantly, integration of full-length transcriptomic and proteomic data identified 160 novel small proteins in lncRNA regions. Collectively, this full-length transcriptome dataset with high quality and accuracy and the shotgun proteome analyses shed light on the complex gene expression in malaria parasites and provide a valuable resource for related functional and mechanistic researches on P. falciparum genes.

Keywords: Plasmodium falciparum; alternative splicing; full-length RNA-seq; long non-coding RNA; small protein.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing
  • Gene Expression Profiling
  • High-Throughput Nucleotide Sequencing*
  • Plasmodium falciparum* / genetics
  • Proteomics
  • Transcriptome