Single-molecule real-time sequencing identifies massive full-length cDNAs and alternative-splicing events that facilitate comparative and functional genomics study in the hexaploid crop sweet potato

PeerJ. 2019 Nov 15:7:e7933. doi: 10.7717/peerj.7933. eCollection 2019.

Abstract

Background: Sweet potato (Ipomoea batatas (L.) Lam.) is one of the most important crops in many developing countries and provides a candidate source of bioenergy. However, neither a complete reference genome nor large-scale full-length cDNA sequences for this outcrossing hexaploid crop are available, which in turn impedes progress in research studies in I. batatas functional genomics and molecular breeding.

Methods: In this study, we sequenced full-length transcriptomes in I. batatas and its diploid ancestor I. trifida by single-molecule real-time sequencing and Illumina second-generation sequencing technologies. With the generated datasets, we conducted comprehensive intraspecific and interspecific sequence analyses and experimental characterization.

Results: A total of 53,861/51,184 high-quality long-read transcripts were obtained, which covered about 10,439/10,452 loci in the I. batatas/I. trifida genome. These datasets enabled us to predict open reading frames successfully in 96.83%/96.82% of transcripts and identify 34,963/33,637 full-length cDNA sequences, 1,401/1,457 transcription factors, 25,315/27,090 simple sequence repeats, 1,656/1,389 long non-coding RNAs, and 5,251/8,901 alternative splicing events. Approximately, 32.34%/38.54% of transcripts and 46.22%/51.18% multi-exon transcripts underwent alternative splicing in I. batatas/I. trifida. Moreover, we validated one alternative splicing event in each of 10 genes and identified tuberous-root-specific expressed isoforms from a starch-branching enzyme, an alpha-glucan phosphorylase, a neutral invertase, and several ABC transporters. Overall, the collection and analysis of large-scale long-read transcripts generated in this study will serve as a valuable resource for the I. batatas research community, which may accelerate the progress in its structural, functional, and comparative genomics studies.

Keywords: Alternative splicing; Comparative transcriptome analysis; Single-molecular real-time sequencing; Sweet potato.

Grants and funding

This study was jointly supported by the National Natural Science Foundation of China (Grant No. 31771855), the National Sweet Potato Industry and Research System (Grant No. CARS-11-B-02), the startup funding from Fujian Agriculture and Forestry University, and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.