Chromosome scale genome assembly and annotation of coconut cultivar Chowghat Green Dwarf

Sci Rep. 2024 Nov 20;14(1):28778. doi: 10.1038/s41598-024-79768-3.

Abstract

The high-quality genome of coconut (Cocos nucifera L.) is a crucial resource for enhancing agronomic traits and studying genome evolution within the Arecaceae family. We sequenced the Chowghat Green Dwarf cultivar, which is resistant to the root (wilt) disease, utilizing Illumina, PacBio, ONT, and Hi-C technologies to produce a chromosome-level genome of ~ 2.68 Gb with a scaffold N50 of 174 Mb; approximately 97% of the genome could be anchored to 16 pseudo-molecules (2.62 Gb). In total, 34,483 protein-coding genes were annotated; the BUSCO completeness score was 96.80%, while the k-mer completeness was ~ 87%. The assembled genome includes 2.19 Gb (81.64%) of repetitive sequences, with long terminal repeats (LTRs) constituting the most abundant class at 53.76%. Additionally, our analysis confirms two whole-genome duplication (WGD) events in the C. nucifera lineage. A genome-wide analysis of LTR insertion time revealed ancient divergence and proliferation of copia and gypsy elements. In addition, 1368 RGAs were discovered in the CGD genome. We also developed a web server 'Kalpa Genome Resource' ( http://210.89.54.198:3000/ ), to manage and store a comprehensive array of genomic data, including genome sequences, genetic markers, structural and functional annotations like metabolic pathways, and transcriptomic profiles. The web server has an embedded genome browser to analyze and visualize the genome, its genomics elements, and transcriptome data. The in-built BLAST server allows sequence homology searches against genome, annotated transcriptome & proteome sequences. The genomic dataset and the database will support comparative genome analysis and can expedite genome-driven breeding and enhancement efforts for tapping genetic gains in coconut.

Keywords: Coconut genome; Database; Genome browser; Transposable elements; Web server; Whole genome duplication.

MeSH terms

  • Chromosomes, Plant* / genetics
  • Cocos* / genetics
  • Genome, Plant*
  • Molecular Sequence Annotation*
  • Phylogeny
  • Terminal Repeat Sequences / genetics