Whole Genome Sequencing of the Pirarucu (Arapaima gigas) Supports Independent Emergence of Major Teleost Clades

Genome Biol Evol. 2018 Sep 1;10(9):2366-2379. doi: 10.1093/gbe/evy130.

Abstract

The Pirarucu (Arapaima gigas) is one of the world's largest freshwater fishes and member of the superorder Osteoglossomorpha (bonytongues), one of the oldest lineages of ray-finned fishes. This species is an obligate air-breather found in the basin of the Amazon River with an attractive potential for aquaculture. Its phylogenetic position among bony fishes makes the Pirarucu a relevant subject for evolutionary studies of early teleost diversification. Here, we present, for the first time, a draft genome version of the A. gigas genome, providing useful information for further functional and evolutionary studies. The A. gigas genome was assembled with 103-Gb raw reads sequenced in an Illumina platform. The final draft genome assembly was ∼661 Mb, with a contig N50 equal to 51.23 kb and scaffold N50 of 668 kb. Repeat sequences accounted for 21.69% of the whole genome, and a total of 24,655 protein-coding genes were predicted from the genome assembly, with an average of nine exons per gene. Phylogenomic analysis based on 24 fish species supported the postulation that Osteoglossomorpha and Elopomorpha (eels, tarpons, and bonefishes) are sister groups, both forming a sister lineage with respect to Clupeocephala (remaining teleosts). Divergence time estimations suggested that Osteoglossomorpha and Elopomorpha lineages emerged independently in a period of ∼30 Myr in the Jurassic. The draft genome of A. gigas provides a valuable genetic resource for further investigations of evolutionary studies and may also offer a valuable data for economic applications.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Biological Evolution
  • Evolution, Molecular
  • Female
  • Fishes / genetics*
  • Genome
  • Genome Size
  • Male
  • Molecular Sequence Annotation
  • Multigene Family
  • Phylogeny
  • Repetitive Sequences, Nucleic Acid
  • Whole Genome Sequencing