Genome Assembly and Annotation of the Trichoplusia ni Tni-FNL Insect Cell Line Enabled by Long-Read Technologies

Genes (Basel). 2019 Jan 23;10(2):79. doi: 10.3390/genes10020079.

Abstract

Background: Trichoplusiani derived cell lines are commonly used to enable recombinant protein expression via baculovirus infection to generate materials approved for clinical use and in clinical trials. In order to develop systems biology and genome engineering tools to improve protein expression in this host, we performed de novo genome assembly of the Trichoplusiani-derived cell line Tni-FNL.

Methods: By integration of PacBio single-molecule sequencing, Bionano optical mapping, and 10X Genomics linked-reads data, we have produced a draft genome assembly of Tni-FNL.

Results: Our assembly contains 280 scaffolds, with a N50 scaffold size of 2.3 Mb and a total length of 359 Mb. Annotation of the Tni-FNL genome resulted in 14,101 predicted genes and 93.2% of the predicted proteome contained recognizable protein domains. Ortholog searches within the superorder Holometabola provided further evidence of high accuracy and completeness of the Tni-FNL genome assembly.

Conclusions: This first draft Tni-FNL genome assembly was enabled by complementary long-read technologies and represents a high-quality, well-annotated genome that provides novel insight into the complexity of this insect cell line and can serve as a reference for future large-scale genome engineering work in this and other similar recombinant protein production hosts.

Keywords: PacBio single molecule real-time sequencing; Tricoplusia ni; de novo assembly; insect genome; next generation sequencing; optical mapping.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Cell Line
  • Contig Mapping
  • Genome, Insect*
  • High-Throughput Nucleotide Sequencing
  • Insect Proteins / chemistry
  • Insect Proteins / genetics
  • Lepidoptera / cytology
  • Lepidoptera / genetics*
  • Molecular Sequence Annotation*
  • Protein Domains
  • Sequence Analysis, DNA

Substances

  • Insect Proteins