Long-read HiFi sequencing correctly assembles repetitive heavy fibroin silk genes in new moth and caddisfly genomes

GigaByte. 2022 Jun 30:2022:gigabyte64. doi: 10.46471/gigabyte.64. eCollection 2022.

Abstract

Insect silk is a versatile biomaterial. Lepidoptera and Trichoptera display some of the most diverse uses of silk, with varying strength, adhesive qualities, and elastic properties. Silk fibroin genes are long (>20 Kbp), with many repetitive motifs that make them challenging to sequence. Most research thus far has focused on conserved N- and C-terminal regions of fibroin genes because a full comparison of repetitive regions across taxa has not been possible. Using the PacBio Sequel II system and SMRT sequencing, we generated high fidelity (HiFi) long-read genomic and transcriptomic sequences for the Indianmeal moth (Plodia interpunctella) and genomic sequences for the caddisfly Eubasilissa regina. Both genomes were highly contiguous (N50 = 9.7 Mbp/32.4 Mbp, L50 = 13/11) and complete (BUSCO complete = 99.3%/95.2%), with complete and contiguous recovery of silk heavy fibroin gene sequences. We show that HiFi long-read sequencing is helpful for understanding genes with long, repetitive regions.

Grants and funding

This study was funded by the Smithsonian National Museum of Natural History Global Genome Initiative (GGI-Peer-2018-182) to TPC, RD, TD, AYK; the Smithsonian Museum Conservation Institute Federal; and Trust funds to TPC and PBF. A grant from the University of Florida Research Opportunity Seed Fund internal award (number AWD06265) was awarded to principal investigators AYK and CGS. The LOEWE Centre for Translational Biodiversity Genomics (TBG) is funded by the Hessen State Ministry of Higher Education, Research and the Arts (HMWK), which financially supported JH and SUP. SH was supported by National Science Foundation award #OPP-1906015.