Multiple approaches for massively parallel sequencing of SARS-CoV-2 genomes directly from clinical samples

Genome Med. 2020 Jun 30;12(1):57. doi: 10.1186/s13073-020-00751-4.

Abstract

Background: COVID-19 (coronavirus disease 2019) has caused a major epidemic worldwide; however, much is yet to be known about the epidemiology and evolution of the virus partly due to the scarcity of full-length SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) genomes reported. One reason is that the challenges underneath sequencing SARS-CoV-2 directly from clinical samples have not been completely tackled, i.e., sequencing samples with low viral load often results in insufficient viral reads for analyses.

Methods: We applied a novel multiplex PCR amplicon (amplicon)-based and hybrid capture (capture)-based sequencing, as well as ultra-high-throughput metatranscriptomic (meta) sequencing in retrieving complete genomes, inter-individual and intra-individual variations of SARS-CoV-2 from serials dilutions of a cultured isolate, and eight clinical samples covering a range of sample types and viral loads. We also examined and compared the sensitivity, accuracy, and other characteristics of these approaches in a comprehensive manner.

Results: We demonstrated that both amplicon and capture methods efficiently enriched SARS-CoV-2 content from clinical samples, while the enrichment efficiency of amplicon outran that of capture in more challenging samples. We found that capture was not as accurate as meta and amplicon in identifying between-sample variations, whereas amplicon method was not as accurate as the other two in investigating within-sample variations, suggesting amplicon sequencing was not suitable for studying virus-host interactions and viral transmission that heavily rely on intra-host dynamics. We illustrated that meta uncovered rich genetic information in the clinical samples besides SARS-CoV-2, providing references for clinical diagnostics and therapeutics. Taken all factors above and cost-effectiveness into consideration, we proposed guidance for how to choose sequencing strategy for SARS-CoV-2 under different situations.

Conclusions: This is, to the best of our knowledge, the first work systematically investigating inter- and intra-individual variations of SARS-CoV-2 using amplicon- and capture-based whole-genome sequencing, as well as the first comparative study among multiple approaches. Our work offers practical solutions for genome sequencing and analyses of SARS-CoV-2 and other emerging viruses.

Keywords: COVID-19; Emerging infectious diseases; Genomic surveillance; Hybrid capture; Metatranscriptomic sequencing; Multiplex PCR; Quasispecies; Virus evolution; iSNV.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Betacoronavirus / genetics*
  • COVID-19
  • Coronavirus Infections
  • Genetic Variation / genetics
  • Genome, Viral / genetics*
  • High-Throughput Nucleotide Sequencing / methods*
  • Host-Pathogen Interactions / genetics
  • Humans
  • Multiplex Polymerase Chain Reaction / methods
  • Pandemics
  • Pneumonia, Viral
  • RNA, Viral / genetics
  • SARS-CoV-2
  • Whole Genome Sequencing / methods*

Substances

  • RNA, Viral