SARS-CoV-2 Quasispecies Provides an Advantage Mutation Pool for the Epidemic Variants

Microbiol Spectr. 2021 Sep 3;9(1):e0026121. doi: 10.1128/Spectrum.00261-21. Epub 2021 Aug 4.

Abstract

The dynamics of quasispecies afford RNA viruses a great fitness on cell tropism and host range. To study the quasispecies features and the intra-host evolution of SARS-CoV-2, we collected nine confirmed patients and sequenced the haplotypes of spike gene using a single-molecule real-time platform. Fourteen samples were extracted from sputum, nasopharyngeal swabs, or stool, which in total produced 283,655 high-quality circular consensus sequences. We observed a stable quasispecies structure that one master mutant (mean abundance ∼0.70), followed by numerous minor mutants (mean abundance ∼1.21 × 10-3). Under high selective pressure, minor mutants may obtain a fitness advantage and become the master ones. The later predominant substitution D614G existed in the minor mutants of more than one early patient. An epidemic variant had a possibility to be independently originated from multiple hosts. The mutant spectrums covered ∼85% amino acid variations of public genomes (GISAID; frequency ≥ 0.1) and likely provided an advantage mutation pool for the current/future epidemic variants. Notably, 32 of 35 collected antibody escape substitutions were preexistent in the early quasispecies. Virus populations in different tissues/organs revealed potentially independent replications. The quasispecies complexity of sputum samples was significantly lower than that of nasopharyngeal swabs (P = 0.02). Evolution analysis revealed that three continuous S2 domains (HR1, CH, and CD) had undergone a positive selection. Cell fusion-related domains may play a crucial role in adapting to the intrahost immune system. Our findings suggested that future epidemiologic investigations and clinical interventions should consider the quasispecies information that has missed by routine single consensus genome. IMPORTANCE RNA virus population in a host does not consist of a consensus single haplotype but rather an ensemble of related sequences termed quasispecies. The dynamics of quasispecies afford SARS-CoV-2 a great ability on genetic fitness during intrahost evolution. The process is likely achieved by changing the genetic characteristics of key functional genes, such as the spike glycoprotein. Previous studies have applied the next-generation sequencing (NGS) technology to evaluate the quasispecies of SARS-CoV-2, and results indicated a low genetic diversity of the spike gene. However, the NGS platform cannot directly obtain the full haplotypes without assembling, and it is also difficult to predict the extremely low-frequency variations. Therefore, we introduced a single-molecule real-time technology to directly obtain the haplotypes of the RNA population and further study the quasispecies features and intrahost evolution of the spike gene.

Keywords: COVID-19; SARS-CoV-2; quasispecies; spike gene.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Adult
  • Aged
  • Base Sequence
  • COVID-19 / virology
  • Child
  • Epidemics*
  • Female
  • Genome, Viral
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Male
  • Middle Aged
  • Mutation*
  • Quasispecies*
  • SARS-CoV-2 / classification*
  • SARS-CoV-2 / genetics*
  • Spike Glycoprotein, Coronavirus / genetics

Substances

  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2