Background: The COVID-19 pandemic poses a serious threat to global health, and pathogenic mutations are a major challenge to disease control. We developed a statistical framework to explore the association between molecular-level mutation activity of SARS-CoV-2 and population-level disease transmissibility of COVID-19.
Methods: We estimated the instantaneous transmissibility of COVID-19 by using the time-varying reproduction number (Rt). The mutation activity in SARS-CoV-2 is quantified empirically depending on (i) the prevalence of emerged amino acid substitutions and (ii) the frequency of these substitutions in the whole sequence. Using the likelihood-based approach, a statistical framework is developed to examine the association between mutation activity and Rt. We adopted the COVID-19 surveillance data in California as an example for demonstration.
Results: We found a significant positive association between population-level COVID-19 transmissibility and the D614G substitution on the SARS-CoV-2 spike protein. We estimate that a per 0.01 increase in the prevalence of glycine (G) on codon 614 is positively associated with a 0.49% (95% CI: 0.39 to 0.59) increase in Rt, which explains 61% of the Rt variation after accounting for the control measures. We remark that the modeling framework can be extended to study other infectious pathogens.
Conclusions: Our findings show a link between the molecular-level mutation activity of SARS-CoV-2 and population-level transmission of COVID-19 to provide further evidence for a positive association between the D614G substitution and Rt. Future studies exploring the mechanism between SARS-CoV-2 mutations and COVID-19 infectivity are warranted.
Keywords: COVID-19; Mutation; Spike protein; Statistical modeling; Transmission.