The severe acute respiratory coronavirus-2 (SARS-CoV-2) is the cause of the global outbreak of COVID-19. Evidence suggests that the virus is evolving to allow efficient spread through the human population, including vaccinated individuals. Here, we report a study of viral variants from surveillance of the Delaware Valley, including the city of Philadelphia, and variants infecting vaccinated subjects. We sequenced and analyzed complete viral genomes from 2621 surveillance samples from March 2020 to September 2021 and compared them to genome sequences from 159 vaccine breakthroughs. In the early spring of 2020, all detected variants were of the B.1 and closely related lineages. A mixture of lineages followed, notably including B.1.243 followed by B.1.1.7 (alpha), with other lineages present at lower levels. Later isolations were dominated by B.1.617.2 (delta) and other delta lineages; delta was the exclusive variant present by the last time sampled. To investigate whether any variants appeared preferentially in vaccine breakthroughs, we devised a model based on Bayesian autoregressive moving average logistic multinomial regression to allow rigorous comparison. This revealed that B.1.617.2 (delta) showed 3-fold enrichment in vaccine breakthrough cases (odds ratio of 3; 95% credible interval 0.89-11). Viral point substitutions could also be associated with vaccine breakthroughs, notably the N501Y substitution found in the alpha, beta and gamma variants (odds ratio 2.04; 95% credible interval of1.25-3.18). This study thus overviews viral evolution and vaccine breakthroughs in the Delaware Valley and introduces a rigorous statistical approach to interrogating enrichment of breakthrough variants against a changing background. IMPORTANCE SARS-CoV-2 vaccination is highly effective at reducing viral infection, hospitalization and death. However, vaccine breakthrough infections have been widely observed, raising the question of whether particular viral variants or viral mutations are associated with breakthrough. Here, we report analysis of 2621 surveillance isolates from people diagnosed with COVID-19 in the Delaware Valley in southeastern Pennsylvania, allowing rigorous comparison to 159 vaccine breakthrough case specimens. Our best estimate is a 3-fold enrichment for some lineages of delta among breakthroughs, and enrichment of a notable spike substitution, N501Y. We introduce statistical methods that should be widely useful for evaluating vaccine breakthroughs and other viral phenotypes.
Keywords: COVID-19; Philadelphia; SARS-CoV-2; coronavirus; genome sequencing.