Within-host diversity improves phylogenetic and transmission reconstruction of SARS-CoV-2 outbreaks

Elife. 2023 Sep 21:12:e84384. doi: 10.7554/eLife.84384.

Abstract

Accurate inference of who infected whom in an infectious disease outbreak is critical for the delivery of effective infection prevention and control. The increased resolution of pathogen whole-genome sequencing has significantly improved our ability to infer transmission events. Despite this, transmission inference often remains limited by the lack of genomic variation between the source case and infected contacts. Although within-host genetic diversity is common among a wide variety of pathogens, conventional whole-genome sequencing phylogenetic approaches exclusively use consensus sequences, which consider only the most prevalent nucleotide at each position and therefore fail to capture low-frequency variation within samples. We hypothesized that including within-sample variation in a phylogenetic model would help to identify who infected whom in instances in which this was previously impossible. Using whole-genome sequences from SARS-CoV-2 multi-institutional outbreaks as an example, we show how within-sample diversity is partially maintained among repeated serial samples from the same host, it can transmitted between those cases with known epidemiological links, and how this improves phylogenetic inference and our understanding of who infected whom. Our technique is applicable to other infectious diseases and has immediate clinical utility in infection prevention and control.

Keywords: SARS-CoV-2; epidemiology; genetics; genomics; global health; infectious disease; phylogenetics; viruses; within-host diversity.

Plain language summary

During an infectious disease outbreak, tracing who infected whom allows public health scientists to see how a pathogen is spreading and to establish effective control measures. Traditionally, this involves identifying the individuals an infected person comes into contact with and monitoring whether they also become unwell. However, this information is not always available and can be inaccurate. One alternative is to track the genetic data of pathogens as they spread. Over time, pathogens accumulate mutations in their genes that can be used to distinguish them from one another. Genetically similar pathogens are more likely to have spread during the same outbreak, while genetically dissimilar pathogens may have come from different outbreaks. However, there are limitations to this approach. For example, some pathogens accumulate genetic mutations very slowly and may not change enough during an outbreak to be distinguishable from one another. Additionally, some pathogens can spread rapidly, leaving less time for mutations to occur between transmission events. To overcome these challenges, Torres Ortiz et al. developed a more sensitive approach to pathogen genetic testing that took advantage of the multiple pathogen populations that often coexist in an infected patient. Rather than tracking only the most dominant genetic version of the pathogen, this method also looked at the less dominant ones. Torres Ortiz et al. performed genome sequencing of SARS-CoV-2 (the virus that causes COVID-19) samples from 451 healthcare workers, patients, and patient contacts at participating London hospitals. Analysis showed that it was possible to detect multiple genetic populations of the virus within individual patients. These subpopulations were often more similar in patients that had been in contact with one another than in those that had not. Tracking the genetic data of all viral populations enabled Torres Ortiz et al. to trace transmission more accurately than if only the dominant population was used. More accurate genetic tracing could help public health scientists better track pathogen transmission and control outbreaks. This may be especially beneficial in hospital settings where outbreaks can be smaller, and it is important to understand if transmission is occurring within the hospital or if the pathogen is imported from the community. Further research will help scientists understand how pathogen population genetics evolve during outbreaks and may improve the detection of subpopulations present at very low frequencies.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19* / epidemiology
  • Communicable Diseases* / epidemiology
  • Disease Outbreaks
  • Humans
  • Phylogeny
  • SARS-CoV-2 / genetics