Humans can be infected sequentially by different strains of the same virus. Estimating the prevalence of so-called 'superinfection' for a particular pathogen is vital because superinfection implies a failure of immunologic memory against a given virus despite past exposure, which may signal challenges for future vaccine development. Increasingly, viral deep sequencing and phylogenetic inference can discriminate distinct strains within a host. Yet, a population-level study may misrepresent the true prevalence of superinfection for several reasons. First, certain infections such as herpes simplex virus (HSV-2) only reactivate single strains, making multiple samples necessary to detect superinfection. Second, the number of samples collected in a study may be fewer than the actual number of independently acquired strains within a single person. Third, detecting strains that are relatively less abundant can be difficult, even for other infections such as HIV-1 where deep sequencing may identify multiple strains simultaneously. Here we develop a model of superinfection inspired by ecology. We define an infected individual's richness as the number of infecting strains and use ecological evenness to quantify the relative strain abundances. The model uses an EM methodology to infer the true prevalence of superinfection from limited clinical datasets. Simulation studies with known true prevalence are used to contrast our EM method to a standard (naive) calculation. While varying richness, evenness and sampling we quantify the accuracy and precision of our method. The EM method outperforms in all cases, particularly when sampling is low, and richness or unevenness is high. Here, sensitivity to our assumptions about clinical data is considered. The simulation studies also provide insight into optimal study designs; estimates of prevalence improve equally by enrolling more participants or gathering more samples per person. Finally, we apply our method to data from published studies of HSV-2 and HIV-1 superinfection.
Keywords: HIV; HSV; ecology; expectation maximization; mathematical modelling; superinfection.
© 2018 The Author(s).