Background: Knowledge of HIV-1 molecular transmission clusters (MTCs) is important, especially in large-scale datasets, for designing prevention programmes and public health intervention strategies. We used a large-scale HIV-1 sequence dataset from nine European HIV cohorts and one Canadian, to identify MTCs and investigate factors associated with the probability of belonging to MTCs.
Methods: To identify MTCs, we applied maximum likelihood inferences on partial pol sequences from 8955 HIV-positive individuals linked to demographic and clinical data. MTCs were defined using two different criteria: clusters with bootstrap support >75% (phylogenetic confidence criterion) and clusters consisting of sequences from a specific region at a proportion of >75% (geographic criterion) compared to the total number of sequences within the network. Multivariable logistic regression analysis was used to assess factors associated with MTC clustering.
Results: Although 3700 (41%) sequences belonged to MTCs, proportions differed substantially by country and subtype, ranging from 7% among UK subtype C sequences to 63% among German subtype B sequences. The probability of belonging to an MTC was independently less likely for women than men (OR = 0.66; P < 0.001), older individuals (OR = 0.79 per 10-year increase in age; P < 0.001) and people of non-white ethnicity (OR = 0.44; P < 0.001 and OR = 0.70; P = 0.002 for black and 'other' versus white, respectively). It was also more likely among men who have sex with men (MSM) than other risk groups (OR = 0.62; P < 0.001 and OR = 0.69; P = 0.002 for people who inject drugs, and sex between men and women, respectively), subtype B (ORs 0.36-0.70 for A, C, CRF01 and CRF02 versus B; all P < 0.05), having a well-estimated date of seroconversion (OR = 1.44; P < 0.001), a later calendar year of sampling (ORs 2.01-2.61 for all post-2002 periods versus pre-2002; all P < 0.01), and being naïve to antiretroviral therapy at sampling (OR = 1.19; P = 0.010).
Conclusions: A high proportion (>40%) of individuals belonged to MTCs. Notably, the HIV epidemic dispersal appears to be driven by subtype B viruses spread within MSM networks. Expansion of regional epidemics seems mainly associated with recent MTCs, rather than the growth of older, established ones. This information is important for designing prevention and public health intervention strategies.
Keywords: Clusters; HIV; HIV epidemic; Molecular epidemiology; Phylogenies; Regional epidemics; Transmission networks.