This study aimed at updating previous data on HIV-1 integrase variability, by using effective bioinformatics methods combining different statistical instruments from simple entropy and mutation rate to more specific approaches such as Hellinger distance. A total of 2133 HIV-1 integrase sequences were analyzed in: i) 1460 samples from drug-naïve [DN] individuals; ii) 386 samples from drug-experienced but INI-naïve [IN] individuals; iii) 287 samples from INI-experienced [IE] individuals. Within the three groups, 76 amino acid positions were highly conserved (≤0.2% variation, Hellinger distance: <0.25%), with 35 fully invariant positions; while, 80 positions were conserved (>0.2% to <1% variation, Hellinger distance: <1%). The H12-H16-C40-C43 and D64-D116-E152 motifs were all well conserved. Some residues were affected by dramatic changes in their mutation distributions, especially between DN and IE samples (Hellinger distance ≥1%). In particular, 15 positions (D6, S24, V31, S39, L74, A91, S119, T122, T124, T125, V126, K160, N222, S230, C280) showed a significant decrease of mutation rate in IN and/or IE samples compared to DN samples. Conversely, 8 positions showed significantly higher mutation rate in samples from treated individuals (IN and/or IE) compared to DN. Some of these positions, such as E92, T97, G140, Y143, Q148 and N155, were already known to be associated with resistance to integrase inhibitors; other positions including S24, M154, V165 and D270 are not yet documented to be associated with resistance. Our study confirms the high conservation of HIV-1 integrase and identified highly invariant positions using robust and innovative methods. The role of novel mutations located in the critical region of HIV-1 integrase deserves further investigation.
Keywords: Bioinformatics; Genetic variability; HIV-1; Hellinger distance; Integrase; Mutational rate.
Copyright © 2022 The Authors. Published by Elsevier B.V. All rights reserved.