Prevention of adverse HIV treatment outcomes: machine learning to enable proactive support of people at risk of HIV care disengagement in Tanzania

Zhongming Xie; Huiyu Hu; Jillian L Kadota; Laura J Packel; Matilda Mlowe; Sylvester Kwilasa; Werner Maokola; Siraji Shabani; Amon Sabasaba; Prosper F Njau; Jingshen Wang; Sandra I McCoy

doi:10.1136/bmjopen-2024-088782

Prevention of adverse HIV treatment outcomes: machine learning to enable proactive support of people at risk of HIV care disengagement in Tanzania

BMJ Open. 2024 Sep 24;14(9):e088782. doi: 10.1136/bmjopen-2024-088782.

Authors

Affiliations

¹ School of Public Health, University of California, Berkeley, California, USA.
² School of Public Health, University of California, Berkeley, California, USA [email protected].
³ Health for a Prosperous Nation, Dar es Salaam, Tanzania, United Republic of.
⁴ United Republic of Tanzania Ministry of Health, Dodoma, Tanzania, United Republic of.

Abstract

Objectives: This study aimed to develop a machine learning (ML) model to predict disengagement from HIV care, high viral load or death among people living with HIV (PLHIV) with the goal of enabling proactive support interventions in Tanzania. The algorithm addressed common challenges when applying ML to electronic medical record (EMR) data: (1) imbalanced outcome distribution; (2) heterogeneity across multisite EMR data and (3) evolving virological suppression thresholds.

Design: Observational study using a national EMR database.

Setting: Conducted in two regions in Tanzania, using data from the National HIV Care database.

Participants: The study included over 6 million HIV care visit records from 295 961 PLHIV in two regions in Tanzania's National HIV Care database from January 2015 to May 2023.

Results: Our ML model effectively identified PLHIV at increased risk of adverse outcomes. Key predictors included past disengagement from care, antiretroviral therapy (ART) status (which tracks a patient's engagement with ART across visits), age and time on ART. The downsampling approach we implemented effectively managed imbalanced data to reduce prediction bias. Site-specific algorithms performed better compared with a universal approach, highlighting the importance of tailoring ML models to local contexts. A sensitivity analysis confirmed the model's robustness to changes in viral load suppression thresholds.

Conclusions: ML models leveraging large-scale databases of patient data offer significant potential to identify PLHIV for interventions to enhance engagement in HIV care in resource-limited settings. Tailoring algorithms to local contexts and flexibility towards evolving clinical guidelines are essential for maximising their impact.

Keywords: HIV & AIDS; electronic health records; machine learning.

Publication types

Observational Study

MeSH terms

Adolescent
Adult
Algorithms
Anti-HIV Agents / therapeutic use
Electronic Health Records*
Female
HIV Infections* / drug therapy
Humans
Machine Learning*
Male
Middle Aged
Tanzania / epidemiology
Treatment Outcome
Viral Load
Young Adult

Substances

Anti-HIV Agents

Grants and funding

R01 MH125746/MH/NIMH NIH HHS/United States