Bias in machine learning applications to address non-communicable diseases at a population-level: a scoping review

Sharon Birdi; Roxana Rabet; Steve Durant; Atushi Patel; Tina Vosoughi; Mahek Shergill; Christy Costanian; Carolyn P Ziegler; Shehzad Ali; David Buckeridge; Marzyeh Ghassemi; Jennifer Gibson; Ava John-Baptiste; Jillian Macklin; Melissa McCradden; Kwame McKenzie; Sharmistha Mishra; Parisa Naraei; Akwasi Owusu-Bempah; Laura Rosella; James Shaw; Ross Upshur; Andrew D Pinto

doi:10.1186/s12889-024-21081-9

Bias in machine learning applications to address non-communicable diseases at a population-level: a scoping review

BMC Public Health. 2024 Dec 28;24(1):3599. doi: 10.1186/s12889-024-21081-9.

Authors

Sharon Birdi¹, Roxana Rabet¹, Steve Durant¹, Atushi Patel¹, Tina Vosoughi¹, Mahek Shergill^{1

2}, Christy Costanian¹, Carolyn P Ziegler³, Shehzad Ali^{4

5

6}, David Buckeridge⁷, Marzyeh Ghassemi⁸, Jennifer Gibson⁹, Ava John-Baptiste¹⁰, Jillian Macklin^{1

11}, Melissa McCradden^{12

13

14}, Kwame McKenzie^{15

16}, Sharmistha Mishra^{17

18

19

20

21}, Parisa Naraei²², Akwasi Owusu-Bempah²³, Laura Rosella^{12

24

25

26}, James Shaw²⁷, Ross Upshur^{28

12

9}, Andrew D Pinto^{29

30

31

32}

Affiliations

¹ Upstream Lab, MAP Centre for Urban Health Solutions, Li Ka Shing Knowledge Institute, Unity Health Toronto, 30 Bond Street, Toronto, ON, M5B 1W8, Canada.
² Michael G. DeGroote School of Medicine, McMaster University, Hamilton, ON, Canada.
³ Library Services, Unity Health Toronto, St. Michael's Hospital, Toronto, ON, Canada.
⁴ Department of Epidemiology and Biostatistics, Western Centre for Public Health & Family Medicine, Western University, London, ON, Canada.
⁵ Division of Epidemiology, Dalla Lana School of Public Health, Toronto, ON, Canada.
⁶ Department of Laboratory Medicine and Pathobiology, Temerty Faculty of Medicine, Toronto, ON, Canada.
⁷ Department of Epidemiology, Biostatistics and Occupational Health, School of Population and Global Health, McGill University, Montreal, QC, Canada.
⁸ Department of Electrical Engineering and Computer Science (EECS) and Institute for Medical Engineering & Science (IMES), MIT, Cambridge, MA, USA.
⁹ Joint Centre for Bioethics, University of Toronto, Toronto, ON, Canada.
¹⁰ Departments of Epidemiology & Biostatistics, Anesthesia & Perioperative Medicine, Schulich Interfaculty Program in Public Health, Western University, London, ON, Canada.
¹¹ Undergraduate Medical Education, Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
¹² Division of Clinical Public Health, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
¹³ Department of Bioethics, The Hospital for Sick Children, Toronto, ON, Canada.
¹⁴ Genetics & Genome Biology, SickKids Research Institute, Toronto, ON, Canada.
¹⁵ Wellesley Institute, Toronto, ON, Canada.
¹⁶ CAMH, Toronto, ON, Canada.
¹⁷ Division of Infectious Diseases, Department of Medicine, Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
¹⁸ MAP Centre for Urban Health Solutions, Li Ka Shing Knowledge Institute, Unity Health Toronto, Toronto, ON, Canada.
¹⁹ Institute of Medical Science, Faculty of Medicine, University of Toronto, Toronto, Canada.
²⁰ Institute of Health Policy, Management and Evaluation, Division of Epidemiology, Dalla Lana School of Public Health, University of Toronto, Toronto, Canada.
²¹ ICES, Toronto, ON, Canada.
²² Department of Computer Science, Toronto Metropolitan University, Toronto, ON, Canada.
²³ Department of Sociology, Faculty of Arts & Sciences, University of Toronto, Toronto, ON, Canada.
²⁴ Institute for Better Health, Trillium Health Partners, Toronto, ON, Canada.
²⁵ Department of Health Sciences, University of York, York, UK.
²⁶ WHO Collaborating Centre for Knowledge Translation and Health Technology Assessment in Health Equity, Ottawa Centre for Health Equity, Ottawa, ON, Canada.
²⁷ Department of Physical Therapy, Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
²⁸ Department of Family and Community Medicine, Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
²⁹ Upstream Lab, MAP Centre for Urban Health Solutions, Li Ka Shing Knowledge Institute, Unity Health Toronto, 30 Bond Street, Toronto, ON, M5B 1W8, Canada. [email protected].
³⁰ Department of Family and Community Medicine, St. Michael's Hospital, Toronto, ON, Canada. [email protected].
³¹ Department of Family and Community Medicine, Faculty of Medicine, University of Toronto, Toronto, ON, Canada. [email protected].
³² Division of Clinical Public Health, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada. [email protected].

Abstract

Background: Machine learning (ML) is increasingly used in population and public health to support epidemiological studies, surveillance, and evaluation. Our objective was to conduct a scoping review to identify studies that use ML in population health, with a focus on its use in non-communicable diseases (NCDs). We also examine potential algorithmic biases in model design, training, and implementation, as well as efforts to mitigate these biases.

Methods: We searched the peer-reviewed, indexed literature using Medline, Embase, Cochrane Central Register of Controlled Trials and Cochrane Database of Systematic Reviews, CINAHL, Scopus, ACM Digital Library, Inspec, Web of Science's Science Citation Index, Social Sciences Citation Index, and the Emerging Sources Citation Index, up to March 2022.

Results: The search identified 27 310 studies and 65 were included. Study aims were separated into algorithm comparison (n = 13, 20%) or disease modelling for population-health-related outputs (n = 52, 80%). We extracted data on NCD type, data sources, technical approach, possible algorithmic bias, and jurisdiction. Type 2 diabetes was the most studied NCD. The most common use of ML was for risk modeling. Mitigating bias was not extensively addressed, with most methods focused on mitigating sex-related bias.

Conclusion: This review examines current applications of ML in NCDs, highlighting potential biases and strategies for mitigation. Future research should focus on communicable diseases and the transferability of ML models in low and middle-income settings. Our findings can guide the development of guidelines for the equitable use of ML to improve population health outcomes.

Keywords: Artificial intelligence; Machine learning; Non-communicable disease; Population health.

Publication types

Review
Systematic Review

MeSH terms

Algorithms
Bias*
Humans
Machine Learning*
Noncommunicable Diseases* / epidemiology
Noncommunicable Diseases* / prevention & control
Population Health

Abstract

Publication types

MeSH terms

Grants and funding