Mitigating the impact of biased artificial intelligence in emergency decision-making

Hammaad Adam; Aparna Balagopalan; Emily Alsentzer; Fotini Christia; Marzyeh Ghassemi

doi:10.1038/s43856-022-00214-4

Mitigating the impact of biased artificial intelligence in emergency decision-making

Commun Med (Lond). 2022 Nov 21;2(1):149. doi: 10.1038/s43856-022-00214-4.

Authors

Hammaad Adam¹, Aparna Balagopalan², Emily Alsentzer^{3

4

5}, Fotini Christia^{6

7

8}, Marzyeh Ghassemi^{2

4

9}

Affiliations

¹ Institute for Data Systems and Society, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA. [email protected].
² Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
³ Harvard-MIT Program in Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
⁴ Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
⁵ Division of General Internal Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA.
⁶ Institute for Data Systems and Society, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
⁷ Sociotechnical Systems Research Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
⁸ Department of Political Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
⁹ CIFAR AI Chair, Vector Institute, Toronto, ON, M5G 1M1, Canada.

Abstract

Background: Prior research has shown that artificial intelligence (AI) systems often encode biases against minority subgroups. However, little work has focused on ways to mitigate the harm discriminatory algorithms can cause in high-stakes settings such as medicine.

Methods: In this study, we experimentally evaluated the impact biased AI recommendations have on emergency decisions, where participants respond to mental health crises by calling for either medical or police assistance. We recruited 438 clinicians and 516 non-experts to participate in our web-based experiment. We evaluated participant decision-making with and without advice from biased and unbiased AI systems. We also varied the style of the AI advice, framing it either as prescriptive recommendations or descriptive flags.

Results: Participant decisions are unbiased without AI advice. However, both clinicians and non-experts are influenced by prescriptive recommendations from a biased algorithm, choosing police help more often in emergencies involving African-American or Muslim men. Crucially, using descriptive flags rather than prescriptive recommendations allows respondents to retain their original, unbiased decision-making.

Conclusions: Our work demonstrates the practical danger of using biased models in health contexts, and suggests that appropriately framing decision support can mitigate the effects of AI bias. These findings must be carefully considered in the many real-world clinical scenarios where inaccurate or biased models may be used to inform important decisions.

Plain language summary

Artificial intelligence (AI) systems that make decisions based on historical data are increasingly common in health care settings. However, many AI models exhibit problematic biases, as data often reflect human prejudices against minority groups. In this study, we used a web-based experiment to evaluate the impact biased models can have when used to inform human decisions. We found that though participants were not inherently biased, they were strongly influenced by advice from a biased model if it was offered prescriptively (i.e., “you should do X”). This adherence led their decisions to be biased against African-American and Muslims individuals. However, framing the same advice descriptively (i.e., without recommending a specific action) allowed participants to remain fair. These results demonstrate that though discriminatory AI can lead to poor outcomes for minority groups, appropriately framing advice can help mitigate its effects.