Developing a fair and interpretable representation of the clock drawing test for mitigating low education and racial bias

Sci Rep. 2024 Jul 29;14(1):17444. doi: 10.1038/s41598-024-68481-w.

Abstract

The clock drawing test (CDT) is a neuropsychological assessment tool to screen an individual's cognitive ability. In this study, we developed a Fair and Interpretable Representation of Clock drawing test (FaIRClocks) to evaluate and mitigate classification bias against people with less than 8 years of education, while screening their cognitive function using an array of neuropsychological measures. In this study, we represented clock drawings by a priorly published 10-dimensional deep learning feature set trained on publicly available data from the National Health and Aging Trends Study (NHATS). These embeddings were further fine-tuned with clocks from a preoperative cognitive screening program at the University of Florida to predict three cognitive scores: the Mini-Mental State Examination (MMSE) total score, an attention composite z-score (ATT-C), and a memory composite z-score (MEM-C). ATT-C and MEM-C scores were developed by averaging z-scores based on normative references. The cognitive screening classifiers were initially tested to see their relative performance in patients with low years of education (< = 8 years) versus patients with higher education (> 8 years) and race. Results indicated that the initial unweighted classifiers confounded lower education with cognitive compromise resulting in a 100% type I error rate for this group. Thereby, the samples were re-weighted using multiple fairness metrics to achieve sensitivity/specificity and positive/negative predictive value (PPV/NPV) balance across groups. In summary, we report the FaIRClocks model, with promise to help identify and mitigate bias against people with less than 8 years of education during preoperative cognitive screening.

Keywords: AI Fairness; Attention; Memory; Mini-mental state examination; Relevance factor variational autoencoder; Semi-supervised deep learning.

MeSH terms

  • Aged
  • Aged, 80 and over
  • Cognition / physiology
  • Cognitive Dysfunction / diagnosis
  • Deep Learning
  • Educational Status*
  • Female
  • Humans
  • Male
  • Mental Status and Dementia Tests
  • Middle Aged
  • Neuropsychological Tests
  • Racism*