The clock drawing test (CDT) is a neuropsychological assessment tool to evaluate a patient's cognitive ability. In this study, we developed a Fair and Interpretable Representation of Clock drawing tests (FaIRClocks) to evaluate and mitigate bias against people with lower education while predicting their cognitive status. We represented clock drawings with a 10-dimensional latent embedding using Relevance Factor Variational Autoencoder (RF-VAE) network pretrained on publicly available clock drawings from the National Health and Aging Trends Study (NHATS) dataset. These embeddings were later fine-tuned for predicting three cognitive scores: the Mini-Mental State Examination (MMSE) total score, attention composite z-score (ATT-C), and memory composite z-score (MEM-C). The classifiers were initially tested to see their relative performance in patients with low education (<= 8 years) versus patients with higher education (> 8 years). Results indicated that the initial unweighted classifiers confounded lower education with cognitive impairment, resulting in a 100% type I error rate for this group. Thereby, the samples were re-weighted using multiple fairness metrics to achieve balanced performance. In summary, we report the FaIRClocks model, which a) can identify attention and memory deficits using clock drawings and b) exhibits identical performance between people with higher and lower education levels.
Keywords: AI Fairness; Mini-mental state examination; Relevance Factor Variational Autoencoder; attention; memory; semi-supervised deep learning.