Building a Human Digital Twin (HDTwin) Using Large Language Models for Cognitive Diagnosis: Algorithm Development and Validation

Gina Sprint; Maureen Schmitter-Edgecombe; Diane Cook

doi:10.2196/63866

Building a Human Digital Twin (HDTwin) Using Large Language Models for Cognitive Diagnosis: Algorithm Development and Validation

JMIR Form Res. 2024 Dec 23:8:e63866. doi: 10.2196/63866.

Authors

Gina Sprint^#¹, Maureen Schmitter-Edgecombe², Diane Cook^#²

Affiliations

¹ Department of Computer Science, Gonzaga University, Spokane, WA, United States.
² School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA, United States.

^# Contributed equally.

PMID: 39715540
DOI: 10.2196/63866

Abstract

Background: Human digital twins have the potential to change the practice of personalizing cognitive health diagnosis because these systems can integrate multiple sources of health information and influence into a unified model. Cognitive health is multifaceted, yet researchers and clinical professionals struggle to align diverse sources of information into a single model.

Objective: This study aims to introduce a method called HDTwin, for unifying heterogeneous data using large language models. HDTwin is designed to predict cognitive diagnoses and offer explanations for its inferences.

Methods: HDTwin integrates cognitive health data from multiple sources, including demographic, behavioral, ecological momentary assessment, n-back test, speech, and baseline experimenter testing session markers. Data are converted into text prompts for a large language model. The system then combines these inputs with relevant external knowledge from scientific literature to construct a predictive model. The model's performance is validated using data from 3 studies involving 124 participants, comparing its diagnostic accuracy with baseline machine learning classifiers.

Results: HDTwin achieves a peak accuracy of 0.81 based on the automated selection of markers, significantly outperforming baseline classifiers. On average, HDTwin yielded accuracy=0.77, precision=0.88, recall=0.63, and Matthews correlation coefficient=0.57. In comparison, the baseline classifiers yielded average accuracy=0.65, precision=0.86, recall=0.35, and Matthews correlation coefficient=0.36. The experiments also reveal that HDTwin yields superior predictive accuracy when information sources are fused compared to single sources. HDTwin's chatbot interface provides interactive dialogues, aiding in diagnosis interpretation and allowing further exploration of patient data.

Conclusions: HDTwin integrates diverse cognitive health data, enhancing the accuracy and explainability of cognitive diagnoses. This approach outperforms traditional models and provides an interface for navigating patient information. The approach shows promise for improving early detection and intervention strategies in cognitive health.

Keywords: artificial intelligence; chatbot; cognitive diagnosis; cognitive health; digital behavior marker; digital twin; health information; human digital twin; interview marker; large language models; machine learning; smartwatch.

©Gina Sprint, Maureen Schmitter-Edgecombe, Diane Cook. Originally published in JMIR Formative Research (https://formative.jmir.org), 23.12.2024.

MeSH terms

Algorithms*
Cognition / physiology
Female
Humans
Male