Enhancing dementia risk screening with GAN-synthesized periodontal examination and general blood test data

Katsunori Oyama; Toshiki Isogai; Yohei Nakayama; Ryoki Kobayashi; Daisuke Kitano; Kenji Karako; Kaoru Sakatani

doi:10.3389/fneur.2024.1379916

Enhancing dementia risk screening with GAN-synthesized periodontal examination and general blood test data

Front Neurol. 2024 Aug 14:15:1379916. doi: 10.3389/fneur.2024.1379916. eCollection 2024.

Authors

Katsunori Oyama¹, Toshiki Isogai², Yohei Nakayama^{3

4}, Ryoki Kobayashi^{3

5}, Daisuke Kitano⁶, Kenji Karako⁷, Kaoru Sakatani^{7

8}

Affiliations

¹ Department of Computer Science, College of Engineering, Nihon University, Koriyama, Japan.
² Graduate School of Computer Science, Nihon University, Koriyama, Japan.
³ Research Institute of Oral Science, Nihon University School of Dentistry at Matsudo, Matsudo, Japan.
⁴ Department of Periodontology, Nihon University School of Dentistry at Matsudo, Matsudo, Japan.
⁵ Department of Infection and Immunology, Nihon University School of Dentistry at Matsudo, Matsudo, Japan.
⁶ Division of Cardiology, Department of Medicine, Nihon University School of Medicine, Itabashi, Japan.
⁷ Department of Human and Engineered Environmental Studies, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan.
⁸ Institute of Gerontology, The University of Tokyo, Bunkyo, Japan.

Abstract

Introduction: This study aimed to investigate the effectiveness of data augmentation to improve dementia risk prediction using machine learning models. Recent studies have shown that basic blood tests are cost-effective in predicting cognitive function. However, developing models that address various conditions poses challenges due to constraints associated with blood test results and cognitive assessments, including high costs, limited sample sizes, and missing data from tests not performed in certain facilities. Despite being often limited by small sample sizes, periodontal examination data have also emerged as a cost-effective screening tool.

Methods: To address these challenges, this study explored the effectiveness of data augmentation using the Synthetic Minority Over-sampling Technique for Regression with Gaussian noise (SMOGN), a Generative Adversarial Network (GAN), and a Conditional Tabular GAN (CTGAN) on periodontal examination and blood test data. The datasets included parameters such as cognitive assessment results from the Mini-Mental State Examination (MMSE), demographic characteristics, periodontal examination data, and blood test results. Linear regression models, random forests, and deep neural networks were used to evaluate the effectiveness of the synthesized data.

Results: This study used measured data from 108 participants and the synthesized data generated from the measured data. External validity was evaluated using a different dataset of 41 participants with missing items. The results suggested that normal GANs have the advantage of investigating models in data diversity, whereas CTGANs preserve the data structure and linear relationships in tabular data from the measured data, which drastically improves linear regression models.

Discussion: Importantly, by interpolating sparse areas in the distribution, such as age, the synthesized models maintained prediction accuracy for test data with extreme inputs. These findings suggest that GAN-synthesized data can effectively address regression problems and improve dementia risk prediction.

Keywords: blood test; cognitive function; deep learning; generative adversarial networks; periodontal examination.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was supported in part by a JSPS Grant-in-Aid for Scientific Research (B) grant number JP23K25233 and Nihon University Research Grants for 2020.