Comparison of cardiovascular risk prediction models developed using machine learning based on data from a Sri Lankan cohort with World Health Organization risk charts for predicting cardiovascular risk among Sri Lankans: a cohort study

Chamila Mettananda; Maheeka Solangaarachchige; Prasanna Haddela; Anuradha Supun Dassanayake; Anuradhani Kasturiratne; Rajitha Wickremasinghe; Norihiro Kato; Hithanadura Janaka de Silva

doi:10.1136/bmjopen-2023-081434

Comparison of cardiovascular risk prediction models developed using machine learning based on data from a Sri Lankan cohort with World Health Organization risk charts for predicting cardiovascular risk among Sri Lankans: a cohort study

BMJ Open. 2025 Jan 15;15(1):e081434. doi: 10.1136/bmjopen-2023-081434.

Authors

Chamila Mettananda¹, Maheeka Solangaarachchige^{2

3}, Prasanna Haddela³, Anuradha Supun Dassanayake⁴, Anuradhani Kasturiratne⁵, Rajitha Wickremasinghe⁵, Norihiro Kato⁶, Hithanadura Janaka de Silva⁷

Affiliations

¹ Department of Pharmacology, University of Kelaniya Faculty of Medicine, Ragama, Western, Sri Lanka [email protected].
² Examination Unit, University of Kelaniya Faculty of Medicine, Ragama, Western, Sri Lanka.
³ Department of Information Technology, Sri Lanka Institute of Information Technology, Malabe, Sri Lanka.
⁴ Department of Pharmacology, University of Kelaniya Faculty of Medicine, Ragama, Western, Sri Lanka.
⁵ Department of Public Health, University of Kelaniya Faculty of Medicine, Ragama, Sri Lanka.
⁶ Gene Diagnostics and Therapeutics, National Center for Global Health and Medicine Research Institute, Shinjuku-ku, Tokyo, Japan.
⁷ Department of Medicine, University of Kelaniya Faculty of Medicine, Ragama, Sri Lanka.

PMID: 39819943
DOI: 10.1136/bmjopen-2023-081434

Abstract

Introduction: Models derived from non-Sri Lankan cohorts are used for cardiovascular (CV) risk stratification of Sri Lankans.

Objective: To develop a CV risk prediction model using machine learning (ML) based on data from a Sri Lankan cohort followed up for 10 years, and to compare the predictions with WHO risk charts.

Design: Cohort study.

Setting: The Ragama Health Study (RHS), an ongoing, prospective, population-based cohort study of patients randomly selected from the Ragama Medical Office of Heath area, Sri Lanka, focusing on the epidemiology of non-communicable diseases, was used to develop the model. The external validation cohort included patients admitted to Colombo North Teaching Hospital (CNTH), a tertiary care hospital in Sri Lanka, from January 2019 through August 2020.

Participants: All RHS participants, aged 40-64 years in 2007, without cardiovascular disease (CVD) at baseline, who had complete data of 10-year outcome by 2017, were used for model development. Patients aged 40-74 years admitted to CNTH during the study period with incident CV events or a disease other than an acute CV event (CVE) with complete data for CVD risk calculation were used for external validation of the model.

Methods: Using the follow-up data of the cohort, we developed two ML models for predicting 10-year CV risk using six conventional CV risk variables (age, gender, smoking status, systolic blood pressure, history of diabetes, and total cholesterol level) and all available variables (n=75). The ML models were derived using classification algorithms of the supervised learning technique. We compared the predictive performance of our ML models with WHO risk charts (2019, Southeast Asia) using area under the receiver operating characteristic curves (AUC-ROC) and calibration plots. We validated the 6-variable model in an external hospital-based cohort.

Results: Of the 2596 participants in the baseline cohort, 179 incident CVEs were observed over 10 years. WHO risk charts predicted only 10 CVEs (AUC-ROC: 0.51, 95% CI 0.42 to 0.60), while the new 6-variable ML model predicted 125 CVEs (AUC-ROC: 0.72, 95% CI 0.66 to 0.78) and the 75-variable ML model predicted 124 CVEs (AUC-ROC: 0.74, 95% CI 0.68 to 0.80). Calibration results (Hosmer-Lemeshow test) for the 6-variable ML model and the WHO risk charts were χ²=12.85 (p=0.12) and χ²=15.58 (p=0.05), respectively. In the external validation cohort, the sensitivity, specificity, positive predictive value, negative predictive value, and calibration of the 6-variable ML model and the WHO risk charts, respectively, were: 70.3%, 94.9%, 87.3%, 86.6%, χ²=8.22, p=0.41 and 23.7%, 79.0%, 35.8%, 67.7%, χ²=81.94, p<0.0001.

Conclusions: ML-based models derived from a cohort of Sri Lankans improved the overall accuracy of CV-risk prediction compared with the WHO risk charts for this cohort of Southeast Asians.

Keywords: Cardiac Epidemiology; Preventive Medicine; Primary Prevention; Risk management.

Publication types

Comparative Study

MeSH terms

Adult
Aged
Cardiovascular Diseases* / epidemiology
Cohort Studies
Female
Heart Disease Risk Factors
Humans
Machine Learning*
Male
Middle Aged
Prospective Studies
Risk Assessment / methods
Risk Factors
Sri Lanka / epidemiology
World Health Organization