Machine learning-based classification of valvular heart disease using cardiovascular risk factors

Sci Rep. 2024 Oct 17;14(1):24396. doi: 10.1038/s41598-024-67973-z.

Abstract

Valvular Heart Disease (VHD) is a globally significant cause of mortality, particularly among aging populations. Despite advancements in percutaneous and surgical interventions, there are still uncertainties that remain regarding the risk factors that significantly contribute to this condition within the domain of cardiovascular disease. This study investigates these uncertainties and the role of machine learning in categorizing VHD based on cardiovascular risk factors. It follows a two-part investigation comprising feature extraction and classification phases. Feature extraction is initially performed using a wrapping approach and refined further with binary logistic regression. The second phase employs five classifiers: Artificial Neural Network (ANN), XGBoost, Random Forest (RF), Naïve Bayes, and Support Vector Machine (SVM), along with advanced methods such as SVM combined with Principal Component Analysis (PCA) and a majority-voting ensemble method (MV5). Data on VHD cases were collected from DHQ Hospital Faisalabad using simple random sampling. Various statistical measures, such as the ROC curve, F-measure, sensitivity, specificity, accuracy, MCC, and Kappa are applied to assess the results. The findings reveal that the combination of SVM with PCA achieves the highest overall performance while the MV5 ensemble method also demonstrates high accuracy and balance in sensitivity and specificity. The variation in VHD prevalence linked to specific risk factors highlights the importance of a comprehensive approach to reduce this disease's burden. The Exceptional performance of SVM + PCA and MV5 highlights their significance in diagnosing VHD and advancing knowledge in biomedicine.

Keywords: Bioinformatics; Cardiovascular; Machine learning; Majority voting; Risk factors; Valvular heart disease.

MeSH terms

  • Adult
  • Aged
  • Bayes Theorem
  • Female
  • Heart Disease Risk Factors
  • Heart Valve Diseases*
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Neural Networks, Computer
  • Principal Component Analysis
  • ROC Curve
  • Risk Factors
  • Support Vector Machine*