QSAR Regression Models for Predicting HMG-CoA Reductase Inhibition

Pharmaceuticals (Basel). 2024 Oct 30;17(11):1448. doi: 10.3390/ph17111448.

Abstract

Background/objectives: HMG-CoA reductase is an enzyme that regulates the initial stage of cholesterol synthesis, and its inhibitors are widely used in the treatment of cardiovascular diseases.

Methods: We have created a set of quantitative structure-activity relationship (QSAR) models for human HMG-CoA reductase inhibitors using nested cross-validation as the primary validation method. To develop the QSAR models, we employed various machine learning regression algorithms, feature selection methods, and fingerprints or descriptor datasets.

Results: We built and evaluated a total of 300 models, selecting 21 that demonstrated good performance (coefficient of determination, R2 ≥ 0.70 or concordance correlation coefficient, CCC ≥ 0.85). Six of these top-performing models met both performance criteria and were used to construct five ensemble models. We identified the descriptors most important in explaining HMG-CoA inhibition for each of the six best-performing models. We used the top models to search through over 220,000 chemical compounds from a large database (ZINC 15) for potential new inhibitors. Only a small fraction (237 out of approximately 220,000 compounds) had reliable predictions with mean pIC50 values ≥ 8 (IC50 values ≤ 10 nM). Our svm-based ensemble model predicted IC50 values < 10 nM for roughly 0.08% of the screened compounds. We have also illustrated the potential applications of these QSAR models in understanding the cholesterol-lowering activities of herbal extracts, such as those reported for an extract prepared from the Iris × germanica rhizome.

Conclusions: Our QSAR models can accurately predict human HMG-CoA reductase inhibitors, having the potential to accelerate the discovery of novel cholesterol-lowering agents and may also be applied to understand the mechanisms underlying the reported cholesterol-lowering activities of herbal extracts.

Keywords: HMG-CoA reductase; Iris germanica; MACCS fingerprints; QSAR; feature selection; machine learning; mlr3; molecular descriptors; nested cross-validation; statins; virtual screening.

Grants and funding

Publication of this paper was supported by the University of Medicine and Pharmacy Carol Davila, through the institutional program Publish not Perish.