Three-class classification models of logS and logP derived by using GA-CG-SVM approach

Mol Divers. 2009 May;13(2):261-8. doi: 10.1007/s11030-009-9108-1. Epub 2009 Jan 31.

Abstract

In this investigation, three-class classification models of aqueous solubility (logS) and lipophilicity (logP) have been developed by using a support vector machine (SVM) method combined with a genetic algorithm (GA) for feature selection and a conjugate gradient method (CG) for parameter optimization. A 5-fold cross-validation and an independent test set method were used to evaluate the SVM classification models. For logS, the overall prediction accuracy is 87.1% for training set and 90.0% for test set. For logP, the overall prediction accuracy is 81.0% for training set and 82.0% for test set. In general, for both logS and logP, the prediction accuracies of three-class models are slightly lower by several percent than those of two-class models. A comparison between the performance of GA-CG-SVM models and that of GA-SVM models shows that the SVM parameter optimization has a significant impact on the quality of SVM classification model.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Genetics
  • Hydrophobic and Hydrophilic Interactions
  • Models, Chemical*
  • Sensitivity and Specificity
  • Solubility
  • Water / chemistry*

Substances

  • Water