Quantitative prediction of logk of peptides in high-performance liquid chromatography based on molecular descriptors by using the heuristic method and support vector machine

J Chem Inf Comput Sci. 2004 Nov-Dec;44(6):1979-86. doi: 10.1021/ci049891a.

Abstract

A new method support vector machine (SVM) and the heuristic method (HM) were used to develop the nonlinear and linear models between the capacity factor (logk) and seven molecular descriptors of 75 peptides for the first time. The molecular descriptors representing the structural features of the compounds only included the constitutional and topological descriptors, which can be obtained easily without optimizing the structure of the molecule. The seven molecular descriptors selected by the heuristic method in CODESSA were used as inputs for SVM. The results obtained by SVM were compared with those obtained by the heuristic method. The prediction result of the SVM model is better than that of heuristic method. For the test set, a predictive correlation coefficient R = 0.9801 and root-mean-square error of 0.1523 were obtained. The prediction results are in very good agreement with the experimental values. But the linear model of the heuristic method is easier to understand and ready to use for a chemist. This paper provided a new and effective method for predicting the chromatography retention of peptides and some insight into the structural features which are related to the capacity factor of peptides.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence*
  • Chromatography, High Pressure Liquid
  • Computer Simulation*
  • Linear Models
  • Peptides / chemistry*

Substances

  • Peptides