Multivariate adaptive regression splines (MARS) in chromatographic quantitative structure-retention relationship studies

J Chromatogr A. 2004 Nov 5;1055(1-2):11-9. doi: 10.1016/j.chroma.2004.07.112.

Abstract

The multivariate adaptive regression splines (MARS) methodology was applied to build quantitative structure-retention relationships (QSRRs). The response (dependent variable) in the MARS models consisted of the logarithms of the extrapolated retention factors (log k(w)) of 83 structurally diverse drugs on a Unisphere PBD column, using isocratic elutions at pH 11.7. A set of 266 molecular descriptors was used as predictor (independent) variables in the MARS model building. The optimal MARS model uses 34 basis functions to describe the retention and has acceptable predictive properties for new objects. The molecular descriptors included in the model describe hydrophobicity, molecular size, complexity, shape and polarisability. Some additional MARS models were created using alternative strategies. These include models with log P as the single predictor and models obtained with only the three most important molecular descriptors. The use of classification and regression trees (CART) as feature selection technique for predictor variables used in the MARS model was also investigated. Further, it is also studied whether allowing quadratic terms instead of interaction terms might lead to better MARS models.

MeSH terms

  • Chromatography, Liquid / methods*
  • Models, Theoretical
  • Multivariate Analysis
  • Quantitative Structure-Activity Relationship