Binary classification of chalcone derivatives with LDA or KNN based on their antileishmanial activity and molecular descriptors selected using the Successive Projections Algorithm feature-selection technique

Eur J Pharm Sci. 2014 Jan 23:51:189-95. doi: 10.1016/j.ejps.2013.09.019. Epub 2013 Sep 30.

Abstract

Chalcones are naturally occurring aromatic ketones, which consist of an α-, β-unsaturated carbonyl system joining two aryl rings. These compounds are reported to exhibit several pharmacological activities, including antiparasitic, antibacterial, antifungal, anticancer, immunomodulatory, nitric oxide inhibition and anti-inflammatory effects. In the present work, a Quantitative Structure-Activity Relationship (QSAR) study is carried out to classify chalcone derivatives with respect to their antileishmanial activity (active/inactive) on the basis of molecular descriptors. For this purpose, two techniques to select descriptors are employed, the Successive Projections Algorithm (SPA) and the Genetic Algorithm (GA). The selected descriptors are initially employed to build Linear Discriminant Analysis (LDA) models. An additional investigation is then carried out to determine whether the results can be improved by using a non-parametric classification technique (One Nearest Neighbour, 1NN). In a case study involving 100 chalcone derivatives, the 1NN models were found to provide better rates of correct classification than LDA, both in the training and test sets. The best result was achieved by a SPA-1NN model with six molecular descriptors, which provided correct classification rates of 97% and 84% for the training and test sets, respectively.

Keywords: Antileishmanial activity; Genetic Algorithm; Linear Discriminant Analysis; One Nearest Neighbour; Successive Projections Algorithm.

MeSH terms

  • Algorithms
  • Chalcone / chemistry*
  • Chalcone / pharmacology*
  • Discriminant Analysis
  • Models, Molecular
  • Quantitative Structure-Activity Relationship

Substances

  • Chalcone