Latent semantic structure indexing (LaSSI) for defining chemical similarity

J Med Chem. 2001 Apr 12;44(8):1177-84. doi: 10.1021/jm000393c.

Abstract

A novel method for computing chemical similarity from chemical substructure descriptors is described. This new method, called LaSSI, uses the singular value decomposition (SVD) of a chemical descriptor-molecule matrix to create a low-dimensional representation of the original descriptor space. Ranking molecules by similarity to a probe molecule in the reduced-dimensional space has several advantages over analogous ranking in the original descriptor space: matching latent structures is more robust than matching discrete descriptors, choosing the number of singular values provides a rational way to vary the "fuzziness" of the search, and the reduction in the dimensionality of the chemical space increases searching speed. LaSSI also allows the calculation of the similarity between two descriptors and between a descriptor and a molecule.

MeSH terms

  • Algorithms
  • Combinatorial Chemistry Techniques
  • Databases, Factual*
  • Drug Design
  • Models, Molecular*
  • Molecular Structure*
  • Organic Chemicals*

Substances

  • Organic Chemicals