On finding nonisomorphic connected subgraphs and distinct molecular substructures

J Chem Inf Comput Sci. 2001 Mar-Apr;41(2):314-20. doi: 10.1021/ci000092b.

Abstract

The problem of finding all nonisomorphic subgraphs of a given graph (all distinct substructures of a given molecular structure) is discussed. A computer program is introduced that first generates all connected subgraphs and then uses a combination of well-discriminating graph invariants to eliminate duplicates. The program is broadly applicable, in particular for molecular graphs which may or may not contain unsaturation or heteroatoms. The number of distinct substructures (Ns), proposed earlier as a measure of a compound's complexity which takes into account its symmetry, is thus automatically obtained. As was to be expected, due to the nature of the problem the computational effort increases exponentially with problem size, whence in most cases complexity measures other than Ns are to be preferred.