Assessment of methods to define the applicability domain of structural alert models

J Chem Inf Model. 2011 May 23;51(5):975-85. doi: 10.1021/ci1000967. Epub 2011 Apr 13.

Abstract

It is important that in silico models for use in chemical safety legislation, such as REACH, are compliant with the OECD Principles for the Validation of (Q)SARs. Structural alert models can be useful under these circumstances but lack an adequately defined applicability domain. This paper examines several methods of domain definition for structural alert models with the aim of assessing which were the most useful. Specifically, these methods were the use of fragments, chemical descriptor ranges, structural similarity, and specific applicability domain definition software. Structural alerts for mutagenicity in Derek for Windows (DfW) were used as examples, and Ames test data were used to define and test the domain of chemical space where the alerts produce reliable results. The usefulness of each domain was assessed on the criterion that confidence in the correctness of predictions should be greater inside the domain than outside it. By using a combination of structural similarity and chemical fragments a domain was produced where the majority of correct positive predictions for mutagenicity were within the domain and a large proportion of the incorrect positive predictions outside it. However this was not found for the negative predictions; there was little difference between the percentage of true and false predictions for inactivity which were found as either within or outside the applicability domain. A hypothesis for the occurrence of this difference between positive and negative predictions is that differences in structure between training and test compounds are more likely to remove the toxic potential of a compound containing a structural alert than to add an unknown mechanism of action (structural alert) to a molecule which does not already contain an alert. This could be especially true for well studied end points such as the Ames assay where the majority of mechanisms of action are likely to be known.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer Simulation
  • Cytotoxins / chemistry*
  • Likelihood Functions
  • Models, Chemical*
  • Mutagens / chemistry*
  • Quantitative Structure-Activity Relationship
  • Software*

Substances

  • Cytotoxins
  • Mutagens