Evaluation of Different Methods for Identification of Structural Alerts Using Chemical Ames Mutagenicity Data Set as a Benchmark

Chem Res Toxicol. 2017 Jun 19;30(6):1355-1364. doi: 10.1021/acs.chemrestox.7b00083. Epub 2017 May 23.

Abstract

Identification of structural alerts for toxicity is useful in drug discovery and other fields such as environmental protection. With structural alerts, researchers can quickly identify potential toxic compounds and learn how to modify them. Hence, it is important to determine structural alerts from a large number of compounds quickly and accurately. There are already many methods reported for identification of structural alerts. However, how to evaluate those methods is a problem. In this paper, we tried to evaluate four of the methods for monosubstructure identification with three indices including accuracy rate, coverage rate, and information gain to compare their advantages and disadvantages. The Kazius' Ames mutagenicity data set was used as the benchmark, and the four methods were MoSS (graph-based), SARpy (fragment-based), and two fingerprint-based methods including Bioalerts and the fingerprint (FP) method we previously used. The results showed that Bioalerts and FP could detect key substructures with high accuracy and coverage rates because they allowed unclosed rings and wildcard atom or bond types. However, they also resulted in redundancy so that their predictive performance was not as good as that of SARpy. SARpy was competitive in predictive performance in both training set and external validation set. These results might be helpful for users to select appropriate methods and further development of methods for identification of structural alerts.

Publication types

  • Evaluation Study

MeSH terms

  • Benchmarking*
  • Databases, Pharmaceutical
  • Datasets as Topic
  • Humans
  • Molecular Structure
  • Mutagenicity Tests / methods*
  • Mutagenicity Tests / standards
  • Mutagens / analysis*
  • Mutagens / chemistry*
  • Mutagens / toxicity

Substances

  • Mutagens