Biodegradation is the principal environmental dissipation process. Due to a lack of comprehensive experimental data, high study cost and time-consuming, in silico approaches for assessing the biodegradable profiles of chemicals are encouraged and is an active current research topic. Here we developed in silico methods to estimate chemical biodegradability in the environment. At first 1440 diverse compounds tested under the Japanese Ministry of International Trade and Industry (MITI) protocol were used. Four different methods, namely support vector machine, k-nearest neighbor, naïve Bayes, and C4.5 decision tree, were used to build the combinatorial classification probability models of ready versus not ready biodegradability using physicochemical descriptors and fingerprints separately. The overall predictive accuracies of the best models were more than 80% for the external test set of 164 diverse compounds. Some privileged substructures were further identified for ready or not ready biodegradable chemicals by combining information gain and substructure fragment analysis. Moreover, 27 new predicted chemicals were selected for experimental assay through the Japanese MITI test protocols, which validated that all 27 compounds were predicted correctly. The predictive accuracies of our models outperform the commonly used software of the EPI Suite. Our study provided critical tools for early assessment of biodegradability of new organic chemicals in environmental hazard assessment.
© 2012 American Chemical Society