QSTR with extended topochemical atom indices. 2. Fish toxicity of substituted benzenes

J Chem Inf Comput Sci. 2004 Mar-Apr;44(2):559-67. doi: 10.1021/ci0342066.

Abstract

Considering the importance of quantitative structure-toxicity relationship (QSTR) studies in the field of aquatic toxicology from the viewpoint of ecological safety assessment, fish toxicity of various benzene derivatives has been modeled by the multiple regression technique using recently introduced extended topochemical atom (ETA) indices. The toxicity data have also been modeled using other selected topological descriptors and physicochemical variables, and the best ETA model has been compared to the non-ETA ones. Principal component factor analysis was used as the data preprocessing step to reduce the dimensionality of the data matrix and identify the important variables that are devoid of collinearities. All-possible-subsets regression was also applied on the parameters to cross-check the variable selection for the best model. Multiple linear regression analyses show that the best non-ETA model involves 1chi, ALogP98, and LUMO (energy) as predictor variables and the quality of the relation is as follows: n = 92, Q2 = 0.718, Ra2 = 0.730, R2 = 0.738, R = 0.859, F = 82.8 (df 3, 88), s = 0.340. On the other hand, the best ETA model has the following quality: n = 92, Q2 = 0.865, Ra2 = 0.876, R2 = 0.885, R = 0.941, F = 92.6 (df 7, 84), s = 0.230. The ETA relations showed positive contributions of molecular bulk (size), chloro and hydroxy substitutions in the benzene ring, and the simultaneous presence of methyl and nitro substitutions to the toxicity. Further, the presence of fluoro and ether functionality, amino or nitro functionality in an otherwise unsubstituted ring, and nitro functionality that is ortho to a chloro substituent decreases toxicity. An attempt to use non-ETA descriptors along with ETA ones did not improve the quality in comparison to the best ETA model. Interestingly, the ETA model developed presently for the fish toxicity is better than the previously reported models on the same data set. Thus, it appears that ETA descriptors have significant potential in QSAR/QSPR/QSTR studies, which warrants extensive evaluation.