Acute toxicity dataset for QSAR modeling and predicting missing data of six pesticides

Data Brief. 2020 Jan 17:29:105150. doi: 10.1016/j.dib.2020.105150. eCollection 2020 Apr.

Abstract

This data article presents 1) the acute toxicity (LC50 or EC50 (μg⋅L-1)) values of various chemicals for ten species, which were used to develop ten robust quantitative structure-activity relationship (QSAR) models, 2) the values of the various descriptors in the ten QSAR models, and 3) the acute toxicity values of six pesticides (acetochlor, chlorpyrifos, dimethoate, glyphosate, malathion, and paraquat) for various species, which were applied to establish species sensitivity distribution (SSD) models. The provided LC50 or EC50 (μg⋅L-1) data were collected from the PAN pesticide database and the United States Environmental Protection Agency ecotoxicology database and/or were predicted by the QSAR models. The values of the descriptors in the ten QSAR models were based on the optimal descriptors computed by the DRAGON software (version 7) and subsequently optimized by partial least squares modeling. All the data included in this manuscript are related to the research titled, "Conlecs: A novel procedure for deriving the concentration limits of chemicals outside the criteria of human drinking water using existing criteria and species sensitivity distribution based on quantitative structure-activity relationship prediction" [1].

Keywords: Non-linear curve fitting; Toxicity prediction; Variable importance in projection; Variable selection and modeling; Water quality criteria.