Computational modeling has emerged as a time-saving and cost-effective alternative to traditional animal testing for assessing chemicals for their potential hazards. However, few computational modeling studies for immunotoxicity were reported, with few models available for predicting toxicants due to the lack of training data and the complex mechanisms of immunotoxicity. In this study, we employed a data-driven quantitative structure-activity relationship (QSAR) modeling workflow to extensively enlarge the limited training data by revealing multiple targets involved in immunotoxicity. To this end, a probe data set of 6,341 chemicals was obtained from a high-throughput screening (HTS) assay testing for the activation of the aryl hydrocarbon receptor (AhR) signaling pathway, a key event leading to immunotoxicity. Searching this probe data set against PubChem yielded 3,183 assays with testing results for varying proportions of these 6,341 compounds. 100 assays were selected to develop QSAR models based on their correlations to AhR agonism. Twelve individual QSAR models were built for each assay using combinations of four machine-learning algorithms and three molecular fingerprints. 5-fold cross-validation of the resulting models showed good predictivity (average CCR = 0.73). A total of 20 assays were further selected based on QSAR model performance, and their resulting QSAR models showed good predictivity of potential immunotoxicants from external chemicals. This study provides a computational modeling strategy that can utilize large public toxicity data sets for modeling immunotoxicity and other toxicity endpoints, which have limited training data and complicated toxicity mechanisms.
© 2024 The Authors. Co-published by Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, and American Chemical Society.