Predicting the bioconcentration of chemical compounds plays a crucial role in assessing environmental risks and toxicological impacts. This study presents a robust multitask deep learning model for predicting the bioconcentration potential. The model can predict the bioconcentration of compounds in multiple categories, including non-bioconcentrative (non-BC), weakly bioconcentrative (weak-BC), and strongly bioconcentrative (strong-BC). We also employed the SHapley Additive exPlanations (SHAP) technology for the model interpretation. The binary classification models (non-BC vs BC and weak-BC vs strong-BC) showed good predictive performance, which achieved accuracy values over 90% and area under the curve (AUC) values with 0.95. The final ternary classification model provided an overall accuracy with 91.11%. Comparative analysis of molecular physicochemical properties showed that the importance of molecular weight, polar surface area, solubility, and hydrogen bonding are important for chemical bioconcentration. Besides, we identified eight structural alerts responsible for chemical bioconcentration. We made the model available as an online tool named BCdpi-predictor, which is accessible at http://bcdpi.sapredictor.cn/. Users can predict the bioconcentration potential of chemical compounds freely. The model has significant implications for environmental policy and regulatory frameworks, such as REACH, by providing a more accurate and interpretable method for assessing chemical risks. We hope that the results of this study can provide helpful tools and meaningful information for chemical bioconcentration prediction in environmental risk assessment.
Keywords: BCdpi-predictor web server; Bioconcentration; Interpretable deep neural network; Multitask classification; Structural alerts.
Copyright © 2024 Elsevier Inc. All rights reserved.