Breast milk serves as a vital source of essential nutrients for infants. However, human milk contamination via the transfer of environmental chemicals from maternal exposome is a significant concern for infant health. The milk to plasma concentration (M/P) ratio is a critical metric that quantifies the extent to which these chemicals transfer from maternal plasma into breast milk, impacting infant exposure. Machine learning-based predictive toxicology models can be valuable in predicting chemicals with a high propensity to transfer into human milk. To this end, we build such classification- and regression-based models by employing multiple machine learning algorithms and leveraging the largest curated data set, to date, of 375 chemicals with known milk-to-plasma concentration (M/P) ratios. Our support vector machine (SVM)-based classifier outperforms other models in terms of different performance metrics, when evaluated on both (internal) test data and an external test data set. Specifically, the SVM-based classifier on (internal) test data achieved a classification accuracy of 77.33%, a specificity of 84%, a sensitivity of 64%, and an F-score of 65.31%. When evaluated on an external test data set, our SVM-based classifier is found to be generalizable with a sensitivity of 77.78%. While we were able to build highly predictive classification models, our best regression models for predicting the M/P ratio of chemicals could achieve only moderate R2 values on the (internal) test data. As noted in the earlier literature, our study also highlights the challenges in developing accurate regression models for predicting the M/P ratio of xenobiotic chemicals. Overall, this study attests to the immense potential of predictive computational toxicology models in characterizing the myriad of chemicals in the human exposome.
© 2024 The Authors. Published by American Chemical Society.