An explainable machine-learning approach for revealing the complex synthesis path-property relationships of nanomaterials

Nanoscale. 2023 Sep 29;15(37):15358-15367. doi: 10.1039/d3nr02273k.

Abstract

Machine learning (ML) models have recently shown important advantages in predicting nanomaterial properties, which avoids many trial-and-error explorations. However, complex variables that control the formation of nanomaterials exhibiting the desired properties still need to be better understood owing to the low interpretability of ML models and the lack of detailed mechanism information on nanomaterial properties. In this study, we developed a methodology for accurately predicting multiple synthesis parameter-property relationships of nanomaterials to improve the interpretability of the nanomaterial property mechanism. As a proof-of-concept, we designed glutathione-gold nanoclusters (GSH-AuNCs) exhibiting an appropriate fluorescence quantum yield (QY). First, we conducted 189 experiments and synthesized different GSH-AuNCs by varying the thiol-to-metal molar ratio and reaction temperature and time in reasonable ranges. The fluorescence QY of GSH-AuNCs could be systematically and independently programmed using different experimental parameters. We used limited GSH-AuNC synthesis parameter data to train an extreme gradient boosting regressor model. Moreover, we improved the interpretability of the ML model by combining individual conditional expectation, double-variable partial dependence, and feature interaction network analyses. The interpretability analyses established the relationship between multiple synthesis parameters and fluorescence QYs of GSH-AuNCs. The results represent an essential step towards revealing the complex fluorescence mechanism of thiolated AuNCs. Finally, we constructed a synthesis phase diagram exceeding 6.0 × 104 prediction variables for accurately predicting the fluorescence QY of GSH-AuNCs. A multidimensional synthesis phase diagram was obtained for the fluorescence QY of GSH-AuNCs by searching the synthesis parameter space in the trained ML model. Our methodology is a general and powerful complementary strategy for application in material informatics.