Traditionally, materials discovery has been driven more by evidence and intuition than by systematic design. However, the advent of "big data" and an exponential increase in computational power have reshaped the landscape. Today, we use simulations, artificial intelligence (AI), and machine learning (ML) to predict materials characteristics, which dramatically accelerates the discovery of novel materials. For instance, combinatorial megalibraries, where millions of distinct nanoparticles are created on a single chip, have spurred the need for automated characterization tools. This paper presents an ML model specifically developed to perform real-time binary classification of grayscale high-angle annular dark-field images of nanoparticles sourced from these megalibraries. Given the high costs associated with downstream processing errors, a primary requirement for our model was to minimize false positives while maintaining efficacy on unseen images. We elaborate on the computational challenges and our solutions, including managing memory constraints, optimizing training time, and utilizing Neural Architecture Search tools. The final model outperformed our expectations, achieving over 95% precision and a weighted F-score of more than 90% on our test data set. This paper discusses the development, challenges, and successful outcomes of this significant advancement in the application of AI and ML to materials discovery.
Keywords: automated characterization; combinatorial megalibraries; machine learning; nanomaterials.
© The Author(s) 2024. Published by Oxford University Press on behalf of the Microscopy Society of America.