Property Prediction for Complex Compounds Using Structure-Free Mendeleev Encoding and Machine Learning

J Chem Inf Model. 2024 Dec 23;64(24):9205-9214. doi: 10.1021/acs.jcim.4c01343. Epub 2024 Dec 12.

Abstract

Predicting the properties for unseen materials exclusively on the basis of the chemical formula before synthesis and characterization has advantages for research and resource planning. This can be achieved using suitable structure-free encoding and machine learning methods, but additional processing decisions are required. In this study, we compare a variety of structure-free materials encodings and machine learning algorithms to predict the structure/property relationships of battery materials. It was found that the physical units used to measure the property labels have an important impact on the predictive ability of the models, regardless of the computational approach. Property labels with respect to weight give excellent performance, but property labels with respect to volume cannot be predicted with confidence using only chemical information, even when the underlying physical characteristics are the same. These results contrast with previous studies of unsupervised learning and classification, where structure-free encoding excelled, and highlight how the structural features or property labels of materials are represented plays an important role in the predictive ability of machine learning models.

MeSH terms

  • Algorithms
  • Machine Learning*