A comprehensive data network for data-driven study of battery materials

Sci Technol Adv Mater. 2024 Sep 11;25(1):2403328. doi: 10.1080/14686996.2024.2403328. eCollection 2024.

Abstract

Data-driven material research for property prediction and material design using machine learning methods requires a large quantity, wide variety, and high-quality materials data. For battery materials, which are commonly polycrystalline, ceramics, and composites, multiscale data on substances, materials, and batteries are required. In this work, we develop a data network composed of three interlinked databases, from which we can obtain comprehensive data on substances such as crystal structures and electronic structures, data on materials such as chemical composition, structure, and properties, and data on batteries such as battery composition, operation conditions, and capacity. The data are extracted from research papers on solid electrolytes and cathode materials, selected by screening more than 330 thousand papers using natural language processing tools. Data extraction and curation are carried out by editors specialized in material science and trained in data standardization.

Keywords: Material databases; battery material; capacity; cathode; crystal structure; data curation; ionic conductivity; nature language processing; solid electrolyte.

Plain language summary

We develop a comprehensive data network to accelerate battery material research, integrating multiscale data from three databases and 330,000+ papers using natural language processing and expert curation.

Grants and funding

The funding information has been included in the Acknowledgment section.