Data-driven material research for property prediction and material design using machine learning methods requires a large quantity, wide variety, and high-quality materials data. For battery materials, which are commonly polycrystalline, ceramics, and composites, multiscale data on substances, materials, and batteries are required. In this work, we develop a data network composed of three interlinked databases, from which we can obtain comprehensive data on substances such as crystal structures and electronic structures, data on materials such as chemical composition, structure, and properties, and data on batteries such as battery composition, operation conditions, and capacity. The data are extracted from research papers on solid electrolytes and cathode materials, selected by screening more than 330 thousand papers using natural language processing tools. Data extraction and curation are carried out by editors specialized in material science and trained in data standardization.
Keywords: Material databases; battery material; capacity; cathode; crystal structure; data curation; ionic conductivity; nature language processing; solid electrolyte.
We develop a comprehensive data network to accelerate battery material research, integrating multiscale data from three databases and 330,000+ papers using natural language processing and expert curation.
© 2024 The Author(s). Published by National Institute for Materials Science in partnership with Taylor & Francis Group.