ReactionCode: format for reaction searching, analysis, classification, transform, and encoding/decoding

J Cheminform. 2020 Dec 3;12(1):72. doi: 10.1186/s13321-020-00476-x.

Abstract

In the past two decades a lot of different formats for molecules and reactions have been created. These formats were mostly developed for the purposes of identifiers, representation, classification, analysis and data exchange. A lot of efforts have been made on molecule formats but only few for reactions where the endeavors have been made mostly by companies leading to proprietary formats. Here, we present ReactionCode: a new open-source format that allows one to encode and decode a reaction into multi-layer machine readable code, which aggregates reactants and products into a condensed graph of reaction (CGR). This format is flexible and can be used in a context of reaction similarity searching and classification. It is also designed for database organization, machine learning applications and as a new transform reaction language.

Keywords: Classification; Decoding; Encoding; Reaction; ReactionCode; Searching.