Electron ionization (EI) mass spectrum library searching is usually performed to identify a compound in gas chromatography/mass spectrometry. However, compounds whose EI mass spectra are registered in the library are still limited compared to the popular compound databases. This means that there are compounds that cannot be identified by conventional library searching but also may result in false positives. In this report, we report on the development of a machine learning model, which was trained using chemical formulae and EI mass spectra, that can predict the EI mass spectrum from the chemical structure. It allowed us to create a predicted EI mass spectrum database with predicted EI mass spectra for 100 million compounds in PubChem. We also propose a method for improving library searching time and accuracy that includes an extensive mass spectrum library.
Keywords: GC-MS; electron ionization; library search; mass spectrum prediction.
Copyright © 2023 Kubo, Azusa Kubota, Haruki Ishioka, Takuhiro Hizume, Masaaki Ubukata, Kenji Nagatomo, Takaya Satoh, Mitsuyoshi Yoshida, and Fuminori Uematsu.