FGTN: Fragment-based graph transformer network for predicting reproductive toxicity

Arch Toxicol. 2024 Dec;98(12):4077-4092. doi: 10.1007/s00204-024-03866-4. Epub 2024 Sep 18.

Abstract

Reproductive toxicity is one of the important issues in chemical safety. Traditional laboratory testing methods are costly and time-consuming with raised ethical issues. Only a few in silico models have been reported to predict human reproductive toxicity, but none of them make full use of the topological information of compounds. In addition, most existing atom-based graph neural network methods focus on attributing model predictions to individual nodes or edges rather than chemically meaningful fragments or substructures. In current studies, we develop a novel fragment-based graph transformer network (FGTN) approach to generate the QSAR model of human reproductive toxicity by considering internal topological structure information of compounds. In the FGTN model, the compound is represented by a graph architecture using fragments to be nodes and bonds linking two fragments to be edges. A super molecule-level node is further proposed to connect all fragment nodes by undirected edges, obtaining global molecular features from fragment embeddings. The FGTN model achieved an accuracy (ACC) of 0.861 and an area under the receiver operating characteristic curve (AUC) value of 0.914 on nonredundant blind tests, outperforming traditional fingerprint-based machine learning models and atom-based GCN model. The FGTN model can attribute toxic predictions to fragments, generating specific structural alerts for the positive compound. Moreover, FGTN may also have the capability to distinguish various chemical isomers. We believe that FGTN can be used as a reliable and effective tool for human reproductive toxicity prediction in contribution to the advancement of chemical safety assessment.

Keywords: Fragment; Graph transformer network; Reproductive toxicity; Structure alerts.

MeSH terms

  • Computer Simulation
  • Humans
  • Neural Networks, Computer*
  • Quantitative Structure-Activity Relationship*
  • Reproduction* / drug effects
  • Toxicity Tests / methods