PubChem Protein, Gene, Pathway, and Taxonomy Data Collections: Bridging Biology and Chemistry through Target-Centric Views of PubChem Data

J Mol Biol. 2022 Jun 15;434(11):167514. doi: 10.1016/j.jmb.2022.167514. Epub 2022 Feb 25.

Abstract

PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public chemical database at the U.S. National Institutes of Health. Visited by millions of users every month, it plays a role as a key chemical information resource for biomedical research communities. Data in PubChem is from hundreds of contributors and organized into multiple collections by record type. Among these are the Protein, Gene, Pathway, and Taxonomy data collections. Records in these collections contain information on chemicals related to a given biological target (i.e., protein, gene, pathway, or taxon), helping users to analyze and interpret the biological activity data of molecules. In addition, annotations about the biological targets are collected from authoritative or curated data sources and integrated into the four collections. The content can be programmatically accessed through PubChem's web service interfaces (including PUG View). A machine-readable representation of this content is also provided within PubChemRDF.

Keywords: bioactivity; bioinformatics; cheminformatics; drug discovery; public chemical database.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Biology
  • Databases, Chemical*
  • Drug Discovery
  • Proteins / genetics

Substances

  • Proteins