Visualisation and subsets of the chemical universe database GDB-13 for virtual screening

J Comput Aided Mol Des. 2011 Jul;25(7):637-47. doi: 10.1007/s10822-011-9436-y. Epub 2011 May 27.

Abstract

The chemical universe database GDB-13, which enumerates 977 million organic molecules up to 13 atoms of C, N, O, S and Cl following simple chemical stability and synthetic feasibility rules, represents a vast reservoir for new fragments. GDB-13 was classified using the MQN-system discussed in the preceding paper for the analysis of PubChem fragments. Two hundred and fifty-five subsets of GDB-13 were generated by the combinatorial use of eight restrictive criteria, including fragment-like ("rule of three") and scaffold-like (no acyclic carbon atoms) filters. Virtual screening for analogs of 15 commercial drugs of 13 non-hydrogen atoms or less shows that retrieving MQN-neighbors of a query molecule from GDB-13 or its subsets provides on average a 38-fold enrichment in structural analogs (Daylight-type substructure fingerprint Tanimoto T (SF) > 0.7), and a 75-fold enrichment in shape-similar analogs (ROCS TanimotoCombo score > 1.4). An MQN-searchable version of GDB-13 is provided at www.gdb.unibe.ch .

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • Combinatorial Chemistry Techniques
  • Databases, Factual / classification*
  • Drug Discovery*
  • Humans
  • Informatics*
  • Ligands
  • Peptide Fragments / chemistry*
  • Pharmaceutical Preparations / chemistry*
  • Proteins / chemistry*
  • User-Computer Interface*

Substances

  • Ligands
  • Peptide Fragments
  • Pharmaceutical Preparations
  • Proteins