Development of a protein-ligand extended connectivity (PLEC) fingerprint and its application for binding affinity predictions

Bioinformatics. 2019 Apr 15;35(8):1334-1341. doi: 10.1093/bioinformatics/bty757.

Abstract

Motivation: Fingerprints (FPs) are the most common small molecule representation in cheminformatics. There are a wide variety of FPs, and the Extended Connectivity Fingerprint (ECFP) is one of the best-suited for general applications. Despite the overall FP abundance, only a few FPs represent the 3D structure of the molecule, and hardly any encode protein-ligand interactions.

Results: Here, we present a Protein-Ligand Extended Connectivity (PLEC) FP that implicitly encodes protein-ligand interactions by pairing the ECFP environments from the ligand and the protein. PLEC FPs were used to construct different machine learning models tailored for predicting protein-ligand affinities (pKi∕d). Even the simplest linear model built on the PLEC FP achieved Rp = 0.817 on the Protein Databank (PDB) bind v2016 'core set', demonstrating its descriptive power.

Availability and implementation: The PLEC FP has been implemented in the Open Drug Discovery Toolkit (https://github.com/oddt/oddt).

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein
  • Ligands
  • Machine Learning*
  • Protein Binding
  • Proteins

Substances

  • Ligands
  • Proteins