GASS: identifying enzyme active sites with genetic algorithms

Bioinformatics. 2015 Mar 15;31(6):864-70. doi: 10.1093/bioinformatics/btu746. Epub 2014 Nov 10.

Abstract

Motivation: Currently, 25% of proteins annotated in Pfam have their function unknown. One way of predicting proteins function is by looking at their active site, which has two main parts: the catalytic site and the substrate binding site. The active site is more conserved than the other residues of the protein and can be a rich source of information for protein function prediction. This article presents a new heuristic method, named genetic active site search (GASS), which searches for given active site 3D templates in unknown proteins. The method can perform non-exact amino acid matches (conservative mutations), is able to find amino acids in different chains and does not impose any restrictions on the active site size.

Results: GASS results were compared with those catalogued in the catalytic site atlas (CSA) in four different datasets and compared with two other methods: amino acid pattern search for substructures and motif and catalytic site identification. The results show GASS can correctly identify >90% of the templates searched. Experiments were also run using data from the substrate binding sites prediction competition CASP 10, and GASS is ranked fourth among the 18 methods considered.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Binding Sites
  • Catalytic Domain*
  • Computer Simulation
  • Databases, Protein*
  • Humans
  • Protein Structure, Tertiary
  • Proteins / chemistry*

Substances

  • Proteins