Application of machine learning to structural molecular biology

Philos Trans R Soc Lond B Biol Sci. 1994 Jun 29;344(1310):365-71. doi: 10.1098/rstb.1994.0075.

Abstract

A technique of machine learning, inductive logic programming implemented in the program GOLEM, has been applied to three problems in structural molecular biology. These problems are: the prediction of protein secondary structure; the identification of rules governing the arrangement of beta-sheets strands in the tertiary folding of proteins; and the modelling of a quantitative structure activity relationship (QSAR) of a series of drugs. For secondary structure prediction and the QSAR, GOLEM yielded predictions comparable with contemporary approaches including neural networks. Rules for beta-strand arrangement are derived and it is planned to contrast their accuracy with those obtained by human inspection. In all three studies GOLEM discovered rules that provided insight into the stereochemistry of the system. We conclude machine learning used together with human intervention will provide a powerful tool to discover patterns in biological sequences and structures.

MeSH terms

  • Amino Acid Sequence
  • Artificial Intelligence*
  • Humans
  • Learning
  • Molecular Biology / methods*
  • Molecular Sequence Data
  • Protein Folding
  • Protein Structure, Secondary
  • Proteins / chemistry*

Substances

  • Proteins