Image-based crystal detection: a machine-learning approach

Acta Crystallogr D Biol Crystallogr. 2008 Dec;64(Pt 12):1187-95. doi: 10.1107/S090744490802982X. Epub 2008 Nov 18.

Abstract

The ability of computers to learn from and annotate large databases of crystallization-trial images provides not only the ability to reduce the workload of crystallization studies, but also an opportunity to annotate crystallization trials as part of a framework for improving screening methods. Here, a system is presented that scores sets of images based on the likelihood of containing crystalline material as perceived by a machine-learning algorithm. The system can be incorporated into existing crystallization-analysis pipelines, whereby specialists examine images as they normally would with the exception that the images appear in rank order according to a simple real-valued score. Promising results are shown for 319 112 images associated with 150 structures solved by the Joint Center for Structural Genomics pipeline during the 2006-2007 year. Overall, the algorithm achieves a mean receiver operating characteristic score of 0.919 and a 78% reduction in human effort per set when considering an absolute score cutoff for screening images, while incurring a loss of five out of 150 structures.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Crystallization
  • Crystallography, X-Ray / instrumentation
  • Crystallography, X-Ray / methods*
  • Crystallography, X-Ray / trends
  • Database Management Systems / economics
  • Database Management Systems / instrumentation
  • Image Interpretation, Computer-Assisted
  • Image Processing, Computer-Assisted / instrumentation
  • Image Processing, Computer-Assisted / methods*
  • Proteins / chemistry*
  • ROC Curve

Substances

  • Proteins