Probabilistic principal component analysis with expectation maximization (PPCA-EM) facilitates volume classification and estimates the missing data

J Struct Biol. 2010 Jul;171(1):18-30. doi: 10.1016/j.jsb.2010.04.002. Epub 2010 Apr 10.

Abstract

We have developed a new method for classifying 3D reconstructions with missing data obtained by electron microscopy techniques. The method is based on principal component analysis (PCA) combined with expectation maximization. The missing data, together with the principal components, are treated as hidden variables that are estimated by maximizing a likelihood function. PCA in 3D is similar to PCA for 2D image analysis. A lower dimensional subspace of significant features is selected, into which the data are projected, and if desired, subsequently classified. In addition, our new algorithm estimates the missing data for each individual volume within the lower dimensional subspace. Application to both a large model data set and cryo-electron microscopy experimental data demonstrates the good performance of the algorithm and illustrates its potential for studying macromolecular assemblies with continuous conformational variations.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Cryoelectron Microscopy
  • Imaging, Three-Dimensional
  • Models, Statistical*
  • Principal Component Analysis*
  • Probability