FreeContact: fast and free software for protein contact prediction from residue co-evolution

BMC Bioinformatics. 2014 Mar 26:15:85. doi: 10.1186/1471-2105-15-85.

Abstract

Background: 20 years of improved technology and growing sequences now renders residue-residue contact constraints in large protein families through correlated mutations accurate enough to drive de novo predictions of protein three-dimensional structure. The method EVfold broke new ground using mean-field Direct Coupling Analysis (EVfold-mfDCA); the method PSICOV applied a related concept by estimating a sparse inverse covariance matrix. Both methods (EVfold-mfDCA and PSICOV) are publicly available, but both require too much CPU time for interactive applications. On top, EVfold-mfDCA depends on proprietary software.

Results: Here, we present FreeContact, a fast, open source implementation of EVfold-mfDCA and PSICOV. On a test set of 140 proteins, FreeContact was almost eight times faster than PSICOV without decreasing prediction performance. The EVfold-mfDCA implementation of FreeContact was over 220 times faster than PSICOV with negligible performance decrease. EVfold-mfDCA was unavailable for testing due to its dependency on proprietary software. FreeContact is implemented as the free C++ library "libfreecontact", complete with command line tool "freecontact", as well as Perl and Python modules. All components are available as Debian packages. FreeContact supports the BioXSD format for interoperability.

Conclusions: FreeContact provides the opportunity to compute reliable contact predictions in any environment (desktop or cloud).

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / genetics
  • Sequence Analysis, Protein / methods*
  • Software*

Substances

  • Proteins