SIMAP--structuring the network of protein similarities

Thomas Rattei; Patrick Tischler; Roland Arnold; Franz Hamberger; Jörg Krebs; Jan Krumsiek; Benedikt Wachinger; Volker Stümpflen; Werner Mewes

doi:10.1093/nar/gkm963

SIMAP--structuring the network of protein similarities

Nucleic Acids Res. 2008 Jan;36(Database issue):D289-92. doi: 10.1093/nar/gkm963. Epub 2007 Nov 23.

Authors

Thomas Rattei¹, Patrick Tischler, Roland Arnold, Franz Hamberger, Jörg Krebs, Jan Krumsiek, Benedikt Wachinger, Volker Stümpflen, Werner Mewes

Affiliation

¹ Chair of Genome Oriented Bioinformatics, Center of Life and Food Science, Technische Universität München, 85350 Freising-Weihenstephan, Germany.

Abstract

Protein sequences are the most important source of evolutionary and functional information for new proteins. In order to facilitate the computationally intensive tasks of sequence analysis, the Similarity Matrix of Proteins (SIMAP) database aims to provide a comprehensive and up-to-date dataset of the pre-calculated sequence similarity matrix and sequence-based features like InterPro domains for all proteins contained in the major public sequence databases. As of September 2007, SIMAP covers approximately 17 million proteins and more than 6 million non-redundant sequences and provides a complete annotation based on InterPro 16. Novel features of SIMAP include a new, portlet-based web portal providing multiple, structured views on retrieved proteins and integration of protein clusters and a unique search method for similar domain architectures. Access to SIMAP is freely provided for academic use through the web portal for individuals at http://mips.gsf.de/simap/and through Web Services for programmatic access at http://mips.gsf.de/webservices/services/SimapService2.0?wsdl.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Databases, Protein*
Internet
Protein Structure, Tertiary
Proteins / classification
Sequence Alignment*
Sequence Analysis, Protein*
User-Computer Interface

Substances

Proteins