SiPAN: simultaneous prediction and alignment of protein-protein interaction networks

Bioinformatics. 2015 Jul 15;31(14):2356-63. doi: 10.1093/bioinformatics/btv160. Epub 2015 Mar 18.

Abstract

Motivation: Network prediction as applied to protein-protein interaction (PPI) networks has received considerable attention within the last decade. Because of the limitations of experimental techniques for interaction detection and network construction, several computational methods for PPI network reconstruction and growth have been suggested. Such methods usually limit the scope of study to a single network, employing data based on genomic context, structure, domain, sequence information or existing network topology. Incorporating multiple species network data for network reconstruction and growth entails the design of novel models encompassing both network reconstruction and network alignment, since the goal of network alignment is to provide functionally orthologous proteins from multiple networks and such orthology information can be used in guiding interolog transfers. However, such an approach raises the classical chicken or egg problem; alignment methods assume error-free networks, whereas network prediction via orthology works affectively if the functionally orthologous proteins are determined with high precision. Thus to resolve this intertwinement, we propose a framework to handle both problems simultaneously, that of SImultaneous Prediction and Alignment of Networks (SiPAN).

Results: We present an algorithm that solves the SiPAN problem in accordance with its simultaneous nature. Bearing the same name as the defined problem itself, the SiPAN algorithm employs state-of-the-art alignment and topology-based interaction confidence construction algorithms, which are used as benchmark methods for comparison purposes as well. To demonstrate the effectiveness of the proposed network reconstruction via SiPAN, we consider two scenarios; one that preserves the network sizes and the other where the network sizes are increased. Through extensive tests on real-world biological data, we show that the network qualities of SiPAN reconstructions are as good as those of original networks and in some cases SiPAN networks are even better, especially for the former scenario. An alternative state-of-the-art network reconstruction algorithm random walk with resistance produces networks considerably worse than the original networks and those reproduced via SiPAN in both cases.

Availability and implementation: Freely available at http://webprs.khas.edu.tr/∼cesim/SiPAN.tar.gz.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology / methods*
  • Humans
  • Models, Biological*
  • Protein Interaction Mapping / methods*
  • Protein Interaction Maps*
  • Proteins / chemistry
  • Proteins / metabolism*
  • Sequence Analysis, Protein / methods*

Substances

  • Proteins