A novel framework for phage-host prediction via logical probability theory and network sparsification

Ankang Wei; Huanghan Zhan; Zhen Xiao; Weizhong Zhao; Xingpeng Jiang

doi:10.1093/bib/bbae708

A novel framework for phage-host prediction via logical probability theory and network sparsification

Brief Bioinform. 2024 Nov 22;26(1):bbae708. doi: 10.1093/bib/bbae708.

Authors

Ankang Wei^{1

2

3}, Huanghan Zhan^{1

2}, Zhen Xiao^{1

2

3}, Weizhong Zhao^{1

2

4}, Xingpeng Jiang^{1

2

4}

Affiliations

¹ Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China.
² School of Computer Science, Central China Normal University, Wuhan 430079, China.
³ School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China.
⁴ National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, Wuhan 430079, China.

PMID: 39780485
DOI: 10.1093/bib/bbae708

Abstract

Bacterial resistance has emerged as one of the greatest threats to human health, and phages have shown tremendous potential in addressing the issue of drug-resistant bacteria by lysing host. The identification of phage-host interactions (PHI) is crucial for addressing bacterial infections. Some existing computational methods for predicting PHI are suboptimal in terms of prediction efficiency due to the limited types of available information. Despite the emergence of some supporting information, the generalizability of models using this information is limited by the small scale of the databases. Additionally, most existing models overlook the sparsity of association data, which severely impacts their predictive performance as well. In this study, we propose a dual-view sparse network model (DSPHI) to predict PHI, which leverages logical probability theory and network sparsification. Specifically, we first constructed similarity networks using the sequences of phages and hosts respectively, and then sparsified these networks, enabling the model to focus more on key information during the learning process, thereby improving prediction efficiency. Next, we utilize logical probability theory to compute high-order logical information between phages (hosts), which is known as mutual information. Subsequently, we connect this information in node form to the sparse phage (host) similarity network, resulting in a phage (host) heterogeneous network that better integrates the two information views, thereby reducing the complexity of model computation and enhancing information aggregation capabilities. The hidden features of phages and hosts are explored through graph learning algorithms. Experimental results demonstrate that mutual information is effective information in predicting PHI, and the sparsification procedure of similarity networks significantly improves the model's predictive performance.

Keywords: graph convolutional network; logical probability theory; metagenomic data; network sparsification; phage–host interactions.

MeSH terms

Algorithms
Bacteria / genetics
Bacteria / virology
Bacteriophages* / genetics
Computational Biology / methods
Host-Pathogen Interactions
Probability Theory

Abstract

MeSH terms

Grants and funding