Stochastic block hypergraph model

Phys Rev E. 2024 Sep;110(3-1):034312. doi: 10.1103/PhysRevE.110.034312.

Abstract

The stochastic block model is widely used to generate graphs with a community structure, but no simple alternative currently exists for hypergraphs, in which more than two nodes can be connected together through a hyperedge. We discuss here such a hypergraph generalization, based on the clustering connection probability P_{ij} between nodes of communities i and j, and that uses an explicit and modulable hyperedge formation process. We focus on the standard case where P_{ij}=pδ_{ij}+q(1-δ_{ij}) when 0≤q≤p (δ_{ij} is the Kronecker symbol). We propose a simple model that satisfies three criteria: it should be as simple as possible, when p=q the model should be equivalent to the standard hypergraph random model, and it should use an explicit and modulable hyperedge formation process so that the model is intuitive and can easily express different real-world formation processes. We first show that for such a model the degree distribution and hyperedge size distribution can be approximated by binomial distributions with effective parameters that depend on the number of communities and q/p. Also, the composition of hyperedges goes for q=0 from 'pure' hyperedges (comprising nodes belonging to the same community) to 'mixed' hyperedges that comprise nodes from different communities for q=p. We test various formation processes and our results suggest that when they depend on the composition of the hyperedge, they tend to favor the dominant community and lead to hyperedges with a smaller diversity. In contrast, for formation processes that are independent from the hyperedge structure, we obtain hyperedges comprising a larger diversity of communities. The advantages of the model proposed here are its simplicity and flexibility that make it a good candidate for testing community-related problems, such as their detection, impact on various dynamics, and visualization.