Revealing biases in the sampling of ecological interaction networks

PeerJ. 2019 Sep 2:7:e7566. doi: 10.7717/peerj.7566. eCollection 2019.

Abstract

The structure of ecological interactions is commonly understood through analyses of interaction networks. However, these analyses may be sensitive to sampling biases with respect to both the interactors (the nodes of the network) and interactions (the links between nodes), because the detectability of species and their interactions is highly heterogeneous. These ecological and statistical issues directly affect ecologists' abilities to accurately construct ecological networks. However, statistical biases introduced by sampling are difficult to quantify in the absence of full knowledge of the underlying ecological network's structure. To explore properties of large-scale ecological networks, we developed the software EcoNetGen, which constructs and samples networks with predetermined topologies. These networks may represent a wide variety of communities that vary in size and types of ecological interactions. We sampled these networks with different mathematical sampling designs that correspond to methods used in field observations. The observed networks generated by each sampling process were then analyzed with respect to the number of components, size of components and other network metrics. We show that the sampling effort needed to estimate underlying network properties depends strongly both on the sampling design and on the underlying network topology. In particular, networks with random or scale-free modules require more complete sampling to reveal their structure, compared to networks whose modules are nested or bipartite. Overall, modules with nested structure were the easiest to detect, regardless of the sampling design used. Sampling a network starting with any species that had a high degree (e.g., abundant generalist species) was consistently found to be the most accurate strategy to estimate network structure. Because high-degree species tend to be generalists, abundant in natural communities relative to specialists, and connected to each other, sampling by degree may therefore be common but unintentional in empirical sampling of networks. Conversely, sampling according to module (representing different interaction types or taxa) results in a rather complete view of certain modules, but fails to provide a complete picture of the underlying network. To reduce biases introduced by sampling methods, we recommend that these findings be incorporated into field design considerations for projects aiming to characterize large species interaction networks.

Keywords: Ecological networks; Field sampling design; Food webs; Modularity; Nestedness; Network metrics; Network topology; Species interaction networks.

Grants and funding

This work was conducted as a part of the Ecological Network Dynamics Working Group at the National Institute for Mathematical and Biological Synthesis, sponsored by the National Science Foundation through NSF Award #DBI-1300426, with additional support from The University of Tennessee, Knoxville, and the International Centre for Theoretical Physics ICTP-SAIFR #2016/01343-7 FAPESP. Marcus A.M. de Aguiar was supported by FAPESP (grants #2016/06054-3 and #2016/01343-7) and CNPq (grant #302049/2015-0). Erica Newman was supported by the University of Arizona Bridging Biodiversity and Conservation Science program. Publication fees were provided by the Berkeley Research Impact Initiative (BRII) sponsored by the UC Berkeley Library. There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.