Motivation: Characterizing the diversity of microbial communities and understanding the environmental factors that influence community diversity are central tenets of microbial ecology. The development and application of cultivation independent molecular tools has allowed for rapid surveying of microbial community composition at unprecedented resolutions and frequencies. There is a growing need to discern robust patterns and relationships within these datasets which provide insight into microbial ecology. Pearson correlation coefficient (PCC) analysis is commonly used for identifying the linear relationship between two species, or species and environmental factors. However, this approach may not be able to capture more complex interactions which occur in situ; thus, alternative analyses were explored.
Results: In this paper we introduced local similarity analysis (LSA), which is a technique that can identify more complex dependence associations among species as well as associations between species and environmental factors without requiring significant data reduction. To illustrate its capability of identifying relationships that may not otherwise be identified by PCC, we first applied LSA to simulated data. We then applied LSA to a marine microbial observatory dataset and identified unique, significant associations that were not detected by PCC analysis. LSA results, combined with results from PCC analysis were used to construct a theoretical ecological network which allows for easy visualization of the most significant associations. Biological implications of the significant associations detected by LSA were discussed. We also identified additional applications where LSA would be beneficial.
Availability: The algorithms are implemented in Splus/R and they are available upon request from the corresponding author.