Designing host-associated microbiomes using the consumer/resource model

mSystems. 2024 Dec 9:e0106824. doi: 10.1128/msystems.01068-24. Online ahead of print.

Abstract

A key step toward rational microbiome engineering is in silico sampling of realistic microbial communities that correspond to desired host phenotypes, and vice versa. This remains challenging due to a lack of generative models that simultaneously capture compositions of host-associated microbiomes and host phenotypes. To that end, we present a generative model based on the mechanistic consumer/resource (C/R) framework. In the model, variation in microbial ecosystem composition arises due to differences in the availability of effective resources (inferred latent variables), while species' resource preferences remain conserved. Simultaneously, the latent variables are used to model phenotypic states of hosts. In silico microbiomes generated by our model accurately reproduce universal and dataset-specific statistics of bacterial communities. The model allows us to address three salient questions in host-associated microbial ecologies: (i) which host phenotypes maximally constrain the composition of the host-associated microbiomes? (ii) how context-specific are phenotype/microbiome associations, and (iii) what are plausible microbiome compositions that correspond to desired host phenotypes? Our approach aids the analysis and design of microbial communities associated with host phenotypes of interest.

Importance: Generative models are extremely popular in modern biology. They have been used to model the variation of protein sequences, entire genomes, and RNA sequencing profiles. Importantly, generative models have been used to extrapolate and interpolate to unobserved regimes of data to design biological systems with desired properties. For example, there has been a boom in machine-learning models aiding in the design of proteins with user-specified structures or functions. Host-associated microbiomes play important roles in animal health and disease, as well as the productivity and environmental footprint of livestock species. However, there are no generative models of host-associated microbiomes. One chief reason is that off-the-shelf machine-learning models are data hungry, and microbiome studies usually deal with large variability and small sample sizes. Moreover, microbiome compositions are heavily context dependent, with characteristics of the host and the abiotic environment leading to distinct patterns in host-microbiome associations. Consequently, off-the-shelf generative modeling has not been successfully applied to microbiomes.To address these challenges, we develop a generative model for host-associated microbiomes derived from the consumer/resource (C/R) framework. This derivation allows us to fit the model to readily available cross-sectional microbiome profile data. Using data from three animal hosts, we show that this mechanistic generative model has several salient features: the model identifies a latent space that represents variables that determine the growth and, therefore, relative abundances of microbial species. Probabilistic modeling of variation in this latent space allows us to generate realistic in silico microbial communities. The model can assign probabilities to microbiomes, thereby allowing us to discriminate between dissimilar ecosystems. Importantly, the model predictively captures host-associated microbiomes and the corresponding hosts' phenotypes, enabling the design of microbial communities associated with user-specified host characteristics.

Keywords: consumer/resource model; generative modeling; host-associated microbiomes.