-
Strong, but not weak, noise correlations are beneficial for population coding
Authors:
Gabriel Mahuas,
Thomas Buffet,
Olivier Marre,
Ulisse Ferrari,
Thierry Mora
Abstract:
Neural correlations play a critical role in sensory information coding. They are of two kinds: signal correlations, when neurons have overlapping sensitivities, and noise correlations from network effects and shared noise. It is commonly thought that stimulus and noise correlations should have opposite signs to improve coding. However, experiments from early sensory systems and cortex typically sh…
▽ More
Neural correlations play a critical role in sensory information coding. They are of two kinds: signal correlations, when neurons have overlapping sensitivities, and noise correlations from network effects and shared noise. It is commonly thought that stimulus and noise correlations should have opposite signs to improve coding. However, experiments from early sensory systems and cortex typically show the opposite effect, with many pairs of neurons showing both types of correlations to be positive and large. Here, we develop a theory of information coding by correlated neurons which resolves this paradox. We show that noise correlations are always beneficial if they are strong enough. Extensive tests on retinal recordings under different visual stimuli confirm our predictions. Finally, using neuronal recordings and modeling, we show that for high dimensional stimuli noise correlation benefits the encoding of fine-grained details of visual stimuli, at the expense of large-scale features, which are already well encoded.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Computational detection of antigen specific B cell receptors following immunization
Authors:
Maria Francesca Abbate,
Thomas Dupic,
Emmanuelle Vigne,
Melody A. Shahsavarian,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
B cell receptors (BCRs) play a crucial role in recognizing and fighting foreign antigens. High-throughput sequencing enables in-depth sampling of the BCRs repertoire after immunization. However, only a minor fraction of BCRs actively participate in any given infection. To what extent can we accurately identify antigen-specific sequences directly from BCRs repertoires? We present a computational me…
▽ More
B cell receptors (BCRs) play a crucial role in recognizing and fighting foreign antigens. High-throughput sequencing enables in-depth sampling of the BCRs repertoire after immunization. However, only a minor fraction of BCRs actively participate in any given infection. To what extent can we accurately identify antigen-specific sequences directly from BCRs repertoires? We present a computational method grounded on sequence similarity, aimed at identifying statistically significant responsive BCRs. This method leverages well-known characteristics of affinity maturation and expected diversity. We validate its effectiveness using longitudinally sampled human immune repertoire data following influenza vaccination and Sars-CoV-2 infections. We show that different lineages converge to the same responding CDR3, demonstrating convergent selection within an individual. The outcomes of this method hold promise for application in vaccine development, personalized medicine, and antibody-derived therapeutics.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Variability in the local and global composition of human T-cell receptor repertoires during thymic development across cell types and individuals
Authors:
Giulio Isacchini,
Valentin Quiniou,
Hélène Vantomme,
Paul Stys,
Encarnita Mariotti-Ferandiz,
David Klatzmann,
Aleksandra M. Walczak,
Thierry Mora,
Armita Nourmohammad
Abstract:
The adaptive immune response relies on T cells that combine phenotypic specialization with diversity of T cell receptors (TCRs) to recognize a wide range of pathogens. TCRs are acquired and selected during T cell maturation in the thymus. Characterizing TCR repertoires across individuals and T cell maturation stages is important for better understanding adaptive immune responses and for developing…
▽ More
The adaptive immune response relies on T cells that combine phenotypic specialization with diversity of T cell receptors (TCRs) to recognize a wide range of pathogens. TCRs are acquired and selected during T cell maturation in the thymus. Characterizing TCR repertoires across individuals and T cell maturation stages is important for better understanding adaptive immune responses and for developing new diagnostics and therapies. Analyzing a dataset of human TCR repertoires from thymocyte subsets, we find that the variability between individuals generated during the TCR V(D)J recombination is maintained through all stages of T cell maturation and differentiation. The inter-individual variability of repertoires of the same cell type is of comparable magnitude to the variability across cell types within the same individual. To zoom in on smaller scales than whole repertoires, we defined a distance measuring the relative overlap of locally similar sequences in repertoires. We find that the whole repertoire models correctly predict local similarity networks, suggesting a lack of forbidden T cell receptor sequences. The local measure correlates well with distances calculated using whole repertoire traits and carries information about cell types.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
A quantitative theory of viral-immune coevolution is within reach
Authors:
Thierry Mora,
Aleksandra Walczak
Abstract:
Pathogens drive changes in host immune systems that in turn exert pressure for pathogens to evolve. Quantifying and understanding this constant coevolutionary process has clear practical global health implications. Yet its relatively easier accessibility compared to macroevolution makes it a fascinating system to learn about the basic laws of evolution. Focusing on immune-viral evolution, we prese…
▽ More
Pathogens drive changes in host immune systems that in turn exert pressure for pathogens to evolve. Quantifying and understanding this constant coevolutionary process has clear practical global health implications. Yet its relatively easier accessibility compared to macroevolution makes it a fascinating system to learn about the basic laws of evolution. Focusing on immune-viral evolution, we present an overview of theoretical and experimental approaches that have recently started coming together to build the foundations for a quantitative and predictive co-evolutionary theory.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
Evolutionary stability of antigenically escaping viruses
Authors:
Victor Chardès,
Andrea Mazzolini,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
Antigenic variation is the main immune escape mechanism for RNA viruses like influenza or SARS-CoV-2. While high mutation rates promote antigenic escape, they also induce large mutational loads and reduced fitness. It remains unclear how this cost-benefit trade-off selects the mutation rate of viruses. Using a traveling wave model for the co-evolution of viruses and host immune systems in a finite…
▽ More
Antigenic variation is the main immune escape mechanism for RNA viruses like influenza or SARS-CoV-2. While high mutation rates promote antigenic escape, they also induce large mutational loads and reduced fitness. It remains unclear how this cost-benefit trade-off selects the mutation rate of viruses. Using a traveling wave model for the co-evolution of viruses and host immune systems in a finite population, we investigate how immunity affects the evolution of the mutation rate and other non-antigenic traits, such as virulence. We first show that the nature of the wave depends on how cross-reactive immune systems are, reconciling previous approaches. The immune-virus system behaves like a Fisher wave at low cross-reactivities, and like a fitness wave at high cross-reactivities. These regimes predict different outcomes for the evolution of non-antigenic traits. At low cross-reactivities, the evolutionarily stable strategy is to maximize the speed of the wave, implying a higher mutation rate and increased virulence. At large cross-reactivities, where our estimates place H3N2 influenza, the stable strategy is to increase the basic reproductive number, keeping the mutation rate to a minimum and virulence low.
△ Less
Submitted 21 July, 2023; v1 submitted 21 April, 2023;
originally announced April 2023.
-
Towards a quantitative theory of tolerance
Authors:
Thierry Mora,
Aleksandra M. Walczak
Abstract:
A cornerstone of the classical view of tolerance is the elimination of self-reactive T cells during negative selection in the thymus. However, high-throughput T-cell receptor sequencing data has so far failed to detect substantial signatures of negative selection in the observed repertoires. In addition, quantitative estimates as well as recent experiments suggest that the elimination of self-reac…
▽ More
A cornerstone of the classical view of tolerance is the elimination of self-reactive T cells during negative selection in the thymus. However, high-throughput T-cell receptor sequencing data has so far failed to detect substantial signatures of negative selection in the observed repertoires. In addition, quantitative estimates as well as recent experiments suggest that the elimination of self-reactive T cells is at best incomplete. We discuss several recent theoretical ideas that can explain tolerance while being consistent with these observations, including collective decision making through quorum sensing, and sensitivity to change through dynamic tuning and adaptation. We propose that a unified quantitative theory of tolerance should combine these elements to explain the plasticity of the immune system and its robustness to autoimmunity.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
Dynamical information synergy in biochemical signaling networks
Authors:
Lauritz Hahn,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
Biological cells encode information about their environment through biochemical signaling networks that control their internal state and response. This information is often encoded in the dynamical patterns of the signaling molecules, rather than just their instantaneous concentrations. Here, we analytically calculate the information contained in these dynamics for a number of paradigmatic cases i…
▽ More
Biological cells encode information about their environment through biochemical signaling networks that control their internal state and response. This information is often encoded in the dynamical patterns of the signaling molecules, rather than just their instantaneous concentrations. Here, we analytically calculate the information contained in these dynamics for a number of paradigmatic cases in the linear regime, for both static and time-dependent input signals. When considering oscillatory output dynamics, we report the emergence of synergy between successive measurements, meaning that the joint information in two measurements exceeds the sum of the individual information. We extend our analysis numerically beyond the scope of linear input encoding to reveal synergetic effects in the cases of frequency or damping modulation, both of which are relevant to classical biochemical signaling systems.
△ Less
Submitted 21 April, 2023; v1 submitted 10 January, 2023;
originally announced January 2023.
-
Combining mutation and recombination statistics to infer clonal families in antibody repertoires
Authors:
Natanael Spisak,
Gabriel Athènes,
Thomas Dupic,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
B-cell repertoires are characterized by a diverse set of receptors of distinct specificities generated through two processes of somatic diversification: V(D)J recombination and somatic hypermutations. B cell clonal families stem from the same V(D)J recombination event, but differ in their hypermutations. Clonal families identification is key to understanding B-cell repertoire function, evolution a…
▽ More
B-cell repertoires are characterized by a diverse set of receptors of distinct specificities generated through two processes of somatic diversification: V(D)J recombination and somatic hypermutations. B cell clonal families stem from the same V(D)J recombination event, but differ in their hypermutations. Clonal families identification is key to understanding B-cell repertoire function, evolution and dynamics. We present HILARy (High-precision Inference of Lineages in Antibody Repertoires), an efficient, fast and precise method to identify clonal families from single- or paired-chain repertoire sequencing datasets. HILARy combines probabilistic models that capture the receptor generation and selection statistics with adapted clustering methods to achieve consistently high inference accuracy. It automatically leverages the phylogenetic signal of shared mutations in difficult repertoire subsets. Exploiting the high sensitivity of the method, we find the statistics of evolutionary properties such as the site frequency spectrum and dN/dS ratio do not depend on the junction length. We also identify a broad range of selection pressures spanning two orders of magnitude.
△ Less
Submitted 15 March, 2024; v1 submitted 22 December, 2022;
originally announced December 2022.
-
A small-correlation expansion to quantify information in noisy sensory systems
Authors:
Gabriel Mahuas,
Olivier Marre,
Thierry Mora,
Ulisse Ferrari
Abstract:
Neural networks encode information through their collective spiking activity in response to external stimuli. This population response is noisy and strongly correlated, with complex interplay between correlations induced by the stimulus, and correlations caused by shared noise. Understanding how these correlations affect information transmission has so far been limited to pairs or small groups of…
▽ More
Neural networks encode information through their collective spiking activity in response to external stimuli. This population response is noisy and strongly correlated, with complex interplay between correlations induced by the stimulus, and correlations caused by shared noise. Understanding how these correlations affect information transmission has so far been limited to pairs or small groups of neurons, because the curse of dimensionality impedes the evaluation of mutual information in larger populations. Here we develop a small-correlation expansion to compute the stimulus information carried by a large population of neurons, yielding interpretable analytical expressions in terms of the neurons' firing rates and pairwise correlations. We validate the approximation on synthetic data and demonstrate its applicability to electrophysiological recordings in the vertebrate retina, allowing us to quantify the effects of noise correlations between neurons and of memory in single neurons.
△ Less
Submitted 24 November, 2022;
originally announced November 2022.
-
Generalized Glauber dynamics for inference in biology
Authors:
Xiaowen Chen,
Maciej Winiarski,
Alicja Puscian,
Ewelina Knapska,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
Large interacting systems in biology often exhibit emergent dynamics, such as coexistence of multiple time scales, manifested by fat tails in the distribution of waiting times. While existing tools in statistical inference, such as maximum entropy models, reproduce the empirical steady state distributions, it remains challenging to learn dynamical models. We present a novel inference method, calle…
▽ More
Large interacting systems in biology often exhibit emergent dynamics, such as coexistence of multiple time scales, manifested by fat tails in the distribution of waiting times. While existing tools in statistical inference, such as maximum entropy models, reproduce the empirical steady state distributions, it remains challenging to learn dynamical models. We present a novel inference method, called generalized Glauber dynamics. Constructed through a non-Markovian fluctuation dissipation theorem, generalized Glauber dynamics tunes the dynamics of an interacting system, while keeping the steady state distribution fixed. We motivate the need for the method on real data from Eco-HAB, an automated habitat for testing behavior in groups of mice under semi-naturalistic conditions, and present it on simple Ising spin systems. We show its applicability for experimental data, by inferring dynamical models of social interactions in a group of mice that reproduce both its collective behavior and the long tails observed in individual dynamics.
△ Less
Submitted 21 July, 2023; v1 submitted 6 August, 2022;
originally announced August 2022.
-
Inspecting the interaction between HIV and the immune system through genetic turnover
Authors:
Andrea Mazzolini,
Thierry Mora,
Aleksandra M Walczak
Abstract:
Chronic infections of the human immunodeficiency virus (HIV) create a very complex co-evolutionary process, where the virus tries to escape the continuously adapting host immune system. Quantitative details of this process are largely unknown and could help in disease treatment and vaccine development. Here we study a longitudinal dataset of ten HIV-infected people, where both the B-cell receptors…
▽ More
Chronic infections of the human immunodeficiency virus (HIV) create a very complex co-evolutionary process, where the virus tries to escape the continuously adapting host immune system. Quantitative details of this process are largely unknown and could help in disease treatment and vaccine development. Here we study a longitudinal dataset of ten HIV-infected people, where both the B-cell receptors and the virus are deeply sequenced. We focus on simple measures of turnover, which quantify how much the composition of the viral strains and the immune repertoire change between time points. At the single-patient level, the viral-host turnover rates do not show any statistically significant correlation, however they correlate if the information is aggregated across patients. In particular, we identify an anti-correlation: large changes in the viral pool composition come with small changes in the B-cell receptor repertoire. This result seems to contradict the naive expectation that when the virus mutates quickly, the immune repertoire needs to change to keep up. However, we show that the observed anti-correlation naturally emerges and can be understood in terms of simple population-genetics models.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
From evolution to folding of repeat proteins
Authors:
Ezequiel A. Galpern,
Jacopo Marchi,
Thierry Mora,
Aleksandra M. Walczak,
Diego U. Ferreiro
Abstract:
Repeat proteins are made with tandem copies of similar amino acid stretches that fold into elongated architectures. Due to their symmetry, these proteins constitute excellent model systems to investigate how evolution relates to structure, folding and function. Here, we propose a scheme to map evolutionary information at the sequence level to a coarse-grained model for repeat-protein folding and u…
▽ More
Repeat proteins are made with tandem copies of similar amino acid stretches that fold into elongated architectures. Due to their symmetry, these proteins constitute excellent model systems to investigate how evolution relates to structure, folding and function. Here, we propose a scheme to map evolutionary information at the sequence level to a coarse-grained model for repeat-protein folding and use it to investigate the folding of thousands of repeat-proteins. We model the energetics by a combination of an inverse Potts model scheme with an explicit mechanistic model of duplications and deletions of repeats to calculate the evolutionary parameters of the system at single residue level. This is used to inform an Ising-like model that allows for the generation of folding curves, apparent domain emergence and occupation of intermediate states that are highly compatible with experimental data in specific case studies. We analyzed the folding of thousands of natural Ankyrin-repeat proteins and found that a multiplicity of folding mechanisms are possible. Fully cooperative all-or-none transition are obtained for arrays with enough sequence-similar elements and strong interactions between them, while non-cooperative element-by-element intermittent folding arose if the elements are dissimilar and the interactions between them are energetically weak. In between, we characterised nucleation-propagation and multi-domain folding mechanisms. Finally, we showed that stability and cooperativity of a repeat-array can be quantitatively predicted from a simple energy score, paving the way for guiding protein folding design with a co-evolutionary model.
△ Less
Submitted 24 February, 2022;
originally announced February 2022.
-
Learning the statistics and landscape of somatic mutation-induced insertions and deletions in antibodies
Authors:
Cosimo Lupo,
Natanael Spisak,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
Affinity maturation is crucial for improving the binding affinity of antibodies to antigens. This process is mainly driven by point substitutions caused by somatic hypermutations of the immunoglobulin gene. It also includes deletions and insertions of genomic material known as indels. While the landscape of point substitutions has been extensively studied, a detailed statistical description of ind…
▽ More
Affinity maturation is crucial for improving the binding affinity of antibodies to antigens. This process is mainly driven by point substitutions caused by somatic hypermutations of the immunoglobulin gene. It also includes deletions and insertions of genomic material known as indels. While the landscape of point substitutions has been extensively studied, a detailed statistical description of indels is still lacking. Here we present a probabilistic inference tool to learn the statistics of indels from repertoire sequencing data, which overcomes the pitfalls and biases of standard annotation methods. The model includes antibody-specific maturation ages to account for variable mutational loads in the repertoire. After validation on synthetic data, we applied our tool to a large dataset of human immunoglobulin heavy chains. The inferred model allows us to identify universal statistical features of indels in heavy chains. We report distinct insertion and deletion hotspots, and show that the distribution of lengths of indels follows a geometric distribution, which puts constraints on future mechanistic models of the hypermutation process.
△ Less
Submitted 4 April, 2022; v1 submitted 15 December, 2021;
originally announced December 2021.
-
Affinity maturation for an optimal balance between long-term immune coverage and short-term resource constraints
Authors:
Victor Chardès,
Massimo Vergassola,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
In order to target threatening pathogens, the adaptive immune system performs a continuous reorganization of its lymphocyte repertoire. Following an immune challenge, the B cell repertoire can evolve cells of increased specificity for the encountered strain. This process of affinity maturation generates a memory pool whose diversity and size remain difficult to predict. We assume that the immune s…
▽ More
In order to target threatening pathogens, the adaptive immune system performs a continuous reorganization of its lymphocyte repertoire. Following an immune challenge, the B cell repertoire can evolve cells of increased specificity for the encountered strain. This process of affinity maturation generates a memory pool whose diversity and size remain difficult to predict. We assume that the immune system follows a strategy that maximizes the long-term immune coverage and minimizes the short-term metabolic costs associated with affinity maturation. This strategy is defined as an optimal decision process on a finite dimensional phenotypic space, where a pre-existing population of naive cells is sequentially challenged with a neutrally evolving strain. We unveil a trade-off between immune protection against future strains and the necessary reorganization of the repertoire. This plasticity of the repertoire drives the emergence of distinct regimes for the size and diversity of the memory pool, depending on the density of naive cells and on the mutation rate of the strain. The model predicts power-law distributions of clonotype sizes observed in data, and rationalizes antigenic imprinting as a strategy to minimize metabolic costs while keeping good immune protection against future strains.
△ Less
Submitted 25 November, 2021; v1 submitted 26 July, 2021;
originally announced July 2021.
-
Antigenic waves of virus-immune co-evolution
Authors:
Jacopo Marchi,
Michael Lässig,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
The evolution of many microbes and pathogens, including circulating viruses such as seasonal influenza, is driven by immune pressure from the host population. In turn, the immune systems of infected populations get updated, chasing viruses even further away. Quantitatively understanding how these dynamics result in observed patterns of rapid pathogen and immune adaptation is instrumental to epidem…
▽ More
The evolution of many microbes and pathogens, including circulating viruses such as seasonal influenza, is driven by immune pressure from the host population. In turn, the immune systems of infected populations get updated, chasing viruses even further away. Quantitatively understanding how these dynamics result in observed patterns of rapid pathogen and immune adaptation is instrumental to epidemiological and evolutionary forecasting. Here we present a mathematical theory of co-evolution between immune systems and viruses in a finite-dimensional antigenic space, which describes the cross-reactivity of viral strains and immune systems primed by previous infections. We show the emergence of an antigenic wave that is pushed forward and canalized by cross-reactivity. We obtain analytical results for shape, speed, and angular diffusion of the wave. In particular, we show that viral-immune co-evolution generates a new emergent timescale, the persistence time of the wave's direction in antigenic space, which can be much longer than the coalescence time of the viral population. We compare these dynamics to the observed antigenic turnover of influenza strains, and we discuss how the dimensionality of antigenic space impacts on the predictability of the evolutionary dynamics. Our results provide a concrete and tractable framework to describe pathogen-host co-evolution.
△ Less
Submitted 7 May, 2021; v1 submitted 20 February, 2021;
originally announced February 2021.
-
NoisET: Noise learning and Expansion detection of T-cell receptors
Authors:
Meriem Bensouda Koraichi,
Maximilian Puelma Touzel,
Andrea Mazzolini,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
High-throughput sequencing of T- and B-cell receptors makes it possible to track immune repertoires across time, in different tissues, in acute and chronic diseases and in healthy individuals. However quantitative comparison between repertoires is confounded by variability in the read count of each receptor clonotype due to sampling, library preparation, and expression noise. We review methods for…
▽ More
High-throughput sequencing of T- and B-cell receptors makes it possible to track immune repertoires across time, in different tissues, in acute and chronic diseases and in healthy individuals. However quantitative comparison between repertoires is confounded by variability in the read count of each receptor clonotype due to sampling, library preparation, and expression noise. We review methods for accounting for both biological and experimental noise and present an easy-to-use python package NoisET that implements and generalizes a previously developed Bayesian method. It can be used to learn experimental noise models for repertoire sequencing from replicates, and to detect responding clones following a stimulus. We test the package on different repertoire sequencing technologies and datasets. We review how such approaches have been used to identify responding clonotypes in vaccination and disease data. Availability: NoisET is freely available to use with source code at github.com/statbiophys/NoisET.
△ Less
Submitted 17 July, 2022; v1 submitted 6 February, 2021;
originally announced February 2021.
-
Non-equilibrium dynamics of adaptation in sensory systems
Authors:
Daniele Conti,
Thierry Mora
Abstract:
Adaptation is used by biological sensory systems to respond to a wide range of environmental signals, by adapting their response properties to the statistics of the stimulus in order to maximize information transmission. We derive rules of optimal adaptation to changes in the mean and variance of a continuous stimulus in terms of Bayesian filters, and map them onto stochastic equations that couple…
▽ More
Adaptation is used by biological sensory systems to respond to a wide range of environmental signals, by adapting their response properties to the statistics of the stimulus in order to maximize information transmission. We derive rules of optimal adaptation to changes in the mean and variance of a continuous stimulus in terms of Bayesian filters, and map them onto stochastic equations that couple the state of the environment to an internal variable controling the response function. We calculate numerical and exact results for the speed and accuracy of adaptation, and its impact on information transmission. We find that, in the regime of efficient adaptation, the speed of adaptation scales sublinearly with the rate of change of the environment. Finally, we exploit the mathematical equivalence between adaptation and stochastic thermodynamics to quantitatively relate adaptation to the irreversibility of the adaptation time course, defined by the rate of entropy production. Our results suggest a means to empirically quantify adaptation in a model-free and non-parametric way.
△ Less
Submitted 12 April, 2021; v1 submitted 19 November, 2020;
originally announced November 2020.
-
Deep generative selection models of T and B cell receptor repertoires with soNNia
Authors:
Giulio Isacchini,
Aleksandra M Walczak,
Thierry Mora,
Armita Nourmohammad
Abstract:
Subclasses of lymphocytes carry different functional roles to work together to produce an immune response and lasting immunity. Additionally to these functional roles, T and B-cell lymphocytes rely on the diversity of their receptor chains to recognize different pathogens. The lymphocyte subclasses emerge from common ancestors generated with the same diversity of receptors during selection process…
▽ More
Subclasses of lymphocytes carry different functional roles to work together to produce an immune response and lasting immunity. Additionally to these functional roles, T and B-cell lymphocytes rely on the diversity of their receptor chains to recognize different pathogens. The lymphocyte subclasses emerge from common ancestors generated with the same diversity of receptors during selection processes. Here we leverage biophysical models of receptor generation with machine learning models of selection to identify specific sequence features characteristic of functional lymphocyte repertoires and subrepertoires. Specifically using only repertoire level sequence information, we classify CD4$^+$ and CD8$^+$ T-cells, find correlations between receptor chains arising during selection and identify T-cells subsets that are targets of pathogenic epitopes. We also show examples of when simple linear classifiers do as well as more complex machine learning methods.
△ Less
Submitted 26 March, 2021; v1 submitted 5 November, 2020;
originally announced November 2020.
-
Learning the heterogeneous hypermutation landscape of immunoglobulins from high-throughput repertoire data
Authors:
Natanael Spisak,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
Somatic hypermutations of immunoglobulin (Ig) genes occuring during affinity maturation drive B-cell receptors' ability to evolve strong binding to their antigenic targets. The landscape of these mutations is highly heterogeneous, with certain regions of the Ig gene being preferentially targeted. However, a rigorous quantification of this bias has been difficult because of phylogenetic correlation…
▽ More
Somatic hypermutations of immunoglobulin (Ig) genes occuring during affinity maturation drive B-cell receptors' ability to evolve strong binding to their antigenic targets. The landscape of these mutations is highly heterogeneous, with certain regions of the Ig gene being preferentially targeted. However, a rigorous quantification of this bias has been difficult because of phylogenetic correlations between sequences and the interference of selective forces. Here, we present an approach that corrects for these issues, and use it to learn a model of hypermutation preferences from a recently published large IgH repertoire dataset. The obtained model predicts mutation profiles accurately and in a reproducible way, including in the previously uncharacterized Complementarity Determining Region 3, revealing that both the sequence context of the mutation and its absolute position along the gene are important. In addition, we show that hypermutations occurring concomittantly along B-cell lineages tend to co-localize, suggesting a possible mechanism for accelerating affinity maturation.
△ Less
Submitted 8 September, 2020; v1 submitted 23 July, 2020;
originally announced July 2020.
-
Immune Fingerprinting through Repertoire Similarity
Authors:
Thomas Dupic,
Meriem Bensouda Koraichi,
Anastasia Minervina,
Mikhail Pogorelyy,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
Immune repertoires provide a unique fingerprint reflecting the immune history of individuals, with potential applications in precision medicine. However, the question of how personal that information is and how it can be used to identify individuals has not been explored. Here, we show that individuals can be uniquely identified from repertoires of just a few thousands lymphocytes. We present "Imm…
▽ More
Immune repertoires provide a unique fingerprint reflecting the immune history of individuals, with potential applications in precision medicine. However, the question of how personal that information is and how it can be used to identify individuals has not been explored. Here, we show that individuals can be uniquely identified from repertoires of just a few thousands lymphocytes. We present "Immprint," a classifier using an information-theoretic measure of repertoire similarity to distinguish pairs of repertoire samples coming from the same versus different individuals. Using published T-cell receptor repertoires and statistical modeling, we tested its ability to identify individuals with great accuracy, including identical twins, by computing false positive and false negative rates $< 10^{-6}$ from samples composed of 10,000 T-cells. We verified through longitudinal datasets and simulations that the method is robust to acute infections and the passage of time. These results emphasize the private and personal nature of repertoire data.
△ Less
Submitted 5 October, 2020; v1 submitted 24 June, 2020;
originally announced June 2020.
-
A new inference approach for training shallow and deep generalized linear models of noisy interacting neurons
Authors:
Gabriel Mahuas,
Giulio Isacchini,
Olivier Marre,
Ulisse Ferrari,
Thierry Mora
Abstract:
Generalized linear models are one of the most efficient paradigms for predicting the correlated stochastic activity of neuronal networks in response to external stimuli, with applications in many brain areas. However, when dealing with complex stimuli, the inferred coupling parameters often do not generalize across different stimulus statistics, leading to degraded performance and blowup instabili…
▽ More
Generalized linear models are one of the most efficient paradigms for predicting the correlated stochastic activity of neuronal networks in response to external stimuli, with applications in many brain areas. However, when dealing with complex stimuli, the inferred coupling parameters often do not generalize across different stimulus statistics, leading to degraded performance and blowup instabilities. Here, we develop a two-step inference strategy that allows us to train robust generalized linear models of interacting neurons, by explicitly separating the effects of correlations in the stimulus from network interactions in each training step. Applying this approach to the responses of retinal ganglion cells to complex visual stimuli, we show that, compared to classical methods, the models trained in this way exhibit improved performance, are more stable, yield robust interaction networks, and generalize well across complex visual statistics. The method can be extended to deep convolutional neural networks, leading to models with high predictive accuracy for both the neuron firing rates and their correlations.
△ Less
Submitted 15 November, 2020; v1 submitted 11 June, 2020;
originally announced June 2020.
-
Longitudinal high-throughput TCR repertoire profiling reveals the dynamics of T cell memory formation after mild COVID-19 infection
Authors:
Anastasia A. Minervina,
Ekaterina A. Komech,
Aleksei Titov,
Meriem Bensouda Koraichi,
Elisa Rosati,
Ilgar Z. Mamedov,
Andre Franke,
Grigory A. Efimov,
Dmitriy M. Chudakov,
Thierry Mora,
Aleksandra M. Walczak,
Yuri B. Lebedev,
Mikhail V. Pogorelyy
Abstract:
COVID-19 is a global pandemic caused by the SARS-CoV-2 coronavirus. T cells play a key role in the adaptive antiviral immune response by killing infected cells and facilitating the selection of virus-specific antibodies. However neither the dynamics and cross-reactivity of the SARS-CoV-2-specific T cell response nor the diversity of resulting immune memory are well understood. In this study we use…
▽ More
COVID-19 is a global pandemic caused by the SARS-CoV-2 coronavirus. T cells play a key role in the adaptive antiviral immune response by killing infected cells and facilitating the selection of virus-specific antibodies. However neither the dynamics and cross-reactivity of the SARS-CoV-2-specific T cell response nor the diversity of resulting immune memory are well understood. In this study we use longitudinal high-throughput T cell receptor (TCR) sequencing to track changes in the T cell repertoire following two mild cases of COVID-19. In both donors we identified CD4+ and CD8+ T cell clones with transient clonal expansion after infection. The antigen specificity of CD8+ TCR sequences to SARS-CoV-2 epitopes was confirmed by both MHC tetramer binding and presence in large database of SARS-CoV-2 epitope-specific TCRs. We describe characteristic motifs in TCR sequences of COVID-19-reactive clones and show preferential occurence of these motifs in publicly available large dataset of repertoires from COVID-19 patients. We show that in both donors the majority of infection-reactive clonotypes acquire memory phenotypes. Certain T cell clones were detected in the memory fraction at the pre-infection timepoint, suggesting participation of pre-existing cross-reactive memory T cells in the immune response to SARS-CoV-2.
△ Less
Submitted 30 September, 2020; v1 submitted 17 May, 2020;
originally announced May 2020.
-
SOS: Online probability estimation and generation of T and B cell receptors
Authors:
Giulio Isacchini,
Carlos Olivares,
Armita Nourmohammad,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
Recent advances in modelling VDJ recombination and subsequent selection of T and B cell receptors provide useful tools to analyze and compare immune repertoires across time, individuals, and tissues. A suite of tools--IGoR [1], OLGA [2] and SONIA [3]--have been publicly released to the community that allow for the inference of generative and selection models from high-throughput sequencing data. H…
▽ More
Recent advances in modelling VDJ recombination and subsequent selection of T and B cell receptors provide useful tools to analyze and compare immune repertoires across time, individuals, and tissues. A suite of tools--IGoR [1], OLGA [2] and SONIA [3]--have been publicly released to the community that allow for the inference of generative and selection models from high-throughput sequencing data. However using these tools requires some scripting or command-line skills and familiarity with complex datasets. As a result the application of the above models has not been available to a broad audience. In this application note we fill this gap by presenting Simple OLGA & SONIA (SOS), a web-based interface where users with no coding skills can compute the generation and post-selection probabilities of their sequences, as well as generate batches of synthetic sequences. The application also functions on mobile phones.
△ Less
Submitted 29 March, 2020;
originally announced March 2020.
-
Population variability in the generation and thymic selection of T-cell repertoires
Authors:
Zachary Sethna,
Giulio Isacchini,
Thomas Dupic,
Thierry Mora,
Aleksandra M. Walczak,
Yuval Elhanati
Abstract:
The diversity of T-cell receptor (TCR) repertoires is achieved by a combination of two intrinsically stochastic steps: random receptor generation by VDJ recombination, and selection based on the recognition of random self-peptides presented on the major histocompatibility complex. These processes lead to a large receptor variability within and between individuals. However, the characterization of…
▽ More
The diversity of T-cell receptor (TCR) repertoires is achieved by a combination of two intrinsically stochastic steps: random receptor generation by VDJ recombination, and selection based on the recognition of random self-peptides presented on the major histocompatibility complex. These processes lead to a large receptor variability within and between individuals. However, the characterization of the variability is hampered by the limited size of the sampled repertoires. We introduce a new software tool SONIA to facilitate inference of individual-specific computational models for the generation and selection of the TCR beta chain (TRB) from sequenced repertoires of 651 individuals, separating and quantifying the variability of the two processes of generation and selection in the population. We find not only that most of the variability is driven by the VDJ generation process, but there is a large degree of consistency between individuals with the inter-individual variance of repertoires being about 2% of the intra-individual variance. Known viral-specific TCRs follow the same generation and selection statistics as all TCRs.
△ Less
Submitted 9 January, 2020;
originally announced January 2020.
-
Building general Langevin models from discrete data sets
Authors:
Federica Ferretti,
Victor Chardès,
Thierry Mora,
Aleksandra M. Walczak,
Irene Giardina
Abstract:
Many living and complex systems exhibit second order emergent dynamics. Limited experimental access to the configurational degrees of freedom results in data that appears to be generated by a non-Markovian process. This poses a challenge in the quantitative reconstruction of the model from experimental data, even in the simple case of equilibrium Langevin dynamics of Hamiltonian systems. We develo…
▽ More
Many living and complex systems exhibit second order emergent dynamics. Limited experimental access to the configurational degrees of freedom results in data that appears to be generated by a non-Markovian process. This poses a challenge in the quantitative reconstruction of the model from experimental data, even in the simple case of equilibrium Langevin dynamics of Hamiltonian systems. We develop a novel Bayesian inference approach to learn the parameters of such stochastic effective models from discrete finite length trajectories. We first discuss the failure of naive inference approaches based on the estimation of derivatives through finite differences, regardless of the time resolution and the length of the sampled trajectories. We then derive, adopting higher order discretization schemes, maximum likelihood estimators for the model parameters that provide excellent results even with moderately long trajectories. We apply our method to second order models of collective motion and show that our results also hold in the presence of interactions.
△ Less
Submitted 13 May, 2020; v1 submitted 22 December, 2019;
originally announced December 2019.
-
Inferring the immune response from repertoire sequencing
Authors:
Maximilian Puelma Touzel,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
High-throughput sequencing of B- and T-cell receptors makes it possible to track immune repertoires across time, in different tissues, and in acute and chronic diseases or in healthy individuals. However, quantitative comparison between repertoires is confounded by variability in the read count of each receptor clonotype due to sampling, library preparation, and expression noise. Here, we present…
▽ More
High-throughput sequencing of B- and T-cell receptors makes it possible to track immune repertoires across time, in different tissues, and in acute and chronic diseases or in healthy individuals. However, quantitative comparison between repertoires is confounded by variability in the read count of each receptor clonotype due to sampling, library preparation, and expression noise. Here, we present a general Bayesian approach to disentangle repertoire variations from these stochastic effects. Using replicate experiments, we first show how to learn the natural variability of read counts by inferring the distributions of clone sizes as well as an explicit noise model relating true frequencies of clones to their read count. We then use that null model as a baseline to infer a model of clonal expansion from two repertoire time points taken before and after an immune challenge. Applying our approach to yellow fever vaccination as a model of acute infection in humans, we identify candidate clones participating in the response.
△ Less
Submitted 16 April, 2020; v1 submitted 17 December, 2019;
originally announced December 2019.
-
On generative models of T-cell receptor sequences
Authors:
Giulio Isacchini,
Zachary Sethna,
Yuval Elhanati,
Armita Nourmohammad,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
T-cell receptors (TCR) are key proteins of the adaptive immune system, generated randomly in each individual, whose diversity underlies our ability to recognize infections and malignancies. Modeling the distribution of TCR sequences is of key importance for immunology and medical applications. Here, we compare two inference methods trained on high-throughput sequencing data: a knowledge-guided app…
▽ More
T-cell receptors (TCR) are key proteins of the adaptive immune system, generated randomly in each individual, whose diversity underlies our ability to recognize infections and malignancies. Modeling the distribution of TCR sequences is of key importance for immunology and medical applications. Here, we compare two inference methods trained on high-throughput sequencing data: a knowledge-guided approach, which accounts for the details of sequence generation, supplemented by a physics-inspired model of selection; and a knowledge-free Variational Auto-Encoder based on deep artificial neural networks. We show that the knowledge-guided model outperforms the deep network approach at predicting TCR probabilities, while being more interpretable, at a lower computational cost.
△ Less
Submitted 13 March, 2020; v1 submitted 27 November, 2019;
originally announced November 2019.
-
Primary and secondary anti-viral response captured by the dynamics and phenotype of individual T cell clones
Authors:
Anastasia A. Minervina,
Mikhail V. Pogorelyy,
Ekaterina A. Komech,
Vadim K. Karnaukhov,
Petra Bacher,
Elisa Rosati,
Andre Franke,
Dmitriy M. Chudakov,
Ilgar Z. Mamedov,
Yuri B. Lebedev,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
The diverse repertoire of T-cell receptors (TCR) plays a key role in the adaptive immune response to infections. Previous studies show that secondary responses to the yellow fever vaccine - the model for acute infection in humans - are weaker than primary ones, but only quantitative measurements can describe the concentration changes and lineage fates for distinct T-cell clones in vivo over time.…
▽ More
The diverse repertoire of T-cell receptors (TCR) plays a key role in the adaptive immune response to infections. Previous studies show that secondary responses to the yellow fever vaccine - the model for acute infection in humans - are weaker than primary ones, but only quantitative measurements can describe the concentration changes and lineage fates for distinct T-cell clones in vivo over time. Using TCR alpha and beta repertoire sequencing for T-cell subsets, as well as single-cell RNAseq and TCRseq, we track the concentrations and phenotypes of individual T-cell clones in response to primary and secondary yellow fever immunization showing their large diversity. We confirm the secondary response is an order of magnitude weaker, albeit $\sim10$ days faster than the primary one. Estimating the fraction of the T-cell response directed against the single immunodominant epitope, we identify the sequence features of TCRs that define the high precursor frequency of the two major TCR motifs specific for this particular epitope. We also show the consistency of clonal expansion dynamics between bulk alpha and beta repertoires, using a new methodology to reconstruct alpha-beta pairings from clonal trajectories.
△ Less
Submitted 3 March, 2020; v1 submitted 25 October, 2019;
originally announced October 2019.
-
Physical limit to concentration sensing in a changing environment
Authors:
Thierry Mora,
Ilya Nemenman
Abstract:
Cells adapt to changing environments by sensing ligand concentrations using specific receptors. The accuracy of sensing is ultimately limited by the finite number of ligand molecules bound by receptors. Previously derived physical limits to sensing accuracy have assumed that the concentration was constant and ignored its temporal fluctuations. We formulate the problem of concentration sensing in a…
▽ More
Cells adapt to changing environments by sensing ligand concentrations using specific receptors. The accuracy of sensing is ultimately limited by the finite number of ligand molecules bound by receptors. Previously derived physical limits to sensing accuracy have assumed that the concentration was constant and ignored its temporal fluctuations. We formulate the problem of concentration sensing in a strongly fluctuating environment as a non-linear field-theoretic problem, for which we find an excellent approximate Gaussian solution. We derive a new physical bound on the relative error in concentration $c$ which scales as $δc/c \sim (Dacτ)^{-1/4}$ with ligand diffusivity $D$, receptor cross-section $a$, and characteristic fluctuation time scale $τ$, in stark contrast with the usual Berg and Purcell bound $δc/c \sim (DacT)^{-1/2}$ for a perfect receptor sensing concentration during time $T$. We show how the bound can be achieved by a simple biochemical network downstream the receptor that adapts the kinetics of signaling as a function of the square root of the sensed concentration.
△ Less
Submitted 12 August, 2019;
originally announced August 2019.
-
How many different clonotypes do immune repertoires contain?
Authors:
Thierry Mora,
Aleksandra M. Walczak
Abstract:
Immune repertoires rely on diversity of T-cell and B-cell receptors to protect us against foreign threats. The ability to recognize a wide variety of pathogens is linked to the number of different clonotypes expressed by an individual. Out of the estimated $\sim 10^{12}$ different B and T cells in humans, how many of them express distinct receptors? We review current and past estimates for these n…
▽ More
Immune repertoires rely on diversity of T-cell and B-cell receptors to protect us against foreign threats. The ability to recognize a wide variety of pathogens is linked to the number of different clonotypes expressed by an individual. Out of the estimated $\sim 10^{12}$ different B and T cells in humans, how many of them express distinct receptors? We review current and past estimates for these numbers. We point out a fundamental limitation of current methods, which ignore the tail of small clones in the distribution of clone sizes. We show that this tail strongly affects the total number of clones, but it is impractical to access experimentally. We propose that combining statistical models with mechanistic models of lymphocyte clonal dynamics offers possible new strategies for estimating the number of clones.
△ Less
Submitted 18 July, 2019;
originally announced July 2019.
-
Quantitative Immunology for Physicists
Authors:
Grégoire Altan-Bonnet,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
The adaptive immune system is a dynamical, self-organized multiscale system that protects vertebrates from both pathogens and internal irregularities, such as tumours. For these reason it fascinates physicists, yet the multitude of different cells, molecules and sub-systems is often also petrifying. Despite this complexity, as experiments on different scales of the adaptive immune system become mo…
▽ More
The adaptive immune system is a dynamical, self-organized multiscale system that protects vertebrates from both pathogens and internal irregularities, such as tumours. For these reason it fascinates physicists, yet the multitude of different cells, molecules and sub-systems is often also petrifying. Despite this complexity, as experiments on different scales of the adaptive immune system become more quantitative, many physicists have made both theoretical and experimental contributions that help predict the behaviour of ensembles of cells and molecules that participate in an immune response. Here we review some recent contributions with an emphasis on quantitative questions and methodologies. We also provide a more general methods section that presents some of the wide array of theoretical tools used in the field.
△ Less
Submitted 28 July, 2019; v1 submitted 8 July, 2019;
originally announced July 2019.
-
Multi-lineage evolution in viral populations driven by host immune systems
Authors:
Jacopo Marchi,
Michael Lässig,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
Viruses evolve in the background of host immune systems that exert selective pressure and drive viral evolutionary trajectories. This interaction leads to different evolutionary patterns in antigenic space. Examples observed in nature include the effectively one-dimensional escape characteristic of influenza A and the prolonged coexistence of lineages in influenza B. Here we use an evolutionary mo…
▽ More
Viruses evolve in the background of host immune systems that exert selective pressure and drive viral evolutionary trajectories. This interaction leads to different evolutionary patterns in antigenic space. Examples observed in nature include the effectively one-dimensional escape characteristic of influenza A and the prolonged coexistence of lineages in influenza B. Here we use an evolutionary model for viruses in the presence of immune host systems with finite memory to delineate parameter regimes of these patterns in a in two-dimensional antigenic space. We find that for small effective mutation rates and mutation jump ranges, a single lineage is the only stable solution. Large effective mutation rates combined with large mutational jumps in antigenic space lead to multiple stably co-existing lineages over prolonged evolutionary periods. These results combined with observations from data constrain the parameter regimes for the adaptation of viruses, including influenza.
△ Less
Submitted 18 June, 2019;
originally announced June 2019.
-
Size and structure of the sequence space of repeat proteins
Authors:
Jacopo Marchi,
Ezequiel A. Galpern,
Rocio Espada,
Diego U. Ferreiro,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
The coding space of protein sequences is shaped by evolutionary constraints set by requirements of function and stability. We show that the coding space of a given protein family--the total number of sequences in that family--can be estimated using models of maximum entropy trained on multiple sequence alignments of naturally occuring amino acid sequences. We analyzed and calculated the size of th…
▽ More
The coding space of protein sequences is shaped by evolutionary constraints set by requirements of function and stability. We show that the coding space of a given protein family--the total number of sequences in that family--can be estimated using models of maximum entropy trained on multiple sequence alignments of naturally occuring amino acid sequences. We analyzed and calculated the size of three abundant repeat proteins families, whose members are large proteins made of many repetitions of conserved portions of ~ 30 amino acids. While amino acid conservation at each position of the alignment explains most of the reduction of diversity relative to completely random sequences, we found that correlations between amino acid usage at different positions significantly impact that diversity. We quantified the impact of different types of correlations, functional and evolutionary, on sequence diversity. Analysis of the detailed structure of the coding space of the families revealed a rugged landscape, with many local energy minima of varying sizes with a hierarchical structure, reminiscent of fustrated energy landscapes of spin glass in physics. This clustered structure indicates a multiplicity of subtypes within each family, and suggests new strategies for protein design.
△ Less
Submitted 3 July, 2019; v1 submitted 11 May, 2019;
originally announced May 2019.
-
Cost and benefits of CRISPR spacer acquisition
Authors:
Serena Bradde,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
CRISPR-Cas mediated immunity in bacteria allows bacterial populations to protect themselves against pathogens. However, it also exposes them to the dangers of auto-immunity by developing protection that targets its own genome. Using a simple model of the coupled dynamics of phage and bacterial populations, we explore how acquisition rates affect the survival rate of the bacterial colony. We find t…
▽ More
CRISPR-Cas mediated immunity in bacteria allows bacterial populations to protect themselves against pathogens. However, it also exposes them to the dangers of auto-immunity by developing protection that targets its own genome. Using a simple model of the coupled dynamics of phage and bacterial populations, we explore how acquisition rates affect the survival rate of the bacterial colony. We find that the optimal strategy depends on the initial population sizes of both viruses and bacteria. Additionally, certain combinations of acquisition and dynamical rates and initial population sizes guarantee protection, due to a dynamical balance between the evolving population sizes, without relying on acquisition of viral spacers. Outside this regime, the high cost of auto-immunity limits the acquisition rate. We discuss these optimal survival strategies in terms of recent experiments.
△ Less
Submitted 8 November, 2018;
originally announced November 2018.
-
Receptor crosstalk improves concentration sensing of multiple ligands
Authors:
Martin Carballo-Pacheco,
Jonathan Desponds,
Tatyana Gavrilchenko,
Andreas Mayer,
Roshan Prizak,
Gautam Reddy,
Ilya Nemenman,
Thierry Mora
Abstract:
Cells need to reliably sense external ligand concentrations to achieve various biological functions such as chemotaxis or signaling. The molecular recognition of ligands by surface receptors is degenerate in many systems leading to crosstalk between different receptors. Crosstalk is often thought of as a deviation from optimal specific recognition, as the binding of non-cognate ligands can interfe…
▽ More
Cells need to reliably sense external ligand concentrations to achieve various biological functions such as chemotaxis or signaling. The molecular recognition of ligands by surface receptors is degenerate in many systems leading to crosstalk between different receptors. Crosstalk is often thought of as a deviation from optimal specific recognition, as the binding of non-cognate ligands can interfere with the detection of the receptor's cognate ligand, possibly leading to a false triggering of a downstream signaling pathway. Here we quantify the optimal precision of sensing the concentrations of multiple ligands by a collection of promiscuous receptors. We demonstrate that crosstalk can improve precision in concentration sensing and discrimination tasks. To achieve superior precision, the additional information about ligand concentrations contained in short binding events of the non-cognate ligand should be exploited. We present a proofreading scheme to realize an approximate estimation of multiple ligand concentrations that reaches a precision close to the derived optimal bounds. Our results help rationalize the observed ubiquity of receptor crosstalk in molecular sensing.
△ Less
Submitted 10 October, 2018;
originally announced October 2018.
-
Detecting T-cell receptors involved in immune responses from single repertoire snapshots
Authors:
Mikhail V Pogorelyy,
Anastasia A Minervina,
Mikhail Shugay,
Dmitriy M Chudakov,
Yuri B Lebedev,
Thierry Mora,
Aleksandra M Walczak
Abstract:
Hypervariable T-cell receptors (TCR) play a key role in adaptive immunity, recognising a vast diversity of pathogen-derived antigens. High throughput sequencing of TCR repertoires (RepSeq) produces huge datasets of T-cell receptor sequences from blood and tissue samples. However, our ability to extract clinically relevant information from RepSeq data is limited, mainly because little is known abou…
▽ More
Hypervariable T-cell receptors (TCR) play a key role in adaptive immunity, recognising a vast diversity of pathogen-derived antigens. High throughput sequencing of TCR repertoires (RepSeq) produces huge datasets of T-cell receptor sequences from blood and tissue samples. However, our ability to extract clinically relevant information from RepSeq data is limited, mainly because little is known about TCR-disease associations. Here we present a statistical approach called ALICE (Antigen-specific Lymphocyte Identification by Clustering of Expanded sequences) that identifies TCR sequences that are actively involved in the current immune response from a single RepSeq sample, and apply it to repertoires of patients with a variety of disorders - autoimmune disease (ankylosing spondylitis), patients under cancer immunotherapy, or subject to an acute infection (live yellow fever vaccine). The method's robustness is demonstrated by the agreement of its predictions with independent assays, and is supported by its ability to selectively detect responding TCR in the memory but not in the naive subset. ALICE requires no longitudinal data collection nor large cohorts, and is thus directly applicable to most RepSeq datasets. Its results facilitate the identification of TCR variants associated with a wide variety of diseases and conditions, which can be used for diagnostics, rational vaccine design and evaluation of the adaptive immune system state.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs
Authors:
Zachary Sethna,
Yuval Elhanati,
Curtis G. Callan Jr.,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
Motivation: High-throughput sequencing of large immune repertoires has enabled the development of methods to predict the probability of generation by V(D)J recombination of T- and B-cell receptors of any specific nucleotide sequence. These generation probabilities are very non-homogeneous, ranging over 20 orders of magnitude in real repertoires. Since the function of a receptor really depends on i…
▽ More
Motivation: High-throughput sequencing of large immune repertoires has enabled the development of methods to predict the probability of generation by V(D)J recombination of T- and B-cell receptors of any specific nucleotide sequence. These generation probabilities are very non-homogeneous, ranging over 20 orders of magnitude in real repertoires. Since the function of a receptor really depends on its protein sequence, it is important to be able to predict this probability of generation at the amino acid level. However, brute-force summation over all the nucleotide sequences with the correct amino acid translation is computationally intractable. The purpose of this paper is to present a solution to this problem.
Results: We use dynamic programming to construct an efficient and flexible algorithm, called OLGA (Optimized Likelihood estimate of immunoGlobulin Amino-acid sequences), for calculating the probability of generating a given CDR3 amino acid sequence or motif, with or without V/J restriction, as a result of V(D)J recombination in B or T cells. We apply it to databases of epitope-specific T-cell receptors to evaluate the probability that a typical human subject will possess T cells responsive to specific disease-associated epitopes. The model prediction shows an excellent agreement with published data. We suggest that OLGA may be a useful tool to guide vaccine design.
Availability: Source code is available at https://github.com/zsethna/OLGA
△ Less
Submitted 13 November, 2018; v1 submitted 12 July, 2018;
originally announced July 2018.
-
Genesis of the alpha beta T-cell receptor
Authors:
Thomas Dupic,
Quentin Marcou,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
The T-cell (TCR) repertoire relies on the diversity of receptors composed of two chains, called $α$ and $β$, to recognize pathogens. Using results of high throughput sequencing and computational chain-pairing experiments of human TCR repertoires, we quantitively characterize the $αβ$ generation process. We estimate the probabilities of a rescue recombination of the $β$ chain on the second chromoso…
▽ More
The T-cell (TCR) repertoire relies on the diversity of receptors composed of two chains, called $α$ and $β$, to recognize pathogens. Using results of high throughput sequencing and computational chain-pairing experiments of human TCR repertoires, we quantitively characterize the $αβ$ generation process. We estimate the probabilities of a rescue recombination of the $β$ chain on the second chromosome upon failure or success on the first chromosome. Unlike $β$ chains, $α$ chains recombine simultaneously on both chromosomes, resulting in correlated statistics of the two genes which we predict using a mechanistic model. We find that $\sim 28 \%$ of cells express both $α$ chains. We report that clones sharing the same $β$ chain but different $α$ chains are overrepresented, suggesting that they respond to common immune challenges. Altogether, our statistical analysis gives a complete quantitative mechanistic picture that results in the observed correlations in the generative process. We learn that the probability to generate any TCR$αβ$ is lower than $10^{-12}$ and estimate the generation diversity and sharing properties of the $αβ$ TCR repertoire.
△ Less
Submitted 11 December, 2018; v1 submitted 28 June, 2018;
originally announced June 2018.
-
Modeling the correlated activity of neural populations: A review
Authors:
Christophe Gardella,
Olivier Marre,
Thierry Mora
Abstract:
The principles of neural encoding and computations are inherently collective and usually involve large populations of interacting neurons with highly correlated activities. While theories of neural function have long recognized the importance of collective effects in populations of neurons, only in the past two decades has it become possible to record from many cells simulatenously using advanced…
▽ More
The principles of neural encoding and computations are inherently collective and usually involve large populations of interacting neurons with highly correlated activities. While theories of neural function have long recognized the importance of collective effects in populations of neurons, only in the past two decades has it become possible to record from many cells simulatenously using advanced experimental techniques with single-spike resolution, and to relate these correlations to function and behaviour. This review focuses on the modeling and inference approaches that have been recently developed to describe the correlated spiking activity of populations of neurons. We cover a variety of models describing correlations between pairs of neurons as well as between larger groups, synchronous or delayed in time, with or without the explicit influence of the stimulus, and including or not latent variables. We discuss the advantages and drawbacks or each method, as well as the computational challenges related to their application to recordings of ever larger populations.
△ Less
Submitted 21 June, 2018;
originally announced June 2018.
-
How a well-adapting immune system remembers
Authors:
Andreas Mayer,
Vijay Balasubramanian,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
An adaptive agent predicting the future state of an environment must weigh trust in new observations against prior experiences. In this light, we propose a view of the adaptive immune system as a dynamic Bayesian machinery that updates its memory repertoire by balancing evidence from new pathogen encounters against past experience of infection to predict and prepare for future threats. This framew…
▽ More
An adaptive agent predicting the future state of an environment must weigh trust in new observations against prior experiences. In this light, we propose a view of the adaptive immune system as a dynamic Bayesian machinery that updates its memory repertoire by balancing evidence from new pathogen encounters against past experience of infection to predict and prepare for future threats. This framework links the observed initial rapid increase of the memory pool early in life followed by a mid-life plateau to the ease of learning salient features of sparse environments. We also derive a modulated memory pool update rule in agreement with current vaccine response experiments. Our results suggest that pathogenic environments are sparse and that memory repertoires significantly decrease infection costs even with moderate sampling. The predicted optimal update scheme maps onto commonly considered competitive dynamics for antigen receptors.
△ Less
Submitted 13 November, 2018; v1 submitted 14 June, 2018;
originally announced June 2018.
-
Precise tracking of vaccine-responding T-cell clones reveals convergent and personalized response in identical twins
Authors:
Mikhail V. Pogorelyy,
Anastasia A. Minervina,
Maximilian Puelma Touzel,
Anastasiia L. Sycheva,
Ekaterina A. Komech,
Elena I. Kovalenko,
Galina G. Karganova,
Evgeniy S. Egorov,
Alexander Yu. Komkov,
Dmitriy M. Chudakov,
Ilgar Z. Mamedov,
Thierry Mora,
Aleksandra M. Walczak,
Yuri B. Lebedev
Abstract:
T-cell receptor (TCR) repertoire data contain information about infections that could be used in disease diagnostics and vaccine development, but extracting that information remains a major challenge. Here we developed a statistical framework to detect TCR clone proliferation and contraction from longitudinal repertoire data. We applied this framework to data from three pairs of identical twins im…
▽ More
T-cell receptor (TCR) repertoire data contain information about infections that could be used in disease diagnostics and vaccine development, but extracting that information remains a major challenge. Here we developed a statistical framework to detect TCR clone proliferation and contraction from longitudinal repertoire data. We applied this framework to data from three pairs of identical twins immunized with the yellow fever vaccine. We identified 500-1500 responding TCRs in each donor and validated them using three independent assays. While the responding TCRs were mostly private, albeit with higher overlap between twins, they could be well predicted using a classifier based on sequence similarity. Our method can also be applied to samples obtained post-infection, making it suitable for systematic discovery of new infection-specific TCRs in the clinic.
△ Less
Submitted 12 April, 2018;
originally announced April 2018.
-
Predicting the spectrum of TCR repertoire sharing with a data-driven model of recombination
Authors:
Yuval Elhanati,
Zachary Sethna,
Curtis G. Callan Jr.,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
Despite the extreme diversity of T cell repertoires, many identical T-cell receptor (TCR) sequences are found in a large number of individual mice and humans. These widely-shared sequences, often referred to as `public', have been suggested to be over-represented due to their potential immune functionality or their ease of generation by V(D)J recombination. Here we show that even for large cohorts…
▽ More
Despite the extreme diversity of T cell repertoires, many identical T-cell receptor (TCR) sequences are found in a large number of individual mice and humans. These widely-shared sequences, often referred to as `public', have been suggested to be over-represented due to their potential immune functionality or their ease of generation by V(D)J recombination. Here we show that even for large cohorts the observed degree of sharing of TCR sequences between individuals is well predicted by a model accounting for by the known quantitative statistical biases in the generation process, together with a simple model of thymic selection. Whether a sequence is shared by many individuals is predicted to depend on the number of queried individuals and the sampling depth, as well as on the sequence itself, in agreement with the data. We introduce the degree of publicness conditional on the queried cohort size and the size of the sampled repertoires. Based on these observations we propose a public/private sequence classifier, `PUBLIC' (Public Universal Binary Likelihood Inference Classifier), based on the generation probability, which performs very well even for small cohort sizes.
△ Less
Submitted 2 March, 2018;
originally announced March 2018.
-
Fierce selection and interference in B-cell repertoire response to chronic HIV-1
Authors:
Armita Nourmohammad,
Jakub Otwinowski,
Marta Łuksza,
Thierry Mora,
Aleksandra M Walczak
Abstract:
During chronic infection, HIV-1 engages in a rapid coevolutionary arms race with the host's adaptive immune system. While it is clear that HIV exerts strong selection on the adaptive immune system, the characteristics of the somatic evolution that shape the immune response are still unknown. Traditional population genetics methods fail to distinguish chronic immune response from healthy repertoire…
▽ More
During chronic infection, HIV-1 engages in a rapid coevolutionary arms race with the host's adaptive immune system. While it is clear that HIV exerts strong selection on the adaptive immune system, the characteristics of the somatic evolution that shape the immune response are still unknown. Traditional population genetics methods fail to distinguish chronic immune response from healthy repertoire evolution. Here, we infer the evolutionary modes of B-cell repertoires and identify complex dynamics with a constant production of better B-cell receptor mutants that compete, maintaining large clonal diversity and potentially slowing down adaptation. A substantial fraction of mutations that rise to high frequencies in pathogen engaging CDRs of B-cell receptors (BCRs) are beneficial, in contrast to many such changes in structurally relevant frameworks that are deleterious and circulate by hitchhiking. We identify a pattern where BCRs in patients who experience larger viral expansions undergo stronger selection with a rapid turnover of beneficial mutations due to clonal interference in their CDR3 regions. Using population genetics modeling, we show that the extinction of these beneficial mutations can be attributed to the rise of competing beneficial alleles and clonal interference. The picture is of a dynamic repertoire, where better clones may be outcompeted by new mutants before they fix.
△ Less
Submitted 25 July, 2020; v1 submitted 24 February, 2018;
originally announced February 2018.
-
Separating intrinsic interactions from extrinsic correlations in a network of sensory neurons
Authors:
Ulisse Ferrari,
Stephane Deny,
Matthew Chalk,
Gasper Tkacik,
Olivier Marre,
Thierry Mora
Abstract:
Correlations in sensory neural networks have both extrinsic and intrinsic origins. Extrinsic or stimulus correlations arise from shared inputs to the network, and thus depend strongly on the stimulus ensemble. Intrinsic or noise correlations reflect biophysical mechanisms of interactions between neurons, which are expected to be robust to changes of the stimulus ensemble. Despite the importance of…
▽ More
Correlations in sensory neural networks have both extrinsic and intrinsic origins. Extrinsic or stimulus correlations arise from shared inputs to the network, and thus depend strongly on the stimulus ensemble. Intrinsic or noise correlations reflect biophysical mechanisms of interactions between neurons, which are expected to be robust to changes of the stimulus ensemble. Despite the importance of this distinction for understanding how sensory networks encode information collectively, no method exists to reliably separate intrinsic interactions from extrinsic correlations in neural activity data, limiting our ability to build predictive models of the network response. In this paper we introduce a general strategy to infer {population models of interacting neurons that collectively encode stimulus information}. The key to disentangling intrinsic from extrinsic correlations is to infer the {couplings between neurons} separately from the encoding model, and to combine the two using corrections calculated in a mean-field approximation. We demonstrate the effectiveness of this approach on retinal recordings. The same coupling network is inferred from responses to radically different stimulus ensembles, showing that these couplings indeed reflect stimulus-independent interactions between neurons. The inferred model predicts accurately the collective response of retinal ganglion cell populations as a function of the stimulus.
△ Less
Submitted 22 February, 2018; v1 submitted 5 January, 2018;
originally announced January 2018.
-
A simple model for low variability in neural spike trains
Authors:
Ulisse Ferrari,
Stephane Deny,
Olivier Marre,
Thierry Mora
Abstract:
Neural noise sets a limit to information transmission in sensory systems. In several areas, the spiking response (to a repeated stimulus) has shown a higher degree of regularity than predicted by a Poisson process. However, a simple model to explain this low variability is still lacking. Here we introduce a new model, with a correction to Poisson statistics, which can accurately predict the regula…
▽ More
Neural noise sets a limit to information transmission in sensory systems. In several areas, the spiking response (to a repeated stimulus) has shown a higher degree of regularity than predicted by a Poisson process. However, a simple model to explain this low variability is still lacking. Here we introduce a new model, with a correction to Poisson statistics, which can accurately predict the regularity of neural spike trains in response to a repeated stimulus. The model has only two parameters, but can reproduce the observed variability in retinal recordings in various conditions. We show analytically why this approximation can work. In a model of the spike emitting process where a refractory period is assumed, we derive that our simple correction can well approximate the spike train statistics over a broad range of firing rates. Our model can be easily plugged to stimulus processing models, like Linear-nonlinear model or its generalizations, to replace the Poisson spike train hypothesis that is commonly assumed. It estimates the amount of information transmitted much more accurately than Poisson models in retinal recordings. Thanks to its simplicity this model has the potential to explain low variability in other areas.
△ Less
Submitted 4 January, 2018;
originally announced January 2018.
-
Physical epistatic landscape of antibody binding affinity
Authors:
Rhys M. Adams,
Justin B. Kinney,
Aleksandra M. Walczak,
Thierry Mora
Abstract:
Affinity maturation produces antibodies that bind antigens with high specificity by accumulating mutations in the antibody sequence. Mapping out the antibody-antigen affinity landscape can give us insight into the accessible paths during this rapid evolutionary process. By developing a carefully controlled null model for noninteracting mutations, we characterized epistasis in affinity measurements…
▽ More
Affinity maturation produces antibodies that bind antigens with high specificity by accumulating mutations in the antibody sequence. Mapping out the antibody-antigen affinity landscape can give us insight into the accessible paths during this rapid evolutionary process. By developing a carefully controlled null model for noninteracting mutations, we characterized epistasis in affinity measurements of a large library of antibody variants obtained by Tite-Seq, a recently introduced Deep Mutational Scan method yielding physical values of the binding constant. We show that representing affinity as the binding free energy minimizes epistasis. Yet, we find that epistatically interacting sites contribute substantially to binding. In addition to negative epistasis, we report a large amount of beneficial epistasis, enlarging the space of high-affinity antibodies as well as their mutational accessibility. These properties suggest that the degeneracy of antibody sequences that can bind a given antigen is enhanced by epistasis - an important property for vaccine design.
△ Less
Submitted 11 December, 2017;
originally announced December 2017.
-
Blindfold learning of an accurate neural metric
Authors:
Christophe Gardella,
Olivier Marre,
Thierry Mora
Abstract:
The brain has no direct access to physical stimuli, but only to the spiking activity evoked in sensory organs. It is unclear how the brain can structure its representation of the world based on differences between those noisy, correlated responses alone. Here we show how to build a distance map of responses from the structure of the population activity of retinal ganglion cells, allowing for the a…
▽ More
The brain has no direct access to physical stimuli, but only to the spiking activity evoked in sensory organs. It is unclear how the brain can structure its representation of the world based on differences between those noisy, correlated responses alone. Here we show how to build a distance map of responses from the structure of the population activity of retinal ganglion cells, allowing for the accurate discrimination of distinct visual stimuli from the retinal response. We introduce the Temporal Restricted Boltzmann Machine to learn the spatiotemporal structure of the population activity, and use this model to define a distance between spike trains. We show that this metric outperforms existing neural distances at discriminating pairs of stimuli that are barely distinguishable. The proposed method provides a generic and biologically plausible way to learn to associate similar stimuli based on their spiking responses, without any other knowledge of these stimuli.
△ Less
Submitted 13 October, 2017;
originally announced October 2017.
-
Method for identification of condition-associated public antigen receptor sequences
Authors:
Mikhail V. Pogorelyy,
Anastasia A. Minervina,
Dmitriy M. Chudakov,
Ilgar Z. Mamedov,
Yury B. Lebedev,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
Diverse repertoires of hypervariable immunoglobulin receptors (TCR and BCR) recognize antigens in the adaptive immune system. The development of immunoglobulin receptor repertoire sequencing methods makes it possible to perform repertoire-wide disease association studies of antigen receptor sequences. We developed a statistical framework for associating receptors to disease from only a small cohor…
▽ More
Diverse repertoires of hypervariable immunoglobulin receptors (TCR and BCR) recognize antigens in the adaptive immune system. The development of immunoglobulin receptor repertoire sequencing methods makes it possible to perform repertoire-wide disease association studies of antigen receptor sequences. We developed a statistical framework for associating receptors to disease from only a small cohort of patients, with no need for a control cohort. Our method successfully identifies previously validated Cytomegalovirus and type 1 diabetes responsive receptors.
△ Less
Submitted 27 September, 2017;
originally announced September 2017.
-
Disorder and the neural representation of complex odors: smelling in the real world
Authors:
Kamesh Krishnamurthy,
Ann M Hermundstad,
Thierry Mora,
Aleksandra M Walczak,
Vijay Balasubramanian
Abstract:
Animals smelling in the real world use a small number of receptors to sense a vast number of natural molecular mixtures, and proceed to learn arbitrary associations between odors and valences. Here, we propose a new interpretation of how the architecture of olfactory circuits is adapted to meet these immense complementary challenges. First, the diffuse binding of receptors to many molecules compre…
▽ More
Animals smelling in the real world use a small number of receptors to sense a vast number of natural molecular mixtures, and proceed to learn arbitrary associations between odors and valences. Here, we propose a new interpretation of how the architecture of olfactory circuits is adapted to meet these immense complementary challenges. First, the diffuse binding of receptors to many molecules compresses a vast odor space into a tiny receptor space, while preserving similarity. Next, lateral interactions "densify" and decorrelate the response, enhancing robustness to noise. Finally, disordered projections from the periphery to the central brain reconfigure the densely packed information into a format suitable for flexible learning of associations and valences. We test our theory empirically using data from Drosophila. Our theory suggests that the neural processing of olfactory information differs from the other senses in its fundamental use of disorder.
△ Less
Submitted 6 July, 2017;
originally announced July 2017.
-
IGoR: a tool for high-throughput immune repertoire analysis
Authors:
Quentin Marcou,
Thierry Mora,
Aleksandra M Walczak
Abstract:
High throughput immune repertoire sequencing is promising to lead to new statistical diagnostic tools for medicine and biology. Successful implementations of these methods require a correct characterization, analysis and interpretation of these datasets. We present IGoR -- a new comprehensive tool that takes B or T-cell receptors sequence reads and quantitatively characterizes the statistics of re…
▽ More
High throughput immune repertoire sequencing is promising to lead to new statistical diagnostic tools for medicine and biology. Successful implementations of these methods require a correct characterization, analysis and interpretation of these datasets. We present IGoR -- a new comprehensive tool that takes B or T-cell receptors sequence reads and quantitatively characterizes the statistics of receptor generation from both cDNA and gDNA. It probabilistically annotates sequences and its modular structure can investigate models of increasing biological complexity for different organisms. For B-cells IGoR returns the hypermutation statistics, which we use to reveal co-localization of hypermutations along the sequence. We demonstrate that IGoR outperforms existing tools in accuracy and estimate the sample sizes needed for reliable repertoire characterization.
△ Less
Submitted 23 May, 2017;
originally announced May 2017.