Search | arXiv e-print repository

Torchtree: flexible phylogenetic model development and inference using PyTorch

Authors: Mathieu Fourment, Matthew Macaulay, Christiaan J Swanepoel, Xiang Ji, Marc A Suchard, Frederick A Matsen IV

Abstract: Bayesian inference has predominantly relied on the Markov chain Monte Carlo (MCMC) algorithm for many years. However, MCMC is computationally laborious, especially for complex phylogenetic models of time trees. This bottleneck has led to the search for alternatives, such as variational Bayes, which can scale better to large datasets. In this paper, we introduce torchtree, a framework written in Py… ▽ More Bayesian inference has predominantly relied on the Markov chain Monte Carlo (MCMC) algorithm for many years. However, MCMC is computationally laborious, especially for complex phylogenetic models of time trees. This bottleneck has led to the search for alternatives, such as variational Bayes, which can scale better to large datasets. In this paper, we introduce torchtree, a framework written in Python that allows developers to easily implement rich phylogenetic models and algorithms using a fixed tree topology. One can either use automatic differentiation, or leverage torchtree's plug-in system to compute gradients analytically for model components for which automatic differentiation is slow. We demonstrate that the torchtree variational inference framework performs similarly to BEAST in terms of speed and approximation accuracy. Furthermore, we explore the use of the forward KL divergence as an optimizing criterion for variational inference, which can handle discontinuous and non-differentiable models. Our experiments show that inference using the forward KL divergence tends to be faster per iteration compared to the evidence lower bound (ELBO) criterion, although the ELBO-based inference may converge faster in some cases. Overall, torchtree provides a flexible and efficient framework for phylogenetic model development and inference using PyTorch. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 23 pages, 3 tables, and 4 figures in main text, plus supplementary materials

arXiv:2308.08708 [pdf, other]

Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

Authors: Patrick Butlin, Robert Long, Eric Elmoznino, Yoshua Bengio, Jonathan Birch, Axel Constant, George Deane, Stephen M. Fleming, Chris Frith, Xu Ji, Ryota Kanai, Colin Klein, Grace Lindsay, Matthias Michel, Liad Mudrik, Megan A. K. Peters, Eric Schwitzgebel, Jonathan Simon, Rufin VanRullen

Abstract: Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argues for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of con… ▽ More Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argues for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive "indicator properties" of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators. △ Less

Submitted 22 August, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

arXiv:2307.05519 [pdf]

Physical Color Calibration of Digital Pathology Scanners for Robust Artificial Intelligence Assisted Cancer Diagnosis

Authors: Xiaoyi Ji, Richard Salmon, Nita Mulliqi, Umair Khan, Yinxi Wang, Anders Blilie, Henrik Olsson, Bodil Ginnerup Pedersen, Karina Dalsgaard Sørensen, Benedicte Parm Ulhøi, Svein R Kjosavik, Emilius AM Janssen, Mattias Rantalainen, Lars Egevad, Pekka Ruusuvuori, Martin Eklund, Kimmo Kartasalo

Abstract: The potential of artificial intelligence (AI) in digital pathology is limited by technical inconsistencies in the production of whole slide images (WSIs), leading to degraded AI performance and posing a challenge for widespread clinical application as fine-tuning algorithms for each new site is impractical. Changes in the imaging workflow can also lead to compromised diagnoses and patient safety r… ▽ More The potential of artificial intelligence (AI) in digital pathology is limited by technical inconsistencies in the production of whole slide images (WSIs), leading to degraded AI performance and posing a challenge for widespread clinical application as fine-tuning algorithms for each new site is impractical. Changes in the imaging workflow can also lead to compromised diagnoses and patient safety risks. We evaluated whether physical color calibration of scanners can standardize WSI appearance and enable robust AI performance. We employed a color calibration slide in four different laboratories and evaluated its impact on the performance of an AI system for prostate cancer diagnosis on 1,161 WSIs. Color standardization resulted in consistently improved AI model calibration and significant improvements in Gleason grading performance. The study demonstrates that physical color calibration provides a potential solution to the variation introduced by different scanners, making AI-based cancer diagnostics more reliable and applicable in clinical settings. △ Less

Submitted 7 July, 2023; originally announced July 2023.

arXiv:2304.12239 [pdf, other]

Uni-QSAR: an Auto-ML Tool for Molecular Property Prediction

Authors: Zhifeng Gao, Xiaohong Ji, Guojiang Zhao, Hongshuai Wang, Hang Zheng, Guolin Ke, Linfeng Zhang

Abstract: Recently deep learning based quantitative structure-activity relationship (QSAR) models has shown surpassing performance than traditional methods for property prediction tasks in drug discovery. However, most DL based QSAR models are restricted to limited labeled data to achieve better performance, and also are sensitive to model scale and hyper-parameters. In this paper, we propose Uni-QSAR, a po… ▽ More Recently deep learning based quantitative structure-activity relationship (QSAR) models has shown surpassing performance than traditional methods for property prediction tasks in drug discovery. However, most DL based QSAR models are restricted to limited labeled data to achieve better performance, and also are sensitive to model scale and hyper-parameters. In this paper, we propose Uni-QSAR, a powerful Auto-ML tool for molecule property prediction tasks. Uni-QSAR combines molecular representation learning (MRL) of 1D sequential tokens, 2D topology graphs, and 3D conformers with pretraining models to leverage rich representation from large-scale unlabeled data. Without any manual fine-tuning or model selection, Uni-QSAR outperforms SOTA in 21/22 tasks of the Therapeutic Data Commons (TDC) benchmark under designed parallel workflow, with an average performance improvement of 6.09\%. Furthermore, we demonstrate the practical usefulness of Uni-QSAR in drug discovery domains. △ Less

Submitted 24 April, 2023; originally announced April 2023.

arXiv:2303.13642 [pdf, other]

Random-effects substitution models for phylogenetics via scalable gradient approximations

Authors: Andrew F. Magee, Andrew J. Holbrook, Jonathan E. Pekar, Itzue W. Caviedes-Solis, Fredrick A. Matsen IV, Guy Baele, Joel O. Wertheim, Xiang Ji, Philippe Lemey, Marc A. Suchard

Abstract: Phylogenetic and discrete-trait evolutionary inference depend heavily on an appropriate characterization of the underlying character substitution process. In this paper, we present random-effects substitution models that extend common continuous-time Markov chain models into a richer class of processes capable of capturing a wider variety of substitution dynamics. As these random-effects substitut… ▽ More Phylogenetic and discrete-trait evolutionary inference depend heavily on an appropriate characterization of the underlying character substitution process. In this paper, we present random-effects substitution models that extend common continuous-time Markov chain models into a richer class of processes capable of capturing a wider variety of substitution dynamics. As these random-effects substitution models often require many more parameters than their usual counterparts, inference can be both statistically and computationally challenging. Thus, we also propose an efficient approach to compute an approximation to the gradient of the data likelihood with respect to all unknown substitution model parameters. We demonstrate that this approximate gradient enables scaling of sampling-based inference, namely Bayesian inference via Hamiltonian Monte Carlo, under random-effects substitution models across large trees and state-spaces. Applied to a dataset of 583 SARS-CoV-2 sequences, an HKY model with random-effects shows strong signals of nonreversibility in the substitution process, and posterior predictive model checks clearly show that it is a more adequate model than a reversible model. When analyzing the pattern of phylogeographic spread of 1441 influenza A virus (H3N2) sequences between 14 regions, a random-effects phylogeographic substitution model infers that air travel volume adequately predicts almost all dispersal rates. A random-effects state-dependent substitution model reveals no evidence for an effect of arboreality on the swimming mode in the tree frog subfamily Hylinae. Simulations reveal that random-effects substitution models can accommodate both negligible and radical departures from the underlying base substitution model. We show that our gradient-based inference approach is over an order of magnitude more time efficient than conventional approaches. △ Less

Submitted 25 September, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

arXiv:2303.04390 [pdf, other]

Many-core algorithms for high-dimensional gradients on phylogenetic trees

Authors: Karthik Gangavarapu, Xiang Ji, Guy Baele, Mathieu Fourment, Philippe Lemey, Frederick A. Matsen IV, Marc A. Suchard

Abstract: The rapid growth in genomic pathogen data spurs the need for efficient inference techniques, such as Hamiltonian Monte Carlo (HMC) in a Bayesian framework, to estimate parameters of these phylogenetic models where the dimensions of the parameters increase with the number of sequences $N$. HMC requires repeated calculation of the gradient of the data log-likelihood with respect to (wrt) all branch-… ▽ More The rapid growth in genomic pathogen data spurs the need for efficient inference techniques, such as Hamiltonian Monte Carlo (HMC) in a Bayesian framework, to estimate parameters of these phylogenetic models where the dimensions of the parameters increase with the number of sequences $N$. HMC requires repeated calculation of the gradient of the data log-likelihood with respect to (wrt) all branch-length-specific (BLS) parameters that traditionally takes $\mathcal{O}(N^2)$ operations using the standard pruning algorithm. A recent study proposes an approach to calculate this gradient in $\mathcal{O}(N)$, enabling researchers to take advantage of gradient-based samplers such as HMC. The CPU implementation of this approach makes the calculation of the gradient computationally tractable for nucleotide-based models but falls short in performance for larger state-space size models, such as codon models. Here, we describe novel massively parallel algorithms to calculate the gradient of the log-likelihood wrt all BLS parameters that take advantage of graphics processing units (GPUs) and result in many fold higher speedups over previous CPU implementations. We benchmark these GPU algorithms on three computing systems using three evolutionary inference examples: carnivores, dengue and yeast, and observe a greater than 128-fold speedup over the CPU implementation for codon-based models and greater than 8-fold speedup for nucleotide-based models. As a practical demonstration, we also estimate the timing of the first introduction of West Nile virus into the continental Unites States under a codon model with a relaxed molecular clock from 104 full viral genomes, an inference task previously intractable. We provide an implementation of our GPU algorithms in BEAGLE v4.0.0, an open source library for statistical phylogenetics that enables parallel calculations on multi-core CPUs and GPUs. △ Less

Submitted 8 March, 2023; originally announced March 2023.

arXiv:2302.06403 [pdf, other]

Sources of Richness and Ineffability for Phenomenally Conscious States

Authors: Xu Ji, Eric Elmoznino, George Deane, Axel Constant, Guillaume Dumas, Guillaume Lajoie, Jonathan Simon, Yoshua Bengio

Abstract: Conscious states (states that there is something it is like to be in) seem both rich or full of detail, and ineffable or hard to fully describe or recall. The problem of ineffability, in particular, is a longstanding issue in philosophy that partly motivates the explanatory gap: the belief that consciousness cannot be reduced to underlying physical processes. Here, we provide an information theore… ▽ More Conscious states (states that there is something it is like to be in) seem both rich or full of detail, and ineffable or hard to fully describe or recall. The problem of ineffability, in particular, is a longstanding issue in philosophy that partly motivates the explanatory gap: the belief that consciousness cannot be reduced to underlying physical processes. Here, we provide an information theoretic dynamical systems perspective on the richness and ineffability of consciousness. In our framework, the richness of conscious experience corresponds to the amount of information in a conscious state and ineffability corresponds to the amount of information lost at different stages of processing. We describe how attractor dynamics in working memory would induce impoverished recollections of our original experiences, how the discrete symbolic nature of language is insufficient for describing the rich and high-dimensional structure of experiences, and how similarity in the cognitive function of two individuals relates to improved communicability of their experiences to each other. While our model may not settle all questions relating to the explanatory gap, it makes progress toward a fully physicalist explanation of the richness and ineffability of conscious experience: two important aspects that seem to be part of what makes qualitative character so puzzling. △ Less

Submitted 20 June, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

arXiv:2211.05220 [pdf, other]

TreeFlow: probabilistic programming and automatic differentiation for phylogenetics

Authors: Christiaan Swanepoel, Mathieu Fourment, Xiang Ji, Hassan Nasif, Marc A Suchard, Frederick A Matsen IV, Alexei Drummond

Abstract: Probabilistic programming frameworks are powerful tools for statistical modelling and inference. They are not immediately generalisable to phylogenetic problems due to the particular computational properties of the phylogenetic tree object. TreeFlow is a software library for probabilistic programming and automatic differentiation with phylogenetic trees. It implements inference algorithms for phyl… ▽ More Probabilistic programming frameworks are powerful tools for statistical modelling and inference. They are not immediately generalisable to phylogenetic problems due to the particular computational properties of the phylogenetic tree object. TreeFlow is a software library for probabilistic programming and automatic differentiation with phylogenetic trees. It implements inference algorithms for phylogenetic tree times and model parameters given a tree topology. We demonstrate how TreeFlow can be used to quickly implement and assess new models. We also show that it provides reasonable performance for gradient-based inference algorithms compared to specialized computational libraries for phylogenetics. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: 34 pages, 8 figures

arXiv:2211.02168 [pdf, other]

doi 10.1093/gbe/evad099

Automatic differentiation is no panacea for phylogenetic gradient computation

Authors: Mathieu Fourment, Christiaan J. Swanepoel, Jared G. Galloway, Xiang Ji, Karthik Gangavarapu, Marc A. Suchard, Frederick A. Matsen IV

Abstract: Gradients of probabilistic model likelihoods with respect to their parameters are essential for modern computational statistics and machine learning. These calculations are readily available for arbitrary models via automatic differentiation implemented in general-purpose machine-learning libraries such as TensorFlow and PyTorch. Although these libraries are highly optimized, it is not clear if th… ▽ More Gradients of probabilistic model likelihoods with respect to their parameters are essential for modern computational statistics and machine learning. These calculations are readily available for arbitrary models via automatic differentiation implemented in general-purpose machine-learning libraries such as TensorFlow and PyTorch. Although these libraries are highly optimized, it is not clear if their general-purpose nature will limit their algorithmic complexity or implementation speed for the phylogenetic case compared to phylogenetics-specific code. In this paper, we compare six gradient implementations of the phylogenetic likelihood functions, in isolation and also as part of a variational inference procedure. We find that although automatic differentiation can scale approximately linearly in tree size, it is much slower than the carefully-implemented gradient calculation for tree likelihood and ratio transformation operations. We conclude that a mixed approach combining phylogenetic libraries with machine learning libraries will provide the optimal combination of speed and model flexibility moving forward. △ Less

Submitted 4 June, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

Comments: 17 pages and 2 figures in main text, plus supplementary materials

arXiv:2209.13518 [pdf]

Graph-Based Active Machine Learning Method for Diverse and Novel Antimicrobial Peptides Generation and Selection

Authors: Bonaventure F. P. Dossou, Dianbo Liu, Xu Ji, Moksh Jain, Almer M. van der Sloot, Roger Palou, Michael Tyers, Yoshua Bengio

Abstract: As antibiotic-resistant bacterial strains are rapidly spreading worldwide, infections caused by these strains are emerging as a global crisis causing the death of millions of people every year. Antimicrobial Peptides (AMPs) are one of the candidates to tackle this problem because of their potential diversity, and ability to favorably modulate the host immune response. However, large-scale screenin… ▽ More As antibiotic-resistant bacterial strains are rapidly spreading worldwide, infections caused by these strains are emerging as a global crisis causing the death of millions of people every year. Antimicrobial Peptides (AMPs) are one of the candidates to tackle this problem because of their potential diversity, and ability to favorably modulate the host immune response. However, large-scale screening of new AMP candidates is expensive, time-consuming, and now affordable in developing countries, which need the treatments the most. In this work, we propose a novel active machine learning-based framework that statistically minimizes the number of wet-lab experiments needed to design new AMPs, while ensuring a high diversity and novelty of generated AMPs sequences, in multi-rounds of wet-lab AMP screening settings. Combining recurrent neural network models and a graph-based filter (GraphCC), our proposed approach delivers novel and diverse candidates and demonstrates better performances according to our defined metrics. △ Less

Submitted 18 September, 2022; originally announced September 2022.

Comments: Under Review at Sciences Advances

arXiv:2201.07291 [pdf, other]

Accelerating Bayesian inference of dependency between complex biological traits

Authors: Zhenyu Zhang, Akihiko Nishimura, Nídia S. Trovão, Joshua L. Cherry, Andrew J. Holbrook, Xiang Ji, Philippe Lemey, Marc A. Suchard

Abstract: Inferring dependencies between complex biological traits while accounting for evolutionary relationships between specimens is of great scientific interest yet remains infeasible when trait and specimen counts grow large. The state-of-the-art approach uses a phylogenetic multivariate probit model to accommodate binary and continuous traits via a latent variable framework, and utilizes an efficient… ▽ More Inferring dependencies between complex biological traits while accounting for evolutionary relationships between specimens is of great scientific interest yet remains infeasible when trait and specimen counts grow large. The state-of-the-art approach uses a phylogenetic multivariate probit model to accommodate binary and continuous traits via a latent variable framework, and utilizes an efficient bouncy particle sampler (BPS) to tackle the computational bottleneck -- integrating many latent variables from a high-dimensional truncated normal distribution. This approach breaks down as the number of specimens grows and fails to reliably characterize conditional dependencies between traits. Here, we propose an inference pipeline for phylogenetic probit models that greatly outperforms BPS. The novelty lies in 1) a combination of the recent Zigzag Hamiltonian Monte Carlo (Zigzag-HMC) with linear-time gradient evaluations and 2) a joint sampling scheme for highly correlated latent variables and correlation matrix elements. In an application exploring HIV-1 evolution from 535 viruses, the inference requires joint sampling from an 11,235-dimensional truncated normal and a 24-dimensional covariance matrix. Our method yields a 5-fold speedup compared to BPS and makes it possible to learn partial correlations between candidate viral mutations and virulence. Computational speedup now enables us to tackle even larger problems: we study the evolution of influenza H1N1 glycosylations on around 900 viruses. For broader applicability, we extend the phylogenetic probit model to incorporate categorical traits, and demonstrate its use to study Aquilegia flower and pollinator co-evolution. △ Less

Submitted 7 September, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

Comments: 39 pages, 5 figures, 3 tables

arXiv:2111.02044 [pdf, other]

Categorical Difference and Related Brain Regions of the Attentional Blink Effect

Authors: Renzhou Gui, Xiaohong Ji

Abstract: Attentional blink (AB) is a biological effect, showing that for 200 to 500ms after paying attention to one visual target, it is difficult to notice another target that appears next, and attentional blink magnitude (ABM) is a indicating parameter to measure the degree of this effect. Researchers have shown that different categories of images can access the consciousness of human mind differently, a… ▽ More Attentional blink (AB) is a biological effect, showing that for 200 to 500ms after paying attention to one visual target, it is difficult to notice another target that appears next, and attentional blink magnitude (ABM) is a indicating parameter to measure the degree of this effect. Researchers have shown that different categories of images can access the consciousness of human mind differently, and produce different ranges of ABM values. So in this paper, we compare two different types of images, categorized as animal and object, by predicting ABM values directly from image features extracted from convolutional neural network (CNN), and indirectly from functional magnetic resonance imaging (fMRI) data. First, for two sets of images, we separately extract their average features from layers of Alexnet, a classic model of CNN, then input the features into a trained linear regression model to predict ABM values, and we find higher-level instead of lower-level image features determine the categorical difference in AB effect, and mid-level image features predict ABM values more correctly than low-level and high-level image features. Then we employ fMRI data from different brain regions collected when the subjects viewed 50 test images to predict ABM values, and conclude that brain regions covering relatively broader areas, like LVC, HVC and VC, perform better than other smaller brain regions, which means AB effect is more related to synthetic impact of several visual brain regions than only one particular visual regions. △ Less

Submitted 3 November, 2021; originally announced November 2021.

Comments: Accepted in PhotonIcs and Electromagnetics Research Symposium (PIERS) 2021

arXiv:2110.13298 [pdf, other]

Scalable Bayesian divergence time estimation with ratio transformations

Authors: Xiang Ji, Alexander A. Fisher, Shuo Su, Jeffrey L. Thorne, Barney Potter, Philippe Lemey, Guy Baele, Marc A. Suchard

Abstract: Divergence time estimation is crucial to provide temporal signals for dating biologically important events, from species divergence to viral transmissions in space and time. With the advent of high-throughput sequencing, recent Bayesian phylogenetic studies have analyzed hundreds to thousands of sequences. Such large-scale analyses challenge divergence time reconstruction by requiring inference on… ▽ More Divergence time estimation is crucial to provide temporal signals for dating biologically important events, from species divergence to viral transmissions in space and time. With the advent of high-throughput sequencing, recent Bayesian phylogenetic studies have analyzed hundreds to thousands of sequences. Such large-scale analyses challenge divergence time reconstruction by requiring inference on highly-correlated internal node heights that often become computationally infeasible. To overcome this limitation, we explore a ratio transformation that maps the original N - 1 internal node heights into a space of one height parameter and N - 2 ratio parameters. To make analyses scalable, we develop a collection of linear-time algorithms to compute the gradient and Jacobian-associated terms of the log-likelihood with respect to these ratios. We then apply Hamiltonian Monte Carlo sampling with the ratio transform in a Bayesian framework to learn the divergence times in four pathogenic virus phylogenies: West Nile virus, rabies virus, Lassa virus and Ebola virus. Our method both resolves a mixing issue in the West Nile virus example and improves inference efficiency by at least 5-fold for the Lassa and rabies virus examples. Our method also makes it now computationally feasible to incorporate mixed-effects molecular clock models for the Ebola virus example, confirms the findings from the original study and reveals clearer multimodal distributions of the divergence times of some clades of interest. △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: 34 pages, 6 figures

arXiv:2105.07119 [pdf, other]

Shrinkage-based random local clocks with scalable inference

Authors: Alexander A. Fisher, Xiang Ji, Akihiko Nishimura, Philippe Lemey, Marc A. Suchard

Abstract: Local clock models propose that the rate of molecular evolution is constant within phylogenetic sub-trees. Current local clock inference procedures scale poorly to large taxa problems, impose model misspecification, or require a priori knowledge of the existence and location of clocks. To overcome these challenges, we present an autocorrelated, Bayesian model of heritable clock rate evolution that… ▽ More Local clock models propose that the rate of molecular evolution is constant within phylogenetic sub-trees. Current local clock inference procedures scale poorly to large taxa problems, impose model misspecification, or require a priori knowledge of the existence and location of clocks. To overcome these challenges, we present an autocorrelated, Bayesian model of heritable clock rate evolution that leverages heavy-tailed priors with mean zero to shrink increments of change between branch-specific clocks. We further develop an efficient Hamiltonian Monte Carlo sampler that exploits closed form gradient computations to scale our model to large trees. Inference under our shrinkage-clock exhibits an over 3-fold speed increase compared to the popular random local clock when estimating branch-specific clock rates on a simulated dataset. We further show our shrinkage-clock recovers known local clocks within a rodent and mammalian phylogeny. Finally, in a problem that once appeared computationally impractical, we investigate the heritable clock structure of various surface glycoproteins of influenza A virus in the absence of prior knowledge about clock placement. △ Less

Submitted 14 May, 2021; originally announced May 2021.

Comments: 24 pages, 6 figures

arXiv:2103.03348 [pdf, other]

From viral evolution to spatial contagion: a biologically modulated Hawkes model

Authors: Andrew J. Holbrook, Xiang Ji, Marc A. Suchard

Abstract: Mutations sometimes increase contagiousness for evolving pathogens. During an epidemic, scientists use viral genome data to infer a shared evolutionary history and connect this history to geographic spread. We propose a model that directly relates a pathogen's evolution to its spatial contagion dynamics -- effectively combining the two epidemiological paradigms of phylogenetic inference and self-e… ▽ More Mutations sometimes increase contagiousness for evolving pathogens. During an epidemic, scientists use viral genome data to infer a shared evolutionary history and connect this history to geographic spread. We propose a model that directly relates a pathogen's evolution to its spatial contagion dynamics -- effectively combining the two epidemiological paradigms of phylogenetic inference and self-exciting process modeling -- and apply this \emph{phylogenetic Hawkes process} to a Bayesian analysis of 23,422 viral cases from the 2014-2016 Ebola outbreak in West Africa. The proposed model is able to detect individual viruses with significantly elevated rates of spatiotemporal propagation for a subset of 1,610 samples that provide genome data. Finally, to facilitate model application in big data settings, we develop massively parallel implementations for the gradient and Hessian of the log-likelihood and apply our high performance computing framework within an adaptively preconditioned Hamiltonian Monte Carlo routine. △ Less

Submitted 10 September, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

arXiv:2006.12323 [pdf, other]

Automatic Recall Machines: Internal Replay, Continual Learning and the Brain

Authors: Xu Ji, Joao Henriques, Tinne Tuytelaars, Andrea Vedaldi

Abstract: Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity. We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective, without extraneous buffers or generator networks. Instead the implicit memory of learned… ▽ More Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity. We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective, without extraneous buffers or generator networks. Instead the implicit memory of learned samples within the assessed model itself is exploited. Furthermore, whereas existing work focuses on reinforcing the full seen data distribution, we show that optimizing for not forgetting calls for the generation of samples that are specialized to each real training batch, which is more efficient and scalable. We consider high-level parallels with the brain, notably the use of a single model for inference and recall, the dependency of recalled samples on the current environment batch, top-down modulation of activations and learning, abstract recall, and the dependency between the degree to which a task is learned and the degree to which it is recalled. These characteristics emerge naturally from the method without being controlled for. △ Less

Submitted 12 December, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

Comments: NeurIPS 2020 Workshop on BabyMind

arXiv:1912.09185 [pdf, other]

Large-scale inference of correlation among mixed-type biological traits with phylogenetic multivariate probit models

Authors: Zhenyu Zhang, Akihiko Nishimura, Paul Bastide, Xiang Ji, Rebecca P. Payne, Philip Goulder, Philippe Lemey, Marc A. Suchard

Abstract: Inferring concerted changes among biological traits along an evolutionary history remains an important yet challenging problem. Besides adjusting for spurious correlation induced from the shared history, the task also requires sufficient flexibility and computational efficiency to incorporate multiple continuous and discrete traits as data size increases. To accomplish this, we jointly model mixed… ▽ More Inferring concerted changes among biological traits along an evolutionary history remains an important yet challenging problem. Besides adjusting for spurious correlation induced from the shared history, the task also requires sufficient flexibility and computational efficiency to incorporate multiple continuous and discrete traits as data size increases. To accomplish this, we jointly model mixed-type traits by assuming latent parameters for binary outcome dimensions at the tips of an unknown tree informed by molecular sequences. This gives rise to a phylogenetic multivariate probit model. With large sample sizes, posterior computation under this model is problematic, as it requires repeated sampling from a high-dimensional truncated normal distribution. Current best practices employ multiple-try rejection sampling that suffers from slow-mixing and a computational cost that scales quadratically in sample size. We develop a new inference approach that exploits 1) the bouncy particle sampler (BPS) based on piecewise deterministic Markov processes to simultaneously sample all truncated normal dimensions, and 2) novel dynamic programming that reduces the cost of likelihood and gradient evaluations for BPS to linear in sample size. In an application with 535 HIV viruses and 24 traits that necessitates sampling from a 12,840-dimensional truncated normal, our method makes it possible to estimate the across-trait correlation and detect factors that affect the pathogen's capacity to cause disease. This inference framework is also applicable to a broader class of covariance structures beyond comparative biology. △ Less

Submitted 23 September, 2020; v1 submitted 19 December, 2019; originally announced December 2019.

Comments: 24 pages, 6 figures, 2 tables. Version accepted by Annals of Applied Statistics

arXiv:1908.08608 [pdf, other]

A phylogenetic approach disentangles interlocus gene conversion tract length and initiation rate

Authors: Xiang Ji, Jeffrey L. Thorne

Abstract: Interlocus gene conversion (IGC) homogenizes paralogs. Little is known regarding the mutation events that cause IGC and even less is known about the IGC mutations that experience fixation. To disentangle the rates of fixed IGC mutations from the tract lengths of these fixed mutations, we employ a composite likelihood procedure. We characterize the procedure with simulations. We apply the procedure… ▽ More Interlocus gene conversion (IGC) homogenizes paralogs. Little is known regarding the mutation events that cause IGC and even less is known about the IGC mutations that experience fixation. To disentangle the rates of fixed IGC mutations from the tract lengths of these fixed mutations, we employ a composite likelihood procedure. We characterize the procedure with simulations. We apply the procedure to duplicated primate introns and to protein-coding paralogs from both yeast and primates. Our estimates from protein-coding data concerning the mean length of fixed IGC tracts were unexpectedly low and are associated with high degrees of uncertainty. In contrast, our estimates from the primate intron data had lengths in the general range expected from IGC mutation studies. While it is challenging to separate the rate at which fixed IGC mutations initiate from the average number of nucleotide positions that these IGC events affect, all of our analyses indicate that IGC is responsible for a substantial proportion of evolutionary change in duplicated regions. Our results suggest that IGC should be considered whenever the evolution of multigene families is examined. △ Less

Submitted 22 August, 2019; originally announced August 2019.

Comments: 5 figures, 2 tables

arXiv:1906.04834 [pdf, other]

Relaxed random walks at scale

Authors: Alexander A. Fisher, Xiang Ji, Philippe Lemey, Marc A. Suchard

Abstract: Relaxed random walk (RRW) models of trait evolution introduce branch-specific rate multipliers to modulate the variance of a standard Brownian diffusion process along a phylogeny and more accurately model overdispersed biological data. Increased taxonomic sampling challenges inference under RRWs as the number of unknown parameters grows with the number of taxa. To solve this problem, we present a… ▽ More Relaxed random walk (RRW) models of trait evolution introduce branch-specific rate multipliers to modulate the variance of a standard Brownian diffusion process along a phylogeny and more accurately model overdispersed biological data. Increased taxonomic sampling challenges inference under RRWs as the number of unknown parameters grows with the number of taxa. To solve this problem, we present a scalable method to efficiently fit RRWs and infer this branch-specific variation in a Bayesian framework. We develop a Hamiltonian Monte Carlo (HMC) sampler to approximate the high-dimensional, correlated posterior that exploits a closed-form evaluation of the gradient of the trait data log-likelihood with respect to all branch-rate multipliers simultaneously. Our gradient calculation achieves computational complexity that scales only linearly with the number of taxa under study. We compare the efficiency of our HMC sampler to the previously standard univariable Metropolis-Hastings approach while studying the spatial emergence of the West Nile virus in North America in the early 2000s. Our method achieves an over 300-fold speed-increase over the univariable approach. Additionally, we demonstrate the scalability of our method by applying the RRW to study the correlation between five mammalian life history traits in a phylogenetic tree with 3650 tips. △ Less

Submitted 14 November, 2019; v1 submitted 11 June, 2019; originally announced June 2019.

Comments: 18 pages, 4 figures

arXiv:1905.12146 [pdf, other]

Gradients do grow on trees: a linear-time ${\cal O}\hspace{-0.2em}\left( N \right)$-dimensional gradient for statistical phylogenetics

Authors: Xiang Ji, Zhenyu Zhang, Andrew Holbrook, Akihiko Nishimura, Guy Baele, Andrew Rambaut, Philippe Lemey, Marc A. Suchard

Abstract: Calculation of the log-likelihood stands as the computational bottleneck for many statistical phylogenetic algorithms. Even worse is its gradient evaluation, often used to target regions of high probability. Order ${\cal O}\hspace{-0.2em}\left( N \right)$-dimensional gradient calculations based on the standard pruning algorithm require ${\cal O}\hspace{-0.2em}\left( N^2 \right)$ operations where N… ▽ More Calculation of the log-likelihood stands as the computational bottleneck for many statistical phylogenetic algorithms. Even worse is its gradient evaluation, often used to target regions of high probability. Order ${\cal O}\hspace{-0.2em}\left( N \right)$-dimensional gradient calculations based on the standard pruning algorithm require ${\cal O}\hspace{-0.2em}\left( N^2 \right)$ operations where N is the number of sampled molecular sequences. With the advent of high-throughput sequencing, recent phylogenetic studies have analyzed hundreds to thousands of sequences, with an apparent trend towards even larger data sets as a result of advancing technology. Such large-scale analyses challenge phylogenetic reconstruction by requiring inference on larger sets of process parameters to model the increasing data heterogeneity. To make this tractable, we present a linear-time algorithm for ${\cal O}\hspace{-0.2em}\left( N \right)$-dimensional gradient evaluation and apply it to general continuous-time Markov processes of sequence substitution on a phylogenetic tree without a need to assume either stationarity or reversibility. We apply this approach to learn the branch-specific evolutionary rates of three pathogenic viruses: West Nile virus, Dengue virus and Lassa virus. Our proposed algorithm significantly improves inference efficiency with a 126- to 234-fold increase in maximum-likelihood optimization and a 16- to 33-fold computational performance increase in a Bayesian framework. △ Less

Submitted 28 May, 2019; originally announced May 2019.

arXiv:1112.3496 [pdf]

Increased Coupling in the Saliency Network is the main cause/effect of Attention Deficit Hyperactivity Disorder

Authors: Xiaoxi Ji, Wei Cheng, Jie Zhang, Tian Ge, Li Sun, Yufeng Wang, Jianfeng Feng

Abstract: To uncover the underlying mechanisms of mental disorders such as attention deficit hyperactivity disorder (ADHD) for improving both early diagnosis and therapy, it is increasingly recognized that we need a better understanding of how the brain's functional connections are altered. A new brain wide association study (BWAS) has been developed and used to investigate functional connectivity changes i… ▽ More To uncover the underlying mechanisms of mental disorders such as attention deficit hyperactivity disorder (ADHD) for improving both early diagnosis and therapy, it is increasingly recognized that we need a better understanding of how the brain's functional connections are altered. A new brain wide association study (BWAS) has been developed and used to investigate functional connectivity changes in the brains of patients suffering from ADHD using resting state fMRI data. To reliably find out the most significantly altered functional connectivity links and associate them with ADHD, a meta-analysis on a cohort of ever reported largest population comprising 249 patients and 253 healthy controls is carried out. The greatest change in ADHD patients was the increased coupling of the saliency network involving the anterior cingulate gyrus and anterior insula. A voxel-based morphometry analysis was also carried out but this revealed no evidence in the ADHD patients for altered grey matter volumes in the regions showing altered functional connectivity. This is the first evidence for the involvement of the saliency network in ADHD and it suggests that this may reflect increased sensitivity over the integration of the incoming sensory information and his/her own thoughts and the network as a switch is bias towards to the central executive network. △ Less

Submitted 15 December, 2011; originally announced December 2011.

arXiv:1108.0538 [pdf, other]

doi 10.1371/journal.pone.0029049

Mechanochemical modeling of dynamic microtubule growth involving sheet-to-tube transition

Authors: Xiang-Ying Ji, Xi-Qiao Feng

Abstract: Microtubule dynamics is largely influenced by nucleotide hydrolysis and the resultant tubulin configuration changes. The GTP cap model has been proposed to interpret the stabilizing mechanism of microtubule growth from the view of hydrolysis effects. Besides, the microtubule growth involves the closure of a curved sheet at its growing end. The curvature conversion also helps to stabilize the succe… ▽ More Microtubule dynamics is largely influenced by nucleotide hydrolysis and the resultant tubulin configuration changes. The GTP cap model has been proposed to interpret the stabilizing mechanism of microtubule growth from the view of hydrolysis effects. Besides, the microtubule growth involves the closure of a curved sheet at its growing end. The curvature conversion also helps to stabilize the successive growth, and the curved sheet is referred to as the conformational cap. However, there still lacks theoretical investigation on the mechanical-chemical coupling growth process of microtubules. In this paper, we study the growth mechanisms of microtubules by using a coarse-grained molecular method. Firstly, the closure process involving a sheet-to-tube transition is simulated. The results verify the stabilizing effect of the sheet structure, and the minimum conformational cap length that can stabilize the growth is demonstrated to be two dimers. Then, we show that the conformational cap can function independently of the GTP cap, signifying the pivotal role of mechanical factors. Furthermore, based on our theoretical results, we describe a Tetris-like growth style of microtubules: the stochastic tubulin assembly is regulated by energy and harmonized with the seam zipping such that the sheet keeps a practically constant length during growth. △ Less

Submitted 2 August, 2011; originally announced August 2011.

Comments: 23 pages, 7 figures. 2 supporting movies have not been uploaded due to the file type restrictions

Showing 1–22 of 22 results for author: Ji, X