-
Gene Regulatory Network Inference with Covariance Dynamics
Authors:
Yue Wang,
Peng Zheng,
Yu-Chen Cheng,
Zikun Wang,
Aleksandr Aravkin
Abstract:
Determining gene regulatory network (GRN) structure is a central problem in biology, with a variety of inference methods available for different types of data. For a widely prevalent and challenging use case, namely single-cell gene expression data measured after intervention at multiple time points with unknown joint distributions, there is only one known specifically developed method, which does…
▽ More
Determining gene regulatory network (GRN) structure is a central problem in biology, with a variety of inference methods available for different types of data. For a widely prevalent and challenging use case, namely single-cell gene expression data measured after intervention at multiple time points with unknown joint distributions, there is only one known specifically developed method, which does not fully utilize the rich information contained in this data type. We develop an inference method for the GRN in this case, netWork infErence by covariaNce DYnamics, dubbed WENDY. The core idea of WENDY is to model the dynamics of the covariance matrix, and solve this dynamics as an optimization problem to determine the regulatory relationships. To evaluate its effectiveness, we compare WENDY with other inference methods using synthetic data and experimental data. Our results demonstrate that WENDY performs well across different data sets.
△ Less
Submitted 17 June, 2024;
originally announced July 2024.
-
Geometric-Facilitated Denoising Diffusion Model for 3D Molecule Generation
Authors:
Can Xu,
Haosen Wang,
Weigang Wang,
Pengfei Zheng,
Hongyang Chen
Abstract:
Denoising diffusion models have shown great potential in multiple research areas. Existing diffusion-based generative methods on de novo 3D molecule generation face two major challenges. Since majority heavy atoms in molecules allow connections to multiple atoms through single bonds, solely using pair-wise distance to model molecule geometries is insufficient. Therefore, the first one involves pro…
▽ More
Denoising diffusion models have shown great potential in multiple research areas. Existing diffusion-based generative methods on de novo 3D molecule generation face two major challenges. Since majority heavy atoms in molecules allow connections to multiple atoms through single bonds, solely using pair-wise distance to model molecule geometries is insufficient. Therefore, the first one involves proposing an effective neural network as the denoising kernel that is capable to capture complex multi-body interatomic relationships and learn high-quality features. Due to the discrete nature of graphs, mainstream diffusion-based methods for molecules heavily rely on predefined rules and generate edges in an indirect manner. The second challenge involves accommodating molecule generation to diffusion and accurately predicting the existence of bonds. In our research, we view the iterative way of updating molecule conformations in diffusion process is consistent with molecular dynamics and introduce a novel molecule generation method named Geometric-Facilitated Molecular Diffusion (GFMDiff). For the first challenge, we introduce a Dual-Track Transformer Network (DTN) to fully excevate global spatial relationships and learn high quality representations which contribute to accurate predictions of features and geometries. As for the second challenge, we design Geometric-Facilitated Loss (GFLoss) which intervenes the formation of bonds during the training period, instead of directly embedding edges into the latent space. Comprehensive experiments on current benchmarks demonstrate the superiority of GFMDiff.
△ Less
Submitted 22 April, 2024; v1 submitted 5 January, 2024;
originally announced January 2024.
-
COSINE: A Web Server for Clonal and Subclonal Structure Inference and Evolution in Cancer Genomics
Authors:
Xiguo Yuan,
Yuan Zhao,
Yang Guo,
Linmei Ge,
Wei Liu,
Shiyu Wen,
Qi Li,
Zhangbo Wan,
Peina Zheng,
Tao Guo,
Zhida Li,
Martin Peifer,
Yupeng Cun
Abstract:
Cancers evolve from mutation of a single cell with sequential clonal and subclonal expansion of somatic mutation acquisition. Inferring clonal and subclonal structures from bulk or single cell tumor genomic sequencing data has a huge impact on cancer evolution studies. Clonal state and mutational order can provide detailed insight into tumor origin and its future development. In the past decade, a…
▽ More
Cancers evolve from mutation of a single cell with sequential clonal and subclonal expansion of somatic mutation acquisition. Inferring clonal and subclonal structures from bulk or single cell tumor genomic sequencing data has a huge impact on cancer evolution studies. Clonal state and mutational order can provide detailed insight into tumor origin and its future development. In the past decade, a variety of methods have been developed for subclonal reconstruction using bulk tumor sequencing data. As these methods have been developed in different programming languages and using different input data formats, their use and comparison can be problematic. Therefore, we established a web server for clonal and subclonal structure inference and evolution of cancer genomic data (COSINE), which included 12 popular subclonal reconstruction methods. We decomposed each method via a detailed workflow of single processing steps with a user-friendly interface. To the best of our knowledge, this is the first web server providing online subclonal inference, including the most popular subclonal reconstruction methods. COSINE is freely accessible at www.clab-cosine.net or http://bio.rj.run:48996/cun-web.
△ Less
Submitted 28 March, 2021;
originally announced March 2021.
-
Computer Assisted Localization of a Heart Arrhythmia
Authors:
Chris Vogl,
Peng Zheng,
Stephen P. Seslar,
Aleksandr Y. Aravkin
Abstract:
We consider the problem of locating a point-source heart arrhythmia using data from a standard diagnostic procedure, where a reference catheter is placed in the heart, and arrival times from a second diagnostic catheter are recorded as the diagnostic catheter moves around within the heart. We model this situation as a nonconvex feasibility problem, where given a set of arrival times, we look for a…
▽ More
We consider the problem of locating a point-source heart arrhythmia using data from a standard diagnostic procedure, where a reference catheter is placed in the heart, and arrival times from a second diagnostic catheter are recorded as the diagnostic catheter moves around within the heart. We model this situation as a nonconvex feasibility problem, where given a set of arrival times, we look for a source location that is consistent with the available data. We develop a new optimization approach and fast algorithm to obtain online proposals for the next location to suggest to the operator as she collects data. We validate the procedure using a Monte Carlo simulation based on patients' electrophysiological data. The proposed procedure robustly and quickly locates the source of arrhythmias without any prior knowledge of heart anatomy.
△ Less
Submitted 9 July, 2018;
originally announced July 2018.
-
Learning Nonlinear Brain Dynamics: van der Pol Meets LSTM
Authors:
German Abrevaya,
Irina Rish,
Aleksandr Y. Aravkin,
Guillermo Cecchi,
James Kozloski,
Pablo Polosecki,
Peng Zheng,
Silvina Ponce Dawson,
Juliana Rhee,
David Cox
Abstract:
Many real-world data sets, especially in biology, are produced by complex nonlinear dynamical systems. In this paper, we focus on brain calcium imaging (CaI) of different organisms (zebrafish and rat), aiming to build a model of joint activation dynamics in large neuronal populations, including the whole brain of zebrafish. We propose a new approach for capturing dynamics of temporal SVD component…
▽ More
Many real-world data sets, especially in biology, are produced by complex nonlinear dynamical systems. In this paper, we focus on brain calcium imaging (CaI) of different organisms (zebrafish and rat), aiming to build a model of joint activation dynamics in large neuronal populations, including the whole brain of zebrafish. We propose a new approach for capturing dynamics of temporal SVD components that uses the coupled (multivariate) van der Pol (VDP) oscillator, a nonlinear ordinary differential equation (ODE) model describing neural activity, with a new parameter estimation technique that combines variable projection optimization and stochastic search. We show that the approach successfully handles nonlinearities and hidden state variables in the coupled VDP. The approach is accurate, achieving 0.82 to 0.94 correlation between the actual and model-generated components, and interpretable, as VDP's coupling matrix reveals anatomically meaningful positive (excitatory) and negative (inhibitory) interactions across different brain subsystems corresponding to spatial SVD components. Moreover, VDP is comparable to (or sometimes better than) recurrent neural networks (LSTM) for (short-term) prediction of future brain activity; VDP needs less parameters to train, which was a plus on our small training data. Finally, the overall best predictive method, greatly outperforming both VDP and LSTM in short- and long-term predictive settings on both datasets, was the new hybrid VDP-LSTM approach that used VDP to simulate large domain-specific dataset for LSTM pretraining; note that simple LSTM data-augmentation via noisy versions of training data was much less effective.
△ Less
Submitted 20 July, 2019; v1 submitted 24 May, 2018;
originally announced May 2018.
-
Chaotic Neuronal Oscillations in Spontaneous Cortical-Subcortical Networks
Authors:
Pengsheng Zheng
Abstract:
Oscillatory activities are widely observed in specific frequency bands of recorded field potentials in different brain regions, and play critical roles in processing neural information. Understanding the structure of these oscillatory activities is essential for understanding the brain function. So far many details remain elusive about their rhythmic structures and how these oscillations are gener…
▽ More
Oscillatory activities are widely observed in specific frequency bands of recorded field potentials in different brain regions, and play critical roles in processing neural information. Understanding the structure of these oscillatory activities is essential for understanding the brain function. So far many details remain elusive about their rhythmic structures and how these oscillations are generated. We show that many oscillatory activities in spontaneous cortical-subcortical networks, such as delta, spindle, gamma, high-gamma and sharp wave ripple bands in different brain regions, are genuine chaotic time series which can be reconstructed as chaotic attractors through appropriately selected embedding delay and dimension. The reconstructed attractors are approximated by a simple radial basis function enabling high precision short-term prediction. Simultaneously recorded oscillatory activities in multiple brain regions differ greatly in term of temporal phase and amplitude but can be approximated by the same function. Our results suggest that neural oscillations are produced by deterministic chaotic systems. The occurrence of neural oscillation events is predetermined, and the brain possibly knows when and where the information will be processed and transferred in the future time as a result of the deterministic dynamic.
△ Less
Submitted 21 July, 2015;
originally announced July 2015.
-
Theoretical modelling discriminates the stochastic and deterministic hypothesis of cell reprogramming
Authors:
Jiawei Yan,
Pu Zheng,
Xingjie Pan
Abstract:
How to induce differentiated cells into pluripotent cells has elicited researchers' interests for a long time since pluripotent stem cells are able to offer remarkable potential in numerous subfields of biological research. However, the nature of cell reprogramming, especially the mechanisms still remain elusive for the sake of most protocols of inducing pluripotent stem cells were discovered by s…
▽ More
How to induce differentiated cells into pluripotent cells has elicited researchers' interests for a long time since pluripotent stem cells are able to offer remarkable potential in numerous subfields of biological research. However, the nature of cell reprogramming, especially the mechanisms still remain elusive for the sake of most protocols of inducing pluripotent stem cells were discovered by screening but not from the knowledge of gene regulation networks. Generally there are two hypotheses to elucidate the mechanism termed as elite model and stochastic model which regard reprogramming process a deterministic process or a stochastic process, respectively. However, the difference between these two models cannot yet be discriminated experimentally. Here we used a general mathematical model to elucidate the nature of cell reprogramming which can fit both hypotheses. We investigated this process from a novel perspective, the timing. We calculated the time of reprogramming in a general way and find that noise would play a significant role if the stochastic hypothesis holds. Thus the two hypotheses may be discriminated experimentally by counting the time of reprogramming in different magnitudes of noise. Because our approach is general, our results should facilitate broad studies of rational design of cell reprogramming protocols.
△ Less
Submitted 7 December, 2014; v1 submitted 8 September, 2014;
originally announced September 2014.