Search | arXiv e-print repository

arXiv:2406.08140 [pdf]

Functional voxel hierarchy and afferent capacity revealed mental state transition on dynamic correlation resting-state fMRI

Authors: Dong Soo Lee, Hyun Joo Kim, Youngmin Huh, Yeon Koo Kang, Wonseok Whi, Hyekyoung Lee, Hyejin Kang

Abstract: Voxel hierarchy on dynamic brain graphs is produced by k core percolation on functional dynamic amplitude correlation of resting-state fMRI. Directed graphs and their afferent/efferent capacities are produced by Markov modeling of the universal cover of undirected graphs simultaneously with the calculation of volume entropy. Positive and unsigned negative brain graphs were analyzed separately on s… ▽ More Voxel hierarchy on dynamic brain graphs is produced by k core percolation on functional dynamic amplitude correlation of resting-state fMRI. Directed graphs and their afferent/efferent capacities are produced by Markov modeling of the universal cover of undirected graphs simultaneously with the calculation of volume entropy. Positive and unsigned negative brain graphs were analyzed separately on sliding-window representation to underpin the visualization and quantitation of mental dynamic states with their transitions. Voxel hierarchy animation maps of positive graphs revealed abrupt changes in coreness k and kmaxcore, which we called mental state transitions. Afferent voxel capacities of the positive graphs also revealed transient modules composed of dominating voxels/independent components and their exchanges representing mental state transitions. Animation and quantification plots of voxel hierarchy and afferent capacity corroborated each other in underpinning mental state transitions and afferent module exchange on the positive directed functional connectivity graphs. We propose the use of spatiotemporal trajectories of voxels on positive dynamic graphs to construct hierarchical structures by k core percolation and quantified in- and out-flows of information of voxels by volume entropy/directed graphs to subserve diverse resting mental state transitions on resting-state fMRI graphs in normal human individuals. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2405.18986 [pdf, other]

Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space

Authors: Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung, Hyun Joo Ro, Meeyoung Cha, Ho Min Kim

Abstract: Proteins are complex molecules responsible for different functions in nature. Enhancing the functionality of proteins and cellular fitness can significantly impact various industries. However, protein optimization using computational methods remains challenging, especially when starting from low-fitness sequences. We propose LatProtRL, an optimization method to efficiently traverse a latent space… ▽ More Proteins are complex molecules responsible for different functions in nature. Enhancing the functionality of proteins and cellular fitness can significantly impact various industries. However, protein optimization using computational methods remains challenging, especially when starting from low-fitness sequences. We propose LatProtRL, an optimization method to efficiently traverse a latent space learned by an encoder-decoder leveraging a large protein language model. To escape local optima, our optimization is modeled as a Markov decision process using reinforcement learning acting directly in latent space. We evaluate our approach on two important fitness optimization tasks, demonstrating its ability to achieve comparable or superior fitness over baseline methods. Our findings and in vitro evaluation show that the generated sequences can reach high-fitness regions, suggesting a substantial potential of LatProtRL in lab-in-the-loop scenarios. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: ICML 2024

arXiv:2405.16357 [pdf, other]

Exploring the Enigma of Neural Dynamics Through A Scattering-Transform Mixer Landscape for Riemannian Manifold

Authors: Tingting Dan, Ziquan Wei, Won Hwa Kim, Guorong Wu

Abstract: The human brain is a complex inter-wired system that emerges spontaneous functional fluctuations. In spite of tremendous success in the experimental neuroscience field, a system-level understanding of how brain anatomy supports various neural activities remains elusive. Capitalizing on the unprecedented amount of neuroimaging data, we present a physics-informed deep model to uncover the coupling m… ▽ More The human brain is a complex inter-wired system that emerges spontaneous functional fluctuations. In spite of tremendous success in the experimental neuroscience field, a system-level understanding of how brain anatomy supports various neural activities remains elusive. Capitalizing on the unprecedented amount of neuroimaging data, we present a physics-informed deep model to uncover the coupling mechanism between brain structure and function through the lens of data geometry that is rooted in the widespread wiring topology of connections between distant brain regions. Since deciphering the puzzle of self-organized patterns in functional fluctuations is the gateway to understanding the emergence of cognition and behavior, we devise a geometric deep model to uncover manifold mapping functions that characterize the intrinsic feature representations of evolving functional fluctuations on the Riemannian manifold. In lieu of learning unconstrained mapping functions, we introduce a set of graph-harmonic scattering transforms to impose the brain-wide geometry on top of manifold mapping functions, which allows us to cast the manifold-based deep learning into a reminiscent of MLP-Mixer architecture (in computer vision) for Riemannian manifold. As a proof-of-concept approach, we explore a neural-manifold perspective to understand the relationship between (static) brain structure and (dynamic) function, challenging the prevailing notion in cognitive neuroscience by proposing that neural activities are essentially excited by brain-wide oscillation waves living on the geometry of human connectomes, instead of being confined to focal areas. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 15 pages, 6 figures

MSC Class: 51H30 ACM Class: I.3.5

arXiv:2405.01974 [pdf, other]

Multitask Extension of Geometrically Aligned Transfer Encoder

Authors: Sung Moon Ko, Sumin Lee, Dae-Woong Jeong, Hyunseung Kim, Chanhui Lee, Soorin Yim, Sehui Han

Abstract: Molecular datasets often suffer from a lack of data. It is well-known that gathering data is difficult due to the complexity of experimentation or simulation involved. Here, we leverage mutual information across different tasks in molecular data to address this issue. We extend an algorithm that utilizes the geometric characteristics of the encoding space, known as the Geometrically Aligned Transf… ▽ More Molecular datasets often suffer from a lack of data. It is well-known that gathering data is difficult due to the complexity of experimentation or simulation involved. Here, we leverage mutual information across different tasks in molecular data to address this issue. We extend an algorithm that utilizes the geometric characteristics of the encoding space, known as the Geometrically Aligned Transfer Encoder (GATE), to a multi-task setup. Thus, we connect multiple molecular tasks by aligning the curved coordinates onto locally flat coordinates, ensuring the flow of information from source tasks to support performance on target data. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: 7 pages, 3 figures, 2 tables

arXiv:2403.02706 [pdf, other]

DeepBioisostere: Discovering Bioisosteres with Deep Learning for a Fine Control of Multiple Molecular Properties

Authors: Hyeongwoo Kim, Seokhyun Moon, Wonho Zhung, Jaechang Lim, Woo Youn Kim

Abstract: Optimizing molecules to improve their properties is a fundamental challenge in drug design. For a fine-tuning of molecular properties without losing bio-activity validated in advance, the concept of bioisosterism has emerged. Many in silico methods have been proposed for discovering bioisosteres, but they require expert knowledge for their applications or are restricted to known databases. Here, w… ▽ More Optimizing molecules to improve their properties is a fundamental challenge in drug design. For a fine-tuning of molecular properties without losing bio-activity validated in advance, the concept of bioisosterism has emerged. Many in silico methods have been proposed for discovering bioisosteres, but they require expert knowledge for their applications or are restricted to known databases. Here, we introduce DeepBioisostere, a deep generative model to design suitable bioisosteric replacements. Our model allows an end-to-end chemical replacement by intelligently selecting fragments for removal and insertion along with their attachment orientation. Through various scenarios of multiple property control, we showcase the model's capability to modulate specific properties, addressing the challenge in molecular optimization. Our model's innovation lies in its capacity to design a bioisosteric replacement reflecting the compatibility with the surroundings of the modification site, facilitating the control of sophisticated properties like drug-likeness. DeepBioisostere can also provide previously unseen bioisosteric replacements, highlighting its capability for exploring diverse chemical modifications rather than just mining them from known databases. Lastly, we employed DeepBioisostere to improve the sensitivity of a known SARS-CoV-2 main protease inhibitor to the E166V mutant that exhibits drug resistance to the inhibitor, demonstrating its potential application in lead optimization. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 32 pages, 7 figures, and 2 tables for main text

arXiv:2402.05961 [pdf, other]

Genetic-guided GFlowNets for Sample Efficient Molecular Optimization

Authors: Hyeonah Kim, Minsu Kim, Sanghyeok Choi, Jinkyoo Park

Abstract: The challenge of discovering new molecules with desired properties is crucial in domains like drug discovery and material design. Recent advances in deep learning-based generative methods have shown promise but face the issue of sample efficiency due to the computational expense of evaluating the reward function. This paper proposes a novel algorithm for sample-efficient molecular optimization by… ▽ More The challenge of discovering new molecules with desired properties is crucial in domains like drug discovery and material design. Recent advances in deep learning-based generative methods have shown promise but face the issue of sample efficiency due to the computational expense of evaluating the reward function. This paper proposes a novel algorithm for sample-efficient molecular optimization by distilling a powerful genetic algorithm into deep generative policy using GFlowNets training, the off-policy method for amortized inference. This approach enables the deep generative policy to learn from domain knowledge, which has been explicitly integrated into the genetic algorithm. Our method achieves state-of-the-art performance in the official molecular optimization benchmark, significantly outperforming previous methods. It also demonstrates effectiveness in designing inhibitors against SARS-CoV-2 with substantially fewer reward calls. △ Less

Submitted 25 May, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

Comments: 26 pages (including 13 pages of appendix)

arXiv:2401.04873 [pdf, other]

Electrostatics of Salt-Dependent Reentrant Phase Behaviors Highlights Diverse Roles of ATP in Biomolecular Condensates

Authors: Yi-Hsuan Lin, Tae Hun Kim, Suman Das, Tanmoy Pal, Jonas Wessén, Atul Kaushik Rangadurai, Lewis E. Kay, Julie D. Forman-Kay, Hue Sun Chan

Abstract: Liquid-liquid phase separation (LLPS) involving intrinsically disordered protein regions (IDRs) is a major physical mechanism for biological membraneless compartmentalization. The multifaceted electrostatic effects in these biomolecular condensates are exemplified here by experimental and theoretical investigations of the different salt- and ATP-dependent LLPSs of an IDR of messenger RNA-regulatin… ▽ More Liquid-liquid phase separation (LLPS) involving intrinsically disordered protein regions (IDRs) is a major physical mechanism for biological membraneless compartmentalization. The multifaceted electrostatic effects in these biomolecular condensates are exemplified here by experimental and theoretical investigations of the different salt- and ATP-dependent LLPSs of an IDR of messenger RNA-regulating protein Caprin1 and its phosphorylated variant pY-Caprin1, exhibiting, e.g., reentrant behaviors in some instances but not others. Experimental data are rationalized by physical modeling using analytical theory, molecular dynamics, and polymer field-theoretic simulations, indicating in general that interchain salt bridges enhance LLPS of polyelectrolytes such as Caprin1 and that the high valency of ATP-magnesium is a significant factor for its colocalization with the condensed phases, as similar trends are observed for several other IDRs. Our findings underscore the role of biomolecular condensates in modulating ion concentrations and its functional ramifications. △ Less

Submitted 18 June, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

Comments: 67 pages, 2 main-text tables, 8 main-text figures, 6 supporting figures, 155 references. Submitted to eLife

arXiv:2310.15263 [pdf, other]

One-hot Generalized Linear Model for Switching Brain State Discovery

Authors: Chengrui Li, Soon Ho Kim, Chris Rodgers, Hannah Choi, Anqi Wu

Abstract: Exposing meaningful and interpretable neural interactions is critical to understanding neural circuits. Inferred neural interactions from neural signals primarily reflect functional interactions. In a long experiment, subject animals may experience different stages defined by the experiment, stimuli, or behavioral states, and hence functional interactions can change over time. To model dynamically… ▽ More Exposing meaningful and interpretable neural interactions is critical to understanding neural circuits. Inferred neural interactions from neural signals primarily reflect functional interactions. In a long experiment, subject animals may experience different stages defined by the experiment, stimuli, or behavioral states, and hence functional interactions can change over time. To model dynamically changing functional interactions, prior work employs state-switching generalized linear models with hidden Markov models (i.e., HMM-GLMs). However, we argue they lack biological plausibility, as functional interactions are shaped and confined by the underlying anatomical connectome. Here, we propose a novel prior-informed state-switching GLM. We introduce both a Gaussian prior and a one-hot prior over the GLM in each state. The priors are learnable. We will show that the learned prior should capture the state-constant interaction, shedding light on the underlying anatomical connectome and revealing more likely physical neuron interactions. The state-dependent interaction modeled by each GLM offers traceability to capture functional variations across multiple brain states. Our methods effectively recover true interaction structures in simulated data, achieve the highest predictive likelihood with real neural datasets, and render interaction structures and hidden states more interpretable when applied to real neural data. △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2308.00354 [pdf, other]

Self-supervised Multidimensional Scaling with $F$-ratio: Improving Microbiome Visualization

Authors: Hyungseok Kim, Soobin Kim, Megan M. Morris, Jeffrey A. Kimbrel, Xavier Mayali, Cullen R. Buie

Abstract: Multidimensional scaling (MDS) is an unsupervised learning technique that preserves pairwise distances between observations and is commonly used for analyzing multivariate biological datasets. Recent advances in MDS have achieved successful classification results, but the configurations heavily depend on the choice of hyperparameters, limiting its broader application. Here, we present a self-super… ▽ More Multidimensional scaling (MDS) is an unsupervised learning technique that preserves pairwise distances between observations and is commonly used for analyzing multivariate biological datasets. Recent advances in MDS have achieved successful classification results, but the configurations heavily depend on the choice of hyperparameters, limiting its broader application. Here, we present a self-supervised MDS approach informed by the dispersions of observations that share a common binary label ($F$-ratio). Our visualization accurately configures the $F$-ratio while consistently preserving the global structure with a low data distortion compared to existing dimensionality reduction tools. Using an algal microbiome dataset, we show that this new method better illustrates the community's response to the host, suggesting its potential impact on microbiology and ecology data analysis. △ Less

Submitted 1 August, 2023; originally announced August 2023.

arXiv:2308.00298 [pdf, ps, other]

Finite population effects on optimal communication for social foragers

Authors: Hyunjoong Kim, Yoichiro Mori, Joshua B Plotkin

Abstract: Foraging is crucial for animals to survive. Many species forage in groups, as individuals communicate to share information about the location of available resources. For example, eusocial foragers, such as honey bees and many ants, recruit members from their central hive or nest to a known foraging site. However, the optimal level of communication and recruitment depends on the overall group size,… ▽ More Foraging is crucial for animals to survive. Many species forage in groups, as individuals communicate to share information about the location of available resources. For example, eusocial foragers, such as honey bees and many ants, recruit members from their central hive or nest to a known foraging site. However, the optimal level of communication and recruitment depends on the overall group size, the distribution of available resources, and the extent of interference between multiple individuals attempting to forage from a site. In this paper, we develop a discrete-time Markov chain model of eusocial foragers, who communicate information with a certain probability. We compare the stochastic model and its corresponding infinite-population limit. We find that foraging efficiency tapers off when recruitment probability is too high -- a phenomenon that does not occur in the infinite-population model, even though it occurs for any finite population size. The marginal inefficiency at high recruitment probability increases as the population increases, similar to a boundary layer. In particular, we prove there is a significant gap between the foraging efficiency of finite and infinite population models in the extreme case of complete communication. We also analyze this phenomenon by approximating the stationary distribution of foragers over sites in terms of mean escape times from multiple quasi-steady states. We conclude that for any finite group of foragers, an individual who has found a resource should only sometimes recruit others to the same resource. We discuss the relationship between our analysis and multi-agent multi-arm bandit problems. △ Less

Submitted 1 August, 2023; originally announced August 2023.

arXiv:2305.13338 [pdf]

Gene Set Summarization using Large Language Models

Authors: Marcin P. Joachimiak, J. Harry Caufield, Nomi L. Harris, Hyeongsik Kim, Christopher J. Mungall

Abstract: Molecular biologists frequently interpret gene lists derived from high-throughput experiments and computational analysis. This is typically done as a statistical enrichment analysis that measures the over- or under-representation of biological function terms associated with genes or their properties, based on curated assertions from a knowledge base (KB) such as the Gene Ontology (GO). Interpretin… ▽ More Molecular biologists frequently interpret gene lists derived from high-throughput experiments and computational analysis. This is typically done as a statistical enrichment analysis that measures the over- or under-representation of biological function terms associated with genes or their properties, based on curated assertions from a knowledge base (KB) such as the Gene Ontology (GO). Interpreting gene lists can also be framed as a textual summarization task, enabling the use of Large Language Models (LLMs), potentially utilizing scientific texts directly and avoiding reliance on a KB. We developed SPINDOCTOR (Structured Prompt Interpolation of Natural Language Descriptions of Controlled Terms for Ontology Reporting), a method that uses GPT models to perform gene set function summarization as a complement to standard enrichment analysis. This method can use different sources of gene functional information: (1) structured text derived from curated ontological KB annotations, (2) ontology-free narrative gene summaries, or (3) direct model retrieval. We demonstrate that these methods are able to generate plausible and biologically valid summary GO term lists for gene sets. However, GPT-based approaches are unable to deliver reliable scores or p-values and often return terms that are not statistically significant. Crucially, these methods were rarely able to recapitulate the most precise and informative term from standard enrichment, likely due to an inability to generalize and reason using an ontology. Results are highly nondeterministic, with minor variations in prompt resulting in radically different term lists. Our results show that at this point, LLM-based methods are unsuitable as a replacement for standard term enrichment analysis and that manual curation of ontological assertions remains necessary. △ Less

Submitted 3 July, 2024; v1 submitted 20 May, 2023; originally announced May 2023.

arXiv:2303.11833 [pdf, other]

doi 10.1039/D3SC05281H

Materials Discovery with Extreme Properties via Reinforcement Learning-Guided Combinatorial Chemistry

Authors: Hyunseung Kim, Haeyeon Choi, Dongju Kang, Won Bo Lee, Jonggeol Na

Abstract: The goal of most materials discovery is to discover materials that are superior to those currently known. Fundamentally, this is close to extrapolation, which is a weak point for most machine learning models that learn the probability distribution of data. Herein, we develop reinforcement learning-guided combinatorial chemistry, which is a rule-based molecular designer driven by trained policy for… ▽ More The goal of most materials discovery is to discover materials that are superior to those currently known. Fundamentally, this is close to extrapolation, which is a weak point for most machine learning models that learn the probability distribution of data. Herein, we develop reinforcement learning-guided combinatorial chemistry, which is a rule-based molecular designer driven by trained policy for selecting subsequent molecular fragments to get a target molecule. Since our model has the potential to generate all possible molecular structures that can be obtained from combinations of molecular fragments, unknown molecules with superior properties can be discovered. We theoretically and empirically demonstrate that our model is more suitable for discovering better compounds than probability distribution-learning models. In an experiment aimed at discovering molecules that hit seven extreme target properties, our model discovered 1,315 of all target-hitting molecules and 7,629 of five target-hitting molecules out of 100,000 trials, whereas the probability distribution-learning models failed. Moreover, it has been confirmed that every molecule generated under the binding rules of molecular fragments is 100% chemically valid. To illustrate the performance in actual problems, we also demonstrate that our models work well on two practical applications: discovering protein docking molecules and HIV inhibitors. △ Less

Submitted 7 May, 2024; v1 submitted 21 March, 2023; originally announced March 2023.

Comments: 18 pages, 8 figures

Journal ref: Chemical Science, 2024

arXiv:2302.12455 [pdf]

doi 10.1088/2050-6120/acfb58

A new twist on PIFE: photoisomerisation-related fluorescence enhancement

Authors: Evelyn Ploetz, Benjamin Ambrose, Anders Barth, Richard Börner, Felix Erichson, Achillefs N. Kapanidis, Harold D. Kim, Marcia Levitus, Timothy M. Lohman, Abhishek Mazumder, David S. Rueda, Fabio D. Steffen, Thorben Cordes, Steven W. Magennis, Eitan Lerner

Abstract: PIFE was first used as an acronym for protein-induced fluorescence enhancement, which refers to the increase in fluorescence observed upon the interaction of a fluorophore, such as a cyanine, with a protein. This fluorescence enhancement is due to changes in the rate of cis/trans photoisomerisation. It is clear now that this mechanism is generally applicable to interactions with any biomolecule an… ▽ More PIFE was first used as an acronym for protein-induced fluorescence enhancement, which refers to the increase in fluorescence observed upon the interaction of a fluorophore, such as a cyanine, with a protein. This fluorescence enhancement is due to changes in the rate of cis/trans photoisomerisation. It is clear now that this mechanism is generally applicable to interactions with any biomolecule and, in this review, we propose that PIFE is thereby renamed according to its fundamental working principle as photoisomerisation-related fluorescence enhancement, keeping the PIFE acronym intact. We discuss the photochemistry of cyanine fluorophores, the mechanism of PIFE, its advantages and limitations, and recent approaches to turn PIFE into a quantitative assay. We provide an overview of its current applications to different biomolecules and discuss potential future uses, including the study of protein-protein interactions, protein-ligand interactions and conformational changes in biomolecules. △ Less

Submitted 10 July, 2023; v1 submitted 24 February, 2023; originally announced February 2023.

Comments: No Comments

MSC Class: N/A

arXiv:2209.04742 [pdf, other]

Physical limits on galvanotaxis

Authors: Ifunanya Nwogbaga, A Hyun Kim, Brian A. Camley

Abstract: Eukaryotic cells can polarize and migrate in response to electric fields via "galvanotaxis," which aids wound healing. Experimental evidence suggests cells sense electric fields via molecules on the cell's surface redistributing via electrophoresis and electroosmosis, though the sensing species has not yet been conclusively identified. We develop a model that links sensor redistribution and galvan… ▽ More Eukaryotic cells can polarize and migrate in response to electric fields via "galvanotaxis," which aids wound healing. Experimental evidence suggests cells sense electric fields via molecules on the cell's surface redistributing via electrophoresis and electroosmosis, though the sensing species has not yet been conclusively identified. We develop a model that links sensor redistribution and galvanotaxis using maximum likelihood estimation. Our model predicts a single universal curve for how galvanotactic directionality depends on field strength. We can collapse measurements of galvanotaxis in keratocytes, neural crest cells, and granulocytes to this curve, suggesting that stochasticity due to the finite number of sensors may limit galvanotactic accuracy. We find cells can achieve experimentally observed directionalities with either a few (~100) highly-polarized sensors, or many (~10,000) sensors with a ~6-10% change in concentration across the cell. We also identify additional signatures of galvanotaxis via sensor redistribution, including the presence of a tradeoff between accuracy and variance in cells being controlled by rapidly switching fields. Our approach shows how the physics of noise at the molecular scale can limit cell-scale galvanotaxis, providing important constraints on sensor properties, and allowing for new tests to determine the specific molecules underlying galvanotaxis. △ Less

Submitted 21 July, 2023; v1 submitted 10 September, 2022; originally announced September 2022.

arXiv:2206.00133 [pdf, other]

Pre-training via Denoising for Molecular Property Prediction

Authors: Sheheryar Zaidi, Michael Schaarschmidt, James Martens, Hyunjik Kim, Yee Whye Teh, Alvaro Sanchez-Gonzalez, Peter Battaglia, Razvan Pascanu, Jonathan Godwin

Abstract: Many important problems involving molecular property prediction from 3D structures have limited data, posing a generalization challenge for neural networks. In this paper, we describe a pre-training technique based on denoising that achieves a new state-of-the-art in molecular property prediction by utilizing large datasets of 3D molecular structures at equilibrium to learn meaningful representati… ▽ More Many important problems involving molecular property prediction from 3D structures have limited data, posing a generalization challenge for neural networks. In this paper, we describe a pre-training technique based on denoising that achieves a new state-of-the-art in molecular property prediction by utilizing large datasets of 3D molecular structures at equilibrium to learn meaningful representations for downstream tasks. Relying on the well-known link between denoising autoencoders and score-matching, we show that the denoising objective corresponds to learning a molecular force field -- arising from approximating the Boltzmann distribution with a mixture of Gaussians -- directly from equilibrium structures. Our experiments demonstrate that using this pre-training objective significantly improves performance on multiple benchmarks, achieving a new state-of-the-art on the majority of targets in the widely used QM9 dataset. Our analysis then provides practical insights into the effects of different factors -- dataset sizes, model size and architecture, and the choice of upstream and downstream datasets -- on pre-training. △ Less

Submitted 24 October, 2022; v1 submitted 31 May, 2022; originally announced June 2022.

arXiv:2204.09768 [pdf, other]

doi 10.1103/PhysRevE.106.054411

Optimality of intercellular signaling: direct transport versus diffusion

Authors: Hyunjoong Kim, Yoichiro Mori, Joshua B. Plotkin

Abstract: Intercellular signaling has an important role in organism development, but not all communication occurs using the same mechanism. Here, we analyze the energy efficiency of intercellular signaling by two canonical mechanisms: diffusion of signaling molecules and direct transport mediated by signaling cellular protrusions. We show that efficient contact formation for direct transport can be establis… ▽ More Intercellular signaling has an important role in organism development, but not all communication occurs using the same mechanism. Here, we analyze the energy efficiency of intercellular signaling by two canonical mechanisms: diffusion of signaling molecules and direct transport mediated by signaling cellular protrusions. We show that efficient contact formation for direct transport can be established by an optimal rate of projecting protrusions, which depends on the availability of information about the location of the target cell. The optimal projection rate also depends on how signaling molecules are transported along the protrusion, in particular the ratio of the energy cost for contact formation and molecule synthesis. Also, we compare the efficiency of the two signaling mechanisms, under various model parameters. We find that the direct transport is favored over the diffusion when transporting a large amount of signaling molecules. There is a critical number of signaling molecules at which the efficiency of the two mechanisms are the same. The critical number is small when the distance between cells is far, which helps explain why protrusion-based mechanisms are observed in long-range cellular communications. △ Less

Submitted 24 October, 2022; v1 submitted 20 April, 2022; originally announced April 2022.

arXiv:2204.08699 [pdf, other]

Weak tension accelerates hybridization and dehybridization of short oligonucleotides

Authors: Derek J. Hart, Jiyoun Jeong, James C. Gumbart, Harold D. Kim

Abstract: The hybridization and dehybridization of DNA subject to tension is relevant to fundamental genetic processes and to the design of DNA-based mechanobiology assays. While strong tension accelerates DNA melting and decelerates DNA annealing, the effects of tension weaker than 5 pN are less clear. In this study, we developed a DNA bow assay, which uses the bending rigidity of double-stranded DNA (dsDN… ▽ More The hybridization and dehybridization of DNA subject to tension is relevant to fundamental genetic processes and to the design of DNA-based mechanobiology assays. While strong tension accelerates DNA melting and decelerates DNA annealing, the effects of tension weaker than 5 pN are less clear. In this study, we developed a DNA bow assay, which uses the bending rigidity of double-stranded DNA (dsDNA) to exert weak tension on a single-stranded DNA (ssDNA) target in the range of 2 pN to 6 pN. Combining this assay with single-molecule FRET, we measured the hybridization and dehybridization kinetics between a 15 nt ssDNA under tension and a 8-9 nt oligo, and found that both the hybridization and dehybridization rates monotonically increase with tension for various nucleotide sequences tested. These findings suggest that the nucleated duplex in its transition state is more extended than the pure dsDNA or ssDNA counterpart. Our simulations using the coarse-grained oxDNA2 model indicate that the increased extension of the transition state is due to exclusion interactions between unpaired ssDNA regions in close proximity to one another. This study highlights an example where the ideal worm-like chain models fail to explain the kinetic behavior of DNA in the low force regime. △ Less

Submitted 19 April, 2022; originally announced April 2022.

arXiv:2204.06714 [pdf, other]

graph-GPA 2.0: A Graphical Model for Multi-disease Analysis of GWAS Results with Integration of Functional Annotation Data

Authors: Qiaolan Deng, Jin Hyun Nam, Ayse Selen Yilmaz, Won Chang, Maciej Pietrzak, Lang Li, Hang J. Kim, Dongjun Chung

Abstract: Genome-wide association studies (GWAS) have successfully identified a large number of genetic variants associated with traits and diseases. However, it still remains challenging to fully understand functional mechanisms underlying many associated variants. This is especially the case when we are interested in variants shared across multiple phenotypes. To address this challenge, we propose graph-G… ▽ More Genome-wide association studies (GWAS) have successfully identified a large number of genetic variants associated with traits and diseases. However, it still remains challenging to fully understand functional mechanisms underlying many associated variants. This is especially the case when we are interested in variants shared across multiple phenotypes. To address this challenge, we propose graph-GPA 2.0 (GGPA 2.0), a novel statistical framework to integrate GWAS datasets for multiple phenotypes and incorporate functional annotations within a unified framework. We conducted simulation studies to evaluate GGPA 2.0. The results indicate that incorporating functional annotation data using GGPA 2.0 does not only improve detection of disease-associated variants, but also allows to identify more accurate relationships among diseases. We analyzed five autoimmune diseases and five psychiatric disorders with the functional annotations derived from GenoSkyline and GenoSkyline-Plus and the prior disease graph generated by biomedical literature mining. For autoimmune diseases, GGPA 2.0 identified enrichment for blood, especially B cells and regulatory T cells across multiple diseases. Psychiatric disorders were enriched for brain, especially prefrontal cortex and inferior temporal lobe for bipolar disorder (BIP) and schizophrenia (SCZ), respectively. Finally, GGPA 2.0 successfully identified the pleiotropy between BIP and SCZ. These results demonstrate that GGPA 2.0 can be a powerful tool to identify associated variants associated with each phenotype or those shared across multiple phenotypes, while also promoting understanding of functional mechanisms underlying the associated variants. △ Less

Submitted 13 April, 2022; originally announced April 2022.

arXiv:2203.13982 [pdf]

Implications of Mortality Displacement for Effect Modification and Selection Bias

Authors: Honghyok Kim, Jong-Tae Lee, Roger D. Peng, Kelvin C. Fong, Michelle L. Bell

Abstract: Mortality displacement is the concept that deaths are moved forward in time (e.g., a few days, several months, and years) by exposure from when they would occur without the exposure, which is common in environmental time-series studies. Using concepts of a frail population and loss of life expectancy, it is understood that mortality displacement may decrease rate ratio (RR). Such decreases are tho… ▽ More Mortality displacement is the concept that deaths are moved forward in time (e.g., a few days, several months, and years) by exposure from when they would occur without the exposure, which is common in environmental time-series studies. Using concepts of a frail population and loss of life expectancy, it is understood that mortality displacement may decrease rate ratio (RR). Such decreases are thought to be minimal or substantial depending on study populations. Environmental epidemiologists have interpreted RR considering mortality displacement. This theoretical paper reveals that mortality displacement can be formulated as a built-in selection bias of RR in Cox models due to unmeasured risk factors independent from exposure of interest, and mortality displacement can also be viewed as an effect modifier by integrating the concepts of rate and loss of life expectancy. Thus, depending on the framework through which we view bias, mortality displacement can be categorized as selection bias in the bias taxonomy of epidemiology, and simultaneously mortality displacement can be seen as an effect modifier. This dichotomy provides useful implications regarding policy, effect modification, exposure time-windows selection, and generalizability, specifically why research in epidemiology may produce unexpected and heterogeneous RR over different studies and sub-populations. △ Less

Submitted 25 March, 2022; originally announced March 2022.

Comments: This is an epidemiological theory paper

arXiv:2203.09719 [pdf, other]

Evolution as Explanation: The Origins of Neural Codes and their Efficiencies

Authors: Han Kim

Abstract: Neural codes appear efficient. Naturally, neuroscientists contend that an efficient process is responsible for generating efficient codes. They argue that natural selection is the efficient process that generates those codes. Although natural selection is an adaptive process, evolution itself, is not. Evolution consists of not only natural selection, but also neutral stochastic forces that can gen… ▽ More Neural codes appear efficient. Naturally, neuroscientists contend that an efficient process is responsible for generating efficient codes. They argue that natural selection is the efficient process that generates those codes. Although natural selection is an adaptive process, evolution itself, is not. Evolution consists of not only natural selection, but also neutral stochastic forces that can generate biological inefficiencies. The explanatory power of natural selection cannot be appealed to, without regards for the remaining evolutionary forces. In this paper, we aim to reformulate the explanatory role of evolutionary forces on neural coding, with special attention to neutral forces. We propose a framework that argues for differing contributions of adaptive and stochastic evolutionary forces, for different phenotypic `levels', including those of neural codes. We assert that this framework is of special interest to neuroscience, because the field has derived much progress from an efficiency-based worldview. We advocate for a pluralistic neuroscience capable of appealing to both adaptive and non-adaptive explanations. △ Less

Submitted 17 March, 2022; originally announced March 2022.

Comments: 19 pages total (includes references), 4 figures

arXiv:2106.06877 [pdf, other]

GPA-Tree: Statistical Approach for Functional-Annotation-Tree-Guided Prioritization of GWAS Results

Authors: Aastha Khatiwada, Bethany J. Wolf, Ayse Selen Yilmaz, Paula S. Ramos, Maciej Pietrzak, Andrew Lawson, Kelly J. Hunt, Hang J. Kim, Dongjun Chung

Abstract: Motivation: In spite of great success of genome-wide association studies (GWAS), multiple challenges still remain. First, complex traits are often associated with many single nucleotide polymorphisms (SNPs), each with small or moderate effect sizes. Second, our understanding of the functional mechanisms through which genetic variants are associated with complex traits is still limited. To address… ▽ More Motivation: In spite of great success of genome-wide association studies (GWAS), multiple challenges still remain. First, complex traits are often associated with many single nucleotide polymorphisms (SNPs), each with small or moderate effect sizes. Second, our understanding of the functional mechanisms through which genetic variants are associated with complex traits is still limited. To address these challenges, we propose GPA-Tree and it simultaneously implements association mapping and identifies key combinations of functional annotations related to risk-associated SNPs by combining a decision tree algorithm with a hierarchical modeling framework. Results: First, we implemented simulation studies to evaluate the proposed GPA-Tree method and compared its performance with existing statistical approaches. The results indicate that GPA-Tree outperforms existing statistical approaches in detecting risk-associated SNPs and identifying the true combinations of functional annotations with high accuracy. Second, we applied GPA-Tree to a systemic lupus erythematosus (SLE) GWAS and functional annotation data including GenoSkyline and GenoSkylinePlus. The results from GPA-Tree highlight the dysregulation of blood immune cells, including but not limited to primary B, memory helper T, regulatory T, neutrophils and CD8+ memory T cells in SLE. These results demonstrate that GPA-Tree can be a powerful tool that improves association mapping while facilitating understanding of the underlying genetic architecture of complex traits and potential mechanisms linking risk-associated SNPs with complex traits. △ Less

Submitted 12 June, 2021; originally announced June 2021.

Comments: 14 pages, 3 figures

arXiv:2104.04547 [pdf, other]

High-Throughput Virtual Screening of Small Molecule Inhibitors for SARS-CoV-2 Protein Targets with Deep Fusion Models

Authors: Garrett A. Stevenson, Derek Jones, Hyojin Kim, W. F. Drew Bennett, Brian J. Bennion, Monica Borucki, Feliza Bourguet, Aidan Epstein, Magdalena Franco, Brooke Harmon, Stewart He, Max P. Katz, Daniel Kirshner, Victoria Lao, Edmond Y. Lau, Jacky Lo, Kevin McLoughlin, Richard Mosesso, Deepa K. Murugesh, Oscar A. Negrete, Edwin A. Saada, Brent Segelke, Maxwell Stefan, Marisa W. Torres, Dina Weilhammer , et al. (7 additional authors not shown)

Abstract: Structure-based Deep Fusion models were recently shown to outperform several physics- and machine learning-based protein-ligand binding affinity prediction methods. As part of a multi-institutional COVID-19 pandemic response, over 500 million small molecules were computationally screened against four protein structures from the novel coronavirus (SARS-CoV-2), which causes COVID-19. Three enhanceme… ▽ More Structure-based Deep Fusion models were recently shown to outperform several physics- and machine learning-based protein-ligand binding affinity prediction methods. As part of a multi-institutional COVID-19 pandemic response, over 500 million small molecules were computationally screened against four protein structures from the novel coronavirus (SARS-CoV-2), which causes COVID-19. Three enhancements to Deep Fusion were made in order to evaluate more than 5 billion docked poses on SARS-CoV-2 protein targets. First, the Deep Fusion concept was refined by formulating the architecture as one, coherently backpropagated model (Coherent Fusion) to improve binding-affinity prediction accuracy. Secondly, the model was trained using a distributed, genetic hyper-parameter optimization. Finally, a scalable, high-throughput screening capability was developed to maximize the number of ligands evaluated and expedite the path to experimental evaluation. In this work, we present both the methods developed for machine learning-based high-throughput screening and results from using our computational pipeline to find SARS-CoV-2 inhibitors. △ Less

Submitted 31 May, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

arXiv:2012.00001 [pdf, other]

Utilizing stability criteria in choosing feature selection methods yields reproducible results in microbiome data

Authors: Lingjing Jiang, Niina Haiminen, Anna-Paola Carrieri, Shi Huang, Yoshiki Vazquez-Baeza, Laxmi Parida, Ho-Cheol Kim, Austin D. Swafford, Rob Knight, Loki Natarajan

Abstract: Feature selection is indispensable in microbiome data analysis, but it can be particularly challenging as microbiome data sets are high-dimensional, underdetermined, sparse and compositional. Great efforts have recently been made on developing new methods for feature selection that handle the above data characteristics, but almost all methods were evaluated based on performance of model prediction… ▽ More Feature selection is indispensable in microbiome data analysis, but it can be particularly challenging as microbiome data sets are high-dimensional, underdetermined, sparse and compositional. Great efforts have recently been made on developing new methods for feature selection that handle the above data characteristics, but almost all methods were evaluated based on performance of model predictions. However, little attention has been paid to address a fundamental question: how appropriate are those evaluation criteria? Most feature selection methods often control the model fit, but the ability to identify meaningful subsets of features cannot be evaluated simply based on the prediction accuracy. If tiny changes to the training data would lead to large changes in the chosen feature subset, then many of the biological features that an algorithm has found are likely to be a data artifact rather than real biological signal. This crucial need of identifying relevant and reproducible features motivated the reproducibility evaluation criterion such as Stability, which quantifies how robust a method is to perturbations in the data. In our paper, we compare the performance of popular model prediction metric MSE and proposed reproducibility criterion Stability in evaluating four widely used feature selection methods in both simulations and experimental microbiome applications. We conclude that Stability is a preferred feature selection criterion over MSE because it better quantifies the reproducibility of the feature selection method. △ Less

Submitted 30 November, 2020; originally announced December 2020.

Report number: https://doi.org/10.1111/biom.13481

arXiv:2009.06076 [pdf, ps, other]

Stochastic Turing pattern formation in a model with active and passive transport

Authors: Hyunjoong Kim, Paul C. Bressloff

Abstract: We investigate Turing pattern formation in a stochastic and spatially discretized version of a reaction diffusion advection (RDA) equation, which was previously introduced to model synaptogenesis in \textit{C. elegans}. The model describes the interactions between a passively diffusing molecular species and an advecting species that switches between anterograde and retrograde motor-driven transpor… ▽ More We investigate Turing pattern formation in a stochastic and spatially discretized version of a reaction diffusion advection (RDA) equation, which was previously introduced to model synaptogenesis in \textit{C. elegans}. The model describes the interactions between a passively diffusing molecular species and an advecting species that switches between anterograde and retrograde motor-driven transport (bidirectional transport). Within the context of synaptogenesis, the diffusing molecules can be identified with the protein kinase CaMKII and the advecting molecules as glutamate receptors. The stochastic dynamics evolves according to an RDA master equation, in which advection and diffusion are both modeled as hopping reactions along a one-dimensional array of chemical compartments. Carrying out a linear noise approximation of the RDA master equation leads to an effective Langevin equation, whose power spectrum provides a means of extending the definition of a Turing instability to stochastic systems, namely, in terms of the existence of a peak in the power spectrum at a non-zero spatial frequency. We thus show how noise can significantly extend the range over which spontaneous patterns occur, which is consistent with previous studies of RD systems. △ Less

Submitted 13 September, 2020; originally announced September 2020.

Comments: 26 pages, 8 figures

arXiv:2008.00629 [pdf, other]

Superlinear Precision and Memory in Simple Population Codes

Authors: Jimmy H. J. Kim, Ila Fiete, David J. Schwab

Abstract: The brain constructs population codes to represent stimuli through widely distributed patterns of activity across neurons. An important figure of merit of population codes is how much information about the original stimulus can be decoded from them. Fisher information is widely used to quantify coding precision and specify optimal codes, because of its relationship to mean squared error (MSE) unde… ▽ More The brain constructs population codes to represent stimuli through widely distributed patterns of activity across neurons. An important figure of merit of population codes is how much information about the original stimulus can be decoded from them. Fisher information is widely used to quantify coding precision and specify optimal codes, because of its relationship to mean squared error (MSE) under certain assumptions. When neural firing is sparse, however, optimizing Fisher information can result in codes that are highly sub-optimal in terms of MSE. We find that this discrepancy arises from the non-local component of error not accounted for by the Fisher information. Using this insight, we construct optimal population codes by directly minimizing the MSE. We study the scaling properties of MSE with coding parameters, focusing on the tuning curve width. We find that the optimal tuning curve width for coding no longer scales as the inverse population size, and the quadratic scaling of precision with system size predicted by Fisher information alone no longer holds. However, superlinearity is still preserved with only a logarithmic slowdown. We derive analogous results for networks storing the memory of a stimulus through continuous attractor dynamics, and show that similar scaling properties optimize memory and representation. △ Less

Submitted 2 August, 2020; originally announced August 2020.

Comments: 5 pages, 4 figures

arXiv:2005.10925 [pdf, other]

doi 10.1016/j.bpj.2021.01.043

First passage time study of DNA strand displacement

Authors: D. W. Bo Broadwater, Jr., Alexander W. Cook, Harold D. Kim

Abstract: DNA strand displacement, where a single-stranded nucleic acid invades a DNA duplex, is pervasive in genomic processes and DNA engineering applications. The kinetics of strand displacement have been studied in bulk; however, the kinetics of the underlying strand exchange were obfuscated by a slow bimolecular association step. Here, we use a novel single-molecule Fluorescence Resonance Energy Transf… ▽ More DNA strand displacement, where a single-stranded nucleic acid invades a DNA duplex, is pervasive in genomic processes and DNA engineering applications. The kinetics of strand displacement have been studied in bulk; however, the kinetics of the underlying strand exchange were obfuscated by a slow bimolecular association step. Here, we use a novel single-molecule Fluorescence Resonance Energy Transfer (smFRET) approach termed the "fission" assay to obtain the full distribution of first passage times of unimolecular strand displacement. At a frame time of 4.4 ms, the first passage time distribution for a 14-nt displacement domain exhibited a nearly monotonic decay with little delay. Among the eight different sequences we tested, the mean displacement time was on average 35 ms and varied by up to a factor of 13. The measured displacement kinetics also varied between complementary invaders and between RNA and DNA invaders of the same base sequence except for T$\rightarrow$U substitution. However, displacement times were largely insensitive to the monovalent salt concentration in the range of 0.25 M to 1 M. Using a one-dimensional random walk model, we infer that the single-step displacement time is in the range of $\sim 30 μs$ to $\sim 300 μs$ depending on the base identity. The framework presented here is broadly applicable to the kinetic analysis of multistep processes investigated at the single-molecule level. △ Less

Submitted 20 May, 2021; v1 submitted 21 May, 2020; originally announced May 2020.

Comments: To be published in Biophysical Journal

arXiv:2005.07704 [pdf, other]

Improved Protein-ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference

Authors: Derek Jones, Hyojin Kim, Xiaohua Zhang, Adam Zemla, Garrett Stevenson, William D. Bennett, Dan Kirshner, Sergio Wong, Felice Lightstone, Jonathan E. Allen

Abstract: Predicting accurate protein-ligand binding affinity is important in drug discovery but remains a challenge even with computationally expensive biophysics-based energy scoring methods and state-of-the-art deep learning approaches. Despite the recent advances in the deep convolutional and graph neural network based approaches, the model performance depends on the input data representation and suffer… ▽ More Predicting accurate protein-ligand binding affinity is important in drug discovery but remains a challenge even with computationally expensive biophysics-based energy scoring methods and state-of-the-art deep learning approaches. Despite the recent advances in the deep convolutional and graph neural network based approaches, the model performance depends on the input data representation and suffers from distinct limitations. It is natural to combine complementary features and their inference from the individual models for better predictions. We present fusion models to benefit from different feature representations of two neural network models to improve the binding affinity prediction. We demonstrate effectiveness of the proposed approach by performing experiments with the PDBBind 2016 dataset and its docking pose complexes. The results show that the proposed approach improves the overall prediction compared to the individual neural network models with greater computational efficiency than related biophysics based energy scoring functions. We also discuss the benefit of the proposed fusion inference with several example complexes. The software is made available as open source at https://github.com/llnl/fast. △ Less

Submitted 17 May, 2020; originally announced May 2020.

arXiv:2005.05437 [pdf]

End-to-End Automatic Sleep Stage Classification Using Spectral-Temporal Sleep Features

Authors: Hyeong-Jin Kim, Minji Lee, Seong-Whan Lee

Abstract: Sleep disorder is one of many neurological diseases that can affect greatly the quality of daily life. It is very burdensome to manually classify the sleep stages to detect sleep disorders. Therefore, the automatic sleep stage classification techniques are needed. However, the previous automatic sleep scoring methods using raw signals are still low classification performance. In this study, we pro… ▽ More Sleep disorder is one of many neurological diseases that can affect greatly the quality of daily life. It is very burdensome to manually classify the sleep stages to detect sleep disorders. Therefore, the automatic sleep stage classification techniques are needed. However, the previous automatic sleep scoring methods using raw signals are still low classification performance. In this study, we proposed an end-to-end automatic sleep staging framework based on optimal spectral-temporal sleep features using a sleep-edf dataset. The input data were modified using a bandpass filter and then applied to a convolutional neural network model. For five sleep stage classification, the classification performance 85.6% and 91.1% using the raw input data and the proposed input, respectively. This result also shows the highest performance compared to conventional studies using the same dataset. The proposed framework has shown high performance by using optimal features associated with each sleep stage, which may help to find new features in the automatic sleep stage method. △ Less

Submitted 4 May, 2020; originally announced May 2020.

arXiv:2005.01325 [pdf]

Prediction of Event Related Potential Speller Performance Using Resting-State EEG

Authors: Gi-Hwan Shin, Minji Lee, Hyeong-Jin Kim, Seong-Whan Lee

Abstract: Event-related potential (ERP) speller can be utilized in device control and communication for locked-in or severely injured patients. However, problems such as inter-subject performance instability and ERP-illiteracy are still unresolved. Therefore, it is necessary to predict classification performance before performing an ERP speller in order to use it efficiently. In this study, we investigated… ▽ More Event-related potential (ERP) speller can be utilized in device control and communication for locked-in or severely injured patients. However, problems such as inter-subject performance instability and ERP-illiteracy are still unresolved. Therefore, it is necessary to predict classification performance before performing an ERP speller in order to use it efficiently. In this study, we investigated the correlations with ERP speller performance using a resting-state before an ERP speller. In specific, we used spectral power and functional connectivity according to four brain regions and five frequency bands. As a result, the delta power in the frontal region and functional connectivity in the delta, alpha, gamma bands are significantly correlated with the ERP speller performance. Also, we predicted the ERP speller performance using EEG features in the resting-state. These findings may contribute to investigating the ERP-illiteracy and considering the appropriate alternatives for each user. △ Less

Submitted 7 May, 2020; v1 submitted 4 May, 2020; originally announced May 2020.

Comments: Accepted to IEEE EMBC 2020

arXiv:2004.12665 [pdf]

Estimation of Infection Rate and Prediction of Initial Infected Individuals of COVID-19

Authors: Seo Yoon Chae, Kyoung-Eun Lee, Hyun Min Lee, Nam Jun, Quang Ahn Le, Biseko Juma Mafwele, Tae Ho Lee, Doo Hwan Kim, Jae Woo Lee

Abstract: We consider the pandemic spreading of COVID-19 for some selected countries after the outbreak of the coronavirus in Wuhan City, China. We estimated the infection rate and the initial infected individuals of COVID-19 by using the officially reported data at the early stage of the epidemic for the susceptible (S), infectable (I), quarantined (Q), and the cofirmed recovered (Rk) population model, so… ▽ More We consider the pandemic spreading of COVID-19 for some selected countries after the outbreak of the coronavirus in Wuhan City, China. We estimated the infection rate and the initial infected individuals of COVID-19 by using the officially reported data at the early stage of the epidemic for the susceptible (S), infectable (I), quarantined (Q), and the cofirmed recovered (Rk) population model, so called SIQRk model. In the reported data we know the quarantined cases and the recovered cases. We can not know the recovered cases from the asymptomatic cases. In the SIQRk model we can estimated the model parameters and the initial infecting cases (confirmed ans asymtomatic cases) from the data fits. We obtained the infection rate in the range between 0.233 and 0.462, the basic reproduction number Ro in the range between 1.8 and 3.5, and the initial number of infected individuals in the range betwee 10 and 8409 for some selected countries. By using fitting parameters we estimated the maximum time of the infection for Germany when the government are performing the quarantine policy. The disease is undergoing to the calm state about six months after first patients were identified. △ Less

Submitted 27 April, 2020; originally announced April 2020.

Comments: 12 pages, 2 figures

arXiv:2004.10762 [pdf, other]

doi 10.1098/rsif.2020.0954

Automated Contact Tracing: a game of big numbers in the time of COVID-19

Authors: Hyunju Kim, Ayan Paul

Abstract: One of the more widely advocated solutions for slowing down the spread of COVID-19 has been automated contact tracing. Since proximity data can be collected by personal mobile devices, the natural proposal has been to use this for automated contact tracing providing a major gain over a manual implementation. In this work, we study the characteristics of voluntary and automated contact tracing and… ▽ More One of the more widely advocated solutions for slowing down the spread of COVID-19 has been automated contact tracing. Since proximity data can be collected by personal mobile devices, the natural proposal has been to use this for automated contact tracing providing a major gain over a manual implementation. In this work, we study the characteristics of voluntary and automated contact tracing and its effectiveness for mapping the spread of a pandemic due to the spread of SARS-CoV-2. We highlight the infrastructure and social structures required for automated contact tracing to work. We display the vulnerabilities of the strategy to inadequate sampling of the population, which results in the inability to sufficiently determine significant contact with infected individuals. Of crucial importance will be the participation of a significant fraction of the population for which we derive a minimum threshold. We conclude that relying largely on automated contact tracing without population-wide participation to contain the spread of the SARS-CoV-2 pandemic can be counterproductive and allow the pandemic to spread unchecked. The simultaneous implementation of various mitigation methods along with automated contact tracing is necessary for reaching an optimal solution to contain the pandemic. △ Less

Submitted 9 February, 2021; v1 submitted 22 April, 2020; originally announced April 2020.

Comments: 10 pages and 2 figures

Report number: DESY 20-069, HU-EP-20/10

Journal ref: J. The Royal Soc. Interface 18, 20200954

arXiv:2002.01571 [pdf]

doi 10.3390/e22090986

Antifragility Predicts the Robustness and Evolvability of Biological Networks through Multi-class Classification with a Convolutional Neural Network

Authors: Hyobin Kim, Stalin Muñoz, Pamela Osuna, Carlos Gershenson

Abstract: Robustness and evolvability are essential properties to the evolution of biological networks. To determine if a biological network is robust and/or evolvable, it is required to compare its functions before and after mutations. However, this sometimes takes a high computational cost as the network size grows. Here we develop a predictive method to estimate the robustness and evolvability of biologi… ▽ More Robustness and evolvability are essential properties to the evolution of biological networks. To determine if a biological network is robust and/or evolvable, it is required to compare its functions before and after mutations. However, this sometimes takes a high computational cost as the network size grows. Here we develop a predictive method to estimate the robustness and evolvability of biological networks without an explicit comparison of functions. We measure antifragility in Boolean network models of biological systems and use this as the predictor. Antifragility occurs when a system benefits from external perturbations. By means of the differences of antifragility between the original and mutated biological networks, we train a convolutional neural network (CNN) and test it to classify the properties of robustness and evolvability. We found that our CNN model successfully classified the properties. Thus, we conclude that our antifragility measure can be used as a predictor of the robustness and evolvability of biological networks. △ Less

Submitted 2 September, 2020; v1 submitted 4 February, 2020; originally announced February 2020.

Comments: 22 pages, 10 figures

arXiv:1902.11214 [pdf]

doi 10.1155/2019/2783217

A Multilayer Structure Facilitates the Production of Antifragile Systems in Boolean Network Models

Authors: Hyobin Kim, Omar K. Pineda, Carlos Gershenson

Abstract: Antifragility is a property from which systems are able to resist stress and furthermore benefit from it. Even though antifragile dynamics is found in various real-world complex systems where multiple subsystems interact with each other, the attribute has not been quantitatively explored yet in those complex systems which can be regarded as multilayer networks. Here we study how the multilayer str… ▽ More Antifragility is a property from which systems are able to resist stress and furthermore benefit from it. Even though antifragile dynamics is found in various real-world complex systems where multiple subsystems interact with each other, the attribute has not been quantitatively explored yet in those complex systems which can be regarded as multilayer networks. Here we study how the multilayer structure affects the antifragility of the whole system. By comparing single-layer and multilayer Boolean networks based on our recently proposed antifragility measure, we found that the multilayer structure facilitated the production of antifragile systems. Our measure and findings will be useful for various applications such as exploring properties of biological systems with multilayer structures and creating more antifragile engineered systems. △ Less

Submitted 10 December, 2019; v1 submitted 28 February, 2019; originally announced February 2019.

Comments: 15 pages, 8 figures

Journal ref: Complexity, 2019:11

arXiv:1901.00263 [pdf, other]

doi 10.1103/PhysRevLett.122.218101

Base-pair mismatch can destabilize small DNA loops through cooperative kinking

Authors: Jiyoun Jeong, Harold D. Kim

Abstract: Base pair mismatch can relieve mechanical stress in highly strained DNA molecules, but how it affects their kinetic stability is not known. Using single-molecule Fluorescence Resonance Energy Transfer (FRET), we measured the lifetimes of tightly bent DNA loops with and without base pair mismatch. Surprisingly, for loops captured by stackable sticky ends, the mismatch decreased the loop lifetime de… ▽ More Base pair mismatch can relieve mechanical stress in highly strained DNA molecules, but how it affects their kinetic stability is not known. Using single-molecule Fluorescence Resonance Energy Transfer (FRET), we measured the lifetimes of tightly bent DNA loops with and without base pair mismatch. Surprisingly, for loops captured by stackable sticky ends, the mismatch decreased the loop lifetime despite reducing the overall bending stress, and the decrease was largest when the mismatch was placed at the DNA midpoint. These findings show that base pair mismatch transfers bending stress to the opposite side of the loop through an allosteric mechanism known as cooperative kinking. Based on this mechanism, we present a three-state model that explains the apparent dichotomy between thermodynamic and kinetic stability of DNA loops. △ Less

Submitted 1 January, 2019; originally announced January 2019.

Journal ref: Phys. Rev. Lett. 122, 218101 (2019)

arXiv:1812.09352 [pdf, other]

Determinants of cyclization-decyclization kinetics of short DNA with sticky ends

Authors: Jiyoun Jeong, Harold D. Kim

Abstract: Cyclization of DNA with sticky ends is commonly used to construct DNA minicircles and to measure DNA bendability. The cyclization probability of short DNA (< 150 bp) has a strong length dependence, but how it depends on the rotational positioning of the sticky ends around the helical axis is less clear. To shed light upon the determinants of the cyclization probability of short DNA, we measured cy… ▽ More Cyclization of DNA with sticky ends is commonly used to construct DNA minicircles and to measure DNA bendability. The cyclization probability of short DNA (< 150 bp) has a strong length dependence, but how it depends on the rotational positioning of the sticky ends around the helical axis is less clear. To shed light upon the determinants of the cyclization probability of short DNA, we measured cyclization and decyclization rates of ~100-bp DNA with sticky ends over two helical periods using single-molecule Fluorescence Resonance Energy Transfer (FRET). The cyclization rate increases monotonically with length, indicating no excess twisting, while the decyclization rate oscillates with length, higher at half-integer helical turns and lower at integer helical turns. The oscillation profile is kinetically and thermodynamically consistent with a three-state cyclization model in which sticky-ended short DNA first bends into a torsionally-relaxed teardrop, and subsequently transitions to a more stable loop upon terminal base stacking. We also show that the looping probability density (the J factor) extracted from this study is in good agreement with the worm-like chain model near 100 bp. For shorter DNA, we discuss various experimental factors that prevent an accurate measurement of the J factor. △ Less

Submitted 21 December, 2018; originally announced December 2018.

arXiv:1812.06760 [pdf]

doi 10.1155/2019/3728621

A Novel Antifragility Measure Based on Satisfaction and Its Application to Random and Biological Boolean Networks

Authors: Omar K. Pineda, Hyobin Kim, Carlos Gershenson

Abstract: Antifragility is a property that enhances the capability of a system in response to external perturbations. Although the concept has been applied in many areas, a practical measure of antifragility has not been developed yet. Here we propose a simply calculable measure of antifragility, based on the change of "satisfaction" before and after adding perturbations, and apply it to random Boolean netw… ▽ More Antifragility is a property that enhances the capability of a system in response to external perturbations. Although the concept has been applied in many areas, a practical measure of antifragility has not been developed yet. Here we propose a simply calculable measure of antifragility, based on the change of "satisfaction" before and after adding perturbations, and apply it to random Boolean networks (RBNs). Using the measure, we found that ordered RBNs are the most antifragile. Also, we demonstrated that seven biological systems are antifragile. Our measure and results can be used in various applications of Boolean networks (BNs) including creating antifragile engineering systems, identifying the genetic mechanism of antifragile biological systems, and developing new treatment strategies for various diseases. △ Less

Submitted 24 April, 2019; v1 submitted 17 December, 2018; originally announced December 2018.

Comments: 15 pages, 7 figures

Journal ref: Complexity, 2019:10

arXiv:1805.01447 [pdf, other]

doi 10.1103/PhysRevLett.121.138102

Logic and connectivity jointly determine criticality in biological gene regulatory networks

Authors: Bryan C. Daniels, Hyunju Kim, Douglas Moore, Siyu Zhou, Harrison Smith, Bradley Karas, Stuart A. Kauffman, Sara I. Walker

Abstract: The complex dynamics of gene expression in living cells can be well-approximated using Boolean networks. The average sensitivity is a natural measure of stability in these systems: values below one indicate typically stable dynamics associated with an ordered phase, whereas values above one indicate chaotic dynamics. This yields a theoretically motivated adaptive advantage to being near the critic… ▽ More The complex dynamics of gene expression in living cells can be well-approximated using Boolean networks. The average sensitivity is a natural measure of stability in these systems: values below one indicate typically stable dynamics associated with an ordered phase, whereas values above one indicate chaotic dynamics. This yields a theoretically motivated adaptive advantage to being near the critical value of one, at the boundary between order and chaos. Here, we measure average sensitivity for 66 publicly available Boolean network models describing the function of gene regulatory circuits across diverse living processes. We find the average sensitivity values for these networks are clustered around unity, indicating they are near critical. In many types of random networks, mean connectivity <K> and the average activity bias of the logic functions <p> have been found to be the most important network properties in determining average sensitivity, and by extension a network's criticality. Surprisingly, many of these gene regulatory networks achieve the near-critical state with <K> and <p> far from that predicted for critical systems: randomized networks sharing the local causal structure and local logic of biological networks better reproduce their critical behavior than controlling for macroscale properties such as <K> and <p> alone. This suggests the local properties of genes interacting within regulatory networks are selected to collectively be near-critical, and this non-local property of gene regulatory network dynamics cannot be predicted using the density of interactions alone. △ Less

Submitted 3 May, 2018; originally announced May 2018.

Comments: 10 pages, 7 figures

Journal ref: Phys. Rev. Lett. 121, 138102 (2018)

arXiv:1803.10996 [pdf, other]

Dihedral angle prediction using generative adversarial networks

Authors: Hyeongki Kim

Abstract: Several dihedral angles prediction methods were developed for protein structure prediction and their other applications. However, distribution of predicted angles would not be similar to that of real angles. To address this we employed generative adversarial networks (GAN). Generative adversarial networks are composed of two adversarially trained networks: a discriminator and a generator. A discri… ▽ More Several dihedral angles prediction methods were developed for protein structure prediction and their other applications. However, distribution of predicted angles would not be similar to that of real angles. To address this we employed generative adversarial networks (GAN). Generative adversarial networks are composed of two adversarially trained networks: a discriminator and a generator. A discriminator distinguishes samples from a dataset and generated samples while a generator generates realistic samples. Although the discriminator of GANs is trained to estimate density, GAN model is intractable. On the other hand, noise-contrastive estimation (NCE) was introduced to estimate a normalization constant of an unnormalized statistical model and thus the density function. In this thesis, we introduce noise-contrastive estimation generative adversarial networks (NCE-GAN) which enables explicit density estimation of a GAN model. And a new loss for the generator is proposed. We also propose residue-wise variants of auxiliary classifier GAN (AC-GAN) and Semi-supervised GAN to handle sequence information in a window. In our experiment, the conditional generative adversarial network (C-GAN), AC-GAN and Semi-supervised GAN were compared. And experiments done with improved conditions were invested. We identified a phenomenon of AC-GAN that distribution of its predicted angles is composed of unusual clusters. The distribution of the predicted angles of Semi-supervised GAN was most similar to the Ramachandran plot. We found that adding the output of the NCE as an additional input of the discriminator is helpful to stabilize the training of the GANs and to capture the detailed structures. Adding regression loss and using predicted angles by regression loss only model could improve the conditional generation performance of the C-GAN and AC-GAN. △ Less

Submitted 29 March, 2018; originally announced March 2018.

Comments: 72 pages, MSc thesis under the supervision of Assoc. Prof. Thomas Hamelryck and Asst. Prof. Wouter Boomsma

arXiv:1803.07850 [pdf]

Contribution of Data Categories to Readmission Prediction Accuracy

Authors: Wendong Ge, Hee Yeun Kim, Sonali Desai, Leonid Perlovsky, Alexander Turchin

Abstract: Identification of patients at high risk for readmission could help reduce morbidity and mortality as well as healthcare costs. Most of the existing studies on readmission prediction did not compare the contribution of data categories. In this study we analyzed relative contribution of 90,101 variables across 398,884 admission records corresponding to 163,468 patients, including patient demographic… ▽ More Identification of patients at high risk for readmission could help reduce morbidity and mortality as well as healthcare costs. Most of the existing studies on readmission prediction did not compare the contribution of data categories. In this study we analyzed relative contribution of 90,101 variables across 398,884 admission records corresponding to 163,468 patients, including patient demographics, historical hospitalization information, discharge disposition, diagnoses, procedures, medications and laboratory test results. We established an interpretable readmission prediction model based on Logistic Regression in scikit-learn, and added the available variables to the model one by one in order to analyze the influences of individual data categories on readmission prediction accuracy. Diagnosis related groups (c-statistic increment of 0.0933) and discharge disposition (c-statistic increment of 0.0269) were the strongest contributors to model accuracy. Additionally, we also identified the top ten contributing variables in every data category. △ Less

Submitted 22 March, 2018; v1 submitted 21 March, 2018; originally announced March 2018.

arXiv:1801.04919 [pdf]

How Criticality of Gene Regulatory Networks Affects the Resulting Morphogenesis under Genetic Perturbations

Authors: Hyobin Kim, Hiroki Sayama

Abstract: Whereas the relationship between criticality of gene regulatory networks (GRNs) and dynamics of GRNs at a single cell level has been vigorously studied, the relationship between the criticality of GRNs and system properties at a higher level has remained unexplored. Here we aim at revealing a potential role of criticality of GRNs at a multicellular level which are hard to uncover through the singl… ▽ More Whereas the relationship between criticality of gene regulatory networks (GRNs) and dynamics of GRNs at a single cell level has been vigorously studied, the relationship between the criticality of GRNs and system properties at a higher level has remained unexplored. Here we aim at revealing a potential role of criticality of GRNs at a multicellular level which are hard to uncover through the single-cell-level studies, especially from an evolutionary viewpoint. Our model simulated the growth of a cell population from a single seed cell. All the cells were assumed to have identical GRNs. We induced genetic perturbations to the GRN of the seed cell by adding, deleting, or switching a regulatory link between a pair of genes. From numerical simulations, we found that the criticality of GRNs facilitated the formation of nontrivial morphologies when the GRNs were critical in the presence of the evolutionary perturbations. Moreover, the criticality of GRNs produced topologically homogenous cell clusters by adjusting the spatial arrangements of cells, which led to the formation of nontrivial morphogenetic patterns. Our findings corresponded to an epigenetic viewpoint that heterogeneous and complex features emerge from homogeneous and less complex components through the interactions among them. Thus, our results imply that highly structured tissues or organs in morphogenesis of multicellular organisms might stem from the criticality of GRNs. △ Less

Submitted 25 January, 2018; v1 submitted 12 January, 2018; originally announced January 2018.

Comments: 34 pages, 17 figures, 1 table

arXiv:1711.07894 [pdf, other]

Quantifying Performance of Bipedal Standing with Multi-channel EMG

Authors: Yanan Sui, Kun ho Kim, Joel W. Burdick

Abstract: Spinal cord stimulation has enabled humans with motor complete spinal cord injury (SCI) to independently stand and recover some lost autonomic function. Quantifying the quality of bipedal standing under spinal stimulation is important for spinal rehabilitation therapies and for new strategies that seek to combine spinal stimulation and rehabilitative robots (such as exoskeletons) in real time feed… ▽ More Spinal cord stimulation has enabled humans with motor complete spinal cord injury (SCI) to independently stand and recover some lost autonomic function. Quantifying the quality of bipedal standing under spinal stimulation is important for spinal rehabilitation therapies and for new strategies that seek to combine spinal stimulation and rehabilitative robots (such as exoskeletons) in real time feedback. To study the potential for automated electromyography (EMG) analysis in SCI, we evaluated the standing quality of paralyzed patients undergoing electrical spinal cord stimulation using both video and multi-channel surface EMG recordings during spinal stimulation therapy sessions. The quality of standing under different stimulation settings was quantified manually by experienced clinicians. By correlating features of the recorded EMG activity with the expert evaluations, we show that multi-channel EMG recording can provide accurate, fast, and robust estimation for the quality of bipedal standing in spinally stimulated SCI patients. Moreover, our analysis shows that the total number of EMG channels needed to effectively predict standing quality can be reduced while maintaining high estimation accuracy, which provides more flexibility for rehabilitation robotic systems to incorporate EMG recordings. △ Less

Submitted 21 November, 2017; originally announced November 2017.

Journal ref: IROS 2017

arXiv:1708.07882 [pdf, other]

The Role of Criticality of Gene Regulatory Networks in Morphogenesis

Authors: Hyobin Kim, Hiroki Sayama

Abstract: Gene regulatory network (GRN)-based morphogenetic models have recently gained an increasing attention. However, the relationship between microscopic properties of intracellular GRNs and macroscopic properties of morphogenetic systems has not been fully understood yet. Here we propose a theoretical morphogenetic model representing an aggregation of cells, and reveal the relationship between critica… ▽ More Gene regulatory network (GRN)-based morphogenetic models have recently gained an increasing attention. However, the relationship between microscopic properties of intracellular GRNs and macroscopic properties of morphogenetic systems has not been fully understood yet. Here we propose a theoretical morphogenetic model representing an aggregation of cells, and reveal the relationship between criticality of GRNs and morphogenetic pattern formation. In our model, the positions of the cells are determined by spring-mass-damper kinetics. Each cell has an identical Kauffman's $NK$ random Boolean network (RBN) as its GRN. We varied the properties of GRNs from ordered, through critical, to chaotic by adjusting node in-degree $K$. We randomly assigned four cell fates to the attractors of RBNs for cellular behaviors. By comparing diverse morphologies generated in our morphogenetic systems, we investigated what the role of the criticality of GRNs is in forming morphologies. We found that nontrivial spatial patterns were generated most frequently when GRNs were at criticality. Our finding indicates that the criticality of GRNs facilitates the formation of nontrivial morphologies in GRN-based morphogenetic systems. △ Less

Submitted 10 October, 2018; v1 submitted 12 August, 2017; originally announced August 2017.

Comments: 11 pages, 13 figures, 1 table; accepted for publication in IEEE Transactions on Cognitive and Developmental Systems

arXiv:1708.07880 [pdf]

doi 10.1098/rsta.2016.0358

How causal analysis can reveal autonomy in models of biological systems

Authors: William Marshall, Hyunju Kim, Sara I. Walker, Giulio Tononi, Larissa Albantakis

Abstract: Standard techniques for studying biological systems largely focus on their dynamical, or, more recently, their informational properties, usually taking either a reductionist or holistic perspective. Yet, studying only individual system elements or the dynamics of the system as a whole disregards the organisational structure of the system - whether there are subsets of elements with joint causes or… ▽ More Standard techniques for studying biological systems largely focus on their dynamical, or, more recently, their informational properties, usually taking either a reductionist or holistic perspective. Yet, studying only individual system elements or the dynamics of the system as a whole disregards the organisational structure of the system - whether there are subsets of elements with joint causes or effects, and whether the system is strongly integrated or composed of several loosely interacting components. Integrated information theory (IIT), offers a theoretical framework to (1) investigate the compositional cause-effect structure of a system, and to (2) identify causal borders of highly integrated elements comprising local maxima of intrinsic cause-effect power. Here we apply this comprehensive causal analysis to a Boolean network model of the fission yeast (Schizosaccharomyces pombe) cell-cycle. We demonstrate that this biological model features a non-trivial causal architecture, whose discovery may provide insights about the real cell cycle that could not be gained from holistic or reductionist approaches. We also show how some specific properties of this underlying causal architecture relate to the biological notion of autonomy. Ultimately, we suggest that analysing the causal organisation of a system, including key features like intrinsic control and stable causal borders, should prove relevant for distinguishing life from non-life, and thus could also illuminate the origin of life problem. △ Less

Submitted 25 August, 2017; originally announced August 2017.

Comments: 15 pages, 4 figures, to appear in Philosophical Transactions of the Royal Society A

arXiv:1707.05961 [pdf]

doi 10.1016/j.neuroimage.2009.05.036

Multidimensional classification of hippocampal shape features discriminates Alzheimer's disease and mild cognitive impairment from normal aging

Authors: Emilie Gerardin, Gaël Chételat, Marie Chupin, Rémi Cuingnet, Béatrice Desgranges, Ho-Sung Kim, Marc Niethammer, Bruno Dubois, Stéphane Lehéricy, Line Garnero, Francis Eustache, Olivier Colliot

Abstract: We describe a new method to automatically discriminate between patients with Alzheimer's disease (AD) or mild cognitive impairment (MCI) and elderly controls, based on multidimensional classification of hippocampal shape features. This approach uses spherical harmonics (SPHARM) coefficients to model the shape of the hippocampi, which are segmented from magnetic resonance images (MRI) using a fully… ▽ More We describe a new method to automatically discriminate between patients with Alzheimer's disease (AD) or mild cognitive impairment (MCI) and elderly controls, based on multidimensional classification of hippocampal shape features. This approach uses spherical harmonics (SPHARM) coefficients to model the shape of the hippocampi, which are segmented from magnetic resonance images (MRI) using a fully automatic method that we previously developed. SPHARM coefficients are used as features in a classification procedure based on support vector machines (SVM). The most relevant features for classification are selected using a bagging strategy. We evaluate the accuracy of our method in a group of 23 patients with AD (10 males, 13 females, age $\pm$ standard-deviation (SD) = 73 $\pm$ 6 years, mini-mental score (MMS) = 24.4 $\pm$ 2.8), 23 patients with amnestic MCI (10 males, 13 females, age $\pm$ SD = 74 $\pm$ 8 years, MMS = 27.3 $\pm$ 1.4) and 25 elderly healthy controls (13 males, 12 females, age $\pm$ SD = 64 $\pm$ 8 years), using leave-one-out cross-validation. For AD vs controls, we obtain a correct classification rate of 94%, a sensitivity of 96%, and a specificity of 92%. For MCI vs controls, we obtain a classification rate of 83%, a sensitivity of 83%, and a specificity of 84%. This accuracy is superior to that of hippocampal volumetry and is comparable to recently published SVM-based whole-brain classification methods, which relied on a different strategy. This new method may become a useful tool to assist in the diagnosis of Alzheimer's disease. △ Less

Submitted 19 July, 2017; originally announced July 2017.

Comments: Data used in the preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database

Journal ref: NeuroImage, 47 (4), pp.1476-86, 2009

arXiv:1704.01693 [pdf]

doi 10.1038/s41467-019-10729-5

Switch-like enhancement of epithelial-mesenchymal transition by YAP through feedback regulation of WT1 and small Rho-family GTPases

Authors: JinSeok Park, Deok-Ho Kim, Sagar R. Shah, Hong-Nam Kim, Kshitiz, David Ellison, Peter Kim, Kahp-Yang Suh, Alfredo Quiñones-Hinojosa, Andre Levchenko

Abstract: Collective cell migration is a hallmark of developmental and patho-physiological states, including wound healing and invasive cancer growth. The integrity of the expanding epithelial sheets can be influenced by extracellular cues, including cell-cell and cell-matrix interactions. We show the nano-scale topography of the extracellular matrix underlying epithelial cell layers can have a strong effec… ▽ More Collective cell migration is a hallmark of developmental and patho-physiological states, including wound healing and invasive cancer growth. The integrity of the expanding epithelial sheets can be influenced by extracellular cues, including cell-cell and cell-matrix interactions. We show the nano-scale topography of the extracellular matrix underlying epithelial cell layers can have a strong effect on the speed and morphology of the fronts of the expanding sheet triggering epithelial-mesenchymal transition (EMT). We further demonstrate that this behavior depends on the mechano-sensitivity of the transcription regulator YAP and two new feedback cross-regulation mechanisms: through Wilms Tumor-1 and E-cadherin, loosening cell-cell contacts, and through Rho GTPase family proteins, enhancing cell migration. These YAP-dependent regulatory feedback loops result in a switch-like change in the signaling and expression of EMT-related markers, leading to a robust enhancement in invasive epithelial sheet expansion, which might lead to a poorer clinical outcome in renal and other cancers. △ Less

Submitted 5 April, 2017; originally announced April 2017.

arXiv:1701.07940 [pdf]

Inferring clonal composition from multiple tumor biopsies

Authors: Matteo Manica, Hyunjae Ryan Kim, Roland Mathis, Philippe Chouvarine, Dorothea Rutishauser, Laura De Vargas Roditi, Bence Szalai, Ulrich Wagner, Kathrin Oehl, Karim Saba, Arati Pati, Julio Saez-Rodriguez, Angshumoy Roy, Donald W. Parsons, Peter J. Wild, María Rodríguez Martínez, Pavel Sumazin

Abstract: Explicit accounting for copy number alterations can dramatically improve mutation frequency estimates, leading to more accurate phylogeny reconstructions and subclone characterizations. Explicit accounting for copy number alterations can dramatically improve mutation frequency estimates, leading to more accurate phylogeny reconstructions and subclone characterizations. △ Less

Submitted 22 November, 2019; v1 submitted 26 January, 2017; originally announced January 2017.

arXiv:1701.05278 [pdf, ps, other]

doi 10.3938/jkps.70.108

Climate change alters diffusion of forest pest: A model study

Authors: Woo Seong Jo, Hwang-Yong Kim, Beom Jun Kim

Abstract: Population dynamics with spatial information is applied to understand the spread of pests. We introduce a model describing how pests spread in discrete space. The number of pest descendants at each site is controlled by local information such as temperature, precipitation, and the density of pine trees. Our simulation leads to a pest spreading pattern comparable to the real data for pine needle ga… ▽ More Population dynamics with spatial information is applied to understand the spread of pests. We introduce a model describing how pests spread in discrete space. The number of pest descendants at each site is controlled by local information such as temperature, precipitation, and the density of pine trees. Our simulation leads to a pest spreading pattern comparable to the real data for pine needle gall midge in the past. We also simulate the model in two different climate conditions based on two different representative concentration pathways scenarios for the future. We observe that after an initial stage of a slow spread of pests, a sudden change in the spreading speed occurs, which is soon followed by a large-scale outbreak. We found that a future climate change causes the outbreak point to occur earlier and that the detailed spatio-temporal pattern of the spread depends on the source position from which the initial pest infection starts. △ Less

Submitted 18 January, 2017; originally announced January 2017.

Comments: 8 pages, 3 figures

Journal ref: Journal of the Korean Physical Society 70, 108-115 (2017)

arXiv:1612.08948 [pdf]

doi 10.1073/pnas.1700054114

A mechano-chemical feedback underlies co-existence of qualitatively distinct cell polarity patterns within diverse cell populations

Authors: JinSeok Park, William R. Holmes, Sung-Hoon Lee, Hong-Nam Kim, Deok-Ho Kim, Moon Kyu Kwak, Chiaochun Joanne Wang, Kahp-Yang Suh, Leah Edelstein-Keshet, Andre Levchenko

Abstract: Cell polarization and directional cell migration can display random, persistent and oscillatory dynamic patterns. However, it is not clear if these polarity patterns can be explained by the same underlying regulatory mechanism. Here, we show that random, persistent and oscillatory migration accompanied by polarization can simultaneously occur in populations of melanoma cells derived from tumors wi… ▽ More Cell polarization and directional cell migration can display random, persistent and oscillatory dynamic patterns. However, it is not clear if these polarity patterns can be explained by the same underlying regulatory mechanism. Here, we show that random, persistent and oscillatory migration accompanied by polarization can simultaneously occur in populations of melanoma cells derived from tumors with different degrees of aggressiveness. We demonstrate that all these patterns and the probabilities of their occurrence are quantitatively accounted for by a simple mechanism involving a spatially distributed, mechano-chemical feedback coupling the dynamically changing extracellular matrix (ECM)-cell contacts to the activation of signaling downstream of the Rho-family small GTPases. This mechanism is supported by a predictive mathematical model and extensive experimental validation, and can explain previously reported results for diverse cell types. In melanoma, this mechanism also accounts for the effects of genetic and environmental perturbations, including mutations linked to invasive cell spread. The resulting mechanistic understanding of cell polarity quantitatively captures the relationship between population variability and phenotypic plasticity, with the potential to account for a wide variety of cell migration states in diverse pathological and physiological conditions. △ Less

Submitted 28 December, 2016; originally announced December 2016.

arXiv:1602.05652 [pdf, other]

doi 10.1016/j.bpj.2016.02.027

The effect of base pair mismatch on DNA strand displacement

Authors: Bo Broadwater, Harold Kim

Abstract: DNA strand displacement is a key reaction in DNA homologous recombination and DNA mismatch repair and is also heavily utilized in DNA-based computation and locomotion. Despite its ubiquity in science and engineering, sequence-dependent effects of displacement kinetics have not been extensively characterized. Here, we measured toehold-mediated strand displacement kinetics using single-molecule fluo… ▽ More DNA strand displacement is a key reaction in DNA homologous recombination and DNA mismatch repair and is also heavily utilized in DNA-based computation and locomotion. Despite its ubiquity in science and engineering, sequence-dependent effects of displacement kinetics have not been extensively characterized. Here, we measured toehold-mediated strand displacement kinetics using single-molecule fluorescence in the presence of a single base pair mismatch. The apparent displacement rate varied significantly when the mismatch was introduced in the invading DNA strand. The rate generally decreased as the mismatch in the invader was encountered earlier in displacement. Our data indicate that a single base pair mismatch in the invader stalls branch migration, and displacement occurs via direct dissociation of the destabilized incumbent strand from the substrate strand. We combined both branch migration and direct dissociation into a model, which we term, the concurrent displacement model, and used the first passage time approach to quantitatively explain the salient features of the observed relationship. We also introduce the concept of splitting probabilities to justify that the concurrent model can be simplified into a three-step sequential model in the presence of an invader mismatch. We expect our model to become a powerful tool to design DNA-based reaction schemes with broad functionality. △ Less

Submitted 17 February, 2016; originally announced February 2016.

Journal ref: Biophysical Journal 110 (2016) 1476-1484

arXiv:1508.04174 [pdf, other]

New Scaling Relation for Information Transfer in Biological Networks

Authors: Hyunju Kim, Paul Davies, Sara Imari Walker

Abstract: Living systems are often described utilizing informational analogies. An important open question is whether information is merely a useful conceptual metaphor, or intrinsic to the operation of biological systems. To address this question, we provide a rigorous case study of the informational architecture of two representative biological networks: the Boolean network model for the cell-cycle regula… ▽ More Living systems are often described utilizing informational analogies. An important open question is whether information is merely a useful conceptual metaphor, or intrinsic to the operation of biological systems. To address this question, we provide a rigorous case study of the informational architecture of two representative biological networks: the Boolean network model for the cell-cycle regulatory network of the fission yeast S. pombe and that of the budding yeast S. cerevisiae. We compare our results for these biological networks to the same analysis performed on ensembles of two different types of random networks. We show that both biological networks share features in common that are not shared by either ensemble. In particular, the biological networks in our study, on average, process more information than the random networks. They also exhibit a scaling relation in information transferred between nodes that distinguishes them from either ensemble: even when compared to the ensemble of random networks that shares important topological properties, such as a scale-free structure. We show that the most biologically distinct regime of this scaling relation is associated with the dynamics and function of the biological networks. Information processing in biological networks is therefore interpreted as an emergent property of topology (causal structure) and dynamics (function). These results demonstrate quantitatively how the informational architecture of biologically evolved networks can distinguish them from other classes of network architecture that do not share the same informational properties. △ Less

Submitted 17 August, 2015; originally announced August 2015.

Showing 1–50 of 66 results for author: Kim, H