Search | arXiv e-print repository

The computational power of discrete chemical reaction networks with bounded executions

Abstract: Chemical reaction networks (CRNs) model systems where molecules interact according to a finite set of reactions such as $A + B \to C$, representing that if a molecule of $A$ and $B$ collide, they disappear and a molecule of $C$ is produced. CRNs can compute Boolean-valued predicates $φ:\mathbb{N}^d \to \{0,1\}$ and integer-valued functions $f:\mathbb{N}^d \to \mathbb{N}$; for instance… ▽ More Chemical reaction networks (CRNs) model systems where molecules interact according to a finite set of reactions such as $A + B \to C$, representing that if a molecule of $A$ and $B$ collide, they disappear and a molecule of $C$ is produced. CRNs can compute Boolean-valued predicates $φ:\mathbb{N}^d \to \{0,1\}$ and integer-valued functions $f:\mathbb{N}^d \to \mathbb{N}$; for instance $X_1 + X_2 \to Y$ computes the function $\min(x_1,x_2)$. We study the computational power of execution bounded CRNs, in which only a finite number of reactions can occur from the initial configuration (e.g., ruling out reversible reactions such as $A \rightleftharpoons B$). The power and composability of such CRNs depend crucially on some other modeling choices that do not affect the computational power of CRNs with unbounded executions, namely whether an initial leader is present, and whether (for predicates) all species are required to "vote" for the Boolean output. If the CRN starts with an initial leader, and can allow only the leader to vote, then all semilinear predicates and functions can be stably computed in $O(n \log n)$ parallel time by execution bounded CRNs. However, if no initial leader is allowed, all species vote, and the CRN is "noncollapsing" (does not shrink from initially large to final $O(1)$ size configurations), then execution bounded CRNs are severely limited, able to compute only eventually constant predicates. A key tool is to characterize execution bounded CRNs as precisely those with a nonnegative linear potential function that is strictly decreased by every reaction, a result that may be of independent interest. △ Less

Submitted 12 August, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

arXiv:2403.07099 [pdf, other]

Rate-independent continuous inhibitory chemical reaction networks are Turing-universal

Authors: Kim Calabrese, David Doty

Abstract: We study the model of continuous chemical reaction networks (CRNs), consisting of reactions such as $A+B \to C+D$ that can transform some continuous, nonnegative real-valued quantity (called a *concentration*) of chemical species $A$ and $B$ into equal concentrations of $C$ and $D$. Such a reaction can occur from any state in which both reactants $A$ and $B$ are present, i.e., have positive concen… ▽ More We study the model of continuous chemical reaction networks (CRNs), consisting of reactions such as $A+B \to C+D$ that can transform some continuous, nonnegative real-valued quantity (called a *concentration*) of chemical species $A$ and $B$ into equal concentrations of $C$ and $D$. Such a reaction can occur from any state in which both reactants $A$ and $B$ are present, i.e., have positive concentration. We modify the model to allow *inhibitors*, for instance, reaction $A+B \to^{I} C+D$ can occur only if the reactants $A$ and $B$ are present and the inhibitor $I$ is absent. The computational power of non-inhibitory CRNs has been studied. For instance, the reaction $X_1+X_2 \to Y$ can be thought to compute the function $f(x_1,x_2) = \min(x_1,x_2)$. Under an "adversarial" model in which reaction rates can vary arbitrarily over time, it was found that exactly the continuous, piecewise linear functions can be computed, ruling out even simple functions such as $f(x) = x^2$. In contrast, in this paper we show that inhibitory CRNs can compute any computable function $f:\mathbb{N}\to\mathbb{N}$. △ Less

Submitted 11 March, 2024; originally announced March 2024.

arXiv:2311.17166 [pdf, other]

Is stochastic thermodynamics the key to understanding the energy costs of computation?

Authors: David Wolpert, Jan Korbel, Christopher Lynn, Farita Tasnim, Joshua Grochow, Gülce Kardeş, James Aimone, Vijay Balasubramanian, Eric de Giuli, David Doty, Nahuel Freitas, Matteo Marsili, Thomas E. Ouldridge, Andrea Richa, Paul Riechers, Édgar Roldán, Brenda Rubenstein, Zoltan Toroczkai, Joseph Paradiso

Abstract: The relationship between the thermodynamic and computational characteristics of dynamical physical systems has been a major theoretical interest since at least the 19th century, and has been of increasing practical importance as the energetic cost of digital devices has exploded over the last half century. One of the most important thermodynamic features of real-world computers is that they operat… ▽ More The relationship between the thermodynamic and computational characteristics of dynamical physical systems has been a major theoretical interest since at least the 19th century, and has been of increasing practical importance as the energetic cost of digital devices has exploded over the last half century. One of the most important thermodynamic features of real-world computers is that they operate very far from thermal equilibrium, in finite time, with many quickly (co-)evolving degrees of freedom. Such computers also must almost always obey multiple physical constraints on how they work. For example, all modern digital computers are periodic processes, governed by a global clock. Another example is that many computers are modular, hierarchical systems, with strong restrictions on the connectivity of their subsystems. This properties hold both for naturally occurring computers, like brains or Eukaryotic cells, as well as digital systems. These features of real-world computers are absent in 20th century analyses of the thermodynamics of computational processes, which focused on quasi-statically slow processes. However, the field of stochastic thermodynamics has been developed in the last few decades - and it provides the formal tools for analyzing systems that have exactly these features of real-world computers. We argue here that these tools, together with other tools currently being developed in stochastic thermodynamics, may help us understand at a far deeper level just how the fundamental physical properties of dynamic systems are related to the computation that they perform. △ Less

Submitted 29 August, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

Comments: Updated version

arXiv:2309.06957 [pdf, other]

Harvesting Brownian Motion: Zero Energy Computational Sampling

Authors: David Doty, Niels Kornerup, Austin Luchsinger, Leo Orshansky, David Soloveichik, Damien Woods

Abstract: The key factor currently limiting the advancement of computational power of electronic computation is no longer the manufacturing density and speed of components, but rather their high energy consumption. While it has been widely argued that reversible computation can escape the fundamental Landauer limit of $k_B T\ln(2)$ Joules per irreversible computational step, there is disagreement around whe… ▽ More The key factor currently limiting the advancement of computational power of electronic computation is no longer the manufacturing density and speed of components, but rather their high energy consumption. While it has been widely argued that reversible computation can escape the fundamental Landauer limit of $k_B T\ln(2)$ Joules per irreversible computational step, there is disagreement around whether indefinitely reusable computation can be achieved without energy dissipation. Here we focus on the relatively simpler context of sampling problems, which take no input, so avoids modeling the energy costs of the observer perturbing the machine to change its input. Given an algorithm $A$ for generating samples from a distribution, we desire a device that can perpetually generate samples from that distribution driven entirely by Brownian motion. We show that such a device can efficiently execute algorithm $A$ in the sense that we must wait only $O(\text{time}(A)^2)$ between samples. We consider two output models: Las Vegas, which samples from the exact probability distribution every $4$ tries in expectation, and Monte Carlo, in which every try succeeds but the distribution is only approximated. We base our model on continuous-time random walks over the state space graph of a general computational machine, with a space-bounded Turing machine as one instantiation. The problem of sampling a computationally complex probability distribution with no energy dissipation informs our understanding of the energy requirements of computation, and may lead to more energy efficient randomized algorithms. △ Less

Submitted 28 August, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

Comments: 20 pages, 6 figures

MSC Class: 60J28 ACM Class: G.3; F.1.m

arXiv:2307.01939 [pdf, other]

Optimal Information Encoding in Chemical Reaction Networks

Authors: Austin Luchsinger, David Doty, David Soloveichik

Abstract: Discrete chemical reaction networks formalize the interactions of molecular species in a well-mixed solution as stochastic events. Given their basic mathematical and physical role, the computational power of chemical reaction networks has been widely studied in the molecular programming and distributed computing communities. While for Turing-universal systems there is a universal measure of optima… ▽ More Discrete chemical reaction networks formalize the interactions of molecular species in a well-mixed solution as stochastic events. Given their basic mathematical and physical role, the computational power of chemical reaction networks has been widely studied in the molecular programming and distributed computing communities. While for Turing-universal systems there is a universal measure of optimal information encoding based on Kolmogorov complexity, chemical reaction networks are not Turing universal unless error and unbounded molecular counts are permitted. Nonetheless, here we show that the optimal number of reactions to generate a specific count $x \in \mathbb{N}$ with probability $1$ is asymptotically equal to a ``space-aware'' version of the Kolmogorov complexity of $x$, defined as $\mathrm{\widetilde{K}s}(x) = \min_p\left\{\lvert p \rvert / \log \lvert p \rvert + \log(\texttt{space}(\mathcal{U}(p))) : \mathcal{U}(p) = x \right\}$, where $p$ is a program for universal Turing machine $\mathcal{U}$. This version of Kolmogorov complexity incorporates not just the length of the shortest program for generating $x$, but also the space usage of that program. Probability $1$ computation is captured by the standard notion of stable computation from distributed computing, but we limit our consideration to chemical reaction networks obeying a stronger constraint: they ``know when they are done'' in the sense that they produce a special species to indicate completion. As part of our results, we develop a module for encoding and unpacking any $b$ bits of information via $O(b/\log{b})$ reactions, which is information-theoretically optimal for incompressible information. Our work provides one answer to the question of how succinctly chemical self-organization can be encoded -- in the sense of generating precise molecular counts of species as the desired state. △ Less

Submitted 4 July, 2023; originally announced July 2023.

ACM Class: F.1.1

arXiv:2307.01550 [pdf, other]

Thermodynamically Driven Signal Amplification

Authors: Joshua Petrack, David Soloveichik, David Doty

Abstract: The field of chemical computation attempts to model computational behavior that arises when molecules, typically nucleic acids, are mixed together. Thermodynamic binding networks (TBNs) is a highly abstracted model that focuses on which molecules are bound to each other in a "thermodynamically stable" sense. Stability is measured based only on how many bonds are formed and how many total complexes… ▽ More The field of chemical computation attempts to model computational behavior that arises when molecules, typically nucleic acids, are mixed together. Thermodynamic binding networks (TBNs) is a highly abstracted model that focuses on which molecules are bound to each other in a "thermodynamically stable" sense. Stability is measured based only on how many bonds are formed and how many total complexes are in a configuration, without focusing on how molecules are binding or how they became bound. We study the problem of signal amplification: detecting a small quantity of some molecule and amplifying its signal to something more easily detectable. This problem has natural applications such as disease diagnosis. By focusing on thermodynamically favored outcomes, we seek to design chemical systems that perform the task of signal amplification robustly without relying on kinetic pathways that can be error prone and require highly controlled conditions (e.g., PCR amplification). It might appear that a small change in concentrations can result in only small changes to the thermodynamic equilibrium of a molecular system. However, we show that it is possible to design a TBN that can "exponentially amplify" a signal represented by a single copy of a monomer called the analyte: this TBN has exactly one stable state before adding the analyte and exactly one stable state afterward, and those two states "look very different" from each other. We also show a corresponding negative result: a doubly exponential upper bound, meaning that there is no TBN that can amplify a signal by an amount more than doubly exponential in the number and sizes of different molecules that comprise it. Our work informs the fundamental question of how a thermodynamic equilibrium can change as a result of a small change to the system (adding a single molecule copy). △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: 25 pages including abstract and appendix. 7 figures. To be published in proceedings of DNA29

arXiv:2202.12864 [pdf, other]

Dynamic size counting in population protocols

Authors: David Doty, Mahsa Eftekhari

Abstract: The population protocol model describes a network of anonymous agents that interact asynchronously in pairs chosen at random. Each agent starts in the same initial state $s$. We introduce the *dynamic size counting* problem: approximately counting the number of agents in the presence of an adversary who at any time can remove any number of agents or add any number of new agents in state $s$. A val… ▽ More The population protocol model describes a network of anonymous agents that interact asynchronously in pairs chosen at random. Each agent starts in the same initial state $s$. We introduce the *dynamic size counting* problem: approximately counting the number of agents in the presence of an adversary who at any time can remove any number of agents or add any number of new agents in state $s$. A valid solution requires that after each addition/removal event, resulting in population size $n$, with high probability each agent "quickly" computes the same constant-factor estimate of the value $\log_2 n$ (how quickly is called the *convergence* time), which remains the output of every agent for as long as possible (the *holding* time). Since the adversary can remove agents, the holding time is necessarily finite: even after the adversary stops altering the population, it is impossible to *stabilize* to an output that never again changes. We first show that a protocol solves the dynamic size counting problem if and only if it solves the *loosely-stabilizing counting* problem: that of estimating $\log n$ in a *fixed-size* population, but where the adversary can initialize each agent in an arbitrary state, with the same convergence time and holding time. We then show a protocol solving the loosely-stabilizing counting problem with the following guarantees: if the population size is $n$, $M$ is the largest initial estimate of $\log n$, and s is the maximum integer initially stored in any field of the agents' memory, we have expected convergence time $O(\log n + \log M)$, expected polynomial holding time, and expected memory usage of $O(\log^2 (s) + (\log \log n)^2)$ bits. Interpreted as a dynamic size counting protocol, when changing from population size $n_{prev}$ to $n_{next}$, the convergence time is $O(\log n_{next} + \log \log n_{prev})$. △ Less

Submitted 25 February, 2022; originally announced February 2022.

MSC Class: 68M18; 68W15 ACM Class: F.1; F.2.2

arXiv:2107.13681 [pdf, other]

doi 10.1145/3590776

Rate-Independent Computation in Continuous Chemical Reaction Networks

Authors: Ho-Lin Chen, David Doty, Wyatt Reeves, David Soloveichik

Abstract: Coupled chemical interactions in a well-mixed solution are commonly formalized as chemical reaction networks (CRNs). However, despite the widespread use of CRNs in the natural sciences, the range of computational behaviors exhibited by CRNs is not well understood. Here we study the following problem: what functions $f:\mathbb{R}^k \to \mathbb{R}$ can be computed by a CRN, in which the CRN eventual… ▽ More Coupled chemical interactions in a well-mixed solution are commonly formalized as chemical reaction networks (CRNs). However, despite the widespread use of CRNs in the natural sciences, the range of computational behaviors exhibited by CRNs is not well understood. Here we study the following problem: what functions $f:\mathbb{R}^k \to \mathbb{R}$ can be computed by a CRN, in which the CRN eventually produces the correct amount of the "output" molecule, no matter the rate at which reactions proceed? This captures a previously unexplored, but very natural class of computations: for example, the reaction $X_1 + X_2 \to Y$ can be thought to compute the function $y = \min(x_1, x_2)$. Such a CRN is robust in the sense that it is correct no matter the kinetic model of chemistry, so long as it respects the stoichiometric constraints. We develop a reachability relation based on "what could happen" if reaction rates can vary arbitrarily over time. We define *stable computation* analogously to probability 1 computation in distributed computing, and connect it with a seemingly stronger notion of rate-independent computation based on convergence under a wide class of generalized rate laws. We also consider the "dual-rail representation" that can represent negative values as the difference of two concentrations and allows the composition of CRN modules. We prove that a function is rate-independently computable if and only if it is piecewise linear (with rational coefficients) and continuous (dual-rail representation), or non-negative with discontinuities occurring only when some inputs switch from zero to positive (direct representation). The many contexts where continuous piecewise linear functions are powerful targets for implementation, combined with the systematic construction we develop for computing these functions, demonstrate the potential of rate-independent chemical computation. △ Less

Submitted 7 April, 2023; v1 submitted 28 July, 2021; originally announced July 2021.

Comments: accepted to JACM (https://doi.org/10.1145/3590776); preliminary version appeared in ITCS 2014: http://doi.org/10.1145/2554797.2554827

arXiv:2106.10201 [pdf, other]

A time and space optimal stable population protocol solving exact majority

Authors: David Doty, Mahsa Eftekhari, Leszek Gąsieniec, Eric Severson, Grzegorz Stachowiak, Przemysław Uznański

Abstract: We study population protocols, a model of distributed computing appropriate for modeling well-mixed chemical reaction networks and other physical systems where agents exchange information in pairwise interactions, but have no control over their schedule of interaction partners. The well-studied *majority* problem is that of determining in an initial population of $n$ agents, each with one of two o… ▽ More We study population protocols, a model of distributed computing appropriate for modeling well-mixed chemical reaction networks and other physical systems where agents exchange information in pairwise interactions, but have no control over their schedule of interaction partners. The well-studied *majority* problem is that of determining in an initial population of $n$ agents, each with one of two opinions $A$ or $B$, whether there are more $A$, more $B$, or a tie. A *stable* protocol solves this problem with probability 1 by eventually entering a configuration in which all agents agree on a correct consensus decision of $\mathsf{A}$, $\mathsf{B}$, or $\mathsf{T}$, from which the consensus cannot change. We describe a protocol that solves this problem using $O(\log n)$ states ($\log \log n + O(1)$ bits of memory) and optimal expected time $O(\log n)$. The number of states $O(\log n)$ is known to be optimal for the class of polylogarithmic time stable protocols that are "output dominant" and "monotone". These are two natural constraints satisfied by our protocol, making it simultaneously time- and state-optimal for that class. We introduce a key technique called a "fixed resolution clock" to achieve partial synchronization. Our protocol is *nonuniform*: the transition function has the value $\left \lceil {\log n} \right \rceil$ encoded in it. We show that the protocol can be modified to be uniform, while increasing the state complexity to $Θ(\log n \log \log n)$. △ Less

Submitted 20 January, 2022; v1 submitted 4 June, 2021; originally announced June 2021.

Comments: Applied FOCS reviewers' comments

ACM Class: F.1; F.2.2

Journal ref: FOCS 2021: Proceedings of the 62nd Annual IEEE Symposium on Foundations of Computer Science, Feb 2022

arXiv:2105.08559 [pdf, other]

Simulating 3-symbol Turing machines with SIMD||DNA

Authors: David Doty, Aaron Ong

Abstract: SIMD||DNA is a model of DNA strand displacement allowing parallel in-memory computation on DNA storage. We show how to simulate an arbitrary 3-symbol space-bounded Turing machine with a SIMD||DNA program, giving a more direct and efficient route to general-purpose information manipulation on DNA storage than the Rule 110 simulation of [Wang, Chalk, Soloveichik, DNA 2019]. We also develop software… ▽ More SIMD||DNA is a model of DNA strand displacement allowing parallel in-memory computation on DNA storage. We show how to simulate an arbitrary 3-symbol space-bounded Turing machine with a SIMD||DNA program, giving a more direct and efficient route to general-purpose information manipulation on DNA storage than the Rule 110 simulation of [Wang, Chalk, Soloveichik, DNA 2019]. We also develop software that can simulate SIMD||DNA programs and produce SVG figures. △ Less

Submitted 15 February, 2022; v1 submitted 18 May, 2021; originally announced May 2021.

arXiv:2105.05408 [pdf, ps, other]

doi 10.1016/j.tcs.2021.08.038

A survey of size counting in population protocols

Authors: David Doty, Mahsa Eftekhari

Abstract: The population protocol model describes a network of $n$ anonymous agents who cannot control with whom they interact. The agents collectively solve some computational problem through random pairwise interactions, each agent updating its own state in response to seeing the state of the other agent. They are equivalent to the model of chemical reaction networks, describing abstract chemical reaction… ▽ More The population protocol model describes a network of $n$ anonymous agents who cannot control with whom they interact. The agents collectively solve some computational problem through random pairwise interactions, each agent updating its own state in response to seeing the state of the other agent. They are equivalent to the model of chemical reaction networks, describing abstract chemical reactions such as $A+B \rightarrow C+D$, when the latter is subject to the restriction that all reactions have two reactants and two products, and all rate constants are 1. The counting problem is that of designing a protocol so that $n$ agents, all starting in the same state, eventually converge to states where each agent encodes in its state an exact or approximate description of population size $n$. In this survey paper, we describe recent algorithmic advances on the counting problem. △ Less

Submitted 22 October, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

MSC Class: 68M18; 68W15

arXiv:2105.04702 [pdf, other]

ppsim: A software package for efficiently simulating and visualizing population protocols

Authors: David Doty, Eric Severson

Abstract: We introduce ppsim, a software package for efficiently simulating population protocols, a widely-studied subclass of chemical reaction networks (CRNs) in which all reactions have two reactants and two products. Each step in the dynamics involves picking a uniform random pair from a population of $n$ molecules to collide and have a (potentially null) reaction. In a recent breakthrough, Berenbrink,… ▽ More We introduce ppsim, a software package for efficiently simulating population protocols, a widely-studied subclass of chemical reaction networks (CRNs) in which all reactions have two reactants and two products. Each step in the dynamics involves picking a uniform random pair from a population of $n$ molecules to collide and have a (potentially null) reaction. In a recent breakthrough, Berenbrink, Hammer, Kaaser, Meyer, Penschuck, and Tran [ESA 2020] discovered a population protocol simulation algorithm quadratically faster than the naive algorithm, simulating $Θ(\sqrt{n})$ reactions in *constant* time (independently of $n$, though the time scales with the number of species), while preserving the *exact* stochastic dynamics. ppsim implements this algorithm, with a tightly optimized Cython implementation that can exactly simulate hundreds of billions of reactions in seconds. It dynamically switches to the CRN Gillespie algorithm for efficiency gains when the number of applicable reactions in a configuration becomes small. As a Python library, ppsim also includes many useful tools for data visualization in Jupyter notebooks, allowing robust visualization of time dynamics such as histogram plots at time snapshots and averaging repeated trials. Finally, we give a framework that takes any CRN with only bimolecular (2 reactant, 2 product) or unimolecular (1 reactant, 1 product) reactions, with arbitrary rate constants, and compiles it into a continuous-time population protocol. This lets ppsim exactly sample from the chemical master equation (unlike approximate heuristics such as tau-leaping or LNA), while achieving asymptotic gains in running time. In linked Jupyter notebooks, we demonstrate the efficacy of the tool on some protocols of interest in molecular programming, including the approximate majority CRN and CRN models of DNA strand displacement reactions. △ Less

Submitted 1 July, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

arXiv:2012.15800

A stable majority population protocol using logarithmic time and states

Authors: David Doty, Mahsa Eftekhari, Eric Severson

Abstract: We study population protocols, a model of distributed computing appropriate for modeling well-mixed chemical reaction networks and other physical systems where agents exchange information in pairwise interactions, but have no control over their schedule of interaction partners. The well-studied *majority* problem is that of determining in an initial population of $n$ agents, each with one of two o… ▽ More We study population protocols, a model of distributed computing appropriate for modeling well-mixed chemical reaction networks and other physical systems where agents exchange information in pairwise interactions, but have no control over their schedule of interaction partners. The well-studied *majority* problem is that of determining in an initial population of $n$ agents, each with one of two opinions $A$ or $B$, whether there are more $A$, more $B$, or a tie. A *stable* protocol solves this problem with probability 1 by eventually entering a configuration in which all agents agree on a correct consensus decision of $A$, $B$, or $T$, from which the consensus cannot change. We describe a protocol that solves this problem using $O(\log n)$ states ($\log \log n + O(1)$ bits of memory) and optimal expected time $O(\log n)$. The number of states $O(\log n)$ is known to be optimal for the class of stable protocols that are "output dominant" and "monotone". These are two natural constraints satisfied by our protocol, making it state-optimal for that class. We use, and develop novel analysis of, a key technique called a "fixed resolution clock" due to Gasieniec, Stachowiak, and Uznanski, who showed a majority protocol using $O(\log n)$ time and states that has a positive probability of error. Our protocol is *nonuniform*: the transition function has the value $\left \lceil {\log n} \right \rceil$ encoded in it. We show that the protocol can be modified to be uniform, while increasing the state complexity to $Θ(\log n \log \log n)$. △ Less

Submitted 20 June, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

Comments: We combined this paper with arXiv:2011.07392 and have a new updated version at arXiv:2106.10201

arXiv:2011.10677 [pdf, other]

Computing properties of thermodynamic binding networks: An integer programming approach

Authors: David Haley, David Doty

Abstract: The thermodynamic binding networks (TBN) model is a tool for studying engineered molecular systems. The TBN model allows one to reason about their behavior through a simplified abstraction that ignores details about molecular composition, focusing on two key determinants of a system's energetics common to any chemical substrate: how many molecular bonds are formed, and how many separate complexes… ▽ More The thermodynamic binding networks (TBN) model is a tool for studying engineered molecular systems. The TBN model allows one to reason about their behavior through a simplified abstraction that ignores details about molecular composition, focusing on two key determinants of a system's energetics common to any chemical substrate: how many molecular bonds are formed, and how many separate complexes exist in the system. We formulate as an integer program the NP-hard problem of computing stable (a.k.a., minimum energy) configurations of a TBN: those configurations that maximize the number of bonds and complexes. We provide open-source software solving this integer program. We give empirical evidence that this approach enables dramatically faster computation of TBN stable configurations than previous approaches based on SAT solvers. Furthermore, unlike SAT-based approaches, our integer programming formulation can reason about TBNs in which some molecules have unbounded counts. These improvements in turn allow us to efficiently automate verification of desired properties of practical TBNs. Finally, we show that the TBN has a natural representation with a unique Hilbert basis describing the "fundamental components" out of which locally minimal energy configurations are composed. This characterization helps verify correctness of not only stable configurations, but entire "kinetic pathways" in a TBN. △ Less

Submitted 11 May, 2021; v1 submitted 20 November, 2020; originally announced November 2020.

arXiv:2005.11841 [pdf, other]

scadnano: A browser-based, scriptable tool for designing DNA nanostructures

Authors: David Doty, Benjamin L Lee, Tristan Stérin

Abstract: We introduce $\textit{scadnano}$ (https://scadnano.org) (short for "scriptable cadnano"), a computational tool for designing synthetic DNA structures. Its design is based heavily on cadnano, the most widely-used software for designing DNA origami, with three main differences: 1. scadnano runs entirely in the browser, with $\textit{no software installation}$ required. 2. scadnano designs, while… ▽ More We introduce $\textit{scadnano}$ (https://scadnano.org) (short for "scriptable cadnano"), a computational tool for designing synthetic DNA structures. Its design is based heavily on cadnano, the most widely-used software for designing DNA origami, with three main differences: 1. scadnano runs entirely in the browser, with $\textit{no software installation}$ required. 2. scadnano designs, while they can be edited manually, can also be created and edited by a $\textit{well-documented Python scripting library}$, to help automate tedious tasks. 3. The scadnano file format is $\textit{easily human-readable}$. This goal is closely aligned with the scripting library, intended to be helpful when debugging scripts or interfacing with other software. The format is also somewhat more expressive than that of cadnano, able to describe a broader range of DNA structures than just DNA origami. △ Less

Submitted 16 July, 2020; v1 submitted 24 May, 2020; originally announced May 2020.

Comments: accepted to DNA 2020 (26th International Meeting on DNA Computing and Molecular Programming)

arXiv:2003.09532 [pdf]

doi 10.4230/LIPIcs.DISC.2020.6

Message complexity of population protocols

Authors: Talley Amir, James Aspnes, David Doty, Mahsa Eftekhari, Eric Severson

Abstract: The standard population protocol model assumes that when two agents interact, each observes the entire state of the other agent. We initiate the study of $\textit{message complexity}$ for population protocols, where the state of an agent is divided into an externally-visible $\textit{message}$ and an internal component, where only the message can be observed by the other agent in an interaction.… ▽ More The standard population protocol model assumes that when two agents interact, each observes the entire state of the other agent. We initiate the study of $\textit{message complexity}$ for population protocols, where the state of an agent is divided into an externally-visible $\textit{message}$ and an internal component, where only the message can be observed by the other agent in an interaction. We consider the case of $O(1)$ message complexity. When time is unrestricted, we obtain an exact characterization of the stably computable predicates based on the number of internal states $s(n)$: If $s(n) = o(n)$ then the protocol computes semilinear predicates (unlike the original model, which can compute non-semilinear predicates with $s(n) = O(\log n)$), and otherwise it computes a predicate decidable by a nondeterministic $O(n \log s(n))$-space-bounded Turing machine. We then introduce novel $O(\mathrm{polylog}(n))$ expected time protocols for junta/leader election and general purpose broadcast correct with high probability, and approximate and exact population size counting correct with probability 1. Finally, we show that the main constraint on the power of bounded-message-size protocols is the size of the internal states: with unbounded internal states, any computable function can be computed with probability 1 in the limit by a protocol that uses only $\textit{1-bit}$ messages. △ Less

Submitted 23 September, 2020; v1 submitted 20 March, 2020; originally announced March 2020.

ACM Class: F.1.1

arXiv:1907.06068 [pdf, other]

Time-optimal self-stabilizing leader election in population protocols

Authors: Janna Burman, Ho-Lin Chen, Hsueh-Ping Chen, David Doty, Thomas Nowak, Eric Severson, Chuan Xu

Abstract: We consider the standard population protocol model, where (a priori) indistinguishable and anonymous agents interact in pairs according to uniformly random scheduling. The self-stabilizing leader election problem requires the protocol to converge on a single leader agent from any possible initial configuration. We initiate the study of time complexity of population protocols solving this problem i… ▽ More We consider the standard population protocol model, where (a priori) indistinguishable and anonymous agents interact in pairs according to uniformly random scheduling. The self-stabilizing leader election problem requires the protocol to converge on a single leader agent from any possible initial configuration. We initiate the study of time complexity of population protocols solving this problem in its original setting: with probability 1, in a complete communication graph. The only previously known protocol by Cai, Izumi, and Wada [Theor. Comput. Syst. 50] runs in expected parallel time $Θ(n^2)$ and has the optimal number of $n$ states in a population of $n$ agents. The existing protocol has the additional property that it becomes silent, i.e., the agents' states eventually stop changing. Observing that any silent protocol solving self-stabilizing leader election requires $Ω(n)$ expected parallel time, we introduce a silent protocol that uses optimal $O(n)$ parallel time and states. Without any silence constraints, we show that it is possible to solve self-stabilizing leader election in asymptotically optimal expected parallel time of $O(\log n)$, but using at least exponential states (a quasi-polynomial number of bits). All of our protocols (and also that of Cai et al.) work by solving the more difficult ranking problem: assigning agents the ranks $1,\ldots,n$. △ Less

Submitted 28 November, 2021; v1 submitted 13 July, 2019; originally announced July 2019.

Comments: fixed typo in Figure 2

Journal ref: PODC 2021: Proceedings of the 2021 ACM Symposium on Principles of Distributed Computing, July 2021, pages 33-44

arXiv:1903.02637 [pdf, other]

Composable computation in discrete chemical reaction networks

Authors: Eric E. Severson, David Haley, David Doty

Abstract: We study the composability of discrete chemical reaction networks (CRNs) that stably compute (i.e., with probability 0 of error) integer-valued functions $f:\mathbb{N}^d\to\mathbb{N}$. We consider output-oblivious CRNs in which the output species is never a reactant (input) to any reaction. The class of output-oblivious CRNs is fundamental, appearing in earlier studies of CRN computation, because… ▽ More We study the composability of discrete chemical reaction networks (CRNs) that stably compute (i.e., with probability 0 of error) integer-valued functions $f:\mathbb{N}^d\to\mathbb{N}$. We consider output-oblivious CRNs in which the output species is never a reactant (input) to any reaction. The class of output-oblivious CRNs is fundamental, appearing in earlier studies of CRN computation, because it is precisely the class of CRNs that can be composed by simply renaming the output of the upstream CRN to match the input of the downstream CRN. Our main theorem precisely characterizes the functions $f$ stably computable by output-oblivious CRNs with an initial leader. The key necessary condition is that for sufficiently large inputs, $f$ is the minimum of a finite number of nondecreasing quilt-affine functions. (An affine function is linear with a constant offset; a quilt-affine function is linear with a periodic offset). △ Less

Submitted 31 May, 2019; v1 submitted 26 February, 2019; originally announced March 2019.

arXiv:1811.01235 [pdf, other]

doi 10.4230/LIPIcs.ICALP.2017.141

Hardness of computing and approximating predicates and functions with leaderless population protocols

Authors: Amanda Belleville, David Doty, David Soloveichik

Abstract: Population protocols are a distributed computing model appropriate for describing massive numbers of agents with limited computational power. A population protocol "has an initial leader" if every valid initial configuration contains a single agent in a special "leader" state that helps to coordinate the computation. Although the class of predicates and functions computable with probability 1 is t… ▽ More Population protocols are a distributed computing model appropriate for describing massive numbers of agents with limited computational power. A population protocol "has an initial leader" if every valid initial configuration contains a single agent in a special "leader" state that helps to coordinate the computation. Although the class of predicates and functions computable with probability 1 is the same whether or not there is an initial leader (semilinear functions and predicates), it is not known whether a leader is necessary for fast computation. Efficient population protocols are generally defined as those computing in polylogarithmic in $n$ (parallel) time. We consider leaderless population protocols, regarding the computation finished when a configuration is reached from which a different output is no longer reachable. In this setting we show that a wide class of functions and predicates computable by population protocols are not efficiently computable (they require at least linear time to stabilize on a correct answer), nor are some linear functions even efficiently approximable. For example, the widely studied parity, majority, and equality predicates cannot be computed in sublinear time. Moreover, it requires at least linear time for a population protocol even to approximate any linear function with a coefficient outside of $\mathbb{N}$: for sufficiently small $γ> 0$, the output of a sublinear time protocol can stabilize outside the interval $f(m) (1 \pm γ)$ on infinitely many inputs $m$. We also show that it requires linear time to exactly compute a wide range of semilinear functions (e.g., $f(m)=m$ if $m$ is even and $2m$ if $m$ is odd). Finally, we show that with a sufficiently large value of $γ$, a population protocol can approximate any linear $f$ with nonnegative rational coefficients, within approximation factor $γ$, in $O(\log n)$ time. △ Less

Submitted 3 November, 2018; originally announced November 2018.

Comments: published in Proceedings of ICALP 2017

arXiv:1810.12889 [pdf, other]

Programming Substrate-Independent Kinetic Barriers with Thermodynamic Binding Networks

Authors: Keenan Breik, Cameron Chalk, David Doty, David Haley, David Soloveichik

Abstract: Engineering molecular systems that exhibit complex behavior requires the design of kinetic barriers. For example, an effective catalytic pathway must have a large barrier when the catalyst is absent. While programming such energy barriers seems to require knowledge of the specific molecular substrate, we develop a novel substrate-independent approach. We extend the recently-developed model known a… ▽ More Engineering molecular systems that exhibit complex behavior requires the design of kinetic barriers. For example, an effective catalytic pathway must have a large barrier when the catalyst is absent. While programming such energy barriers seems to require knowledge of the specific molecular substrate, we develop a novel substrate-independent approach. We extend the recently-developed model known as thermodynamic binding networks, demonstrating programmable kinetic barriers that arise solely from the thermodynamic driving forces of bond formation and the configurational entropy of forming separate complexes. Our kinetic model makes relatively weak assumptions, which implies that energy barriers predicted by our model would exist in a wide variety of systems and conditions. We demonstrate that our model is robust by showing that several variations in its definition result in equivalent energy barriers. We apply this model to design catalytic systems with an arbitrarily large energy barrier to uncatalyzed reactions. Our results could yield robust amplifiers using DNA strand displacement, a popular technology for engineering synthetic reaction pathways, and suggest design strategies for preventing undesired kinetic behavior in a variety of molecular systems. △ Less

Submitted 27 January, 2020; v1 submitted 30 October, 2018; originally announced October 2018.

arXiv:1808.08913 [pdf, other]

Efficient size estimation and impossibility of termination in uniform dense population protocols

Authors: David Doty, Mahsa Eftekhari

Abstract: We study uniform population protocols: networks of anonymous agents whose pairwise interactions are chosen at random, where each agent uses an identical transition algorithm that does not depend on the population size $n$. Many existing polylog$(n)$ time protocols for leader election and majority computation are nonuniform: to operate correctly, they require all agents to be initialized with an ap… ▽ More We study uniform population protocols: networks of anonymous agents whose pairwise interactions are chosen at random, where each agent uses an identical transition algorithm that does not depend on the population size $n$. Many existing polylog$(n)$ time protocols for leader election and majority computation are nonuniform: to operate correctly, they require all agents to be initialized with an approximate estimate of $n$ (specifically, the exact value $\lfloor \log n \rfloor$). Our first main result is a uniform protocol for calculating $\log(n) \pm O(1)$ with high probability in $O(\log^2 n)$ time and $O(\log^4 n)$ states ($O(\log \log n)$ bits of memory). The protocol is converging but not terminating: it does not signal when the estimate is close to the true value of $\log n$. If it could be made terminating, this would allow composition with protocols, such as those for leader election or majority, that require a size estimate initially, to make them uniform (though with a small probability of failure). We do show how our main protocol can be indirectly composed with others in a simple and elegant way, based on the leaderless phase clock, demonstrating that those protocols can in fact be made uniform. However, our second main result implies that the protocol cannot be made terminating, a consequence of a much stronger result: a uniform protocol for any task requiring more than constant time cannot be terminating even with probability bounded above 0, if infinitely many initial configurations are dense: any state present initially occupies $Ω(n)$ agents. (In particular, no leader is allowed.) Crucially, the result holds no matter the memory or time permitted. Finally, we show that with an initial leader, our size-estimation protocol can be made terminating with high probability, with the same asymptotic time and space bounds. △ Less

Submitted 28 July, 2019; v1 submitted 27 August, 2018; originally announced August 2018.

Comments: Using leaderless phase clock

arXiv:1805.04832 [pdf, other]

Exact size counting in uniform population protocols in nearly logarithmic time

Authors: David Doty, Mahsa Eftekhari, Othon Michail, Paul G. Spirakis, Michail Theofilatos

Abstract: We study population protocols: networks of anonymous agents that interact under a scheduler that picks pairs of agents uniformly at random. The _size counting problem_ is that of calculating the exact number $n$ of agents in the population, assuming no leader (each agent starts in the same state). We give the first protocol that solves this problem in sublinear time. The protocol converges in… ▽ More We study population protocols: networks of anonymous agents that interact under a scheduler that picks pairs of agents uniformly at random. The _size counting problem_ is that of calculating the exact number $n$ of agents in the population, assuming no leader (each agent starts in the same state). We give the first protocol that solves this problem in sublinear time. The protocol converges in $O(\log n \log \log n)$ time and uses $O(n^{60})$ states ($O(1) + 60 \log n$ bits of memory per agent) with probability $1-O(\frac{\log \log n}{n})$. The time complexity is also $O(\log n \log \log n)$ in expectation. The time to converge is also $O(\log n \log \log n)$ in expectation. Crucially, unlike most published protocols with $ω(1)$ states, our protocol is _uniform_: it uses the same transition algorithm for any population size, so does not need an estimate of the population size to be embedded into the algorithm. A sub-protocol is the first uniform sublinear-time leader election population protocol, taking $O(\log n \log \log n)$ time and $O(n^{18})$ states. The state complexity of both the counting and leader election protocols can be reduced to $O(n^{30})$ and $O(n^{9})$ respectively, while increasing the time to $O(\log^2 n)$. △ Less

Submitted 13 May, 2018; originally announced May 2018.

arXiv:1709.07922 [pdf, other]

Thermodynamic Binding Networks

Authors: David Doty, Trent A. Rogers, David Soloveichik, Chris Thachuk, Damien Woods

Abstract: Strand displacement and tile assembly systems are designed to follow prescribed kinetic rules (i.e., exhibit a specific time-evolution). However, the expected behavior in the limit of infinite time--known as thermodynamic equilibrium--is often incompatible with the desired computation. Basic physical chemistry implicates this inconsistency as a source of unavoidable error. Can the thermodynamic eq… ▽ More Strand displacement and tile assembly systems are designed to follow prescribed kinetic rules (i.e., exhibit a specific time-evolution). However, the expected behavior in the limit of infinite time--known as thermodynamic equilibrium--is often incompatible with the desired computation. Basic physical chemistry implicates this inconsistency as a source of unavoidable error. Can the thermodynamic equilibrium be made consistent with the desired computational pathway? In order to formally study this question, we introduce a new model of molecular computing in which computation is driven by the thermodynamic driving forces of enthalpy and entropy. To ensure greatest generality we do not assume that there are any constraints imposed by geometry and treat monomers as unstructured collections of binding sites. In this model we design Boolean AND/OR formulas, as well as a self-assembling binary counter, where the thermodynamically favored states are exactly the desired final output configurations. Though inspired by DNA nanotechnology, the model is sufficiently general to apply to a wide variety of chemical systems. △ Less

Submitted 22 September, 2017; originally announced September 2017.

arXiv:1702.05704 [pdf, other]

Computational Complexity of Atomic Chemical Reaction Networks

Authors: David Doty, Shaopeng Zhu

Abstract: Informally, a chemical reaction network is "atomic" if each reaction may be interpreted as the rearrangement of indivisible units of matter. There are several reasonable definitions formalizing this idea. We investigate the computational complexity of deciding whether a given network is atomic according to each of these definitions. Our first definition, primitive atomic, which requires each rea… ▽ More Informally, a chemical reaction network is "atomic" if each reaction may be interpreted as the rearrangement of indivisible units of matter. There are several reasonable definitions formalizing this idea. We investigate the computational complexity of deciding whether a given network is atomic according to each of these definitions. Our first definition, primitive atomic, which requires each reaction to preserve the total number of atoms, is to shown to be equivalent to mass conservation. Since it is known that it can be decided in polynomial time whether a given chemical reaction network is mass-conserving, the equivalence gives an efficient algorithm to decide primitive atomicity. Another definition, subset atomic, further requires that all atoms are species. We show that deciding whether a given network is subset atomic is in $\textsf{NP}$, and the problem "is a network subset atomic with respect to a given atom set" is strongly $\textsf{NP}$-$\textsf{Complete}$. A third definition, reachably atomic, studied by Adleman, Gopalkrishnan et al., further requires that each species has a sequence of reactions splitting it into its constituent atoms. We show that there is a $\textbf{polynomial-time algorithm}$ to decide whether a given network is reachably atomic, improving upon the result of Adleman et al. that the problem is $\textbf{decidable}$. We show that the reachability problem for reachably atomic networks is $\textsf{Pspace}$-$\textsf{Complete}$. Finally, we demonstrate equivalence relationships between our definitions and some special cases of another existing definition of atomicity due to Gnacadja. △ Less

Submitted 1 October, 2017; v1 submitted 19 February, 2017; originally announced February 2017.

ACM Class: F.1.1

arXiv:1604.03687 [pdf, other]

Democratic, Existential, and Consensus-Based Output Conventions in Stable Computation by Chemical Reaction Networks

Authors: Robert Brijder, David Doty, David Soloveichik

Abstract: We show that some natural output conventions for error-free computation in chemical reaction networks (CRN) lead to a common level of computational expressivity. Our main results are that the standard consensus-based output convention have equivalent computational power to (1) existence-based and (2) democracy-based output conventions. The CRNs using the former output convention have only "yes" vo… ▽ More We show that some natural output conventions for error-free computation in chemical reaction networks (CRN) lead to a common level of computational expressivity. Our main results are that the standard consensus-based output convention have equivalent computational power to (1) existence-based and (2) democracy-based output conventions. The CRNs using the former output convention have only "yes" voters, with the interpretation that the CRN's output is yes if any voters are present and no otherwise. The CRNs using the latter output convention define output by majority vote among "yes" and "no" voters. Both results are proven via a generalized framework that simultaneously captures several definitions, directly inspired by a Petri net result of Esparza, Ganty, Leroux, and Majumder [CONCUR 2015]. These results support the thesis that the computational expressivity of error-free CRNs is intrinsic, not sensitive to arbitrary definitional choices. △ Less

Submitted 10 July, 2017; v1 submitted 13 April, 2016; originally announced April 2016.

Comments: 16 pages, 2 figures

arXiv:1602.01600 [pdf, other]

Design of Geometric Molecular Bonds

Authors: David Doty, Andrew Winslow

Abstract: An example of a nonspecific molecular bond is the affinity of any positive charge for any negative charge (like-unlike), or of nonpolar material for itself when in aqueous solution (like-like). This contrasts specific bonds such as the affinity of the DNA base A for T, but not for C, G, or another A. Recent experimental breakthroughs in DNA nanotechnology demonstrate that a particular nonspecific… ▽ More An example of a nonspecific molecular bond is the affinity of any positive charge for any negative charge (like-unlike), or of nonpolar material for itself when in aqueous solution (like-like). This contrasts specific bonds such as the affinity of the DNA base A for T, but not for C, G, or another A. Recent experimental breakthroughs in DNA nanotechnology demonstrate that a particular nonspecific like-like bond ("blunt-end DNA stacking" that occurs between the ends of any pair of DNA double-helices) can be used to create specific "macrobonds" by careful geometric arrangement of many nonspecific blunt ends, motivating the need for sets of macrobonds that are orthogonal: two macrobonds not intended to bind should have relatively low binding strength, even when misaligned. To address this need, we introduce geometric orthogonal codes that abstractly model the engineered DNA macrobonds as two-dimensional binary codewords. While motivated by completely different applications, geometric orthogonal codes share similar features to the optical orthogonal codes studied by Chung, Salehi, and Wei. The main technical difference is the importance of 2D geometry in defining codeword orthogonality. △ Less

Submitted 12 February, 2017; v1 submitted 4 February, 2016; originally announced February 2016.

Comments: Accepted to appear in IEEE Transactions on Molecular, Biological, and Multi-Scale Communications

arXiv:1502.04246 [pdf, ps, other]

Stable Leader Election in Population Protocols Requires Linear Time

Authors: David Doty, David Soloveichik

Abstract: A population protocol *stably elects a leader* if, for all $n$, starting from an initial configuration with $n$ agents each in an identical state, with probability 1 it reaches a configuration $\mathbf{y}$ that is correct (exactly one agent is in a special leader state $\ell$) and stable (every configuration reachable from $\mathbf{y}$ also has a single agent in state $\ell$). We show that any pop… ▽ More A population protocol *stably elects a leader* if, for all $n$, starting from an initial configuration with $n$ agents each in an identical state, with probability 1 it reaches a configuration $\mathbf{y}$ that is correct (exactly one agent is in a special leader state $\ell$) and stable (every configuration reachable from $\mathbf{y}$ also has a single agent in state $\ell$). We show that any population protocol that stably elects a leader requires $Ω(n)$ expected "parallel time" --- $Ω(n^2)$ expected total pairwise interactions --- to reach such a stable configuration. Our result also informs the understanding of the time complexity of chemical self-organization by showing an essential difficulty in generating exact quantities of molecular species quickly. △ Less

Submitted 20 August, 2016; v1 submitted 14 February, 2015; originally announced February 2015.

Comments: accepted to Distributed Computing special issue of invited papers from DISC 2015; significantly revised proof structure and intuitive explanations

arXiv:1411.6672 [pdf, other]

Pattern overlap implies runaway growth in hierarchical tile systems

Authors: Ho-Lin Chen, David Doty, Ján Maňuch, Arash Rafiey, Ladislav Stacho

Abstract: We show that in the hierarchical tile assembly model, if there is a producible assembly that overlaps a nontrivial translation of itself consistently (i.e., the pattern of tile types in the overlap region is identical in both translations), then arbitrarily large assemblies are producible. The significance of this result is that tile systems intended to controllably produce finite structures must… ▽ More We show that in the hierarchical tile assembly model, if there is a producible assembly that overlaps a nontrivial translation of itself consistently (i.e., the pattern of tile types in the overlap region is identical in both translations), then arbitrarily large assemblies are producible. The significance of this result is that tile systems intended to controllably produce finite structures must avoid pattern repetition in their producible assemblies that would lead to such overlap. This answers an open question of Chen and Doty (SODA 2012), who showed that so-called "partial-order" systems producing a unique finite assembly *and" avoiding such overlaps must require time linear in the assembly diameter. An application of our main result is that any system producing a unique finite assembly is automatically guaranteed to avoid such overlaps, simplifying the hypothesis of Chen and Doty's main theorem. △ Less

Submitted 24 November, 2014; originally announced November 2014.

arXiv:1409.4828 [pdf, other]

Fast algorithmic self-assembly of simple shapes using random agitation

Authors: Ho-Lin Chen, David Doty, Dhiraj Holden, Chris Thachuk, Damien Woods, Chun-Tao Yang

Abstract: We study the power of uncontrolled random molecular movement in the nubot model of self-assembly. The nubot model is an asynchronous nondeterministic cellular automaton augmented with rigid-body movement rules (push/pull, deterministically and programmatically applied to specific monomers) and random agitations (nondeterministically applied to every monomer and direction with equal probability all… ▽ More We study the power of uncontrolled random molecular movement in the nubot model of self-assembly. The nubot model is an asynchronous nondeterministic cellular automaton augmented with rigid-body movement rules (push/pull, deterministically and programmatically applied to specific monomers) and random agitations (nondeterministically applied to every monomer and direction with equal probability all of the time). Previous work on the nubot model showed how to build simple shapes such as lines and squares quickly---in expected time that is merely logarithmic of their size. These results crucially make use of the programmable rigid-body movement rule: the ability for a single monomer to control the movement of a large objects quickly, and only at a time and place of the programmers' choosing. However, in engineered molecular systems, molecular motion is largely uncontrolled and fundamentally random. This raises the question of whether similar results can be achieved in a more restrictive, and perhaps easier to justify, model where uncontrolled random movements, or agitations, are happening throughout the self-assembly process and are the only form of rigid-body movement. We show that this is indeed the case: we give a polylogarithmic expected time construction for squares using agitation, and a sublinear expected time construction to build a line. Such results are impossible in an agitation-free (and movement-free) setting and thus show the benefits of exploiting uncontrolled random movement. △ Less

Submitted 16 September, 2014; originally announced September 2014.

Comments: Conference version at DNA20

arXiv:1304.7804 [pdf, ps, other]

Producibility in hierarchical self-assembly

Authors: David Doty

Abstract: Three results are shown on producibility in the hierarchical model of tile self-assembly. It is shown that a simple greedy polynomial-time strategy decides whether an assembly A is producible. The algorithm can be optimized to use O(|A| log^2 |A|) time. Cannon, Demaine, Demaine, Eisenstat, Patitz, Schweller, Summers, and Winslow showed that the problem of deciding if an assembly A is the unique pr… ▽ More Three results are shown on producibility in the hierarchical model of tile self-assembly. It is shown that a simple greedy polynomial-time strategy decides whether an assembly A is producible. The algorithm can be optimized to use O(|A| log^2 |A|) time. Cannon, Demaine, Demaine, Eisenstat, Patitz, Schweller, Summers, and Winslow showed that the problem of deciding if an assembly A is the unique producible terminal assembly of a tile system T can be solved in O(|A|^2 |T| + |A| |T|^2) time for the special case of noncooperative "temperature 1" systems. It is shown that this can be improved to O(|A| |T| log |T|) time. Finally, it is shown that if two assemblies are producible, and if they can be overlapped consistently -- i.e., if the positions that they share have the same tile type in each assembly -- then their union is also producible. △ Less

Submitted 1 May, 2013; v1 submitted 29 April, 2013; originally announced April 2013.

arXiv:1304.4519 [pdf, ps, other]

Leaderless deterministic chemical reaction networks

Authors: David Doty, Monir Hajiaghayi

Abstract: This paper answers an open question of Chen, Doty, and Soloveichik [1], who showed that a function f:N^k --> N^l is deterministically computable by a stochastic chemical reaction network (CRN) if and only if the graph of f is a semilinear subset of N^{k+l}. That construction crucially used "leaders": the ability to start in an initial configuration with constant but non-zero counts of species othe… ▽ More This paper answers an open question of Chen, Doty, and Soloveichik [1], who showed that a function f:N^k --> N^l is deterministically computable by a stochastic chemical reaction network (CRN) if and only if the graph of f is a semilinear subset of N^{k+l}. That construction crucially used "leaders": the ability to start in an initial configuration with constant but non-zero counts of species other than the k species X_1,...,X_k representing the input to the function f. The authors asked whether deterministic CRNs without a leader retain the same power. We answer this question affirmatively, showing that every semilinear function is deterministically computable by a CRN whose initial configuration contains only the input species X_1,...,X_k, and zero counts of every other species. We show that this CRN completes in expected time O(n), where n is the total number of input molecules. This time bound is slower than the O(log^5 n) achieved in [1], but faster than the O(n log n) achieved by the direct construction of [1] (Theorem 4.1 in the latest online version of [1]), since the fast construction of that paper (Theorem 4.4) relied heavily on the use of a fast, error-prone CRN that computes arbitrary computable functions, and which crucially uses a leader. △ Less

Submitted 16 April, 2013; originally announced April 2013.

Comments: arXiv admin note: substantial text overlap with arXiv:1204.4176

arXiv:1304.0872 [pdf, ps, other]

Timing in chemical reaction networks

Authors: David Doty

Abstract: Chemical reaction networks (CRNs) formally model chemistry in a well-mixed solution. CRNs are widely used to describe information processing occurring in natural cellular regulatory networks, and with upcoming advances in synthetic biology, CRNs are a promising programming language for the design of artificial molecular control circuitry. Due to a formal equivalence between CRNs and a model of dis… ▽ More Chemical reaction networks (CRNs) formally model chemistry in a well-mixed solution. CRNs are widely used to describe information processing occurring in natural cellular regulatory networks, and with upcoming advances in synthetic biology, CRNs are a promising programming language for the design of artificial molecular control circuitry. Due to a formal equivalence between CRNs and a model of distributed computing known as population protocols, results transfer readily between the two models. We show that if a CRN respects finite density (at most O(n) additional molecules can be produced from n initial molecules), then starting from any dense initial configuration (all molecular species initially present have initial count Omega(n), where n is the initial molecular count and volume), then every producible species is produced in constant time with high probability. This implies that no CRN obeying the stated constraints can function as a timer, able to produce a molecule, but doing so only after a time that is an unbounded function of the input size. This has consequences regarding an open question of Angluin, Aspnes, and Eisenstat concerning the ability of population protocols to perform fast, reliable leader election and to simulate arbitrary algorithms from a uniform initial state. △ Less

Submitted 3 April, 2013; originally announced April 2013.

arXiv:1204.4176 [pdf, other]

Deterministic Function Computation with Chemical Reaction Networks

Authors: Ho-Lin Chen, David Doty, David Soloveichik

Abstract: Chemical reaction networks (CRNs) formally model chemistry in a well-mixed solution. CRNs are widely used to describe information processing occurring in natural cellular regulatory networks, and with upcoming advances in synthetic biology, CRNs are a promising language for the design of artificial molecular control circuitry. Nonetheless, despite the widespread use of CRNs in the natural sciences… ▽ More Chemical reaction networks (CRNs) formally model chemistry in a well-mixed solution. CRNs are widely used to describe information processing occurring in natural cellular regulatory networks, and with upcoming advances in synthetic biology, CRNs are a promising language for the design of artificial molecular control circuitry. Nonetheless, despite the widespread use of CRNs in the natural sciences, the range of computational behaviors exhibited by CRNs is not well understood. CRNs have been shown to be efficiently Turing-universal when allowing for a small probability of error. CRNs that are guaranteed to converge on a correct answer, on the other hand, have been shown to decide only the semilinear predicates. We introduce the notion of function, rather than predicate, computation by representing the output of a function f:N^k --> N^l by a count of some molecular species, i.e., if the CRN starts with n_1,...,n_k molecules of some "input" species X1,...,Xk, the CRN is guaranteed to converge to having f(n_1,...,n_k) molecules of the "output" species Y1,...,Yl. We show that a function f:N^k --> N^l is deterministically computed by a CRN if and only if its graph {(x,y) | f(x) = y} is a semilinear set. Furthermore, each semilinear function f can be computed on input x in expected time O(polylog(|x|)). △ Less

Submitted 10 January, 2013; v1 submitted 18 April, 2012; originally announced April 2012.

Comments: fixed errors in previous version

arXiv:1111.3097 [pdf, other]

The tile assembly model is intrinsically universal

Authors: David Doty, Jack H. Lutz, Matthew J. Patitz, Robert T. Schweller, Scott M. Summers, Damien Woods

Abstract: We prove that the abstract Tile Assembly Model (aTAM) of nanoscale self-assembly is intrinsically universal. This means that there is a single tile assembly system U that, with proper initialization, simulates any tile assembly system T. The simulation is "intrinsic" in the sense that the self-assembly process carried out by U is exactly that carried out by T, with each tile of T represented by an… ▽ More We prove that the abstract Tile Assembly Model (aTAM) of nanoscale self-assembly is intrinsically universal. This means that there is a single tile assembly system U that, with proper initialization, simulates any tile assembly system T. The simulation is "intrinsic" in the sense that the self-assembly process carried out by U is exactly that carried out by T, with each tile of T represented by an m x m "supertile" of U. Our construction works for the full aTAM at any temperature, and it faithfully simulates the deterministic or nondeterministic behavior of each T. Our construction succeeds by solving an analog of the cell differentiation problem in developmental biology: Each supertile of U, starting with those in the seed assembly, carries the "genome" of the simulated system T. At each location of a potential supertile in the self-assembly of U, a decision is made whether and how to express this genome, i.e., whether to generate a supertile and, if so, which tile of T it will represent. This decision must be achieved using asynchronous communication under incomplete information, but it achieves the correct global outcome(s). △ Less

Submitted 7 April, 2012; v1 submitted 14 November, 2011; originally announced November 2011.

arXiv:1104.5226 [pdf, ps, other]

Parallelism and Time in Hierarchical Self-Assembly

Authors: Ho-Lin Chen, David Doty

Abstract: We study the role that parallelism plays in time complexity of Winfree's abstract Tile Assembly Model (aTAM), a model of molecular algorithmic self-assembly. In the "hierarchical" aTAM, two assemblies, both consisting of multiple tiles, are allowed to aggregate together, whereas in the "seeded" aTAM, tiles attach one at a time to a growing assembly. Adleman, Cheng, Goel, and Huang ("Running Time a… ▽ More We study the role that parallelism plays in time complexity of Winfree's abstract Tile Assembly Model (aTAM), a model of molecular algorithmic self-assembly. In the "hierarchical" aTAM, two assemblies, both consisting of multiple tiles, are allowed to aggregate together, whereas in the "seeded" aTAM, tiles attach one at a time to a growing assembly. Adleman, Cheng, Goel, and Huang ("Running Time and Program Size for Self-Assembled Squares", STOC 2001) showed how to assemble an n x n square in O(n) time in the seeded aTAM using O(log n / log log n) unique tile types, where both of these parameters are optimal. They asked whether the hierarchical aTAM could allow a tile system to use the ability to form large assemblies in parallel before they attach to break the Omega(n) lower bound for assembly time. We show that there is a tile system with the optimal O(log n / log log n) tile types that assembles an n x n square using O(log^2 n) parallel "stages", which is close to the optimal Omega(log n) stages, forming the final n x n square from four n/2 x n/2 squares, which are themselves recursively formed from n/4 x n/4 squares, etc. However, despite this nearly maximal parallelism, the system requires superlinear time to assemble the square. We extend the definition of *partial order tile systems* studied by Adleman et al. in a natural way to hierarchical assembly and show that no hierarchical partial order tile system can build any shape with diameter N in less than time Omega(N), demonstrating that in this case the hierarchical model affords no speedup whatsoever over the seeded model. We strengthen the Omega(N) time lower bound for deterministic seeded systems of Adleman et al. to nondeterministic seeded systems. Finally, we show that for infinitely many n, a tile system can assemble an n x n' rectangle, with n > n', in time O(n^{4/5} log n), breaking the linear-time lower bound. △ Less

Submitted 12 February, 2017; v1 submitted 27 April, 2011; originally announced April 2011.

Comments: accepted to appear in SIAM Journal on Computing

arXiv:1011.3493 [pdf, ps, other]

Program Size and Temperature in Self-Assembly

Authors: Ho-Lin Chen, David Doty, Shinnosuke Seki

Abstract: Winfree's abstract Tile Assembly Model (aTAM) is a model of molecular self-assembly of DNA complexes known as tiles, which float freely in solution and attach one at a time to a growing "seed" assembly based on specific binding sites on their four sides. We show that there is a polynomial-time algorithm that, given an n x n square, finds the minimal tile system (i.e., the system with the smallest… ▽ More Winfree's abstract Tile Assembly Model (aTAM) is a model of molecular self-assembly of DNA complexes known as tiles, which float freely in solution and attach one at a time to a growing "seed" assembly based on specific binding sites on their four sides. We show that there is a polynomial-time algorithm that, given an n x n square, finds the minimal tile system (i.e., the system with the smallest number of distinct tile types) that uniquely self-assembles the square, answering an open question of Adleman, Cheng, Goel, Huang, Kempe, Moisset de Espanes, and Rothemund ("Combinatorial Optimization Problems in Self-Assembly", STOC 2002). Our investigation leading to this algorithm reveals other positive and negative results about the relationship between the size of a tile system and its "temperature" (the binding strength threshold required for a tile to attach). △ Less

Submitted 2 March, 2011; v1 submitted 15 November, 2010; originally announced November 2010.

Comments: The previous version contained more sections, but we have split that paper into two. The other half will be posted as a separate paper

arXiv:1006.2897 [pdf, ps, other]

The Power of Nondeterminism in Self-Assembly

Authors: Nathaniel Bryans, Ehsan Chiniforooshan, David Doty, Lila Kari, Shinnosuke Seki

Abstract: We investigate the role of nondeterminism in Winfree's abstract Tile Assembly Model (aTAM), which was conceived to model artificial molecular self-assembling systems constructed from DNA. Of particular practical importance is to find tile systems that minimize resources such as the number of distinct tile types, each of which corresponds to a set of DNA strands that must be custom-synthesized in a… ▽ More We investigate the role of nondeterminism in Winfree's abstract Tile Assembly Model (aTAM), which was conceived to model artificial molecular self-assembling systems constructed from DNA. Of particular practical importance is to find tile systems that minimize resources such as the number of distinct tile types, each of which corresponds to a set of DNA strands that must be custom-synthesized in actual molecular implementations of the aTAM. We seek to identify to what extent the use of nondeterminism in tile systems affects the resources required by such molecular shape-building algorithms. We first show a "molecular computability theoretic" result: there is an infinite shape S that is uniquely assembled by a tile system but not by any deterministic tile system. We then show an analogous phenomenon in the finitary "molecular complexity theoretic" case: there is a finite shape S that is uniquely assembled by a tile system with c tile types, but every deterministic tile system that uniquely assembles S has more than c tile types. In fact we extend the technique to derive a stronger (classical complexity theoretic) result, showing that the problem of finding the minimum number of tile types that uniquely assemble a given finite shape is Sigma-P-2-complete. In contrast, the problem of finding the minimum number of deterministic tile types that uniquely assemble a shape was shown to be NP-complete by Adleman, Cheng, Goel, Huang, Kempe, Moisset de Espanés, and Rothemund (Combinatorial Optimization Problems in Self-Assembly, STOC 2002). The conclusion is that nondeterminism confers extra power to assemble a shape from a small tile system, but unless the polynomial hierarchy collapses, it is computationally more difficult to exploit this power by finding the size of the smallest tile system, compared to finding the size of the smallest deterministic tile system. △ Less

Submitted 25 November, 2010; v1 submitted 15 June, 2010; originally announced June 2010.

Comments: Accepted to SODA 2011. The previous version of this paper (which appears in the SODA proceedings) had open questions about computing the minimum number of tile types to weakly self-assemble a set. The answer to these questions is "no", by a very simple imitation of the proof that Kolmogorov complexity is uncomputable based on the Berry paradox. These open questions have been removed

arXiv:1004.3993 [pdf, ps, other]

An Oracle Strongly Separating Deterministic Time from Nondeterministic Time, via Kolmogorov Complexity

Authors: David Doty

Abstract: Hartmanis used Kolmogorov complexity to provide an alternate proof of the classical result of Baker, Gill, and Solovay that there is an oracle relative to which P is not NP. We refine the technique to strengthen the result, constructing an oracle relative to which a conjecture of Lipton is false. Hartmanis used Kolmogorov complexity to provide an alternate proof of the classical result of Baker, Gill, and Solovay that there is an oracle relative to which P is not NP. We refine the technique to strengthen the result, constructing an oracle relative to which a conjecture of Lipton is false. △ Less

Submitted 22 April, 2010; originally announced April 2010.

arXiv:1004.0995 [pdf, ps, other]

Strong Fault-Tolerance for Self-Assembly with Fuzzy Temperature

Authors: David Doty, Matthew J. Patitz, Dustin Reishus, Robert T. Schweller, Scott M. Summers

Abstract: We consider the problem of fault-tolerance in nanoscale algorithmic self-assembly. We employ a variant of Winfree's abstract Tile Assembly Model (aTAM), the two-handed aTAM, in which square "tiles" -- a model of molecules constructed from DNA for the purpose of engineering self-assembled nanostructures -- aggregate according to specific binding sites of varying strengths, and in which large aggreg… ▽ More We consider the problem of fault-tolerance in nanoscale algorithmic self-assembly. We employ a variant of Winfree's abstract Tile Assembly Model (aTAM), the two-handed aTAM, in which square "tiles" -- a model of molecules constructed from DNA for the purpose of engineering self-assembled nanostructures -- aggregate according to specific binding sites of varying strengths, and in which large aggregations of tiles may attach to each other, in contrast to the seeded aTAM, in which tiles aggregate one at a time to a single specially-designated "seed" assembly. We focus on a major cause of errors in tile-based self-assembly: that of unintended growth due to "weak" strength-1 bonds, which if allowed to persist, may be stabilized by subsequent attachment of neighboring tiles in the sense that at least energy 2 is now required to break apart the resulting assembly; i.e., the errant assembly is stable at temperature 2. We study a common self-assembly benchmark problem, that of assembling an n x n square using O(log n) unique tile types, under the two-handed model of self-assembly. Our main result achieves a much stronger notion of fault-tolerance than those achieved previously. Arbitrary strength-1 growth is allowed (i.e., the temperature is "fuzzy" and may drift from 2 to 1 for arbitrarily long); however, any assembly that grows sufficiently to become stable at temperature 2 is guaranteed to assemble at temperature 2 into the correct final assembly of an n x n square. In other words, errors due to insufficient attachment, which is the cause of errors studied in earlier papers on fault-tolerance, are prevented absolutely in our main construction, rather than only with high probability and for sufficiently small structures, as in previous fault-tolerance studies. △ Less

Submitted 6 April, 2010; originally announced April 2010.

ACM Class: F.1.1

arXiv:1003.3275 [pdf, ps, other]

doi 10.1007/978-3-642-18305-8_3

Scalable, Time-Responsive, Digital, Energy-Efficient Molecular Circuits using DNA Strand Displacement

Authors: Ehsan Chiniforooshan, David Doty, Lila Kari, Shinnosuke Seki

Abstract: We propose a novel theoretical biomolecular design to implement any Boolean circuit using the mechanism of DNA strand displacement. The design is scalable: all species of DNA strands can in principle be mixed and prepared in a single test tube, rather than requiring separate purification of each species, which is a barrier to large-scale synthesis. The design is time-responsive: the concentratio… ▽ More We propose a novel theoretical biomolecular design to implement any Boolean circuit using the mechanism of DNA strand displacement. The design is scalable: all species of DNA strands can in principle be mixed and prepared in a single test tube, rather than requiring separate purification of each species, which is a barrier to large-scale synthesis. The design is time-responsive: the concentration of output species changes in response to the concentration of input species, so that time-varying inputs may be continuously processed. The design is digital: Boolean values of wires in the circuit are represented as high or low concentrations of certain species, and we show how to construct a single-input, single-output signal restoration gate that amplifies the difference between high and low, which can be distributed to each wire in the circuit to overcome signal degradation. This means we can achieve a digital abstraction of the analog values of concentrations. Finally, the design is energy-efficient: if input species are specified ideally (meaning absolutely 0 concentration of unwanted species), then output species converge to their ideal concentrations at steady-state, and the system at steady-state is in (dynamic) equilibrium, meaning that no energy is consumed by irreversible reactions until the input again changes. Drawbacks of our design include the following. If input is provided non-ideally (small positive concentration of unwanted species), then energy must be continually expended to maintain correct output concentrations even at steady-state. In addition, our fuel species - those species that are permanently consumed in irreversible reactions - are not "generic"; each gate in the circuit is powered by its own specific type of fuel species. Hence different circuits must be powered by different types of fuel. Finally, we require input to be given according to the dual-rail convention, so that an input of 0 is specified not only by the absence of a certain species, but by the presence of another. That is, we do not construct a "true NOT gate" that sets its output to high concentration if and only if its input's concentration is low. It remains an open problem to design scalable, time-responsive, digital, energy-efficient molecular circuits that additionally solve one of these problems, or to prove that some subset of their resolutions are mutually incompatible. △ Less

Submitted 18 March, 2010; v1 submitted 16 March, 2010; originally announced March 2010.

Comments: version 2: the paper itself is unchanged from version 1, but the arXiv software stripped some asterisk characters out of the abstract whose purpose was to highlight words. These characters have been replaced with underscores in version 2. The arXiv software also removed the second paragraph of the abstract, which has been (attempted to be) re-inserted. Also, although the secondary subject is "Soft Condensed Matter", this classification was chosen by the arXiv moderators after submission, not chosen by the authors. The authors consider this submission to be a theoretical computer science paper.

ACM Class: F.1.1

arXiv:1002.2746 [pdf, ps, other]

doi 10.1007/978-3-642-18305-8_4

Negative Interactions in Irreversible Self-Assembly

Authors: David Doty, Lila Kari, Benoit Masson

Abstract: This paper explores the use of negative (i.e., repulsive) interaction the abstract Tile Assembly Model defined by Winfree. Winfree postulated negative interactions to be physically plausible in his Ph.D. thesis, and Reif, Sahu, and Yin explored their power in the context of reversible attachment operations. We explore the power of negative interactions with irreversible attachments, and we achie… ▽ More This paper explores the use of negative (i.e., repulsive) interaction the abstract Tile Assembly Model defined by Winfree. Winfree postulated negative interactions to be physically plausible in his Ph.D. thesis, and Reif, Sahu, and Yin explored their power in the context of reversible attachment operations. We explore the power of negative interactions with irreversible attachments, and we achieve two main results. Our first result is an impossibility theorem: after t steps of assembly, Omega(t) tiles will be forever bound to an assembly, unable to detach. Thus negative glue strengths do not afford unlimited power to reuse tiles. Our second result is a positive one: we construct a set of tiles that can simulate a Turing machine with space bound s and time bound t, while ensuring that no intermediate assembly grows larger than O(s), rather than O(s * t) as required by the standard Turing machine simulation with tiles. △ Less

Submitted 13 February, 2010; originally announced February 2010.

ACM Class: F.1.1; F.1.1; F.1.m; F.m; J.2

arXiv:1001.0208 [pdf, ps, other]

Intrinsic Universality in Self-Assembly

Authors: David Doty, Jack H. Lutz, Matthew J. Patitz, Scott M. Summers, Damien Woods

Abstract: We show that the Tile Assembly Model exhibits a strong notion of universality where the goal is to give a single tile assembly system that simulates the behavior of any other tile assembly system. We give a tile assembly system that is capable of simulating a very wide class of tile systems, including itself. Specifically, we give a tile set that simulates the assembly of any tile assembly syste… ▽ More We show that the Tile Assembly Model exhibits a strong notion of universality where the goal is to give a single tile assembly system that simulates the behavior of any other tile assembly system. We give a tile assembly system that is capable of simulating a very wide class of tile systems, including itself. Specifically, we give a tile set that simulates the assembly of any tile assembly system in a class of systems that we call \emph{locally consistent}: each tile binds with exactly the strength needed to stay attached, and that there are no glue mismatches between tiles in any produced assembly. Our construction is reminiscent of the studies of \emph{intrinsic universality} of cellular automata by Ollinger and others, in the sense that our simulation of a tile system $T$ by a tile system $U$ represents each tile in an assembly produced by $T$ by a $c \times c$ block of tiles in $U$, where $c$ is a constant depending on $T$ but not on the size of the assembly $T$ produces (which may in fact be infinite). Also, our construction improves on earlier simulations of tile assembly systems by other tile assembly systems (in particular, those of Soloveichik and Winfree, and of Demaine et al.) in that we simulate the actual process of self-assembly, not just the end result, as in Soloveichik and Winfree's construction, and we do not discriminate against infinite structures. Both previous results simulate only temperature 1 systems, whereas our construction simulates tile assembly systems operating at temperature 2. △ Less

Submitted 3 February, 2010; v1 submitted 1 January, 2010; originally announced January 2010.

arXiv:0906.3251 [pdf, ps, other]

doi 10.4204/EPTCS.1.6

Limitations of Self-Assembly at Temperature One (extended abstract)

Authors: David Doty, Matthew J. Patitz, Scott M. Summers

Abstract: We prove that if a subset X of the integer Cartesian plane weakly self-assembles at temperature 1 in a deterministic (Winfree) tile assembly system satisfying a natural condition known as *pumpability*, then X is a finite union of doubly periodic sets. This shows that only the most simple of infinite shapes and patterns can be constructed using pumpable temperature 1 tile assembly systems, and g… ▽ More We prove that if a subset X of the integer Cartesian plane weakly self-assembles at temperature 1 in a deterministic (Winfree) tile assembly system satisfying a natural condition known as *pumpability*, then X is a finite union of doubly periodic sets. This shows that only the most simple of infinite shapes and patterns can be constructed using pumpable temperature 1 tile assembly systems, and gives strong evidence for the thesis that temperature 2 or higher is required to carry out general-purpose computation in a tile assembly system. Finally, we show that general-purpose computation is possible at temperature 1 if negative glue strengths are allowed in the tile assembly model. △ Less

Submitted 17 June, 2009; originally announced June 2009.

Journal ref: EPTCS 1, 2009, pp. 67-69

arXiv:0903.1857 [pdf, ps, other]

Limitations of Self-Assembly at Temperature 1

Authors: David Doty, Matthew J Patitz, Scott M Summers

Abstract: We prove that if a set $X \subseteq \Z^2$ weakly self-assembles at temperature 1 in a deterministic tile assembly system satisfying a natural condition known as \emph{pumpability}, then $X$ is a finite union of semi-doubly periodic sets. This shows that only the most simple of infinite shapes and patterns can be constructed using pumpable temperature 1 tile assembly systems, and gives evidence f… ▽ More We prove that if a set $X \subseteq \Z^2$ weakly self-assembles at temperature 1 in a deterministic tile assembly system satisfying a natural condition known as \emph{pumpability}, then $X$ is a finite union of semi-doubly periodic sets. This shows that only the most simple of infinite shapes and patterns can be constructed using pumpable temperature 1 tile assembly systems, and gives evidence for the thesis that temperature 2 or higher is required to carry out general-purpose computation in a tile assembly system. Finally, we show that general-purpose computation \emph{is} possible at temperature 1 if negative glue strengths are allowed in the tile assembly model. △ Less

Submitted 10 March, 2009; originally announced March 2009.

Comments: 10 page conference submission with additional technical appendix containing proofs

arXiv:0903.0889 [pdf, ps, other]

A Domain-Specific Language for Programming in the Tile Assembly Model

Authors: David Doty, Matthew J. Patitz

Abstract: We introduce a domain-specific language (DSL) for creating sets of tile types for simulations of the abstract Tile Assembly Model. The language defines objects known as tile templates, which represent related groups of tiles, and a small number of basic operations on tile templates that help to eliminate the error-prone drudgery of enumerating such tile types manually or with low-level construct… ▽ More We introduce a domain-specific language (DSL) for creating sets of tile types for simulations of the abstract Tile Assembly Model. The language defines objects known as tile templates, which represent related groups of tiles, and a small number of basic operations on tile templates that help to eliminate the error-prone drudgery of enumerating such tile types manually or with low-level constructs of general-purpose programming languages. The language is implemented as a class library in Python (a so-called internal DSL), but is presented independently of Python or object-oriented programming, with emphasis on supporting the creation of visual editing tools for programmatically creating large sets of complex tile types without needing to write a program. △ Less

Submitted 4 March, 2009; originally announced March 2009.

arXiv:0901.1849 [pdf, ps, other]

Randomized Self-Assembly for Exact Shapes

Authors: David Doty

Abstract: Working in Winfree's abstract tile assembly model, we show that a constant-size tile assembly system can be programmed through relative tile concentrations to build an n x n square with high probability, for any sufficiently large n. This answers an open question of Kao and Schweller (Randomized Self-Assembly for Approximate Shapes, ICALP 2008), who showed how to build an approximately n x n squar… ▽ More Working in Winfree's abstract tile assembly model, we show that a constant-size tile assembly system can be programmed through relative tile concentrations to build an n x n square with high probability, for any sufficiently large n. This answers an open question of Kao and Schweller (Randomized Self-Assembly for Approximate Shapes, ICALP 2008), who showed how to build an approximately n x n square using tile concentration programming, and asked whether the approximation could be made exact with high probability. We show how this technique can be modified to answer another question of Kao and Schweller, by showing that a constant-size tile assembly system can be programmed through tile concentrations to assemble arbitrary finite *scaled shapes*, which are shapes modified by replacing each point with a c x c block of points, for some integer c. Furthermore, we exhibit a smooth tradeoff between specifying bits of n via tile concentrations versus specifying them via hard-coded tile types, which allows tile concentration programming to be employed for specifying a fraction of the bits of "input" to a tile assembly system, under the constraint that concentrations can only be specified to a limited precision. Finally, to account for some unrealistic aspects of the tile concentration programming model, we show how to modify the construction to use only concentrations that are arbitrarily close to uniform. △ Less

Submitted 16 July, 2010; v1 submitted 13 January, 2009; originally announced January 2009.

Comments: Conference version accepted to FOCS 2009. Present version accepted to SIAM Journal on Computing, which adds new sections on arbitrary scaled shapes, smooth trade-off between specifying bits of n through concentrations versus hardcoded tile types, and construction that uses concentrations arbitrarily close to uniform to fix potential thermodynamic problems with model

arXiv:cs/0701123 [pdf, ps, other]

Feasible Depth

Authors: David Doty, Philippe Moser

Abstract: This paper introduces two complexity-theoretic formulations of Bennett's logical depth: finite-state depth and polynomial-time depth. It is shown that for both formulations, trivial and random infinite sequences are shallow, and a slow growth law holds, implying that deep sequences cannot be created easily from shallow sequences. Furthermore, the E analogue of the halting language is shown to be… ▽ More This paper introduces two complexity-theoretic formulations of Bennett's logical depth: finite-state depth and polynomial-time depth. It is shown that for both formulations, trivial and random infinite sequences are shallow, and a slow growth law holds, implying that deep sequences cannot be created easily from shallow sequences. Furthermore, the E analogue of the halting language is shown to be polynomial-time deep, by proving a more general result: every language to which a nonnegligible subset of E can be reduced in uniform exponential time is polynomial-time deep. △ Less

Submitted 11 April, 2007; v1 submitted 19 January, 2007; originally announced January 2007.

Comments: Accepted to Computation and Logic in the Real World, Proceedings of the 3rd Conference on Computability in Europe (CiE), 2007

arXiv:cs/0701089 [pdf, ps, other]

Constructive Dimension and Turing Degrees

Authors: Laurent Bienvenu, David Doty, Frank Stephan

Abstract: This paper examines the constructive Hausdorff and packing dimensions of Turing degrees. The main result is that every infinite sequence S with constructive Hausdorff dimension dim_H(S) and constructive packing dimension dim_P(S) is Turing equivalent to a sequence R with dim_H(R) <= (dim_H(S) / dim_P(S)) - epsilon, for arbitrary epsilon > 0. Furthermore, if dim_P(S) > 0, then dim_P(R) >= 1 - eps… ▽ More This paper examines the constructive Hausdorff and packing dimensions of Turing degrees. The main result is that every infinite sequence S with constructive Hausdorff dimension dim_H(S) and constructive packing dimension dim_P(S) is Turing equivalent to a sequence R with dim_H(R) <= (dim_H(S) / dim_P(S)) - epsilon, for arbitrary epsilon > 0. Furthermore, if dim_P(S) > 0, then dim_P(R) >= 1 - epsilon. The reduction thus serves as a *randomness extractor* that increases the algorithmic randomness of S, as measured by constructive dimension. A number of applications of this result shed new light on the constructive dimensions of Turing degrees. A lower bound of dim_H(S) / dim_P(S) is shown to hold for the Turing degree of any sequence S. A new proof is given of a previously-known zero-one law for the constructive packing dimension of Turing degrees. It is also shown that, for any regular sequence S (that is, dim_H(S) = dim_P(S)) such that dim_H(S) > 0, the Turing degree of S has constructive Hausdorff and packing dimension equal to 1. Finally, it is shown that no single Turing reduction can be a universal constructive Hausdorff dimension extractor, and that bounded Turing reductions cannot extract constructive Hausdorff dimension. We also exhibit sequences on which weak truth-table and bounded Turing reductions differ in their ability to extract dimension. △ Less

Submitted 8 April, 2010; v1 submitted 14 January, 2007; originally announced January 2007.

Comments: The version of this paper appearing in Theory of Computing Systems, 45(4):740-755, 2009, had an error in the proof of Theorem 2.4, due to insufficient care with the choice of delta. This version modifies that proof to fix the error.

arXiv:cs/0609096 [pdf, ps, other]

Finite-State Dimension and Lossy Decompressors

Authors: David Doty, Philippe Moser

Abstract: This paper examines information-theoretic questions regarding the difficulty of compressing data versus the difficulty of decompressing data and the role that information loss plays in this interaction. Finite-state compression and decompression are shown to be of equivalent difficulty, even when the decompressors are allowed to be lossy. Inspired by Kolmogorov complexity, this paper defines t… ▽ More This paper examines information-theoretic questions regarding the difficulty of compressing data versus the difficulty of decompressing data and the role that information loss plays in this interaction. Finite-state compression and decompression are shown to be of equivalent difficulty, even when the decompressors are allowed to be lossy. Inspired by Kolmogorov complexity, this paper defines the optimal *decompression *ratio achievable on an infinite sequence by finite-state decompressors (that is, finite-state transducers outputting the sequence in question). It is shown that the optimal compression ratio achievable on a sequence S by any *information lossless* finite state compressor, known as the finite-state dimension of S, is equal to the optimal decompression ratio achievable on S by any finite-state decompressor. This result implies a new decompression characterization of finite-state dimension in terms of lossy finite-state transducers. △ Less

Submitted 28 November, 2006; v1 submitted 18 September, 2006; originally announced September 2006.

Comments: We found that Theorem 3.11, which was basically the motive for this paper, was already proven by Sheinwald, Ziv, and Lempel in 1991 and 1995 papers

arXiv:cs/0606078 [pdf, ps, other]

Dimension Extractors and Optimal Decompression

Authors: David Doty

Abstract: A *dimension extractor* is an algorithm designed to increase the effective dimension -- i.e., the amount of computational randomness -- of an infinite binary sequence, in order to turn a "partially random" sequence into a "more random" sequence. Extractors are exhibited for various effective dimensions, including constructive, computable, space-bounded, time-bounded, and finite-state dimension.… ▽ More A *dimension extractor* is an algorithm designed to increase the effective dimension -- i.e., the amount of computational randomness -- of an infinite binary sequence, in order to turn a "partially random" sequence into a "more random" sequence. Extractors are exhibited for various effective dimensions, including constructive, computable, space-bounded, time-bounded, and finite-state dimension. Using similar techniques, the Kucera-Gacs theorem is examined from the perspective of decompression, by showing that every infinite sequence S is Turing reducible to a Martin-Loef random sequence R such that the asymptotic number of bits of R needed to compute n bits of S, divided by n, is precisely the constructive dimension of S, which is shown to be the optimal ratio of query bits to computed bits achievable with Turing reductions. The extractors and decompressors that are developed lead directly to new characterizations of some effective dimensions in terms of optimal decompression by Turing reductions. △ Less

Submitted 11 March, 2007; v1 submitted 18 June, 2006; originally announced June 2006.

Comments: This report was combined with a different conference paper "Every Sequence is Decompressible from a Random One" (cs.IT/0511074, at http://dx.doi.org/10.1007/11780342_17), and both titles were changed, with the conference paper incorporated as section 5 of this new combined paper. The combined paper was accepted to the journal Theory of Computing Systems, as part of a special issue of invited papers from the second conference on Computability in Europe, 2006

ACM Class: F.1.3; E.4; H.1.1

Showing 1–50 of 54 results for author: Doty, D