thm[theorem]Theorem \newtheoremreplemma[theorem]Lemma \newtheoremrepobservation[theorem]Observation \newtheoremrepcorollary[theorem]Corollary \hideLIPIcs Computer Science, University of California–Davis, CA, USA and https://web.cs.ucdavis.edu/~doty/[email protected]://orcid.org/0000-0002-3922-172XNSF awards 2211793, 1900931, 1844976, and DoE EXPRESS award SC0024467 CIT, Technical University of Munich, Germany and Computer Science, University of California–Davis, CA, [email protected] award 1844976 \CopyrightDavid Doty and Ben Heckmann \ccsdescTheory of computation Models of computation
The computational power of discrete chemical reaction networks with bounded executions
Abstract
Chemical reaction networks (CRNs) model systems where molecules interact according to a finite set of reactions such as , representing that if a molecule of and collide, they disappear and a molecule of is produced. CRNs can compute Boolean-valued predicates and integer-valued functions ; for instance computes the function
We study the computational power of execution bounded CRNs, in which only a finite number of reactions can occur from the initial configuration (e.g., ruling out reversible reactions such as ). The power and composability of such CRNs depends crucially on some other modeling choices that do not affect the computational power of CRNs with unbounded executions, namely whether an initial leader is present, and whether (for predicates) all species are required to “vote” for the Boolean output. If the CRN starts with an initial leader, and can allow only the leader to vote, then all semilinear predicates and functions can be stably computed in parallel time by execution bounded CRNs.
However, if no initial leader is allowed, all species vote, and the CRN is “noncollapsing” (does not shrink from initially large to final size configurations), then execution bounded CRNs are severely limited, able to compute only eventually constant predicates. A key tool is to characterize execution bounded CRNs as precisely those with a nonnegative linear potential function that is strictly decreased by every reaction, a result that may be of independent interest.
keywords:
chemical reaction networks, population protocols, stable computation1 Introduction
Chemical reaction networks (CRNs) are a fundamental tool for understanding and designing molecular systems. By abstracting chemical reactions into a set of finite, rule-based transformations, CRNs allow us to model the behavior of complex systems. For instance, the CRN with a single reaction , produces one every time two molecules randomly react together, effectively calculating the function if the initial count of molecules is interpreted as the input and as the output. A commonly studied special case of CRNs is the population protocol model of distributed computing [Angluin2006ComputationalPower], in which each reaction has exactly two reactants and two products, e.g., . This model assumes idealized conditions where reactions can proceed indefinitely, constrained only by the availability of reactants in the well-mixed solution.
Precisely the semilinear predicates and functions can be computed stably, roughly meaning that the output is correct no matter the order in which reactions happen. In population protocols or other CRNs with a finite reachable configuration space, this means that the output is correct with probability 1 under a stochastic scheduler that picks the next molecules to react at random. However, existing constructions to compute semilinear predicates and functions use CRNs with unbounded executions, meaning that it is possible to execute infinitely many reactions from the initial configuration. CRNs with bounded executions have several advantages. With an absolute guarantee on how many reactions will happen before the CRN terminates, wet-lab implementations need only supply a bounded amount of fuel to power the reactions. Such CRNs are simpler to reason about: each reaction brings it “closer” to the answer. They also lead to a simpler definition of stable computation than is typically employed: an execution bounded CRN stably computes a predicate/function if it gets the correct answer after sufficiently many reactions.
To study this topic, we limit the classical, discrete CRN model to networks that must eventually reach a configuration where no further reactions can occur, regardless of the sequence of reactions executed. By guaranteeing a finite endpoint for CRN computations and later integrating the concept of decreasing potential, we aim to align our models more closely with their implementations in the physical world.
This restriction is nontrivial because the techniques in [Chen2012DeterministicFunction] and [Doty2013Leaderless] rely on reversible reactions catalyzed by species we expect to be depleted once a computational step has terminated. This trick seems to add computational power to our system by undoing certain reactions as long as a specific species is present. Consider the following CRN computing . The input values are given as counts of copies of , and the count of molecules in the stable output:
(1) | ||||
(2) | ||||
(3) | ||||
(4) |
Reactions (1) and (2) compute , storing the result in the count of . Next, reaction (3) can be applied exactly times. But since the order of reactions is a stochastic process, we might consume copies of in (3), before all of is subtracted from it. Therefore, we add reaction (4), using as a catalyst to undo reaction (3) as long as copies of are present, indicating that the first step of computation has not terminated. A similar technique is used in [Chen2012DeterministicFunction], where semilinear sets are understood as a finite union of linear sets, shown to be computable in parallel by CRNs. A reversible, catalyzed reaction finally converts the output of one of the CRNs to the global output. Among other questions, we explore how the constructions of [Chen2012DeterministicFunction] and [Doty2013Leaderless] can be modified to provide equal computational power while guaranteeing bounded execution.
Section 3 defines execution boundedness (Definition 3.1). Furthermore, we introduce alternative characterizations of the class for use in later proofs, such as the lack of self-covering execution paths. Section 4 and 5 contain the main positive results of the paper and provide the concrete constructions used to decide semilinear sets and functions using execution bounded CRNs whose initial configurations contain a single leader. Section 6 discusses the limitations of execution bounded CRNs, introducing the concept of a “linear potential function” as a core characterization of these systems. We demonstrate that entirely execution bounded CRNs that are leaderless and non-collapsing (such as all population protocols), can only stably decide trivial semilinear predicates: the eventually constant predicates (Definition 6.11).
2 Preliminaries
We use established notation from [Chen2012DeterministicFunction, Doty2013Leaderless] and stable computation definitions from [Angluin2006ComputationalPower] for (discrete) chemical reaction networks.
2.1 Notation
Let denote the nonnegative integers. For any finite set , we write to mean the set of functions . Equivalently, can be interpreted as the set of vectors indexed by the elements of , and so specifies nonnegative integer counts for all elements of . denotes the -th coordinate of , and if is indexed by elements of , then denotes the count of species .
For two vectors , we write to denote that for all , to denote that but , and to denote that for all . In the case that , we say that is nonnegative, semipositive, and positive, respectively. Similarly define .
For a matrix or vector , define , ranges over all the entries of .
2.2 Chemical Reaction Networks
A chemical reaction network (CRN) is a pair , where is a finite set of chemical species, and is a finite set of reactions over , where each reaction is a pair indicating the reactants and products . A population protocol [angluin2004computation] is a CRN in which all reactions obey . We write reactions such as to represent the reaction . A configuration of a CRN assigns integer counts to every species . When convenient, we use the notation to describe a configuration with copies of species , i.e., , and any species that is not listed is assumed to have a zero count. If some configuration is understood from context, for a species , we write to denote A reaction is said to be applicable in configuration if . If the reaction is applicable, applying it results in configuration , and we write .
An execution is a finite or infinite sequence of one or more configurations such that, for all and . denotes that is finite, starts at , and ends at . In this case we say is reachable from . Let . Note that the reachability relation is additive: if , then for all , .
For a CRN where and , define the stoichiometric matrix of as follows. The species are ordered , and the reactions are ordered , and . In other words, is the net amount of produced when executing the ’th reaction. For instance, if the CRN has two reactions and , then \optfull
submission,final
Remark 2.1.
Let . Then the vector represents the change in species counts that results from applying reactions by amounts described in . In the above example, if , then , meaning that executing the first reaction twice () and the second reaction once () causes to decrease by 1, to stay the same, and to increase by 4.
2.3 Stable computation with CRNs
To capture the result of computations done by a CRN, we generalize the definitions to include information about how to interpret the final configuration after letting the CRN run until the result cannot change anymore (characterized below as stable computation). Computation primarily involves two classes of functions: 1. evaluating predicates to determine properties of the input (akin to deciding a set defined by these properties), and 2. executing general functions that map an input configuration to an output, denoted as .
A chemical reaction decider (CRD) is a tuple , where is a CRN, is the set of input species, is the set of yes voters, and is the set of no voters. If , we say the CRD is all-voting. We define a global output partial function as follows. is undefined if either , or if there exist and such that and . In other words, we require a unanimous vote as our output. We say is stable if, for all such that , We say a CRD stably decides the predicate if, for any valid initial configuration with , for all configurations implies such that is stable and . We associate to a predicate the set of inputs on which outputs 1, so we can equivalently say the CRD stably decides the set
A chemical reaction computer is a tuple , where is a CRN, is the set of input species, is the output species, and is the initial context. A configuration is stable if, for every such that , i.e. the output can never change again. We say that stably computes a function if for any valid initial configuration and any implies such that is stable and where denotes restriction of to .
For a CRD or CRC with initial context and input species , we say a is a valid initial configuration if , where for all ; i.e., is the initial context plus only input species.table configuration.
2.4 Time model
The following model of stochastic chemical kinetics is widely used in quantitative biology and other fields dealing with chemical reactions between species present in small counts [Gillespie77]. It ascribes probabilities to execution sequences, and also defines the time of reactions, allowing us to study the computational complexity of the CRN computation in Sections 4 and 5. If the volume is defined to be , the total number of molecules, then the time model is essentially equivalent to the notion of parallel time studied in population protocols [AngluinAE2008Fast]. In this paper, the rate constants of all reactions are , and we define the kinetic model with this assumption. A reaction is unimolecular if it has one reactant and bimolecular if it has two reactants. We use no higher-order reactions in this paper.
The kinetics of a CRN is described by a continuous-time Markov process as follows. Given a fixed volume , the propensity of a unimolecular reaction in configuration is . The propensity of a bimolecular reaction , where , is . The propensity of a bimolecular reaction is . The propensity function determines the evolution of the system as follows. The time until the next reaction occurs is an exponential random variable with rate (note that if no reactions are applicable to ). The probability that next reaction will be a particular is .
The kinetic model is based on the physical assumption of well-mixedness that is valid in a dilute solution. Thus, we assume the finite density constraint, which stipulates that a volume required to execute a CRN must be proportional to the maximum molecular count obtained during execution [SolCooWinBru08]. In other words, the total concentration (molecular count per volume) is bounded. This realistically constrains the speed of the computation achievable by CRNs.
For a CRD or CRC stably computing a predicate/function, the stabilization time is the function defined for all as the worst-case expected time to reach from any valid initial configuration of size to a stable configuration.
2.5 Semilinear sets, predicates, functions
Definition 2.2.
A set is linear if there are vectors such that . A set is semilinear if it is a finite union of linear sets. A predicate is semilinear if the set is semilinear. A function is semilinear if its graph is semilinear.
The following is a famous characterization of the computational power of CRNs [Angluin2006ComputationalPower, chen2023rate].
Theorem 2.3 ([Angluin2006ComputationalPower, chen2023rate]).
A predicate/function is stably computable by a CRD/CRC if and only if it is semilinear.
Definition 2.4.
is a threshold set is if there are constants such that is a mod set if there are constants such that
The following well-known characterization of semilinear sets is useful.
Theorem 2.5 ([Ginsburg1966Semigroups]).
A set is semilinear if and only if it is a Boolean combination (union, intersection, complement) of threshold and mod sets.
3 Execution bounded chemical reaction networks
In this section, we define execution bounded CRNs and state a few alternate characterizations of the definition. \optsubmissionProofs are in the appendix. \optfinalconfProofs are in the full version of this paper.
Definition 3.1.
A CRN is execution bounded from configuration if all executions starting at are finite. A CRD or CRC is execution bounded if it is execution bounded from every valid initial configuration. is entirely execution bounded if it is execution bounded from every configuration.
This is a distinct concept from the notion of “bounded” CRNs studied by Rackoff [Rackoff1978CoveringBoundedness] (studied under the equivlaent formalism of vector addition systems). That paper defines a CRN to be bounded from a configuration if is finite (and shows that the decision problem of determining whether this is true is -complete.) We use the term execution bounded to avoid confusion with this concept.
We first observe that being execution bounded from implies a slightly stronger condition: there is a uniform upper bound on the length of all executions from .111 In other words, this rules out the possibility that, although all executions from are finite, there are infinitely many of them , each longer than the previous.
A CRN is execution bounded from if and only if there is a constant such that all executions from have length at most . Equivalently, there are finitely many executions from .
Proof 3.2.
We use Kőnig’s lemma to show that in the absence of an infinite path, the number of all possible paths must be finite, which directly implies a global bound on the length of all executions. We represent the set of all executions for as a tree where each edge represents a single reaction applied and each node stores the complete execution sequence starting from configuration . Note that this construction is slightly different from a more straightforward graph with the reachable states as nodes, which would not give us a tree, since the same state can be reached by different executions. Formally, we generate the tree as follows: where , }. In other words, all the executions from of length are the nodes at depth of this tree. One can think of the nodes as being labeled by configurations rather than executions (specifically the final configuration of the execution, with the tree rooted at ), but the same configuration can label multiple nodes if it can be reached from via different executions. In this case the children of a configuration are those that are reachable from it by applying a single reaction.
This tree is finitely branching, as we can only choose from a finite number of reactions at any node. By definition of execution bounded, there is no execution sequence with an infinite length. Due to the bijection between paths in and executions possible in , there is no infinite path in the tree. By Kőnig’s Lemma, the tree has a finite number of nodes, guaranteeing a single bound (the depth of the tree) on the length of every execution.
The next lemma characterizes execution boundedness as equivalent to having a finite reachable state space with no cycles.
Lemma 3.3.
A CRN is execution bounded from if and only if is finite and, for all , except by the zero-length execution.
Proof 3.4.
Every configuration reachable from is reached through some execution contained in as a node and there exists only a finite number of them (Section 3). Multiple unique executions can produce the same configuration but one execution cannot produce multiple configurations. Thus, there exists a surjection from the nodes of into and must also be finite. For the second part of the condition, we prove its contrapositive and assume there exists , . Let . It holds that and . We can construct an infinite-length execution , which must also be a valid under the reactions of , making execution unbounded from .
If is finite and contains no such , then we can construct a finite, directed, acyclic graph where , . The longest path in the graph has length of at most . A bijection exists between paths in and executions possible in starting from . We set satisfying that each execution has length of at most , making execution bounded.
The following result is used frequently in impossibility proofs for CRNs and population protocols, and it will help us prove another characterization of execution bounded CRNs in Section 3.
Lemma 3.5.
(Dickson’s Lemma) For every infinite sequence of nonnegative integer vectors , there are such that .
We first observe an equivalent characterization of execution bounded that will be useful in the negative results of Section 6. \optsubmissionA proof is in the appendix.
Definition 3.6.
A execution is self-covering if for some , . It is strictly self-covering if . We also refer to these as (strict) self-covering paths.222 Rackoff [Rackoff1978CoveringBoundedness] uses the term “self-covering” to mean what we call strictly self-covering here, and points out that Karp and Miller [karp1969parallel] showed that is infinite if and only if there is a strictly self-covering path from . The distinction between these concepts is illustrated by the CRN . From any configuration , is finite (), and there is no strict self-covering path. However, from (say) , there is a (nonstrict) self-covering path , and by repeating, this CRN has an infinite cycling execution within its finite configuration space .
A CRN is execution bounded from if and only if there is no self-covering path from .
Proof 3.7.
For the forward direction, assume there is a self-covering path from , which reaches to and later to . Then the reactions going from to can be repeated indefinitely (in a cycle if , and increasing some molecular counts unboundedly if ), so is not execution bounded from .
For the reverse direction, assume is not execution bounded from . Then there is an infinite execution By Dickson’s Lemma there are such that , i.e., is self-covering.
4 Execution bounded CRDs stably decide all semilinear sets
In this section, we will show the computational equivalence between execution bounded and execution unbounded CRNs by a construction. The following is the main result of this section.
Theorem 4.1.
Exactly the semilinear sets are stably decidable by execution bounded CRDs. Furthermore, each can be stably decided with expected stabilization time .
Since semilinear sets are Boolean combinations of mod and threshold predicates, we prove this theorem by showing that execution bounded CRDs can decide mod and threshold sets individually as well as any Boolean combination in the following lemmas. To ensure execution boundedness in the last step, we require the following property.
Definition 4.2.
Let be a CRD with voting species . We say is single-voting if for any valid initial configuration and any s.t. , , i.e., exactly one voter is present in every reachable configuration.
Every mod set is stably decidable by an execution bounded, single-voting CRD with stabilization time .
We design a CRD with exactly one leader present at all times, cycling through “states” while consuming the input and accepting on state . Let be the set of input species and start with only one leader, i.e. set the initial context and for all other species. For each add the following reaction: Let only vote yes and all other species no, i.e. . For any valid initial configuration, reaches a stable configuration which votes yes if and only if the input is in the mod set, and no otherwise. \optsubmissionThe time and execution boundedness are proven in the appendix.
Proof 4.3.
terminates with the correct output value: At any point in time, there is a single leader present (the initial configuration contains a single leader and each reaction produces and consumes one). Every reaction satisfies the following invariant (for the leader’s subscript ): where is the updated count of species in the current configuration. By design of , there will be a reaction applicable as long as there are copies of (a leader with any subscript can react with any ). After applying this reaction as often as possible, we have reached a stable configuration with as the only species present.
is execution bounded: Every reaction reduces the count of chemicals by one. Every possible execution contains exactly configurations, where is the number of all molecules in the starting configuration.
is single-voting: Initially, is present and the only voter. Every valid input contains no voter and every reaction results in no change to the count of copies of .
stabilizes in time: We start with in volume . reactions must occur before terminates. For the first reaction, we have a rate of , for the last (with only the leader and one present), our rate will be . Thus, the expected time for all reactions to complete is
Every threshold set is stably decidable by an execution bounded, single-voting CRD with stabilization time .
We design a CRD which multiplies the input molecules according to their weight and consumes positive and negative units alternatingly using a single leader. Once no more reaction is applicable, the leader’s state will indicate whether or not there are positive units left and the threshold is met. Let be the set of input species and the yes voter. We first add reactions to multiply the input species by their respective weights. For all , add the reaction:
(5) |
and represent “positive” and “negative” units respectively. Now add reactions to consume and alternatingly using a leader until we run out of one species:
(6) | ||||
(7) |
Finally, initialize the CRD with one and the threshold number copies of (or if is negative), i.e. , if , or if , and for all other species. For any valid initial configuration, reaches a stable configuration which votes yes if and only if the weighted sum of inputs is above the threshold, and no otherwise. \optsubmissionThe execution time is proven in the appendix.
Proof 4.4.
is single-voting since it starts with a single leader and no reaction changes the count of molecules.
stabilizes in time: First, all input species will be converted to instances of or . We run these reactions until no . As they are independent of molecules other than the reactant, these reactions have a rate of , so the expected time until the next reaction is . The total time for reactions (5) to complete is therefore . The time for reactions (6) and (7) on the other hand is asymptotically dominated by the last reaction, where and , where , so . Let be the counts of and assume without loss of generality . We get:
Lemma 4.5.
If sets are stably decided by some execution bounded, single-voting CRD, then so are , and with stabilization time .
Proof 4.6.
To stably decide , swap the yes and no voters.
For and , consider a construction where we decide both sets separately and record both of their votes in a new voter species. For this, we allow the set of all voters to be a strict subset of all species. We first add reactions to duplicate our input with reactions of the form
(8) |
by two separate CRDs. Subsequently, we add reactions to record the separate votes in one of four new voter species: . The first and second CRN determine the first and second subscript respectively. For and if are leaders of and respectively, add the reactions:
(9) | |||
(10) |
E.g. if is the no voter of the first CRD, we would add and . We let the yes voters be: to stably decide or to stably decide .
Reaction (8) will complete in time and is clearly execution bounded since the input is finite and not produced in any reaction. Consequently, two separate CRNs run in time as shown in Section 4 and Section 4. After stabilization of the parallel CRNs, we expect reaction (9) and (10) to happen exactly once. Each molecule involved is a leader and has count in volume . This leads to a rate of , so the expected time for one reaction to happen is . It is important to note that reactions (9) and (10) do not result in unbounded executions due to the unanimous vote in parallel CRDs. In both mod sets and threshold sets, the leader changes its vote a maximum of times, with only ever one leader present at any time. Again, we start with only one voter present initially and no reaction changes the count of voters, making our construction single-voting.
Since semilinear predicates are exactly Boolean combinations of threshold and mod predicates, Sections 4, 4 and 4.5 imply the following.
Theorem 4.7.
Every semilinear set is stably decidable by an execution bounded, single-voting CRD, with stabilization time
We can also prove the same result for all-voting CRDs. Note, however, that such CRDs cannot be “composed” using the constructions of LABEL:{lem:boolean-closure-crd} and 5.5, which crucially relied on the assumption that the CRDs being used as “subroutines” are single-voting. \optsubmissionA proof is in the appendix.
Every semilinear set is stably decidable by an execution bounded, all-voting CRD, with stabilization time
Proof 4.8.
By Theorem 4.7, every semilinear set is stably decided by a single-voting CRD. We convert this to an all-voting CRD, where every species is required to vote yes or no, by “propagating” the final vote (recorded in the single voter voting no or voting yes) back to all other molecules. A superscript indicates the “global” decision. The execution boundedness proven in Lemma 4.5 ensures that the leader propagates the final vote only a finite amount of times. For each vote and each voter voting , and all other species , replace species with two versions and , and add reactions:
(11) |
The original reactions of the CRD must also be replaced with “functionally identical” reactions for the new versions of species. For example, the reaction becomes
In the middle two cases we can pick the superscripts of the products arbitrarily, whereas in the first and last case, we must choose the product votes to match those of the reactants to ensure stable states remain stable.
A vote change of the leader leads to the propagation of the vote to at most molecules once using reaction (11). This reaction dominates the runtime, as a single molecule is required to interact with each other molecule. We cannot speed this process up using an epidemic style process as conflicting votes would make the CRN execution unbounded. The original CRD takes time to converge on a correct output for the single voter . At that point, a standard coupon collector argument shows that the voter takes expected time to correct the votes of all other species via reaction (11).
5 Execution bounded CRCs stably compute all semilinear functions
In this section we shift focus from computing Boolean-valued predicates to integer-valued functions , showing that execution bounded CRCs can stably compute the same class of functions (semilinear) as unrestricted CRCs.
Similar to [Chen2012DeterministicFunction, Doty2013Leaderless], we compute semilinear functions by decomposing them into “affine pieces”, which we will show can be computed by execution bounded CRNs and combined by using semilinear predicates to decide which linear function to apply for a given input.333 While this proof generalizes to multivariate output functions as in [Chen2012DeterministicFunction, Doty2013Leaderless], to simplify notation we focus on single output functions. Multi-valued functions can be equivalently thought of as separate single output functions , which can be computed in parallel by independent CRCs.
We say a partial function is affine if there exist a vectors , with and nonnegative integer such that This definition of affine function may appear contrived, but the main utility of the definition is that it satisfies Section 5. For convenience, we can ensure to only work with integer valued molecule counts by multiplying by after the dot product, where may be taken to be the least common multiple of the denominators of the rational coefficients in the original definition such that : \optfull
submission,final
We say that a partial function is a diff-representation of if and, for all , if , then , and . In other words, represents as the difference of its two outputs and , with the larger output possibly being larger than the original function’s output, but at most a multiplicative constant larger [Doty2013Leaderless].
Lemma 5.1.
Let be an affine partial function. Then there is a diff-representation of and an execution bounded CRC that monotonically stably computes in expected time .
Proof 5.2.
Define a CRC with input species and output species . We need to ensure that after stabilizing,
To account for the offset, start with copies of .
For the offset, we must reduce the number of by . Since the result will be used in the next reaction, we want to produce a new species and require to not be consumed during the computation. We achieve this by adding reactions that let consume itself times (kee** track with a subscript) and converting to once has been reached. For each and , if , add the reaction
(12) |
If , add the reaction
(13) |
Runtime: In volume , the rate of reactions (12) and (13) would be ( molecules have the chance to react with any of the others), so the expected time for the next reaction is . The expected time for the whole process is . Further, the reactions are execution bounded since both strictly decrease the number of their reactants and exactly reactions will happen.
To account for the coefficient, we multiply by , then divide by using similar reactions as for the subtraction. To multiply by , add the following reaction for each :
(14) |
For each , if , add the reactions
(15) | |||
(16) |
If , add the reactions
(17) | |||
(18) |
Reactions (14) complete in expected time , while (17) and (18) complete in by a similar analysis as for the first two reactions. As for execution boundedness, (14) is only applicable once for every ; all other reactions start with a number of reactants which are a constant factor of and decrease the count of their reactants by one in each reaction.
We require the following result due to Chen, Doty, Soloveichik [Chen2012DeterministicFunction], guaranteeing that any semilinear function can be built from affine partial functions.
Lemma 5.3 ([Chen2012DeterministicFunction]).
Let be a semilinear function. Then there is a finite set of affine partial functions, where each is a linear set, such that, for each , if is defined, then , and .
We strengthen Lemma 5.3 to show we may assume each is disjoint from the others. This is needed not only to prove Theorem 5.5, but to correct the proof of Lemma 4.4 in [Chen2012DeterministicFunction], which implicitly assumed the domains are disjoint. \optsubmissionSection 5 is proven in the appendix.
Let be a semilinear function. Then there is a finite set of affine partial functions, where each is a linear set, and for all , such that, for each , if is defined, then , and .
Proof 5.4.
By [Ito1969SemilinearSetsFiniteUnionDisjointLinearSets, Theorem 2], every semilinear set is a finite union of disjoint fundamental linear sets. The author defines a linear set as fundamental, if span a -dimensional vector space in , i.e. all vectors are linearly independent in .444 This distinction is significant because not all integer-valued linear sets can be represented using solely linearly independent vectors. An illustrative example is , as discussed in [Chen2012DeterministicFunction]. The vectors are not linearly independent in , yet this set cannot be expressed with less than three basis vectors. The proof of Lemma 5.3 in [Chen2012DeterministicFunction] shows that each linear set comprising the semilinear graph of corresponds to one partial affine function . The fact that [Ito1969SemilinearSetsFiniteUnionDisjointLinearSets, Theorem 2] lets us assume each is disjoint from the others immediately implies that each is disjoint from the others.
The next theorem shows that semilinear functions can be computed by execution bounded CRCs in expected time .
Theorem 5.5.
Let be a semilinear function. Then there is an execution bounded CRC that stably computes with stabilization time , in expectation and with probability at least .
Proof 5.6.
We employ the same construction of [Chen2012DeterministicFunction] with minor alterations. A CRC with input species and output species . By Section 5, we decompose our semilinear function into partial affine functions (with linear, disjoint domains), which can be computed in parallel by Lemma 5.1. Further, we decide which function to use by computing the predicate “” (Theorem 4.7). We interpret each and as an “inactive” version of “active” output species and . Let be the yes and no voters respectively voting whether lies in the domain of -th partial function. Now, we convert the function result of the applicable partial affine function to the global output by adding the following reactions for each .
(19) | ||||
(20) | ||||
(21) |
Reaction (19) produces an output copy of species and (20) and (21) reverse the first reaction using only bimolecular reactions. Both are catalyzed by the vote of the -th predicate result. Also add reactions
(22) | |||
(23) |
and
(24) | ||||
(25) |
Reactions (22) and (23) activate and deactivate the “negative” output values and reactions (24) and (25) allow two active partial outputs to cancel out and consume the excess in the process. When the input is in the domain of function , exactly one copy of will be present, otherwise one copy of . Since we know that the predicate computation is execution bounded and produces at most one voter, the catalytic reaction will also happen at most as often as the leader changes its vote. Therefore, it is also execution bounded.
6 Limitations of execution bounded CRNs
The main positive results of the paper (LABEL:{thm:execution_bdd_CRDs_decide_all_semilinear_sets} and 5.5) rely on the assumption that valid initial configurations have a single leader (in particular, they are execution bounded only from configurations with a single leader, but not from arbitrary configurations). Section 4 shows that we may assume the CRD deciding a semilinear set is all-voting. However, for the “constructive” results LABEL:{lem:boolean-closure-crd} and 5.5, which compose the output of a CRD with downstream computation, using as a “subroutine” to stably compute a more complex set/function, the constructions crucially use the assumption that is single-voting (i.e., only the leader of votes) to argue the resulting composed CRN is execution bounded. In this section we show these assumptions are necessary, proving that execution bounded CRNs without those constraints are severely limited in their computational abilities.
We show that entirely execution bounded CRNs (from every configuration) can be characterized by a simpler property of having a “linear potential function” that essentially measures how close the CRN is to reaching a terminal configuration. We then use this characterization to prove that entirely execution bounded CRNs can stably decide only limited semilinear predicates (eventually constant, Definition 6.11), assuming all species vote, and that molecular counts cannot decrease to in stable configurations (see Definition 6.8).
6.1 Linear potential functions
We define a linear potential function of a CRN to be a nonnegative linear function of states that each reaction strictly decreases.
Definition 6.1.
A linear potential function for a CRN is a nonnegative, linear function of configurations, such that for each reaction ,
Note that since is required to be nonnegative on all configurations , it must be nondecreasing in each species, i.e., all coefficients must be nonnegative (though some are permitted to be 0). Intuitively, we can think of as assigning a nonnegative “mass” to each species (the mass of is ), such that each reaction removes a positive amount of mass from the system.
Remark 6.2.
A system of linear inequalities with rational coefficients has a real solution if and only if it has a rational solution. For any homogeneous system (where all inequalities are comparing to 0), any positive scalar multiple of a solution is also a solution. By clearing denominators, a system has a rational solution if and only if it has an integer solution. Thus, one can equivalently define a linear potential function to be a function such that each , i.e., we may assume . In particular, since is decreased by each reaction, it is decreased by at least 1.
A CRN may or may not have a linear potential function. Although it is not straightforward to “syntactically check” a CRN to see if has a linear potential function, it is efficiently decidable: a CRN has a linear potential function if and only if the following system of linear inequalities has a solution (which can be solved in polynomial time using linear programming techniques; the variables to solve for are the for each ), where the ’th reaction has reactants and products , and species has mass : \optfull
submission,final For example, for the reactions and , for each reaction to strictly decrease the potential function , must satisfy and . In this case, works.
The following is a variant of Farkas’ Lemma [farkas1902theorie], one of several similar “Theorems of the Alternative” stating that exactly one of two different linear systems has a solution. (See [mangasarian1994nonlinear, Chapter 4] for a list of such theorems.) A proof can be found in [gale1960theory, Theorem 2.10].
Theorem 6.3.
Let be a real matrix. Exactly one of the following statements is true.
-
1.
There is a vector such that .
-
2.
There is a vector such that .
We require the following discrete variant of Theorem 6.3. The geometric intuition of this version is illustrated in Figure 1. \optsubmissionIt is proven in the appendix.
Let be a rational matrix. Exactly one of the following statements is true.
-
1.
There is an integer vector such that .
-
2.
There is an integer vector such that .
Proof 6.4.
For convenience when we use Section 6.1 in proving Theorem 6.5, we swapped the roles of and in left- vs. right-multiplication with ; the real-valued version of the statement of Section 6.1 is equivalent to Theorem 6.3 by taking the transpose of .
To see that we may assume the vectors are integer-valued if is rational-valued, recall that a system of linear equalities/inequalities with rational coefficients has a solution if and only if it has a rational solution. Since the system is homogeneous (the matrix-vector product is compared to the zero vector ), any multiple of a solution is also a solution. By clearing denominators, it has a rational solution if and only if it has an integer solution.
Although we do not need the following fact, it is worthwhile to observe that, if is integer-valued (as in our application), then the solution or (whichever exists) in Section 6.1 has entries that are at most exponential in i.e., at most exponential in the sum of absolute values of entries of (see e.g., [papadimitriou1981complexity]). So in particular when we consider having small size entries, this means the solution or has entries that are at most exponential in the number of rows and columns of . When is a stoichiometric matrix, this corresponds to the number of species and reactions, respectively, of the CRN.
Section 6.1 will help us prove the following theorem characterizing CRNs with bounded executions from all configurations. Theorem 6.5 is used in this paper only to prove Theorems 6.9 and 6.3, but it may also be of independent interest, since it equates a “global, infinitary, difficult-to-check” condition (bounded executions from all configurations) with a “local, easy-to-check” condition (having a linear potential function).
Theorem 6.5.
A CRN has a linear potential function if and only if it is entirely execution bounded.
Proof 6.6.
Let be a CRN. The forward direction is easy: assuming has potential function , since each reaction decreases by at least 1 (see Remark 6.2), starting from configuration , we can execute at most reactions while kee** nonnegative. Thus is entirely execution bounded.
To see the reverse direction, assume that is execution bounded from every configuration, and let be the stoichiometric matrix of . We claim there is no integer vector satisfying ; for the sake of contradiction suppose otherwise. Interpreting as counts of reactions to execute, for any sufficiently large configuration , all reactions in can be applied (in arbitrary order), and the vector describes the resulting change in species counts, reaching to configuration . Since , this path is self-covering, i.e., . But since is execution bounded from every configuration, by Section 3, has no self-covering path from any configuration, a contradiction. This establishes the claim that has no integer solution .
By Section 6.1, there is an integer vector such that . Let be the coefficients of a linear function , i.e., . Then the vector represents the amount changes by one unit of each reaction, i.e., is the amount increases when executing reaction once. Since , this means that every reaction strictly decreases , i.e., is a linear potential function for .
Remark 6.7.
By employing the real-valued version of Section 6.1, the above proof also shows that Theorem 6.5 holds for the continuous model of CRNs [chen2023rate], in which species amounts are modeled as continuous nonnegative real concentrations. In this case, a continuous CRN would be defined to be execution bounded from configuration if each reaction can be executed by at most a finite (real-valued) amount from .
6.2 Impossibility of stably deciding majority and parity
In this section, we prove Theorem 6.9, which is a special case of our main negative result, Section 6.3. We give a self-contained proof of Theorem 6.9 because it is simpler and serves as an intuitive warmup to some of the key ideas used in proving Section 6.3, without the complexities of dealing with arbitrary semilinear sets.
Theorem 6.9 shows a limitation on the computational power of entirely execution bounded, all-voting CRNs, but it requires an additional constraint on the CRN for the result to hold (and we later give counterexamples showing that this extra hypothesis is provably necessary), described in the following definition.
Definition 6.8.
Let be a CRD. The output size of is the function defined , the size of the smallest stable configuration reachable from any valid initial configuration of size . A CRD is non-collapsing if .
Put another way, is collapsing if there is a constant such that, from infinitely many initial configurations , can reach a stable configuration of size at most . All population protocols are non-collapsing, since every reaction preserves the configuration size.
Theorem 6.9.
No noncollapsing, all-voting, entirely execution bounded CRD can stably decide the majority predicate or the parity predicate .
Proof 6.10.
Let be a CRD obeying the stated conditions, and suppose for the sake of contradiction that stably decides the majority predicate (so ).
We consider the sequence of stable configurations defined as follows. Let be a stable configuration reachable from initial configuration ; since the correct answer is yes, all species present in vote yes. Now add a single copy of . By additivity, the configuration is reachable from , for which the correct answer in this case is no. Thus, since stably decides majority, from , a stable “no” configuration is reachable; call this . Now add a single . Since the correct answer is yes, from a stable “yes” configuration is reachable, call it .
Continuing in this way, we have a sequence of stable configurations where all species in vote yes and all species in vote no. Since is noncollapsing, the size of the configurations and increases without bound as . (Possibly , i.e., the size is not necessarily monotonically increasing, but for all sufficiently large , we have .) Since all species vote, for some constant , to get from to , at least reactions must occur. This is because all species in must be removed since they vote yes, and each reaction removes at most molecules. (Concretely, let , i.e., 1 over the most net molecules consumed in any reaction.) Similarly, to get from to , at least reactions must occur.
Since is entirely execution bounded, by Theorem 6.5, has a linear potential function , where . Adding a single to increases by the constant . Since grows without bound, the number of reactions to get from to increases without bound as , and since each reaction strictly decreases by at least 1, the total change in that results from adding and then going from to is unbounded in , so unboundedly negative for sufficiently large (negative once is large enough that ). Similarly, adding a single to and going from to , the resulting total change in is unbounded and (eventually) negative.
starts this process at the constant . Before and are large enough that and (i.e., large enough that the net change in is negative resulting from adding a single input and going to the next stable configuration), could increase, if (resp. ) is larger than the net decrease in due to following reactions to get from to (resp. from to ).
However, since is noncollapsing, this can only happen for a constant number of (so never reaches more than a constant above its initial value ), after which strictly decreases after each round of this process. At some point in this process, will not be able to reach all the way to the next or without becoming negative, a contradiction.
The argument for parity is similar, but instead of alternating adding then , in each round we always add one more to flip the correct answer.
Theorem 6.9 is false without the noncollapsing hypothesis. The following collapsing, leaderless (but all-voting and entirely execution bounded) CRD stably decides majority: Species vote yes, while vote no:
It has bounded executions from any configuration: of the first reaction can occur, and the other reactions decrease molecular count, so are limited by the total configuration size. However, it is collapsing since a stable configuration of size 1 is always reachable. Theorem 6.9 is similarly false without the all-voting hypothesis; for each of the reactions with one product above, add another non-voting product . This converts the CRD to be noncollapsing but not all-voting. Of course, the execution bounded hypothesis is also necessary: the original population protocols paper [angluin2004computation] showed that all-voting, noncollapsing, leaderless population protocols can stably decide all semilinear predicates.
The following collapsing, all-voting, leaderless (but entirely execution bounded) CRD stably decides parity. Let the input species be named . Species votes yes, votes no:
full It has bounded executions from any configuration: exactly reactions can occur since each reduces by 1. Similar to above, by adding the non-voting product to each reaction above, this CRD becomes noncollapsing but not all-voting, showing that the all-voting hypothesis is also necessary for stably deciding parity.
6.3 Impossibility of stably deciding not eventually constant predicates
We now present our main negative result, Section 6.3, which generalizes Theorem 6.9 to show that such CRNs can stably decide only trivial (eventually constant) predicates. \optsubmissionThe proof is in the appendix.
Definition 6.11.
Let be a predicate. We say is eventually constant if there is such that is constant on , i.e., either or .
In other words, although may have an infinite number of each output, “sufficiently far from the boundary of the positive orthant” (where all coordinates exceed ), only one output appears. \optsubmission,fullSee Figure 2 for a 2D example.
For any set and , write to denote the set which is translated by vector . Let denote the unit vector in direction , i.e., and for .
Definition 6.12.
We say is periodic if, for some , for some finite set , . We say is the period of and say that is -periodic. Equivalently, is -periodic if, for all and all unit vectors , .
In other words, is periodic if it is a union of copies of a finite subset of the hypercube with a corner at the origin, translated in each direction by every nonnegative integer multiple of the hypercube’s width. See Figure 3. Note that if is -periodic, then it is also -periodic for every positive integer multiple of .
Lemma 6.13.
Let be a Boolean combination of mod sets. Then is periodic.
Proof 6.14.
We prove this by induction on the number of mod sets. For the base case, let be a single mod set, where and are constants. Letting and in Definition 6.12 works. Let . Then for all , , so , meaning that , so is -periodic.
The inductive case amounts to showing that periodic sets are closed under Boolean operations of union, intersection, and complement. Clearly the complement of any periodic set is also periodic.
Inductively assume that are periodic; we argue that is periodic. Letting be the least common multiple of their periods, we may assume both and are -periodic with the same period . Then for all and all unit vectors , and . Thus , so is also -periodic. Similar reasoning shows is -periodic (one can also appeal to DeMorgan’s Laws).
Each threshold set is defined by a hyperplane that partitions into the sets (on one side of the hyperplane, including integer points on the hyperplane itself) and (on the other side of the hyperplane). More generally, several threshold sets partition into multiple disjoint subsets we call “regions”. Furthermore, any predicate that is a Boolean combination of threshold sets has constant output in any region; the next definition formalizes this.
Definition 6.15.
Let be Boolean combination of threshold sets . A region of is a convex polytope such that, for all , for all , . The output of the region is the value 1 if and 0 if . (Note these are the only two possibilities, since no individual threshold set is exited or entered as we move within .) A region is totally unbounded if, for all , , i.e., contains points that are arbitrarily large on all components. A region is called partially bounded if it is not totally unbounded.
Put another way, predicates defined by Boolean combinations of threshold sets are defined by -dimensional hyperplanes that partition into regions, where in each region, the output of the predicate is all yes, or all no. In fact this is an exact characterization of Boolean combinations of threshold predicates.
Definition 6.16.
For any set , the recession cone of is
the set of vectors such that, from any point in , one can move in direction forever without leaving .
A region defined by threshold sets is totally unbounded if and only if , i.e., the recession cone of contains a positive vector.
Lemma 6.17.
Let be Boolean combination of threshold sets that is not eventually constant. Then there are two adjacent totally unbounded regions , with opposite outputs, such that the normal vector of the hyperplane separating and has at least one negative component and at least one positive component.
Proof 6.18.
See Figure 5 for an example in 2D. Since is not eventually constant, it must have two totally unbounded regions and with opposite outputs; assume WLOG that has output . Let be sufficiently large that all partially bounded regions of are subsets of . Now, simply pick any points and . There is some path from to that follows only unit vectors (i.e., moves only to adjacent points that are distance 1 from the previous point), such that every intermediate point also obeys .
Then this path never enters a partially bounded region of , since they are all subsets of . Thus, since the path starts in a region with output 0, ends in a region with output 1, there must be two adjacent points on the path, where is in a totally unbounded region with output 0 and is in a totally unbounded region with output 1.
Finally, we must that the normal vector of the hyperplane separating from has a negative and a positive entry. Recall that a threshold set is defined by , where and (Definition 2.4). Since is a Boolean combination of threshold sets and are adjacent with opposite outputs, there must be some threshold set such that , but (or vice versa, but assume WLOG, since we could replace with in the Boolean combination defining ). Equivalently, we can think of the regions and as being separated by the hyperplane , with normal vector and offset , such that all points obey , and all points obey . The transition between the regions at points and involves crossing the hyperplane, where the inequality changes from to , which defines the boundary between different outputs (0 in and 1 in ). Therefore, the points on the hyperplane necessarily lie exactly at the boundary between these regions.
We show that cannot be nonnegative or nonpositive. Suppose (scale the normal vector by otherwise). Since is totally unbounded, it contains points that are arbitrarily large on all components. More formally, there is a strictly increasing sequence such that all Since , . This contradicts the previous assumption that all points obey (geometrically, we would cross the hyperplane somewhere and land in ). Symmetric reasoning applies to the case . We conclude that the separating hyperplane must have a normal vector with at least one positive and at least one negative component, establishing the lemma.
The next lemma shows that the there exists a vector parallel to the hyperplane separating the two regions. In other words, we can move along while increasing every component.
Let be a hyperplane with normal vector . Then there is a positive vector with if and only if has at least one negative component and at least one positive component.
Proof 6.19.
If and then . Similarly, if and then . So to get , must have at least one positive and at least one negative element.
We construct as follows: Let denote the indices of the positive coordinates of and the indices of the negative coordinates. Our goal is to balance out the positive and negative parts of the dot product, given by . Set to be the sum of the positive coordinates of if and the sum of the absolute values of negative coordinates of otherwise:
Substituting into the formula shows the correctness. For brevity, let if and if as above.
Finally, if is not integer-valued, scale it by the least common multiple of all coordinate denominators to ensure without altering the dot product.
Lemma 6.20.
Let be a semilinear predicate that is not eventually constant. Then there is an infinite sequence and constant , such that for all ,
-
1.
(correct answer swaps for each subsequent point),
-
2.
(inputs are increasing), and
-
3.
(adjacent inputs are “close”).
Proof 6.21.
We associate to the set where , i.e., .
Since is semilinear, it is a Boolean combination of threshold sets and mod sets . Recall Definition 6.15, where the threshold sets partition into regions, where moving within a region does not cross and hyperplanes defining the threshold sets, thus does not change the Boolean value [?] for any . Suppose we have regions Then we can rewrite as a Boolean combination of mod sets only, intersected with . We do this by replacing each in the original Boolean expression with either or , depending whether or , respectively.555 For example, if the expression is , if the points are in but not or , this becomes . (Note by the definition of region these are the only two possibilities.) Let be this Boolean combination of mod sets, such that . By Lemma 6.13, is periodic.
Consider a totally unbounded region By Section 6.3, contains a positive vector . We have two cases:
- for some totally unbounded region , is not constant:
-
This is illustrated in Figures 3 and 4, which show two subcases. Figure 3 shows the subcase where, for some and point , defining , the sequence is not constant. Since is periodic, the sequence is periodic with period . So we can find a subsequence obeying all three conditions of the lemma. In particular, it suffices to choose a point (resp. ) let such that (resp. ), letting , and let , and subsequent elements of the subsequence are the same distances apart ().
Figure 4 shows the subcase where, for all and , defining , the sequence is constant. However, since is not constant, we can still find a sequence , but unlike the previous subcase, it is not a subsequence of points collinear along one vector .
Since is periodic and not constant, and since is totally unbounded, for every , there is such that , i.e., for every point in the region in , there is a larger point in not in . Also, since is periodic, there is a constant independent of such that . By symmetric reasoning, there is a such that and .
Let be arbitrary. For all , choose based on as above, such that , , and if is odd and if is even. Then the sequence satisfies the lemma.
- for all totally unbounded regions , is constant:
-
This implies that the mod sets can be “factored out” of the Boolean expression defining in terms of threshold sets and the mod sets , which will give the same output as in totally unbounded regions. Put another way, is a Boolean combination of the threshold sets , where represents all the totally unbounded regions.
By Lemma 6.17, two adjacent totally unbounded regions of have opposite outputs. See Figure 5 for an example of picking the points below. These adjacent regions are separated by some hyperplane , such that , but for some unit vector , , i.e., all of is contained in , but the entire hyperplane adjacent to in direction , consists of points not in . Note this is not true for general hyperplanes, e.g., one whose orthogonal vector is , where both unit vectors and would move off the hyperplane, but in the “yes” direction where the point is still contained in the threshold set. However, since is separating two totally unbounded regions, some strictly positive vector is parallel to , i.e., obeys for ’s orthogonal vector . By Section 6.3, has at least one positive coordinate (say ) and at least one negative coordinate (say ), so that unit vector moves to one side of and moves to the other side.
In this case, we let be some vector parallel to , let , sufficiently large that the vector , starting at , does not cross any of the hyperplanes of (as in Figure 5). Define the rest of the infinite sequence as
By the arguments given above, for all odd , and for all even , , satisfying condition (1). If is even, then , so clearly , satisfying condition (2). If is odd, then . Since , we have , so , satisfying condition (2). Finally, , satisfying condition (3).
If a noncollapsing, all-voting, entirely execution bounded CRD stably decides a predicate , then is eventually constant.
submissionA complete proof appears in the appendix.
This proof is similar to that of Theorem 6.9. In that proof, we repeatedly add a “constant amount of additional input or , which flips the output”. For more general semilinear, but not eventually constant, predicates, we dig into the structure of the semilinear set to find a sequence of constant-size vectors representing additional inputs that flip the correct answer. Any predicate that is not eventually constant has infinitely many yes inputs and infinitely many no inputs, but in general they could be increasingly far apart: e.g., if and only if for even . For the potential function argument to work, each subsequence input needs to be at most a constant larger than the previous.
But if is semilinear (and not eventually constant) then we can show that there is a sequence of increasing inputs , each a constant distance from the next (), flip** the output (). Roughly, this is true for one of two reasons. Using Theorem 2.5, is a Boolean combination of threshold and mod sets. Either the mod sets are not combined to be trivially or , in which case we can find some vector that, followed infinitely far from some starting point (so ) periodically hits both yes inputs () and no inputs (). \optsubmission,full(See Figures 3 and 4.) Otherwise, the mod sets can be removed and simplify the Boolean combination to only threshold sets, in which case the infinite sequence can be obtained by moving along a threshold hyperplane that separates yes from no inputs. \optsubmission,full(See Figure 5.)
Proof 6.22.
This proof is similar to that of Theorem 6.9, with the vectors defined below playing the role of the “constant amount of additional input or that flips the correct answer” in that proof.
Let be a CRD obeying the stated conditions, and suppose for the sake of contradiction that stably decides a semilinear predicate that is not eventually constant.
By Lemma 6.20, there is an infinite sequence such that
-
1.
(i.e., the correct answer swaps for each subsequent input)
-
2.
, i.e., the inputs are increasing (on at least one coordinate(s)), and
-
3.
for some constant , , i.e., adjacent inputs are “close”.
Assume WLOG that . For each , let , noting by condition (2) that .
We consider the sequence of stable configurations defined as follows. Let be a stable configuration reachable from ; since the correct answer is no, all species present in vote no. Now add to . By additivity, the configuration is reachable from . Since the correct answer for is yes, must go from to a stable “yes” configuration, call this . Now add to . Since the correct answer is no, must now reach from to a stable “no” configuration, call it . By condition (3), each obeys for some constant .
Continuing in this way, we have a sequence of stable configurations where all species in vote yes for odd , and all species in vote no for even . Since is noncollapsing, the size of the configurations increases without bound as . (Possibly , i.e., the size is not necessarily monotonically nondecreasing, but for all sufficiently large , we have .)
Since all species vote, for some constant , to get from to , at least reactions must occur. This is because all species in must be removed since they vote the opposite of the voters in , and each reaction removes at most molecules. (Concretely, let , i.e., 1 over the most net molecules consumed in any reaction.)
Since is entirely execution bounded, by Theorem 6.5, has a linear potential function , where . Adding to increases by , which is bounded above by a constant since . Since grows without bound, the number of reactions to get from to increases without bound as , and since each reaction strictly decreases by at least 1, the total change in that results from adding and then going from to is unbounded in , so unboundedly negative for sufficiently large (negative once is large enough that ).
However, started at the constant . Before is large enough that (i.e., large enough that the net change in is negative resulting from adding a single input and going to the next stable configuration), could increase, if is larger than the net decrease in due to following reactions to get from to .
However, since is noncollapsing, this can only happen for a constant number of (so never reaches more than a constant above its initial value ), after which point strictly decreases after each round of this process.
At some point in this process, will not be able to reach all the way to the next without becoming negative, a contradiction.
The statement of Theorem 6.9 does not mention the concept of a leader, but it would typically apply to leaderless CRDs. A CRD may be execution bounded from configurations with a single leader, but not execution bounded when multiple leaders are present (preventing the use of Theorem 6.5, which requires the CRD to be execution bounded from all configurations). For example, in Lemma 4.5, reaction (9) occurs finitely many times if the leader/voter or has count 1. However, if and can be present simultaneously (e.g., if we start with two leaders), then the reactions and can flip between and infinitely often in an unbounded execution.
If the CRN is leaderless, however, we have the following, which says that if it is execution bounded from valid initial configurations, then it is execution bounded from all configurations.
If a leaderless CRD or CRC is execution bounded, then it is entirely execution bounded.
submissionA proof is in the appendix. Since is leaderless, the sum of two valid initial configurations is also valid. Thus if we can produce some species from a valid initial configuration, we can produce arbitrarily large counts of all species by adding up sufficiently many initial configurations. This means that for any configuration , from any sufficiently large valid initial configuration , some is reachable from . But if is execution bounded from , since , it must also be execution bounded from , thus also from since by additivity any reactions applicable to are also applicable to .
Proof 6.23.
Let be a leaderless CRD or CRC. Let be any configuration. We first show that some is reachable from a valid initial configuration .
We may assume without loss of generality that only contains species producible from valid initial configurations, otherwise we obtain an equivalent CRN by removing those unproducible species from .
Since is leaderless, the sum of two valid initial configurations is also valid. Then each species being producible means that there is a valid initial configuration such that for some , and , i.e., at least one copy of can be produced. Let . By additivity, , where , noting that . In other words, all species are producible in arbitrarily large counts from some valid initial configuration.
Now we argue all species can be made simultaneously arbitrarily large count from some valid initial configuration; in particular, we can reach a configuration with counts at least . Let . Since each , by additivity we have , where . Then for each , , so .
Since all executions from are finite, all executions from are finite. By additivity, any sequence of reactions applicable to is also applicable to . Thus all executions from must be finite as well, i.e., is entirely execution bounded since is an arbitrary configuration.
Section 6.3 lets us replace “entirely execution bounded” in Section 6.3 with “leaderless and execution bounded”:
Corollary 6.24.
If a noncollapsing, all-voting, leaderless, execution bounded CRD stably decides a predicate , then is eventually constant.
In particular, since the original model of population protocols [angluin2004computation] defined them as leaderless and all-voting—and since population protocols are noncollapsing—we have the following.
Corollary 6.25.
If an execution bounded population protocol stably decides a predicate , then is eventually constant.
6.4 Feedforward CRNs
We show that another common constraint, feedforwardness, significantly reduces computational power, making it impossible to decide even simple mod and threshold sets.
Definition 6.26.
A CRN is reaction-feedforward if reactions can be ordered such that, for all , no reactant of appears in (as either reactant or product).
Reaction-feedforward CRNs are significant in the sense that many continuous real-valued CRNs computing numerical-valued functions (where the count of some species is interpreted as the output, e.g., computes ) can be computed by reaction-feedforward CRNs [chen2023rate].666 The definition of feedforward in reference [chen2023rate] is different from the definition given here, being based on an ordering of species rather than reactions. However, it is straightforward to verify by inspection that the CRNs given for the positive results of [chen2023rate] are reaction-feedforward according to Definition 6.26. Compared to general CRNs, reaction-feedforward CRNs are easy to analyze and prove correctness. One reason is that, if a reaction-feedforward CRN can reach terminal configuration from at all, then it is execution bounded from .
There is a similar definition, called simply feedforward in [chen2023rate], based on ordering of species rather than reactions. We use the term species-feedforward to avoid confusion with Definition 6.26. We say a reaction produces a species if , and it consumes if .
Definition 6.27.
A CRN is species-feedforward if species can be ordered such that every reaction producing a species consumes a earlier species where .
Although the term “linear potential function” was not used in [chen2023rate], it is shown in [chen2023rate, Lemma 4.8] that species-feedforward CRNs have a linear potential function (assigning weight to species for a suitably large constant ), thus are entirely execution bounded. The same is not always true of reaction-feedforward CRNs, for example is reaction-feedforward but not execution bounded. However, we can use similar techniques to proofs used for so-called noncompetitive CRNs in [vasic2022programming] to show “reasonable” reaction-feedforward CRNs are execution bounded.
Lemma 6.28.
Suppose in a reaction-feedforward CRN that by execution , and by execution . If any reaction occurs less in than , then is not terminal.
Proof 6.29.
Here we equivalently think of an execution from as a sequence of reactions, since from those and we can deduce the configurations in the execution. Define as the number of times reaction occurs in the execution . Let be the first reaction in the reaction-feedforward order such that . Assume, for brevity of explanation, that has only one reactant, denoted ; the argument below, however, is general and applies to any number of reactants in .
By the definition of a reaction-feedforward CRN, the reactions through do not affect the count of . Further, reactions through can only produce and not consume it, reactions through can increase the count of , and among them, only can decrease it. Let . Let represent the prefix sequence of where the transition corresponds to the st execution of reaction . The configuration is thus the configuration just before occurs more in than in .
Note that reactions through occur at least as often in as in (i.e. for to ). Therefore, they occur at least as often in as in , since is a prefix of . Moreover, by our choice of , . So is present in , i.e. . Thus, is applicable at , so is not terminal.
The following corollary implies that any reaction-feedforward CRN that can reach a terminal configuration from is execution bounded from .
Corollary 6.30.
In a reaction-feedforward CRN , if there is a terminal configuration reachable from initial configuration , then is reached by every sufficiently long execution from . Furthermore, all of these executions are permutations of the same number of each reaction type. In particular, is execution bounded from .
Proof 6.31.
Let be the execution leading from to . Consider any execution with . By the pigeonhole principle, must involve more occurrences of some reaction than does. By Lemma 6.28, this would imply that is not terminal, which contradicts the premise that is terminal. Therefore, no execution can be longer than . Consider any execution where . must be a permutation of , as any deviation resulting in more of any reaction would, by the pigeonhole principle, lead to a contradiction of the terminality of . To address the possibility of a shorter terminal execution, consider any execution with . There must be some reaction occurring more frequently in than in , and by Lemma 6.28, cannot reach a terminal configuration.
As noted, in the model of continuous CRNs, it is known that all the functions that can be stably computed (the continuous, piecewise linear functions) can be stably computed by reaction-feedforward CRNs [chen2023rate]. In contrast, with discrete CRNs computing predicates, we show that reaction-feedforward CRNs cannot stably decide all semilinear sets by giving two counterexamples, showing that reaction-feedforward CRDs can decide neither “most” mod sets (6.31) nor “most” threshold sets (6.32). Specifically, we chose the parity and majority predicate as our counterexamples, although the techniques generalize to more complex mod and threshold sets, e.g., .
Reaction-feedforward CRDs can’t stably decide the parity predicate .
Proof 6.32.
We show that in any possible construction, the input species must be a reactant of two distinct reactions. By letting the CRN stabilize and then introducing another input molecule, there must exist a set of rules inverting the output in either way, consisting of at least two reactions with as reactant, breaking the reaction-feedforward condition.
Consider the set of even numbers. A simple, non-reaction-feedforward CRD that decides parity is:
where is the input species, is a yes voter, and is a no voter, initialized with and . In either way to order these reactions, a reactant of the first reaction appears in the second reaction. Thus, the CRN is not reaction-feedforward.
To show that no such CRN could decide parity, we show that any construction requires us to have at least one reactant reappear in a later reaction, or even stronger: at least one species must be a reactant of two distinct reactions. Specifically, this is true for the input species .
To motivate the choice of species, let’s consider an even simpler parity computing CRD.
where is both input and votes no, votes yes, initialized with and . Only the input species appears twice as a reactant. Intuitively, this is true for all CRDs because we expect the input to be able to change our answer in either way, reversing the previous one.
Suppose for the sake of contradiction that there is a reaction-feedforward CRD which stably decides whether the initial number of input is even. We withhold two copies of and let stabilize on the correct output of yes. Denote as the set of yes voters. We denote the no voters with . Only species contained in are present in the stable, correct output configuration.
Now, we release one of the remaining copies of . We first run the chain reaction (if any) starting from only one . Let be the set of species producible from (e.g., if there is no reaction , then is just ). Without loss of generality, we assume , that is, all of ’s direct products are yes-voters (if not, exchange and in what follows). To correct the answer (), must consume all species currently present and produce at least one copy of a species in . It follows that for all , contains a reaction with as a reactant. Further, none of these reactions contain a reactant of , since none are present in the current configuration.
Finally, we release the last remaining copy of . Again, we produce the set from . To invert the vote again, we must consume all and produce at least one member of . The reaction(s) consuming must have a member of as a reactant since the configuration is stable without . Further, the reaction cannot be any of the ones from before since they contain a member of as reactant.
Since there are least two reactions sharing a common species as reactant, the reactions cannot be ordered such that no reactant of the first of these reactions appears in the latter one. This makes non-reaction-feedforward, contradicting our initial assumption.
Reaction-feedforward CRDs can’t stably decide the majority predicate .
Proof 6.33.
Suppose, for the sake of contradiction, there exists a reaction-feedforward CRD which stably decides the predicate. We let stabilize on input (yielding output yes), while withholding two copies of . We release one . Again, we consider the full set of species a single could produce before reacting with other molecules (denoted ). Without loss of generality, we consider all of them yes voters i.e. . The correct output now changes to no, and all yes voters must be consumed by reactions that only have reactants which are yes voters and further, these reactions contain all species in as reactants.
Once the vote has reversed and stabilized, it contains only species of . We release the last and let it produce . Since i.e. all elements are yes voters, but the correct vote is still no, all must be consumed again. This time, they must be consumed in reactions involving no voters, must be distinct reactions from those in the previous step. Thus, all species appear at least twice as a reactant, breaking the reaction-feedforward condition.
7 Conclusion
full We explored the computational capabilities of execution bounded Chemical Reaction Networks (CRNs), which terminate after a finite number of reactions. This constraint aligns the model with practical scenarios where fuel supply is limited.
Our findings illustrate that the computational power of these CRNs varies significantly based on structural choices. Specifically, CRNs with an initial leader and the ability to allow only the leader to vote can stably compute all semilinear predicates and functions in parallel time. Without an initial leader, and requiring all species to vote, these networks are limited to computing eventually constant predicates. This limitation holds considerable weight for decentralized systems modeled by population protocols, which inherently exhibit these traits. Additionally, we introduced a new characterization of execution bounded networks through a nonnegative linear potential function, providing a novel theoretical tool for analyzing the physical constraints CRNs.
A key question remains open: Can execution bounded CRNs compute semilinear functions and predicates within polylogarithmic time? Angluin, Aspnes and Eisenstat introduced a fast population protocol that simulates a register machine with high probability. This protocol can perform standard operations like comparison, addition, subtraction, and multiplication and division by constants in time [AngluinAE2008Fast]. Chen, Doty and Soloveichik applied this construction to CRNs in [Chen2012DeterministicFunction], showing that semilinear functions can be computed by CRNs without error in expected polylogarithmic time in the kinetic model. Central to their success in both cases is the “phase clock”, which generates a clock signal to indicate the probable completion of an epidemic style chain reaction and orders more recent instructions to overwrite older ones. This clock is inherently unbounded in its execution, cycling through stages.
References
- [1] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J Fischer, and René Peralta. Computation in networks of passively mobile finite-state sensors. In PODC 2004: Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing, pages 290–299, 2004.
- [2] Dana Angluin, James Aspnes, and David Eisenstat. Fast computation by population protocols with a leader. Distributed Computing, 21(3):183–199, September 2008.
- [3] Dana Angluin, James Aspnes, David Eisenstat, and Eric Ruppert. The computational power of population protocols, 2006.
- [4] Ho-Lin Chen, David Doty, Wyatt Reeves, and David Soloveichik. Rate-independent computation in continuous chemical reaction networks. Journal of the ACM, 70(3), May 2023.
- [5] Ho-Lin Chen, David Doty, and David Soloveichik. Deterministic function computation with chemical reaction networks. Natural Computing, 13(4):517–534, 2014. Preliminary version appeared in DNA 2012.
- [6] David Doty and Monir Hajiaghayi. Leaderless deterministic chemical reaction networks. Natural Computing, 14(2):213–223, 2015. Preliminary version appeared in DNA 2013.
- [7] Julius Farkas. Theorie der einfachen ungleichungen. Journal für die reine und angewandte Mathematik (Crelles Journal), 1902(124):1–27, 1902.
- [8] David Gale. The theory of linear economic models. University of Chicago press, 1960.
- [9] Daniel T. Gillespie. Exact stochastic simulation of coupled chemical reactions. Journal of Physical Chemistry, 81(25):2340–2361, 1977.
- [10] S. Ginsburg and E. H. Spanier. Semigroups, Presburger formulas, and languages. Pacific Journal of Mathematics, 16(2):285–296, 1966.
- [11] Ryuichi Ito. Every semilinear set is a finite union of disjoint linear sets. Journal of Computer and System Sciences, 3(2):221–231, 1969.
- [12] Richard M Karp and Raymond E Miller. Parallel program schemata. Journal of Computer and system Sciences, 3(2):147–195, 1969.
- [13] Olvi L Mangasarian. Nonlinear programming. SIAM, 1994.
- [14] Christos H Papadimitriou. On the complexity of integer programming. Journal of the ACM (JACM), 28(4):765–768, 1981.
- [15] Charles Rackoff. The covering and boundedness problems for vector addition systems. Theoretical Computer Science, 6(2):223–231, 1978.
- [16] David Soloveichik, Matthew Cook, Erik Winfree, and Jehoshua Bruck. Computation with finite stochastic chemical reaction networks. Natural Computing, 7(4):615–633, 2008.
- [17] Marko Vasić, Cameron Chalk, Austin Luchsinger, Sarfraz Khurshid, and David Soloveichik. Programming and training rate-independent chemical reaction networks. Proceedings of the National Academy of Sciences, 119(24):e2111552119, 2022.