License: CC BY 4.0
arXiv:2312.04187v2 [cs.CC] 14 Dec 2023

Enumerating Complexity Revisited

Alexander Shekhovtsov
Moscow Institute of Physics and Technology
[email protected]
   Georgii Zakharov
Moscow Institute of Physics and Technology
[email protected]

Abstract

Consider a subset of positive integers S𝑆Sitalic_S. In this paper, we reduce the upper bound on the length of a minimum program that enumerates S𝑆Sitalic_S in terms of the probability of S𝑆Sitalic_S being enumerated by a random program.

So far, the best-known upper bound was given by Solovay. Solovay proved that the minimum length of a program enumerating S𝑆Sitalic_S is bounded by 3333 times minus binary logarithm of the probability that a random program enumerates S𝑆Sitalic_S. Later, Vereshchagin showed that the constant can be improved from 3333 to 2222 for finite sets. By improving the method proposed by Solovay, we demonstrate that any bound for finite sets implies the same bound for infinite sets, modulo logarithmic factors. Thus, the constant can be replaced by 2222 for every set S𝑆Sitalic_S due to the result of Vereshchagin.

Organization

In Section 1, we introduce definitions of deterministic and randomized Kolmogorov complexity of a set. Then, we present the results known prior to our work. After that, we formulate the main result of the paper in Theorem 1.3. In Section 2, in order to prove Theorem 1.3, we describe the notions of cats and ants. Both cats and ants move along the edges of the graph G𝐺Gitalic_G described in Subsection 2.1. The position of cats at time t𝑡titalic_t is equal to the set enumerated by the deterministic machine halted after t𝑡titalic_t steps. Similarly, the position of ants at time t𝑡titalic_t is equal to the set enumerated by the randomized machine halted after t𝑡titalic_t steps. Then, we introduce shadow positions for the ants, which will follow the ants with some delay. We describe how the shadow positions will mimic the real ones. The shadow positions will correspond to an another randomized machine Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, which is used in the proof of the Theorem 1.3. After that, we are finally able to state Lemma 2.1. Lastly, we prove that Theorem 1.3 follows from Lemma 2.1. In Section 3, we prove Lemma 2.1. Section 4 contains a discussion of the results.

1 Introduction

To formulate the main results of this work, we will introduce the concepts of complexities for deterministic and probabilistic enumerating machines.

1.1 Deterministic Enumerating Machine

Consider a deterministic machine D𝐷Ditalic_D. D𝐷Ditalic_D takes a binary string as an input and enumerates a subset of natural numbers. D𝐷Ditalic_D is not required to halt and may enumerate an infinite set. Let us define the complexity of a set S𝑆S\subset\mathbb{N}italic_S ⊂ blackboard_N, denoted as ID(S)subscript𝐼𝐷𝑆I_{D}(S)italic_I start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_S ), as the minimum length of an input for D𝐷Ditalic_D on which it enumerates S𝑆Sitalic_S. If there is no such input, then ID(S)=subscript𝐼𝐷𝑆I_{D}(S)=\inftyitalic_I start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_S ) = ∞. By a standard theorem, there exists a machine D0subscript𝐷0D_{0}italic_D start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT such that for any other machine D,𝐷D,italic_D , there is a constant c𝑐citalic_c such that:

ID(S)ID0(S)+csubscript𝐼𝐷𝑆subscript𝐼subscript𝐷0𝑆𝑐I_{D}(S)\leq I_{D_{0}}(S)+citalic_I start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_S ) ≤ italic_I start_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_S ) + italic_c

The machine D0subscript𝐷0D_{0}italic_D start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is referred to as universal. Set D𝐷Ditalic_D equal to any such universal machine D0subscript𝐷0D_{0}italic_D start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and define I(S)𝐼𝑆I(S)italic_I ( italic_S ) as ID(S)subscript𝐼𝐷𝑆I_{D}(S)italic_I start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_S ). Thus, I(S)𝐼𝑆I(S)italic_I ( italic_S ) is well-defined up to an additive constant.

1.2 Probabilistic Enumerating Machine

Consider a probabilistic machine M𝑀Mitalic_M that utilizes arbitrarily many random bits, uniformly distributed and independent, and enumerates a subset of natural numbers. Similarly to the deterministic machine, M𝑀Mitalic_M is not required to halt and may enumerate an infinite set. Such a machine defines a map** from the Cantor space ΩΩ\Omegaroman_Ω (all infinite sequences of zeros and ones) to the family of subsets of the natural numbers. An outcome wΩ𝑤Ωw\in\Omegaitalic_w ∈ roman_Ω of the random bits corresponds to the set that is enumerated by M𝑀Mitalic_M when w𝑤witalic_w is used as input as a sequence of random bits. The image of uniform measure then becomes a measure on the set of all subsets of the natural numbers, and this is the measure we refer to when discussing the probability (for machine M𝑀Mitalic_M) of enumerating a certain set S𝑆Sitalic_S. This probability is non-zero only if the set S𝑆Sitalic_S is enumerable, as implied by a variant of the de Leeuw–Moore–Shannon–Shapiro theorem for enumeration problems [2]. We define the complexity HM(S)subscript𝐻𝑀𝑆H_{M}(S)italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_S ) as the negative binary logarithm of the probability that M𝑀Mitalic_M enumerates S𝑆Sitalic_S: HM(S)=log2Pr(M𝑒𝑛𝑢𝑚𝑒𝑟𝑎𝑡𝑒𝑠S)subscript𝐻𝑀𝑆subscript2𝑃𝑟𝑀𝑒𝑛𝑢𝑚𝑒𝑟𝑎𝑡𝑒𝑠𝑆H_{M}(S)=-\log_{2}Pr(M\ \textit{enumerates}\ S)italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_S ) = - roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_P italic_r ( italic_M enumerates italic_S ). If this probability is zero, we consider the complexity to be infinite.

Similar to the deterministic case, there exists a machine M0subscript𝑀0M_{0}italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT in such a way that for any other M,𝑀M,italic_M , there is a constant c𝑐citalic_c such that:

HM(S)HM0(S)+csubscript𝐻𝑀𝑆subscript𝐻subscript𝑀0𝑆𝑐H_{M}(S)\leq H_{M_{0}}(S)+citalic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_S ) ≤ italic_H start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_S ) + italic_c

Such a machine M0subscript𝑀0M_{0}italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is referred to as optimal. Set M𝑀Mitalic_M equal to any such optimal machine M0subscript𝑀0M_{0}italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and define H(S)𝐻𝑆H(S)italic_H ( italic_S ) as HM(S).subscript𝐻𝑀𝑆H_{M}(S).italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_S ) . Therefore, H(S)𝐻𝑆H(S)italic_H ( italic_S ) is well-defined up to an additive constant.

1.3 Connection Between Deterministic and Randomized Complexity

Since M𝑀Mitalic_M can generate an input x𝑥xitalic_x for machine D𝐷Ditalic_D using |x|+O(log|x|)𝑥𝑂𝑥|x|+O(\log|x|)| italic_x | + italic_O ( roman_log | italic_x | ) of its random bits (for instance, one can first generate the length |x|𝑥|x|| italic_x | and then x𝑥xitalic_x itself), it is true that H(S)I(S)+O(logI(S))𝐻𝑆𝐼𝑆𝑂𝐼𝑆H(S)\leq I(S)+O(\log I(S))italic_H ( italic_S ) ≤ italic_I ( italic_S ) + italic_O ( roman_log italic_I ( italic_S ) ) for all S𝑆S\subset\mathbb{N}italic_S ⊂ blackboard_N.

However, obtaining the converse inequality is more challenging. Solovay proved a linear upper bound.

Theorem 1.1.

(Solovay, [4]). There exists a constant c𝑐citalic_c such that for any S𝑆S\subset\mathbb{N}italic_S ⊂ blackboard_N, it holds that I(S)3H(S)+2logH(S)+c.𝐼𝑆normal-⋅3𝐻𝑆2𝐻𝑆𝑐I(S)\leq 3\cdot H(S)+2\log H(S)+c.italic_I ( italic_S ) ≤ 3 ⋅ italic_H ( italic_S ) + 2 roman_log italic_H ( italic_S ) + italic_c .

Later, Vereshchagin improved the constant from three to two for finite sets:

Theorem 1.2.

(Vereshchagin, [5]). There exists a constant c𝑐citalic_c such that for any finite S𝑆S\subset\mathbb{N}italic_S ⊂ blackboard_N, it holds that I(S)2H(S)+2logH(S)+c.𝐼𝑆normal-⋅2𝐻𝑆2𝐻𝑆𝑐I(S)\leq 2\cdot H(S)+2\log H(S)+c.italic_I ( italic_S ) ≤ 2 ⋅ italic_H ( italic_S ) + 2 roman_log italic_H ( italic_S ) + italic_c .

Vereshchagin also asked the question whether it is possible to improve the Solovay’s bound by reducing the length of an auxiliary string in the Solovay’s algorithm. We answer positively to this question. We will show how to extend any upper bound for finite sets to infinite sets. This result is formulated in the following theorem.

Theorem 1.3.

Let α1𝛼1\alpha\geq 1italic_α ≥ 1 be such that for any finite S𝑆S\subset\mathbb{N}italic_S ⊂ blackboard_N,

I(S)αH(S)+O(logH(S))𝐼𝑆𝛼𝐻𝑆𝑂𝐻𝑆I(S)\leq\alpha\cdot H(S)+O(\log H(S))italic_I ( italic_S ) ≤ italic_α ⋅ italic_H ( italic_S ) + italic_O ( roman_log italic_H ( italic_S ) )

Then the same bound holds for any infinite S𝑆Sitalic_S (possibly with a different O-big).

Applying this Theorem 1.3 to Theorem 1.2, we improve the constant in Solovay’s bound from three to two (ignoring logarithmic factors). Thus, we obtain the following theorem.

Theorem 1.4.

There exists a constant c𝑐citalic_c such that for any S𝑆S\subset\mathbb{N}italic_S ⊂ blackboard_N, it holds that I(S)2H(S)+O(logH(S))+c.𝐼𝑆normal-⋅2𝐻𝑆𝑂𝐻𝑆𝑐I(S)\leq 2\cdot H(S)+O(\log H(S))+c.italic_I ( italic_S ) ≤ 2 ⋅ italic_H ( italic_S ) + italic_O ( roman_log italic_H ( italic_S ) ) + italic_c .

To prove Theorem 1.3, we will introduce convenient terms. We will represent the behavior of deterministic and randomized machines as the movement of cats and ants on a directed graph.

2 Graph, Cats, Ants, Shadow Positions

2.1 Graph of Subsets of Natural Numbers

Let G𝐺Gitalic_G be a directed graph, with its vertices representing finite sets of natural numbers, and edges leading from a set to all of its proper supersets.

2.2 Cats

We enumerate the inputs for D𝐷Ditalic_D in increasing order of their length (for example: ε𝜀\varepsilonitalic_ε, 0, 1, 00, 01, 10, 11, 000, …). We associate a cat with each input for D𝐷Ditalic_D. Each cat simulates the actions of D𝐷Ditalic_D on the corresponding input. At any given time, a cat is positioned in the vertex corresponding to what D𝐷Ditalic_D has enumerated up to that point. At time zero, all cats are in the \varnothing vertex. As machine D𝐷Ditalic_D outputs new numbers, the cats move along the edges of the graph to new positions. The cats are numbered in the same way as the inputs they correspond to. Hence, I(S)k𝐼𝑆𝑘I(S)\leq kitalic_I ( italic_S ) ≤ italic_k \Leftrightarrow at least one of the first 2k+11superscript2𝑘112^{k+1}-12 start_POSTSUPERSCRIPT italic_k + 1 end_POSTSUPERSCRIPT - 1 cats enumerates the set S𝑆Sitalic_S.

2.3 Ants

We model the behavior of machine M𝑀Mitalic_M on different inputs using the metaphor of <<ants>>. While machine M𝑀Mitalic_M does not request random bits and only enumerates numbers, we envision the movement of a unit-sized ant on the graph G𝐺Gitalic_G. An ant begins from an empty set and moves to corresponding sets as numbers appear in the output (the ones already enumerated).

When the machine reaches the point of requesting a random bit, the ant splits into two <<children>> — each of half the size. One follows the behavior of the machine with a random bit 0,00,0 , and the other follows the behavior with a random bit 1,11,1 , potentially moving differently. At some point, the ant may need the next random bit, at which point it will split again into two, and so on. At any moment in this modeling process, the ants are finite in number, and each corresponds to a binary word (representing the random bits the ant already knows). These cones form a partition of the Cantor space. Requesting a random bit corresponds to dividing one of the cones in half.

Using this metaphor, we distinguish between an <<ant>> itself (a node in the infinite tree) and its <<position>> in the graph. A node indicates which random bits have been used in machine M𝑀Mitalic_M’s computation, while a position represents the set of numbers that have already appeared in machine M𝑀Mitalic_M’s output. There are two types of changes that occur during the modeling process:

  • The machine may request a random bit (in some computation path). Then, one ant (corresponding to the bits of x𝑥xitalic_x that are already requested) splits into two (corresponding to nodes x0𝑥0x0italic_x 0 and x1𝑥1x1italic_x 1), inheriting the current position.

  • The machine (in some computation path) outputs a new number m.𝑚m.italic_m . The ant, corresponding to the bits already requested, updates its position (moving to a graph vertex obtained from the previous position by adding m𝑚mitalic_m).

In these terms, the map** from the Cantor space to 𝒫()𝒫\mathcal{P}(\mathbb{N})caligraphic_P ( blackboard_N ) can be described as follows: for each point w𝑤witalic_w in the Cantor space, there is a sequence of ants, each being the descendant of the previous one and continuing its path on the graph. The combination of these paths enumerates a set of natural numbers, which represents the image of the point w.𝑤w.italic_w . (It is possible that only a finite number of bits from input w𝑤witalic_w will be used. In this case, the last ant will not split further, and its path on the graph may be either finite or infinite.)

2.4 Shadow Positions

In the description of the construction, in addition to the positions of the ants on the graph, we use their shadow positions, which follow their actual positions with some delay. Specifically, when an ant’s position changes, its shadow position remains unchanged, except for specific explicitly highlighted shifts. At the moment of such a shift, the shadow position of the ant moves to its real position. When an ant splits, its children inherit not only its position but also its shadow position. Then, the children may diverge, and after shifts, their shadow positions may also diverge.

From this description, it’s evident that the shadow position of an ant is one of its previous (real) positions, making it a subset of those positions. Each ant can participate in several shifts (or none at all). For convenience, when we later discuss the number of shifts in which a given ant participated, we mean the total number of shifts for both the ant and its ancestors. (This number will be important to us, particularly whether it is finite or infinite.)

2.5 Shifts

The evolution and positions of ants are determined by the simulation of machine M𝑀Mitalic_M and are not dependent on us. To describe the entire process, we need to explain when and to which ants shifts are applied. The shift will depend on a parameter ε𝜀\varepsilonitalic_ε and a function l(ε)𝑙𝜀l(\varepsilon)italic_l ( italic_ε ) based on it. We will define the function l(ε)𝑙𝜀l(\varepsilon)italic_l ( italic_ε ) and ε𝜀\varepsilonitalic_ε later.

The conditions for a shift arise when there is a finite set X𝑋Xitalic_X for which there are more than ε𝜀\varepsilonitalic_ε ants whose positions are supersets of X,𝑋X,italic_X , and their shadow positions are subsets of X.𝑋X.italic_X . (The term <<more>> is understood in terms of the cumulative <<weight>> of the ants - that is, the measure of the corresponding set in the Cantor space.). More precisely, initially, the shadow positions of these ants are temporarily set to be equal to X,𝑋X,italic_X , and it is expected that one of the first l(ε)𝑙𝜀l(\varepsilon)italic_l ( italic_ε ) cats will enter* X𝑋Xitalic_X (the movement of cats is being simulated from the very beginning until a cat enters the vertex X𝑋Xitalic_X), after which the shadow positions are set to be equal to the real ones.

Note*: It is possible that none of the first l(ε)𝑙𝜀l(\varepsilon)italic_l ( italic_ε ) cats will ever enter X.𝑋X.italic_X . However, we will choose l(ε)𝑙𝜀l(\varepsilon)italic_l ( italic_ε ) in such a way that such a cat will be found.

In case if there are several vertices X𝑋Xitalic_X where conditions for a shift exist, any of them is chosen. Then (with new positions), it is checked whether there is another vertex where the shift condition is met, and a shift is also applied there, and so on, until there are no vertices where a shift is possible. Then the simulation of machine M𝑀Mitalic_M and the corresponding processes of ant movement and splitting are resumed.

Note: It is possible in principle that in the same vertex, after a shift is performed, there are conditions for another shift (for example, the positions of a large number of ants and their shadow positions were in X,𝑋X,italic_X , which made the shift possible, but nothing changed after it). However, we do not perform a second shift in the same vertex. (But we check all the others - whether there are any other vertices where a shift is possible. Sooner or later, all possible shifts will be exhausted since there is only a finite number of such vertices where a shift is possible at the current simulation moment.)

2.6 Main Lemma

Lemma 2.1.

Let S𝑆Sitalic_S be a set with a probability of enumeration greater than ε𝜀\varepsilonitalic_ε (the threshold from the shift conditions). Let SSsuperscript𝑆normal-′𝑆S^{\prime}\subset Sitalic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊂ italic_S be a finite subset of it. Then at some point, a shift will occur for the set X𝑋Xitalic_X located between Ssuperscript𝑆normal-′S^{\prime}italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and S𝑆Sitalic_S (i.e., SXSsuperscript𝑆normal-′𝑋𝑆S^{\prime}\subset X\subset Sitalic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊂ italic_X ⊂ italic_S).

Before proving Lemma 2.1, let’s show that it implies a desired bound for the complexity of enumerating infinite sets. Consider an increasing sequence of finite sets Ssuperscript𝑆S^{\prime}italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT converging to S𝑆Sitalic_S and apply Lemma 2.1 to each of them, obtaining, for each, its set X𝑋Xitalic_X (and one of the first l(ε)𝑙𝜀l(\varepsilon)italic_l ( italic_ε ) cats servicing this set). Since there is a finite number of cats, one of them must have participated in servicing an infinite number of sets Ssuperscript𝑆S^{\prime}italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, never leaving S𝑆Sitalic_S and being in the supersets of infinitely many sets Ssuperscript𝑆S^{\prime}italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Therefore, it will eventually enumerate S𝑆Sitalic_S. Thus, we have:

I(S)log2l(ε)𝐼𝑆subscript2𝑙𝜀I(S)\leq\log_{2}l(\varepsilon)italic_I ( italic_S ) ≤ roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_l ( italic_ε )

Now we will show how to use this to prove Theorem 1.3.

Proof of Theorem 1.3. Let k𝑘kitalic_k be an integer. We will show that if we set ε=2k𝜀superscript2𝑘\varepsilon=2^{-k}italic_ε = 2 start_POSTSUPERSCRIPT - italic_k end_POSTSUPERSCRIPT and l(ε)=2αk+O(logk)𝑙𝜀superscript2𝛼𝑘𝑂𝑘l(\varepsilon)=2^{\alpha k+O(\log k)}italic_l ( italic_ε ) = 2 start_POSTSUPERSCRIPT italic_α italic_k + italic_O ( roman_log italic_k ) end_POSTSUPERSCRIPT, then during a shift, there will always be a cat among the first l(ε)𝑙𝜀l(\varepsilon)italic_l ( italic_ε ) that comes to X.𝑋X.italic_X . Using I(S)log2l(ε),𝐼𝑆subscript2𝑙𝜀I(S)\leq\log_{2}l(\varepsilon),italic_I ( italic_S ) ≤ roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_l ( italic_ε ) , we will prove Theorem 1.3.

Consider a randomized machine Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT that models the movement of the shadow positions of ants. To do this, Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT needs to know ε,𝜀\varepsilon,italic_ε , so Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT uses the first 2log2k+22subscript2𝑘22\lceil\log_{2}k\rceil+22 ⌈ roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_k ⌉ + 2 of its bits to determine k𝑘kitalic_k and then compute ε=2k.𝜀superscript2𝑘\varepsilon=2^{-k}.italic_ε = 2 start_POSTSUPERSCRIPT - italic_k end_POSTSUPERSCRIPT . Thus, if at some point, a set X𝑋Xitalic_X has at least 2ksuperscript2𝑘2^{-k}2 start_POSTSUPERSCRIPT - italic_k end_POSTSUPERSCRIPT shadow ants concentrated in it, then Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT will have a probability of being in this set of at least ε=2k2log2k3superscript𝜀superscript2𝑘2subscript2𝑘3\varepsilon^{\prime}=2^{-k-2\log_{2}k-3}italic_ε start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 2 start_POSTSUPERSCRIPT - italic_k - 2 roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_k - 3 end_POSTSUPERSCRIPT. However,

αlog21/ε+O(log2log21/ε)=αk+O(log2k)𝛼subscript21superscript𝜀𝑂subscript2subscript21superscript𝜀𝛼𝑘𝑂subscript2𝑘\alpha\log_{2}1/\varepsilon^{\prime}+O(\log_{2}\log_{2}1/\varepsilon^{\prime})% =\alpha k+O(\log_{2}k)italic_α roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 1 / italic_ε start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_O ( roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 1 / italic_ε start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_α italic_k + italic_O ( roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_k )

Therefore, we can choose l(ε)𝑙𝜀l(\varepsilon)italic_l ( italic_ε ) such that any set enumerated by Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT with high probability will be enumerated by one of the first l(ε)𝑙𝜀l(\varepsilon)italic_l ( italic_ε ) cats. In other words, during a shift, there will always be a cat that comes to X.𝑋X.italic_X .

\square

Now, it only remains to prove Lemma 2.1.

3 Proof of Lemma 2.1

3.1 Number of Shifts

At every moment t𝑡titalic_t of the process, the Cantor space is divided into cones corresponding to ants. For each ant, we can count the total number of shifts in which it (itself or its ancestors) participated. We can define the set Wktsuperscriptsubscript𝑊𝑘𝑡W_{k}^{t}italic_W start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT as the set in the Cantor space corresponding to ants with k𝑘kitalic_k or more shifts by the time t𝑡titalic_t. This is a basic (open and closed) set; the larger k𝑘kitalic_k is, the smaller Wktsuperscriptsubscript𝑊𝑘𝑡W_{k}^{t}italic_W start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT (for a given t𝑡titalic_t). On the other hand, Wktsuperscriptsubscript𝑊𝑘𝑡W_{k}^{t}italic_W start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT grows with increasing t𝑡titalic_t. We can take the union over all t𝑡titalic_t and obtain an open set Wksubscript𝑊𝑘W_{k}italic_W start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (it is even effectively open since the process is algorithmic — we assume that ε𝜀\varepsilonitalic_ε is rational). We obtain a decreasing sequence of open sets:

W1W2W3Wksuperset-ofsubscript𝑊1subscript𝑊2superset-ofsubscript𝑊3superset-ofsuperset-ofsubscript𝑊𝑘superset-ofW_{1}\supset W_{2}\supset W_{3}\supset\ldots\supset W_{k}\supset\ldotsitalic_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊃ italic_W start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⊃ italic_W start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ⊃ … ⊃ italic_W start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⊃ …

in the Cantor space. Membership in the sequence w𝑤witalic_w to the set Wksubscript𝑊𝑘W_{k}italic_W start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT means that in the chain of ants corresponding to shifts in the direction of the bits of w𝑤witalic_w, there were at least k𝑘kitalic_k shifts (in total, for all participants). This chain can be finite or infinite (the same ant, without splitting, can participate in several shifts, or even an infinite number of them).

Now, we can consider the set W=kWksubscript𝑊subscript𝑘subscript𝑊𝑘W_{\infty}=\bigcap_{k}W_{k}italic_W start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT = ⋂ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. It consists of those sequences where the corresponding chain of ants is involved in an infinite number of shifts.

3.2 The Case of Non-empty Intersection

In addition to Wsubscript𝑊W_{\infty}italic_W start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT, consider the set U𝑈Uitalic_U in the Cantor space, consisting of those sequences for which the probabilistic machine M𝑀Mitalic_M enumerates the set S𝑆Sitalic_S. Do the sets U𝑈Uitalic_U and Wsubscript𝑊W_{\infty}italic_W start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT intersect? We will show that if they do intersect, then the statement of Lemma 2.1 is true, and then we will contradict the assumption that the intersection is empty.

Let’s assume that the intersection is not empty, and w𝑤witalic_w is a sequence belonging to both U𝑈Uitalic_U and Wsubscript𝑊W_{\infty}italic_W start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT. Then, with the bits from w𝑤witalic_w, the machine M𝑀Mitalic_M enumerates S𝑆Sitalic_S, and the corresponding w𝑤witalic_w ants are involved in an infinite number of shifts. Since the machine M𝑀Mitalic_M enumerates S𝑆Sitalic_S, all the positions of these ants will be subsets of S𝑆Sitalic_S. Furthermore, for our finite set Ssuperscript𝑆S^{\prime}italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, there will be a moment when the position of the ant becomes a superset of Ssuperscript𝑆S^{\prime}italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. This is also true for the shadow position in one of the following moments, because with each shift (and there are infinitely many of them), the shadow position catches up with the real position. After this, another shift will occur, and the set X𝑋Xitalic_X for this shift will contain Ssuperscript𝑆S^{\prime}italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (because it contains the current shadow position) and will be contained in S𝑆Sitalic_S (because it is contained in the real position of the same ant).

3.3 The Case of Empty Intersection

We must lead to a contradiction the assumption that the intersection of the sets U𝑈Uitalic_U (sequences where the machine enumerates S𝑆Sitalic_S) and W=kWksubscript𝑊subscript𝑘subscript𝑊𝑘W_{\infty}=\cap_{k}W_{k}italic_W start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT = ∩ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is empty. (Note that we can forget about Ssuperscript𝑆S^{\prime}italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT at this point.) Let’s assume that it is. Then, the sets UWk𝑈subscript𝑊𝑘U\cap W_{k}italic_U ∩ italic_W start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT decrease and have an empty intersection. Therefore, their measures tend to zero, and we can find such an N𝑁Nitalic_N that the measure of the intersection UWN𝑈subscript𝑊𝑁U\cap W_{N}italic_U ∩ italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT is very small (less than some δ𝛿\deltaitalic_δ, which we will choose later and which will be less than the excess of the measure of U𝑈Uitalic_U above the threshold ε𝜀\varepsilonitalic_ε). Choose and fix such N𝑁Nitalic_N.

The set WNsubscript𝑊𝑁W_{N}italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT is open and is a union of an increasing sequence of sets WNtsuperscriptsubscript𝑊𝑁𝑡W_{N}^{t}italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT. For a sufficiently large T𝑇Titalic_T, the difference between the measures of WNsubscript𝑊𝑁W_{N}italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT and WNTsuperscriptsubscript𝑊𝑁𝑇W_{N}^{T}italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT can be made arbitrarily small. Recall that for a given t𝑡titalic_t, the set WNtsuperscriptsubscript𝑊𝑁𝑡W_{N}^{t}italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT consists of cones (in the Cantor space) over ants (vertices of a binary tree) that have already undergone N𝑁Nitalic_N shifts by time t𝑡titalic_t. The rest of the ants with their cones form the complement to the set WNtsuperscriptsubscript𝑊𝑁𝑡W_{N}^{t}italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT. Division of ants does not change these sets, but as time increases, there will be shifts with these ants or their descendants, and then some part of the complement to WNtsuperscriptsubscript𝑊𝑁𝑡W_{N}^{t}italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT will move to WNtsuperscriptsubscript𝑊𝑁𝑡W_{N}^{t}italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT. We want this process to be almost complete by the time of T𝑇Titalic_T. Moreover, we want similar processes for all sets W1,,WNsubscript𝑊1subscript𝑊𝑁W_{1},\ldots,W_{N}italic_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT to be almost complete as well. We will require that the difference between the measures of Wnsubscript𝑊𝑛W_{n}italic_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and WnTsuperscriptsubscript𝑊𝑛𝑇W_{n}^{T}italic_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is less than δ/N𝛿𝑁\delta/Nitalic_δ / italic_N for all n=1,2,,N𝑛12𝑁n=1,2,\ldots,Nitalic_n = 1 , 2 , … , italic_N. (Since N𝑁Nitalic_N is fixed before choosing T𝑇Titalic_T, this is possible.) This means that the total measure of the ants that will undergo their 1111st, 2222nd, …, (N1)𝑁1(N-1)( italic_N - 1 )th shift after time T𝑇Titalic_T (either themselves or their descendants) is not greater than N(δ/N)=δ𝑁𝛿𝑁𝛿N\cdot(\delta/N)=\deltaitalic_N ⋅ ( italic_δ / italic_N ) = italic_δ.

So, consider all ants that have undergone fewer than N𝑁Nitalic_N shifts by time T𝑇Titalic_T. What proportion of their continuations is made up of sequences from U𝑈Uitalic_U (which lead to the enumeration of S𝑆Sitalic_S)? By construction, U𝑈Uitalic_U barely intersected with WNsubscript𝑊𝑁W_{N}italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT (less than δ𝛿\deltaitalic_δ), and the intersection with WNTsuperscriptsubscript𝑊𝑁𝑇W_{N}^{T}italic_W start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT will be even smaller. Therefore, the measure we are interested in will be no less than P(U)δ𝑃𝑈𝛿P(U)-\deltaitalic_P ( italic_U ) - italic_δ. Let’s look at the positions of these ants. Some of them have exited S𝑆Sitalic_S (they are not subsets of S𝑆Sitalic_S), and such ants do not contribute to S𝑆Sitalic_S. Therefore, if we discard them, we will be left with a finite set of ants for which:

  • their current positions are subsets of S𝑆Sitalic_S (and this is also true for their shadow positions).

  • the measure of their continuations that are in U𝑈Uitalic_U is not less than P(U)δ𝑃𝑈𝛿P(U)-\deltaitalic_P ( italic_U ) - italic_δ.

  • the measure of them and their descendants who will undergo a shift is not more than δ𝛿\deltaitalic_δ.

Let X𝑋Xitalic_X be a finite subset of S𝑆Sitalic_S, which is a union of all shadow positions of ants from this set. All points in U𝑈Uitalic_U enumerate S𝑆Sitalic_S, so for the corresponding ants, their position will eventually become a superset of X𝑋Xitalic_X. Applying measure continuity again, we conclude that the measure of the ants whose position will become a superset of X𝑋Xitalic_X at some point will be no less than P(U)2δ𝑃𝑈2𝛿P(U)-2\deltaitalic_P ( italic_U ) - 2 italic_δ. For some of these ants, their shadow position may have changed, but the measure of such ants will not be greater than δ𝛿\deltaitalic_δ. So if P(U)3δ>ε𝑃𝑈3𝛿𝜀P(U)-3\delta>\varepsilonitalic_P ( italic_U ) - 3 italic_δ > italic_ε, conditions for a shift with the set X𝑋Xitalic_X will be created at some point. Therefore, the measure of the ants that have undergone a shift will be greater than ε𝜀\varepsilonitalic_ε, and we can assume without loss of generality that δ<ε𝛿𝜀\delta<\varepsilonitalic_δ < italic_ε. This will result in a contradiction.

Thus, the case of an empty intersection is impossible.

4 Discussion

One of the most natural questions in theoretical computer science is how much more efficient randomized computations are compared to deterministic ones. As Theorem 1.3 demonstrates, randomized computations are not superior to deterministic ones when considering infinite sets instead of finite ones. Moreover, the gap between complexities can be reduced to a factor of 2 (modulo logarithmic factors). To prove Theorem 1.3, we utilized a trick first used by Solovay: instead of cats (the deterministic machine) trying to catch the actual positions of ants (the randomized machine), the cats catch the shadow positions of the ants, which are slightly modified real positions. In Solovay’s method, managing shadow positions required transmitting additional H(S)𝐻𝑆H(S)italic_H ( italic_S ) bits of information. We were able to reduce the number of additional bits to O(logH(S))𝑂𝐻𝑆O(\log H(S))italic_O ( roman_log italic_H ( italic_S ) ) and, thereby, improve the upper bound.

Can we reduce the upper bound even more or find a matching lower bound? In [5], Vereshchagin used Martin’s game to obtain an upper bound for finite sets. However, Ageev [1] proved that the quadratic bound is both a lower and upper bound in Martin’s game. Thus, to improve the upper bound further, a novel technique must be developed. It is also unknown if the lower bound in Martin’s game does extend to a lower bound of deterministic complexity.

5 Acknowledgements

We extend our gratitude to Nikolay Vereshchagin, Alexander Shen, and Daniil Musatov for their invaluable assistance in both validating our findings and contributing to the writing of this article.

References

  • [1] M. Ageev. Martin’s game: a lower bound for the number of sets. Theoretical Computer Science, 289(1):871–876, 2002.
  • [2] K. de Leeuw, E. F. Moore, C. E. Shannon, and N. Shapiro. Computability by probabilistic machines. In C. E. Shannon and J. McCarthey, editors, Automata Studies, pages 183–212. Princeton University Press, Princeton, NJ, 1956.
  • [3] D. A. Martin. Borel indeterminacy. Annals of Mathematics, 102:363–371, 1978.
  • [4] R. M. Solovay. On random r.e. sets. In R. Chaqui A.I. Arruda, N.C.A. da Costa, editor, Non-Classical Logics, Model Theory and Computability, pages 283–307. North-Holland, Amsterdam, 1977.
  • [5] N. K. Vereshchagin. Kolmogorov complexity of enumerating finite sets. Inform. Process. Lett., 103(1):34–39, 2007.