License: arXiv.org perpetual non-exclusive license
arXiv:2404.04552v1 [cs.DS] 06 Apr 2024

Fast and Simple Sorting Using Partial Information

Bernhard Haeupler
[email protected]
ETH Zurich & CMU
Partially funded by the European Union’s Horizon 2020 ERC grant 949272.
   Richard Hladík
[email protected]
ETH Zurich
Supported by the VILLUM Foundation grant 54451. The work was done while this author was visiting BARC at the University of Copenhagen. The author would like to thank Rasmus Pagh for hosting him there.
   John Iacono
[email protected]
Université libre de Bruxelles
This work was supported by the Fonds de la Recherche Scientifique-FNRS.
   Vaclav Rozhon
[email protected]
ETH Zurich
Supported by the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant agreement No. 853109).
   Robert E. Tarjan
[email protected]
Princeton University
Partially supported by a gift from Microsoft.
   Jakub Tětek
[email protected]
BARC, Univ. of Copenhagen
Supported by the VILLUM Foundation grant 54451.
Abstract

We consider the problem of sorting a set of items having an unknown total order by doing binary comparisons of the items, given the outcomes of some pre-existing comparisons. We present a simple algorithm with a running time of O(m+n+logT)O𝑚𝑛𝑇\mathrm{O}(m+n+\log T)roman_O ( italic_m + italic_n + roman_log italic_T ), where n𝑛nitalic_n, m𝑚mitalic_m, and T𝑇Titalic_T are the number of items, the number of pre-existing comparisons, and the number of total orders consistent with the outcomes of the pre-existing comparisons, respectively. The algorithm does O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparisons.

Our running time and comparison bounds are best possible up to constant factors, thus resolving a problem that has been studied intensely since 1976 (Fredman, Theoretical Computer Science). The best previous algorithm with a bound of O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) on the number of comparisons has a time bound of O(n2.5)Osuperscript𝑛2.5\mathrm{O}(n^{2.5})roman_O ( italic_n start_POSTSUPERSCRIPT 2.5 end_POSTSUPERSCRIPT ) and is significantly more complicated. Our algorithm combines three classic algorithms: topological sort, heapsort with the right kind of heap, and efficient insertion into a sorted list.

1 Introduction

We consider the problem of sorting a set of items that are elements of a totally ordered set, assuming we are given for free the outcomes of certain comparisons between the items. This problem has been called sorting under partial information [Fre76, KK92], and it has been intensely studied since 1978.

We present a simple algorithm for this problem that runs in O(m+n+logT)O𝑚𝑛𝑇\mathrm{O}(m+n+\log T)roman_O ( italic_m + italic_n + roman_log italic_T ) time and does O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparisons, where n𝑛nitalic_n, m𝑚mitalic_m, and T𝑇Titalic_T are the number of items, the number of pre-existing comparisons, and the number of total orders consistent with the pre-existing comparisons, respectively. These bounds are tight. The best previous algorithm with an O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparison bound has a time bound of O(n2.5)Osuperscript𝑛2.5\mathrm{O}(n^{2.5})roman_O ( italic_n start_POSTSUPERSCRIPT 2.5 end_POSTSUPERSCRIPT ) and is much more complicated [Car+10].

Our algorithm for this problem combines three classic algorithms in a natural way: topological sort, heapsort with the right kind of heap, and efficient insertion into a sorted list. Unlike previous algorithms, ours does not use an estimate of T𝑇Titalic_T to determine the next comparison.

Our algorithm is closely related to a recent result of [Hae+23] proving that Dijkstra’s algorithm with an appropriate heap is universally optimal for the task of sorting vertices according to their distance from a source vertex. We use the same kind of heap, and we use similar ideas to analyze our algorithm.

Graph-theoretical problem formulation

We assume the input is given in the form of a directed graph G𝐺Gitalic_G whose vertices are the items, having an arc vw𝑣𝑤vwitalic_v italic_w for each given comparison outcome v<w𝑣𝑤v<witalic_v < italic_w. The goal is to compute the unknown total order of the vertices of G𝐺Gitalic_G by doing additional binary comparisons of the vertices. The parameters n𝑛nitalic_n and m𝑚mitalic_m denote the number of vertices and the number of arcs of G𝐺Gitalic_G, respectively.

The problem has a solution if and only if G𝐺Gitalic_G is acyclic; that is, G𝐺Gitalic_G is a directed acyclic graph (DAG). Each possible solution is a topological order of the DAG, a total order such that if vw𝑣𝑤vwitalic_v italic_w is an arc, v<w𝑣𝑤v<witalic_v < italic_w. A lower bound on the number of comparisons needed to solve the problem is logT𝑇\log Troman_log italic_T,111Throughout this paper “log\logroman_log” without a base denotes the base-two logarithm. where T𝑇Titalic_T is the number of possible topological orders of G𝐺Gitalic_G.

An equivalent view of the problem is that the given comparisons define a partial order, and the problem is to find an unknown linear extension of this partial order by doing additional comparisons. We prefer the DAG view, because the input DAG may not be transitively closed (if uv𝑢𝑣uvitalic_u italic_v and vw𝑣𝑤vwitalic_v italic_w are arcs then so is uw𝑢𝑤uwitalic_u italic_w) nor transitively reduced (if vw𝑣𝑤vwitalic_v italic_w is an arc then there is no other path from v𝑣vitalic_v to w𝑤witalic_w). We want a solution for an arbitrary DAG, and we do not want to take the time to compute either its transitive reduction or its transitive closure. From now on, we call the problem DAG sorting.

Roadmap

In Section 2, we briefly discuss related work on DAG sorting, on heaps with the working-set bound (a key ingredient in our algorithm), on related sorting and ordering problems, and on sampling and counting topological orders. In Section 3, we present our basic algorithm, which we call topological heapsort. We prove that it runs in O(m+n+logT)O𝑚𝑛𝑇\mathrm{O}(m+n+\log T)roman_O ( italic_m + italic_n + roman_log italic_T ) time and does O(n+logT)O𝑛𝑇\mathrm{O}(n+\log T)roman_O ( italic_n + roman_log italic_T ) comparisons. We eliminate the additive n𝑛nitalic_n term in the number of comparisons in Section 4 by adding an additional step to our algorithm, producing an algorithm that we call topological heapsort with insertion. This algorithm is best possible to within constant factors in both its running time and its number of comparisons.

2 Related Work

DAG sorting

[Fre76] considered the generalization of the DAG sorting problem in which there is an unknown total order selected from an arbitrary subset of the total orders of n𝑛nitalic_n elements, and the problem is to find the unknown total order by doing binary comparisons of the elements. He showed that if there are T𝑇Titalic_T possible total orders, logT+2n𝑇2𝑛\log T+2nroman_log italic_T + 2 italic_n binary comparisons suffice. His algorithm is highly inefficient, however.

For the special case of sorting a DAG, [Fre76] and [Lin84] independently conjectured that there always exists a balancing comparison: a comparison “x<y𝑥𝑦x<yitalic_x < italic_y?” such that the fraction of the topological orders of G𝐺Gitalic_G for which the answer is “yes” lies between δ𝛿\deltaitalic_δ and 1δ1𝛿1-\delta1 - italic_δ for δ=1/3𝛿13\delta=1/3italic_δ = 1 / 3. This is the 1/3–2/3 conjecture. [KS84] showed that the claim is true for δ=3/11𝛿311\delta=3/11italic_δ = 3 / 11, and the value of δ𝛿\deltaitalic_δ has since been improved several times [KL91, BFT95, Bri99]. It follows immediately that there is an algorithm that always does at most log11δTsubscript11𝛿𝑇\log_{1\over 1-\delta}Troman_log start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 1 - italic_δ end_ARG end_POSTSUBSCRIPT italic_T comparisons: Repeatedly find a balancing comparison.

[KK92] gave a polynomial-time algorithm for the DAG sorting problem that does O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparisons. Their algorithm uses the ellipsoid method to find a good comparison. [Car+10] gave several algorithms for sorting a DAG that avoid the use of the ellipsoid method. In particular, they devised an algorithm that runs in O(n2.5)Osuperscript𝑛2.5\mathrm{O}(n^{2.5})roman_O ( italic_n start_POSTSUPERSCRIPT 2.5 end_POSTSUPERSCRIPT ) time and does O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparisons. This is the fastest previous algorithm that sorts a DAG in O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparisons. Another algorithm in [Car+10] has the same time bound and does (1+ϵ)logT+Oϵ(n)1italic-ϵ𝑇subscriptOitalic-ϵ𝑛(1+\epsilon)\log T+\mathrm{O}_{\epsilon}(n)( 1 + italic_ϵ ) roman_log italic_T + roman_O start_POSTSUBSCRIPT italic_ϵ end_POSTSUBSCRIPT ( italic_n ) comparisons.

[DM18] considered a variant of the problem in which only the k𝑘kitalic_k smallest elements need to be sorted. They devised an O(nlogk+m)O𝑛𝑘𝑚\mathrm{O}(n\log k+m)roman_O ( italic_n roman_log italic_k + italic_m ) time algorithm that does O(nlogk)O𝑛𝑘\mathrm{O}(n\log k)roman_O ( italic_n roman_log italic_k ) comparisons. For additional related work, see the survey of [CF13].

Heaps with the working-set bound

Our algorithm substantially deviates from the techniques previously used for DAG sorting. Our basic algorithm is a classical topological sorting algorithm modified to use a heap to choose the next-smallest vertex. We prove that if the heap has the so-called working-set bound, then our DAG sorting algorithm is efficient.

A heap is a data structure H𝐻Hitalic_H storing a set of items, each having a key selected from a totally ordered set. The heap is initially empty. For our purpose, heaps support two operations. The first is 𝑖𝑛𝑠𝑒𝑟𝑡(H,x)𝑖𝑛𝑠𝑒𝑟𝑡𝐻𝑥\mbox{\it insert\/}(H,x)insert ( italic_H , italic_x ), which inserts item x𝑥xitalic_x into heap H𝐻Hitalic_H; x𝑥xitalic_x must not be in H𝐻Hitalic_H already and must have a predefined key. The second is delete-min(H)delete-min𝐻\mbox{\it delete-min\/}(H)delete-min ( italic_H ), which deletes and returns an item in H𝐻Hitalic_H with minimum key. Some heap implementations support a third operation, decrease-key(H,x,k)decrease-key𝐻𝑥𝑘\mbox{\it decrease-key\/}(H,x,k)decrease-key ( italic_H , italic_x , italic_k ), which, given the location of an item x𝑥xitalic_x in heap H𝐻Hitalic_H such that x𝑥xitalic_x has key greater than k𝑘kitalic_k, replaces the key of x𝑥xitalic_x in heap H𝐻Hitalic_H by k𝑘kitalic_k. Fibonacci heaps [Fre+86a] and other equally efficient heap implementations support insert and decrease-key in O(1)O1\mathrm{O}(1)roman_O ( 1 ) amortized time and delete-min on an n𝑛nitalic_n-item heap in O(logn)O𝑛\mathrm{O}(\log n)roman_O ( roman_log italic_n ) time.

In our application we need a more refined bound for delete-mins that depends on the sequence of operations done on the heap. This is the working-set bound, defined as follows. Consider an item x𝑥xitalic_x that is inserted into the heap at some time and later deleted. An item is in the working set of x𝑥xitalic_x if it is x𝑥xitalic_x or is inserted after x𝑥xitalic_x but before x𝑥xitalic_x is deleted. The number of items in the working set of x𝑥xitalic_x can change over time as other items are inserted and deleted. The working set size w(x)𝑤𝑥w(x)italic_w ( italic_x ) of x𝑥xitalic_x is the maximum size of its working set while x𝑥xitalic_x is in the heap. The heap has the working set bound if each insertion takes O(1)O1\mathrm{O}(1)roman_O ( 1 ) time and each delete-min of an item x𝑥xitalic_x takes O(1+logw(x))O1𝑤𝑥\mathrm{O}(1+\log w(x))roman_O ( 1 + roman_log italic_w ( italic_x ) ) time. These bounds can be amortized. (If decrease-key is a supported operation, it has no required time bound.) We call a heap that has the working-set bound a working-set heap. Although it is not obvious, one can prove that a heap with the working set bound has amortized time bounds of O(1)O1\mathrm{O}(1)roman_O ( 1 ) for insert and O(logn)O𝑛\mathrm{O}(\log n)roman_O ( roman_log italic_n ) for delete-min [Hae+24].

The working-set bound is related to a similar bound for binary search trees, one possessed by splay trees [ST85].

[Iac00] proved that pairing heaps [Fre+86], a form of self-adjusting heap, have the working-set bound, provided that the heap ends empty. Splay trees used as heaps in an appropriate way also have the working-set bound, again provided that the heap ends empty [Hae+24]. [Elm06] devised a heap with an even better bound: The time for a delete-min of x𝑥xitalic_x is O(1+logs(x))O1𝑠𝑥\mathrm{O}(1+\log s(x))roman_O ( 1 + roman_log italic_s ( italic_x ) ), where s(x)𝑠𝑥s(x)italic_s ( italic_x ) is the size of the working set of x𝑥xitalic_x when x𝑥xitalic_x is deleted. One can obtain the same bound in a more straightforward way using a finger search tree [Hae+24]. [Hae+23] developed a heap with the working-set bound that also supports decrease-key operations in O(1)O1\mathrm{O}(1)roman_O ( 1 ) amortized time.

See [KS18, Mun+19, EFI13, BHM13] for other input-sensitive time bounds for heaps.

Related sorting and ordering problems

Our approach is strongly influenced by recent work [Hae+23] on the distance-ordering problem. The input to this problem is a directed graph with non-negative arc weights and a source vertex. The problem is to sort the vertices of the input graph by their distance from the source. This problem can be solved in O(m+nlogn)O𝑚𝑛𝑛\mathrm{O}(m+n\log n)roman_O ( italic_m + italic_n roman_log italic_n ) time and comparisons by running Dijkstra’s algorithm using a Fibonacci heap [Fre+86a]. The bounds are best possible in the worst case. The authors of [Hae+23] improve this worst-case result by giving a so-called universally optimal algorithm: They show that if Dijkstra’s algorithm is implemented using a heap with the working set bound, then for any fixed unweighted graph G𝐺Gitalic_G with a given source vertex, the algorithm takes the minimum time needed (to within a constant factor) to solve the problem on G𝐺Gitalic_G with a worst-case choice of arc weights. Moreover, a variant of Dijkstra’s algorithm minimizes not only the time but also the number of comparisons to within a constant factor.

The DAG sorting problem also asks for a universally optimal algorithm in that on any given DAG it should minimize the running time and number of comparisons. Both problems are generalizations of sorting. Not only are the two problems similar, but we show that they can be solved by similar techniques.

Another algorithm related to ours is the adaptive heapsort algorithm of [LP93]. This algorithm sorts a sequence of numbers by first building a heap-ordered tree, the Cartesian tree of the sequence, defined recursively as follows: The root is the minimum number in the sequence, say x𝑥xitalic_x, its left subtree is the Cartesian tree of the subsequence preceding x𝑥xitalic_x, and its right subtree is the Cartesian tree of the subsequence following x𝑥xitalic_x. The Cartesian tree can be built in at most 2n32𝑛32n-32 italic_n - 3 comparisons. The algorithm finishes the sort using a heap to store the possible minima, initially only the root. After a delete-min, say of v𝑣vitalic_v, the children of v𝑣vitalic_v in the Cartesian tree are inserted in the heap. [LP93] use a standard heap in their algorithm, but if one uses a working-set heap instead, our results imply that adaptive heapsort does O(n+logT)O𝑛𝑇\mathrm{O}(n+\log T)roman_O ( italic_n + roman_log italic_T ) comparisons, where T𝑇Titalic_T is the number of topological orders of the Cartesian tree viewed as a DAG.

An ordering problem that is dual to DAG sorting is that of producing a given partial order on a totally ordered set, by doing enough comparisons so that the partial order induced by the comparison outcomes is isomorphic to the input partial order. The two problems are dual in the sense that if an n𝑛nitalic_n-element partial order has T𝑇Titalic_T topological orders, the partial-order production problem on the same partial order requires logn!logT𝑛𝑇\log n!-\log Troman_log italic_n ! - roman_log italic_T comparisons [Sch76, Yao89], both worst-case and expected-case. Cardinal et al. [Car+09] gave an algorithm for partial-order production that does a number of comparisons within a lower-order term of the lower bound plus O(n)O𝑛\mathrm{O}(n)roman_O ( italic_n ). One step in their algorithm finds a greedy coloring of the comparability graph of the partially ordered set. We use a similar concept, that of a greedy clique partition of an interval graph (see Section 3.2), but only in our analysis, not in our algorithm, and our partition is of a different graph, one determined by the behavior of our algorithm.

Sampling and counting topological orders

Even though we give an algorithm that does O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparisons, calculating logT𝑇\log Troman_log italic_T exactly is hard, since the problem of determining T𝑇Titalic_T is #P-complete [BW91]. There are algorithms that compute T𝑇Titalic_T approximately [DFK91, Ban+10], however. [BW91] have shown that a constant-factor approximation to T𝑇Titalic_T can be obtained with high probability using O(n2polylogn)Osuperscript𝑛2polylog𝑛\mathrm{O}(n^{2}\operatorname*{poly\,log}n)roman_O ( italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_OPERATOR roman_poly roman_log end_OPERATOR italic_n ) calls to an oracle that samples a uniformly random topological order. There are multiple results on sampling a random topological order [Mat91, KK91, BD99, Hub06], with the state-of-the-art approach having O(n3logn)Osuperscript𝑛3𝑛\mathrm{O}(n^{3}\log n)roman_O ( italic_n start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT roman_log italic_n ) time complexity and leading to an O(n5polylogn)Osuperscript𝑛5polylog𝑛\mathrm{O}(n^{5}\operatorname*{poly\,log}n)roman_O ( italic_n start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT start_OPERATOR roman_poly roman_log end_OPERATOR italic_n ) approximation algorithm for T𝑇Titalic_T. In Appendix A, we show how to use one oracle call to get a constant-factor approximation to logT𝑇\log Troman_log italic_T, and hence a polynomial approximation to T𝑇Titalic_T.

3 Topological Heapsort and Its Efficiency

This section introduces our basic algorithm, which we call topological heapsort. The algorithm combines two classic algorithms, topological sort [Knu97] and heapsort [Wil64, Flo64] in a simple way. We describe the algorithm in Section 3.1. In Section 3.2, we prove that our algorithm runs in O(m+n+logT)O𝑚𝑛𝑇\mathrm{O}(m+n+\log T)roman_O ( italic_m + italic_n + roman_log italic_T ) time and does O(n+logT)O𝑛𝑇\mathrm{O}(n+\log T)roman_O ( italic_n + roman_log italic_T ) comparisons if it is implemented with a working-set heap.

3.1 Topological Heapsort

We start by recalling a basic result in graph theory: A directed graph is a DAG if and only if it has a topological order. One can prove this by the following classic topological sorting algorithm [Kah62, Knu97]: Call a vertex a source if it has no entering arcs. Given a directed graph, repeatedly delete a source and its outgoing arcs, until there are either no vertices or no sources. In the former case, the vertex deletion order is a topological order; in the latter case, one can find a cycle by starting at any remaining vertex and building a path by repeatedly traversing some arc in the reverse direction and continuing until a vertex is repeated.

To make this algorithm efficient, one needs to keep track of the current set of sources. Kahn [Kah62] does this by maintaining for each vertex its current in-degree (number of incoming arcs). The sources are the vertices with in-degree zero. When a vertex is deleted, the in-degrees of its immediate successors (those reached by an outgoing arc) are decremented. In each iteration, Kahn finds a source by examining all the remaining vertices, which results in O(n2+m)Osuperscript𝑛2𝑚\mathrm{O}(n^{2}+m)roman_O ( italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_m ) worst-case running time. Knuth [Knu97] adds the idea of maintaining the current set of sources in a separate data structure, for which he uses a queue. This reduces the running time to O(m+n)O𝑚𝑛\mathrm{O}(m+n)roman_O ( italic_m + italic_n ).

Given a DAG, our task is to find a specific topological order, the one corresponding to the unknown total order of the vertices. Our basic algorithm is the topological sorting algorithm with the current set of sources stored in a heap. The key of a vertex is the vertex itself. Each step deletes the minimum vertex, say v𝑣vitalic_v, from the heap, adds v𝑣vitalic_v to the total order, decrements the in-degree of each vertex w𝑤witalic_w such that vw𝑣𝑤vwitalic_v italic_w is an arc, inserts into the heap each such w𝑤witalic_w whose in-degree is now zero, and finally deletes v𝑣vitalic_v and its outgoing arcs from the DAG. The complete description of the algorithm is given in Algorithm 1.

This algorithm is not only a version of topological sort, it is also a form of heapsort [Wil64, Flo64], which in turn is a form of selection sort: The algorithm adds the vertices to the total order in increasing order, using a heap to do so. It differs from standard heapsort in that the heap contains only the current sources, which are the only candidates for the next minimum, rather than all the undeleted vertices. We call the algorithm topological heapsort.

Data: A Directed Acyclic Graph G𝐺Gitalic_G
Result: A sorted list of vertices Q𝑄Qitalic_Q
Initialize Q𝑄Qitalic_Q as an empty list to store sorted vertices Compute the in-degree of each vertex Initialize heap H𝐻Hitalic_H to contain all vertices with in-degree zero (the sources) while G𝐺Gitalic_G is not empty :
       Remove the minimum vertex v𝑣vitalic_v from H𝐻Hitalic_H Add v𝑣vitalic_v to the back of Q𝑄Qitalic_Q foreach arc vw𝑣𝑤vwitalic_v italic_w do
             Decrement the in-degree of w𝑤witalic_w if w𝑤witalic_w now has in-degree zero (is a source) :
                   Add w𝑤witalic_w to H𝐻Hitalic_H
            
            Delete v𝑣vitalic_v and each outgoing arc vw𝑣𝑤vwitalic_v italic_w from G𝐺Gitalic_G
       end foreach
      
Algorithm 1 Algorithm 1: Topological heapsort

Topological heapsort is correct because if v𝑣vitalic_v is an undeleted vertex that is not a source, it cannot be the smallest undeleted vertex. The running time of the algorithm is O(m+n)O𝑚𝑛\mathrm{O}(m+n)roman_O ( italic_m + italic_n ) plus the time required for n𝑛nitalic_n insertions into H𝐻Hitalic_H and n𝑛nitalic_n intermixed delete-mins from H𝐻Hitalic_H. All the vertex comparisons are in the heap operations. The input can be a list of the arcs in G𝐺Gitalic_G; if it is, as part of the initialization we build for each vertex a list of its outgoing arcs.

We implement topological heapsort using a working-set heap.

3.2 Efficiency of Topological Heapsort

To prove that topological heapsort is efficient, we must estimate the number of comparisons needed to sort the vertices of a given DAG G𝐺Gitalic_G. This is at least logT𝑇\log Troman_log italic_T, where T𝑇Titalic_T is the number of topological orders of G𝐺Gitalic_G, by a standard information-theory lower bound argument.

For completeness we present a version of this argument. Given a sorting algorithm, consider the adversary that begins with the set of all topological orders consistent with the comparison outcomes so far, and that responds to each comparison with the outcome that eliminates at most half of the previously consistent orders. Then the algorithm must do at least logT𝑇\log Troman_log italic_T comparisons to verify that is has the correct order. If it does fewer, then even if it guesses the order, it will be correct at most half the time if the adversary responds with a uniformly random consistent total order.

We shall show that topological heapsort matches this bound to within a constant factor plus an additive term in n𝑛nitalic_n. Specifically, we shall prove the following theorem:

Theorem 3.1.

Topological heapsort implemented with a working-set heap runs in O(n+m+logT)normal-O𝑛𝑚𝑇\mathrm{O}(n+m+\log T)roman_O ( italic_n + italic_m + roman_log italic_T ) time and does O(n+logT)normal-O𝑛𝑇\mathrm{O}(n+\log T)roman_O ( italic_n + roman_log italic_T ) comparisons.

Our approach is very similar to that used in [Hae+23]. We prove the theorem by develo** a lower bound on logT𝑇\log Troman_log italic_T that can be related to the time required by the delete-min operations in topological heapsort via the working-set bound.

Given a run of topological heapsort, let t(v)𝑡𝑣t(v)italic_t ( italic_v ) and t(v)superscript𝑡𝑣t^{\prime}(v)italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_v ) be the times at which vertex v𝑣vitalic_v is inserted into and deleted from the heap H𝐻Hitalic_H, respectively. We associate the time interval I(v)=[t(v),t(v)]𝐼𝑣𝑡𝑣superscript𝑡𝑣I(v)=[t(v),t^{\prime}(v)]italic_I ( italic_v ) = [ italic_t ( italic_v ) , italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_v ) ] with vertex v𝑣vitalic_v. We define the interval graph \mathcal{I}caligraphic_I of the run to be the undirected graph whose vertices are the vertices of the DAG G𝐺Gitalic_G, with an edge connecting two vertices if their intervals overlap. A clique of \mathcal{I}caligraphic_I is a set of pairwise adjacent vertices. For any clique C𝐶Citalic_C of \mathcal{I}caligraphic_I, there is at least one time that is common to all of the corresponding intervals, namely max{t(v)vC}conditional𝑡𝑣𝑣𝐶\max\{\,t(v)\mid v\in C\,\}roman_max { italic_t ( italic_v ) ∣ italic_v ∈ italic_C }, the insertion time of the last-inserted vertex in the clique. We call this time the critical time of C𝐶Citalic_C.

We partition the vertices of \mathcal{I}caligraphic_I into cliques in a greedy fashion, by selecting any clique of maximum size, deleting its vertices and incident edges, and repeating until there are no vertices left. Since the vertices of \mathcal{I}caligraphic_I and G𝐺Gitalic_G are the same, this also partitions the vertices of G𝐺Gitalic_G. Although the first clique deleted is maximal in \mathcal{I}caligraphic_I (no additional vertices can be added), this is not true of later cliques, since it may be possible to add to them vertices that were deleted earlier. Let C1,C2,C3,,Cksubscript𝐶1subscript𝐶2subscript𝐶3subscript𝐶𝑘C_{1},C_{2},C_{3},\dots,C_{k}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , … , italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT be the sequence of cliques, in increasing order by critical time. This order is in general different from their order of selection. We denote by |Ci|subscript𝐶𝑖|C_{i}|| italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | the number of vertices in Cisubscript𝐶𝑖C_{i}italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

If vw𝑣𝑤vwitalic_v italic_w is an arc of G𝐺Gitalic_G, then v𝑣vitalic_v must be deleted from H𝐻Hitalic_H before w𝑤witalic_w is inserted. Thus if vCi𝑣subscript𝐶𝑖v\in C_{i}italic_v ∈ italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and wCj𝑤subscript𝐶𝑗w\in C_{j}italic_w ∈ italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, then i<j𝑖𝑗i<jitalic_i < italic_j. In particular, no arcs of G𝐺Gitalic_G have both ends in the same Cisubscript𝐶𝑖C_{i}italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. This allows us to prove the following lemma:

Lemma 3.2.

logTi=1k|Ci|log|Ci|nloge𝑇superscriptsubscript𝑖1𝑘subscript𝐶𝑖subscript𝐶𝑖𝑛𝑒\log T\geq\sum_{i=1}^{k}|C_{i}|\log|C_{i}|-n\log eroman_log italic_T ≥ ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT | italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | roman_log | italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | - italic_n roman_log italic_e.

Proof.

We can obtain a topological order of G𝐺Gitalic_G by arranging the vertices of C1subscript𝐶1C_{1}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in any order, followed by the vertices of C2subscript𝐶2C_{2}italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in any order, and so on. Thus Ti=1k|Ci|!𝑇superscriptsubscriptproduct𝑖1𝑘subscript𝐶𝑖T\geq\prod_{i=1}^{k}|C_{i}|!italic_T ≥ ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT | italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | ! The lemma follows by taking logarithms and applying Stirling’s approximation. ∎

Lemma 3.2 gives us a lower bound on logT𝑇\log Troman_log italic_T. We need to relate this lower bound to the time taken by the delete-min operations. We need one more definition. If C𝐶Citalic_C is any clique of \mathcal{I}caligraphic_I (not just a Cisubscript𝐶𝑖C_{i}italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT), the primary vertex of C𝐶Citalic_C is the vertex in C𝐶Citalic_C inserted into H𝐻Hitalic_H the earliest. Using this definition, we can reformulate the definition of working-set size given in Section 2 as follows: The working-set size w(v)𝑤𝑣w(v)italic_w ( italic_v ) of a vertex v𝑣vitalic_v is the maximum number of vertices in a clique of \mathcal{I}caligraphic_I whose primary vertex is v𝑣vitalic_v. This definition coincides with the original definition for the original \mathcal{I}caligraphic_I, but w(v)𝑤𝑣w(v)italic_w ( italic_v ) can decrease as vertices are deleted from \mathcal{I}caligraphic_I.

The next lemma is the heart of our analysis. It is a variant of Lemmas 3.14 and 3.15 in [Hae+23].

Lemma 3.3.

vlogw(v)2i=1k|Ci|log|Ci|subscript𝑣𝑤𝑣2superscriptsubscript𝑖1𝑘subscript𝐶𝑖subscript𝐶𝑖\sum_{v\in\mathcal{I}}\log w(v)\leq 2\sum_{i=1}^{k}|C_{i}|\log|C_{i}|∑ start_POSTSUBSCRIPT italic_v ∈ caligraphic_I end_POSTSUBSCRIPT roman_log italic_w ( italic_v ) ≤ 2 ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT | italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | roman_log | italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT |, where w(v)𝑤𝑣w(v)italic_w ( italic_v ) is the working-set size of v𝑣vitalic_v in \mathcal{I}caligraphic_I.

Proof.

We prove the lemma by induction on k𝑘kitalic_k. Suppose the lemma is true for k1𝑘1k-1italic_k - 1. Let C𝐶Citalic_C be the first clique selected when building the clique partition, let superscript\mathcal{I}^{\prime}caligraphic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT be \mathcal{I}caligraphic_I after the vertices in C𝐶Citalic_C and their incident edges are deleted, and let Cjsubscriptsuperscript𝐶𝑗C^{\prime}_{j}italic_C start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for 1jk11𝑗𝑘11\leq j\leq k-11 ≤ italic_j ≤ italic_k - 1 be a renumbering of the cliques Cisubscript𝐶𝑖C_{i}italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, not including C𝐶Citalic_C. (The order does not matter.) By the induction hypothesis, vw(v)2i=1k1|Ci|log|Ci|subscript𝑣superscriptsuperscript𝑤𝑣2superscriptsubscript𝑖1𝑘1subscriptsuperscript𝐶𝑖subscriptsuperscript𝐶𝑖\sum_{v\in\mathcal{I^{\prime}}}w^{\prime}(v)\leq 2\sum_{i=1}^{k-1}|C^{\prime}_% {i}|\log|C^{\prime}_{i}|∑ start_POSTSUBSCRIPT italic_v ∈ caligraphic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_v ) ≤ 2 ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT | italic_C start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | roman_log | italic_C start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT |, where w(v)superscript𝑤𝑣w^{\prime}(v)italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_v ) is the working-set size of v𝑣vitalic_v in superscript\mathcal{I^{\prime}}caligraphic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

To prove the bound for \mathcal{I}caligraphic_I, we must account for the terms for the clique C𝐶Citalic_C and its vertices, and also for any increases in the working-set sizes of vertices in superscript\mathcal{I^{\prime}}caligraphic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT caused by adding the vertices in C𝐶Citalic_C. The first is easy, the second is the technical part of the proof.

Since C𝐶Citalic_C is a largest clique in \mathcal{I}caligraphic_I, w(v)|C|𝑤𝑣𝐶w(v)\leq|C|italic_w ( italic_v ) ≤ | italic_C | for every vertex v𝑣vitalic_v in C𝐶Citalic_C. Hence vClogw(v)|C|log|C|subscript𝑣𝐶𝑤𝑣𝐶𝐶\sum_{v\in C}\log w(v)\leq|C|\log|C|∑ start_POSTSUBSCRIPT italic_v ∈ italic_C end_POSTSUBSCRIPT roman_log italic_w ( italic_v ) ≤ | italic_C | roman_log | italic_C |.

Consider the increase in working-set sizes that results from adding one vertex v𝑣vitalic_v in C𝐶Citalic_C to superscript\mathcal{I^{\prime}}caligraphic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. This can increase the working-set size of u𝑢superscriptu\in\mathcal{I^{\prime}}italic_u ∈ caligraphic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT only if t(u)<t(v)<t(u)𝑡𝑢𝑡𝑣superscript𝑡𝑢t(u)<t(v)<t^{\prime}(u)italic_t ( italic_u ) < italic_t ( italic_v ) < italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_u ), and then by at most one. Let u1,u2,,ujsubscript𝑢1subscript𝑢2subscript𝑢𝑗u_{1},u_{2},\dots,u_{j}italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT be the vertices u𝑢superscriptu\in\mathcal{I^{\prime}}italic_u ∈ caligraphic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that t(u)<t(v)<t(u)𝑡𝑢𝑡𝑣superscript𝑡𝑢t(u)<t(v)<t^{\prime}(u)italic_t ( italic_u ) < italic_t ( italic_v ) < italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_u ), ordered in decreasing order by t(ui)𝑡subscript𝑢𝑖t(u_{i})italic_t ( italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ). Then the increase in logw(ui)𝑤subscript𝑢𝑖\log w(u_{i})roman_log italic_w ( italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) caused by adding v𝑣vitalic_v is at most log(i+1)logi𝑖1𝑖\log(i+1)-\log iroman_log ( italic_i + 1 ) - roman_log italic_i by the concavity of the log\logroman_log function. Summing over i𝑖iitalic_i, the sum telescopes, and we find that the total increase is at most log(j+1)𝑗1\log(j+1)roman_log ( italic_j + 1 ). The set containing v𝑣vitalic_v and all the vertices uisubscript𝑢𝑖u_{i}italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is a clique in \mathcal{I}caligraphic_I. By the greedy choice of C𝐶Citalic_C, j+1|C|𝑗1𝐶j+1\leq|C|italic_j + 1 ≤ | italic_C |. This argument applies sequentially to each addition of a vertex v𝑣vitalic_v in C𝐶Citalic_C to superscript\mathcal{I^{\prime}}caligraphic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Thus u(w(u)w(u))|C|log|C|subscript𝑢superscript𝑤𝑢superscript𝑤𝑢𝐶𝐶\sum_{u\in\mathcal{I^{\prime}}}(w(u)-w^{\prime}(u))\leq|C|\log|C|∑ start_POSTSUBSCRIPT italic_u ∈ caligraphic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_w ( italic_u ) - italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_u ) ) ≤ | italic_C | roman_log | italic_C |.

Adding our two bounds and the bound given by the induction hypothesis gives us the desired bound for \mathcal{I}caligraphic_I. The lemma follows by induction. ∎

Theorem 3.1 follows immediately from Lemmas 3.2 and 3.3, since the running time of topological heapsort is O(m+n)O𝑚𝑛\mathrm{O}(m+n)roman_O ( italic_m + italic_n ) plus the time to do the heap operations, the time to do the heap operations is O(n+logT)O𝑛𝑇\mathrm{O}(n+\log T)roman_O ( italic_n + roman_log italic_T ) by the two lemmas and the definition of the working-set bound, and all the comparisons are in the heap operations.

4 Topological Heapsort with Insertion

The bound on comparisons in Theorem 3.1 includes an additive term linear in n𝑛nitalic_n. This term is significant if the number of topological orders of the DAG is small, specifically sub-exponential in n𝑛nitalic_n. In this section, we augment topological heapsort to eliminate this term in the bound.

The first step is to determine when the additive n𝑛nitalic_n term is significant. This is only when the input DAG contains a long path.

Lemma 4.1.

Let normal-ℓ\ellroman_ℓ be the number of vertices on a longest path in a DAG G𝐺Gitalic_G. Then G𝐺Gitalic_G has at least 2(n)/2superscript2𝑛normal-ℓ22^{(n-\ell)/2}2 start_POSTSUPERSCRIPT ( italic_n - roman_ℓ ) / 2 end_POSTSUPERSCRIPT topological orders.

Proof.

Consider the partition of the vertices of G𝐺Gitalic_G into layers L1,L2,L3,,Lsubscript𝐿1subscript𝐿2subscript𝐿3subscript𝐿L_{1},L_{2},L_{3},\dots,L_{\ell}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , … , italic_L start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT, where layer Lisubscript𝐿𝑖L_{i}italic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT contains all vertices v𝑣vitalic_v such that the longest path ending in v𝑣vitalic_v contains i𝑖iitalic_i vertices. Then if vw𝑣𝑤vwitalic_v italic_w is an arc with vLi𝑣subscript𝐿𝑖v\in L_{i}italic_v ∈ italic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and wLj𝑤subscript𝐿𝑗w\in L_{j}italic_w ∈ italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, then i<j𝑖𝑗i<jitalic_i < italic_j. In particular, there are no arcs with both ends in the same layer. It follows as in the proof of Lemma 3.2 that the number of topological orders of G𝐺Gitalic_G is at least i=1|Li|!superscriptsubscriptproduct𝑖1subscript𝐿𝑖\prod_{i=1}^{\ell}|L_{i}|!∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT | italic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | !: The vertices can be ordered layer by layer, and those within a layer can be ordered arbitrarily. A layer that contains 2k2𝑘2k2 italic_k or 2k+12𝑘12k+12 italic_k + 1 vertices contributes at least (2k)!2k2𝑘superscript2𝑘(2k)!\geq 2^{k}( 2 italic_k ) ! ≥ 2 start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT to this product. This means that each layer Lisubscript𝐿𝑖L_{i}italic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT contributes to the product at least 2(|Li|1)/2superscript2subscript𝐿𝑖122^{(|L_{i}|-1)/2}2 start_POSTSUPERSCRIPT ( | italic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | - 1 ) / 2 end_POSTSUPERSCRIPT. The number of topological orders is thus at least

i=12(|Li|1)/2=2i=1(|Li|1)/2=2(n)/2superscriptsubscriptproduct𝑖1superscript2subscript𝐿𝑖12superscript2superscriptsubscript𝑖1subscript𝐿𝑖12superscript2𝑛2\prod_{i=1}^{\ell}2^{(|L_{i}|-1)/2}=2^{\sum_{i=1}^{\ell}(|L_{i}|-1)/2}=2^{(n-% \ell)/2}∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT ( | italic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | - 1 ) / 2 end_POSTSUPERSCRIPT = 2 start_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT ( | italic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | - 1 ) / 2 end_POSTSUPERSCRIPT = 2 start_POSTSUPERSCRIPT ( italic_n - roman_ℓ ) / 2 end_POSTSUPERSCRIPT

By Lemma 4.1, topological heapsort uses the desired bound of O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparisons to sort the vertices of any DAG whose maximum-length path contains at most (1ϵ)n1italic-ϵ𝑛(1-\epsilon)n( 1 - italic_ϵ ) italic_n vertices, where ϵitalic-ϵ\epsilonitalic_ϵ is any positive constant, since in this case logT=Ω(n)𝑇Ω𝑛\log T=\Omega(n)roman_log italic_T = roman_Ω ( italic_n ). Thus we only need a way to handle any DAG that has a path containing at least a large constant fraction of the vertices. To handle such a DAG, we find a longest path and run topological heapsort without inserting any of the vertices on this path into the heap. Instead, we insert each vertex returned by a delete-min into the long path using an exponential search followed by a binary search on the relevant part of the path to find the insertion position.

Topological heapsort with insertion

The resulting sorting algorithm, which we call topological heapsort with insertion, is as follows:

Refer to caption
Figure 1: This picture shows how the input DAG G𝐺Gitalic_G is transformed into a DAG Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT in Item 2 of topological heapsort with insertion.
  1. Item 1: Find a longest path P𝑃Pitalic_P in the given DAG G𝐺Gitalic_G. Mark the last vertex of P𝑃Pitalic_P. For each vertex v𝑣vitalic_v not on P𝑃Pitalic_P, mark the last vertex u𝑢uitalic_u on P𝑃Pitalic_P such that uv𝑢𝑣uvitalic_u italic_v is an arc and the first vertex w𝑤witalic_w on P𝑃Pitalic_P such that vw𝑣𝑤vwitalic_v italic_w is an arc. Delete all arcs between v𝑣vitalic_v and P𝑃Pitalic_P other than uv𝑢𝑣uvitalic_u italic_v and vw𝑣𝑤vwitalic_v italic_w. Form DAG Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT from G𝐺Gitalic_G by adding an arc xy𝑥𝑦xyitalic_x italic_y from each marked vertex x𝑥xitalic_x on P𝑃Pitalic_P other than the last one to the next marked vertex y𝑦yitalic_y, and deleting all unmarked vertices on P𝑃Pitalic_P and their incident arcs. Save the vertices that were originally on P𝑃Pitalic_P in an array L𝐿Litalic_L.

  2. Item 2: Run topological heapsort on Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Each time a vertex v𝑣vitalic_v is deleted from H𝐻Hitalic_H, do the following: Delete from L𝐿Litalic_L all vertices less than or equal to v𝑣vitalic_v, and add these vertices to the sorted list Q𝑄Qitalic_Q in their order in L𝐿Litalic_L. Add v𝑣vitalic_v to Q𝑄Qitalic_Q as well. Then continue executing topological heapsort.

To do Item 1, find any topological order of the vertices of G𝐺Gitalic_G and process the vertices in topological order as follows: Compute for each vertex v𝑣vitalic_v the length (v)𝑣\ell(v)roman_ℓ ( italic_v ) of the longest path ending in v𝑣vitalic_v, using the recurrence (v)=max({0}{(u)+1uvE}}\ell(v)=\max(\{0\}\cup\{\,\ell(u)+1\mid uv\in E\,\}\}roman_ℓ ( italic_v ) = roman_max ( { 0 } ∪ { roman_ℓ ( italic_u ) + 1 ∣ italic_u italic_v ∈ italic_E } }, where E𝐸Eitalic_E is the set of arcs of G𝐺Gitalic_G. Find a longest path P𝑃Pitalic_P by starting at a vertex v𝑣vitalic_v of largest (v)𝑣\ell(v)roman_ℓ ( italic_v ) and proceeding backward, always to a vertex u𝑢uitalic_u of largest (u)𝑢\ell(u)roman_ℓ ( italic_u ). Finding a longest path takes O(m+n)O𝑚𝑛\mathrm{O}(m+n)roman_O ( italic_m + italic_n ) time and no vertex comparisons.

In Item 2, find the set of vertices in L𝐿Litalic_L less than or equal to v𝑣vitalic_v as follows: If v𝑣vitalic_v is in L𝐿Litalic_L, then this set is the prefix of L𝐿Litalic_L ending with v𝑣vitalic_v. Testing this requires no vertex comparisons. If v𝑣vitalic_v is not in L𝐿Litalic_L, use exponential search followed by binary search: Compare v𝑣vitalic_v with the first, second, fourth, eighth, … vertex in L𝐿Litalic_L until finding a pair of vertices y𝑦yitalic_y and z𝑧zitalic_z in L𝐿Litalic_L such that y<v<z𝑦𝑣𝑧y<v<zitalic_y < italic_v < italic_z (possibly y=𝑦y=\emptysetitalic_y = ∅ if v𝑣vitalic_v is less than the first vertex in L𝐿Litalic_L). Then do a binary search on the set of vertices in L𝐿Litalic_L between y𝑦yitalic_y and z𝑧zitalic_z to find x𝑥xitalic_x. A search that returns the k𝑘kitalic_k-th vertex on P𝑃Pitalic_P takes O(logk)O𝑘\mathrm{O}(\log k)roman_O ( roman_log italic_k ) time and O(logk)O𝑘\mathrm{O}(\log k)roman_O ( roman_log italic_k ) comparisons.

The running time of topological heapsort with insertion is O(m+n+logT)O𝑚𝑛𝑇\mathrm{O}(m+n+\log T)roman_O ( italic_m + italic_n + roman_log italic_T ). We shall show that the number of comparisons it does is O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ). We need the following simple monotonicity lemma:

Lemma 4.2.

Let G𝐺Gitalic_G be a DAG. Suppose that a DAG Gsuperscript𝐺normal-′G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is formed by (i) deleting an arc vw𝑣𝑤vwitalic_v italic_w such that there is a path from v𝑣vitalic_v to w𝑤witalic_w avoiding this arc, (ii) adding an arc to G𝐺Gitalic_G, (iii) deleting a vertex v𝑣vitalic_v with one incoming arc uv𝑢𝑣uvitalic_u italic_v and one outgoing arc vw𝑣𝑤vwitalic_v italic_w and adding an arc uw𝑢𝑤uwitalic_u italic_w, or (iv) removing a source vertex. Then G𝐺Gitalic_G has at least as many topological orders as Gsuperscript𝐺normal-′G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Proof.

(i) If Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is formed by deleting an arc vw𝑣𝑤vwitalic_v italic_w such that there is a path from v𝑣vitalic_v to w𝑤witalic_w avoiding vw𝑣𝑤vwitalic_v italic_w, then the set of topological orders of G𝐺Gitalic_G is the same as that of Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. (ii) If Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is formed from G𝐺Gitalic_G by adding an arc, then any topological order of Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is a topological order of G𝐺Gitalic_G. (iii) If Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is formed from G𝐺Gitalic_G by deleting a vertex v𝑣vitalic_v with incident arcs uv𝑢𝑣uvitalic_u italic_v and vw𝑣𝑤vwitalic_v italic_w and adding uw𝑢𝑤uwitalic_u italic_w, then any topological order of Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT can be extended to one of G𝐺Gitalic_G by inserting v𝑣vitalic_v anywhere between u𝑢uitalic_u and w𝑤witalic_w. (iv) Any topological order of Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT may be extended to a topological order of G𝐺Gitalic_G by inserting v𝑣vitalic_v before all the vertices of Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. ∎

Theorem 4.3.

Topological heapsort with insertion sorts the vertices of a DAG G𝐺Gitalic_G in O(m+n+logT)normal-O𝑚𝑛𝑇\mathrm{O}(m+n+\log T)roman_O ( italic_m + italic_n + roman_log italic_T ) time and O(logT)normal-O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparisons.

Proof.

The time that the algorithm spends outside of Algorithm 1 is O(n+m)O𝑛𝑚\mathrm{O}(n+m)roman_O ( italic_n + italic_m ). Hence the running time is O(m+n+logT)O𝑚𝑛𝑇\mathrm{O}(m+n+\log T)roman_O ( italic_m + italic_n + roman_log italic_T ) by Theorem 3.1. It remains to prove that the algorithm does O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparisons.

Let \ellroman_ℓ be the number of vertices on P𝑃Pitalic_P. By Lemma 4.1, (n)/2logT𝑛2𝑇(n-\ell)/2\leq\log T( italic_n - roman_ℓ ) / 2 ≤ roman_log italic_T. The number of vertices in Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is at most 3(n)3𝑛3(n-\ell)3 ( italic_n - roman_ℓ ), so the number of comparisons needed by topological sort when run on Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is O((n)+logT)=O(logT)O𝑛𝑇O𝑇\mathrm{O}((n-\ell)+\log T)=\mathrm{O}(\log T)roman_O ( ( italic_n - roman_ℓ ) + roman_log italic_T ) = roman_O ( roman_log italic_T ) by Theorems 3.1 and 4.2.

Consider the search that Item 2 does for a vertex v𝑣vitalic_v not in P𝑃Pitalic_P. We use P(v)𝑃𝑣P(v)italic_P ( italic_v ) to denote the set of vertices removed from P𝑃Pitalic_P when processing v𝑣vitalic_v in Item 2. The number of comparisons the search takes is O(1+log(|P(v)|+1))O1𝑃𝑣1\mathrm{O}(1+\log(|P(v)|+1))roman_O ( 1 + roman_log ( | italic_P ( italic_v ) | + 1 ) ). For any correct order of the vertices of G𝐺Gitalic_G and any node v𝑣vitalic_v, consider the |P(v)|+1𝑃𝑣1|P(v)|+1| italic_P ( italic_v ) | + 1 topological orders obtained by moving v𝑣vitalic_v by 00, 1111, \ldots, |P(v)|𝑃𝑣|P(v)|| italic_P ( italic_v ) | positions to the left in the current topological order. Note that these are indeed valid topological orders, since the |P(v)|𝑃𝑣|P(v)|| italic_P ( italic_v ) | vertices of G𝐺Gitalic_G immediately preceding v𝑣vitalic_v have to be part of P𝑃Pitalic_P by the definition of P(v)𝑃𝑣P(v)italic_P ( italic_v ). We may shift v𝑣vitalic_v in this fashion independently for every vertex v𝑣vitalic_v. Therefore, TvV(G)(|P(v)|+1)𝑇subscriptproduct𝑣𝑉superscript𝐺𝑃𝑣1T\geq\prod_{v\in V(G^{\prime})}(|P(v)|+1)italic_T ≥ ∏ start_POSTSUBSCRIPT italic_v ∈ italic_V ( italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT ( | italic_P ( italic_v ) | + 1 ). We can rewrite this inequality as vV(G)log(|P(v)|+1)logTsubscript𝑣𝑉superscript𝐺𝑃𝑣1𝑇\sum_{v\in V(G^{\prime})}\log(|P(v)|+1)\leq\log T∑ start_POSTSUBSCRIPT italic_v ∈ italic_V ( italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT roman_log ( | italic_P ( italic_v ) | + 1 ) ≤ roman_log italic_T; and, finally, conclude that vV(G)(1+log(|P(v)|+1))logT+(n)=O(logT)subscript𝑣𝑉superscript𝐺1𝑃𝑣1𝑇𝑛O𝑇\sum_{v\in V(G^{\prime})}\left(1+\log(|P(v)|+1)\right)\leq\log T+(n-\ell)=% \mathrm{O}(\log T)∑ start_POSTSUBSCRIPT italic_v ∈ italic_V ( italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT ( 1 + roman_log ( | italic_P ( italic_v ) | + 1 ) ) ≤ roman_log italic_T + ( italic_n - roman_ℓ ) = roman_O ( roman_log italic_T ). Thus our algorithm does at most O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparisons during Item 2.

Combining the bounds for running topological heapsort on Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and inserting vertices into P𝑃Pitalic_P gives the theorem. ∎

Remark.

If Item 1 finds that the longest path has length at most (1ϵ)n1italic-ϵ𝑛(1-\epsilon)n( 1 - italic_ϵ ) italic_n for some fixed positive ϵitalic-ϵ\epsilonitalic_ϵ, the algorithm can skip the construction of Gsuperscript𝐺G^{\prime}italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and just run topological heapsort on the original graph G𝐺Gitalic_G. Also, if the problem must be solved repeatedly for a fixed DAG with different total orders, Item 1 needs to be run only once.

References

  • [Ban+10] Jacqueline Banks, Scott Garrabrant, Mark L Huber and Anne Perizzolo “Using TPA to count linear extensions” In arXiv preprint arXiv:1010.4981, 2010
  • [BHM13] Prosenjit Bose, John Howat and Pat Morin “A history of distribution-sensitive data structures” In Space-Efficient Data Structures, Streams, and Algorithms: Papers in Honor of J. Ian Munro on the Occasion of His 66th Birthday Springer, 2013, pp. 133–149
  • [Bri99] Graham Brightwell “Balanced pairs in partial orders” In Discrete Mathematics 201.1-3 Elsevier, 1999, pp. 25–52
  • [BW91] Graham Brightwell and Peter Winkler “Counting linear extensions is #P-complete” In Proceedings of the twenty-third annual ACM symposium on Theory of computing, 1991, pp. 175–181
  • [BFT95] Graham R Brightwell, Stefan Felsner and William T Trotter “Balancing pairs and the cross product conjecture” In Order 12 Springer, 1995, pp. 327–349
  • [BD99] Russ Bubley and Martin Dyer “Faster random generation of linear extensions” In Discrete mathematics 201.1-3 Elsevier, 1999, pp. 81–88
  • [CF13] Jean Cardinal and Samuel Fiorini “On generalized comparison-based sorting problems” In Space-Efficient Data Structures, Streams, and Algorithms: Papers in Honor of J. Ian Munro on the Occasion of His 66th Birthday Springer, 2013, pp. 164–175
  • [Car+10] Jean Cardinal et al. “Sorting under partial information (without the ellipsoid algorithm)” In Proceedings of the forty-second ACM symposium on Theory of computing, 2010, pp. 359–368
  • [Car+09] Jean Cardinal et al. “An efficient algorithm for partial order production” In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009 ACM, 2009, pp. 93–100
  • [DM18] Eyal Dushkin and Tova Milo “Top-k sorting under partial order information” In Proceedings of the 2018 International Conference on Management of Data, 2018, pp. 1007–1019
  • [DFK91] Martin Dyer, Alan Frieze and Ravi Kannan “A random polynomial-time algorithm for approximating the volume of convex bodies” In Journal of the ACM (JACM) 38.1 ACM New York, NY, USA, 1991, pp. 1–17
  • [Elm06] Amr Elmasry “A priority queue with the working-set property” In International Journal of Foundations of Computer Science 17.06 World Scientific, 2006, pp. 1455–1465
  • [EFI13] Amr Elmasry, Arash Farzan and John Iacono “On the hierarchy of distribution-sensitive properties for data structures” In Acta informatica 50.4 Springer, 2013, pp. 289–295
  • [Flo64] Robert W Floyd “Algorithm 245: treesort” In Communications of the ACM 7.12 ACM New York, NY, USA, 1964, pp. 701
  • [Fre76] Michael L Fredman “How good is the information theory bound in sorting?” In Theoretical Computer Science 1.4 Elsevier, 1976, pp. 355–361
  • [Fre+86] Michael L Fredman, Robert Sedgewick, Daniel D Sleator and Robert E Tarjan “The pairing heap: A new form of self-adjusting heap” In Algorithmica 1.1-4 Springer, 1986, pp. 111–129
  • [Fre+86a] Michael L. Fredman, Robert Sedgewick, Daniel Dominic Sleator and Robert Endre Tarjan “The Pairing Heap: A New Form of Self-Adjusting Heap” In Algorithmica 1.1, 1986, pp. 111–129 DOI: 10.1007/BF01840439
  • [Hae+24] Bernhard Haeupler et al. “Heaps with the Working-Set Bound” Preprint, 2024
  • [Hae+23] Bernhard Haeupler et al. “Universal Optimality of Dijkstra via Beyond-Worst-Case Heaps”, 2023 arXiv:2311.11793 [cs.DS]
  • [Hub06] Mark Huber “Fast perfect sampling from linear extensions” In Discrete Mathematics 306.4 Elsevier, 2006, pp. 420–428
  • [Iac00] John Iacono “Improved upper bounds for pairing heaps” In Scandinavian Workshop on Algorithm Theory, 2000, pp. 32–45 Springer
  • [Kah62] Arthur B Kahn “Topological sorting of large networks” In Communications of the ACM 5.11 ACM New York, NY, USA, 1962, pp. 558–562
  • [KK92] Jeff Kahn and Jeong Han Kim “Entropy and sorting” In Proceedings of the twenty-fourth annual ACM symposium on Theory of computing, 1992, pp. 178–187
  • [KL91] Jeff Kahn and Nathan Linial “Balancing extensions via Brunn-Minkowski” In Combinatorica 11.4, 1991, pp. 363–368
  • [KS84] Jeff Kahn and Michael Saks “Balancing poset extensions” In Order 1 Springer, 1984, pp. 113–126
  • [KK91] Alexander Karzanov and Leonid Khachiyan “On the conductance of order Markov chains” In Order 8 Springer, 1991, pp. 7–15
  • [Knu97] Donald E Knuth “The Art of Computer Programming: Fundamental Algorithms, volume 1” Addison-Wesley Professional, 1997
  • [KS18] László Kozma and Thatchaphol Saranurak “Smooth heaps and a dual view of self-adjusting data structures” In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, 2018, pp. 801–814
  • [LP93] Christos Levcopoulos and Ola Petersson “Adaptive heapsort” In Journal of Algorithms 14.3 Elsevier, 1993, pp. 395–413
  • [Lin84] Nathan Linial “The information-theoretic bound is good for merging” In SIAM Journal on Computing 13.4 SIAM, 1984, pp. 795–801
  • [Mat91] Peter Matthews “Generating a random linear extension of a partial order” In The Annals of Probability 19.3 Institute of Mathematical Statistics, 1991, pp. 1367–1392
  • [Mun+19] J Ian Munro, Richard Peng, Sebastian Wild and Lingyi Zhang “Dynamic Optimality Refuted–For Tournament Heaps” In arXiv preprint arXiv:1908.00563, 2019
  • [Sch76] A. Schönhage “The Production of Partial Orders” In Astérisque 38-39 Soc. Math. France, Paris, 1976, pp. 229–246
  • [Sha48] Claude Elwood Shannon “A mathematical theory of communication” In The Bell system technical journal 27.3 Nokia Bell Labs, 1948, pp. 379–423
  • [ST85] Daniel Dominic Sleator and Robert Endre Tarjan “Self-adjusting binary search trees” In Journal of the ACM (JACM) 32.3 ACM New York, NY, USA, 1985, pp. 652–686
  • [Wil64] J Williams “Heapsort” In Commun. ACM 7.6, 1964, pp. 347–348
  • [Yao89] Andrew Chi-Chih Yao “On the Complexity of Partial Order Productions” In SIAM J. Comput. 18.4, 1989, pp. 679–689

Appendix A Sampling and Counting Topological Orders

Given a DAG G𝐺Gitalic_G and its corresponding number of topological orders T𝑇Titalic_T, our algorithm yields a simple way of estimating the value of logT𝑇\log Troman_log italic_T to within a constant factor. The idea is that the DAG sorting problem with an unknown total order selected uniformly at random takes Ω(logT)Ω𝑇\Omega(\log T)roman_Ω ( roman_log italic_T ) comparisons with high probability. Thus if the algorithm is run on one sample selected uniformly at random, we obtain a good approximation of logT𝑇\log Troman_log italic_T.

Theorem A.1.

Let G𝐺Gitalic_G be a directed acyclic graph with T𝑇Titalic_T topological orders. Assume that there is an algorithm that returns a topological order O(1)normal-O1\mathrm{O}(1)roman_O ( 1 )-pointwise close to uniform222We say that a distribution p𝑝pitalic_p is c𝑐citalic_c-pointwise close to q𝑞qitalic_q if for every element x𝑥xitalic_x we have qx/cpxcqxsubscript𝑞𝑥𝑐subscript𝑝𝑥normal-⋅𝑐subscript𝑞𝑥q_{x}/c\leq p_{x}\leq c\cdot q_{x}italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT / italic_c ≤ italic_p start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ≤ italic_c ⋅ italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT. in time tsamplesubscript𝑡𝑠𝑎𝑚𝑝𝑙𝑒t_{sample}italic_t start_POSTSUBSCRIPT italic_s italic_a italic_m italic_p italic_l italic_e end_POSTSUBSCRIPT. Then there is an algorithm that runs in time O(tsample+logT)normal-Osubscript𝑡𝑠𝑎𝑚𝑝𝑙𝑒𝑇\mathrm{O}(t_{sample}+\log T)roman_O ( italic_t start_POSTSUBSCRIPT italic_s italic_a italic_m italic_p italic_l italic_e end_POSTSUBSCRIPT + roman_log italic_T ), performs O(logT)normal-O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) comparisons, and returns a constant-factor approximation of the value logT𝑇\log Troman_log italic_T with error probability O(1/T0.9)normal-O1superscript𝑇0.9\mathrm{O}(1/T^{0.9})roman_O ( 1 / italic_T start_POSTSUPERSCRIPT 0.9 end_POSTSUPERSCRIPT ).

Proof.

The algorithm merely samples a topological order, runs topological heapsort with insertion on the sample, and returns the number of comparisons made by the algorithm. The running time of the algorithm is O(n+m+logT+tsample)O𝑛𝑚𝑇subscript𝑡𝑠𝑎𝑚𝑝𝑙𝑒\mathrm{O}(n+m+\log T+t_{sample})roman_O ( italic_n + italic_m + roman_log italic_T + italic_t start_POSTSUBSCRIPT italic_s italic_a italic_m italic_p italic_l italic_e end_POSTSUBSCRIPT ). Since in order to sample the order, we have to read the whole input DAG, this is equal to the desired O(tsample+logT)Osubscript𝑡𝑠𝑎𝑚𝑝𝑙𝑒𝑇\mathrm{O}(t_{sample}+\log T)roman_O ( italic_t start_POSTSUBSCRIPT italic_s italic_a italic_m italic_p italic_l italic_e end_POSTSUBSCRIPT + roman_log italic_T ). The number of comparisons it does is O(logT)O𝑇\mathrm{O}(\log T)roman_O ( roman_log italic_T ) (note that we do not use any comparisons when generating the sample).

The number of comparisons X𝑋Xitalic_X done by topological heapsort with insertion is X=O(logT)𝑋O𝑇X=\mathrm{O}(\log T)italic_X = roman_O ( roman_log italic_T ) by Theorem 4.3. Thus our algorithm returns an upper bound on logT𝑇\log Troman_log italic_T that is at most a constant factor larger than the true value.

It remains to give a similar lower bound. We prove that with high probability X=Ω(logT)𝑋Ω𝑇X=\Omega(\log T)italic_X = roman_Ω ( roman_log italic_T ). Suppose 𝒪𝒪\mathcal{O}caligraphic_O is a sample c𝑐citalic_c-pointwise close to uniform. Consider the event \mathcal{E}caligraphic_E that Xlog(T/c)10𝑋𝑇𝑐10X\leq\frac{\log(T/c)}{10}italic_X ≤ divide start_ARG roman_log ( italic_T / italic_c ) end_ARG start_ARG 10 end_ARG. For any topological order 𝒪superscript𝒪\mathcal{O}^{\prime}caligraphic_O start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, we have

P[𝒪=𝒪|]=P[𝒪=𝒪]P[]P[𝒪=𝒪]P[]c/TP[].𝑃delimited-[]𝒪conditionalsuperscript𝒪𝑃delimited-[]𝒪superscript𝒪𝑃delimited-[]𝑃delimited-[]𝒪superscript𝒪𝑃delimited-[]𝑐𝑇𝑃delimited-[]P[\mathcal{O}=\mathcal{O}^{\prime}|\mathcal{E}]=\frac{P[\mathcal{O}=\mathcal{O% }^{\prime}\;\cap\;\mathcal{E}]}{P[\mathcal{E}]}\leq\frac{P[\mathcal{O}=% \mathcal{O}^{\prime}]}{P[\mathcal{E}]}\leq\frac{c/T}{P[\mathcal{E}]}.italic_P [ caligraphic_O = caligraphic_O start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | caligraphic_E ] = divide start_ARG italic_P [ caligraphic_O = caligraphic_O start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∩ caligraphic_E ] end_ARG start_ARG italic_P [ caligraphic_E ] end_ARG ≤ divide start_ARG italic_P [ caligraphic_O = caligraphic_O start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_P [ caligraphic_E ] end_ARG ≤ divide start_ARG italic_c / italic_T end_ARG start_ARG italic_P [ caligraphic_E ] end_ARG .

The conditional entropy of 𝒪𝒪\mathcal{O}caligraphic_O is then

H(𝒪|)=𝒪P[𝒪=𝒪|]log1P[𝒪=𝒪|]log(TP[]c).𝐻conditional𝒪subscriptsuperscript𝒪𝑃delimited-[]𝒪conditionalsuperscript𝒪1𝑃delimited-[]𝒪conditionalsuperscript𝒪𝑇𝑃delimited-[]𝑐H(\mathcal{O}|\mathcal{E})=\sum_{\mathcal{O}^{\prime}\in\mathcal{E}}P[\mathcal% {O}=\mathcal{O}^{\prime}|\mathcal{E}]\cdot\log\frac{1}{P[\mathcal{O}=\mathcal{% O}^{\prime}|\mathcal{E}]}\geq\log\left(\frac{T\cdot P[\mathcal{E}]}{c}\right).italic_H ( caligraphic_O | caligraphic_E ) = ∑ start_POSTSUBSCRIPT caligraphic_O start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_E end_POSTSUBSCRIPT italic_P [ caligraphic_O = caligraphic_O start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | caligraphic_E ] ⋅ roman_log divide start_ARG 1 end_ARG start_ARG italic_P [ caligraphic_O = caligraphic_O start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | caligraphic_E ] end_ARG ≥ roman_log ( divide start_ARG italic_T ⋅ italic_P [ caligraphic_E ] end_ARG start_ARG italic_c end_ARG ) .

The comparisons performed by topological heapsort with insertion uniquely determine the topological order 𝒪𝒪\mathcal{O}caligraphic_O. Thus by Shannon’s source coding theorem for symbol codes [Sha48], we have E[X|]H(𝒪|)𝐸delimited-[]conditional𝑋𝐻conditional𝒪E[X|\mathcal{E}]\geq H(\mathcal{O}|\mathcal{E})italic_E [ italic_X | caligraphic_E ] ≥ italic_H ( caligraphic_O | caligraphic_E ), and we can write

log(T/c)10E[X|]H(𝒪|)log(TP[]c).𝑇𝑐10𝐸delimited-[]conditional𝑋𝐻conditional𝒪𝑇𝑃delimited-[]𝑐\frac{\log(T/c)}{10}\geq E[X|\mathcal{E}]\geq H(\mathcal{O}|\mathcal{E})\geq% \log\left(\frac{T\cdot P[\mathcal{E}]}{c}\right)\,.divide start_ARG roman_log ( italic_T / italic_c ) end_ARG start_ARG 10 end_ARG ≥ italic_E [ italic_X | caligraphic_E ] ≥ italic_H ( caligraphic_O | caligraphic_E ) ≥ roman_log ( divide start_ARG italic_T ⋅ italic_P [ caligraphic_E ] end_ARG start_ARG italic_c end_ARG ) .

Solving for P[]𝑃delimited-[]P[\mathcal{E}]italic_P [ caligraphic_E ], we get

P[](cT)9/10,𝑃delimited-[]superscript𝑐𝑇910P[\mathcal{E}]\leq\left(\frac{c}{T}\right)^{9/10},italic_P [ caligraphic_E ] ≤ ( divide start_ARG italic_c end_ARG start_ARG italic_T end_ARG ) start_POSTSUPERSCRIPT 9 / 10 end_POSTSUPERSCRIPT ,

which concludes the proof. ∎