Search | arXiv e-print repository

Compact Parallel Hash Tables on the GPU

Authors: Steef Hegeman, Daan Wöltgens, Anton Wijs, Alfons Laarman

Abstract: On the GPU, hash table operation speed is determined in large part by cache line efficiency, and state-of-the-art hashing schemes thus divide tables into cache line-sized buckets. This raises the question whether performance can be further improved by increasing the number of entries that fit in such buckets. Known compact hashing techniques have not yet been adapted to the massively parallel sett… ▽ More On the GPU, hash table operation speed is determined in large part by cache line efficiency, and state-of-the-art hashing schemes thus divide tables into cache line-sized buckets. This raises the question whether performance can be further improved by increasing the number of entries that fit in such buckets. Known compact hashing techniques have not yet been adapted to the massively parallel setting, nor have they been evaluated on the GPU. We consider a compact version of bucketed cuckoo hashing, and a version of compact iceberg hashing suitable for the GPU. We discuss the tables from a theoretical perspective, and provide an open source implementation of both schemes in CUDA for comparative benchmarking. In terms of performance, the state-of-the-art cuckoo hashing benefits from compactness on lookups and insertions (most experiments show at least 10-20% increase in throughput), and the iceberg table benefits significantly, to the point of being comparable to compact cuckoo hashing--while supporting performant dynamic operation. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2403.18813 [pdf, ps, other]

Equivalence Checking of Quantum Circuits by Model Counting

Authors: **gyi Mei, Tim Coopmans, Marcello Bonsangue, Alfons Laarman

Abstract: Verifying equivalence between two quantum circuits is a hard problem, that is nonetheless crucial in compiling and optimizing quantum algorithms for real-world devices. This paper gives a Turing reduction of the (universal) quantum circuits equivalence problem to weighted model counting (WMC). Our starting point is a folklore theorem showing that equivalence checking of quantum circuits can be don… ▽ More Verifying equivalence between two quantum circuits is a hard problem, that is nonetheless crucial in compiling and optimizing quantum algorithms for real-world devices. This paper gives a Turing reduction of the (universal) quantum circuits equivalence problem to weighted model counting (WMC). Our starting point is a folklore theorem showing that equivalence checking of quantum circuits can be done in the so-called Pauli-basis. We combine this insight with a WMC encoding of quantum circuit simulation, which we extend with support for the Toffoli gate. Finally, we prove that the weights computed by the model counter indeed realize the reduction. With an open-source implementation, we demonstrate that this novel approach can outperform a state-of-the-art equivalence-checking tool based on ZX calculus and decision diagrams. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2403.07197 [pdf, ps, other]

Simulating Quantum Circuits by Model Counting

Authors: **gyi Mei, Marcello Bonsangue, Alfons Laarman

Abstract: Quantum circuit compilation comprises many computationally hard reasoning tasks that nonetheless lie inside #$\mathbf{P}$ and its decision counterpart in $\mathbf{PP}$. The classical simulation of general quantum circuits is a core example. We show for the first time that a strong simulation of universal quantum circuits can be efficiently tackled through weighted model counting by providing a lin… ▽ More Quantum circuit compilation comprises many computationally hard reasoning tasks that nonetheless lie inside #$\mathbf{P}$ and its decision counterpart in $\mathbf{PP}$. The classical simulation of general quantum circuits is a core example. We show for the first time that a strong simulation of universal quantum circuits can be efficiently tackled through weighted model counting by providing a linear encoding of Clifford+T circuits. To achieve this, we exploit the stabilizer formalism by Knill, Gottesmann, and Aaronson and the fact that stabilizer states form a basis for density operators. With an open-source simulator implementation, we demonstrate empirically that model counting often outperforms state-of-the-art simulation techniques based on the ZX calculus and decision diagrams. Our work paves the way to apply the existing array of powerful classical reasoning tools to realize efficient quantum circuit compilation; one of the obstacles on the road towards quantum supremacy. △ Less

Submitted 11 March, 2024; originally announced March 2024.

arXiv:2401.10579 [pdf, other]

Trade-offs between classical and quantum space using spooky pebbling

Authors: Arend-Jan Quist, Alfons Laarman

Abstract: Pebble games are used to study space/time trade-offs. Recently, spooky pebble games were introduced to study classical space / quantum space / time trade-offs for simulation of classical circuits on quantum computers. In this paper, the spooky pebble game framework is applied for the first time to general circuits. Using this framework we prove an upper bound for quantum space in the spooky pebble… ▽ More Pebble games are used to study space/time trade-offs. Recently, spooky pebble games were introduced to study classical space / quantum space / time trade-offs for simulation of classical circuits on quantum computers. In this paper, the spooky pebble game framework is applied for the first time to general circuits. Using this framework we prove an upper bound for quantum space in the spooky pebble game. We also prove that solving the spooky pebble game is PSPACE-complete. Moreover, we present a solver for the spooky pebble game based on satisfiability combined with heuristic solvers. This spooky pebble game solver was empirically evaluated by calculating optimal classical space / quantum space / time trade-offs. Within limited runtime, the solver could find a strategy reducing quantum space when classical space is taken into account, showing that the spooky pebble model is useful to reduce quantum space. △ Less

Submitted 19 January, 2024; originally announced January 2024.

arXiv:2401.01322 [pdf, ps, other]

A Knowledge Compilation Map for Quantum Information

Authors: Lieuwe Vinkhuijzen, Tim Coopmans, Alfons Laarman

Abstract: Quantum computing is finding promising applications in optimization, machine learning and physics, leading to the development of various models for representing quantum information. Because these representations are often studied in different contexts (many-body physics, machine learning, formal verification, simulation), little is known about fundamental trade-offs between their succinctness and… ▽ More Quantum computing is finding promising applications in optimization, machine learning and physics, leading to the development of various models for representing quantum information. Because these representations are often studied in different contexts (many-body physics, machine learning, formal verification, simulation), little is known about fundamental trade-offs between their succinctness and the runtime of operations to update them. We therefore analytically investigate three widely-used quantum state representations: matrix product states (MPS), decision diagrams (DDs), and restricted Boltzmann machines (RBMs). We map the relative succinctness of these data structures and provide the complexity for relevant query and manipulation operations. Further, to chart the balance between succinctness and operation efficiency, we extend the concept of rapidity with support for the non-canonical data structures studied in this work, showing in particular that MPS is at least as rapid as some DDs. By providing a knowledge compilation map for quantum state representations, this paper contributes to the understanding of the inherent time and space efficiency trade-offs in this area. △ Less

Submitted 2 January, 2024; originally announced January 2024.

MSC Class: 68P05 ACM Class: E.1; F.2

arXiv:2310.11288 [pdf, ps, other]

Enriching Diagrams with Algebraic Operations

Authors: Alejandro Villoria, Henning Basold, Alfons Laarman

Abstract: In this paper, we extend diagrammatic reasoning in monoidal categories with algebraic operations and equations. We achieve this by considering monoidal categories that are enriched in the category of Eilenberg-Moore algebras for a monad. Under the condition that this monad is monoidal and affine, we construct an adjunction between symmetric monoidal categories and symmetric monoidal categories enr… ▽ More In this paper, we extend diagrammatic reasoning in monoidal categories with algebraic operations and equations. We achieve this by considering monoidal categories that are enriched in the category of Eilenberg-Moore algebras for a monad. Under the condition that this monad is monoidal and affine, we construct an adjunction between symmetric monoidal categories and symmetric monoidal categories enriched over algebras for the monad. This allows us to devise an extension, and its semantics, of the ZX-calculus with probabilistic choices by freely enriching over convex algebras, which are the algebras of the finite distribution monad. We show how this construction can be used for diagrammatic reasoning of noise in quantum systems. △ Less

Submitted 29 January, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: 19 pages, 8 appendix pages

arXiv:2309.03593 [pdf, other]

Quantum Graph-State Synthesis with SAT

Authors: Sebastiaan Brand, Tim Coopmans, Alfons Laarman

Abstract: In quantum computing and quantum information processing, graph states are a specific type of quantum states which are commonly used in quantum networking and quantum error correction. A recurring problem is finding a transformation from a given source graph state to a desired target graph state using only local operations. Recently it has been shown that deciding transformability is already NP-har… ▽ More In quantum computing and quantum information processing, graph states are a specific type of quantum states which are commonly used in quantum networking and quantum error correction. A recurring problem is finding a transformation from a given source graph state to a desired target graph state using only local operations. Recently it has been shown that deciding transformability is already NP-hard. In this paper, we present a CNF encoding for both local and non-local graph state operations, corresponding to one- and two-qubit Clifford gates and single-qubit Pauli measurements. We use this encoding in a bounded-model-checking set-up to synthesize the desired transformation. For a completeness threshold, we provide an upper bound on the length of the transformation if it exists. We evaluate the approach in two settings: the first is the synthesis of the ubiquitous GHZ state from a random graph state where we can vary the number of qubits, while the second is based on a proposed 14 node quantum network. We find that the approach is able to synthesize transformations for graphs up to 17 qubits in under 30 minutes. △ Less

Submitted 7 September, 2023; originally announced September 2023.

arXiv:2308.12162 [pdf, ps, other]

Incremental Property Directed Reachability

Authors: Max Blankestijn, Alfons Laarman

Abstract: Property Directed Reachability (PDR) is a widely used technique for formal verification of hardware and software systems. This paper presents an incremental version of PDR (IPDR), which enables the automatic verification of system instances of incremental complexity. The proposed algorithm leverages the concept of incremental SAT solvers to reuse verification results from previously verified syste… ▽ More Property Directed Reachability (PDR) is a widely used technique for formal verification of hardware and software systems. This paper presents an incremental version of PDR (IPDR), which enables the automatic verification of system instances of incremental complexity. The proposed algorithm leverages the concept of incremental SAT solvers to reuse verification results from previously verified system instances, thereby accelerating the verification process. The new algorithm supports both incremental constraining and relaxing; i.e., starting from an over-constrained instance that is gradually relaxed. To validate the effectiveness of the proposed algorithm, we implemented IPDR and experimentally evaluate it on two different problem domains. First, we consider a circuit pebbling problem, where the number of pebbles is both constrained and relaxed. Second, we explore parallel program instances, progressively increasing the allowed number of interleavings. The experimental results demonstrate significant performance improvements compared to Z3's PDR implementation SPACER. Experiments also show that the incremental approach succeeds in reusing a substantial amount of clauses between instances, for both the constraining and relaxing algorithm. △ Less

Submitted 23 August, 2023; originally announced August 2023.

ACM Class: B.8.1; I.1.2

arXiv:2308.01206 [pdf, other]

Fast equivalence checking of quantum circuits of Clifford gates

Authors: Dimitrios Thanos, Tim Coopmans, Alfons Laarman

Abstract: Checking whether two quantum circuits are equivalent is important for the design and optimization of quantum-computer applications with real-world devices. We consider quantum circuits consisting of Clifford gates, a practically-relevant subset of all quantum operations which is large enough to exhibit quantum features such as entanglement and forms the basis of, for example, quantum-error correct… ▽ More Checking whether two quantum circuits are equivalent is important for the design and optimization of quantum-computer applications with real-world devices. We consider quantum circuits consisting of Clifford gates, a practically-relevant subset of all quantum operations which is large enough to exhibit quantum features such as entanglement and forms the basis of, for example, quantum-error correction and many quantum-network applications. We present a deterministic algorithm that is based on a folklore mathematical result and demonstrate that it is capable of outperforming previously considered state-of-the-art method. In particular, given two Clifford circuits as sequences of single- and two-qubit Clifford gates, the algorithm checks their equivalence in $O(n \cdot m)$ time in the number of qubits $n$ and number of elementary Clifford gates $m$. Using the performant Stim simulator as backend, our implementation checks equivalence of quantum circuits with 1000 qubits (and a circuit depth of 10.000 gates) in $\sim$22 seconds and circuits with 100.000 qubits (depth 10) in $\sim$15 minutes, outperforming the existing SAT-based and path-integral based approaches by orders of magnitude. This approach shows that the correctness of application-relevant subsets of quantum operations can be verified up to large circuits in practice. △ Less

Submitted 2 August, 2023; originally announced August 2023.

arXiv:2212.03684 [pdf, other]

doi 10.1007/978-3-031-27481-7_29

A Decision Diagram Operation for Reachability

Authors: Sebastiaan Brand, Thomas Bäck, Alfons Laarman

Abstract: Saturation is considered the state-of-the-art method for computing fixpoints with decision diagrams. We present a relatively simple decision diagram operation called REACH that also computes fixpoints. In contrast to saturation, it does not require a partitioning of the transition relation. We give sequential algorithms implementing the new operation for both binary and multi-valued decision diagr… ▽ More Saturation is considered the state-of-the-art method for computing fixpoints with decision diagrams. We present a relatively simple decision diagram operation called REACH that also computes fixpoints. In contrast to saturation, it does not require a partitioning of the transition relation. We give sequential algorithms implementing the new operation for both binary and multi-valued decision diagrams, and moreover provide parallel counterparts. We implement these algorithms and experimentally compare their performance against saturation on 692 model checking benchmarks in different languages. The results show that the REACH operation often outperforms saturation, especially on transition relations with low locality. In a comparison between parallelized versions of REACH and saturation we find that REACH obtains comparable speedups up to 16 cores, although falls behind saturation at 64 cores. Finally, in a comparison with the state-of-the-art model checking tool ITS-tools we find that REACH outperforms ITS-tools on 29% of models, suggesting that REACH can be useful as a complementary method in an ensemble tool. △ Less

Submitted 7 December, 2022; originally announced December 2022.

arXiv:2108.00931 [pdf, other]

doi 10.22331/q-2023-09-11-1108

LIMDD: A Decision Diagram for Simulation of Quantum Computing Including Stabilizer States

Authors: Lieuwe Vinkhuijzen, Tim Coopmans, David Elkouss, Vedran Dunjko, Alfons Laarman

Abstract: Efficient methods for the representation and simulation of quantum states and quantum operations are crucial for the optimization of quantum circuits. Decision diagrams (DDs), a well-studied data structure originally used to represent Boolean functions, have proven capable of capturing relevant aspects of quantum systems, but their limits are not well understood. In this work, we investigate and b… ▽ More Efficient methods for the representation and simulation of quantum states and quantum operations are crucial for the optimization of quantum circuits. Decision diagrams (DDs), a well-studied data structure originally used to represent Boolean functions, have proven capable of capturing relevant aspects of quantum systems, but their limits are not well understood. In this work, we investigate and bridge the gap between existing DD-based structures and the stabilizer formalism, an important tool for simulating quantum circuits in the tractable regime. We first show that although DDs were suggested to succinctly represent important quantum states, they actually require exponential space for certain stabilizer states. To remedy this, we introduce a more powerful decision diagram variant, called Local Invertible Map-DD (LIMDD). We prove that the set of quantum states represented by poly-sized LIMDDs strictly contains the union of stabilizer states and other decision diagram variants. Finally, there exist circuits which LIMDDs can efficiently simulate, while their output states cannot be succinctly represented by two state-of-the-art simulation paradigms: the stabilizer decomposition techniques for Clifford + $T$ circuits and Matrix-Product States. By uniting two successful approaches, LIMDDs thus pave the way for fundamentally more powerful solutions for simulation and analysis of quantum computing. △ Less

Submitted 6 September, 2023; v1 submitted 2 August, 2021; originally announced August 2021.

ACM Class: E.1

Journal ref: Quantum 7, 1108 (2023)

arXiv:2007.07040 [pdf, other]

doi 10.22331/q-2023-03-23-959

Hybrid divide-and-conquer approach for tree search algorithms

Authors: Mathys Rennela, Sebastiaan Brand, Alfons Laarman, Vedran Dunjko

Abstract: One of the challenges of quantum computers in the near- and mid- term is the limited number of qubits we can use for computations. Finding methods that achieve useful quantum improvements under size limitations is thus a key question in the field. In this vein, it was recently shown that a hybrid classical-quantum method can help provide polynomial speed-ups to classical divide-and-conquer algorit… ▽ More One of the challenges of quantum computers in the near- and mid- term is the limited number of qubits we can use for computations. Finding methods that achieve useful quantum improvements under size limitations is thus a key question in the field. In this vein, it was recently shown that a hybrid classical-quantum method can help provide polynomial speed-ups to classical divide-and-conquer algorithms, even when only given access to a quantum computer much smaller than the problem itself. In this work, we study the hybrid divide-and-conquer method in the context of tree search algorithms, and extend it by including quantum backtracking, which allows better results than previous Grover-based methods. Further, we provide general criteria for polynomial speed-ups in the tree search context, and provide a number of examples where polynomial speed ups, using arbitrarily smaller quantum computers, can be obtained. We provide conditions for speedups for the well known algorithm of DPLL, and we prove threshold-free speed-ups for the PPSZ algorithm (the core of the fastest exact Boolean satisfiability solver) for well-behaved classes of formulas. We also provide a simple example where speed-ups can be obtained in an algorithm-independent fashion, under certain well-studied complexity-theoretical assumptions. Finally, we briefly discuss the fundamental limitations of hybrid methods in providing speed-ups for larger problems. △ Less

Submitted 20 March, 2023; v1 submitted 14 July, 2020; originally announced July 2020.

Comments: 48 pages, 16 figures

Journal ref: Quantum 7, 959 (2023)

arXiv:1802.02685 [pdf, ps, other]

Stubborn Transaction Reduction (with Proofs)

Authors: Alfons Laarman

Abstract: The exponential explosion of parallel interleavings remains a fundamental challenge to model checking of concurrent programs. Both partial-order reduction (POR) and transaction reduction (TR) decrease the number of interleavings in a concurrent system. Unlike POR, transactions also reduce the number of intermediate states. Modern POR techniques, on the other hand, offer more dynamic ways of identi… ▽ More The exponential explosion of parallel interleavings remains a fundamental challenge to model checking of concurrent programs. Both partial-order reduction (POR) and transaction reduction (TR) decrease the number of interleavings in a concurrent system. Unlike POR, transactions also reduce the number of intermediate states. Modern POR techniques, on the other hand, offer more dynamic ways of identifying commutative behavior, a crucial task for obtaining good reductions. We show that transaction reduction can use the same dynamic commutativity as found in stubborn set POR. We also compare reductions obtained by POR and TR, demonstrating with several examples that these techniques complement each other. With an implementation of the dynamic transactions in the model checker LTSmin, we compare its effectiveness with the original static TR and two POR approaches. Several inputs, including realistic case studies, demonstrate that the new dynamic TR can surpass POR in practice. △ Less

Submitted 7 February, 2018; originally announced February 2018.

Comments: 24 pages

MSC Class: 68Q60

arXiv:1611.09318 [pdf, other]

Dynamic Reductions for Model Checking Concurrent Software

Authors: Henning Günther, Alfons Laarman, Ana Sokolova, Georg Weissenbacher

Abstract: Symbolic model checking of parallel programs stands and falls with effective methods of dealing with the explosion of interleavings. We propose a dynamic reduction technique to avoid unnecessary interleavings. By extending Lipton's original work with a notion of bisimilarity, we accommodate dynamic transactions, and thereby reduce dependence on the accuracy of static analysis, which is a severe bo… ▽ More Symbolic model checking of parallel programs stands and falls with effective methods of dealing with the explosion of interleavings. We propose a dynamic reduction technique to avoid unnecessary interleavings. By extending Lipton's original work with a notion of bisimilarity, we accommodate dynamic transactions, and thereby reduce dependence on the accuracy of static analysis, which is a severe bottleneck in other reduction techniques. The combination of symbolic model checking and dynamic reduction techniques has proven to be challenging in the past. Our generic reduction theorem nonetheless enables us to derive an efficient symbolic encoding, which we implemented for IC3 and BMC. The experiments demonstrate the power of dynamic reduction on several case studies and a large set of SVCOMP benchmarks. △ Less

Submitted 28 November, 2016; originally announced November 2016.

Comments: 38 pages

arXiv:1111.0369 [pdf, other]

doi 10.4204/EPTCS.72.2

Variations on Multi-Core Nested Depth-First Search

Authors: Alfons Laarman, Jaco van de Pol

Abstract: Recently, two new parallel algorithms for on-the-fly model checking of LTL properties were presented at the same conference: Automated Technology for Verification and Analysis, 2011. Both approaches extend Swarmed NDFS, which runs several sequential NDFS instances in parallel. While parallel random search already speeds up detection of bugs, the workers must share some global information in order… ▽ More Recently, two new parallel algorithms for on-the-fly model checking of LTL properties were presented at the same conference: Automated Technology for Verification and Analysis, 2011. Both approaches extend Swarmed NDFS, which runs several sequential NDFS instances in parallel. While parallel random search already speeds up detection of bugs, the workers must share some global information in order to speed up full verification of correct models. The two algorithms differ considerably in the global information shared between workers, and in the way they synchronize. Here, we provide a thorough experimental comparison between the two algorithms, by measuring the runtime of their implementations on a multi-core machine. Both algorithms were implemented in the same framework of the model checker LTSmin, using similar optimizations, and have been subjected to the full BEEM model database. Because both algorithms have complementary advantages, we constructed an algorithm that combines both ideas. This combination clearly has an improved speedup. We also compare the results with the alternative parallel algorithm for accepting cycle detection OWCTY-MAP. Finally, we study a simple statistical model for input models that do contain accepting cycles. The goal is to distinguish the speedup due to parallel random search from the speedup that can be attributed to clever work sharing schemes. △ Less

Submitted 1 November, 2011; originally announced November 2011.

Comments: In Proceedings PDMC 2011, arXiv:1111.0064

Journal ref: EPTCS 72, 2011, pp. 13-28

arXiv:1104.3119 [pdf, other]

Parallel Recursive State Compression for Free

Authors: Alfons Laarman, Jaco van de Pol, Michael Weber

Abstract: This paper focuses on reducing memory usage in enumerative model checking, while maintaining the multi-core scalability obtained in earlier work. We present a tree-based multi-core compression method, which works by leveraging sharing among sub-vectors of state vectors. An algorithmic analysis of both worst-case and optimal compression ratios shows the potential to compress even large states to… ▽ More This paper focuses on reducing memory usage in enumerative model checking, while maintaining the multi-core scalability obtained in earlier work. We present a tree-based multi-core compression method, which works by leveraging sharing among sub-vectors of state vectors. An algorithmic analysis of both worst-case and optimal compression ratios shows the potential to compress even large states to a small constant on average (8 bytes). Our experiments demonstrate that this holds up in practice: the median compression ratio of 279 measured experiments is within 17% of the optimum for tree compression, and five times better than the median compression ratio of SPIN's COLLAPSE compression. Our algorithms are implemented in the LTSmin tool, and our experiments show that for model checking, multi-core tree compression pays its own way: it comes virtually without overhead compared to the fastest hash table-based methods. △ Less

Submitted 14 May, 2011; v1 submitted 15 April, 2011; originally announced April 2011.

Comments: 19 pages

arXiv:1004.2772 [pdf, other]

Boosting Multi-Core Reachability Performance with Shared Hash Tables

Authors: Alfons Laarman, Jaco van de Pol, Michael Weber

Abstract: This paper focuses on data structures for multi-core reachability, which is a key component in model checking algorithms and other verification methods. A cornerstone of an efficient solution is the storage of visited states. In related work, static partitioning of the state space was combined with thread-local storage and resulted in reasonable speedups, but left open whether improvements are… ▽ More This paper focuses on data structures for multi-core reachability, which is a key component in model checking algorithms and other verification methods. A cornerstone of an efficient solution is the storage of visited states. In related work, static partitioning of the state space was combined with thread-local storage and resulted in reasonable speedups, but left open whether improvements are possible. In this paper, we present a scaling solution for shared state storage which is based on a lockless hash table implementation. The solution is specifically designed for the cache architecture of modern CPUs. Because model checking algorithms impose loose requirements on the hash table operations, their design can be streamlined substantially compared to related work on lockless hash tables. Still, an implementation of the hash table presented here has dozens of sensitive performance parameters (bucket size, cache line size, data layout, probing sequence, etc.). We analyzed their impact and compared the resulting speedups with related tools. Our implementation outperforms two state-of-the-art multi-core model checkers (SPIN and DiVinE) by a substantial margin, while placing fewer constraints on the load balancing and search algorithms. △ Less

Submitted 4 May, 2010; v1 submitted 16 April, 2010; originally announced April 2010.

Comments: preliminary report

Showing 1–17 of 17 results for author: Laarman, A