-
A Pattern Language for Machine Learning Tasks
Authors:
Benjamin Rodatz,
Ian Fan,
Tuomas Laakkonen,
Neil John Ortega,
Thomas Hoffman,
Vincent Wang-Mascianica
Abstract:
Idealised as universal approximators, learners such as neural networks can be viewed as "variable functions" that may become one of a range of concrete functions after training. In the same way that equations constrain the possible values of variables in algebra, we may view objective functions as constraints on the behaviour of learners. We extract the equivalences perfectly optimised objective f…
▽ More
Idealised as universal approximators, learners such as neural networks can be viewed as "variable functions" that may become one of a range of concrete functions after training. In the same way that equations constrain the possible values of variables in algebra, we may view objective functions as constraints on the behaviour of learners. We extract the equivalences perfectly optimised objective functions impose, calling them "tasks". For these tasks, we develop a formal graphical language that allows us to: (1) separate the core tasks of a behaviour from its implementation details; (2) reason about and design behaviours model-agnostically; and (3) simply describe and unify approaches in machine learning across domains.
As proof-of-concept, we design a novel task that enables converting classifiers into generative models we call "manipulators", which we implement by directly translating task specifications into code. The resulting models exhibit capabilities such as style transfer and interpretable latent-space editing, without the need for custom architectures, adversarial training or random sampling. We formally relate the behaviour of manipulators to GANs, and empirically demonstrate their competitive performance with VAEs. We report on experiments across vision and language domains aiming to characterise manipulators as approximate Bayesian inversions of discriminative classifiers.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
On the Anatomy of Attention
Authors:
Nikhil Khatri,
Tuomas Laakkonen,
Jonathon Liu,
Vincent Wang-MaĆcianica
Abstract:
We introduce a category-theoretic diagrammatic formalism in order to systematically relate and reason about machine learning models. Our diagrams present architectures intuitively but without loss of essential detail, where natural relationships between models are captured by graphical transformations, and important differences and similarities can be identified at a glance. In this paper, we focu…
▽ More
We introduce a category-theoretic diagrammatic formalism in order to systematically relate and reason about machine learning models. Our diagrams present architectures intuitively but without loss of essential detail, where natural relationships between models are captured by graphical transformations, and important differences and similarities can be identified at a glance. In this paper, we focus on attention mechanisms: translating folklore into mathematical derivations, and constructing a taxonomy of attention variants in the literature. As a first example of an empirical investigation underpinned by our formalism, we identify recurring anatomical components of attention, which we exhaustively recombine to explore a space of variations on the attention mechanism.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Quantum Circuit Optimization with AlphaTensor
Authors:
Francisco J. R. Ruiz,
Tuomas Laakkonen,
Johannes Bausch,
Matej Balog,
Mohammadamin Barekatain,
Francisco J. H. Heras,
Alexander Novikov,
Nathan Fitzpatrick,
Bernardino Romera-Paredes,
John van de Wetering,
Alhussein Fawzi,
Konstantinos Meichanetzidis,
Pushmeet Kohli
Abstract:
A key challenge in realizing fault-tolerant quantum computers is circuit optimization. Focusing on the most expensive gates in fault-tolerant quantum computation (namely, the T gates), we address the problem of T-count optimization, i.e., minimizing the number of T gates that are needed to implement a given circuit. To achieve this, we develop AlphaTensor-Quantum, a method based on deep reinforcem…
▽ More
A key challenge in realizing fault-tolerant quantum computers is circuit optimization. Focusing on the most expensive gates in fault-tolerant quantum computation (namely, the T gates), we address the problem of T-count optimization, i.e., minimizing the number of T gates that are needed to implement a given circuit. To achieve this, we develop AlphaTensor-Quantum, a method based on deep reinforcement learning that exploits the relationship between optimizing T-count and tensor decomposition. Unlike existing methods for T-count optimization, AlphaTensor-Quantum can incorporate domain-specific knowledge about quantum computation and leverage gadgets, which significantly reduces the T-count of the optimized circuits. AlphaTensor-Quantum outperforms the existing methods for T-count optimization on a set of arithmetic benchmarks (even when compared without making use of gadgets). Remarkably, it discovers an efficient algorithm akin to Karatsuba's method for multiplication in finite fields. AlphaTensor-Quantum also finds the best human-designed solutions for relevant arithmetic computations used in Shor's algorithm and for quantum chemistry simulation, thus demonstrating it can save hundreds of hours of research by optimizing relevant quantum circuits in a fully automated way.
△ Less
Submitted 5 March, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
Picturing Counting Reductions with the ZH-Calculus
Authors:
Tuomas Laakkonen,
Konstantinos Meichanetzidis,
John van de Wetering
Abstract:
Counting the solutions to Boolean formulae defines the problem #SAT, which is complete for the complexity class #P. We use the ZH-calculus, a universal and complete graphical language for linear maps which naturally encodes counting problems in terms of diagrams, to give graphical reductions from #SAT to several related counting problems. Some of these graphical reductions, like to #2SAT, are subs…
▽ More
Counting the solutions to Boolean formulae defines the problem #SAT, which is complete for the complexity class #P. We use the ZH-calculus, a universal and complete graphical language for linear maps which naturally encodes counting problems in terms of diagrams, to give graphical reductions from #SAT to several related counting problems. Some of these graphical reductions, like to #2SAT, are substantially simpler than known reductions via the matrix permanent. Additionally, our approach allows us to consider the case of counting solutions modulo an integer on equal footing. Finally, since the ZH-calculus was originally introduced to reason about quantum computing, we show that the problem of evaluating ZH-diagrams in the fragment corresponding to the Clifford+T gateset, is in FP^#P. Our results show that graphical calculi represent an intuitive and useful framework for reasoning about counting problems.
△ Less
Submitted 31 August, 2023; v1 submitted 5 April, 2023;
originally announced April 2023.
-
A Graphical #SAT Algorithm for Formulae with Small Clause Density
Authors:
Tuomas Laakkonen,
Konstantinos Meichanetzidis,
John van de Wetering
Abstract:
We study the counting version of the Boolean satisfiability problem #SAT using the ZH-calculus, a graphical language originally introduced to reason about quantum circuits. Using this we find a natural extension of #SAT which we call $\#SAT_\pm$, where variables are additionally labeled by phases, which is GapP-complete. Using graphical reasoning, we find a reduction from #SAT to $\#2SAT_\pm$ in t…
▽ More
We study the counting version of the Boolean satisfiability problem #SAT using the ZH-calculus, a graphical language originally introduced to reason about quantum circuits. Using this we find a natural extension of #SAT which we call $\#SAT_\pm$, where variables are additionally labeled by phases, which is GapP-complete. Using graphical reasoning, we find a reduction from #SAT to $\#2SAT_\pm$ in the ZH-calculus. We observe that the DPLL algorithm for #2SAT can be adapted to $\#2SAT_\pm$ directly and hence that Wahlstrom's $O^*(1.2377^n)$ upper bound applies to $\#2SAT_\pm$ as well. Combining this with our reduction from #SAT to $\#2SAT_\pm$ gives us novel upper bounds in terms of clauses and variables that are better than $O^*(2^n)$ for small clause densities of $\frac{m}{n} < 2.25$. This is to our knowledge the first non-trivial upper bound for #SAT that is independent of clause size. Our algorithm improves on Dubois' upper bound for $\#kSAT$ whenever $\frac{m}{n} < 1.85$ and $k \geq 4$, and the Williams' average-case analysis whenever $\frac{m}{n} < 1.21$ and $k \geq 6$. We also obtain an unconditional upper bound of $O^*(1.88^m)$ for $\#4SAT$ in terms of clauses only, and find an improved bound on $\#3SAT$ for $1.2577 < \frac{m}{n} \leq \frac{7}{3}$. Our results demonstrate that graphical reasoning can lead to new algorithmic insights, even outside the domain of quantum computing that the calculus was intended for.
△ Less
Submitted 15 December, 2022;
originally announced December 2022.