Search | arXiv e-print repository

Graph2Tac: Online Representation Learning of Formal Math Concepts

Authors: Lasse Blaauwbroek, Miroslav Olšák, Jason Rute, Fidel Ivan Schaposnik Massolo, Jelle Piepenbrock, Vasily Pestun

Abstract: In proof assistants, the physical proximity between two formal mathematical concepts is a strong predictor of their mutual relevance. Furthermore, lemmas with close proximity regularly exhibit similar proof structures. We show that this locality property can be exploited through online learning techniques to obtain solving agents that far surpass offline learners when asked to prove theorems in an… ▽ More In proof assistants, the physical proximity between two formal mathematical concepts is a strong predictor of their mutual relevance. Furthermore, lemmas with close proximity regularly exhibit similar proof structures. We show that this locality property can be exploited through online learning techniques to obtain solving agents that far surpass offline learners when asked to prove theorems in an unseen mathematical setting. We extensively benchmark two such online solvers implemented in the Tactician platform for the Coq proof assistant: First, Tactician's online $k$-nearest neighbor solver, which can learn from recent proofs, shows a $1.72\times$ improvement in theorems proved over an offline equivalent. Second, we introduce a graph neural network, Graph2Tac, with a novel approach to build hierarchical representations for new definitions. Graph2Tac's online definition task realizes a $1.5\times$ improvement in theorems solved over an offline baseline. The $k$-NN and Graph2Tac solvers rely on orthogonal online data, making them highly complementary. Their combination improves $1.27\times$ over their individual performances. Both solvers outperform all other general-purpose provers for Coq, including CoqHammer, Proverbot9001, and a transformer baseline by at least $1.48\times$ and are available for practical use by end-users. △ Less

Submitted 23 June, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

Comments: 31 pages

MSC Class: 68T07 (Primary) 68V15 (Secondary) ACM Class: I.2.3; I.2.6

arXiv:2102.06203 [pdf, other]

Proof Artifact Co-training for Theorem Proving with Language Models

Authors: Jesse Michael Han, Jason Rute, Yuhuai Wu, Edward W. Ayers, Stanislas Polu

Abstract: Labeled data for imitation learning of theorem proving in large libraries of formalized mathematics is scarce as such libraries require years of concentrated effort by human specialists to be built. This is particularly challenging when applying large Transformer language models to tactic prediction, because the scaling of performance with respect to model size is quickly disrupted in the data-sca… ▽ More Labeled data for imitation learning of theorem proving in large libraries of formalized mathematics is scarce as such libraries require years of concentrated effort by human specialists to be built. This is particularly challenging when applying large Transformer language models to tactic prediction, because the scaling of performance with respect to model size is quickly disrupted in the data-scarce, easily-overfitted regime. We propose PACT ({\bf P}roof {\bf A}rtifact {\bf C}o-{\bf T}raining), a general methodology for extracting abundant self-supervised data from kernel-level proof terms for co-training alongside the usual tactic prediction objective. We apply this methodology to Lean, an interactive proof assistant which hosts some of the most sophisticated formalized mathematics to date. We instrument Lean with a neural theorem prover driven by a Transformer language model and show that PACT improves theorem proving success rate on a held-out suite of test theorems from 32\% to 48\%. △ Less

Submitted 15 March, 2022; v1 submitted 11 February, 2021; originally announced February 2021.

arXiv:1812.03375 [pdf, ps, other]

On the close interaction between algorithmic randomness and constructive/computable measure theory

Authors: Jason Rute

Abstract: This is a survey of constructive and computable measure theory with an emphasis on the close connections with algorithmic randomness. We give a brief history of constructive measure theory from Brouwer to the present, emphasizing how Schnorr randomness is the randomness notion implicit in the work of Brouwer, Bishop, Demuth, and others. We survey a number of recent results showing that classical a… ▽ More This is a survey of constructive and computable measure theory with an emphasis on the close connections with algorithmic randomness. We give a brief history of constructive measure theory from Brouwer to the present, emphasizing how Schnorr randomness is the randomness notion implicit in the work of Brouwer, Bishop, Demuth, and others. We survey a number of recent results showing that classical almost everywhere convergence theorems can be used to characterize many of the common randomness notions including Schnorr randomness, computable randomness, and Martin-Löf randomness. Last, we go into more detail about computable measure theory, showing how all the major approaches are basically equivalent (even though the definitions can vary greatly). △ Less

Submitted 16 March, 2019; v1 submitted 8 December, 2018; originally announced December 2018.

MSC Class: 03D32; 03F60

arXiv:1801.10387 [pdf, other]

On the computability of graphons

Authors: Nathanael L. Ackerman, Jeremy Avigad, Cameron E. Freer, Daniel M. Roy, Jason M. Rute

Abstract: We investigate the relative computability of exchangeable binary relational data when presented in terms of the distribution of an invariant measure on graphs, or as a graphon in either $L^1$ or the cut distance. We establish basic computable equivalences, and show that $L^1$ representations contain fundamentally more computable information than the other representations, but that $0'$ suffices to… ▽ More We investigate the relative computability of exchangeable binary relational data when presented in terms of the distribution of an invariant measure on graphs, or as a graphon in either $L^1$ or the cut distance. We establish basic computable equivalences, and show that $L^1$ representations contain fundamentally more computable information than the other representations, but that $0'$ suffices to move between computable such representations. We show that $0'$ is necessary in general, but that in the case of random-free graphons, no oracle is necessary. We also provide an example of an $L^1$-computable random-free graphon that is not weakly isomorphic to any graphon with an a.e. continuous version. △ Less

Submitted 31 January, 2018; originally announced January 2018.

Comments: 24 pages, 1 figure

arXiv:1509.00524 [pdf, ps, other]

Energy randomness

Authors: Joseph S. Miller, Jason Rute

Abstract: Energy randomness is a notion of partial randomness introduced by Diamondstone and Kjos-Hanssen to characterize the sequences that can be elements of a Martin-Löf random closed set (in the sense of Barmpalias, Brodhead, Cenzer, Dashti, and Weber). It has also been applied by Allen, Bienvenu, and Slaman to the characterization of the possible zero times of a Martin-Löf random Brownian motion. In th… ▽ More Energy randomness is a notion of partial randomness introduced by Diamondstone and Kjos-Hanssen to characterize the sequences that can be elements of a Martin-Löf random closed set (in the sense of Barmpalias, Brodhead, Cenzer, Dashti, and Weber). It has also been applied by Allen, Bienvenu, and Slaman to the characterization of the possible zero times of a Martin-Löf random Brownian motion. In this paper, we show that $X \in 2^ω$ is $s$-energy random if and only if $\sum_{n\inω} 2^{sn - KM(X\upharpoonright n)} < \infty$, providing a characterization of energy randomness via a priori complexity $KM$. This is related to a question of Allen, Bienvenu, and Slaman. △ Less

Submitted 1 September, 2015; originally announced September 2015.

MSC Class: 03D32 (Primary); 68Q30; 31C15 (Secondary)

arXiv:1501.02155 [pdf, ps, other]

A formal proof of the Kepler conjecture

Authors: Thomas Hales, Mark Adams, Gertrud Bauer, Dat Tat Dang, John Harrison, Truong Le Hoang, Cezary Kaliszyk, Victor Magron, Sean McLaughlin, Thang Tat Nguyen, Truong Quang Nguyen, Tobias Nipkow, Steven Obua, Joseph Pleso, Jason Rute, Alexey Solovyev, An Hoai Thi Ta, Trung Nam Tran, Diep Thi Trieu, Josef Urban, Ky Khac Vu, Roland Zumkeller

Abstract: This article describes a formal proof of the Kepler conjecture on dense sphere packings in a combination of the HOL Light and Isabelle proof assistants. This paper constitutes the official published account of the now completed Flyspeck project. This article describes a formal proof of the Kepler conjecture on dense sphere packings in a combination of the HOL Light and Isabelle proof assistants. This paper constitutes the official published account of the now completed Flyspeck project. △ Less

Submitted 9 January, 2015; originally announced January 2015.

Comments: 21 pages

arXiv:1411.0186 [pdf, ps, other]

doi 10.2168/LMCS-10(4:12)2014

Algorithmic randomness for Doob's martingale convergence theorem in continuous time

Authors: Bjørn Kjos-Hanssen, Paul Kim Long V. Nguyen, Jason Rute

Abstract: We study Doob's martingale convergence theorem for computable continuous time martingales on Brownian motion, in the context of algorithmic randomness. A characterization of the class of sample points for which the theorem holds is given. Such points are given the name of Doob random points. It is shown that a point is Doob random if its tail is computably random in a certain sense. Moreover, Doob… ▽ More We study Doob's martingale convergence theorem for computable continuous time martingales on Brownian motion, in the context of algorithmic randomness. A characterization of the class of sample points for which the theorem holds is given. Such points are given the name of Doob random points. It is shown that a point is Doob random if its tail is computably random in a certain sense. Moreover, Doob randomness is strictly weaker than computable randomness and is incomparable with Schnorr randomness. △ Less

Submitted 16 December, 2014; v1 submitted 1 November, 2014; originally announced November 2014.

Journal ref: Logical Methods in Computer Science, Volume 10, Issue 4 (December 18, 2014) lmcs:978

Showing 1–7 of 7 results for author: Rute, J