-
Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation
Authors:
Shamane Siriwardhana,
Mark McQuade,
Thomas Gauthier,
Lucas Atkins,
Fernando Fernandes Neto,
Luke Meyers,
Anneketh Vij,
Tyler Odenthal,
Charles Goddard,
Mary MacCarthy,
Jacob Solawetz
Abstract:
We conducted extensive experiments on domain adaptation of the Meta-Llama-3-70B-Instruct model on SEC data, exploring its performance on both general and domain-specific benchmarks. Our focus included continual pre-training (CPT) and model merging, aiming to enhance the model's domain-specific capabilities while mitigating catastrophic forgetting. Through this study, we evaluated the impact of int…
▽ More
We conducted extensive experiments on domain adaptation of the Meta-Llama-3-70B-Instruct model on SEC data, exploring its performance on both general and domain-specific benchmarks. Our focus included continual pre-training (CPT) and model merging, aiming to enhance the model's domain-specific capabilities while mitigating catastrophic forgetting. Through this study, we evaluated the impact of integrating financial regulatory data into a robust language model and examined the effectiveness of our model merging techniques in preserving and improving the model's instructive abilities. The model is accessible at hugging face: https://huggingface.co/arcee-ai/Llama-3-SEC-Base, arcee-ai/Llama-3-SEC-Base. This is an intermediate checkpoint of our final model, which has seen 20B tokens so far. The full model is still in the process of training. This is a preprint technical report with thorough evaluations to understand the entire process.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
A Formal Proof of R(4,5)=25
Authors:
Thibault Gauthier,
Chad E. Brown
Abstract:
In 1995, McKay and Radziszowski proved that the Ramsey number R(4,5) is equal to 25. Their proof relies on a combination of high-level arguments and computational steps. The authors have performed the computational parts of the proof with different implementations in order to reduce the possibility of an error in their programs. In this work, we prove this theorem in the interactive theorem prover…
▽ More
In 1995, McKay and Radziszowski proved that the Ramsey number R(4,5) is equal to 25. Their proof relies on a combination of high-level arguments and computational steps. The authors have performed the computational parts of the proof with different implementations in order to reduce the possibility of an error in their programs. In this work, we prove this theorem in the interactive theorem prover HOL4 limiting the uncertainty to the small HOL4 kernel. Instead of verifying their algorithms directly, we rely on the HOL4 interface to MiniSat SAT to prove gluing lemmas. To reduce the number of such lemmas and thus make the computational part of the proof feasible, we implement a generalization algorithm. We verify that its output covers all the possible cases by implementing a custom SAT-solver extended with a graph isomorphism checker.
△ Less
Submitted 12 June, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Learning Guided Automated Reasoning: A Brief Survey
Authors:
Lasse Blaauwbroek,
David Cerna,
Thibault Gauthier,
Jan Jakubův,
Cezary Kaliszyk,
Martin Suda,
Josef Urban
Abstract:
Automated theorem provers and formal proof assistants are general reasoning systems that are in theory capable of proving arbitrarily hard theorems, thus solving arbitrary problems reducible to mathematics and logical reasoning. In practice, such systems however face large combinatorial explosion, and therefore include many heuristics and choice points that considerably influence their performance…
▽ More
Automated theorem provers and formal proof assistants are general reasoning systems that are in theory capable of proving arbitrarily hard theorems, thus solving arbitrary problems reducible to mathematics and logical reasoning. In practice, such systems however face large combinatorial explosion, and therefore include many heuristics and choice points that considerably influence their performance. This is an opportunity for trained machine learning predictors, which can guide the work of such reasoning systems. Conversely, deductive search supported by the notion of logically valid proof allows one to train machine learning systems on large reasoning corpora. Such bodies of proof are usually correct by construction and when combined with more and more precise trained guidance they can be boostrapped into very large corpora, with increasingly long reasoning chains and possibly novel proof ideas. In this paper we provide an overview of several automated reasoning and theorem proving domains and the learning and AI methods that have been so far developed for them. These include premise selection, proof guidance in several settings, AI systems and feedback loops iterating between reasoning and learning, and symbolic classification problems.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
A Mathematical Benchmark for Inductive Theorem Provers
Authors:
Thibault Gauthier,
Chad E. Brown,
Mikolas Janota,
Josef Urban
Abstract:
We present a benchmark of 29687 problems derived from the On-Line Encyclopedia of Integer Sequences (OEIS). Each problem expresses the equivalence of two syntactically different programs generating the same OEIS sequence. Such programs were conjectured by a learning-guided synthesis system using a language with loo** operators. The operators implement recursion, and thus many of the proofs requi…
▽ More
We present a benchmark of 29687 problems derived from the On-Line Encyclopedia of Integer Sequences (OEIS). Each problem expresses the equivalence of two syntactically different programs generating the same OEIS sequence. Such programs were conjectured by a learning-guided synthesis system using a language with loo** operators. The operators implement recursion, and thus many of the proofs require induction on natural numbers. The benchmark contains problems of varying difficulty from a wide area of mathematical domains. We believe that these characteristics will make it an effective judge for the progress of inductive theorem provers in this domain for years to come.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Alien Coding
Authors:
Thibault Gauthier,
Miroslav Olšák,
Josef Urban
Abstract:
We introduce a self-learning algorithm for synthesizing programs for OEIS sequences. The algorithm starts from scratch initially generating programs at random. Then it runs many iterations of a self-learning loop that interleaves (i) training neural machine translation to learn the correspondence between sequences and the programs discovered so far, and (ii) proposing many new programs for each OE…
▽ More
We introduce a self-learning algorithm for synthesizing programs for OEIS sequences. The algorithm starts from scratch initially generating programs at random. Then it runs many iterations of a self-learning loop that interleaves (i) training neural machine translation to learn the correspondence between sequences and the programs discovered so far, and (ii) proposing many new programs for each OEIS sequence by the trained neural machine translator. The algorithm discovers on its own programs for more than 78000 OEIS sequences, sometimes develo** unusual programming methods. We analyze its behavior and the invented programs in several experiments.
△ Less
Submitted 30 August, 2023; v1 submitted 26 January, 2023;
originally announced January 2023.
-
Self-Supervised 3D Monocular Object Detection by Recycling Bounding Boxes
Authors:
Sugirtha T,
Sridevi M,
Khailash Santhakumar,
Hao Liu,
B Ravi Kiran,
Thomas Gauthier,
Senthil Yogamani
Abstract:
Modern object detection architectures are moving towards employing self-supervised learning (SSL) to improve performance detection with related pretext tasks. Pretext tasks for monocular 3D object detection have not yet been explored yet in literature. The paper studies the application of established self-supervised bounding box recycling by labeling random windows as the pretext task. The classif…
▽ More
Modern object detection architectures are moving towards employing self-supervised learning (SSL) to improve performance detection with related pretext tasks. Pretext tasks for monocular 3D object detection have not yet been explored yet in literature. The paper studies the application of established self-supervised bounding box recycling by labeling random windows as the pretext task. The classifier head of the 3D detector is trained to classify random windows containing different proportions of the ground truth objects, thus handling the foreground-background imbalance. We evaluate the pretext task using the RTM3D detection model as baseline, with and without the application of data augmentation. We demonstrate improvements of between 2-3 % in mAP 3D and 0.9-1.5 % BEV scores using SSL over the baseline scores. We propose the inverse class frequency re-weighted (ICFW) mAP score that highlights improvements in detection for low frequency classes in a class imbalanced dataset with long tails. We demonstrate improvements in ICFW both mAP 3D and BEV scores to take into account the class imbalance in the KITTI validation dataset. We see 4-5 % increase in ICFW metric with the pretext task.
△ Less
Submitted 25 June, 2022;
originally announced June 2022.
-
Topos: A Secure, Trustless, and Decentralized Interoperability Protocol
Authors:
Théo Gauthier,
Sébastien Dan,
Monir Hadji,
Antonella Del Pozzo,
Yackolley Amoussou-Guenou
Abstract:
Topos is an open interoperability protocol designed to reduce as much as possible trust assumptions by replacing them with cryptographic constructions and decentralization while exhibiting massive scalability. The protocol does not make use of a central blockchain, nor uses consensus to ensure consistent delivery of messages across a heterogeneous ecosystem of public and private blockchains, named…
▽ More
Topos is an open interoperability protocol designed to reduce as much as possible trust assumptions by replacing them with cryptographic constructions and decentralization while exhibiting massive scalability. The protocol does not make use of a central blockchain, nor uses consensus to ensure consistent delivery of messages across a heterogeneous ecosystem of public and private blockchains, named subnets, but instead relies on a weak causal reliable broadcast implemented by a distributed network which we call $\textit{Transmission Control Engine}$ (TCE). The validity of cross-subnet messages is ensured by the $\textit{Universal Certificate Interface}$ (UCI) and stems from zkSTARK proofs asserting the validity of subnets' state transitions executed by the Topos zkVM. Such proofs of computational integrity are publicly verifiable by any other participants in and out the protocol such as other subnets or audit companies. The interface between the TCE and subnets leverages the ICE-FROST protocol, an innovative threshold signature scheme, whose static public key allows for uniquely identifying subnets after they register in the protocol. The Topos protocol is designed to provide $\textit{uniform security}$ to the ecosystem and to handle any type of subnets (e.g., permissioned, permissionless) in order to fit any business use cases and pave the way for global adoption and a new standard for the Internet base layer.
△ Less
Submitted 8 February, 2023; v1 submitted 7 June, 2022;
originally announced June 2022.
-
Learning Program Synthesis for Integer Sequences from Scratch
Authors:
Thibault Gauthier,
Josef Urban
Abstract:
We present a self-learning approach for synthesizing programs from integer sequences. Our method relies on a tree search guided by a learned policy. Our system is tested on the On-Line Encyclopedia of Integer Sequences. There, it discovers, on its own, solutions for 27987 sequences starting from basic operators and without human-written training examples.
We present a self-learning approach for synthesizing programs from integer sequences. Our method relies on a tree search guided by a learned policy. Our system is tested on the On-Line Encyclopedia of Integer Sequences. There, it discovers, on its own, solutions for 27987 sequences starting from basic operators and without human-written training examples.
△ Less
Submitted 29 November, 2022; v1 submitted 24 February, 2022;
originally announced February 2022.
-
Simulation-to-Reality domain adaptation for offline 3D object annotation on pointclouds with correlation alignment
Authors:
Weishuang Zhang,
B Ravi Kiran,
Thomas Gauthier,
Yanis Mazouz,
Theo Steger
Abstract:
Annotating objects with 3D bounding boxes in LiDAR pointclouds is a costly human driven process in an autonomous driving perception system. In this paper, we present a method to semi-automatically annotate real-world pointclouds collected by deployment vehicles using simulated data. We train a 3D object detector model on labeled simulated data from CARLA jointly with real world pointclouds from ou…
▽ More
Annotating objects with 3D bounding boxes in LiDAR pointclouds is a costly human driven process in an autonomous driving perception system. In this paper, we present a method to semi-automatically annotate real-world pointclouds collected by deployment vehicles using simulated data. We train a 3D object detector model on labeled simulated data from CARLA jointly with real world pointclouds from our target vehicle. The supervised object detection loss is augmented with a CORAL loss term to reduce the distance between labeled simulated and unlabeled real pointcloud feature representations. The goal here is to learn representations that are invariant to simulated (labeled) and real-world (unlabeled) target domains. We also provide an updated survey on domain adaptation methods for pointclouds.
△ Less
Submitted 26 February, 2022; v1 submitted 5 February, 2022;
originally announced February 2022.
-
Learned Provability Likelihood for Tactical Search
Authors:
Thibault Gauthier
Abstract:
We present a method to estimate the provability of a mathematical formula. We adapt the tactical theorem prover TacticToe to factor in these estimations. Experiments over the HOL4 library show an increase in the number of theorems re-proven by TacticToe thanks to this additional guidance. This amelioration in performance together with concurrent updates to the TacticToe framework lead to an improv…
▽ More
We present a method to estimate the provability of a mathematical formula. We adapt the tactical theorem prover TacticToe to factor in these estimations. Experiments over the HOL4 library show an increase in the number of theorems re-proven by TacticToe thanks to this additional guidance. This amelioration in performance together with concurrent updates to the TacticToe framework lead to an improved user experience.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Exploring 2D Data Augmentation for 3D Monocular Object Detection
Authors:
Sugirtha T,
Sridevi M,
Khailash Santhakumar,
B Ravi Kiran,
Thomas Gauthier,
Senthil Yogamani
Abstract:
Data augmentation is a key component of CNN based image recognition tasks like object detection. However, it is relatively less explored for 3D object detection. Many standard 2D object detection data augmentation techniques do not extend to 3D box. Extension of these data augmentations for 3D object detection requires adaptation of the 3D geometry of the input scene and synthesis of new viewpoint…
▽ More
Data augmentation is a key component of CNN based image recognition tasks like object detection. However, it is relatively less explored for 3D object detection. Many standard 2D object detection data augmentation techniques do not extend to 3D box. Extension of these data augmentations for 3D object detection requires adaptation of the 3D geometry of the input scene and synthesis of new viewpoints. This requires accurate depth information of the scene which may not be always available. In this paper, we evaluate existing 2D data augmentations and propose two novel augmentations for monocular 3D detection without a requirement for novel view synthesis. We evaluate these augmentations on the RTM3D detection model firstly due to the shorter training times . We obtain a consistent improvement by 4% in the 3D AP (@IoU=0.7) for cars, ~1.8% scores 3D AP (@IoU=0.25) for pedestrians & cyclists, over the baseline on KITTI car detection dataset. We also demonstrate a rigorous evaluation of the mAP scores by re-weighting them to take into account the class imbalance in the KITTI validation dataset.
△ Less
Submitted 21 April, 2021;
originally announced April 2021.
-
Tree Neural Networks in HOL4
Authors:
Thibault Gauthier
Abstract:
We present an implementation of tree neural networks within the proof assistant HOL4. Their architecture makes them naturally suited for approximating functions whose domain is a set of formulas. We measure the performance of our implementation and compare it with other machine learning predictors on the tasks of evaluating arithmetical expressions and estimating the truth of propositional formula…
▽ More
We present an implementation of tree neural networks within the proof assistant HOL4. Their architecture makes them naturally suited for approximating functions whose domain is a set of formulas. We measure the performance of our implementation and compare it with other machine learning predictors on the tasks of evaluating arithmetical expressions and estimating the truth of propositional formulas.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
Self-Learned Formula Synthesis in Set Theory
Authors:
Chad E. Brown,
Thibault Gauthier
Abstract:
A reinforcement learning algorithm accomplishes the task of synthesizing a set-theoretical formula that evaluates to given truth values for given assignments.
A reinforcement learning algorithm accomplishes the task of synthesizing a set-theoretical formula that evaluates to given truth values for given assignments.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
Deep Reinforcement Learning for Synthesizing Functions in Higher-Order Logic
Authors:
Thibault Gauthier
Abstract:
The paper describes a deep reinforcement learning framework based on self-supervised learning within the proof assistant HOL4. A close interaction between the machine learning modules and the HOL4 library is achieved by the choice of tree neural networks (TNNs) as machine learning models and the internal use of HOL4 terms to represent tree structures of TNNs. Recursive improvement is possible when…
▽ More
The paper describes a deep reinforcement learning framework based on self-supervised learning within the proof assistant HOL4. A close interaction between the machine learning modules and the HOL4 library is achieved by the choice of tree neural networks (TNNs) as machine learning models and the internal use of HOL4 terms to represent tree structures of TNNs. Recursive improvement is possible when a task is expressed as a search problem. In this case, a Monte Carlo Tree Search (MCTS) algorithm guided by a TNN can be used to explore the search space and produce better examples for training the next TNN. As an illustration, term synthesis tasks on combinators and Diophantine equations are specified and learned. We achieve a success rate of 65% on combinator synthesis problems outperforming state-of-the-art ATPs run with their best general set of strategies. We set a precedent for statistically guided synthesis of Diophantine equations by solving 78.5% of the generated test problems.
△ Less
Submitted 24 April, 2020; v1 submitted 25 October, 2019;
originally announced October 2019.
-
GRUNGE: A Grand Unified ATP Challenge
Authors:
Chad E. Brown,
Thibault Gauthier,
Cezary Kaliszyk,
Geoff Sutcliffe,
Josef Urban
Abstract:
This paper describes a large set of related theorem proving problems obtained by translating theorems from the HOL4 standard library into multiple logical formalisms. The formalisms are in higher-order logic (with and without type variables) and first-order logic (possibly with multiple types, and possibly with type variables). The resultant problem sets allow us to run automated theorem provers t…
▽ More
This paper describes a large set of related theorem proving problems obtained by translating theorems from the HOL4 standard library into multiple logical formalisms. The formalisms are in higher-order logic (with and without type variables) and first-order logic (possibly with multiple types, and possibly with type variables). The resultant problem sets allow us to run automated theorem provers that support different logical formats on corresponding problems, and compare their performances. This also results in a new "grand unified" large theory benchmark that emulates the ITP/ATP hammer setting, where systems and metasystems can use multiple ATP formalisms in complementary ways, and jointly learn from the accumulated knowledge.
△ Less
Submitted 19 November, 2019; v1 submitted 6 March, 2019;
originally announced March 2019.
-
TacticToe: Learning to Prove with Tactics
Authors:
Thibault Gauthier,
Cezary Kaliszyk,
Josef Urban,
Ramana Kumar,
Michael Norrish
Abstract:
We implement a automated tactical prover TacticToe on top of the HOL4 interactive theorem prover. TacticToe learns from human proofs which mathematical technique is suitable in each proof situation. This knowledge is then used in a Monte Carlo tree search algorithm to explore promising tactic-level proof paths. On a single CPU, with a time limit of 60 seconds, TacticToe proves 66.4 percent of the…
▽ More
We implement a automated tactical prover TacticToe on top of the HOL4 interactive theorem prover. TacticToe learns from human proofs which mathematical technique is suitable in each proof situation. This knowledge is then used in a Monte Carlo tree search algorithm to explore promising tactic-level proof paths. On a single CPU, with a time limit of 60 seconds, TacticToe proves 66.4 percent of the 7164 theorems in HOL4's standard library, whereas E prover with auto-schedule solves 34.5 percent. The success rate rises to 69.0 percent by combining the results of TacticToe and E prover.
△ Less
Submitted 1 December, 2021; v1 submitted 2 April, 2018;
originally announced April 2018.
-
Learning to Reason with HOL4 tactics
Authors:
Thibault Gauthier,
Cezary Kaliszyk,
Josef Urban
Abstract:
Techniques combining machine learning with translation to automated reasoning have recently become an important component of formal proof assistants. Such "hammer" tech- niques complement traditional proof assistant automation as implemented by tactics and decision procedures. In this paper we present a unified proof assistant automation approach which attempts to automate the selection of appropr…
▽ More
Techniques combining machine learning with translation to automated reasoning have recently become an important component of formal proof assistants. Such "hammer" tech- niques complement traditional proof assistant automation as implemented by tactics and decision procedures. In this paper we present a unified proof assistant automation approach which attempts to automate the selection of appropriate tactics and tactic-sequences com- bined with an optimized small-scale hammering approach. We implement the technique as a tactic-level automation for HOL4: TacticToe. It implements a modified A*-algorithm directly in HOL4 that explores different tactic-level proof paths, guiding their selection by learning from a large number of previous tactic-level proofs. Unlike the existing hammer methods, TacticToe avoids translation to FOL, working directly on the HOL level. By combining tactic prediction and premise selection, TacticToe is able to re-prove 39 percent of 7902 HOL4 theorems in 5 seconds whereas the best single HOL(y)Hammer strategy solves 32 percent in the same amount of time.
△ Less
Submitted 2 April, 2018;
originally announced April 2018.
-
Premise Selection and External Provers for HOL4
Authors:
Thibault Gauthier,
Cezary Kaliszyk
Abstract:
Learning-assisted automated reasoning has recently gained popularity among the users of Isabelle/HOL, HOL Light, and Mizar. In this paper, we present an add-on to the HOL4 proof assistant and an adaptation of the HOLyHammer system that provides machine learning-based premise selection and automated reasoning also for HOL4. We efficiently record the HOL4 dependencies and extract features from the t…
▽ More
Learning-assisted automated reasoning has recently gained popularity among the users of Isabelle/HOL, HOL Light, and Mizar. In this paper, we present an add-on to the HOL4 proof assistant and an adaptation of the HOLyHammer system that provides machine learning-based premise selection and automated reasoning also for HOL4. We efficiently record the HOL4 dependencies and extract features from the theorem statements, which form a basis for premise selection. HOLyHammer transforms the HOL4 statements in the various TPTP-ATP proof formats, which are then processed by the ATPs. We discuss the different evaluation settings: ATPs, accessible lemmas, and premise numbers. We measure the performance of HOLyHammer on the HOL4 standard library. The results are combined accordingly and compared with the HOL Light experiments, showing a comparably high quality of predictions. The system directly benefits HOL4 users by automatically finding proofs dependencies that can be reconstructed by Metis.
△ Less
Submitted 11 September, 2015;
originally announced September 2015.
-
Sharing HOL4 and HOL Light proof knowledge
Authors:
Thibault Gauthier,
Cezary Kaliszyk
Abstract:
New proof assistant developments often involve concepts similar to already formalized ones. When proving their properties, a human can often take inspiration from the existing formalized proofs available in other provers or libraries. In this paper we propose and evaluate a number of methods, which strengthen proof automation by learning from proof libraries of different provers. Certain conjectur…
▽ More
New proof assistant developments often involve concepts similar to already formalized ones. When proving their properties, a human can often take inspiration from the existing formalized proofs available in other provers or libraries. In this paper we propose and evaluate a number of methods, which strengthen proof automation by learning from proof libraries of different provers. Certain conjectures can be proved directly from the dependencies induced by similar proofs in the other library. Even if exact correspondences are not found, learning-reasoning systems can make use of the association between proved theorems and their characteristics to predict the relevant premises. Such external help can be further combined with internal advice. We evaluate the proposed knowledge-sharing methods by reproving the HOL Light and HOL4 standard libraries. The learning-reasoning system HOL(y)Hammer, whose single best strategy could automatically find proofs for 30% of the HOL Light problems, can prove 40% with the knowledge from HOL4.
△ Less
Submitted 11 September, 2015;
originally announced September 2015.
-
Matching concepts across HOL libraries
Authors:
Thibault Gauthier,
Cezary Kaliszyk
Abstract:
Many proof assistant libraries contain formalizations of the same mathematical concepts. The concepts are often introduced (defined) in different ways, but the properties that they have, and are in turn formalized, are the same. For the basic concepts, like natural numbers, matching them between libraries is often straightforward, because of mathematical naming conventions. However, for more advan…
▽ More
Many proof assistant libraries contain formalizations of the same mathematical concepts. The concepts are often introduced (defined) in different ways, but the properties that they have, and are in turn formalized, are the same. For the basic concepts, like natural numbers, matching them between libraries is often straightforward, because of mathematical naming conventions. However, for more advanced concepts, finding similar formalizations in different libraries is a non-trivial task even for an expert.
In this paper we investigate automatic discovery of similar concepts across libraries of proof assistants. We propose an approach for normalizing properties of concepts in formal libraries and a number of similarity measures. We evaluate the approach on HOL based proof assistants HOL4, HOL Light and Isabelle/HOL, discovering 398 pairs of isomorphic constants and types.
△ Less
Submitted 15 May, 2014;
originally announced May 2014.