Search | arXiv e-print repository

Can Github issues be solved with Tree Of Thoughts?

Authors: Ricardo La Rosa, Corey Hulse, Bangdi Liu

Abstract: While there have been extensive studies in code generation by large language models (LLM), where benchmarks like HumanEval have been surpassed with an impressive 96.3% success rate, these benchmarks predominantly judge a model's performance on basic function-level code generation and lack the critical thinking and concept of scope required of real-world scenarios such as solving GitHub issues. Thi… ▽ More While there have been extensive studies in code generation by large language models (LLM), where benchmarks like HumanEval have been surpassed with an impressive 96.3% success rate, these benchmarks predominantly judge a model's performance on basic function-level code generation and lack the critical thinking and concept of scope required of real-world scenarios such as solving GitHub issues. This research introduces the application of the Tree of Thoughts (ToT) language model reasoning framework for enhancing the decision-making and problem-solving abilities of LLMs for this complex task. Compared to traditional input-output (IO) prompting and Retrieval Augmented Generation (RAG) techniques, ToT is designed to improve performance by facilitating a structured exploration of multiple reasoning trajectories and enabling self-assessment of potential solutions. We experimentally deploy ToT in tackling a Github issue contained within an instance of the SWE-bench. However, our results reveal that the ToT framework alone is not enough to give LLMs the critical reasoning capabilities to outperform existing methods. In this paper we analyze the potential causes of these shortcomings and identify key areas for improvement such as deepening the thought process and introducing agentic capabilities. The insights of this research are aimed at informing future directions for refining the application of ToT and better harnessing the potential of LLMs in real-world problem-solving scenarios. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 8 pages, 2 figures, 7 tables

arXiv:2205.09185 [pdf, other]

doi 10.1016/j.nima.2022.167748

AI-assisted Optimization of the ECCE Tracking System at the Electron Ion Collider

Authors: C. Fanelli, Z. Papandreou, K. Suresh, J. K. Adkins, Y. Akiba, A. Albataineh, M. Amaryan, I. C. Arsene, C. Ayerbe Gayoso, J. Bae, X. Bai, M. D. Baker, M. Bashkanov, R. Bellwied, F. Benmokhtar, V. Berdnikov, J. C. Bernauer, F. Bock, W. Boeglin, M. Borysova, E. Brash, P. Brindza, W. J. Briscoe, M. Brooks, S. Bueltmann , et al. (258 additional authors not shown)

Abstract: The Electron-Ion Collider (EIC) is a cutting-edge accelerator facility that will study the nature of the "glue" that binds the building blocks of the visible matter in the universe. The proposed experiment will be realized at Brookhaven National Laboratory in approximately 10 years from now, with detector design and R&D currently ongoing. Notably, EIC is one of the first large-scale facilities to… ▽ More The Electron-Ion Collider (EIC) is a cutting-edge accelerator facility that will study the nature of the "glue" that binds the building blocks of the visible matter in the universe. The proposed experiment will be realized at Brookhaven National Laboratory in approximately 10 years from now, with detector design and R&D currently ongoing. Notably, EIC is one of the first large-scale facilities to leverage Artificial Intelligence (AI) already starting from the design and R&D phases. The EIC Comprehensive Chromodynamics Experiment (ECCE) is a consortium that proposed a detector design based on a 1.5T solenoid. The EIC detector proposal review concluded that the ECCE design will serve as the reference design for an EIC detector. Herein we describe a comprehensive optimization of the ECCE tracker using AI. The work required a complex parametrization of the simulated detector system. Our approach dealt with an optimization problem in a multidimensional design space driven by multiple objectives that encode the detector performance, while satisfying several mechanical constraints. We describe our strategy and show results obtained for the ECCE tracking system. The AI-assisted design is agnostic to the simulation framework and can be extended to other sub-detectors or to a system of sub-detectors to further optimize the performance of the EIC detector. △ Less

Submitted 19 May, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

Comments: 16 pages, 18 figures, 2 appendices, 3 tables

arXiv:2111.07959 [pdf, other]

Neural Normalized Min-Sum Message-Passing vs. Viterbi Decoding for the CCSDS Line Product Code

Authors: Jonathan Nguyen, Linfang Wang, Chester Hulse, Sahil Dani, Amaael Antonini, Todd Chauvin, Divsalar Dariush, Richard Wesel

Abstract: The Consultative Committee for Space Data Systems (CCSDS) 141.11-O-1 Line Product Code (LPC) provides a rare opportunity to compare maximum-likelihood decoding and message passing. The LPC considered in this paper is intended to serve as the inner code in conjunction with a (255,239) Reed Solomon (RS) code whose symbols are bytes of data. This paper represents the 141.11-O-1 LPC as a bipartite gra… ▽ More The Consultative Committee for Space Data Systems (CCSDS) 141.11-O-1 Line Product Code (LPC) provides a rare opportunity to compare maximum-likelihood decoding and message passing. The LPC considered in this paper is intended to serve as the inner code in conjunction with a (255,239) Reed Solomon (RS) code whose symbols are bytes of data. This paper represents the 141.11-O-1 LPC as a bipartite graph and uses that graph to formulate both maximum likelihood (ML) and message passing algorithms. ML decoding must, of course, have the best frame error rate (FER) performance. However, a fixed point implementation of a Neural-Normalized MinSum (N-NMS) message passing decoder closely approaches ML performance with a significantly lower complexity. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: This paper has been submitted to ICC 2022

arXiv:2104.09480 [pdf, other]

FPGA Implementations of Layered MinSum LDPC Decoders Using RCQ Message Passing

Authors: Caleb Terrill, Linfang Wang, Sean Chen, Chester Hulse, Calvin Kuo, Richard Wesel, Dariush Divsalar

Abstract: Non-uniform message quantization techniques such as reconstruction-computation-quantization (RCQ) improve error-correction performance and decrease hardware complexity of low-density parity-check (LDPC) decoders that use a flooding schedule. Layered MinSum RCQ (L-msRCQ) enables message quantization to be utilized for layered decoders and irregular LDPC codes. We investigate field-programmable gate… ▽ More Non-uniform message quantization techniques such as reconstruction-computation-quantization (RCQ) improve error-correction performance and decrease hardware complexity of low-density parity-check (LDPC) decoders that use a flooding schedule. Layered MinSum RCQ (L-msRCQ) enables message quantization to be utilized for layered decoders and irregular LDPC codes. We investigate field-programmable gate array (FPGA) implementations of L-msRCQ decoders. Three design methods for message quantization are presented, which we name the Lookup, Broadcast, and Dribble methods. The decoding performance and hardware complexity of these schemes are compared to a layered offset MinSum (OMS) decoder. Simulation results on a (16384, 8192) protograph-based raptor-like (PBRL) LDPC code show that a 4-bit L-msRCQ decoder using the Broadcast method can achieve a 0.03 dB improvement in error-correction performance while using 12% fewer registers than the OMS decoder. A Broadcast-based 3-bit L-msRCQ decoder uses 15% fewer lookup tables, 18% fewer registers, and 13% fewer routed nets than the OMS decoder, but results in a 0.09 dB loss in performance. △ Less

Submitted 19 April, 2021; originally announced April 2021.

Comments: 6 pages, 5 figures, submitted to GLOBECOM 2021

Showing 1–4 of 4 results for author: Hulse, C