Skip to main content

Showing 1–5 of 5 results for author: Olausson, T X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.19475  [pdf, other

    cs.SE cs.AI cs.LG

    The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations?

    Authors: Alex Gu, Wen-Ding Li, Naman Jain, Theo X. Olausson, Celine Lee, Koushik Sen, Armando Solar-Lezama

    Abstract: While language models are increasingly more proficient at code generation, they still frequently generate incorrect programs. Many of these programs are obviously wrong, but others are more subtle and pass weaker correctness checks such as being able to compile. In this work, we focus on these counterfeit samples: programs sampled from a language model that 1) have a high enough log-probability to… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 54 pages, 25 figures

  2. arXiv:2310.19791  [pdf, other

    cs.CL cs.AI cs.LG cs.PL

    LILO: Learning Interpretable Libraries by Compressing and Documenting Code

    Authors: Gabriel Grand, Lionel Wong, Maddy Bowers, Theo X. Olausson, Muxin Liu, Joshua B. Tenenbaum, Jacob Andreas

    Abstract: While large language models (LLMs) now excel at code generation, a key aspect of software development is the art of refactoring: consolidating code into libraries of reusable and readable programs. In this paper, we introduce LILO, a neurosymbolic framework that iteratively synthesizes, compresses, and documents code to build libraries tailored to particular problem domains. LILO combines LLM-guid… ▽ More

    Submitted 15 March, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 camera-ready

  3. LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

    Authors: Theo X. Olausson, Alex Gu, Benjamin Lipkin, Cedegao E. Zhang, Armando Solar-Lezama, Joshua B. Tenenbaum, Roger Levy

    Abstract: Logical reasoning, i.e., deductively inferring the truth value of a conclusion from a set of premises, is an important task for artificial intelligence with wide potential impacts on science, mathematics, and society. While many prompting-based strategies have been proposed to enable Large Language Models (LLMs) to do such reasoning more effectively, they still appear unsatisfactory, often failing… ▽ More

    Submitted 14 February, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP Main 2023 (Outstanding Paper Award)

    Journal ref: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5153-5176, Singapore. Association for Computational Linguistics

  4. arXiv:2306.09896  [pdf, other

    cs.CL cs.AI cs.PL cs.SE

    Is Self-Repair a Silver Bullet for Code Generation?

    Authors: Theo X. Olausson, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao, Armando Solar-Lezama

    Abstract: Large language models have shown remarkable aptitude in code generation, but still struggle to perform complex tasks. Self-repair -- in which the model debugs and repairs its own code -- has recently become a popular way to boost performance in these settings. However, despite its increasing popularity, existing studies of self-repair have been limited in scope; in many settings, its efficacy thus… ▽ More

    Submitted 2 February, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted to ICLR 2024. Added additional Code Llama experiments and fixed a data processing error harming Code Llama's reported self-repair performance on HumanEval

  5. arXiv:2211.16605  [pdf, other

    cs.PL cs.AI

    Top-Down Synthesis for Library Learning

    Authors: Matthew Bowers, Theo X. Olausson, Lionel Wong, Gabriel Grand, Joshua B. Tenenbaum, Kevin Ellis, Armando Solar-Lezama

    Abstract: This paper introduces corpus-guided top-down synthesis as a mechanism for synthesizing library functions that capture common functionality from a corpus of programs in a domain specific language (DSL). The algorithm builds abstractions directly from initial DSL primitives, using syntactic pattern matching of intermediate abstractions to intelligently prune the search space and guide the algorithm… ▽ More

    Submitted 15 January, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: Published at POPL 2023

    Journal ref: Proc. ACM Program. Lang. 7, POPL, Article 41 (January 2023), pp 1182-1213