Skip to main content

Showing 1–13 of 13 results for author: Xia, C S

.
  1. arXiv:2407.01489  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    Agentless: Demystifying LLM-based Software Engineering Agents

    Authors: Chunqiu Steven Xia, Yinlin Deng, Soren Dunn, Lingming Zhang

    Abstract: Recent advancements in large language models (LLMs) have significantly advanced the automation of software development tasks, including code synthesis, program repair, and test generation. More recently, researchers and industry practitioners have developed various autonomous LLM agents to perform end-to-end software development tasks. These agents are equipped with the ability to use tools, run c… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2404.17153  [pdf, other

    cs.SE

    A Unified Debugging Approach via LLM-Based Multi-Agent Synergy

    Authors: Cheryl Lee, Chunqiu Steven Xia, Jen-tse Huang, Zhouruixin Zhu, Lingming Zhang, Michael R. Lyu

    Abstract: Tremendous efforts have been devoted to automating software debugging, a time-consuming process involving fault localization and repair generation. Recently, Large Language Models (LLMs) have shown great potential in automated debugging. However, we identified three challenges posed to traditional and LLM-based debugging tools: 1) the upstream imperfection of fault localization affects the downstr… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  3. arXiv:2403.19114  [pdf, other

    cs.SE cs.CL cs.LG cs.PL

    Top Leaderboard Ranking = Top Coding Proficiency, Always? EvoEval: Evolving Coding Benchmarks via LLM

    Authors: Chunqiu Steven Xia, Yinlin Deng, Lingming Zhang

    Abstract: LLMs have become the go-to choice for code generation tasks, with an exponential increase in the training, development, and usage of LLMs specifically for code generation. To evaluate the ability of LLMs on code, both academic and industry practitioners rely on popular handcrafted benchmarks. However, prior benchmarks contain only a very limited set of problems, both in quantity and variety. Furth… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  4. arXiv:2309.00608  [pdf, other

    cs.SE cs.LG cs.PL

    Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair

    Authors: Yuxiang Wei, Chunqiu Steven Xia, Lingming Zhang

    Abstract: During Automated Program Repair (APR), it can be challenging to synthesize correct patches for real-world systems in general-purpose programming languages. Recent Large Language Models (LLMs) have been shown to be helpful "copilots" in assisting developers with various coding tasks, and have also been directly applied for patch synthesis. However, most LLMs treat programs as sequences of tokens, m… ▽ More

    Submitted 8 November, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: Paper accepted at ESEC/FSE 2023

    ACM Class: I.2.2; D.2.5

  5. arXiv:2308.04748  [pdf, other

    cs.SE cs.LG

    Fuzz4All: Universal Fuzzing with Large Language Models

    Authors: Chunqiu Steven Xia, Matteo Paltenghi, Jia Le Tian, Michael Pradel, Lingming Zhang

    Abstract: Fuzzing has achieved tremendous success in discovering bugs and vulnerabilities in various software systems. Systems under test (SUTs) that take in programming or formal language as inputs, e.g., compilers, runtime engines, constraint solvers, and software libraries with accessible APIs, are especially important as they are fundamental building blocks of software development. However, existing fuz… ▽ More

    Submitted 15 January, 2024; v1 submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted at ICSE 2024

  6. arXiv:2305.01210  [pdf, other

    cs.SE cs.CL cs.LG

    Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation

    Authors: Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, Lingming Zhang

    Abstract: Program synthesis has been long studied with recent approaches focused on directly using the power of Large Language Models (LLMs) to generate code. Programming benchmarks, with curated synthesis problems and test-cases, are used to measure the performance of various LLMs on code synthesis. However, these test-cases can be limited in both quantity and quality for fully assessing the functional cor… ▽ More

    Submitted 30 October, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

  7. arXiv:2304.02014  [pdf, other

    cs.SE

    Large Language Models are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT

    Authors: Yinlin Deng, Chunqiu Steven Xia, Chenyuan Yang, Shizhuo Dylan Zhang, Shu**g Yang, Lingming Zhang

    Abstract: Deep Learning (DL) library bugs affect downstream DL applications, emphasizing the need for reliable systems. Generating valid input programs for fuzzing DL libraries is challenging due to the need for satisfying both language syntax/semantics and constraints for constructing valid computational graphs. Recently, the TitanFuzz work demonstrates that modern Large Language Models (LLMs) can be direc… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  8. arXiv:2304.00385  [pdf, other

    cs.SE cs.LG

    Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT

    Authors: Chunqiu Steven Xia, Lingming Zhang

    Abstract: Automated Program Repair (APR) aims to automatically generate patches for buggy programs. Recent APR work has been focused on leveraging modern Large Language Models (LLMs) to directly generate patches for APR. Such LLM-based APR tools work by first constructing an input prompt built using the original buggy code and then queries the LLM to generate patches. While the LLM-based APR tools are able… ▽ More

    Submitted 1 April, 2023; originally announced April 2023.

  9. arXiv:2303.10494  [pdf, other

    cs.SE cs.LG

    Revisiting the Plastic Surgery Hypothesis via Large Language Models

    Authors: Chunqiu Steven Xia, Yifeng Ding, Lingming Zhang

    Abstract: Automated Program Repair (APR) aspires to automatically generate patches for an input buggy program. Traditional APR tools typically focus on specific bug types and fixes through the use of templates, heuristics, and formal specifications. However, these techniques are limited in terms of the bug types and patch variety they can produce. As such, researchers have designed various learning-based AP… ▽ More

    Submitted 18 March, 2023; originally announced March 2023.

  10. arXiv:2301.13246  [pdf, other

    cs.SE cs.LG

    Conversational Automated Program Repair

    Authors: Chunqiu Steven Xia, Lingming Zhang

    Abstract: Automated Program Repair (APR) can help developers automatically generate patches for bugs. Due to the impressive performance obtained using Large Pre-Trained Language Models (LLMs) on many code related tasks, researchers have started to directly use LLMs for APR. However, prior approaches simply repeatedly sample the LLM given the same constructed input/prompt created from the original buggy code… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

  11. arXiv:2212.14834  [pdf, other

    cs.SE

    Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models

    Authors: Yinlin Deng, Chunqiu Steven Xia, Haoran Peng, Chenyuan Yang, Lingming Zhang

    Abstract: Detecting bugs in Deep Learning (DL) libraries (e.g., TensorFlow/PyTorch) is critical for almost all downstream DL systems in ensuring effectiveness/safety for end users. Meanwhile, traditional fuzzing techniques can be hardly effective for such a challenging domain since the input DL programs need to satisfy both the input language (e.g., Python) syntax/semantics and the DL API input/shape constr… ▽ More

    Submitted 7 March, 2023; v1 submitted 30 December, 2022; originally announced December 2022.

    Comments: Accepted at ISSTA 2023

  12. arXiv:2210.14179  [pdf, other

    cs.SE

    Practical Program Repair in the Era of Large Pre-trained Language Models

    Authors: Chunqiu Steven Xia, Yuxiang Wei, Lingming Zhang

    Abstract: Automated Program Repair (APR) aims to help developers automatically patch software bugs. However, current state-of-the-art traditional and learning-based APR techniques face the problem of limited patch variety, failing to fix complicated bugs. This is mainly due to the reliance on bug-fixing datasets to craft fix templates or directly predict potential patches. Large Pre-Trained Language Models… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

  13. arXiv:2207.08281  [pdf, other

    cs.SE

    Less Training, More Repairing Please: Revisiting Automated Program Repair via Zero-shot Learning

    Authors: Chunqiu Steven Xia, Lingming Zhang

    Abstract: Due to the promising future of Automated Program Repair (APR), researchers have proposed various APR techniques, including heuristic-based, template-based, and constraint-based techniques. Among such classic APR techniques, template-based techniques have been widely recognized as state of the art. However, such template-based techniques require predefined templates to perform repair, and their eff… ▽ More

    Submitted 25 July, 2022; v1 submitted 17 July, 2022; originally announced July 2022.

    Comments: Accepted at ESEC/FSE 2022