Skip to main content

Showing 1–4 of 4 results for author: Feldman, M Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.15232  [pdf, other

    cs.HC

    How Beginning Programmers and Code LLMs (Mis)read Each Other

    Authors: Sydney Nguyen, Hannah McLean Babe, Yangtian Zi, Arjun Guha, Carolyn Jane Anderson, Molly Q Feldman

    Abstract: Generative AI models, specifically large language models (LLMs), have made strides towards the long-standing goal of text-to-code generation. This progress has invited numerous studies of user interaction. However, less is known about the struggles and strategies of non-experts, for whom each step of the text-to-code problem presents challenges: describing their intent in natural language, evaluat… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: Conditionally Accepted to CHI 2024

  2. arXiv:2308.09895  [pdf, other

    cs.PL cs.LG

    Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs

    Authors: Federico Cassano, John Gouwar, Francesca Lucchetti, Claire Schlesinger, Anders Freeman, Carolyn Jane Anderson, Molly Q Feldman, Michael Greenberg, Abhinav Jangda, Arjun Guha

    Abstract: Over the past few years, Large Language Models of Code (Code LLMs) have started to have a significant impact on programming practice. Code LLMs are also emerging as building blocks for research in programming languages and software engineering. However, Code LLMs produce impressive results on programming languages that are well represented in their training data (e.g., Java, Python, or JavaScript)… ▽ More

    Submitted 10 February, 2024; v1 submitted 18 August, 2023; originally announced August 2023.

  3. arXiv:2306.04556  [pdf, other

    cs.LG cs.HC cs.SE

    StudentEval: A Benchmark of Student-Written Prompts for Large Language Models of Code

    Authors: Hannah McLean Babe, Sydney Nguyen, Yangtian Zi, Arjun Guha, Molly Q Feldman, Carolyn Jane Anderson

    Abstract: Code LLMs are being rapidly deployed and there is evidence that they can make professional programmers more productive. Current benchmarks for code generation measure whether models generate correct programs given an expert prompt. In this paper, we present a new benchmark containing multiple prompts per problem, written by a specific population of non-expert prompters: beginning programmers. Stud… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  4. arXiv:2208.08227  [pdf, other

    cs.LG cs.PL

    MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation

    Authors: Federico Cassano, John Gouwar, Daniel Nguyen, Sydney Nguyen, Luna Phipps-Costin, Donald Pinckney, Ming-Ho Yee, Yangtian Zi, Carolyn Jane Anderson, Molly Q Feldman, Arjun Guha, Michael Greenberg, Abhinav Jangda

    Abstract: Large language models have demonstrated the ability to generate both natural language and programming language text. Such models open up the possibility of multi-language code generation: could code generation models generalize knowledge from one language to another? Although contemporary code generation models can generate semantically correct Python code, little is known about their abilities wi… ▽ More

    Submitted 19 December, 2022; v1 submitted 17 August, 2022; originally announced August 2022.