Search | arXiv e-print repository

Amortizing Pragmatic Program Synthesis with Rankings

Authors: Yewen Pu, Saujas Vaduguru, Priyan Vaithilingam, Elena Glassman, Daniel Fried

Abstract: The usage of Rational Speech Acts (RSA) framework has been successful in building \emph{pragmatic} program synthesizers that return programs which, in addition to being logically consistent with user-generated examples, account for the fact that a user chooses their examples informatively. We present a general method of amortizing the slow, exact RSA synthesizer. Our method first query the exact R… ▽ More The usage of Rational Speech Acts (RSA) framework has been successful in building \emph{pragmatic} program synthesizers that return programs which, in addition to being logically consistent with user-generated examples, account for the fact that a user chooses their examples informatively. We present a general method of amortizing the slow, exact RSA synthesizer. Our method first query the exact RSA synthesizer to compile a communication dataset. The dataset contains a number of example-dependent rankings of subsets of programs. It then distills a \textit{single} global ranking of all programs as an approximation to every ranking in the dataset. This global ranking is then used at inference time to rank multiple logically consistent candidate programs generated from a fast, non-pragmatic synthesizer. Experiments on two program synthesis domains using our ranking method resulted in orders of magnitudes of speed ups compared to the exact RSA synthesizer, while being more accurate than a non-pragmatic synthesizer when communicating with humans. Finally, we prove that in the special case of synthesis from a single example, this approximation is exact. △ Less

Submitted 1 June, 2024; originally announced July 2024.

Comments: icml 2024. arXiv admin note: substantial text overlap with arXiv:2309.03225

arXiv:2402.07342 [pdf, other]

Imagining a Future of Designing with AI: Dynamic Grounding, Constructive Negotiation, and Sustainable Motivation

Authors: Priyan Vaithilingam, Ian Arawjo, Elena L. Glassman

Abstract: We ideate a future design workflow that involves AI technology. Drawing from activity and communication theory, we attempt to isolate the new value large AI models can provide design compared to past technologies. We arrive at three affordances -- dynamic grounding, constructive negotiation, and sustainable motivation -- that summarize latent qualities of natural language-enabled foundation models… ▽ More We ideate a future design workflow that involves AI technology. Drawing from activity and communication theory, we attempt to isolate the new value large AI models can provide design compared to past technologies. We arrive at three affordances -- dynamic grounding, constructive negotiation, and sustainable motivation -- that summarize latent qualities of natural language-enabled foundation models that, if explicitly designed for, can support the process of design. Through design fiction, we then imagine a future interface as a diegetic prototype, the story of Squirrel Game, that demonstrates each of our three affordances in a realistic usage scenario. Our design process, terminology, and diagrams aim to contribute to future discussions about the relative affordances of AI technology with regard to collaborating with human designers. △ Less

Submitted 11 February, 2024; originally announced February 2024.

Comments: 12 pages, 4 figures

ACM Class: J.6; I.2.0; H.5.2

arXiv:2401.10880 [pdf, other]

DynaVis: Dynamically Synthesized UI Widgets for Visualization Editing

Authors: Priyan Vaithilingam, Elena L. Glassman, Jeevana Priya Inala, Chenglong Wang

Abstract: Users often rely on GUIs to edit and interact with visualizations - a daunting task due to the large space of editing options. As a result, users are either overwhelmed by a complex UI or constrained by a custom UI with a tailored, fixed subset of options with limited editing flexibility. Natural Language Interfaces (NLIs) are emerging as a feasible alternative for users to specify edits. However,… ▽ More Users often rely on GUIs to edit and interact with visualizations - a daunting task due to the large space of editing options. As a result, users are either overwhelmed by a complex UI or constrained by a custom UI with a tailored, fixed subset of options with limited editing flexibility. Natural Language Interfaces (NLIs) are emerging as a feasible alternative for users to specify edits. However, NLIs forgo the advantages of traditional GUI: the ability to explore and repeat edits and see instant visual feedback. We introduce DynaVis, which blends natural language and dynamically synthesized UI widgets. As the user describes an editing task in natural language, DynaVis performs the edit and synthesizes a persistent widget that the user can interact with to make further modifications. Study participants (n=24) preferred DynaVis over the NLI-only interface citing ease of further edits and editing confidence due to immediate visual feedback. △ Less

Submitted 19 January, 2024; originally announced January 2024.

arXiv:2309.09128 [pdf, other]

doi 10.1145/3613904.3642016

ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing

Authors: Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, Elena Glassman

Abstract: Evaluating outputs of large language models (LLMs) is challenging, requiring making -- and making sense of -- many responses. Yet tools that go beyond basic prompting tend to require knowledge of programming APIs, focus on narrow domains, or are closed-source. We present ChainForge, an open-source visual toolkit for prompt engineering and on-demand hypothesis testing of text generation LLMs. Chain… ▽ More Evaluating outputs of large language models (LLMs) is challenging, requiring making -- and making sense of -- many responses. Yet tools that go beyond basic prompting tend to require knowledge of programming APIs, focus on narrow domains, or are closed-source. We present ChainForge, an open-source visual toolkit for prompt engineering and on-demand hypothesis testing of text generation LLMs. ChainForge provides a graphical interface for comparison of responses across models and prompt variations. Our system was designed to support three tasks: model selection, prompt template design, and hypothesis testing (e.g., auditing). We released ChainForge early in its development and iterated on its design with academics and online users. Through in-lab and interview studies, we find that a range of people could use ChainForge to investigate hypotheses that matter to them, including in real-world settings. We identify three modes of prompt engineering and LLM hypothesis testing: opportunistic exploration, limited evaluation, and iterative refinement. △ Less

Submitted 3 May, 2024; v1 submitted 16 September, 2023; originally announced September 2023.

Comments: 18 pages, 7 figures, published at CHI 2024

ACM Class: H.5.2; I.2

arXiv:2309.03225 [pdf, other]

Amortizing Pragmatic Program Synthesis with Rankings

Authors: Yewen Pu, Saujas Vaduguru, Priyan Vaithilingam, Elena Glassman, Daniel Fried

Abstract: In program synthesis, an intelligent system takes in a set of user-generated examples and returns a program that is logically consistent with these examples. The usage of Rational Speech Acts (RSA) framework has been successful in building \emph{pragmatic} program synthesizers that return programs which -- in addition to being logically consistent -- account for the fact that a user chooses their… ▽ More In program synthesis, an intelligent system takes in a set of user-generated examples and returns a program that is logically consistent with these examples. The usage of Rational Speech Acts (RSA) framework has been successful in building \emph{pragmatic} program synthesizers that return programs which -- in addition to being logically consistent -- account for the fact that a user chooses their examples informatively. However, the computational burden of running the RSA algorithm has restricted the application of pragmatic program synthesis to domains with a small number of possible programs. This work presents a novel method of amortizing the RSA algorithm by leveraging a \emph{global pragmatic ranking} -- a single, total ordering of all the hypotheses. We prove that for a pragmatic synthesizer that uses a single demonstration, our global ranking method exactly replicates RSA's ranked responses. We further empirically show that global rankings effectively approximate the full pragmatic synthesizer in an online, multi-demonstration setting. Experiments on two program synthesis domains using our pragmatic ranking method resulted in orders of magnitudes of speed ups compared to the RSA synthesizer, while outperforming the standard, non-pragmatic synthesizer. △ Less

Submitted 1 September, 2023; originally announced September 2023.

ACM Class: I.2.2; D.3.0

arXiv:2308.06656 [pdf, other]

The Usability of Pragmatic Communication in Regular Expression Synthesis

Authors: Priyan Vaithilingam, Yewen Pu, Elena L. Glassman

Abstract: Programming-by-example (PBE) systems aim to alleviate the burden of programming. However, user-specified examples are often ambiguous, leaving multiple programs to satisfy the specification. Consequently, in most prior work, users have had to provide additional examples, particularly negative ones, to further constrain the search over compatible programs. Recent work resolves additional ambiguity… ▽ More Programming-by-example (PBE) systems aim to alleviate the burden of programming. However, user-specified examples are often ambiguous, leaving multiple programs to satisfy the specification. Consequently, in most prior work, users have had to provide additional examples, particularly negative ones, to further constrain the search over compatible programs. Recent work resolves additional ambiguity by modeling program synthesis tasks as pragmatic communication, showing promising results on a graphics domain using a rudimentary user-study. We adapt pragmatic reasoning to a sub-domain of regular expressions and rigorously study its usability as a means of communication both with and without the ability to provide negative examples. Our user study (N=30) demonstrates that, with a pragmatic synthesizer, end-users can more successfully communicate a target regex using positive examples alone (95%) compared to using a non-pragmatic synthesizer (51%). Further, users can communicate more efficiently (57% fewer examples) with a pragmatic synthesizer compared to a non-pragmatic one. △ Less

Submitted 12 August, 2023; originally announced August 2023.

Showing 1–6 of 6 results for author: Vaithilingam, P