Solving Data-centric Tasks using Large Language Models
Authors:
Shraddha Barke,
Christian Poelitz,
Carina Suzana Negreanu,
Benjamin Zorn,
José Cambronero,
Andrew D. Gordon,
Vu Le,
Elnaz Nouri,
Nadia Polikarpova,
Advait Sarkar,
Brian Slininger,
Neil Toronto,
Jack Williams
Abstract:
Large language models (LLMs) are rapidly replacing help forums like StackOverflow, and are especially helpful for non-professional programmers and end users. These users are often interested in data-centric tasks, such as spreadsheet manipulation and data wrangling, which are hard to solve if the intent is only communicated using a natural-language description, without including the data. But how…
▽ More
Large language models (LLMs) are rapidly replacing help forums like StackOverflow, and are especially helpful for non-professional programmers and end users. These users are often interested in data-centric tasks, such as spreadsheet manipulation and data wrangling, which are hard to solve if the intent is only communicated using a natural-language description, without including the data. But how do we decide how much data and which data to include in the prompt? This paper makes two contributions towards answering this question. First, we create a dataset of real-world NL-to-code tasks manipulating tabular data, mined from StackOverflow posts. Second, we introduce a cluster-then-select prompting technique, which adds the most representative rows from the input data to the LLM prompt. Our experiments show that LLM performance is indeed sensitive to the amount of data passed in the prompt, and that for tasks with a lot of syntactic variation in the input table, our cluster-then-select technique outperforms a random selection baseline.
△ Less
Submitted 24 March, 2024; v1 submitted 18 February, 2024;
originally announced February 2024.
Perturbations and the Future Conformal Boundary
Authors:
A. N. Lasenby,
W. J. Handley,
D. J. Bartlett,
C. S. Negreanu
Abstract:
The concordance model of cosmology predicts a universe which finishes in a finite amount of conformal time at a future conformal boundary. We show that for particular cases we study, the background variables and perturbations may be analytically continued beyond this boundary and that the "end of the universe" is not necessarily the end of their physical development. Remarkably, these theoretical…
▽ More
The concordance model of cosmology predicts a universe which finishes in a finite amount of conformal time at a future conformal boundary. We show that for particular cases we study, the background variables and perturbations may be analytically continued beyond this boundary and that the "end of the universe" is not necessarily the end of their physical development. Remarkably, these theoretical considerations of the end of the universe might have observable consequences today: perturbation modes consistent with these boundary conditions have a quantised power spectrum which may be relevant to features seen in the large scale cosmic microwave background. Mathematically these cosmological models may either be interpreted as a palindromic universe mirrored in time, a reflecting boundary condition, or a double cover, but are identical with respect to their observational predictions and stand in contrast to the predictions of conformal cyclic cosmologies.
△ Less
Submitted 20 April, 2022; v1 submitted 5 April, 2021;
originally announced April 2021.