Skip to main content

Showing 1–50 of 334 results for author: Clark, P

.
  1. arXiv:2407.01725  [pdf, other

    cs.CL cs.AI cs.LG

    DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

    Authors: Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal, Bhavana Dalvi Mishra, Abhijeetsingh Meena, Aryan Prakhar, Tirth Vora, Tushar Khot, Ashish Sabharwal, Peter Clark

    Abstract: Can the rapid advances in code generation, function calling, and data analysis using large language models (LLMs) help automate the search and verification of hypotheses purely from a set of provided datasets? To evaluate this question, we present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery. The benchmark is designed to systemat… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Website: https://github.com/allenai/discoverybench

  2. arXiv:2406.06769  [pdf, other

    cs.AI cs.CL

    DISCOVERYWORLD: A Virtual Environment for Develo** and Evaluating Automated Scientific Discovery Agents

    Authors: Peter Jansen, Marc-Alexandre Côté, Tushar Khot, Erin Bransom, Bhavana Dalvi Mishra, Bodhisattwa Prasad Majumder, Oyvind Tafjord, Peter Clark

    Abstract: Automated scientific discovery promises to accelerate progress across scientific domains. However, develo** and evaluating an AI agent's capacity for end-to-end scientific reasoning is challenging as running real-world experiments is often prohibitively expensive or infeasible. In this work we introduce DISCOVERYWORLD, the first virtual environment for develo** and benchmarking an agent's abil… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 9 pages, 4 figures. Preprint, under review

  3. arXiv:2406.06702  [pdf, other

    astro-ph.GA

    NEATH III: a molecular line survey of a simulated star-forming cloud

    Authors: F. D. Priestley, P. C. Clark, S. C. O. Glover, S. E. Ragan, O. Fehér, L. R. Prole, R. S. Klessen

    Abstract: We present synthetic line observations of a simulated molecular cloud, utilising a self-consistent treatment of the dynamics and time-dependent chemical evolution. We investigate line emission from the three most common CO isotopologues ($^{12}$CO, $^{13}$CO, C$^{18}$O) and six supposed tracers of dense gas (NH$_3$, HCN, N$_2$H$^+$, HCO$^+$, CS, HNC). Our simulation produces a range of line intens… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 14 pages, 14 figures. MNRAS accepted

  4. arXiv:2406.06485  [pdf, other

    cs.CL cs.AI

    Can Language Models Serve as Text-Based World Simulators?

    Authors: Ruoyao Wang, Graham Todd, Ziang Xiao, Xingdi Yuan, Marc-Alexandre Côté, Peter Clark, Peter Jansen

    Abstract: Virtual environments play a key role in benchmarking advances in complex planning and decision-making tasks but are expensive and complicated to build by hand. Can current language models themselves serve as world simulators, correctly predicting how actions change different world states, thus bypassing the need for extensive manual coding? Our goal is to answer this question in the context of tex… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: ACL 2024

  5. arXiv:2406.02334  [pdf, other

    astro-ph.IM astro-ph.HE

    $\textit{Kilonova Seekers}$: the GOTO project for real-time citizen science in time-domain astrophysics

    Authors: T. L. Killestein, L. Kelsey, E. Wickens, L. Nuttall, J. Lyman, C. Krawczyk, K. Ackley, M. J. Dyer, F. Jiménez-Ibarra, K. Ulaczyk, D. O'Neill, A. Kumar, D. Steeghs, D. K. Galloway, V. S. Dhillon, P. O'Brien, G. Ramsay, K. Noysena, R. Kotak, R. P. Breton, E. Pallé, D. Pollacco, S. Awiphan, S. Belkin, P. Chote , et al. (29 additional authors not shown)

    Abstract: Time-domain astrophysics continues to grow rapidly, with the inception of new surveys drastically increasing data volumes. Democratised, distributed approaches to training sets for machine learning classifiers are crucial to make the most of this torrent of discovery -- with citizen science approaches proving effective at meeting these requirements. In this paper, we describe the creation of and t… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 20 pages, 15 figures. Submitted to MNRAS

  6. arXiv:2405.19793  [pdf, other

    cs.CL

    PDDLEGO: Iterative Planning in Textual Environments

    Authors: Li Zhang, Peter Jansen, Tianyi Zhang, Peter Clark, Chris Callison-Burch, Niket Tandon

    Abstract: Planning in textual environments have been shown to be a long-standing challenge even for current models. A recent, promising line of work uses LLMs to generate a formal representation of the environment that can be solved by a symbolic planner. However, existing methods rely on a fully-observed environment where all entity states are initially known, so a one-off representation can be constructed… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: In *SEM 2024

  7. arXiv:2405.16337  [pdf, other

    cs.CL cs.AI

    Learning to Reason via Program Generation, Emulation, and Search

    Authors: Nathaniel Weir, Muhammad Khalifa, Linlu Qiu, Orion Weller, Peter Clark

    Abstract: Program synthesis with language models (LMs) has unlocked a large set of reasoning abilities; code-tuned LMs have proven adept at generating programs that solve a wide variety of algorithmic symbolic manipulation tasks (e.g. word concatenation). However, not all reasoning tasks are easily expressible as code, e.g. tasks involving commonsense reasoning, moral decision-making, and sarcasm understand… ▽ More

    Submitted 28 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

    Comments: 16 pages, 10 figures

  8. arXiv:2405.09503  [pdf, other

    astro-ph.GA

    Self-consistent modelling of the Milky Way structure using live potentials

    Authors: Eva Durán-Camacho, Ana Duarte-Cabral, Alex R. Pettitt, Robin G. Treß, Paul C. Clark, Ralf S. Klessen, Kamran R. J. Bogue, Rowan J. Smith, Mattia C. Sormani

    Abstract: To advance our understanding of the evolution of the interstellar medium (ISM) of our Galaxy, numerical models of Milky Way (MW) type galaxies are widely used. However, most models only vaguely resemble the MW (e.g. in total mass), and often use imposed analytic potentials (which cannot evolve dynamically). This poses a problem in asserting their applicability for the interpretation of observation… ▽ More

    Submitted 11 June, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted for publication in MNRAS. 24 pages, 23 figures, 3 tables

  9. arXiv:2405.00095  [pdf, ps, other

    astro-ph.GA astro-ph.SR

    Assessing the accuracy of the star formation rate measurements by direct star count in molecular clouds

    Authors: Sami Dib, Jian Wen Zhou, Sébastien Comerón, Luis E. Garduño, Valery V. Kravtsov, Paul C. Clark, Guang-Xing Li, Maritza A. Lara-López, Tie Liu, Mohsen Shadmehri, James R. Doughty

    Abstract: Star formation estimates based on the counting of YSOs is commonly applied to nearby star-forming regions in the Galaxy. With this method, the SFRs are measured using the counts of YSOs in a particular protostellar Class, a typical protostellar mass, and the lifetime associated with this Class. However, the assumptions underlying the validity of the method such as that of a constant star formation… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: Submitted. Comments are welcome

  10. arXiv:2403.20302  [pdf, other

    astro-ph.GA astro-ph.HE physics.pop-ph

    I'm in AGNi: A new standard for AGN pluralisation

    Authors: Andrew D. Gow, Peter Clark, Dan Rycanowski

    Abstract: We present a new standard acronym for Active Galactic Nuclei, finally settling the argument of AGN vs. AGNs. Our new standard is not only etymologically superior (following the consensus set by SNe), but also boasts other linguistic opportunities, connecting strongly with relevant theology and streamlining descriptions of AGN properties.

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 4 pages, 3 figures, accepted for publication in Acta Prima Aprilia

  11. arXiv:2403.19269  [pdf, other

    astro-ph.GA

    N$_2$H$^+$(1-0) as a tracer of dense gas in and between spiral arms

    Authors: O. Feher, S. E. Ragan, F. D. Priestley, P. C. Clark, T. J. T. Moore

    Abstract: Recent advances in identifying giant molecular filaments in galactic surveys allow us to study the interstellar material and its dense, potentially star forming phase on scales comparable to resolved extragalactic clouds. Two large filaments detected in the CHIMPS $^{13}$CO(3-2) survey, one in the Sagittarius-arm and one in an inter-arm region, were mapped with dense gas tracers inside a 0.06 deg… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: 18 pages, 9 figures, accepted by MNRAS

  12. arXiv:2403.00092  [pdf, other

    cs.CL

    PROC2PDDL: Open-Domain Planning Representations from Texts

    Authors: Tianyi Zhang, Li Zhang, Zhaoyi Hou, Ziyu Wang, Yuling Gu, Peter Clark, Chris Callison-Burch, Niket Tandon

    Abstract: Planning in a text-based environment continues to be a major challenge for AI systems. Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representation… ▽ More

    Submitted 2 July, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

    Comments: In NLRSE 2024, the 2nd Natural Language Reasoning and Structured Explanations Workshop

  13. arXiv:2402.16951  [pdf, other

    astro-ph.HE astro-ph.GA

    The rate of extreme coronal line emitting galaxies in the Sloan Digital Sky Survey and their relation to tidal disruption events

    Authors: Joseph Callow, Or Graur, Peter Clark, Antonella Palmese, Jessica Aguilar, Steven Ahlen, Segev BenZvi, David Brooks, Todd Claybaugh, Axel de la Macorra, Peter Doel, Jaime E. Forero-Romero, Enrique Gaztañaga, Satya Gontcho A Gontcho, Andrew Lambert, Martin Landriau, Marc Manera, Aaron Meisner, Ramon Miquel, John Moustakas, Jundan Nie, Claire Poppett, Francisco Prada, Mehdi Rezaie, Graziano Rossi , et al. (5 additional authors not shown)

    Abstract: Strong high-ionization iron coronal lines (CLs) are a rare phenomenon observed in galaxy and quasi-stellar object spectra that are thought to be created as a result of tidal disruption event (TDE) flares. To test whether these CLs are the result of TDE activity, we search for extreme coronal line emitting galaxies (ECLEs) in the Sloan Digital Sky Survey (SDSS), measure their rate, and compare it t… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: Submitted to MNRAS. 19 pages, 12 figures

  14. arXiv:2402.14798  [pdf, other

    cs.CL cs.AI

    Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic

    Authors: Nathaniel Weir, Kate Sanders, Orion Weller, Shreya Sharma, Dongwei Jiang, Zheng** Jiang, Bhavana Dalvi Mishra, Oyvind Tafjord, Peter Jansen, Peter Clark, Benjamin Van Durme

    Abstract: Contemporary language models enable new opportunities for structured reasoning with text, such as the construction and evaluation of intuitive, proof-like textual entailment trees without relying on brittle formal logic. However, progress in this direction has been hampered by a long-standing lack of a clear protocol for determining what valid compositional entailment is. This absence causes noisy… ▽ More

    Submitted 27 February, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

  15. arXiv:2402.13610  [pdf, other

    cs.CL cs.AI cs.LG

    Data-driven Discovery with Large Generative Models

    Authors: Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal, Sanchaita Hazra, Ashish Sabharwal, Peter Clark

    Abstract: With the accumulation of data at an unprecedented rate, its potential to fuel scientific discovery is growing exponentially. This position paper urges the Machine Learning (ML) community to exploit the capabilities of large generative models (LGMs) to develop automated systems for end-to-end data-driven discovery -- a paradigm encompassing the search and verification of hypotheses purely from a se… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  16. arXiv:2402.03244  [pdf, other

    cs.LG cs.CL

    Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills

    Authors: Kolby Nottingham, Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Sameer Singh, Peter Clark, Roy Fox

    Abstract: Large language models (LLMs) have recently been used for sequential decision making in interactive environments. However, leveraging environment reward signals for continual LLM actor improvement is not straightforward. We propose Skill Set Optimization (SSO) for improving LLM actor performance through constructing and refining sets of transferable skills. SSO constructs skills by extracting commo… ▽ More

    Submitted 22 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  17. arXiv:2401.06751  [pdf, other

    cs.CL cs.AI cs.LG

    The Unreasonable Effectiveness of Easy Training Data for Hard Tasks

    Authors: Peter Hase, Mohit Bansal, Peter Clark, Sarah Wiegreffe

    Abstract: How can we train models to perform well on hard test data when hard training data is by definition difficult to label correctly? This question has been termed the scalable oversight problem and has drawn increasing attention as language models have continually improved. In this paper, we present the surprising conclusion that current pretrained language models often generalize relatively well from… ▽ More

    Submitted 5 June, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: ACL 2024. 23 pages, 20 figures

  18. arXiv:2312.07527  [pdf, other

    cs.CL cs.AI

    BaRDa: A Belief and Reasoning Dataset that Separates Factual Accuracy and Reasoning Ability

    Authors: Peter Clark, Bhavana Dalvi Mishra, Oyvind Tafjord

    Abstract: While there are numerous benchmarks comparing the performance of modern language models (LMs), end-task evaluations often conflate notions of *factual accuracy* ("truth") and *reasoning ability* ("rationality", or "honesty" in the sense of correctly reporting implications of beliefs). Our goal is a dataset that clearly distinguishes these two notions. Our approach is to leverage and extend a colle… ▽ More

    Submitted 23 March, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Added note about how dataset sampling was performed

  19. arXiv:2312.06769  [pdf, other

    astro-ph.GA astro-ph.CO

    Heavy Black Hole Seed Formation in High-z Atomic Cooling Halos

    Authors: Lewis R. Prole, John A. Regan, Simon C. O. Glover, Ralf S. Klessen, Felix D. Priestley, Paul C. Clark

    Abstract: Halos with masses in excess of the atomic limit are believed to be ideal environments in which to form heavy black hole seeds with masses above 10^3 Msun. In cases where the H_2 fraction is suppressed this is expected to lead to reduced fragmentation of the gas and the generation of a top heavy initial mass function. In extreme cases this can result in the formation of massive black hole seeds. Re… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Submitted to A&A, comments welcome

  20. arXiv:2312.03842  [pdf, other

    astro-ph.HE

    Light-Curve Structure and Halpha Line Formation in the Tidal Disruption Event AT 2019azh

    Authors: Sara Faris, Iair Arcavi, Lydia Makrygianni, Daichi Hiramatsu, Giacomo Terreran, Joseph Farah, D. Andrew Howell, Curtis McCully, Megan Newsome, Estefania Padilla Gonzalez, Craig Pellegrino, K. Azalee Bostroem, Wiam Abojanb, Marco C. Lam, Lina Tomasella, Thomas G. Brink, Alexei V. Filippenko, K. Decker French, Peter Clark, Or Graur, Giorgos Leloudas, Mariusz Gromadzki, Joseph P. Anderson, Matt Nicholl, Claudia P. Gutierrez , et al. (11 additional authors not shown)

    Abstract: AT 2019azh is a H+He tidal disruption event (TDE) with one of the most extensive ultraviolet and optical datasets available to date. We present our photometric and spectroscopic observations of this event starting several weeks before and out to approximately two years after g-band peak brightness and combine them with public photometric data. This extensive dataset robustly reveals a change in th… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Submitted to ApJ

  21. arXiv:2311.13981  [pdf, other

    astro-ph.IM

    Overview of the distributed image processing infrastructure to produce the Legacy Survey of Space and Time

    Authors: Fabio Hernandez, George Beckett, Peter Clark, Matt Doidge, Tim Jenness, Edward Karavakis, Quentin Le Boulc'h, Peter Love, Gabriele Mainetti, Timothy Noble, Brandon White, Wei Yang

    Abstract: The Vera C. Rubin Observatory is preparing to execute the most ambitious astronomical survey ever attempted, the Legacy Survey of Space and Time (LSST). Currently the final phase of construction is under way in the Chilean Andes, with the Observatory's ten-year science mission scheduled to begin in 2025. Rubin's 8.4-meter telescope will nightly scan the southern hemisphere collecting imagery in th… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

    Comments: 8 pages, 2 figures, 26th International Conference on Computing in High Energy & Nuclear Physics

  22. arXiv:2311.10527  [pdf, other

    math.NT math.GR

    Functional degrees and arithmetic applications III: Beyond Prime Exponent

    Authors: Pete L. Clark, Uwe Schauz

    Abstract: Continuing our work on group-theoretic generalizations of the prime Ax-Katz Theorem, we give a lower bound on the $p$-adic divisibility of the cardinality of the set of simultaneous zeros $Z(f_1,f_2,\ldots,f_r)$ of $r$ maps $f_j:A\rightarrow B_j$ between arbitrary finite commutative groups $A$ and $B_j$ in terms of the invariant factors of $A, B_1,B_2,\dotsc,B_r$ and the \emph{functional degrees}… ▽ More

    Submitted 4 July, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: 27 pages

    MSC Class: 20K01; 13F20; 20C05

  23. arXiv:2311.09613  [pdf, other

    cs.CL cs.AI

    Digital Socrates: Evaluating LLMs through Explanation Critiques

    Authors: Yuling Gu, Oyvind Tafjord, Peter Clark

    Abstract: While LLMs can provide reasoned explanations along with their answers, the nature and quality of those explanations are still poorly understood. In response, our goal is to define a detailed way of characterizing the explanation capabilities of modern models and to create a nuanced, interpretable explanation evaluation tool that can generate such characterizations automatically, without relying on… ▽ More

    Submitted 16 February, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

  24. arXiv:2311.09519  [pdf, other

    cs.CL

    Leveraging Code to Improve In-context Learning for Semantic Parsing

    Authors: Ben Bogin, Shivanshu Gupta, Peter Clark, Ashish Sabharwal

    Abstract: In-context learning (ICL) is an appealing approach for semantic parsing due to its few-shot nature and improved generalization. However, learning to parse to rare domain-specific languages (DSLs) from just a few demonstrations is challenging, limiting the performance of even the most capable LLMs. In this work, we improve the effectiveness of ICL for semantic parsing by (1) using general-purpose p… ▽ More

    Submitted 27 March, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted to NAACL 2024

  25. arXiv:2311.09510  [pdf, other

    cs.CL

    Tailoring with Targeted Precision: Edit-Based Agents for Open-Domain Procedure Customization

    Authors: Yash Kumar Lal, Li Zhang, Faeze Brahman, Bodhisattwa Prasad Majumder, Peter Clark, Niket Tandon

    Abstract: How-to procedures, such as how to plant a garden, are now used by millions of users, but sometimes need customizing to meet a user's specific needs, e.g., planting a garden without pesticides. Our goal is to measure and improve an LLM's ability to perform such customization. Our approach is to test several simple multi-LLM-agent architectures for customization, as well as an end-to-end LLM, using… ▽ More

    Submitted 30 May, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Camera ready version accepted to Findings of ACL 2024

  26. arXiv:2311.08602  [pdf, other

    astro-ph.IM astro-ph.EP physics.ao-ph

    Data downloaded via parachute from a NASA super-pressure balloon

    Authors: Ellen L. Sirks, Richard Massey, Ajay S. Gill, Jason Anderson, Steven J. Benton, Anthony M. Brown, Paul Clark, Joshua English, Spencer W. Everett, Aurelien A. Fraisse, Hugo Franco, John W. Hartley, David Harvey, Bradley Holder, Andrew Hunter, Eric M. Huff, Andrew Hynous, Mathilde Jauzac, William C. Jones, Nikky Joyce, Duncan Kennedy, David Lagattuta, Jason S. -Y. Leung, Lun Li, Stephen Lishman , et al. (18 additional authors not shown)

    Abstract: In April to May 2023, the superBIT telescope was lifted to the Earth's stratosphere by a helium-filled super-pressure balloon, to acquire astronomical imaging from above (99.5% of) the Earth's atmosphere. It was launched from New Zealand then, for 40 days, circumnavigated the globe five times at a latitude 40 to 50 degrees South. Attached to the telescope were four 'DRS' (Data Recovery System) cap… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 12 pages

    Journal ref: Aerospace 2023, 10, 960

  27. arXiv:2311.05772  [pdf, other

    cs.AI cs.CL cs.LG

    ADaPT: As-Needed Decomposition and Planning with Language Models

    Authors: Archiki Prasad, Alexander Koller, Mareike Hartmann, Peter Clark, Ashish Sabharwal, Mohit Bansal, Tushar Khot

    Abstract: Large Language Models (LLMs) are increasingly being used for interactive decision-making tasks requiring planning and adapting to the environment. Recent works employ LLMs-as-agents in broadly two ways: iteratively determining the next action (iterative executors) or generating plans and executing sub-tasks using LLMs (plan-and-execute). However, these methods struggle with task complexity, as the… ▽ More

    Submitted 8 April, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: NAACL 2024 (findings) camera-ready. Project Page: https://allenai.github.io/adaptllm

  28. arXiv:2311.04892  [pdf, other

    cs.CL

    Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs

    Authors: Shashank Gupta, Vaishnavi Shrivastava, Ameet Deshpande, Ashwin Kalyan, Peter Clark, Ashish Sabharwal, Tushar Khot

    Abstract: Recent works have showcased the ability of LLMs to embody diverse personas in their responses, exemplified by prompts like 'You are Yoda. Explain the Theory of Relativity.' While this ability allows personalization of LLMs and enables human behavior simulation, its effect on LLMs' capabilities remains unclear. To fill this gap, we present the first extensive study of the unintended side-effects of… ▽ More

    Submitted 27 January, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: Project page: https://allenai.github.io/persona-bias. Paper to appear at ICLR 2024. Added results for other LLMs in v2 (similar findings)

  29. arXiv:2311.02807  [pdf, other

    cs.LG cs.AI cs.CL

    QualEval: Qualitative Evaluation for Model Improvement

    Authors: Vishvak Murahari, Ameet Deshpande, Peter Clark, Tanmay Rajpurohit, Ashish Sabharwal, Karthik Narasimhan, Ashwin Kalyan

    Abstract: Quantitative evaluation metrics have traditionally been pivotal in gauging the advancements of artificial intelligence systems, including large language models (LLMs). However, these metrics have inherent limitations. Given the intricate nature of real-world tasks, a single scalar to quantify and compare is insufficient to capture the fine-grained nuances of model behavior. Metrics serve only as a… ▽ More

    Submitted 5 May, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  30. arXiv:2310.10730  [pdf, other

    astro-ph.GA astro-ph.CO astro-ph.SR

    Population III star formation: multiple gas phases prevent the use of an equation of state at high densities

    Authors: Lewis R. Prole, Paul C. Clark, Felix D. Priestley, Simon C. O. Glover, John A. Regan

    Abstract: Advanced primordial chemistry networks have been developed to model the collapse of metal-free baryonic gas within the gravitational well of dark matter (DM) halos and its subsequent collapse into Population III stars. At the low densities of 10^-26-10^-21 g cm-3 (10-3-10^2 cm-3) the collapse is dependent on H2 production, which is a function of the compressional heating provided by the DM potenti… ▽ More

    Submitted 19 January, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted in the Open Journal of Astrophysics

  31. arXiv:2310.10134  [pdf, other

    cs.CL cs.AI cs.LG

    CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization

    Authors: Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Peter Jansen, Oyvind Tafjord, Niket Tandon, Li Zhang, Chris Callison-Burch, Peter Clark

    Abstract: Language agents have shown some ability to interact with an external environment, e.g., a virtual world such as ScienceWorld, to perform complex tasks, e.g., growing a plant, without the startup costs of reinforcement learning. However, despite their zero-shot capabilities, these agents to date do not continually improve over time beyond performance refinement on a specific task. Here we present C… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Project page: https://allenai.github.io/clin/

  32. arXiv:2310.06037  [pdf, other

    astro-ph.GA

    NEATH II: N$_2$H$^+$ as a tracer of imminent star formation in quiescent high-density gas

    Authors: F. D. Priestley, P. C. Clark, S. C. O. Glover, S. E. Ragan, O. Fehér, L. R. Prole, R. S. Klessen

    Abstract: Star formation activity in molecular clouds is often found to be correlated with the amount of material above a column density threshold of $\sim 10^{22} \, {\rm cm^{-2}}$. Attempts to connect this column density threshold to a ${\it volume}$ density above which star formation can occur are limited by the fact that the volume density of gas is difficult to reliably measure from observations. We po… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: 10 pages, 10 figures. MNRAS accepted

  33. arXiv:2309.11340  [pdf, other

    astro-ph.HE astro-ph.SR

    GW190425: Pan-STARRS and ATLAS coverage of the skymap and limits on optical emission associated with FRB190425

    Authors: S. J. Smartt, M. Nicholl, S. Srivastav, M. E. Huber, K. C. Chambers, K. W. Smith, D. R. Young, M. D. Fulton, J. L. Tonry, C. W. Stubbs, L. Denneau, A. J. Cooper, A. Aamer, J. P. Anderson, A. Andersson, J. Bulger, T. -W Chen, P. Clark, T. de Boer, H. Gao, J. H. Gillanders, A. Lawrence, C. C. Lin, T. B. Lowe, E. A. Magnier , et al. (10 additional authors not shown)

    Abstract: GW190425 is the second of only two binary neutron star (BNS) merger events to be significantly detected by the LIGO-Virgo- Kagra gravitational wave detectors. With a detection only in LIGO Livingston, the skymap containing the source was large and no plausible electromagnetic counterpart was found in real time searching in 2019. Here we summarise our ATLAS and Pan-STARRS wide-field optical coverag… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: Submitted to MNRAS, 20th Sept 2023, 9 pages

  34. Non-Equilibrium Abundances Treated Holistically (NEATH): the molecular composition of star-forming clouds

    Authors: F. D. Priestley, P. C. Clark, S. C. O. Glover, S. E. Ragan, O. Fehér, L. R. Prole, R. S. Klessen

    Abstract: Much of what we know about molecular clouds, and by extension star formation, comes from molecular line observations. Interpreting these correctly requires knowledge of the underlying molecular abundances. Simulations of molecular clouds typically only model species that are important for the gas thermodynamics, which tend to be poor tracers of the denser material where stars form. We construct a… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 14 pages, 13 figures. MNRAS accepted

  35. arXiv:2307.03295  [pdf, other

    astro-ph.IM astro-ph.GA

    Lensing in the Blue II: Estimating the Sensitivity of Stratospheric Balloons to Weak Gravitational Lensing

    Authors: Jacqueline E. McCleary, Spencer W. Everett, Mohamed M. Shaaban, Ajay S. Gill, Georgios N. Vassilakis, Eric M. Huff, Richard J. Massey, Steven J. Benton, Anthony M. Brown, Paul Clark, Bradley Holder, Aurelien A. Fraisse, Mathilde Jauzac, William C. Jones, David Lagattuta, Jason S. -Y. Leung, Lun Li, Thuy Vy T. Luu, Johanna M. Nagy, C. Barth Netterfield, Emaad Paracha, Susan F. Redmond, Jason D. Rhodes, J\''urgen Schmoll, Ellen Sirks , et al. (1 additional authors not shown)

    Abstract: The Superpressure Balloon-borne Imaging Telescope (SuperBIT) is a diffraction-limited, wide-field, 0.5 m, near-infrared to near-ultraviolet observatory designed to exploit the stratosphere's space-like conditions. SuperBIT's 2023 science flight will deliver deep, blue imaging of galaxy clusters for gravitational lensing analysis. In preparation, we have developed a weak lensing measurement pipelin… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: Submitted to Astronomical Journal

  36. Long-term follow-up observations of extreme coronal line emitting galaxies

    Authors: Peter Clark, Or Graur, Joseph Callow, Jessica Aguilar, Steven Ahlen, Joseph P. Anderson, Edo Berger, Thomas Brink, David Brooks, Ting-Wan Chen, Todd Claybaugh, Axel de la Macorra, Peter Doel, Alexei Filippenko, Jamie Forero-Romero, Sebastian Gomez, Mariusz Gromadzki, Klaus Honscheid, Cosimo Inserra, Theodore Kisner, Martin Landriau, Lydia Makrygianni, Marc Manera, Aaron Meisner, Ramon Miquel , et al. (18 additional authors not shown)

    Abstract: We present new spectroscopic and photometric follow-up observations of the known sample of extreme coronal line emitting galaxies (ECLEs) identified in the Sloan Digital Sky Survey (SDSS). With these new data, observations of the ECLE sample now span a period of two decades following their initial SDSS detections. We confirm the nonrecurrence of the iron coronal line signatures in five of the seve… ▽ More

    Submitted 4 March, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: This is a pre-copyedited, author-produced PDF of an article accepted for publication in Monthly Notices of the Royal Astronomical Society following peer review. Note the corrected caption of Figure 1 continued, which in this version correctly refers to 'SDSS J124' rather than the erroneous 'SDSS J1341' in the published version. 29 Pages, 14 Figures

    Journal ref: Monthly Notices of the Royal Astronomical Society, Volume 528, Issue 4, March 2024, Pages 7076-7102

  37. arXiv:2305.14596  [pdf, other

    cs.CL cs.LG

    Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy

    Authors: Sarah Wiegreffe, Matthew Finlayson, Oyvind Tafjord, Peter Clark, Ashish Sabharwal

    Abstract: When pretrained language models (LMs) are applied to discriminative tasks such as multiple-choice questions, they place probability mass on vocabulary tokens that aren't among the given answer choices. Spreading probability mass across multiple surface forms with identical meaning (such as "bath" and "bathtub") is thought to cause an underestimation of a model's true performance, referred to as th… ▽ More

    Submitted 31 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023

  38. arXiv:2305.14386  [pdf, other

    cs.LG cs.AI cs.CL

    Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation

    Authors: Zhenwen Liang, Wenhao Yu, Tanmay Rajpurohit, Peter Clark, Xiangliang Zhang, Ashwin Kaylan

    Abstract: In this paper, we present a novel approach for distilling math word problem solving capabilities from large language models (LLMs) into smaller, more efficient student models. Our approach is designed to consider the student model's weaknesses and foster a tailored learning experience by generating targeted exercises aligned with educational science principles, such as knowledge tracing and person… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  39. arXiv:2305.14250  [pdf, other

    cs.CL cs.AI

    Language Models with Rationality

    Authors: Nora Kassner, Oyvind Tafjord, Ashish Sabharwal, Kyle Richardson, Hinrich Schuetze, Peter Clark

    Abstract: While large language models (LLMs) are proficient at question-answering (QA), it is not always clear how (or even if) an answer follows from their latent "beliefs". This lack of interpretability is a growing impediment to widespread use of LLMs. To address this, our goals are to make model beliefs and their inferential relationships explicit, and to resolve inconsistencies that may exist, so that… ▽ More

    Submitted 29 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  40. arXiv:2305.14010  [pdf, other

    cs.CL

    IfQA: A Dataset for Open-domain Question Answering under Counterfactual Presuppositions

    Authors: Wenhao Yu, Meng Jiang, Peter Clark, Ashish Sabharwal

    Abstract: Although counterfactual reasoning is a fundamental aspect of intelligence, the lack of large-scale counterfactual open-domain question-answering (QA) benchmarks makes it difficult to evaluate and improve models on this ability. To address this void, we introduce the first such dataset, named IfQA, where each question is based on a counterfactual presupposition via an "if" clause. For example, if L… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  41. arXiv:2305.08844  [pdf, other

    cs.CL

    RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs

    Authors: Afra Feyza Akyürek, Ekin Akyürek, Aman Madaan, Ashwin Kalyan, Peter Clark, Derry Wijaya, Niket Tandon

    Abstract: Despite their unprecedented success, even the largest language models make mistakes. Similar to how humans learn and improve using feedback, previous work proposed providing language models with natural language feedback to guide them in repairing their outputs. Because human-generated critiques are expensive to obtain, researchers have devised learned critique generators in lieu of human critics… ▽ More

    Submitted 11 July, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  42. arXiv:2305.01304  [pdf, ps, other

    math.GR math.AC

    Functional degrees and arithmetic applications II: The Group-Theoretic Prime Ax-Katz Theorem

    Authors: Pete L. Clark, Uwe Schauz

    Abstract: We give a version of Ax-Katz's $p$-adic congruences and Moreno-Moreno's $p$-weight refinement that holds over any finite commutative ring of prime characteristic. We deduce this from a purely group-theoretic result that gives a lower bound on the $p$-adic divisibility of the number of simultaneous zeros of a system of maps $f_j: A\to B_j$ from a fixed ``source'' finite commutative group $A$ of exp… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: 29 pages

    MSC Class: 20K01; 13F20; 20C05

  43. arXiv:2304.07399  [pdf, ps, other

    math.NT

    Densities of integer sets represented by quadratic forms

    Authors: Pete L. Clark, Paul Pollack, Jeremy Rouse, Katherine Thompson

    Abstract: Let $f(t_1,\ldots,t_n)$ be a nondegenerate integral quadratic form. We analyze the asymptotic behavior of the function $D_f(X)$, the number of integers of absolute value up to $X$ represented by $f$. When $f$ is isotropic or $n$ is at least $3$, we show that there is a $δ(f) \in \mathbb{Q} \cap (0,1)$ such that $D_f(X) \sim δ(f) X$ and call $δ(f)$ the density of $f$. We consider the inverse proble… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

    Comments: 25 pages

    MSC Class: Primary 11E12; Secondary 11E20

  44. arXiv:2303.17651  [pdf, other

    cs.CL cs.AI cs.LG

    Self-Refine: Iterative Refinement with Self-Feedback

    Authors: Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark

    Abstract: Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLMs; then, the same LLMs provides feedback for its output and uses it… ▽ More

    Submitted 25 May, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: Code, data, and demo at https://selfrefine.info/

  45. Multiwavelength observations of the extraordinary accretion event AT2021lwx

    Authors: P. Wiseman, Y. Wang, S. Hönig, N. Castro-Segura, P. Clark, C. Frohmaier, M. D. Fulton, G. Leloudas, M. Middleton, T. E. Müller-Bravo, A. Mummery, M. Pursiainen, S. J. Smartt, K. Smith, M. Sullivan, J. P. Anderson, J. A. Acosta Pulido, P. Charalampopoulos, M. Banerji, M. Dennefeld, L. Galbany, M. Gromadzki, C. P. Gutiérrez, N. Ihanec, E. Kankare , et al. (21 additional authors not shown)

    Abstract: We present observations from X-ray to mid-infrared wavelengths of the most energetic non-quasar transient ever observed, AT2021lwx. Our data show a single optical brightening by a factor $>100$ to a luminosity of $7\times10^{45}$ erg s$^{-1}$, and a total radiated energy of $1.5\times10^{53}$ erg, both greater than any known optical transient. The decline is smooth and exponential and the ultra-vi… ▽ More

    Submitted 31 March, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: 11 pages, 5 figures, Accepted for publication in MNRAS

  46. Do simulated molecular clouds look like real ones?

    Authors: F. D. Priestley, P. C. Clark, A. P Whitworth

    Abstract: Simulations of molecular clouds often begin from highly idealised initial conditions, such as a uniform-density sphere with an artificially imposed turbulent velocity field. While the resulting structures may appear qualitatively similar to those detected in continuum and line observations, it is unclear whether they are genuinely representative of real molecular clouds. Recent observational work… ▽ More

    Submitted 12 January, 2023; originally announced January 2023.

    Comments: 10 pages, 10 figures. MNRAS accepted. Data publicly available at http://cloudzoo.astro.cf.ac.uk/Downloads/202211/

  47. arXiv:2301.00828  [pdf, other

    astro-ph.GA astro-ph.CO astro-ph.SR

    From dark matter halos to pre-stellar cores: High resolution follow-up of cosmological Lyman-Werner simulations

    Authors: Lewis R. Prole, Anna T. P. Schauer, Paul C. Clark, Simon C. O. Glover, Felix D. Priestley, Ralf S. Klessen

    Abstract: Molecular hydrogen allows cooling in primordial gas, facilitating its collapse into Population III stars within primordial halos. Lyman-Werner (LW) radiation from these stars can escape the halo and delay further star formation by destroying H$_2$ in other halos. As cosmological simulations show that increasing the background LW field strength increases the average halo mass required for star form… ▽ More

    Submitted 19 January, 2023; v1 submitted 2 January, 2023; originally announced January 2023.

    Comments: MNRAS Accepted 2023 January 16 ref. MN-22-5075-MJ.R2

  48. arXiv:2212.13327  [pdf, other

    math.NT

    CM Elliptic Curves: Volcanoes, Reality and Applications, Part II

    Authors: Pete L. Clark, Frederick Saia

    Abstract: Let $M \mid N$ be positive integers, and let $Δ$ be the discriminant of an order in an imaginary quadratic field $K$. When $Δ_K < -4$, the first author determined the fiber of the morphism $X_0(M,N) \rightarrow X(1)$ over the closed point $J_Δ$ corresponding to $Δ$ and showed that all fibers of the map $X_1(M,N) \rightarrow X_0(M,N)$ over $J_Δ$ were connected. Here we complement this prior work by… ▽ More

    Submitted 26 December, 2022; originally announced December 2022.

    Comments: 37 pages

  49. arXiv:2212.13316  [pdf, other

    math.NT

    CM Elliptic Curves: Volcanoes, Reality and Applications

    Authors: Pete L. Clark

    Abstract: For positive integers $M \mid N$ and an order of discriminant $Δ$ in an imaginary quadratic field $K$ with discriminant $Δ_K < -4$, we determine the fiber of the morphism $X_0(M,N) \rightarrow X(1)$ over the closed point $J_Δ$ corresponding to $Δ$. We also show that the fiber of the natural map $X_1(M,N) \rightarrow X_0(M,N)$ over $J_Δ$ is connected. Putting this together we deduce the number of p… ▽ More

    Submitted 26 December, 2022; originally announced December 2022.

    Comments: 105 pages

  50. arXiv:2212.10029  [pdf, other

    cs.CL cs.AI

    Do language models have coherent mental models of everyday things?

    Authors: Yuling Gu, Bhavana Dalvi Mishra, Peter Clark

    Abstract: When people think of everyday things like an egg, they typically have a mental image associated with it. This allows them to correctly judge, for example, that "the yolk surrounds the shell" is a false statement. Do language models similarly have a coherent picture of such everyday things? To investigate this, we propose a benchmark dataset consisting of 100 everyday things, their parts, and the r… ▽ More

    Submitted 8 June, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL 2023