Skip to main content

Showing 1–21 of 21 results for author: Dasigi, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.00838  [pdf, other

    cs.CL

    OLMo: Accelerating the Science of Language Models

    Authors: Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam , et al. (18 additional authors not shown)

    Abstract: Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed. Given the importance of these details in scientifically studying these models… ▽ More

    Submitted 7 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  2. arXiv:2311.10702  [pdf, other

    cs.CL

    Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

    Authors: Hamish Ivison, Yizhong Wang, Valentina Pyatkin, Nathan Lambert, Matthew Peters, Pradeep Dasigi, Joel Jang, David Wadden, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi

    Abstract: Since the release of TÜLU [Wang et al., 2023b], open resources for instruction tuning have developed quickly, from better base models to new finetuning techniques. We test and incorporate a number of these advances into TÜLU, resulting in TÜLU 2, a suite of improved TÜLU models for advancing the understanding and best practices of adapting pretrained language models to downstream tasks and user pr… ▽ More

    Submitted 19 November, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: technical report; fixed zephyr numbers

  3. arXiv:2311.09635  [pdf, other

    cs.CL

    Evaluating In-Context Learning of Libraries for Code Generation

    Authors: Arkil Patel, Siva Reddy, Dzmitry Bahdanau, Pradeep Dasigi

    Abstract: Contemporary Large Language Models (LLMs) exhibit a high degree of code generation and comprehension capability. A particularly promising area is their ability to interpret code modules from unfamiliar libraries for solving user-instructed tasks. Recent work has shown that large proprietary LLMs can learn novel library usage in-context from demonstrations. These results raise several open question… ▽ More

    Submitted 4 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  4. arXiv:2310.03646  [pdf, other

    cs.LG cs.CL

    TRAM: Bridging Trust Regions and Sharpness Aware Minimization

    Authors: Tom Sherborne, Naomi Saphra, Pradeep Dasigi, Hao Peng

    Abstract: Sharpness-aware minimization (SAM) reports improving domain generalization by reducing the loss surface curvature in the parameter space. However, generalization during fine-tuning is often more dependent on the transferability of representations in the function space. Trust-region methods (TR) target this goal by regularizing representation curvature to reduce catastrophic forgetting of pre-train… ▽ More

    Submitted 12 March, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

    Comments: Camera Ready for ICLR 2024 (Accepted as Spotlight). 21 pages, 14 tables, 2 figures

  5. arXiv:2306.04751  [pdf, other

    cs.CL

    How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources

    Authors: Yizhong Wang, Hamish Ivison, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi

    Abstract: In this work we explore recent advances in instruction-tuning language models on a range of open instruction-following datasets. Despite recent claims that open models can be on par with state-of-the-art proprietary models, these claims are often accompanied by limited evaluation, making it difficult to compare models across the board and determine the utility of various resources. We provide a la… ▽ More

    Submitted 30 October, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: 18 pages, 6 figure, 10 tables. NeurIPS 2023 Datasets and Benchmarks Track Camera Ready

  6. arXiv:2305.11744  [pdf, other

    cs.IR cs.CL

    ReFIT: Relevance Feedback from a Reranker during Inference

    Authors: Revanth Gangi Reddy, Pradeep Dasigi, Md Arafat Sultan, Arman Cohan, Avirup Sil, Heng Ji, Hannaneh Hajishirzi

    Abstract: Retrieve-and-rerank is a prevalent framework in neural information retrieval, wherein a bi-encoder network initially retrieves a pre-defined number of candidates (e.g., K=100), which are then reranked by a more powerful cross-encoder model. While the reranker often yields improved candidate scores compared to the retriever, its scope is confined to only the top K retrieved candidates. As a result,… ▽ More

    Submitted 28 May, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Preprint

  7. arXiv:2301.13298  [pdf, other

    cs.CL

    LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization

    Authors: Kalpesh Krishna, Erin Bransom, Bailey Kuehl, Mohit Iyyer, Pradeep Dasigi, Arman Cohan, Kyle Lo

    Abstract: While human evaluation remains best practice for accurately judging the faithfulness of automatically-generated summaries, few solutions exist to address the increased difficulty and workload when evaluating long-form summaries. Through a survey of 162 papers on long-form summarization, we first shed light on current human evaluation practices surrounding long-form summaries. We find that 73% of t… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: EACL 2023 camera ready. Code and data can be found in https://github.com/martiansideofthemoon/longeval-summarization

  8. arXiv:2212.00921  [pdf, other

    cs.LG cs.AI cs.CL

    AGRO: Adversarial Discovery of Error-prone groups for Robust Optimization

    Authors: Bhargavi Paranjape, Pradeep Dasigi, Vivek Srikumar, Luke Zettlemoyer, Hannaneh Hajishirzi

    Abstract: Models trained via empirical risk minimization (ERM) are known to rely on spurious correlations between labels and task-independent input features, resulting in poor generalization to distributional shifts. Group distributionally robust optimization (G-DRO) can alleviate this problem by minimizing the worst-case loss over a set of pre-defined groups over training data. G-DRO successfully improves… ▽ More

    Submitted 8 December, 2022; v1 submitted 1 December, 2022; originally announced December 2022.

  9. arXiv:2212.00196  [pdf, other

    cs.CL

    Data-Efficient Finetuning Using Cross-Task Nearest Neighbors

    Authors: Hamish Ivison, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi

    Abstract: Obtaining labeled data to train a model for a task of interest is often expensive. Prior work shows training models on multitask data augmented with task descriptions (prompts) effectively transfers knowledge to new tasks. Towards efficiently building task-specific models, we assume access to a small number (32-1000) of unlabeled target-task examples and use those to retrieve the most similar labe… ▽ More

    Submitted 24 May, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

    Comments: Findings of ACL 2023

  10. arXiv:2203.12942  [pdf, other

    cs.CL cs.AI cs.CY

    Generating Data to Mitigate Spurious Correlations in Natural Language Inference Datasets

    Authors: Yuxiang Wu, Matt Gardner, Pontus Stenetorp, Pradeep Dasigi

    Abstract: Natural language processing models often exploit spurious correlations between task-independent features and labels in datasets to perform well only within the distributions they are trained on, while not generalising to different task distributions. We propose to tackle this problem by generating a debiased version of a dataset, which can then be used to train a debiased, off-the-shelf model, by… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

    Comments: Accepted to ACL 2022 main conference

  11. arXiv:2105.03011  [pdf, other

    cs.CL

    A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers

    Authors: Pradeep Dasigi, Kyle Lo, Iz Beltagy, Arman Cohan, Noah A. Smith, Matt Gardner

    Abstract: Readers of academic research papers often read with the goal of answering specific questions. Question Answering systems that can answer those questions can make consumption of the content much more efficient. However, building such tools requires data that reflect the difficulty of the task arising from complex reasoning about claims made in multiple parts of a paper. In contrast, existing inform… ▽ More

    Submitted 6 May, 2021; originally announced May 2021.

    Comments: Accepted at NAACL 2021; Project page: https://allenai.org/project/qasper

  12. arXiv:2104.08735  [pdf, other

    cs.CL

    Learning with Instance Bundles for Reading Comprehension

    Authors: Dheeru Dua, Pradeep Dasigi, Sameer Singh, Matt Gardner

    Abstract: When training most modern reading comprehension models, all the questions associated with a context are treated as being independent from each other. However, closely related questions and their corresponding answers are not independent, and leveraging these relationships could provide a strong supervision signal to a model. Drawing on ideas from contrastive estimation, we introduce several new su… ▽ More

    Submitted 18 April, 2021; originally announced April 2021.

  13. arXiv:2103.12235  [pdf, other

    cs.CL

    Mitigating False-Negative Contexts in Multi-document Question Answering with Retrieval Marginalization

    Authors: Ansong Ni, Matt Gardner, Pradeep Dasigi

    Abstract: Question Answering (QA) tasks requiring information from multiple documents often rely on a retrieval model to identify relevant information for reasoning. The retrieval model is typically trained to maximize the likelihood of the labeled supporting evidence. However, when retrieving from large text corpora such as Wikipedia, the correct answer can often be obtained from multiple evidence candidat… ▽ More

    Submitted 8 September, 2021; v1 submitted 22 March, 2021; originally announced March 2021.

    Comments: Accepted to EMNLP 2021 (main conference)

  14. arXiv:2011.07127  [pdf, other

    cs.CL

    IIRC: A Dataset of Incomplete Information Reading Comprehension Questions

    Authors: James Ferguson, Matt Gardner, Hannaneh Hajishirzi, Tushar Khot, Pradeep Dasigi

    Abstract: Humans often have to read multiple documents to address their information needs. However, most existing reading comprehension (RC) tasks only focus on questions for which the contexts provide all the information required to answer them, thus not evaluating a system's performance at identifying a potential lack of sufficient information and locating sources for that information. To fill this gap, w… ▽ More

    Submitted 13 November, 2020; originally announced November 2020.

    Comments: EMNLP 2020

  15. arXiv:2010.06694  [pdf, other

    cs.HC

    Easy, Reproducible and Quality-Controlled Data Collection with Crowdaq

    Authors: Qiang Ning, Hao Wu, Pradeep Dasigi, Dheeru Dua, Matt Gardner, Robert L. Logan IV, Ana Marasovic, Zhen Nie

    Abstract: High-quality and large-scale data are key to success for AI systems. However, large-scale data annotation efforts are often confronted with a set of common challenges: (1) designing a user-friendly annotation interface; (2) training enough annotators efficiently; and (3) reproducibility. To address these problems, we introduce Crowdaq, an open-source platform that standardizes the data collection… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

    Comments: Accepted to the demo track of EMNLP 2020

  16. arXiv:2004.02709  [pdf, other

    cs.CL

    Evaluating Models' Local Decision Boundaries via Contrast Sets

    Authors: Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, Ally Zhang , et al. (1 additional authors not shown)

    Abstract: Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities. We propose a new annotation paradigm for NLP that helps to close systemati… ▽ More

    Submitted 1 October, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

  17. arXiv:1908.05803  [pdf, other

    cs.CL

    Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning

    Authors: Pradeep Dasigi, Nelson F. Liu, Ana Marasović, Noah A. Smith, Matt Gardner

    Abstract: Machine comprehension of texts longer than a single sentence often requires coreference resolution. However, most current reading comprehension benchmarks do not contain complex coreferential phenomena and hence fail to evaluate the ability of models to resolve coreference. We present a new crowdsourced dataset containing more than 24K span-selection questions that require resolving coreference am… ▽ More

    Submitted 4 September, 2019; v1 submitted 15 August, 2019; originally announced August 2019.

    Comments: 8 pages including appendix; EMNLP 2019 accepted paper camera ready version

  18. arXiv:1903.00161  [pdf, other

    cs.CL

    DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

    Authors: Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, Matt Gardner

    Abstract: Reading comprehension has recently seen rapid progress, with systems matching humans on the most popular datasets for the task. However, a large body of work has highlighted the brittleness of these systems, showing that there is much work left to be done. We introduce a new English reading comprehension benchmark, DROP, which requires Discrete Reasoning Over the content of Paragraphs. In this cro… ▽ More

    Submitted 16 April, 2019; v1 submitted 1 March, 2019; originally announced March 2019.

  19. arXiv:1803.07640  [pdf, ps, other

    cs.CL

    AllenNLP: A Deep Semantic Natural Language Processing Platform

    Authors: Matt Gardner, Joel Grus, Mark Neumann, Oyvind Tafjord, Pradeep Dasigi, Nelson Liu, Matthew Peters, Michael Schmitz, Luke Zettlemoyer

    Abstract: This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding. AllenNLP is designed to support researchers who want to build novel language understanding models quickly and easily. It is built on top of PyTorch, allowing for dynamic computation graphs, and provides (1) a flexible data API that handles intelligent batching and padding, (2) high-le… ▽ More

    Submitted 31 May, 2018; v1 submitted 20 March, 2018; originally announced March 2018.

    Comments: Describes the initial version of AllenNLP. Many features and models have been added since the first release. This is the paper to cite if you use AllenNLP in your research. Updated 5/31/2018 with version accepted to the NLP OSS workshop help at ACL 2018

  20. arXiv:1705.02925  [pdf, other

    cs.CL

    Ontology-Aware Token Embeddings for Prepositional Phrase Attachment

    Authors: Pradeep Dasigi, Waleed Ammar, Chris Dyer, Eduard Hovy

    Abstract: Type-level word embeddings use the same set of parameters to represent all instances of a word regardless of its context, ignoring the inherent lexical ambiguity in language. Instead, we embed semantic concepts (or synsets) as defined in WordNet and represent a word token in a particular context by estimating a distribution over relevant semantic concepts. We use the new, context-sensitive embeddi… ▽ More

    Submitted 8 May, 2017; originally announced May 2017.

    Comments: ACL 2017

  21. arXiv:1702.05398  [pdf, other

    cs.CL

    Experiment Segmentation in Scientific Discourse as Clause-level Structured Prediction using Recurrent Neural Networks

    Authors: Pradeep Dasigi, Gully A. P. C. Burns, Eduard Hovy, Anita de Waard

    Abstract: We propose a deep learning model for identifying structure within experiment narratives in scientific literature. We take a sequence labeling approach to this problem, and label clauses within experiment narratives to identify the different parts of the experiment. Our dataset consists of paragraphs taken from open access PubMed papers labeled with rhetorical information as a result of our pilot a… ▽ More

    Submitted 17 February, 2017; originally announced February 2017.