Skip to main content

Showing 1–12 of 12 results for author: Dugan, L

.
  1. arXiv:2405.07940  [pdf, other

    cs.CL

    RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors

    Authors: Liam Dugan, Alyssa Hwang, Filip Trhlik, Josh Magnus Ludan, Andrew Zhu, Hainiu Xu, Daphne Ippolito, Chris Callison-Burch

    Abstract: Many commercial and open-source models claim to detect machine-generated text with extremely high accuracy (99% or more). However, very few of these detectors are evaluated on shared benchmark datasets and even when they are, the datasets used for evaluation are insufficiently challenging-lacking variations in sampling strategy, adversarial attacks, and open-source generative models. In this work… ▽ More

    Submitted 10 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: ACL 2024

    ACM Class: I.2.7

  2. arXiv:2402.14116  [pdf, other

    cs.CL cs.AI

    FanOutQA: A Multi-Hop, Multi-Document Question Answering Benchmark for Large Language Models

    Authors: Andrew Zhu, Alyssa Hwang, Liam Dugan, Chris Callison-Burch

    Abstract: One type of question that is commonly found in day-to-day scenarios is ``fan-out'' questions, complex multi-hop, multi-document reasoning questions that require finding information about a large number of entities. However, there exist few resources to evaluate this type of question-answering capability among large language models. To evaluate complex reasoning in LLMs more fully, we present FanOu… ▽ More

    Submitted 6 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: 18 pages, 2 figures. ACL 2024

  3. arXiv:2310.19660  [pdf, other

    cs.CL

    Interpretable-by-Design Text Understanding with Iteratively Generated Concept Bottleneck

    Authors: Josh Magnus Ludan, Qing Lyu, Yue Yang, Liam Dugan, Mark Yatskar, Chris Callison-Burch

    Abstract: Black-box deep neural networks excel in text classification, yet their application in high-stakes domains is hindered by their lack of interpretability. To address this, we propose Text Bottleneck Models (TBM), an intrinsically interpretable text classification framework that offers both global and local explanations. Rather than directly predicting the output label, TBM predicts categorical value… ▽ More

    Submitted 3 April, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

  4. arXiv:2309.05542  [pdf, other

    cs.SE cs.AI cs.CL

    Kani: A Lightweight and Highly Hackable Framework for Building Language Model Applications

    Authors: Andrew Zhu, Liam Dugan, Alyssa Hwang, Chris Callison-Burch

    Abstract: Language model applications are becoming increasingly popular and complex, often including features like tool usage and retrieval augmentation. However, existing frameworks for such applications are often opinionated, deciding for developers how their prompts ought to be formatted and imposing limitations on customizability and reproducibility. To solve this we present Kani: a lightweight, flexibl… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: In submission to NLP-OSS

    ACM Class: I.2.7

  5. arXiv:2306.01201  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models

    Authors: Liam Dugan, Anshul Wadhawan, Kyle Spence, Chris Callison-Burch, Morgan McGuire, Victor Zordan

    Abstract: Recent work in speech-to-speech translation (S2ST) has focused primarily on offline settings, where the full input utterance is available before any output is given. This, however, is not reasonable in many real-world scenarios. In latency-sensitive applications, rather than waiting for the full utterance, translations should be spoken as soon as the information in the input is present. In this wo… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: To appear at INTERSPEECH 2023

  6. arXiv:2304.13250  [pdf, other

    cs.CL

    Exploring the Curious Case of Code Prompts

    Authors: Li Zhang, Liam Dugan, Hainiu Xu, Chris Callison-Burch

    Abstract: Recent work has shown that prompting language models with code-like representations of natural language leads to performance improvements on structured reasoning tasks. However, such tasks comprise only a small subset of all natural language tasks. In our work, we seek to answer whether or not code-prompting is the preferred way of interacting with language models in general. We compare code and t… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  7. arXiv:2212.12672  [pdf, other

    cs.CL cs.AI cs.HC

    Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Generated Text

    Authors: Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, Chris Callison-Burch

    Abstract: As text generated by large language models proliferates, it becomes vital to understand how humans engage with such text, and whether or not they are able to detect when the text they are reading did not originate with a human writer. Prior work on human detection of generated text focuses on the case where an entire passage is either human-written or machine-generated. In this paper, we study a m… ▽ More

    Submitted 24 December, 2022; originally announced December 2022.

    Comments: AAAI 2023 Long Paper. Code is available at https://github.com/liamdugan/human-detection

    ACM Class: I.2.7

  8. arXiv:2206.04812  [pdf, other

    cs.CL

    The Case for a Single Model that can Both Generate Continuations and Fill in the Blank

    Authors: Daphne Ippolito, Liam Dugan, Emily Reif, Ann Yuan, Andy Coenen, Chris Callison-Burch

    Abstract: The task of inserting text into a specified position in a passage, known as fill in the blank (FitB), is useful for a variety of applications where writers interact with a natural language generation (NLG) system to craft text. While previous work has tackled this problem with models trained specifically to do the fill-in-the-blank task, a more useful model is one that can effectively perform _bot… ▽ More

    Submitted 30 June, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: This version: fixed bug in the headers of Table 2

    Journal ref: NAACL 2022 Findings

  9. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, AdriĆ  Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  10. arXiv:2203.08685  [pdf, other

    cs.CL cs.AI cs.HC

    A Feasibility Study of Answer-Agnostic Question Generation for Education

    Authors: Liam Dugan, Eleni Miltsakaki, Shriyash Upadhyay, Etan Ginsberg, Hannah Gonzalez, Dayheon Choi, Chuning Yuan, Chris Callison-Burch

    Abstract: We conduct a feasibility study into the applicability of answer-agnostic question generation models to textbook passages. We show that a significant portion of errors in such systems arise from asking irrelevant or uninterpretable questions and that such errors can be ameliorated by providing summarized input. We find that giving these models human-written summaries instead of the original text re… ▽ More

    Submitted 29 March, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: To be published in 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022)

    ACM Class: I.2.7

  11. arXiv:2010.03070  [pdf, other

    cs.CL cs.AI cs.HC

    RoFT: A Tool for Evaluating Human Detection of Machine-Generated Text

    Authors: Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Chris Callison-Burch

    Abstract: In recent years, large neural networks for natural language generation (NLG) have made leaps and bounds in their ability to generate fluent text. However, the tasks of evaluating quality differences between NLG systems and understanding how humans perceive the generated text remain both crucial and difficult. In this system demonstration, we present Real or Fake Text (RoFT), a website that tackles… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: To be published in Annual Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)

    ACM Class: I.2.7

  12. Cloud Chaser: Real Time Deep Learning Computer Vision on Low Computing Power Devices

    Authors: Zhengyi Luo, Austin Small, Liam Dugan, Stephen Lane

    Abstract: Internet of Things(IoT) devices, mobile phones, and robotic systems are often denied the power of deep learning algorithms due to their limited computing power. However, to provide time-critical services such as emergency response, home assistance, surveillance, etc, these devices often need real-time analysis of their camera data. This paper strives to offer a viable approach to integrate high-pe… ▽ More

    Submitted 8 November, 2020; v1 submitted 2 October, 2018; originally announced October 2018.

    Comments: Accepted to The 11th International Conference on Machine Vision (ICMV 2018). Project site: https://zhengyiluo.github.io/projects/cloudchaser/