Skip to main content

Showing 1–4 of 4 results for author: Dunagan, L

.
  1. arXiv:2311.09718  [pdf, other

    cs.CL cs.AI

    You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments

    Authors: Bangzhao Shu, Lechen Zhang, Minje Choi, Lavinia Dunagan, Lajanugen Logeswaran, Moontae Lee, Dallas Card, David Jurgens

    Abstract: The versatility of Large Language Models (LLMs) on natural language understanding tasks has made them popular for research in social sciences. To properly understand the properties and innate personas of LLMs, researchers have performed studies that involve using prompts in the form of questions that ask LLMs about particular opinions. In this study, we take a cautionary step back and examine whet… ▽ More

    Submitted 1 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Camera-ready version for NAACL 2024. First two authors contributed equally

  2. arXiv:2307.02758  [pdf, other

    cs.CL

    Exploring Linguistic Style Matching in Online Communities: The Role of Social Context and Conversation Dynamics

    Authors: Aparna Ananthasubramaniam, Hong Chen, Jason Yan, Kenan Alkiek, Jiaxin Pei, Agrima Seth, Lavinia Dunagan, Minje Choi, Benjamin Litterer, David Jurgens

    Abstract: Linguistic style matching (LSM) in conversations can be reflective of several aspects of social influence such as power or persuasion. However, how LSM relates to the outcomes of online communication on platforms such as Reddit is an unknown question. In this study, we analyze a large corpus of two-party conversation threads in Reddit where we identify all occurrences of LSM using two types of sty… ▽ More

    Submitted 26 August, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: Equal contributions from authors 1-9 (AA, HC, JY, KA, JP, AS, LD, MC, BL)

  3. arXiv:2112.04139  [pdf, other

    cs.CL

    Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand

    Authors: Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Lavinia Dunagan, Jacob Morrison, Alexander R. Fabbri, Ye** Choi, Noah A. Smith

    Abstract: Natural language processing researchers have identified limitations of evaluation methodology for generation tasks, with new questions raised about the validity of automatic metrics and of crowdworker judgments. Meanwhile, efforts to improve generation models tend to depend on simple n-gram overlap metrics (e.g., BLEU, ROUGE). We argue that new advances on models and metrics should each more direc… ▽ More

    Submitted 18 May, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: Proc. of NAACL 2022

  4. arXiv:2111.08940  [pdf, other

    cs.CL cs.CV

    Transparent Human Evaluation for Image Captioning

    Authors: Jungo Kasai, Keisuke Sakaguchi, Lavinia Dunagan, Jacob Morrison, Ronan Le Bras, Ye** Choi, Noah A. Smith

    Abstract: We establish THumB, a rubric-based human evaluation protocol for image captioning models. Our scoring rubrics and their definitions are carefully developed based on machine- and human-generated captions on the MSCOCO dataset. Each caption is evaluated along two main dimensions in a tradeoff (precision and recall) as well as other aspects that measure the text quality (fluency, conciseness, and inc… ▽ More

    Submitted 18 May, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

    Comments: Proc. of NAACL 2022