Skip to main content

Showing 1–16 of 16 results for author: Garimella, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16152  [pdf, other

    cs.CL

    Towards Region-aware Bias Evaluation Metrics

    Authors: Angana Borah, Aparna Garimella, Rada Mihalcea

    Abstract: When exposed to human-generated data, language models are known to learn and amplify societal biases. While previous works introduced benchmarks that can be used to assess the bias in these models, they rely on assumptions that may not be universally true. For instance, a gender bias dimension commonly used by these metrics is that of family--career, but this may not be the only common bias in cer… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2406.14829  [pdf, other

    cs.CL

    Is this a bad table? A Closer Look at the Evaluation of Table Generation from Text

    Authors: Pritika Ramu, Aparna Garimella, Sambaran Bandyopadhyay

    Abstract: Understanding whether a generated table is of good quality is important to be able to use it in creating or editing documents using automatic methods. In this work, we underline that existing measures for table quality evaluation fail to capture the overall semantics of the tables, and sometimes unfairly penalize good tables and reward bad ones. We propose TabEval, a novel table evaluation strateg… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2405.13095  [pdf, other

    cs.CL cs.AI

    Presentations are not always linear! GNN meets LLM for Document-to-Presentation Transformation with Attribution

    Authors: Himanshu Maheshwari, Sambaran Bandyopadhyay, Aparna Garimella, Anandhavelu Natarajan

    Abstract: Automatically generating a presentation from the text of a long document is a challenging and useful problem. In contrast to a flat summary, a presentation needs to have a better and non-linear narrative, i.e., the content of a slide can come from different and non-contiguous parts of the given document. However, it is difficult to incorporate such non-linear map** of content to slides and ensur… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: This paper is under review in a conference

  4. arXiv:2404.01261  [pdf, other

    cs.CL cs.AI

    FABLES: Evaluating faithfulness and content selection in book-length summarization

    Authors: Yekyung Kim, Yapei Chang, Marzena Karpinska, Aparna Garimella, Varun Manjunatha, Kyle Lo, Tanya Goyal, Mohit Iyyer

    Abstract: While long-context large language models (LLMs) can technically summarize book-length documents (>100K tokens), the length and complexity of the documents have so far prohibited evaluations of input-dependent aspects like faithfulness. In this paper, we conduct the first large-scale human evaluation of faithfulness and content selection on LLM-generated summaries of fictional books. Our study miti… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: preprint - 39 pages

  5. arXiv:2403.20147  [pdf, other

    cs.CL

    IndiBias: A Benchmark Dataset to Measure Social Biases in Language Models for Indian Context

    Authors: Nihar Ranjan Sahoo, Pranamya Prashant Kulkarni, Narjis Asad, Arif Ahmad, Tanu Goyal, Aparna Garimella, Pushpak Bhattacharyya

    Abstract: The pervasive influence of social biases in language data has sparked the need for benchmark datasets that capture and evaluate these biases in Large Language Models (LLMs). Existing efforts predominantly focus on English language and the Western context, leaving a void for a reliable dataset that encapsulates India's unique socio-cultural nuances. To bridge this gap, we introduce IndiBias, a comp… ▽ More

    Submitted 3 April, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

  6. arXiv:2310.09219  [pdf, other

    cs.CL cs.AI

    "Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters

    Authors: Yixin Wan, George Pu, Jiao Sun, Aparna Garimella, Kai-Wei Chang, Nanyun Peng

    Abstract: Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content, including professional documents such as recommendation letters. Though bringing convenience, this application also introduces unprecedented fairness concerns. Model-generated reference letters might be directly used by users in professional scenarios. If underlying bi… ▽ More

    Submitted 1 December, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 Findings

  7. arXiv:2305.14659  [pdf, other

    cs.CL

    InteractiveIE: Towards Assessing the Strength of Human-AI Collaboration in Improving the Performance of Information Extraction

    Authors: Ishani Mondal, Michelle Yuan, Anandhavelu N, Aparna Garimella, Francis Ferraro, Andrew Blair-Stanek, Benjamin Van Durme, Jordan Boyd-Graber

    Abstract: Learning template based information extraction from documents is a crucial yet difficult task. Prior template-based IE approaches assume foreknowledge of the domain templates; however, real-world IE do not have pre-defined schemas and it is a figure-out-as you go phenomena. To quickly bootstrap templates in a real-world setting, we need to induce template slots from documents with zero or minimal… ▽ More

    Submitted 17 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Version 2

  8. arXiv:2305.14625  [pdf, other

    cs.CL

    KNN-LM Does Not Improve Open-ended Text Generation

    Authors: Shufan Wang, Yixiao Song, Andrew Drozdov, Aparna Garimella, Varun Manjunatha, Mohit Iyyer

    Abstract: In this paper, we study the generation quality of interpolation-based retrieval-augmented language models (LMs). These methods, best exemplified by the KNN-LM, interpolate the LM's predicted distribution of the next word with a distribution formed from the most relevant retrievals for a given prefix. While the KNN-LM and related methods yield impressive decreases in perplexity, we discover that th… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  9. Investigating Strategies for Clause Recommendation

    Authors: Sagar Joshi, Sumanth Balaji, Jerrin Thomas, Aparna Garimella, Vasudeva Varma

    Abstract: Clause recommendation is the problem of recommending a clause to a legal contract, given the context of the contract in question and the clause type to which the clause should belong. With not much prior work being done toward the generation of legal contracts, this problem was proposed as a first step toward the bigger problem of contract generation. As an open-ended text generation problem, the… ▽ More

    Submitted 21 January, 2023; originally announced January 2023.

    Comments: Published in Legal Knowledge and Information Systems (JURIX) 2022. (10 pages, 4 figures)

    ACM Class: I.2.7

    Journal ref: Volume 362: Legal Knowledge and Information Systems (2022), Frontiers in Artificial Intelligence and Applications

  10. arXiv:2301.06901  [pdf, other

    cs.CL cs.AI

    Graph-based Keyword Planning for Legal Clause Generation from Topics

    Authors: Sagar Joshi, Sumanth Balaji, Aparna Garimella, Vasudeva Varma

    Abstract: Generating domain-specific content such as legal clauses based on minimal user-provided information can be of significant benefit in automating legal contract generation. In this paper, we propose a controllable graph-based mechanism that can generate legal clauses using only the topic or type of the legal clauses. Our pipeline consists of two stages involving a graph-based planner followed by a c… ▽ More

    Submitted 7 January, 2023; originally announced January 2023.

    Comments: To be published in the Natural Legal Language Processing Workshop, EMNLP 2022 (11 pages, 7 figures)

    ACM Class: I.2.7

  11. arXiv:2212.09825  [pdf, other

    cs.CL

    What to Read in a Contract? Party-Specific Summarization of Legal Obligations, Entitlements, and Prohibitions

    Authors: Abhilasha Sancheti, Aparna Garimella, Balaji Vasan Srinivasan, Rachel Rudinger

    Abstract: Reviewing and comprehending key obligations, entitlements, and prohibitions in legal contracts can be a tedious task due to their length and domain-specificity. Furthermore, the key rights and duties requiring review vary for each contracting party. In this work, we propose a new task of party-specific extractive summarization for legal contracts to facilitate faster reviewing and improved compreh… ▽ More

    Submitted 24 October, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: EMNLP 2023

  12. arXiv:2211.12752  [pdf, other

    cs.CL

    Agent-Specific Deontic Modality Detection in Legal Language

    Authors: Abhilasha Sancheti, Aparna Garimella, Balaji Vasan Srinivasan, Rachel Rudinger

    Abstract: Legal documents are typically long and written in legalese, which makes it particularly difficult for laypeople to understand their rights and duties. While natural language understanding technologies can be valuable in supporting such understanding in the legal domain, the limited availability of datasets annotated for deontic modalities in the legal domain, due to the cost of hiring experts and… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: Accepted at EMNLP 2022

  13. arXiv:2110.15794  [pdf, other

    cs.CL cs.AI

    CLAUSEREC: A Clause Recommendation Framework for AI-aided Contract Authoring

    Authors: Vinay Aggarwal, Aparna Garimella, Balaji Vasan Srinivasan, Anandhavelu N, Rajiv Jain

    Abstract: Contracts are a common type of legal document that frequent in several day-to-day business workflows. However, there has been very limited NLP research in processing such documents, and even lesser in generating them. These contracts are made up of clauses, and the unique nature of these clauses calls for specific methods to understand and generate such documents. In this paper, we introduce the t… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

  14. arXiv:2102.00272  [pdf, other

    cs.LG cs.CL

    EmpathBERT: A BERT-based Framework for Demographic-aware Empathy Prediction

    Authors: Bhanu Prakash Reddy Guda, Aparna Garimella, Niyati Chhaya

    Abstract: Affect preferences vary with user demographics, and tap** into demographic information provides important cues about the users' language preferences. In this paper, we utilize the user demographics, and propose EmpathBERT, a demographic-aware framework for empathy prediction based on BERT. Through several comparative experiments, we show that EmpathBERT surpasses traditional machine learning and… ▽ More

    Submitted 30 January, 2021; originally announced February 2021.

    Comments: Accepted in EACL 2019, 5 pages

  15. arXiv:2101.11836  [pdf, other

    cs.CL cs.AI cs.LG

    DRAG: Director-Generator Language Modelling Framework for Non-Parallel Author Stylized Rewriting

    Authors: Hrituraj Singh, Gaurav Verma, Aparna Garimella, Balaji Vasan Srinivasan

    Abstract: Author stylized rewriting is the task of rewriting an input text in a particular author's style. Recent works in this area have leveraged Transformer-based language models in a denoising autoencoder setup to generate author stylized text without relying on a parallel corpus of data. However, these approaches are limited by the lack of explicit control of target attributes and being entirely data-d… ▽ More

    Submitted 28 January, 2021; originally announced January 2021.

    Comments: Accepted as Long Paper to EACL 2021

  16. arXiv:2006.00578  [pdf, other

    cs.CL

    "Judge me by my size (noun), do you?'' YodaLib: A Demographic-Aware Humor Generation Framework

    Authors: Aparna Garimella, Carmen Banea, Nabil Hossain, Rada Mihalcea

    Abstract: The subjective nature of humor makes computerized humor generation a challenging task. We propose an automatic humor generation framework for filling the blanks in Mad Libs stories, while accounting for the demographic backgrounds of the desired audience. We collect a dataset consisting of such stories, which are filled in and judged by carefully selected workers on Amazon Mechanical Turk. We buil… ▽ More

    Submitted 31 May, 2020; originally announced June 2020.