Skip to main content

Showing 1–26 of 26 results for author: Nenkova, A

.
  1. arXiv:2404.12386  [pdf, other

    cs.CV cs.LG

    SOHES: Self-supervised Open-world Hierarchical Entity Segmentation

    Authors: Shengcao Cao, Jiuxiang Gu, Jason Kuen, Hao Tan, Ruiyi Zhang, Handong Zhao, Ani Nenkova, Liang-Yan Gui, Tong Sun, Yu-Xiong Wang

    Abstract: Open-world entity segmentation, as an emerging computer vision task, aims at segmenting entities in images without being restricted by pre-defined classes, offering impressive generalization capabilities on unseen images and concepts. Despite its promise, existing entity segmentation methods like Segment Anything Model (SAM) rely heavily on costly expert annotators. This work presents Self-supervi… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: ICLR 2024

  2. arXiv:2403.00553  [pdf, other

    cs.CL

    Standardizing the Measurement of Text Diversity: A Tool and a Comparative Analysis of Scores

    Authors: Chantal Shaib, Joe Barrow, Jiuding Sun, Alexa F. Siu, Byron C. Wallace, Ani Nenkova

    Abstract: The diversity across outputs generated by large language models shapes the perception of their quality and utility. Prompt leaks, templated answer structure, and canned responses across different interactions are readily noticed by people, but there is no standard score to measure this aspect of model behavior. In this work we empirically investigate diversity scores on English texts. We find that… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: Preprint

  3. arXiv:2402.18756  [pdf, other

    cs.CL

    How Much Annotation is Needed to Compare Summarization Models?

    Authors: Chantal Shaib, Joe Barrow, Alexa F. Siu, Byron C. Wallace, Ani Nenkova

    Abstract: Modern instruction-tuned models have become highly capable in text generation tasks such as summarization, and are expected to be released at a steady pace. In practice one may now wish to choose confidently, but with minimal effort, the best performing summarization model when applied to a new domain or purpose. In this work, we empirically investigate the test sample size necessary to select a p… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: Preprint

  4. arXiv:2310.16790  [pdf, other

    cs.CL cs.AI cs.LG

    Improving a Named Entity Recognizer Trained on Noisy Data with a Few Clean Instances

    Authors: Zhendong Chu, Ruiyi Zhang, Tong Yu, Rajiv Jain, Vlad I Morariu, Jiuxiang Gu, Ani Nenkova

    Abstract: To achieve state-of-the-art performance, one still needs to train NER models on large-scale, high-quality annotated data, an asset that is both costly and time-intensive to accumulate. In contrast, real-world applications often resort to massive low-quality labeled data through non-expert annotators via crowdsourcing and external knowledge bases via distant supervision as a cost-effective alternat… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 14 pages

  5. arXiv:2310.15140  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    AutoDAN: Interpretable Gradient-Based Adversarial Attacks on Large Language Models

    Authors: Sicheng Zhu, Ruiyi Zhang, Bang An, Gang Wu, Joe Barrow, Zichao Wang, Furong Huang, Ani Nenkova, Tong Sun

    Abstract: Safety alignment of Large Language Models (LLMs) can be compromised with manual jailbreak attacks and (automatic) adversarial attacks. Recent studies suggest that defending against these attacks is possible: adversarial attacks generate unlimited but unreadable gibberish prompts, detectable by perplexity-based filters; manual jailbreak attacks craft readable prompts, but their limited number due t… ▽ More

    Submitted 14 December, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Version 2 updates: Added comparison of three more evaluation methods and their reliability check using human labeling. Added results for jailbreaking Llama2 (individual behavior) and included complexity and hyperparameter analysis. Revised objectives for prompt leaking. Other minor changes made

  6. arXiv:2309.08872  [pdf, other

    cs.CL cs.AI cs.LG

    PDFTriage: Question Answering over Long, Structured Documents

    Authors: Jon Saad-Falcon, Joe Barrow, Alexa Siu, Ani Nenkova, David Seunghyun Yoon, Ryan A. Rossi, Franck Dernoncourt

    Abstract: Large Language Models (LLMs) have issues with document question answering (QA) in situations where the document is unable to fit in the small context length of an LLM. To overcome this issue, most existing works focus on retrieving the relevant context from the document, representing them as plain text. However, documents such as PDFs, web pages, and presentations are naturally structured with dif… ▽ More

    Submitted 8 November, 2023; v1 submitted 16 September, 2023; originally announced September 2023.

  7. arXiv:2306.10555  [pdf, other

    cs.CL

    Summarization from Leaderboards to Practice: Choosing A Representation Backbone and Ensuring Robustness

    Authors: David Demeter, Oshin Agarwal, Simon Ben Igeri, Marko Sterbentz, Neil Molino, John M. Conroy, Ani Nenkova

    Abstract: Academic literature does not give much guidance on how to build the best possible customer-facing summarization system from existing research components. Here we present analyses to inform the selection of a system backbone from popular models; we find that in both automatic and human evaluation, BART performs better than PEGASUS and T5. We also find that when applied cross-domain, summarizers exh… ▽ More

    Submitted 18 June, 2023; originally announced June 2023.

  8. arXiv:2305.12077  [pdf, other

    cs.CL

    Few-Shot Dialogue Summarization via Skeleton-Assisted Prompt Transfer in Prompt Tuning

    Authors: Kaige Xie, Tong Yu, Haoliang Wang, Junda Wu, Handong Zhao, Ruiyi Zhang, Kanak Mahadik, Ani Nenkova, Mark Riedl

    Abstract: In real-world scenarios, labeled samples for dialogue summarization are usually limited (i.e., few-shot) due to high annotation costs for high-quality dialogue summaries. To efficiently learn from few-shot samples, previous works have utilized massive annotated data from other downstream tasks and then performed prompt transfer in prompt tuning so as to enable cross-task knowledge transfer. Howeve… ▽ More

    Submitted 26 February, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Accepted to EACL 2024

  9. arXiv:2305.10434  [pdf, other

    cs.CL cs.AI cs.LG

    Learning the Visualness of Text Using Large Vision-Language Models

    Authors: Gaurav Verma, Ryan A. Rossi, Christopher Tensmeyer, Jiuxiang Gu, Ani Nenkova

    Abstract: Visual text evokes an image in a person's mind, while non-visual text fails to do so. A method to automatically detect visualness in text will enable text-to-image retrieval and generation models to augment text with relevant images. This is particularly challenging with long-form text as text-to-image generation and retrieval models are often triggered for text that is designed to be explicitly v… ▽ More

    Submitted 22 October, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: Accepted at EMNLP 2023 (Main, long); 9 pages, 5 figures

  10. arXiv:2302.12324  [pdf, other

    cs.CL

    Summaries as Captions: Generating Figure Captions for Scientific Documents with Automated Text Summarization

    Authors: Chieh-Yang Huang, Ting-Yao Hsu, Ryan Rossi, Ani Nenkova, Sungchul Kim, Gromit Yeuk-Yin Chan, Eunyee Koh, Clyde Lee Giles, Ting-Hao 'Kenneth' Huang

    Abstract: Good figure captions help paper readers understand complex scientific figures. Unfortunately, even published papers often have poorly written captions. Automatic caption generation could aid paper writers by providing good starting captions that can be refined for better quality. Prior work often treated figure caption generation as a vision-to-language task. In this paper, we show that it can be… ▽ More

    Submitted 11 August, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: Accepted by INLG-2023

  11. arXiv:2211.14958  [pdf, other

    cs.CV

    MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding

    Authors: Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, Ani Nenkova, Tong Sun, **gbo Shang, Vlad I. Morariu

    Abstract: Document images are a ubiquitous source of data where the text is organized in a complex hierarchical structure ranging from fine granularity (e.g., words), medium granularity (e.g., regions such as paragraphs or figures), to coarse granularity (e.g., the whole page). The spatial hierarchical relationships between content at different levels of granularity are crucial for document image understand… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

    Comments: EMNLP 2022

  12. arXiv:2210.14177  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Influence Functions for Sequence Tagging Models

    Authors: Sarthak Jain, Varun Manjunatha, Byron C. Wallace, Ani Nenkova

    Abstract: Many language tasks (e.g., Named Entity Recognition, Part-of-Speech tagging, and Semantic Role Labeling) are naturally framed as sequence tagging problems. However, there has been comparatively little work on interpretability methods for sequence tagging models. In this paper, we extend influence functions - which aim to trace predictions back to the training points that informed them - to sequenc… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: Accepted to Findings of EMNLP 2022

  13. arXiv:2210.08145  [pdf, other

    cs.CL

    Self-Repetition in Abstractive Neural Summarizers

    Authors: Nikita Salkar, Thomas Trikalinos, Byron C. Wallace, Ani Nenkova

    Abstract: We provide a quantitative and qualitative analysis of self-repetition in the output of neural summarizers. We measure self-repetition as the number of n-grams of length four or longer that appear in multiple outputs of the same system. We analyze the behavior of three popular architectures (BART, T5, and Pegasus), fine-tuned on five datasets. In a regression analysis, we find that the three archit… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  14. arXiv:2204.10939  [pdf, other

    cs.CL cs.CV

    Unified Pretraining Framework for Document Understanding

    Authors: Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Nikolaos Barmpalios, Rajiv Jain, Ani Nenkova, Tong Sun

    Abstract: Document intelligence automates the extraction of information from documents and supports many business applications. Recent self-supervised learning methods on large-scale unlabeled document datasets have opened up promising directions towards reducing annotation efforts by training models with self-supervised objectives. However, most of the existing document pretraining methods are still langua… ▽ More

    Submitted 28 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: 12 pages, 4 figures, NeurIPS 2021 (Updated Camera Ready)

  15. arXiv:2111.12790  [pdf, other

    cs.CL

    Temporal Effects on Pre-trained Models for Language Processing Tasks

    Authors: Oshin Agarwal, Ani Nenkova

    Abstract: Kee** the performance of language technologies optimal as time passes is of great practical interest. We study temporal effects on model performance on downstream language tasks, establishing a nuanced terminology for such discussion and identifying factors essential to conduct a robust study. We present experiments for several tasks in English where the label correctness is not dependent on tim… ▽ More

    Submitted 6 June, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: TACL, pre-MIT press publication version

  16. arXiv:2102.03671  [pdf, other

    cs.CL

    From Toxicity in Online Comments to Incivility in American News: Proceed with Caution

    Authors: Anushree Hede, Oshin Agarwal, Linda Lu, Diana C. Mutz, Ani Nenkova

    Abstract: The ability to quantify incivility online, in news and in congressional debates, is of great interest to political scientists. Computational tools for detecting online incivility for English are now fairly accessible and potentially could be applied more broadly. We test the Jigsaw Perspective API for its ability to detect the degree of incivility on a corpus that we developed, consisting of manua… ▽ More

    Submitted 6 February, 2021; originally announced February 2021.

    Comments: Accepted at EACL 2021

  17. arXiv:2010.03550  [pdf, other

    cs.CL

    Understanding Clinical Trial Reports: Extracting Medical Entities and Their Relations

    Authors: Benjamin E. Nye, Jay DeYoung, Eric Lehman, Ani Nenkova, Iain J. Marshall, Byron C. Wallace

    Abstract: The best evidence concerning comparative treatment effectiveness comes from clinical trials, the results of which are reported in unstructured articles. Medical experts must manually extract information from articles to inform decision-making, which is time-consuming and expensive. Here we consider the end-to-end task of both (a) extracting treatments and outcomes from full-text articles describin… ▽ More

    Submitted 7 January, 2022; v1 submitted 7 October, 2020; originally announced October 2020.

  18. arXiv:2005.10865  [pdf, other

    cs.IR cs.CL cs.HC cs.LG

    Trialstreamer: Map** and Browsing Medical Evidence in Real-Time

    Authors: Benjamin E. Nye, Ani Nenkova, Iain J. Marshall, Byron C. Wallace

    Abstract: We introduce Trialstreamer, a living database of clinical trial reports. Here we mainly describe the evidence extraction component; this extracts from biomedical abstracts key pieces of information that clinicians need when appraising the literature, and also the relations between these. Specifically, the system extracts descriptions of trial participants, the treatments compared in each arm (the… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

    Comments: 6 pages, 4 figures

    ACM Class: I.2.7; J.3

  19. arXiv:2004.04564  [pdf, other

    cs.CL

    Interpretability Analysis for Named Entity Recognition to Understand System Predictions and How They Can Improve

    Authors: Oshin Agarwal, Yinfei Yang, Byron C. Wallace, Ani Nenkova

    Abstract: Named Entity Recognition systems achieve remarkable performance on domains such as English news. It is natural to ask: What are these models actually learning to achieve this? Are they merely memorizing the names themselves? Or are they capable of interpreting the text and inferring the correct entity type from the linguistic context? We examine these questions by contrasting the performance of se… ▽ More

    Submitted 3 January, 2021; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: Computational Linguistics Journal

  20. arXiv:2004.04123  [pdf, ps, other

    cs.CL

    Entity-Switched Datasets: An Approach to Auditing the In-Domain Robustness of Named Entity Recognition Models

    Authors: Oshin Agarwal, Yinfei Yang, Byron C. Wallace, Ani Nenkova

    Abstract: Named entity recognition systems perform well on standard datasets comprising English news. But given the paucity of data, it is difficult to draw conclusions about the robustness of systems with respect to recognizing a diverse set of entities. We propose a method for auditing the in-domain robustness of systems, focusing specifically on differences in performance due to the national origin of en… ▽ More

    Submitted 13 January, 2021; v1 submitted 8 April, 2020; originally announced April 2020.

  21. arXiv:1905.07791  [pdf, other

    cs.CL

    Predicting Annotation Difficulty to Improve Task Routing and Model Performance for Biomedical Information Extraction

    Authors: Yinfei Yang, Oshin Agarwal, Chris Tar, Byron C. Wallace, Ani Nenkova

    Abstract: Modern NLP systems require high-quality annotated data. In specialized domains, expert annotations may be prohibitively expensive. An alternative is to rely on crowdsourcing to reduce costs at the risk of introducing noise. In this paper we demonstrate that directly modeling instance difficulty can be used to improve model performance, and to route instances to appropriate annotators. Our difficul… ▽ More

    Submitted 19 May, 2019; originally announced May 2019.

    Comments: NAACL2019

  22. arXiv:1810.11476  [pdf, ps, other

    cs.CL

    Named Person Coreference in English News

    Authors: Oshin Agarwal, Sanjay Subramanian, Ani Nenkova, Dan Roth

    Abstract: People are often entities of interest in tasks such as search and information extraction. In these tasks, the goal is to find as much information as possible about people specified by their name. However in text, some of the references to people are by pronouns (she, his) or generic descriptions (the professor, the German chancellor). It is therefore important that coreference resolution systems a… ▽ More

    Submitted 20 November, 2018; v1 submitted 26 October, 2018; originally announced October 2018.

  23. arXiv:1806.04185  [pdf, other

    cs.CL

    A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature

    Authors: Benjamin Nye, Junyi Jessy Li, Roma Patel, Yinfei Yang, Iain J. Marshall, Ani Nenkova, Byron C. Wallace

    Abstract: We present a corpus of 5,000 richly annotated abstracts of medical articles describing clinical randomized controlled trials. Annotations include demarcations of text spans that describe the Patient population enrolled, the Interventions studied and to what they were Compared, and the Outcomes measured (the `PICO' elements). These spans are further annotated at a more granular level, e.g., individ… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

    Comments: ACL 2018

  24. arXiv:1805.00097  [pdf, other

    cs.CL

    Syntactic Patterns Improve Information Extraction for Medical Search

    Authors: Roma Patel, Yinfei Yang, Iain Marshall, Ani Nenkova, Byron Wallace

    Abstract: Medical professionals search the published literature by specifying the type of patients, the medical intervention(s) and the outcome measure(s) of interest. In this paper we demonstrate how features encoding syntactic patterns improve the performance of state-of-the-art sequence tagging models (both linear and neural) for information extraction of these medically relevant categories. We present a… ▽ More

    Submitted 30 April, 2018; originally announced May 2018.

  25. arXiv:1704.00440  [pdf, other

    cs.CL

    Combining Lexical and Syntactic Features for Detecting Content-dense Texts in News

    Authors: Yinfei Yang, Ani Nenkova

    Abstract: Content-dense news report important factual information about an event in direct, succinct manner. Information seeking applications such as information extraction, question answering and summarization normally assume all text they deal with is content-dense. Here we empirically test this assumption on news articles from the business, U.S. international relations, sports and science journalism doma… ▽ More

    Submitted 3 April, 2017; originally announced April 2017.

    Comments: In submission to JAIR

  26. arXiv:1702.07998  [pdf, ps, other

    cs.CL

    Detecting (Un)Important Content for Single-Document News Summarization

    Authors: Yinfei Yang, Forrest Sheng Bao, Ani Nenkova

    Abstract: We present a robust approach for detecting intrinsic sentence importance in news, by training on two corpora of document-summary pairs. When used for single-document summarization, our approach, combined with the "beginning of document" heuristic, outperforms a state-of-the-art summarizer and the beginning-of-article baseline in both automatic and manual evaluations. These results represent an imp… ▽ More

    Submitted 26 February, 2017; originally announced February 2017.

    Comments: Accepted By EACL 2017