Skip to main content

Showing 1–31 of 31 results for author: Demberg, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18776  [pdf, other

    cs.CL

    Implicit Discourse Relation Classification For Nigerian Pidgin

    Authors: Muhammed Saeed, Peter Bourgonje, Vera Demberg

    Abstract: Despite attempts to make Large Language Models multi-lingual, many of the world's languages are still severely under-resourced. This widens the performance gap between NLP and AI applications aimed at well-financed, and those aimed at less-resourced languages. In this paper, we focus on Nigerian Pidgin (NP), which is spoken by nearly 100 million people, but has comparatively very few NLP resources… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2405.00657  [pdf, other

    cs.CL cs.AI cs.LG

    RST-LoRA: A Discourse-Aware Low-Rank Adaptation for Long Document Abstractive Summarization

    Authors: Dongqi Pu, Vera Demberg

    Abstract: For long document summarization, discourse structure is important to discern the key content of the text and the differences in importance level between sentences. Unfortunately, the integration of rhetorical structure theory (RST) into parameter-efficient fine-tuning strategies for long document summarization remains unexplored. Therefore, this paper introduces RST-LoRA and proposes four RST-awar… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: NAACL 2024 Main & Long Conference Paper (Oral Presentation)

  3. arXiv:2404.18264  [pdf, other

    cs.CL cs.AI

    Modeling Orthographic Variation Improves NLP Performance for Nigerian Pidgin

    Authors: Pin-Jie Lin, Merel Scholman, Muhammed Saeed, Vera Demberg

    Abstract: Nigerian Pidgin is an English-derived contact language and is traditionally an oral language, spoken by approximately 100 million people. No orthographic standard has yet been adopted, and thus the few available Pidgin datasets that exist are characterised by noise in the form of orthographic variations. This contributes to under-performance of models in critical NLP tasks. The current work is the… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Accepted to LREC-COLING 2024 Main Conference

  4. arXiv:2403.17768  [pdf, other

    cs.CL cs.AI cs.LG

    SciNews: From Scholarly Complexities to Public Narratives -- A Dataset for Scientific News Report Generation

    Authors: Dongqi Pu, Yifan Wang, Jia Loy, Vera Demberg

    Abstract: Scientific news reports serve as a bridge, adeptly translating complex research articles into reports that resonate with the broader public. The automated generation of such narratives enhances the accessibility of scholarly insights. In this paper, we present a new corpus to facilitate this paradigm development. Our corpus comprises a parallel compilation of academic publications and their corres… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: LREC-COLING 2024 Main Conference Paper

  5. arXiv:2402.04918  [pdf, other

    cs.CL cs.AI

    Prompting Implicit Discourse Relation Annotation

    Authors: Frances Yung, Mansoor Ahmad, Merel Scholman, Vera Demberg

    Abstract: Pre-trained large language models, such as ChatGPT, archive outstanding performance in various reasoning tasks without supervised training and were found to have outperformed crowdsourcing workers. Nonetheless, ChatGPT's performance in the task of implicit discourse relation classification, prompted by a standard multiple-choice question, is still far from satisfactory and considerably inferior to… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: To appear at the Linguistic Annotation Workshop 2024

  6. arXiv:2311.09325  [pdf, other

    cs.CL cs.AI

    Temperature-scaling surprisal estimates improve fit to human reading times -- but does it do so for the "right reasons"?

    Authors: Tong Liu, Iza Škrjanec, Vera Demberg

    Abstract: A wide body of evidence shows that human language processing difficulty is predicted by the information-theoretic measure surprisal, a word's negative log probability in context. However, it is still unclear how to best estimate these probabilities needed for predicting human processing difficulty -- while a long-standing belief held that models with lower perplexity would provide more accurate es… ▽ More

    Submitted 3 July, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: ACL 2024

  7. arXiv:2311.07311  [pdf, other

    cs.CL cs.AI

    Do large language models and humans have similar behaviors in causal inference with script knowledge?

    Authors: Xudong Hong, Margarita Ryzhova, Daniel Adrian Biondi, Vera Demberg

    Abstract: Recently, large pre-trained language models (LLMs) have demonstrated superior language understanding abilities, including zero-shot causal reasoning. However, it is unclear to what extent their capabilities are similar to human ones. We here study the processing of an event $B$ in a script-based story, which causally depends on a previous event $A$. In our manipulation, event $A$ is stated, negate… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: 15 pages, 3 figures

    ACM Class: I.2.7; I.2.0

  8. arXiv:2308.00399  [pdf, other

    cs.CL cs.LG

    Tackling Hallucinations in Neural Chart Summarization

    Authors: Saad Obaid ul Islam, Iza Škrjanec, Ondřej Dušek, Vera Demberg

    Abstract: Hallucinations in text generation occur when the system produces text that is not grounded in the input. In this work, we tackle the problem of hallucinations in neural chart summarization. Our analysis shows that the target side of chart summarization training datasets often contains additional information, leading to hallucinations. We propose a natural language inference (NLI) based method to p… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: To be presented in INLG 2023

  9. arXiv:2307.00374  [pdf, other

    cs.CL

    Revisiting Sample Size Determination in Natural Language Understanding

    Authors: Ernie Chang, Muhammad Hassan Rashid, Pin-Jie Lin, Changsheng Zhao, Vera Demberg, Yangyang Shi, Vikas Chandra

    Abstract: Knowing exactly how many data points need to be labeled to achieve a certain model performance is a hugely beneficial step towards reducing the overall budgets for annotation. It pertains to both active learning and traditional data annotation, and is particularly beneficial for low resource scenarios. Nevertheless, it remains a largely under-explored area of research in NLP. We therefore explored… ▽ More

    Submitted 1 July, 2023; originally announced July 2023.

    Comments: Accepted to ACL 2023

  10. arXiv:2306.07799  [pdf, other

    cs.CL cs.AI cs.LG

    ChatGPT vs Human-authored Text: Insights into Controllable Text Summarization and Sentence Style Transfer

    Authors: Dongqi Pu, Vera Demberg

    Abstract: Large-scale language models, like ChatGPT, have garnered significant media attention and stunned the public with their remarkable capacity for generating coherent text from short natural language prompts. In this paper, we aim to conduct a systematic inspection of ChatGPT's performance in two controllable generation tasks, with respect to ChatGPT's ability to adapt its output to different target a… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: ACL-SRW 2023

  11. arXiv:2305.16784  [pdf, other

    cs.CL cs.AI cs.LG

    Incorporating Distributions of Discourse Structure for Long Document Abstractive Summarization

    Authors: Dongqi Pu, Yifan Wang, Vera Demberg

    Abstract: For text summarization, the role of discourse structure is pivotal in discerning the core content of a text. Regrettably, prior studies on incorporating Rhetorical Structure Theory (RST) into transformer-based summarization models only consider the nuclearity annotation, thereby overlooking the variety of discourse relation types. This paper introduces the 'RSTformer', a novel summarization model… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023 (Main conference)

  12. arXiv:2304.00815  [pdf, other

    cs.CL

    Design Choices for Crowdsourcing Implicit Discourse Relations: Revealing the Biases Introduced by Task Design

    Authors: Valentina Pyatkin, Frances Yung, Merel C. J. Scholman, Reut Tsarfaty, Ido Dagan, Vera Demberg

    Abstract: Disagreement in natural language annotation has mostly been studied from a perspective of biases introduced by the annotators and the annotation frameworks. Here, we propose to analyze another source of bias: task design bias, which has a particularly strong impact on crowdsourced linguistic annotations where natural language is used to elicit the interpretation of laymen annotators. For this purp… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: Accepted to TACL, pre-MIT Press publication version

  13. arXiv:2301.08571  [pdf, other

    cs.CL cs.CV cs.LG

    Visual Writing Prompts: Character-Grounded Story Generation with Curated Image Sequences

    Authors: Xudong Hong, Asad Sayeed, Khushboo Mehra, Vera Demberg, Bernt Schiele

    Abstract: Current work on image-based story generation suffers from the fact that the existing image sequence collections do not have coherent plots behind them. We improve visual story generation by producing a new image-grounded dataset, Visual Writing Prompts (VWP). VWP contains almost 2K selected sequences of movie shots, each including 5-10 images. The image sequences are aligned with a total of 12K st… ▽ More

    Submitted 20 January, 2023; originally announced January 2023.

    Comments: Paper accepted by Transactions of the Association for Computational Linguistics (TACL). This is a pre-MIT Press publication version. 15 pages, 6 figures

  14. arXiv:2210.10252  [pdf, other

    cs.CL cs.SD eess.AS

    A Data-Driven Investigation of Noise-Adaptive Utterance Generation with Linguistic Modification

    Authors: Anupama Chingacham, Vera Demberg, Dietrich Klakow

    Abstract: In noisy environments, speech can be hard to understand for humans. Spoken dialog systems can help to enhance the intelligibility of their output, either by modifying the speech synthesis (e.g., imitate Lombard speech) or by optimizing the language generation. We here focus on the second type of approach, by which an intended message is realized with words that are more intelligible in a specific… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: Accepted to SLT 2022

  15. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  16. arXiv:2108.06614  [pdf, other

    cs.CL

    The SelectGen Challenge: Finding the Best Training Samples for Few-Shot Neural Text Generation

    Authors: Ernie Chang, Xiaoyu Shen, Alex Marin, Vera Demberg

    Abstract: We propose a shared task on training instance selection for few-shot neural text generation. Large-scale pretrained language models have led to dramatic improvements in few-shot text generation. Nonetheless, almost all previous work simply applies random sampling to select the few-shot training instances. Little to no attention has been paid to the selection strategies and how they would affect mo… ▽ More

    Submitted 14 August, 2021; originally announced August 2021.

    Comments: Accepted at GenChal @ INLG 2021. arXiv admin note: text overlap with arXiv:2107.03176

  17. arXiv:2107.08337  [pdf, other

    cs.CL cs.SD eess.AS

    Exploring the Potential of Lexical Paraphrases for Mitigating Noise-Induced Comprehension Errors

    Authors: Anupama Chingacham, Vera Demberg, Dietrich Klakow

    Abstract: Listening in noisy environments can be difficult even for individuals with a normal hearing thresholds. The speech signal can be masked by noise, which may lead to word misperceptions on the side of the listener, and overall difficulty to understand the message. To mitigate hearing difficulties on listeners, a co-operative speaker utilizes voice modulation strategies like Lombard speech to generat… ▽ More

    Submitted 17 July, 2021; originally announced July 2021.

    Comments: Accepted in Interspeech 2021

  18. arXiv:2107.03179  [pdf, other

    cs.CL

    Time-Aware Ancient Chinese Text Translation and Inference

    Authors: Ernie Chang, Yow-Ting Shiue, Hui-Syuan Yeh, Vera Demberg

    Abstract: In this paper, we aim to address the challenges surrounding the translation of ancient Chinese text: (1) The linguistic gap due to the difference in eras results in translations that are poor in quality, and (2) most translations are missing the contextual information that is often very crucial to understanding the text. To this end, we improve upon past translation techniques by proposing the fol… ▽ More

    Submitted 7 July, 2021; originally announced July 2021.

    Comments: Accepted at LChange at ACL 2021

  19. arXiv:2107.03176  [pdf, other

    cs.CL cs.LG

    On Training Instance Selection for Few-Shot Neural Text Generation

    Authors: Ernie Chang, Xiaoyu Shen, Hui-Syuan Yeh, Vera Demberg

    Abstract: Large-scale pretrained language models have led to dramatic improvements in text generation. Impressive performance can be achieved by finetuning only on a small number of instances (few-shot setting). Nonetheless, almost all previous work simply applies random sampling to select the few-shot training instances. Little to no attention has been paid to the selection strategies and how they would af… ▽ More

    Submitted 7 July, 2021; originally announced July 2021.

    Comments: Accepted at ACL 2021

  20. arXiv:2102.03556  [pdf, other

    cs.CL

    Neural Data-to-Text Generation with LM-based Text Augmentation

    Authors: Ernie Chang, Xiaoyu Shen, Dawei Zhu, Vera Demberg, Hui Su

    Abstract: For many new application domains for data-to-text generation, the main obstacle in training neural models consists of a lack of training data. While usually large numbers of instances are available on the data side, often only very few text samples are available. To address this problem, we here propose a novel few-shot approach for this setting. Our approach automatically augments the data availa… ▽ More

    Submitted 6 February, 2021; originally announced February 2021.

    Comments: Accepted EACL 2021

  21. arXiv:2102.03554  [pdf, other

    cs.CL

    Does the Order of Training Samples Matter? Improving Neural Data-to-Text Generation with Curriculum Learning

    Authors: Ernie Chang, Hui-Syuan Yeh, Vera Demberg

    Abstract: Recent advancements in data-to-text generation largely take on the form of neural end-to-end systems. Efforts have been dedicated to improving text generation systems by changing the order of training samples in a process known as curriculum learning. Past research on sequence-to-sequence learning showed that curriculum learning helps to improve both the performance and convergence speed. In this… ▽ More

    Submitted 6 February, 2021; originally announced February 2021.

    Comments: Accepted at EACL 2021

  22. arXiv:2102.03551  [pdf, other

    cs.CL cs.AI

    Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling

    Authors: Ernie Chang, Vera Demberg, Alex Marin

    Abstract: Neural natural language generation (NLG) and understanding (NLU) models are data-hungry and require massive amounts of annotated data to be competitive. Recent frameworks address this bottleneck with generative models that synthesize weak labels at scale, where a small amount of training labels are expert-curated and the rest of the data is automatically annotated. We follow that approach, by auto… ▽ More

    Submitted 6 February, 2021; originally announced February 2021.

    Comments: Accepted at EACL2021

  23. arXiv:2010.10967  [pdf, other

    cs.HC

    Safe Handover in Mixed-Initiative Control for Cyber-Physical Systems

    Authors: Frederik Wiehr, Anke Hirsch, Florian Daiber, Antonio Kruger, Alisa Kovtunova, Stefan Borgwardt, Ernie Chang, Vera Demberg, Marcel Steinmetz, Hoffmann Jorg

    Abstract: For mixed-initiative control between cyber-physical systems (CPS) and its users, it is still an open question how machines can safely hand over control to humans. In this work, we propose a concept to provide technological support that uses formal methods from AI -- description logic (DL) and automated planning -- to predict more reliably when a hand-over is necessary, and to increase the advance… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

    Comments: In Proceedings of Workshop at CHI

  24. arXiv:2010.04141  [pdf, other

    cs.CL

    DART: A Lightweight Quality-Suggestive Data-to-Text Annotation Tool

    Authors: Ernie Chang, Jeriah Caplinger, Alex Marin, Xiaoyu Shen, Vera Demberg

    Abstract: We present a lightweight annotation tool, the Data AnnotatoR Tool (DART), for the general task of labeling structured data with textual descriptions. The tool is implemented as an interactive application that reduces human efforts in annotating large quantities of structured data, e.g. in the format of a table or tree structure. By using a backend sequence-to-sequence model, our system iteratively… ▽ More

    Submitted 1 December, 2020; v1 submitted 8 October, 2020; originally announced October 2020.

    Comments: Accepted to COLING 2020 (selected as outstanding paper)

  25. arXiv:2003.08272  [pdf, ps, other

    cs.CL

    Unsupervised Pidgin Text Generation By Pivoting English Data and Self-Training

    Authors: Ernie Chang, David Ifeoluwa Adelani, Xiaoyu Shen, Vera Demberg

    Abstract: West African Pidgin English is a language that is significantly spoken in West Africa, consisting of at least 75 million speakers. Nevertheless, proper machine translation systems and relevant NLP datasets for pidgin English are virtually absent. In this work, we develop techniques targeted at bridging the gap between Pidgin English and English in the context of natural language generation. %As a… ▽ More

    Submitted 27 April, 2021; v1 submitted 18 March, 2020; originally announced March 2020.

    Comments: Accepted to Workshop at ICLR 2020

  26. arXiv:1811.01697  [pdf, other

    cs.CL

    Learning to Explicitate Connectives with Seq2Seq Network for Implicit Discourse Relation Classification

    Authors: Wei Shi, Vera Demberg

    Abstract: Implicit discourse relation classification is one of the most difficult steps in discourse parsing. The difficulty stems from the fact that the coherence relation must be inferred based on the content of the discourse relational arguments. Therefore, an effective encoding of the relational arguments is of crucial importance. We here propose a new model for implicit discourse relation classificatio… ▽ More

    Submitted 2 April, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

    Comments: to appear on IWCS 2019

  27. arXiv:1808.10290  [pdf, other

    cs.CL

    Acquiring Annotated Data with Cross-lingual Explicitation for Implicit Discourse Relation Classification

    Authors: Wei Shi, Frances Yung, Vera Demberg

    Abstract: Implicit discourse relation classification is one of the most challenging and important tasks in discourse parsing, due to the lack of connective as strong linguistic cues. A principle bottleneck to further improvement is the shortage of training data (ca.~16k instances in the PDTB). Shi et al. (2017) proposed to acquire additional data by exploiting connectives in translation: human translators m… ▽ More

    Submitted 15 April, 2019; v1 submitted 30 August, 2018; originally announced August 2018.

    Comments: to appear on DISRPT@NAACL2019

  28. arXiv:1802.02032  [pdf, other

    cs.CL cs.AI cs.LG

    Improving Variational Encoder-Decoders in Dialogue Generation

    Authors: Xiaoyu Shen, Hui Su, Shuzi Niu, Vera Demberg

    Abstract: Variational encoder-decoders (VEDs) have shown promising results in dialogue generation. However, the latent variable distributions are usually approximated by a much simpler model than the powerful RNN structure used for encoding and decoding, yielding the KL-vanishing problem and inconsistent training objective. In this paper, we separate the training step into two phases: The first phase learns… ▽ More

    Submitted 6 February, 2018; originally announced February 2018.

    Comments: Accepted by AAAI2018

  29. arXiv:1704.08893  [pdf, other

    cs.CL

    How compatible are our discourse annotations? Insights from map** RST-DT and PDTB annotations

    Authors: Vera Demberg, Fatemeh Torabi Asr, Merel Scholman

    Abstract: Discourse-annotated corpora are an important resource for the community, but they are often annotated according to different frameworks. This makes comparison of the annotations difficult, thereby also preventing researchers from searching the corpora in a unified way, or using all annotated data jointly to train computational systems. Several theoretical proposals have recently been made for mapp… ▽ More

    Submitted 15 March, 2018; v1 submitted 28 April, 2017; originally announced April 2017.

  30. arXiv:1702.03121  [pdf, other

    cs.CL cs.AI stat.ML

    Modeling Semantic Expectation: Using Script Knowledge for Referent Prediction

    Authors: Ashutosh Modi, Ivan Titov, Vera Demberg, Asad Sayeed, Manfred Pinkal

    Abstract: Recent research in psycholinguistics has provided increasing evidence that humans predict upcoming content. Prediction also affects perception and might be a key to robustness in human language processing. In this paper, we investigate the factors that affect human prediction by building a computational model that can predict upcoming discourse referents based on linguistic knowledge alone vs. lin… ▽ More

    Submitted 10 February, 2017; originally announced February 2017.

    Comments: 14 pages, published at TACL, 2017, Volume-5, Pg 31-44, 2017

    Journal ref: Transactions of ACL, Volume-5, Pg 31-44 (2017)

  31. arXiv:1606.01990  [pdf, other

    cs.CL

    Neural Network Models for Implicit Discourse Relation Classification in English and Chinese without Surface Features

    Authors: Attapol T. Rutherford, Vera Demberg, Nianwen Xue

    Abstract: Inferring implicit discourse relations in natural language text is the most difficult subtask in discourse parsing. Surface features achieve good performance, but they are not readily applicable to other languages without semantic lexicons. Previous neural models require parses, surface features, or a small label set to work well. Here, we propose neural network models that are based on feedforwar… ▽ More

    Submitted 6 June, 2016; originally announced June 2016.