Skip to main content

Showing 1–27 of 27 results for author: Bisazza, A

.
  1. arXiv:2406.19999  [pdf, other

    cs.CL

    The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models

    Authors: Xinyi Chen, Baohao Liao, Jirui Qi, Panagiotis Eustratiadis, Christof Monz, Arianna Bisazza, Maarten de Rijke

    Abstract: Following multiple instructions is a crucial ability for large language models (LLMs). Evaluating this ability comes with significant challenges: (i) limited coherence between multiple instructions, (ii) positional bias where the order of instructions affects model performance, and (iii) a lack of objectively verifiable tasks. To address these issues, we introduce a benchmark designed to evaluate… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  2. arXiv:2406.13663  [pdf, other

    cs.CL cs.AI cs.LG

    Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

    Authors: Jirui Qi, Gabriele Sarti, Raquel Fernández, Arianna Bisazza

    Abstract: Ensuring the verifiability of model answers is a fundamental challenge for retrieval-augmented generation (RAG) in the question answering (QA) domain. Recently, self-citation prompting was proposed to make large language models (LLMs) generate citations to supporting documents along with their answers. However, self-citing LLMs often struggle to match the required format, refer to non-existent sou… ▽ More

    Submitted 1 July, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: Under review. Code and data released at https://github.com/Betswish/MIRAGE

  3. arXiv:2406.07243  [pdf, other

    cs.CL

    MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs

    Authors: Vera Neplenbroek, Arianna Bisazza, Raquel Fernández

    Abstract: Generative large language models (LLMs) have been shown to exhibit harmful biases and stereotypes. While safety fine-tuning typically takes place in English, if at all, these models are being used by speakers of many different languages. There is existing evidence that the performance of these models is inconsistent across languages and that they discriminate based on demographic factors of the us… ▽ More

    Submitted 18 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  4. arXiv:2405.00208  [pdf, other

    cs.CL

    A Primer on the Inner Workings of Transformer-based Language Models

    Authors: Javier Ferrando, Gabriele Sarti, Arianna Bisazza, Marta R. Costa-jussà

    Abstract: The rapid progress of research aimed at interpreting the inner workings of advanced language models has highlighted a need for contextualizing the insights gained from years of work in this area. This primer provides a concise technical introduction to the current techniques used to interpret the inner workings of Transformer-based language models, focusing on the generative decoder-only architect… ▽ More

    Submitted 1 May, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

  5. arXiv:2403.16865  [pdf, other

    cs.CL eess.AS

    Encoding of lexical tone in self-supervised models of spoken language

    Authors: Gaofei Shen, Michaela Watkins, Afra Alishahi, Arianna Bisazza, Grzegorz Chrupała

    Abstract: Interpretability research has shown that self-supervised Spoken Language Models (SLMs) encode a wide variety of features in human speech from the acoustic, phonetic, phonological, syntactic and semantic levels, to speaker characteristics. The bulk of prior research on representations of phonology has focused on segmental features such as phonemes; the encoding of suprasegmental phonology (such as… ▽ More

    Submitted 3 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted to NAACL 2024

  6. arXiv:2310.10378  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Cross-Lingual Consistency of Factual Knowledge in Multilingual Language Models

    Authors: Jirui Qi, Raquel Fernández, Arianna Bisazza

    Abstract: Multilingual large-scale Pretrained Language Models (PLMs) have been shown to store considerable amounts of factual knowledge, but large variations are observed across languages. With the ultimate goal of ensuring that users with different language backgrounds obtain consistent feedback from the same model, we study the cross-lingual consistency (CLC) of factual knowledge in various multilingual P… ▽ More

    Submitted 9 November, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP2023 main conference. All code and data are released at https://github.com/Betswish/Cross-Lingual-Consistency

  7. arXiv:2310.01188  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Quantifying the Plausibility of Context Reliance in Neural Machine Translation

    Authors: Gabriele Sarti, Grzegorz Chrupała, Malvina Nissim, Arianna Bisazza

    Abstract: Establishing whether language models can use contextual information in a human-plausible way is important to ensure their trustworthiness in real-world settings. However, the questions of when and which parts of the context affect model generations are typically tackled separately, with current plausibility evaluations being practically limited to a handful of artificial benchmarks. To address thi… ▽ More

    Submitted 13 March, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 Camera Ready. Code: https://github.com/gsarti/pecore. Artifacts: https://huggingface.co/collections/gsarti/pecore-iclr-2024-65edab42e28439e21b612c2e

    ACM Class: I.2.7

  8. Wave to Syntax: Probing spoken language models for syntax

    Authors: Gaofei Shen, Afra Alishahi, Arianna Bisazza, Grzegorz Chrupała

    Abstract: Understanding which information is encoded in deep models of spoken and written language has been the focus of much research in recent years, as it is crucial for debugging and improving these architectures. Most previous work has focused on probing for speaker characteristics, acoustic and phonological information in models of spoken language, and for syntactic information in models of written la… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted to Interspeech 2023

    Journal ref: Proceedings of Interspeech 2023

  9. Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation

    Authors: Lukas Edman, Gabriele Sarti, Antonio Toral, Gertjan van Noord, Arianna Bisazza

    Abstract: Pretrained character-level and byte-level language models have been shown to be competitive with popular subword models across a range of Natural Language Processing (NLP) tasks. However, there has been little research on their effectiveness for neural machine translation (NMT), particularly within the popular pretrain-then-finetune paradigm. This work performs an extensive comparison across multi… ▽ More

    Submitted 26 January, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: This version of our work is a pre-MIT Press publication version

  10. arXiv:2302.13942  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Inseq: An Interpretability Toolkit for Sequence Generation Models

    Authors: Gabriele Sarti, Nils Feldhus, Ludwig Sickert, Oskar van der Wal, Malvina Nissim, Arianna Bisazza

    Abstract: Past work in natural language processing interpretability focused mainly on popular classification tasks while largely overlooking generation settings, partly due to a lack of dedicated tools. In this work, we introduce Inseq, a Python library to democratize access to interpretability analyses of sequence generation models. Inseq enables intuitive and optimized extraction of models' internal infor… ▽ More

    Submitted 27 May, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: ACL 2023 Demo Track. Library: https://github.com/inseq-team/inseq, Docs: https://inseq.readthedocs.io, v0.4

    Journal ref: Proceedings of ACL: System Demonstrations (2023) 421-435

  11. arXiv:2301.13083  [pdf, other

    cs.CL cs.AI

    Communication Drives the Emergence of Language Universals in Neural Agents: Evidence from the Word-order/Case-marking Trade-off

    Authors: Yuchen Lian, Arianna Bisazza, Tessa Verhoef

    Abstract: Artificial learners often behave differently from human learners in the context of neural agent-based simulations of language emergence and change. A common explanation is the lack of appropriate cognitive biases in these learners. However, it has also been proposed that more naturalistic settings of language learning and use could lead to more human-like results. We investigate this latter accoun… ▽ More

    Submitted 31 May, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    Comments: Accepted to TACL, pre-MIT Press publication version

  12. DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages

    Authors: Gabriele Sarti, Arianna Bisazza, Ana Guerberof Arenas, Antonio Toral

    Abstract: We introduce DivEMT, the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. Using a strictly controlled setup, 18 professional translators were instructed to translate or post-edit the same set of English documents into Arabic, Dutch, Italian, Turkish, Ukrainian, and Vietnamese. During the process, their edits, keys… ▽ More

    Submitted 18 October, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022, materials: https://github.com/gsarti/divemt

    Journal ref: Proceedings of EMNLP (2022) 7795-7816

  13. arXiv:2205.12148  [pdf, other

    cs.CL

    Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer

    Authors: Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord, Sebastian Ruder

    Abstract: Massively multilingual models are promising for transfer learning across tasks and languages. However, existing methods are unable to fully leverage training data when it is available in different task-language combinations. To exploit such heterogeneous supervision, we propose Hyper-X, a single hypernetwork that unifies multi-task and multilingual learning with efficient adaptation. This model ge… ▽ More

    Submitted 25 October, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: Accepted at EMNLP 2022 (Main Conference)

  14. arXiv:2107.06055  [pdf, other

    cs.CL

    On the Difficulty of Translating Free-Order Case-Marking Languages

    Authors: Arianna Bisazza, Ahmet Üstün, Stephan Sportel

    Abstract: Identifying factors that make certain languages harder to model than others is essential to reach language equality in future Natural Language Processing technologies. Free-order case-marking languages, such as Russian, Latin or Tamil, have proved more challenging than fixed-order languages for the tasks of syntactic parsing and subject-verb agreement prediction. In this work, we investigate wheth… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

    Comments: Accepted to TACL, pre-MIT Press publication version

  15. arXiv:2104.07637  [pdf, other

    cs.CL cs.AI cs.LG

    The Effect of Efficient Messaging and Input Variability on Neural-Agent Iterated Language Learning

    Authors: Yuchen Lian, Arianna Bisazza, Tessa Verhoef

    Abstract: Natural languages display a trade-off among different strategies to convey syntactic structure, such as word order or inflection. This trade-off, however, has not appeared in recent simulations of iterated language learning with neural network agents (Chaabouni et al., 2019b). We re-evaluate this result in light of three factors that play an important role in comparable experiments from the Langua… ▽ More

    Submitted 10 September, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: To appear at EMNLP 2021

  16. arXiv:2004.14327  [pdf, other

    cs.CL

    UDapter: Language Adaptation for Truly Universal Dependency Parsing

    Authors: Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord

    Abstract: Recent advances in multilingual dependency parsing have brought the idea of a truly universal parser closer to reality. However, cross-language interference and restrained model capacity remain major obstacles. To address this, we propose a novel multilingual task adaptation approach based on contextual parameter generation and adapter modules. This approach enables to learn adapters via language… ▽ More

    Submitted 6 October, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: In EMNLP 2020

  17. arXiv:2003.14056  [pdf, other

    cs.CL cs.LG

    Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent Neural Networks

    Authors: Prajit Dhar, Arianna Bisazza

    Abstract: It is now established that modern neural language models can be successfully trained on multiple languages simultaneously without changes to the underlying architecture. But what kind of knowledge is really shared among languages within these models? Does multilingual training mostly lead to an alignment of the lexical representation spaces or does it also enable the sharing of purely grammatical… ▽ More

    Submitted 14 April, 2021; v1 submitted 31 March, 2020; originally announced March 2020.

    Comments: v3: Submitted to NaDaLiDa 2021. 9 pages, two columns and with 6 figures and 1 table

  18. arXiv:1912.09582  [pdf, other

    cs.CL

    BERTje: A Dutch BERT Model

    Authors: Wietse de Vries, Andreas van Cranenburgh, Arianna Bisazza, Tommaso Caselli, Gertjan van Noord, Malvina Nissim

    Abstract: The transformer-based pre-trained language model BERT has helped to improve state-of-the-art performance on many natural language processing (NLP) tasks. Using the same architecture and parameters, we developed and evaluated a monolingual Dutch BERT model called BERTje. Compared to the multilingual BERT model, which includes Dutch but is only based on Wikipedia text, BERTje is based on a large and… ▽ More

    Submitted 19 December, 2019; originally announced December 2019.

  19. arXiv:1910.05479  [pdf, other

    cs.CL

    Zero-shot Dependency Parsing with Pre-trained Multilingual Sentence Representations

    Authors: Ke Tran, Arianna Bisazza

    Abstract: We investigate whether off-the-shelf deep bidirectional sentence representations trained on a massively multilingual corpus (multilingual BERT) enable the development of an unsupervised universal dependency parser. This approach only leverages a mix of monolingual corpora in many languages and does not require any translation data making it applicable to low-resource languages. In our experiments… ▽ More

    Submitted 11 October, 2019; originally announced October 2019.

    Comments: DeepLo workshop, EMNLP 2019

  20. arXiv:1803.03585  [pdf, other

    cs.CL

    The Importance of Being Recurrent for Modeling Hierarchical Structure

    Authors: Ke Tran, Arianna Bisazza, Christof Monz

    Abstract: Recent work has shown that recurrent neural networks (RNNs) can implicitly capture and exploit hierarchical information when trained to solve common natural language processing tasks such as language modeling (Linzen et al., 2016) and neural machine translation (Shi et al., 2016). In contrast, the ability to model structured data with non-recurrent neural networks has received little attention des… ▽ More

    Submitted 28 August, 2018; v1 submitted 9 March, 2018; originally announced March 2018.

    Comments: EMNLP 2018

  21. arXiv:1802.04681  [pdf, other

    cs.CL

    Examining the Tip of the Iceberg: A Data Set for Idiom Translation

    Authors: Marzieh Fadaee, Arianna Bisazza, Christof Monz

    Abstract: Neural Machine Translation (NMT) has been widely used in recent years with significant improvements for many language pairs. Although state-of-the-art NMT systems are generating progressively better translations, idiom translation remains one of the open challenges in this field. Idioms, a category of multiword expressions, are an interesting language phenomenon where the overall meaning of the ex… ▽ More

    Submitted 13 February, 2018; originally announced February 2018.

    Comments: Accepted at LREC 2018

  22. arXiv:1708.00712  [pdf, other

    cs.CL

    Dynamic Data Selection for Neural Machine Translation

    Authors: Marlies van der Wees, Arianna Bisazza, Christof Monz

    Abstract: Intelligent selection of training data has proven a successful technique to simultaneously increase training efficiency and translation performance for phrase-based machine translation (PBMT). With the recent increase in popularity of neural machine translation (NMT), we explore in this paper to what extent and how NMT can also benefit from data selection. While state-of-the-art data selection (Ax… ▽ More

    Submitted 2 August, 2017; originally announced August 2017.

    Comments: Accepted at EMNLP2017

  23. Learning Topic-Sensitive Word Representations

    Authors: Marzieh Fadaee, Arianna Bisazza, Christof Monz

    Abstract: Distributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling topics and integrating topic distributions for ea… ▽ More

    Submitted 1 May, 2017; originally announced May 2017.

    Comments: 5 pages, 1 figure, Accepted at ACL 2017

  24. Data Augmentation for Low-Resource Neural Machine Translation

    Authors: Marzieh Fadaee, Arianna Bisazza, Christof Monz

    Abstract: The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora. For low-resource language pairs this is not the case, resulting in poor translation quality. Inspired by work in computer vision, we propose a novel data augmentation approach that targets low-frequency words by generating new sentence pairs containing rare words in new, synthe… ▽ More

    Submitted 1 May, 2017; originally announced May 2017.

    Comments: 5 pages, 2 figures, Accepted at ACL 2017

  25. arXiv:1608.04631  [pdf, other

    cs.CL

    Neural versus Phrase-Based Machine Translation Quality: a Case Study

    Authors: Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, Marcello Federico

    Abstract: Within the field of Statistical Machine Translation (SMT), the neural approach (NMT) has recently emerged as the first technology able to challenge the long-standing dominance of phrase-based approaches (PBMT). In particular, at the IWSLT 2015 evaluation campaign, NMT outperformed well established state-of-the-art PBMT systems on English-German, a language pair known to be particularly hard becaus… ▽ More

    Submitted 9 October, 2016; v1 submitted 16 August, 2016; originally announced August 2016.

    Comments: Conference on Empirical Methods in Natural Language Processing (EMNLP), November 1-5, 2016, Austin, Texas, USA

  26. arXiv:1601.01272  [pdf, other

    cs.CL

    Recurrent Memory Networks for Language Modeling

    Authors: Ke Tran, Arianna Bisazza, Christof Monz

    Abstract: Recurrent Neural Networks (RNN) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge. In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allo… ▽ More

    Submitted 22 April, 2016; v1 submitted 6 January, 2016; originally announced January 2016.

    Comments: 8 pages, 6 figures. Accepted at NAACL 2016

  27. A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena

    Authors: Arianna Bisazza, Marcello Federico

    Abstract: Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs. Instead, the choice of the optimal approach for a new… ▽ More

    Submitted 14 March, 2016; v1 submitted 17 February, 2015; originally announced February 2015.

    Comments: 44 pages, to appear in Computational Linguistics

    Journal ref: Computational Linguistics, Vol. 42, No. 2: 163-205, MIT Press (June 2016)