Skip to main content

Showing 1–8 of 8 results for author: Stap, D

.
  1. arXiv:2405.20089  [pdf, other

    cs.CL

    The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities

    Authors: David Stap, Eva Hasler, Bill Byrne, Christof Monz, Ke Tran

    Abstract: Fine-tuning large language models (LLMs) for machine translation has shown improvements in overall translation quality. However, it is unclear what is the impact of fine-tuning on desirable LLM behaviors that are not present in neural machine translation models, such as steerability, inherent document-level translation abilities, and the ability to produce less literal translations. We perform an… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL 2024 (long, main)

  2. arXiv:2401.12413  [pdf, other

    cs.CL cs.LG

    How Far Can 100 Samples Go? Unlocking Overall Zero-Shot Multilingual Translation via Tiny Multi-Parallel Data

    Authors: Di Wu, Shaomu Tan, Yan Meng, David Stap, Christof Monz

    Abstract: Zero-shot translation aims to translate between language pairs not seen during training in Multilingual Machine Translation (MMT) and is largely considered an open problem. A common, albeit resource-consuming, solution is to add as many related translation directions as possible to the training corpus. In this paper, we show that for an English-centric model, surprisingly large zero-shot improveme… ▽ More

    Submitted 26 February, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: 15 pages, 5 figures

  3. arXiv:2310.14644  [pdf, other

    cs.CL

    Multilingual k-Nearest-Neighbor Machine Translation

    Authors: David Stap, Christof Monz

    Abstract: k-nearest-neighbor machine translation has demonstrated remarkable improvements in machine translation quality by creating a datastore of cached examples. However, these improvements have been limited to high-resource language pairs, with large datastores, and remain a challenge for low-resource languages. In this paper, we address this issue by combining representations from multiple languages in… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP

  4. arXiv:2310.09946  [pdf, other

    cs.CL cs.LG

    UvA-MT's Participation in the WMT23 General Translation Shared Task

    Authors: Di Wu, Shaomu Tan, David Stap, Ali Araabi, Christof Monz

    Abstract: This paper describes the UvA-MT's submission to the WMT 2023 shared task on general machine translation. We participate in the constrained track in two directions: English <-> Hebrew. In this competition, we show that by using one model to handle bidirectional tasks, as a minimal setting of Multilingual Machine Translation (MMT), it is possible to achieve comparable results with that of traditiona… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: This paper has been accepted by the WMT2023 Conference

  5. arXiv:2305.11550  [pdf, other

    cs.CL

    Viewing Knowledge Transfer in Multilingual Machine Translation Through a Representational Lens

    Authors: David Stap, Vlad Niculae, Christof Monz

    Abstract: We argue that translation quality alone is not a sufficient metric for measuring knowledge transfer in multilingual neural machine translation. To support this claim, we introduce Representational Transfer Potential (RTP), which measures representational similarities between languages. We show that RTP can measure both positive and negative transfer (interference), and find that RTP is strongly co… ▽ More

    Submitted 4 December, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023 Findings

  6. arXiv:2212.06383  [pdf, other

    cs.CL

    Towards a general purpose machine translation system for Sranantongo

    Authors: Just Zwennicker, David Stap

    Abstract: Machine translation for Sranantongo (Sranan, srn), a low-resource Creole language spoken predominantly in Surinam, is virgin territory. In this study we create a general purpose machine translation system for srn. In order to facilitate this research, we introduce the SRNcorpus, a collection of parallel Dutch (nl) to srn and monolingual srn data. We experiment with a wide range of proven machine t… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: Accepted to WiNLP (EMNLP). 2 pages

  7. arXiv:2204.07705  [pdf, other

    cs.CL cs.AI

    Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

    Authors: Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza , et al. (15 additional authors not shown)

    Abstract: How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce Super-NaturalInstructions, a benchmark of 1,616 diverse NLP tasks and their expert-written instructions. Our collection covers 76 distinct task types, including but not limited to classification, extraction, infilling, sequence tagging, text rewriting,… ▽ More

    Submitted 24 October, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

    Comments: Accepted to EMNLP 2022, 25 pages

  8. arXiv:2005.04909  [pdf, other

    cs.CV

    Conditional Image Generation and Manipulation for User-Specified Content

    Authors: David Stap, Maurits Bleeker, Sarah Ibrahimi, Maartje ter Hoeve

    Abstract: In recent years, Generative Adversarial Networks (GANs) have improved steadily towards generating increasingly impressive real-world images. It is useful to steer the image generation process for purposes such as content creation. This can be done by conditioning the model on additional information. However, when conditioning on additional information, there still exists a large set of images that… ▽ More

    Submitted 11 May, 2020; originally announced May 2020.

    Comments: Accepted to the AI for content creation workshop at CVPR 2020