Skip to main content

Showing 1–38 of 38 results for author: Fraser, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.01771  [pdf, other

    cs.CL

    LLMs Beyond English: Scaling the Multilingual Capability of LLMs with Cross-Lingual Feedback

    Authors: Wen Lai, Mohsen Mesgar, Alexander Fraser

    Abstract: To democratize large language models (LLMs) to most natural languages, it is imperative to make these models capable of understanding and generating texts in many languages, in particular low-resource ones. While recent multilingual LLMs demonstrate remarkable performance in such capabilities, these LLMs still support a limited number of human languages due to the lack of training data for low-res… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted to Findings of ACL 2024. The code, datasets, and models are publicly available at https://github.com/boschresearch/ACL24-MLLM

  2. arXiv:2405.18308  [pdf, other

    cs.CL

    Joint Lemmatization and Morphological Tagging with LEMMING

    Authors: Thomas Muller, Ryan Cotterell, Alexander Fraser, Hinrich Schütze

    Abstract: We present LEMMING, a modular log-linear model that jointly models lemmatization and tagging and supports the integration of arbitrary global features. It is trainable on corpora annotated with gold standard tags and lemmata and does not rely on morphological dictionaries or analyzers. LEMMING sets the new state of the art in token-based statistical lemmatization on six languages; e.g., for Czech… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: EMNLP 2015; Honorable Mention for Best Short Paper

  3. Labeled Morphological Segmentation with Semi-Markov Models

    Authors: Ryan Cotterell, Thomas Müller, Alexander Fraser, Hinrich Schütze

    Abstract: We present labeled morphological segmentation, an alternative view of morphological processing that unifies several tasks. From an annotation standpoint, we additionally introduce a new hierarchy of morphotactic tagsets. Finally, we develop \modelname, a discriminative morphological segmentation system that, contrary to previous work, explicitly models morphotactics. We show that \textsc{chipmunk}… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: CoNLL 2015

  4. arXiv:2404.06228  [pdf, other

    cs.CL

    Understanding Cross-Lingual Alignment -- A Survey

    Authors: Katharina Hämmerl, **dřich Libovický, Alexander Fraser

    Abstract: Cross-lingual alignment, the meaningful similarity of representations across languages in multilingual language models, has been an active field of research in recent years. We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field. We present different understandings of cross-lingual alignment and… ▽ More

    Submitted 11 June, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: Camera-ready version, ACL Findings 2024

  5. arXiv:2401.16092  [pdf, other

    cs.CL cs.CY cs.LG

    Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You

    Authors: Felix Friedrich, Katharina Hämmerl, Patrick Schramowski, Manuel Brack, **drich Libovicky, Kristian Kersting, Alexander Fraser

    Abstract: Text-to-image generation models have recently achieved astonishing results in image quality, flexibility, and text alignment, and are consequently employed in a fast-growing number of applications. Through improvements in multilingual abilities, a larger community now has access to this technology. However, our results show that multilingual models suffer from significant gender biases just as mon… ▽ More

    Submitted 15 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  6. arXiv:2311.12489  [pdf, other

    cs.CL

    Multilingual Word Embeddings for Low-Resource Languages using Anchors and a Chain of Related Languages

    Authors: Viktor Hangya, Silvia Severini, Radoslav Ralev, Alexander Fraser, Hinrich Schütze

    Abstract: Very low-resource languages, having only a few million tokens worth of data, are not well-supported by multilingual NLP approaches due to poor quality cross-lingual word representations. Recent work showed that good cross-lingual performance can be achieved if a source language is related to the low-resource target language. However, not all language pairs are related. In this paper, we propose to… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    Comments: Accepted at the MRL 2023 workshop

  7. arXiv:2311.08538  [pdf, other

    cs.CL

    Extending Multilingual Machine Translation through Imitation Learning

    Authors: Wen Lai, Viktor Hangya, Alexander Fraser

    Abstract: Despite the growing variety of languages supported by existing multilingual neural machine translation (MNMT) models, most of the world's languages are still being left behind. We aim to extend large-scale MNMT models to a new language, allowing for translation between the newly added and all of the already supported languages in a challenging scenario: using only a parallel corpus between the new… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  8. arXiv:2306.00458  [pdf, other

    cs.CL

    Exploring Anisotropy and Outliers in Multilingual Language Models for Cross-Lingual Semantic Sentence Similarity

    Authors: Katharina Hämmerl, Alina Fastowski, **dřich Libovický, Alexander Fraser

    Abstract: Previous work has shown that the representations output by contextual language models are more anisotropic than static type embeddings, and typically display outlier dimensions. This seems to be true for both monolingual and multilingual models, although much less work has been done on the multilingual context. Why these outliers occur and how they affect the representations is still an active are… ▽ More

    Submitted 7 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: To appear in ACL Findings 2023. Fixed a citation in this version

  9. arXiv:2305.17182  [pdf, other

    cs.CL cs.AI

    On the Copying Problem of Unsupervised NMT: A Training Schedule with a Language Discriminator Loss

    Authors: Yihong Liu, Alexandra Chronopoulou, Hinrich Schütze, Alexander Fraser

    Abstract: Although unsupervised neural machine translation (UNMT) has achieved success in many language pairs, the copying problem, i.e., directly copying some parts of the input sentence as the translation, is common among distant language pairs, especially when low-resource languages are involved. We find this issue is closely related to an unexpected copying behavior during online back-translation (BT).… ▽ More

    Submitted 4 June, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: IWSLT 2023

  10. arXiv:2305.14081  [pdf, other

    cs.CL

    How to Solve Few-Shot Abusive Content Detection Using the Data We Actually Have

    Authors: Viktor Hangya, Alexander Fraser

    Abstract: Due to the broad range of social media platforms, the requirements of abusive language detection systems are varied and ever-changing. Already a large set of annotated corpora with different properties and label sets were created, such as hate or misogyny detection, but the form and targets of abusive speech are constantly evolving. Since, the annotation of new corpora is expensive, in this work w… ▽ More

    Submitted 6 May, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted at LREC-COLING 2024

  11. arXiv:2305.12786  [pdf, other

    cs.CL

    Mitigating Data Imbalance and Representation Degeneration in Multilingual Machine Translation

    Authors: Wen Lai, Alexandra Chronopoulou, Alexander Fraser

    Abstract: Despite advances in multilingual neural machine translation (MNMT), we argue that there are still two major challenges in this area: data imbalance and representation degeneration. The data imbalance problem refers to the imbalance in the amount of parallel corpora for all language pairs, especially for long-tail languages (i.e., very low-resource languages). The representation degeneration proble… ▽ More

    Submitted 24 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of EMNLP 2023, add statistical significance tests. code available at https://github.com/lavine-lmu/Bi-ACL

  12. arXiv:2302.07027  [pdf, other

    cs.CL

    AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models

    Authors: Alexandra Chronopoulou, Matthew E. Peters, Alexander Fraser, Jesse Dodge

    Abstract: Pretrained language models (PLMs) are trained on massive corpora, but often need to specialize to specific domains. A parameter-efficient adaptation method suggests training an adapter for each domain on the task of language modeling. This leads to good in-domain scores but can be impractical for domain- or resource-restricted settings. A solution is to use a related-domain adapter for the novel d… ▽ More

    Submitted 28 March, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: Accepted at EACL 2023; camera-ready version; fixed typo in related work

  13. arXiv:2211.07733  [pdf, other

    cs.CL

    Speaking Multiple Languages Affects the Moral Bias of Language Models

    Authors: Katharina Hämmerl, Björn Deiseroth, Patrick Schramowski, **dřich Libovický, Constantin A. Rothkopf, Alexander Fraser, Kristian Kersting

    Abstract: Pre-trained multilingual language models (PMLMs) are commonly used when dealing with data from multiple languages and cross-lingual transfer. However, PMLMs are trained on varying amounts of data for each language. In practice this means their performance is often much better on English than many other languages. We explore to what extent this also applies to moral norms. Do the models capture mor… ▽ More

    Submitted 1 June, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: To appear in ACL Findings 2023

  14. arXiv:2210.11912  [pdf, other

    cs.CL

    $m^4Adapter$: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter

    Authors: Wen Lai, Alexandra Chronopoulou, Alexander Fraser

    Abstract: Multilingual neural machine translation models (MNMT) yield state-of-the-art performance when evaluated on data from a domain and language pair seen at training time. However, when a MNMT model is used to translate under domain shift or to a new language pair, performance drops dramatically. We consider a very challenging scenario: adapting the MNMT model both to a new domain and to a new language… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: Accepted to Findings of EMNLP 2022

  15. arXiv:2210.04675  [pdf, other

    cs.CL cs.AI

    A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing

    Authors: Sophie Henning, William Beluch, Alexander Fraser, Annemarie Friedrich

    Abstract: Many natural language processing (NLP) tasks are naturally imbalanced, as some target categories occur much more frequently than others in the real world. In such scenarios, current NLP models still tend to perform poorly on less frequent classes. Addressing class imbalance in NLP is an active research topic, yet, finding a good approach for a particular task and imbalance scenario is difficult.… ▽ More

    Submitted 22 February, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: Camera-ready version for EACL 2023

  16. arXiv:2209.15236  [pdf, other

    cs.CL

    Language-Family Adapters for Low-Resource Multilingual Neural Machine Translation

    Authors: Alexandra Chronopoulou, Dario Stojanovski, Alexander Fraser

    Abstract: Large multilingual models trained with self-supervision achieve state-of-the-art results in a wide range of natural language processing tasks. Self-supervised pretrained models are often fine-tuned on parallel data from one or multiple language pairs for machine translation. Multilingual fine-tuning improves performance on low-resource languages but requires modifying the entire model and can be p… ▽ More

    Submitted 29 March, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

    Comments: LoResMT (@EACL 2023) camera-ready version

  17. arXiv:2205.15713  [pdf, other

    cs.CL

    Don't Forget Cheap Training Signals Before Building Unsupervised Bilingual Word Embeddings

    Authors: Silvia Severini, Viktor Hangya, Masoud Jalili Sabet, Alexander Fraser, Hinrich Schütze

    Abstract: Bilingual Word Embeddings (BWEs) are one of the cornerstones of cross-lingual transfer of NLP models. They can be built using only monolingual corpora without supervision leading to numerous works focusing on unsupervised BWEs. However, most of the current approaches to build unsupervised BWEs do not compare their results with methods based on easy-to-access cross-lingual signals. In this paper, w… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Comments: BUCC@LREC 2022

  18. arXiv:2205.06857  [pdf, ps, other

    cs.DS cs.GT

    Parameterized Complexity of Gerrymandering

    Authors: Andrew Fraser, Brian Lavallee, Blair D. Sullivan

    Abstract: In a representative democracy, the electoral process involves partitioning geographical space into districts which each elect a single representative. These representatives craft and vote on legislation, incentivizing political parties to win as many districts as possible (ideally a plurality). Gerrymandering is the process by which district boundaries are manipulated to the advantage of a desired… ▽ More

    Submitted 7 December, 2023; v1 submitted 13 May, 2022; originally announced May 2022.

  19. arXiv:2203.14144  [pdf, other

    cs.DB cs.CL

    Demonstrating CAT: Synthesizing Data-Aware Conversational Agents for Transactional Databases

    Authors: Marius Gassen, Benjamin Hättasch, Benjamin Hilprecht, Nadja Geisler, Alexander Fraser, Carsten Binnig

    Abstract: Databases for OLTP are often the backbone for applications such as hotel room or cinema ticket booking applications. However, develo** a conversational agent (i.e., a chatbot-like interface) to allow end-users to interact with an application using natural language requires both immense amounts of training data and NLP expertise. This motivates CAT, which can be used to easily create conversation… ▽ More

    Submitted 26 March, 2022; originally announced March 2022.

    Comments: Submitted as demonstration proposal to VLDB 2022

  20. arXiv:2203.13550  [pdf, ps, other

    cs.CL

    Modeling Target-Side Morphology in Neural Machine Translation: A Comparison of Strategies

    Authors: Marion Weller-Di Marco, Matthias Huck, Alexander Fraser

    Abstract: Morphologically rich languages pose difficulties to machine translation. Machine translation engines that rely on statistical learning from parallel training data, such as state-of-the-art neural systems, face challenges especially with rich morphology on the output language side. Key challenges of rich target-side morphology in data-driven machine translation include: (1) A large amount of differ… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

  21. arXiv:2203.09904  [pdf, ps, other

    cs.CL

    Do Multilingual Language Models Capture Differing Moral Norms?

    Authors: Katharina Hämmerl, Björn Deiseroth, Patrick Schramowski, **dřich Libovický, Alexander Fraser, Kristian Kersting

    Abstract: Massively multilingual sentence representations are trained on large corpora of uncurated data, with a very imbalanced proportion of languages included in the training. This may cause the models to grasp cultural values including moral judgments from the high-resource languages and impose them on the low-resource languages. The lack of data in certain languages can also lead to develo** random a… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

  22. arXiv:2203.09326  [pdf, other

    cs.CL

    Combining Static and Contextualised Multilingual Embeddings

    Authors: Katharina Hämmerl, **dřich Libovický, Alexander Fraser

    Abstract: Static and contextual multilingual embeddings have complementary strengths. Static embeddings, while less expressive than contextual language models, can be more straightforwardly aligned across multiple languages. We combine the strengths of static and contextual models to improve multilingual representations. We extract static embeddings for 40 languages from XLM-R, validate those embeddings wit… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: Accepted to Findings of ACL 2022

  23. arXiv:2201.05922  [pdf, ps, other

    cs.CL

    Addressing the Challenges of Cross-Lingual Hate Speech Detection

    Authors: Irina Bigoulaeva, Viktor Hangya, Iryna Gurevych, Alexander Fraser

    Abstract: The goal of hate speech detection is to filter negative online content aiming at certain groups of people. Due to the easy accessibility of social media platforms it is crucial to protect everyone which requires building hate speech detection systems for a wide range of languages. However, the available labeled hate speech datasets are limited making it problematic to build systems for many langua… ▽ More

    Submitted 15 January, 2022; originally announced January 2022.

  24. arXiv:2112.08288  [pdf, other

    cs.CL

    Improving Both Domain Robustness and Domain Adaptability in Machine Translation

    Authors: Wen Lai, **dřich Libovický, Alexander Fraser

    Abstract: We consider two problems of NMT domain adaptation using meta-learning. First, we want to reach domain robustness, i.e., we want to reach high quality on both domains seen in the training data and unseen domains. Second, we want our systems to be adaptive, i.e., making it possible to finetune systems with just hundreds of in-domain parallel sentences. We study the domain adaptability of meta-learni… ▽ More

    Submitted 4 October, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: Accepted to COLING 2022

  25. arXiv:2110.08191  [pdf, other

    cs.CL

    Why don't people use character-level machine translation?

    Authors: **dřich Libovický, Helmut Schmid, Alexander Fraser

    Abstract: We present a literature and empirical survey that critically assesses the state of the art in character-level modeling for machine translation (MT). Despite evidence in the literature that character-level systems are comparable with subword systems, they are virtually never used in competitive setups in WMT competitions. We empirically show that even with recent modeling innovations in character-l… ▽ More

    Submitted 27 April, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: 16 pages, 4 figures; Findings of ACL 2022, camera-ready

  26. arXiv:2104.08388  [pdf, other

    cs.CL cs.LG

    Neural String Edit Distance

    Authors: **dřich Libovický, Alexander Fraser

    Abstract: We propose the neural string edit distance model for string-pair matching and string transduction based on learnable string edit distance. We modify the original expectation-maximization learned edit distance algorithm into a differentiable loss function, allowing us to integrate it into a neural network providing a contextual representation of the input. We evaluate on cognate detection, translit… ▽ More

    Submitted 27 April, 2022; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: 14 pages, 5 figures; Workshop on Structured Prediction for NLP @ACL 2022, camera-ready

  27. arXiv:2103.10531  [pdf, other

    cs.CL

    Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation

    Authors: Alexandra Chronopoulou, Dario Stojanovski, Alexander Fraser

    Abstract: Successful methods for unsupervised neural machine translation (UNMT) employ crosslingual pretraining via self-supervision, often in the form of a masked language modeling or a sequence generation task, which requires the model to align the lexical- and high-level representations of the two languages. While cross-lingual pretraining works for similar languages with abundant corpora, it performs po… ▽ More

    Submitted 14 April, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

    Comments: Accepted at NAACL 2021

  28. arXiv:2010.13192  [pdf, other

    cs.CL

    The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task

    Authors: Alexandra Chronopoulou, Dario Stojanovski, Viktor Hangya, Alexander Fraser

    Abstract: This paper describes the submission of LMU Munich to the WMT 2020 unsupervised shared task, in two language directions, German<->Upper Sorbian. Our core unsupervised neural machine translation (UNMT) system follows the strategy of Chronopoulou et al. (2020), using a monolingual pretrained language generation model (on German) and fine-tuning it on both German and Upper Sorbian, before initializing… ▽ More

    Submitted 25 October, 2020; originally announced October 2020.

    Comments: WMT Unsupervised Shared Task 2020

  29. arXiv:2010.12627  [pdf, other

    cs.CL

    Anchor-based Bilingual Word Embeddings for Low-Resource Languages

    Authors: Tobias Eder, Viktor Hangya, Alexander Fraser

    Abstract: Good quality monolingual word embeddings (MWEs) can be built for languages which have large amounts of unlabeled text. MWEs can be aligned to bilingual spaces using only a few thousand word translation pairs. For low resource languages training MWEs monolingually results in MWEs of poor quality, and thus poor bilingual word embeddings (BWEs) as well. This paper proposes a new approach for building… ▽ More

    Submitted 27 July, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

  30. arXiv:2009.07610  [pdf, other

    cs.CL

    Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT

    Authors: Alexandra Chronopoulou, Dario Stojanovski, Alexander Fraser

    Abstract: Using a language model (LM) pretrained on two languages with large monolingual data in order to initialize an unsupervised neural machine translation (UNMT) system yields state-of-the-art results. When limited data is available for one language, however, this method leads to poor translations. We present an effective approach that reuses an LM that is pretrained only on the high-resource language.… ▽ More

    Submitted 6 October, 2020; v1 submitted 16 September, 2020; originally announced September 2020.

    Comments: EMNLP 2020, main conference

  31. arXiv:2007.05234  [pdf, other

    cs.CL cs.AI

    Pragmatic information in translation: a corpus-based study of tense and mood in English and German

    Authors: Anita Ramm, Ekaterina Lapshinova-Koltunski, Alexander Fraser

    Abstract: Grammatical tense and mood are important linguistic phenomena to consider in natural language processing (NLP) research. We consider the correspondence between English and German tense and mood in translation. Human translators do not find this correspondence easy, and as we will show through careful analysis, there are no simplistic ways to map tense and mood from one language to another. Our obs… ▽ More

    Submitted 10 July, 2020; originally announced July 2020.

    Comments: Technical Report of CIS, LMU Munich. September 19th, 2019

    ACM Class: I.1.2

  32. arXiv:2004.14927  [pdf, other

    cs.CL

    Addressing Zero-Resource Domains Using Document-Level Context in Neural Machine Translation

    Authors: Dario Stojanovski, Alexander Fraser

    Abstract: Achieving satisfying performance in machine translation on domains for which there is no training data is challenging. Traditional supervised domain adaptation is not suitable for addressing such zero-resource domains because it relies on in-domain parallel data. We show that when in-domain parallel data is not available, access to document-level context enables better capturing of domain generali… ▽ More

    Submitted 19 April, 2021; v1 submitted 30 April, 2020; originally announced April 2020.

  33. arXiv:2004.14280  [pdf, other

    cs.CL

    Towards Reasonably-Sized Character-Level Transformer NMT by Finetuning Subword Systems

    Authors: **dřich Libovický, Alexander Fraser

    Abstract: Applying the Transformer architecture on the character level usually requires very deep architectures that are difficult and slow to train. These problems can be partially overcome by incorporating a segmentation into tokens in the model. We show that by initially training a subword model and then finetuning it on characters, we can obtain a neural machine translation model that works at the chara… ▽ More

    Submitted 29 September, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: 8 pages, 1 figure; Accepted to EMNLP 2020

  34. arXiv:2004.05160  [pdf, other

    cs.CL

    On the Language Neutrality of Pre-trained Multilingual Representations

    Authors: **dřich Libovický, Rudolf Rosa, Alexander Fraser

    Abstract: Multilingual contextual embeddings, such as multilingual BERT and XLM-RoBERTa, have proved useful for many multi-lingual tasks. Previous work probed the cross-linguality of the representations indirectly using zero-shot transfer learning on morphological and syntactic tasks. We instead investigate the language-neutrality of multilingual contextual embeddings directly and with respect to lexical se… ▽ More

    Submitted 29 September, 2020; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: 12 pages, 3 figures. arXiv admin note: text overlap with arXiv:1911.03310. Accepted to Findings of EMNLP 2020

  35. arXiv:1911.03310  [pdf, other

    cs.CL

    How Language-Neutral is Multilingual BERT?

    Authors: **dřich Libovický, Rudolf Rosa, Alexander Fraser

    Abstract: Multilingual BERT (mBERT) provides sentence representations for 104 languages, which are useful for many multi-lingual tasks. Previous work probed the cross-linguality of mBERT using zero-shot transfer learning on morphological and syntactic tasks. We instead focus on the semantic properties of mBERT. We show that mBERT representations can be split into a language-specific component and a language… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

    Comments: 6 pages, 3 figures

  36. arXiv:1801.06807  [pdf, other

    cs.CL

    Embedding Learning Through Multilingual Concept Induction

    Authors: Philipp Dufter, Mengjie Zhao, Martin Schmitt, Alexander Fraser, Hinrich Schütze

    Abstract: We present a new method for estimating vector space representations of words: embedding learning by concept induction. We test this method on a highly parallel corpus and learn semantic representations of words in 1259 different languages in a single common space. An extensive experimental evaluation on crosslingual word similarity and sentiment analysis indicates that concept-based multilingual e… ▽ More

    Submitted 27 June, 2018; v1 submitted 21 January, 2018; originally announced January 2018.

    Comments: ACL 2018

  37. arXiv:1707.06012  [pdf, other

    cs.CL

    Modeling Target-Side Inflection in Neural Machine Translation

    Authors: Aleš Tamchyna, Marion Weller-Di Marco, Alexander Fraser

    Abstract: NMT systems have problems with large vocabulary sizes. Byte-pair encoding (BPE) is a popular approach to solving this problem, but while BPE allows the system to generate any target-side word, it does not enable effective generalization over the rich vocabulary in morphologically rich languages with strong inflectional phenomena. We introduce a simple approach to overcome this problem by training… ▽ More

    Submitted 5 September, 2017; v1 submitted 19 July, 2017; originally announced July 2017.

    Comments: Accepted as a research paper at WMT17. (Updated version with corrected references.)

  38. arXiv:1607.01149  [pdf, other

    cs.CL

    Target-Side Context for Discriminative Models in Statistical Machine Translation

    Authors: Aleš Tamchyna, Alexander Fraser, Ondřej Bojar, Marcin Junczys-Dowmunt

    Abstract: Discriminative translation models utilizing source context have been shown to help statistical machine translation performance. We propose a novel extension of this work using target context information. Surprisingly, we show that this model can be efficiently integrated directly in the decoding process. Our approach scales to large training data sizes and results in consistent improvements in tra… ▽ More

    Submitted 5 July, 2016; originally announced July 2016.

    Comments: Accepted as a long paper for ACL 2016