Skip to main content

Showing 1–13 of 13 results for author: Baziotis, C

.
  1. arXiv:2305.14124  [pdf, other

    cs.CL

    When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model Scale

    Authors: Christos Baziotis, Biao Zhang, Alexandra Birch, Barry Haddow

    Abstract: Multilingual machine translation (MMT), trained on a mixture of parallel and monolingual data, is key for improving translation in low-resource language pairs. However, the literature offers conflicting results on the performance of different methods of including monolingual data. To resolve this, we examine how denoising autoencoding (DAE) and backtranslation (BT) impact MMT under different data… ▽ More

    Submitted 30 March, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to NAACL 2024 (Main conference)

  2. arXiv:2210.04545  [pdf, other

    cs.CL

    Automatic Evaluation and Analysis of Idioms in Neural Machine Translation

    Authors: Christos Baziotis, Prashant Mathur, Eva Hasler

    Abstract: A major open problem in neural machine translation (NMT) is the translation of idiomatic expressions, such as "under the weather". The meaning of these expressions is not composed by the meaning of their constituent words, and NMT models tend to translate them literally (i.e., word-by-word), which leads to confusing and nonsensical translations. Research on idioms in NMT is limited and obstructed… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

  3. arXiv:2205.10835  [pdf, other

    cs.CL

    Multilingual Machine Translation with Hyper-Adapters

    Authors: Christos Baziotis, Mikel Artetxe, James Cross, Shruti Bhosale

    Abstract: Multilingual machine translation suffers from negative interference across languages. A common solution is to relax parameter sharing with language-specific modules like adapters. However, adapters of related languages are unable to transfer information, and their total number of parameters becomes prohibitively expensive as the number of languages grows. In this work, we overcome these drawbacks… ▽ More

    Submitted 5 December, 2022; v1 submitted 22 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022 camera-ready version. Code at github.com/cbaziotis/fairseq under the "hyperadapters" branch (see instructions at https://github.com/cbaziotis/fairseq/tree/hyperadapters/examples/adapters)

  4. arXiv:2106.05634  [pdf, other

    cs.CL

    Exploring Unsupervised Pretraining Objectives for Machine Translation

    Authors: Christos Baziotis, Ivan Titov, Alexandra Birch, Barry Haddow

    Abstract: Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT), by drastically reducing the need for large parallel data. Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder. In this work, we systematically compare masking with alternative objectives… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: Findings of ACL 2021

  5. arXiv:2004.14928  [pdf, other

    cs.CL

    Language Model Prior for Low-Resource Neural Machine Translation

    Authors: Christos Baziotis, Barry Haddow, Alexandra Birch

    Abstract: The scarcity of large parallel corpora is an important obstacle for neural machine translation. A common solution is to exploit the knowledge of language models (LM) trained on abundant monolingual data. In this work, we propose a novel approach to incorporate a LM as prior in a neural translation model (TM). Specifically, we add a regularization term, which pushes the output distributions of the… ▽ More

    Submitted 26 October, 2020; v1 submitted 30 April, 2020; originally announced April 2020.

    Comments: Accepted at EMNLP 2020. Camera-ready version

  6. arXiv:1906.03674  [pdf, other

    cs.LG cs.CL stat.ML

    Attention-based Conditioning Methods for External Knowledge Integration

    Authors: Katerina Margatina, Christos Baziotis, Alexandros Potamianos

    Abstract: In this paper, we present a novel approach for incorporating external knowledge in Recurrent Neural Networks (RNNs). We propose the integration of lexicon features into the self-attention mechanism of RNN-based architectures. This form of conditioning on the attention distribution, enforces the contribution of the most salient words for the task at hand. We introduce three methods, namely attentio… ▽ More

    Submitted 9 June, 2019; originally announced June 2019.

    Comments: ACL 2019

  7. arXiv:1904.03651  [pdf, other

    cs.CL

    SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression

    Authors: Christos Baziotis, Ion Androutsopoulos, Ioannis Konstas, Alexandros Potamianos

    Abstract: Neural sequence-to-sequence models are currently the dominant approach in several natural language processing tasks, but require large parallel corpora. We present a sequence-to-sequence-to-sequence autoencoder (SEQ^3), consisting of two chained encoder-decoder pairs, with words used as a sequence of discrete latent variables. We apply the proposed model to unsupervised abstractive sentence compre… ▽ More

    Submitted 9 June, 2019; v1 submitted 7 April, 2019; originally announced April 2019.

    Comments: Accepted to NAACL 2019

  8. arXiv:1902.10547  [pdf, other

    cs.CL cs.LG

    An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models

    Authors: Alexandra Chronopoulou, Christos Baziotis, Alexandros Potamianos

    Abstract: A growing number of state-of-the-art transfer learning methods employ language models pretrained on large generic corpora. In this paper we present a conceptually simple and effective transfer learning approach that addresses the problem of catastrophic forgetting. Specifically, we combine the task-specific optimization function with an auxiliary language model objective, which is adjusted during… ▽ More

    Submitted 31 May, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: NAACL 2019

  9. arXiv:1811.04133  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Integrating Recurrence Dynamics for Speech Emotion Recognition

    Authors: Efthymios Tzinis, Georgios Paraskevopoulos, Christos Baziotis, Alexandros Potamianos

    Abstract: We investigate the performance of features that can capture nonlinear recurrence dynamics embedded in the speech signal for the task of Speech Emotion Recognition (SER). Reconstruction of the phase space of each speech frame and the computation of its respective Recurrence Plot (RP) reveals complex structures which can be measured by performing Recurrence Quantification Analysis (RQA). These measu… ▽ More

    Submitted 9 November, 2018; originally announced November 2018.

    Journal ref: Proc. Interspeech 2018, pp. 927-931

  10. arXiv:1809.00717  [pdf, other

    cs.CL

    NTUA-SLP at IEST 2018: Ensemble of Neural Transfer Methods for Implicit Emotion Classification

    Authors: Alexandra Chronopoulou, Aikaterini Margatina, Christos Baziotis, Alexandros Potamianos

    Abstract: In this paper we present our approach to tackle the Implicit Emotion Shared Task (IEST) organized as part of WASSA 2018 at EMNLP 2018. Given a tweet, from which a certain word has been removed, we are asked to predict the emotion of the missing word. In this work, we experiment with neural Transfer Learning (TL) methods. Our models are based on LSTM networks, augmented with a self-attention mechan… ▽ More

    Submitted 3 September, 2018; originally announced September 2018.

  11. arXiv:1804.06659  [pdf, other

    cs.CL

    NTUA-SLP at SemEval-2018 Task 3: Tracking Ironic Tweets using Ensembles of Word and Character Level Attentive RNNs

    Authors: Christos Baziotis, Nikos Athanasiou, Pinelopi Papalampidi, Athanasia Kolovou, Georgios Paraskevopoulos, Nikolaos Ellinas, Alexandros Potamianos

    Abstract: In this paper we present two deep-learning systems that competed at SemEval-2018 Task 3 "Irony detection in English tweets". We design and ensemble two independent models, based on recurrent neural networks (Bi-LSTM), which operate at the word and character level, in order to capture both the semantic and syntactic information in tweets. Our models are augmented with a self-attention mechanism, in… ▽ More

    Submitted 18 April, 2018; originally announced April 2018.

    Comments: SemEval-2018, Task 3 "Irony detection in English tweets"

  12. arXiv:1804.06658  [pdf, other

    cs.CL

    NTUA-SLP at SemEval-2018 Task 1: Predicting Affective Content in Tweets with Deep Attentive RNNs and Transfer Learning

    Authors: Christos Baziotis, Nikos Athanasiou, Alexandra Chronopoulou, Athanasia Kolovou, Georgios Paraskevopoulos, Nikolaos Ellinas, Shrikanth Narayanan, Alexandros Potamianos

    Abstract: In this paper we present deep-learning models that submitted to the SemEval-2018 Task~1 competition: "Affect in Tweets". We participated in all subtasks for English tweets. We propose a Bi-LSTM architecture equipped with a multi-layer self attention mechanism. The attention mechanism improves the model performance and allows us to identify salient words in tweets, as well as gain insight into the… ▽ More

    Submitted 18 April, 2018; originally announced April 2018.

    Comments: Semeval 2018, Task 1 "Affect in Tweets"

  13. arXiv:1804.06657  [pdf, other

    cs.CL

    NTUA-SLP at SemEval-2018 Task 2: Predicting Emojis using RNNs with Context-aware Attention

    Authors: Christos Baziotis, Nikos Athanasiou, Georgios Paraskevopoulos, Nikolaos Ellinas, Athanasia Kolovou, Alexandros Potamianos

    Abstract: In this paper we present a deep-learning model that competed at SemEval-2018 Task 2 "Multilingual Emoji Prediction". We participated in subtask A, in which we are called to predict the most likely associated emoji in English tweets. The proposed architecture relies on a Long Short-Term Memory network, augmented with an attention mechanism, that conditions the weight of each word, on a "context vec… ▽ More

    Submitted 18 April, 2018; originally announced April 2018.

    Comments: SemEval-2018, Task 2 "Multilingual Emoji Prediction"