Skip to main content

Showing 1–13 of 13 results for author: Di Gangi, M A

.
  1. arXiv:2012.04964  [pdf, ps, other

    cs.CL

    On Knowledge Distillation for Direct Speech Translation

    Authors: Marco Gaido, Mattia A. Di Gangi, Matteo Negri, Marco Turchi

    Abstract: Direct speech translation (ST) has shown to be a complex task requiring knowledge transfer from its sub-tasks: automatic speech recognition (ASR) and machine translation (MT). For MT, one of the most promising techniques to transfer knowledge is knowledge distillation. In this paper, we compare the different solutions to distill knowledge in a sequence-to-sequence task like ST. Moreover, we analyz… ▽ More

    Submitted 9 December, 2020; originally announced December 2020.

    Comments: Accepted at CLiC-IT 2020

  2. arXiv:2009.04707  [pdf, other

    cs.CL

    On Target Segmentation for Direct Speech Translation

    Authors: Mattia Antonino Di Gangi, Marco Gaido, Matteo Negri, Marco Turchi

    Abstract: Recent studies on direct speech translation show continuous improvements by means of data augmentation techniques and bigger deep learning models. While these methods are hel** to close the gap between this new approach and the more traditional cascaded one, there are many incongruities among different studies that make it difficult to assess the state of the art. Surprisingly, one point of disc… ▽ More

    Submitted 10 September, 2020; originally announced September 2020.

    Comments: 14 pages single column, 4 figures, accepted for presentation at the AMTA2020 research track

  3. arXiv:2008.02270  [pdf, other

    cs.CL

    Contextualized Translation of Automatically Segmented Speech

    Authors: Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Mauro Cettolo, Marco Turchi

    Abstract: Direct speech-to-text translation (ST) models are usually trained on corpora segmented at sentence level, but at inference time they are commonly fed with audio split by a voice activity detector (VAD). Since VAD segmentation is not syntax-informed, the resulting segments do not necessarily correspond to well-formed sentences uttered by the speaker but, most likely, to fragments of one or more sen… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

    Comments: Interspeech 2020

  4. arXiv:2006.05754  [pdf, ps, other

    cs.CL cs.AI cs.SD eess.AS

    Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus

    Authors: Luisa Bentivogli, Beatrice Savoldi, Matteo Negri, Mattia Antonino Di Gangi, Roldano Cattoni, Marco Turchi

    Abstract: Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines. This difficulty is also due to the fact that the training data on which models are built typically reflect the asymmetries of natural languages, gender bias included. Exclusively fed with textual data, machine translation is intrinsically constrained b… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

    Comments: 9 pages of content, accepted at ACL 2020

  5. arXiv:2006.02965  [pdf, other

    cs.CL cs.SD eess.AS

    End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020

    Authors: Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi

    Abstract: This paper describes FBK's participation in the IWSLT 2020 offline speech translation (ST) task. The task evaluates systems' ability to translate English TED talks audio into German texts. The test talks are provided in two versions: one contains the data already segmented with automatic tools and the other is the raw data without any segmentation. Participants can decide whether to work on custom… ▽ More

    Submitted 4 June, 2020; originally announced June 2020.

    Comments: Accepted at IWSLT2020

  6. arXiv:1910.10663  [pdf, ps, other

    cs.CL eess.AS

    Instance-Based Model Adaptation For Direct Speech Translation

    Authors: Mattia Antonino Di Gangi, Viet-Nhat Nguyen, Matteo Negri, Marco Turchi

    Abstract: Despite recent technology advancements, the effectiveness of neural approaches to end-to-end speech-to-text translation is still limited by the paucity of publicly available training corpora. We tackle this limitation with a method to improve data exploitation and boost the system's performance at inference time. Our approach allows us to customize "on the fly" an existing model to each incoming t… ▽ More

    Submitted 23 October, 2019; originally announced October 2019.

    Comments: 6 pages, under review at ICASSP 2020

  7. arXiv:1910.10238  [pdf, ps, other

    cs.CL cs.LG

    Robust Neural Machine Translation for Clean and Noisy Speech Transcripts

    Authors: Mattia Antonino Di Gangi, Robert Enyedi, Alessandra Brusadin, Marcello Federico

    Abstract: Neural machine translation models have shown to achieve high quality when trained and fed with well structured and punctuated input texts. Unfortunately, the latter condition is not met in spoken language translation, where the input is generated by an automatic speech recognition (ASR) system. In this paper, we study how to adapt a strong NMT system to make it robust to typical ASR errors. As in… ▽ More

    Submitted 22 October, 2019; originally announced October 2019.

    Comments: 6 pages, accepted at IWSLT 2019

  8. arXiv:1910.06753  [pdf, other

    cs.CL

    On the Importance of Word Boundaries in Character-level Neural Machine Translation

    Authors: Duygu Ataman, Orhan Firat, Mattia A. Di Gangi, Marcello Federico, Alexandra Birch

    Abstract: Neural Machine Translation (NMT) models generally perform translation using a fixed-size lexical vocabulary, which is an important bottleneck on their generalization capability and overall translation quality. The standard approach to overcome this limitation is to segment words into subword units, typically using some external tools with arbitrary heuristics, resulting in vocabulary units not opt… ▽ More

    Submitted 21 October, 2019; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: To appear at the 3rd Workshop on Neural Generation and Translation (WNGT 2019)

  9. arXiv:1910.03320  [pdf, other

    cs.CL eess.AS

    One-To-Many Multilingual End-to-end Speech Translation

    Authors: Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi

    Abstract: Nowadays, training end-to-end neural models for spoken language translation (SLT) still has to confront with extreme data scarcity conditions. The existing SLT parallel corpora are indeed orders of magnitude smaller than those available for the closely related tasks of automatic speech recognition (ASR) and machine translation (MT), which usually comprise tens of millions of instances. To cope wit… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

    Comments: 8 pages, one figure, version accepted at ASRU 2019

  10. Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors

    Authors: Nicholas Ruiz, Mattia Antonino Di Gangi, Nicola Bertoldi, Marcello Federico

    Abstract: Machine translation systems are conventionally trained on textual resources that do not model phenomena that occur in spoken language. While the evaluation of neural machine translation systems on textual inputs is actively researched in the literature , little has been discovered about the complexities of translating spoken language data with neural models. We introduce and motivate interesting p… ▽ More

    Submitted 24 April, 2019; originally announced April 2019.

    Comments: Interspeech 2017

  11. arXiv:1904.04019  [pdf, other

    cs.CL cs.LG stat.ML

    Effectiveness of Data-Driven Induction of Semantic Spaces and Traditional Classifiers for Sarcasm Detection

    Authors: Mattia Antonino Di Gangi, Giosué Lo Bosco, Giovanni Pilato

    Abstract: Irony and sarcasm are two complex linguistic phenomena that are widely used in everyday language and especially over the social media, but they represent two serious issues for automated text understanding. Many labeled corpora have been extracted from several sources to accomplish this task, and it seems that sarcasm is conveyed in different ways for different domains. Nonetheless, very little wo… ▽ More

    Submitted 6 December, 2019; v1 submitted 2 April, 2019; originally announced April 2019.

    Comments: 37 pages, 7 figures, version 4

    Journal ref: Natural Language Engineering, 25(2), 257-285 (2019)

  12. arXiv:1810.07652  [pdf, other

    eess.AS cs.CL cs.LG cs.SD stat.ML

    Fine-tuning on Clean Data for End-to-End Speech Translation: FBK @ IWSLT 2018

    Authors: Mattia Antonino Di Gangi, Roberto Dessì, Roldano Cattoni, Matteo Negri, Marco Turchi

    Abstract: This paper describes FBK's submission to the end-to-end English-German speech translation task at IWSLT 2018. Our system relies on a state-of-the-art model based on LSTMs and CNNs, where the CNNs are used to reduce the temporal dimension of the audio input, which is in general much higher than machine translation input. Our model was trained only on the audio-to-text parallel data released for the… ▽ More

    Submitted 16 October, 2018; originally announced October 2018.

    Comments: 6 pages, 2 figures, system description at the 15th International Workshop on Spoken Language Translation (IWSLT) 2018

  13. arXiv:1805.04185  [pdf, other

    cs.CL

    Deep Neural Machine Translation with Weakly-Recurrent Units

    Authors: Mattia Antonino Di Gangi, Marcello Federico

    Abstract: Recurrent neural networks (RNNs) have represented for years the state of the art in neural machine translation. Recently, new architectures have been proposed, which can leverage parallel computation on GPUs better than classical RNNs. Faster training and inference combined with different sequence-to-sequence modeling also lead to performance improvements. While the new models completely depart fr… ▽ More

    Submitted 10 May, 2018; originally announced May 2018.

    Comments: 10 pages, 3 figures, accepted as a conference paper at the 21st Annual Conference of the European Association for Machine Translation (EAMT) 2018