Skip to main content

Showing 1–11 of 11 results for author: Setiawan, H

.
  1. arXiv:2309.02553  [pdf, other

    cs.CL cs.AI

    Automating Behavioral Testing in Machine Translation

    Authors: Javier Ferrando, Matthias Sperber, Hendra Setiawan, Dominic Telaar, Saša Hasan

    Abstract: Behavioral testing in NLP allows fine-grained evaluation of systems by examining their linguistic capabilities through the analysis of input-output behavior. Unfortunately, existing work on behavioral testing in Machine Translation (MT) is currently restricted to largely handcrafted tests covering a limited range of capabilities and languages. To address this limitation, we propose to use Large La… ▽ More

    Submitted 2 November, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

  2. arXiv:2309.01826  [pdf, other

    cs.CL cs.AI

    One Wide Feedforward is All You Need

    Authors: Telmo Pessoa Pires, António V. Lopes, Yannick Assogba, Hendra Setiawan

    Abstract: The Transformer architecture has two main non-embedding components: Attention and the Feed Forward Network (FFN). Attention captures interdependencies between words regardless of their position, while the FFN non-linearly transforms each input token independently. In this work we explore the role of the FFN, and find that despite taking up a significant fraction of the model's parameters, it is hi… ▽ More

    Submitted 21 October, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: Accepted at WMT23 (EMNLP 2023)

  3. arXiv:2305.12057  [pdf, other

    cs.CL

    Accurate Knowledge Distillation with n-best Reranking

    Authors: Hendra Setiawan

    Abstract: We propose utilizing n-best reranking to enhance Sequence-Level Knowledge Distillation (Kim and Rush, 2016) where we extract pseudo-labels for student model's training data from top n-best hypotheses and leverage a diverse set of models with different inductive biases, objective functions or architectures, including some publicly-available large language models, to pick the highest-quality hypothe… ▽ More

    Submitted 12 June, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

  4. arXiv:2212.09982  [pdf, other

    cs.CL cs.SD eess.AS

    Joint Speech Transcription and Translation: Pseudo-Labeling with Out-of-Distribution Data

    Authors: Mozhdeh Gheini, Tatiana Likhomanenko, Matthias Sperber, Hendra Setiawan

    Abstract: Self-training has been shown to be helpful in addressing data scarcity for many domains, including vision, speech, and language. Specifically, self-training, or pseudo-labeling, labels unsupervised data and adds that to the training pool. In this work, we investigate and use pseudo-labeling for a recently proposed novel setup: joint transcription and translation of speech, which suffers from an ab… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

  5. arXiv:2204.05076  [pdf, other

    cs.CL cs.SD eess.AS

    End-to-End Speech Translation for Code Switched Speech

    Authors: Orion Weller, Matthias Sperber, Telmo Pires, Hendra Setiawan, Christian Gollan, Dominic Telaar, Matthias Paulik

    Abstract: Code switching (CS) refers to the phenomenon of interchangeably using words and phrases from different languages. CS can pose significant accuracy challenges to NLP, due to the often monolingual nature of the underlying systems. In this work, we focus on CS in the context of English/Spanish conversations for the task of speech translation (ST), generating and evaluating both transcript and transla… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: Accepted to Findings of ACL 2022

  6. arXiv:2007.12741  [pdf, other

    cs.CL

    Consistent Transcription and Translation of Speech

    Authors: Matthias Sperber, Hendra Setiawan, Christian Gollan, Udhyakumar Nallasamy, Matthias Paulik

    Abstract: The conventional paradigm in speech translation starts with a speech recognition step to generate transcripts, followed by a translation step with the automatic transcripts as input. To address various shortcomings of this paradigm, recent work explores end-to-end trainable direct models that translate without transcribing. However, transcripts can be an indispensable output in practical applicati… ▽ More

    Submitted 28 August, 2020; v1 submitted 24 July, 2020; originally announced July 2020.

    Comments: Accepted at TACL (pre-MIT Press publication version); added dataset link

  7. arXiv:2005.13978  [pdf, ps, other

    cs.CL

    Variational Neural Machine Translation with Normalizing Flows

    Authors: Hendra Setiawan, Matthias Sperber, Udhay Nallasamy, Matthias Paulik

    Abstract: Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations, conditioned not only on the source sentence but also on some latent random variables. The latent variable modeling may introduce useful statistical dependencies that can improve translation accuracy. Unfortunately, learning informative latent variables is non-trivial, as the… ▽ More

    Submitted 28 May, 2020; originally announced May 2020.

    Comments: To appear in 2020 Association for Computational Linguistics (ACL) as a short paper

  8. arXiv:2005.10806  [pdf, other

    physics.ins-det nucl-ex

    The S$π$RIT Time Projection Chamber

    Authors: J. Barney, J. Estee, W. G. Lynch, T. Isobe, G. Jhang, M. Kurata-Nishimura, A. B. McIntosh, T. Murakami, R. Shane, S. Tangwancharoen, M. B. Tsang, G. Cerizza, M. Kaneko, J. W. Lee, C. Y. Tsang, R. Wang, C. Anderson, H. Baba, Z. Chajecki, M. Famiano, R. Hodges-Showalter, B. Hong, T. Kobayashi, P. Lasko, J. Łukasik , et al. (15 additional authors not shown)

    Abstract: The SAMURAI Pion Reconstruction and Ion-Tracker Time Projection Chamber (S$π$RIT TPC) was designed to enable measurements of heavy ion collisions with the SAMURAI spectrometer at the RIKEN Radioactive Isotope Beam Factory and provide constraints on the Equation of State of neutron-rich nuclear matter. The S$π$RIT TPC has a 50.5 cm drift length and an 86.4 cm $\times$ 134.4 cm pad plane with 12,096… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

    Comments: 12 pages, 20 figures

  9. arXiv:1801.06181  [pdf, ps, other

    physics.ins-det nucl-ex

    On Determining Dead Layer and Detector Thicknesses for a Position-Sensitive Silicon Detector

    Authors: J. Manfredi, Jenny Lee, W. G. Lynch, C. Y. Niu, M. B. Tsang, C. Anderson, J. Barney, K. W. Brown, Z. Chajecki, K. P. Chan, G. Chen, J. Estee, Z. Li, C. Pruitt, A. M. Rogers, A. Sanetullaev, H. Setiawan, R. Showalter, C. Y. Tsang, J. R. Winkelbauer, Z. Xiao, Z. Xu

    Abstract: In this work, two particular properties of the position-sensitive, thick silicon detectors (known as the "E" detectors) in the High Resolution Array (HiRA) are investigated: the thickness of the dead layer on the front of the detector, and the overall thickness of the detector itself. The dead layer thickness for each E detector in HiRA is extracted using a measurement of alpha particles emitted f… ▽ More

    Submitted 18 January, 2018; originally announced January 2018.

    Comments: Accepted for publication in Nuclear Instruments and Methods in Physics Research

  10. Pion Production in Rare Isotope Collisions

    Authors: M. B. Tsang, J. Estee, H. Setiawan, W. G. Lynch, J. Barney, M. B. Chen, G. Cerizza, P. Danielewicz, J. Hong, P. Morfouace, R. Shane, S. Tangwancharoen, K. Zhu, T. Isobe, M. Kurata-Nishimura, J. Lukasik, T. Murakami, the SπRIT collaboration

    Abstract: Pion energy spectra are presented for central collisions of neutron-rich 132Sn+124Sn and neutron-deficient 108Sn+112Sn systems using simulations with Boltzmann-Uehling-Uhlenbeck transport model. These calculations, which incorporate isospin-dependent mean field potentials for relevant baryons and mesons, display a sensitivity to the pion spectra that could allow significant constraints on the dens… ▽ More

    Submitted 12 March, 2017; v1 submitted 20 December, 2016; originally announced December 2016.

    Journal ref: Phys. Rev. C 95, 044614 (2017)

  11. arXiv:1506.00698  [pdf, other

    cs.CL

    Statistical Machine Translation Features with Multitask Tensor Networks

    Authors: Hendra Setiawan, Zhongqiang Huang, Jacob Devlin, Thomas Lamar, Rabih Zbib, Richard Schwartz, John Makhoul

    Abstract: We present a three-pronged approach to improving Statistical Machine Translation (SMT), building on recent success in the application of neural networks to SMT. First, we propose new features based on neural networks to model various non-local translation phenomena. Second, we augment the architecture of the neural network with tensor layers that capture important higher-order interaction among th… ▽ More

    Submitted 1 June, 2015; originally announced June 2015.

    Comments: 11 pages (9 content + 2 references), 2 figures, accepted to ACL 2015 as a long paper