Skip to main content

Showing 1–10 of 10 results for author: Lam, T K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.19333  [pdf, other

    cs.CL cs.SD eess.AS

    Compact Speech Translation Models via Discrete Speech Units Pretraining

    Authors: Tsz Kin Lam, Alexandra Birch, Barry Haddow

    Abstract: We propose a pretraining method to use Self-Supervised Speech (SSS) model to creating more compact Speech-to-text Translation. In contrast to using the SSS model for initialization, our method is more suitable to memory constrained scenario such as on-device deployment. Our method is based on Discrete Speech Units (DSU) extracted from the SSS model. In the first step, our method pretrains two smal… ▽ More

    Submitted 26 June, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: 11 pages, accepted at IWSLT 2024

  2. arXiv:2402.00632  [pdf, other

    cs.CL

    Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases

    Authors: Giulio Zhou, Tsz Kin Lam, Alexandra Birch, Barry Haddow

    Abstract: Speech-to-Text Translation (S2TT) has typically been addressed with cascade systems, where speech recognition systems generate a transcription that is subsequently passed to a translation model. While there has been a growing interest in develo** direct speech translation systems to avoid propagating errors and losing non-verbal content, prior work in direct S2TT has struggled to conclusively es… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted at Findings of EACL 2024

  3. arXiv:2302.03839  [pdf, other

    eess.IV cs.CV cs.LG

    Futuristic Variations and Analysis in Fundus Images Corresponding to Biological Traits

    Authors: Muhammad Hassan, Hao Zhang, Ahmed Fateh Ameen, Home Wu Zeng, Shuye Ma, Wen Liang, Dingqi Shang, Jiaming Ding, Ziheng Zhan, Tsz Kwan Lam, Ming Xu, Qiming Huang, Dongmei Wu, Can Yang Zhang, Zhou You, Awiwu Ain, Pei Wu Qin

    Abstract: Fundus image captures rear of an eye, and which has been studied for the diseases identification, classification, segmentation, generation, and biological traits association using handcrafted, conventional, and deep learning methods. In biological traits estimation, most of the studies have been carried out for the age prediction and gender classification with convincing results. However, the curr… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

    Comments: 10 pages, 4 figures, 3 tables

  4. Make More of Your Data: Minimal Effort Data Augmentation for Automatic Speech Recognition and Translation

    Authors: Tsz Kin Lam, Shigehiko Schamoni, Stefan Riezler

    Abstract: Data augmentation is a technique to generate new training data based on existing data. We evaluate the simple and cost-effective method of concatenating the original data examples to build new training instances. Continued training with such augmented data is able to improve off-the-shelf Transformer and Conformer models that were optimized on the original data only. We demonstrate considerable im… ▽ More

    Submitted 14 April, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: Accepted at ICASSP 2023

  5. arXiv:2210.13281  [pdf, other

    cs.CL

    Analyzing the Use of Influence Functions for Instance-Specific Data Filtering in Neural Machine Translation

    Authors: Tsz Kin Lam, Eva Hasler, Felix Hieber

    Abstract: Customer feedback can be an important signal for improving commercial machine translation systems. One solution for fixing specific translation errors is to remove the related erroneous training instances followed by re-training of the machine translation system, which we refer to as instance-specific data filtering. Influence functions (IF) have been shown to be effective in finding such relevant… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted at WMT 2022

  6. Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation

    Authors: Tsz Kin Lam, Shigehiko Schamoni, Stefan Riezler

    Abstract: End-to-end speech translation relies on data that pair source-language speech inputs with corresponding translations into a target language. Such data are notoriously scarce, making synthetic data augmentation by back-translation or knowledge distillation a necessary ingredient of end-to-end training. In this paper, we present a novel approach to data augmentation that leverages audio alignments,… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: Accepted at ACL 2022

  7. On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR

    Authors: Tsz Kin Lam, Mayumi Ohta, Shigehiko Schamoni, Stefan Riezler

    Abstract: We propose an on-the-fly data augmentation method for automatic speech recognition (ASR) that uses alignment information to generate effective training samples. Our method, called Aligned Data Augmentation (ADA) for ASR, replaces transcribed tokens and the speech representations in an aligned manner to generate previously unseen training pairs. The speech representations are sampled from an audio… ▽ More

    Submitted 9 June, 2021; v1 submitted 3 April, 2021; originally announced April 2021.

    Comments: Accepted at INTERSPEECH 2021

  8. Cascaded Models With Cyclic Feedback For Direct Speech Translation

    Authors: Tsz Kin Lam, Shigehiko Schamoni, Stefan Riezler

    Abstract: Direct speech translation describes a scenario where only speech inputs and corresponding translations are available. Such data are notoriously limited. We present a technique that allows cascades of automatic speech recognition (ASR) and machine translation (MT) to exploit in-domain direct speech translation data in addition to out-of-domain MT and ASR data. After pre-training MT and ASR, we use… ▽ More

    Submitted 11 February, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: Accepted at ICASSP 2021

  9. arXiv:1907.02326  [pdf, other

    cs.CL

    Interactive-Predictive Neural Machine Translation through Reinforcement and Imitation

    Authors: Tsz Kin Lam, Shigehiko Schamoni, Stefan Riezler

    Abstract: We propose an interactive-predictive neural machine translation framework for easier model personalization using reinforcement and imitation learning. During the interactive translation process, the user is asked for feedback on uncertain locations identified by the system. Responses are weak feedback in the form of "keep" and "delete" edits, and expert demonstrations in the form of "substitute" e… ▽ More

    Submitted 5 July, 2019; v1 submitted 4 July, 2019; originally announced July 2019.

    Comments: Machine Translation Summit 2019 (MTSUMMIT XVII), Dublin, Ireland

  10. arXiv:1805.01553  [pdf, other

    cs.CL stat.ML

    A Reinforcement Learning Approach to Interactive-Predictive Neural Machine Translation

    Authors: Tsz Kin Lam, Julia Kreutzer, Stefan Riezler

    Abstract: We present an approach to interactive-predictive neural machine translation that attempts to reduce human effort from three directions: Firstly, instead of requiring humans to select, correct, or delete segments, we employ the idea of learning from human reinforcements in form of judgments on the quality of partial translations. Secondly, human effort is further reduced by using the entropy of wor… ▽ More

    Submitted 5 June, 2018; v1 submitted 3 May, 2018; originally announced May 2018.

    Comments: Published at EAMT 2018; Updated algorithm