Skip to main content

Showing 1–4 of 4 results for author: Nguyen, L T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.02555  [pdf, ps, other

    eess.AS cs.CL

    PhoWhisper: Automatic Speech Recognition for Vietnamese

    Authors: Thanh-Thien Le, Linh The Nguyen, Dat Quoc Nguyen

    Abstract: We introduce PhoWhisper in five versions for Vietnamese automatic speech recognition. PhoWhisper's robustness is achieved through fine-tuning the Whisper model on an 844-hour dataset that encompasses diverse Vietnamese accents. Our experimental study demonstrates state-of-the-art performances of PhoWhisper on benchmark Vietnamese ASR datasets. We have open-sourced PhoWhisper at: https://github.com… ▽ More

    Submitted 27 March, 2024; originally announced June 2024.

    Comments: Accepted to ICLR 2024 Tiny Papers Track

  2. arXiv:2402.01198  [pdf, other

    cs.IT eess.SP

    Physical Layer Location Privacy in SIMO Communication Using Fake Paths Injection

    Authors: Trong Duy Tran, Maxime Ferreira Da Costa, Linh Trung Nguyen

    Abstract: Fake path injection is an emerging paradigm for inducing privacy over wireless networks. In this paper, fake paths are injected by the transmitter into a SIMO multipath communication channel to preserve her physical location from an eavesdropper. A novel statistical privacy metric is defined as the ratio between the largest (resp. smallest) eigenvalues of Bob's (resp. Eve's) Cramér-Rao lower bound… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  3. arXiv:2305.19709  [pdf, other

    cs.CL cs.SD eess.AS

    XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech

    Authors: Linh The Nguyen, Thinh Pham, Dat Quoc Nguyen

    Abstract: We present XPhoneBERT, the first multilingual model pre-trained to learn phoneme representations for the downstream text-to-speech (TTS) task. Our XPhoneBERT has the same model architecture as BERT-base, trained using the RoBERTa pre-training approach on 330M phoneme-level sentences from nearly 100 languages and locales. Experimental results show that employing XPhoneBERT as an input phoneme encod… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: In Proceedings of INTERSPEECH 2023 (to appear)

  4. arXiv:2109.03219  [pdf, other

    cs.SD cs.LG cs.NE eess.AS

    Fruit-CoV: An Efficient Vision-based Framework for Speedy Detection and Diagnosis of SARS-CoV-2 Infections Through Recorded Cough Sounds

    Authors: Long H. Nguyen, Nhat Truong Pham, Van Huong Do, Liu Tai Nguyen, Thanh Tin Nguyen, Van Dung Do, Hai Nguyen, Ngoc Duy Nguyen

    Abstract: SARS-CoV-2 is colloquially known as COVID-19 that had an initial outbreak in December 2019. The deadly virus has spread across the world, taking part in the global pandemic disease since March 2020. In addition, a recent variant of SARS-CoV-2 named Delta is intractably contagious and responsible for more than four million deaths over the world. Therefore, it is vital to possess a self-testing serv… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

    Comments: 4 pages