Skip to main content

Showing 1–15 of 15 results for author: Pham, N

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.19077  [pdf, other

    eess.SY

    Parameter Dependent Chen--Fliess Series and Their Nonrecursive Interconnections

    Authors: W. Steven Gray, Natalie Pham

    Abstract: A class of parameter dependent Chen--Fliess series is introduced where the series coefficients are taken from a noncommutative ring of multivariable differential operators. Such series are shown in the linear case to represent formal solutions to Cauchy initial value problems for nonhomogeneous PDEs and thus are useful for characterizing the input-output maps of distributed control systems. It is… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    MSC Class: 41A58; 93C10; 35C10

  2. arXiv:2401.05425  [pdf, other

    eess.SP cs.LG

    An Unobtrusive and Lightweight Ear-worn System for Continuous Epileptic Seizure Detection

    Authors: Abdul Aziz, Nhat Pham, Neel Vora, Cody Reynolds, Jaime Lehnen, Pooja Venkatesh, Zhuoran Yao, Jay Harvey, Tam Vu, Kan Ding, Phuc Nguyen

    Abstract: Epilepsy is one of the most common neurological diseases globally, affecting around 50 million people worldwide. Fortunately, up to 70 percent of people with epilepsy could live seizure-free if properly diagnosed and treated, and a reliable technique to monitor the onset of seizures could improve the quality of life of patients who are constantly facing the fear of random seizure attacks. The scal… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  3. arXiv:2311.11096  [pdf, other

    eess.IV cs.CV

    On the Out of Distribution Robustness of Foundation Models in Medical Image Segmentation

    Authors: Duy Minh Ho Nguyen, Tan Ngoc Pham, Nghiem Tuong Diep, Nghi Quoc Phan, Quang Pham, Vinh Tong, Binh T. Nguyen, Ngan Hoang Le, Nhat Ho, Pengtao Xie, Daniel Sonntag, Mathias Niepert

    Abstract: Constructing a robust model that can effectively generalize to test samples under distribution shifts remains a significant challenge in the field of medical imaging. The foundational models for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach. It showcases impressive learning abilities across different tasks with the need for… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: Advances in Neural Information Processing Systems (NeurIPS) 2023, Workshop on robustness of zero/few-shot learning in foundation models

  4. arXiv:2211.11703   

    cs.CL cs.SD eess.AS

    Towards continually learning new languages

    Authors: Ngoc-Quan Pham, Jan Niehues, Alexander Waibel

    Abstract: Multilingual speech recognition with neural networks is often implemented with batch-learning, when all of the languages are available before training. An ability to add new languages after the prior training sessions can be economically beneficial, but the main challenge is catastrophic forgetting. In this work, we combine the qualities of weight factorization and elastic weight consolidation in… ▽ More

    Submitted 1 March, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Work in progress

  5. arXiv:2211.02592  [pdf

    eess.SY

    A Large-Scale Study of a Sleep Tracking and Improving Device with Closed-loop and Personalized Real-time Acoustic Stimulation

    Authors: Anh Nguyen, Galen Pogoncheff, Ban Xuan Dong, Nam Bui, Hoang Truong, Nhat Pham, Linh Nguyen, Hoang Huu Nguyen, Sy Duong-Quy, Sangtae Ha, Tam Vu

    Abstract: Various intervention therapies ranging from pharmaceutical to hi-tech tailored solutions have been available to treat difficulty in falling asleep commonly caused by insomnia in modern life. However, current techniques largely remain ill-suited, ineffective, and unreliable due to their lack of precise real-time sleep tracking, in-time feedback on the therapies, an ability to keep people asleep dur… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

    Comments: 33 pages, 8 figures

  6. arXiv:2205.12304  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Adaptive multilingual speech recognition with pretrained models

    Authors: Ngoc-Quan Pham, Alex Waibel, Jan Niehues

    Abstract: Multilingual speech recognition with supervised learning has achieved great results as reflected in recent research. With the development of pretraining methods on audio and text data, it is imperative to transfer the knowledge from unsupervised multilingual models to facilitate recognition, especially in many languages with limited data. Our work investigated the effectiveness of using two pretra… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Comments: Submitted to INTERSPEECH 2022

  7. arXiv:2109.09026  [pdf, other

    cs.SD cs.HC cs.LG eess.AS

    Hybrid Data Augmentation and Deep Attention-based Dilated Convolutional-Recurrent Neural Networks for Speech Emotion Recognition

    Authors: Nhat Truong Pham, Duc Ngoc Minh Dang, Sy Dzung Nguyen

    Abstract: Speech emotion recognition (SER) has been one of the significant tasks in Human-Computer Interaction (HCI) applications. However, it is hard to choose the optimal features and deal with imbalance labeled data. In this article, we investigate hybrid data augmentation (HDA) methods to generate and balance data based on traditional and generative adversarial networks (GAN) methods. To evaluate the ef… ▽ More

    Submitted 18 September, 2021; originally announced September 2021.

    Comments: 12 pages, 16 figures, 6 tables

  8. arXiv:2109.03219  [pdf, other

    cs.SD cs.LG cs.NE eess.AS

    Fruit-CoV: An Efficient Vision-based Framework for Speedy Detection and Diagnosis of SARS-CoV-2 Infections Through Recorded Cough Sounds

    Authors: Long H. Nguyen, Nhat Truong Pham, Van Huong Do, Liu Tai Nguyen, Thanh Tin Nguyen, Van Dung Do, Hai Nguyen, Ngoc Duy Nguyen

    Abstract: SARS-CoV-2 is colloquially known as COVID-19 that had an initial outbreak in December 2019. The deadly virus has spread across the world, taking part in the global pandemic disease since March 2020. In addition, a recent variant of SARS-CoV-2 named Delta is intractably contagious and responsible for more than four million deaths over the world. Therefore, it is vital to possess a self-testing serv… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

    Comments: 4 pages

  9. arXiv:2108.11089  [pdf, other

    cs.SD eess.AS

    Detecting Drill Failure in the Small Short-sound Drill Dataset

    Authors: Thanh Tran, Nhat Truong Pham, Jan Lundgren

    Abstract: Monitoring the conditions of machines is vital in the manufacturing industry. Early detection of faulty components in machines for stop** and repairing the failed components can minimize the downtime of the machine. This article presents an approach to detect the failure occurring in drill machines based on drill sounds from Valmet AB. The drill dataset includes three classes: anomalous sounds,… ▽ More

    Submitted 9 November, 2021; v1 submitted 25 August, 2021; originally announced August 2021.

    Comments: 8 pages, 10 figures, journal

  10. arXiv:2105.03010  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Efficient Weight factorization for Multilingual Speech Recognition

    Authors: Ngoc-Quan Pham, Tuan-Nam Nguyen, Sebastian Stueker, Alexander Waibel

    Abstract: End-to-end multilingual speech recognition involves using a single model training on a compositional speech corpus including many languages, resulting in a single neural network to handle transcribing different languages. Due to the fact that each language in the training data has different characteristics, the shared network may struggle to optimize for all various languages simultaneously. In th… ▽ More

    Submitted 6 May, 2021; originally announced May 2021.

    Comments: Submitted to Interspeech 2021

  11. arXiv:2005.09940  [pdf, other

    eess.AS cs.CL cs.SD

    Relative Positional Encoding for Speech Recognition and Direct Translation

    Authors: Ngoc-Quan Pham, Thanh-Le Ha, Tuan-Nam Nguyen, Thai-Son Nguyen, Elizabeth Salesky, Sebastian Stueker, Jan Niehues, Alexander Waibel

    Abstract: Transformer models are powerful sequence-to-sequence architectures that are capable of directly map** speech inputs to transcriptions or translations. However, the mechanism for modeling positions in this model was tailored for text modeling, and thus is less ideal for acoustic inputs. In this work, we adapt the relative position encoding scheme to the Speech Transformer, where the key addition… ▽ More

    Submitted 20 May, 2020; originally announced May 2020.

    Comments: Submitted to Interspeech 2020

  12. arXiv:2003.10022  [pdf, other

    eess.AS cs.CL cs.SD

    High Performance Sequence-to-Sequence Model for Streaming Speech Recognition

    Authors: Thai-Son Nguyen, Ngoc-Quan Pham, Sebastian Stueker, Alex Waibel

    Abstract: Recently sequence-to-sequence models have started to achieve state-of-the-art performance on standard speech recognition tasks when processing audio data in batch mode, i.e., the complete audio data is available when starting processing. However, when it comes to performing run-on recognition on an input stream of audio data while producing recognition results in real-time and with low word-based… ▽ More

    Submitted 26 July, 2020; v1 submitted 22 March, 2020; originally announced March 2020.

    Comments: To appear in Interspeech 2020

  13. arXiv:1910.05603  [pdf, other

    cs.CL cs.SD eess.AS

    VAIS ASR: Building a conversational speech recognition system using language model combination

    Authors: Quang Minh Nguyen, Thai Binh Nguyen, Ngoc Phuong Pham, The Loc Nguyen

    Abstract: Automatic Speech Recognition (ASR) systems have been evolving quickly and reaching human parity in certain cases. The systems usually perform pretty well on reading style and clean speech, however, most of the available systems suffer from situation where the speaking style is conversation and in noisy environments. It is not straight-forward to tackle such problems due to difficulties in data col… ▽ More

    Submitted 12 October, 2019; originally announced October 2019.

    Comments: 3 pages, 1 figures, Vietnamese Language and Speech Processing conference)

  14. arXiv:1908.09766  [pdf

    cs.NI eess.SY

    A Hybrid of Adaptation and Dynamic Routing based on SDN for Improving QoE in HTTP Adaptive VBR Video Streaming

    Authors: Hong Thinh Pham, Ngoc Nam Pham, Huu Thanh Nguyen, Alan Marshall, Thu Huong Truong

    Abstract: Recently, HTTP Adaptive Streaming HAS has received significant attention from both industry and academia based on its ability to enhancing media streaming services over the Internet. Recent research solutions that have tried to improve HAS by adaptation at the client side only may not be completely effective without interacting with routing decisions in the upper layers. In this paper, we address… ▽ More

    Submitted 26 August, 2019; originally announced August 2019.

    Comments: 14 pages, 17 figures, IJCSNS International Journal of Computer Science and Network Security, http://paper.ijcsns.org/07_book/201907/20190708.pdf

    Journal ref: VOL.19 No.7, July 2019

  15. arXiv:1904.13377  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Very Deep Self-Attention Networks for End-to-End Speech Recognition

    Authors: Ngoc-Quan Pham, Thai-Son Nguyen, Jan Niehues, Markus Müller, Sebastian Stüker, Alexander Waibel

    Abstract: Recently, end-to-end sequence-to-sequence models for speech recognition have gained significant interest in the research community. While previous architecture choices revolve around time-delay neural networks (TDNN) and long short-term memory (LSTM) recurrent neural networks, we propose to use self-attention via the Transformer architecture as an alternative. Our analysis shows that deep Transfor… ▽ More

    Submitted 3 May, 2019; v1 submitted 30 April, 2019; originally announced April 2019.

    Comments: Submitted to INTERSPEECH 2019