Skip to main content

Showing 1–50 of 78 results for author: Pham, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16777  [pdf, other

    cs.CL cs.AI

    Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024

    Authors: Sai Koneru, Thai-Binh Nguyen, Ngoc-Quan Pham, Danni Liu, Zhaolin Li, Alexander Waibel, Jan Niehues

    Abstract: Large Language Models (LLMs) are currently under exploration for various tasks, including Automatic Speech Recognition (ASR), Machine Translation (MT), and even End-to-End Speech Translation (ST). In this paper, we present KIT's offline submission in the constrained + LLM track by incorporating recently proposed techniques that can be added to any cascaded speech translation. Specifically, we inte… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2402.15679  [pdf, ps, other

    cs.LG cs.CV

    Scalable Density-based Clustering with Random Projections

    Authors: Haochuan Xu, Ninh Pham

    Abstract: We present sDBSCAN, a scalable density-based clustering algorithm in high dimensions with cosine distance. Utilizing the neighborhood-preserving property of random projections, sDBSCAN can quickly identify core points and their neighborhoods, the primary hurdle of density-based clustering. Theoretically, sDBSCAN outputs a clustering structure similar to DBSCAN under mild conditions with high proba… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  3. arXiv:2402.09264  [pdf, other

    cs.LG cs.HC

    UR2M: Uncertainty and Resource-Aware Event Detection on Microcontrollers

    Authors: Hong Jia, Young D. Kwon, Dong Ma, Nhat Pham, Lorena Qendro, Tam Vu, Cecilia Mascolo

    Abstract: Traditional machine learning techniques are prone to generating inaccurate predictions when confronted with shifts in the distribution of data between the training and testing phases. This vulnerability can lead to severe consequences, especially in applications such as mobile healthcare. Uncertainty estimation has the potential to mitigate this issue by assessing the reliability of a model's outp… ▽ More

    Submitted 12 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  4. arXiv:2401.11487  [pdf, other

    cs.CL cs.CY

    Towards Better Inclusivity: A Diverse Tweet Corpus of English Varieties

    Authors: Nhi Pham, Lachlan Pham, Adam L. Meyers

    Abstract: The prevalence of social media presents a growing opportunity to collect and analyse examples of English varieties. Whilst usage of these varieties was - and, in many cases, still is - used only in spoken contexts or hard-to-access private messages, social media sites like Twitter provide a platform for users to communicate informally in a scrapeable format. Notably, Indian English (Hinglish), Sin… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: 10 pages (including limitations, references and appendices), 2 figures

  5. arXiv:2401.05425  [pdf, other

    eess.SP cs.LG

    An Unobtrusive and Lightweight Ear-worn System for Continuous Epileptic Seizure Detection

    Authors: Abdul Aziz, Nhat Pham, Neel Vora, Cody Reynolds, Jaime Lehnen, Pooja Venkatesh, Zhuoran Yao, Jay Harvey, Tam Vu, Kan Ding, Phuc Nguyen

    Abstract: Epilepsy is one of the most common neurological diseases globally, affecting around 50 million people worldwide. Fortunately, up to 70 percent of people with epilepsy could live seizure-free if properly diagnosed and treated, and a reliable technique to monitor the onset of seizures could improve the quality of life of patients who are constantly facing the fear of random seizure attacks. The scal… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  6. arXiv:2401.01108  [pdf, other

    cs.CL

    Unveiling Comparative Sentiments in Vietnamese Product Reviews: A Sequential Classification Framework

    Authors: Ha Le, Bao Tran, Phuong Le, Tan Nguyen, Dac Nguyen, Ngoan Pham, Dang Huynh

    Abstract: Comparative opinion mining is a specialized field of sentiment analysis that aims to identify and extract sentiments expressed comparatively. To address this task, we propose an approach that consists of solving three sequential sub-tasks: (i) identifying comparative sentence, i.e., if a sentence has a comparative meaning, (ii) extracting comparative elements, i.e., what are comparison subjects, o… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Accepted manuscript at VLSP 2023

  7. arXiv:2312.09877  [pdf, other

    cs.LG cs.AI cs.DC stat.ML

    Distributed Learning of Mixtures of Experts

    Authors: Faïcel Chamroukhi, Nhat Thien Pham

    Abstract: In modern machine learning problems we deal with datasets that are either distributed by nature or potentially large for which distributing the computations is usually a standard way to proceed, since centralized algorithms are in general ineffective. We propose a distributed learning approach for mixtures of experts (MoE) models with an aggregation strategy to construct a reduction estimator from… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  8. arXiv:2311.11096  [pdf, other

    eess.IV cs.CV

    On the Out of Distribution Robustness of Foundation Models in Medical Image Segmentation

    Authors: Duy Minh Ho Nguyen, Tan Ngoc Pham, Nghiem Tuong Diep, Nghi Quoc Phan, Quang Pham, Vinh Tong, Binh T. Nguyen, Ngan Hoang Le, Nhat Ho, Pengtao Xie, Daniel Sonntag, Mathias Niepert

    Abstract: Constructing a robust model that can effectively generalize to test samples under distribution shifts remains a significant challenge in the field of medical imaging. The foundational models for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach. It showcases impressive learning abilities across different tasks with the need for… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: Advances in Neural Information Processing Systems (NeurIPS) 2023, Workshop on robustness of zero/few-shot learning in foundation models

  9. arXiv:2310.14434  [pdf, other

    cs.CR

    Enhancing Accuracy-Privacy Trade-off in Differentially Private Split Learning

    Authors: Ngoc Duy Pham, Khoa Tran Phan, Naveen Chilamkurti

    Abstract: Split learning (SL) aims to protect user data privacy by distributing deep models between client-server and kee** private data locally. Only processed or `smashed' data can be transmitted from the clients to the server during the SL process. However, recently proposed model inversion attacks can recover the original data from the smashed data. In order to enhance privacy protection against such… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

  10. arXiv:2309.11506  [pdf, other

    cs.IR cs.AI cs.CL

    Matching Table Metadata with Business Glossaries Using Large Language Models

    Authors: Elita Lobo, Oktie Hassanzadeh, Nhan Pham, Nandana Mihindukulasooriya, Dharmashankar Subramanian, Horst Samulowitz

    Abstract: Enterprises often own large collections of structured data in the form of large databases or an enterprise data lake. Such data collections come with limited metadata and strict access policies that could limit access to the data contents and, therefore, limit the application of classic retrieval and analysis solutions. As a result, there is a need for solutions that can effectively utilize the av… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: This paper is a work in progress with findings based on limited evidence. Please exercise discretion when interpreting the findings

  11. arXiv:2308.03415  [pdf, other

    cs.CL cs.AI

    End-to-End Evaluation for Low-Latency Simultaneous Speech Translation

    Authors: Christian Huber, Tu Anh Dinh, Carlos Mullov, Ngoc Quan Pham, Thai Binh Nguyen, Fabian Retkowski, Stefan Constantin, Enes Yavuz Ugan, Danni Liu, Zhaolin Li, Sai Koneru, Jan Niehues, Alexander Waibel

    Abstract: The challenge of low-latency speech translation has recently draw significant interest in the research community as shown by several publications and shared tasks. Therefore, it is essential to evaluate these different approaches in realistic scenarios. However, currently only specific aspects of the systems are evaluated and often it is not possible to compare different approaches. In this work… ▽ More

    Submitted 23 October, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

  12. arXiv:2306.11925  [pdf, other

    cs.CV

    LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching

    Authors: Duy M. H. Nguyen, Hoang Nguyen, Nghiem T. Diep, Tan N. Pham, Tri Cao, Binh T. Nguyen, Paul Swoboda, Nhat Ho, Shadi Albarqouni, Pengtao Xie, Daniel Sonntag, Mathias Niepert

    Abstract: Obtaining large pre-trained models that can be fine-tuned to new tasks with limited annotated samples has remained an open challenge for medical imaging data. While pre-trained deep networks on ImageNet and vision-language foundation models trained on web-scale data are prevailing approaches, their effectiveness on medical tasks is limited due to the significant domain shift between natural and me… ▽ More

    Submitted 18 November, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Accepted at NeurIPS 2023

  13. arXiv:2306.05320  [pdf, other

    cs.CL cs.SD

    KIT's Multilingual Speech Translation System for IWSLT 2023

    Authors: Danni Liu, Thai Binh Nguyen, Sai Koneru, Enes Yavuz Ugan, Ngoc-Quan Pham, Tuan-Nam Nguyen, Tu Anh Dinh, Carlos Mullov, Alexander Waibel, Jan Niehues

    Abstract: Many existing speech translation benchmarks focus on native-English speech in high-quality recording conditions, which often do not match the conditions in real-life use-cases. In this paper, we describe our speech translation system for the multilingual track of IWSLT 2023, which evaluates translation quality on scientific conference talks. The test condition features accented input speech and te… ▽ More

    Submitted 12 July, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: IWSLT 2023

  14. arXiv:2305.06044  [pdf, other

    cs.LG stat.ML

    Correlation visualization under missing values: a comparison between imputation and direct parameter estimation methods

    Authors: Nhat-Hao Pham, Khanh-Linh Vo, Mai Anh Vu, Thu Nguyen, Michael A. Riegler, Pål Halvorsen, Binh T. Nguyen

    Abstract: Correlation matrix visualization is essential for understanding the relationships between variables in a dataset, but missing data can pose a significant challenge in estimating correlation coefficients. In this paper, we compare the effects of various missing data methods on the correlation plot, focusing on two common missing patterns: random and monotone. We aim to provide practical strategies… ▽ More

    Submitted 5 September, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

  15. arXiv:2304.08252  [pdf, other

    cs.RO

    PaaS: Planning as a Service for reactive driving in CARLA Leaderboard

    Authors: Nhat Hao Truong, Huu Thien Mai, Tuan Anh Tran, Minh Quang Tran, Duc Duy Nguyen, Ngoc Viet Phuong Pham

    Abstract: End-to-end deep learning approaches has been proven to be efficient in autonomous driving and robotics. By using deep learning techniques for decision-making, those systems are often referred to as a black box, and the result is driven by data. In this paper, we propose PaaS (Planning as a Service), a vanilla module to generate local trajectory planning for autonomous driving in CARLA simulation.… ▽ More

    Submitted 14 June, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: accepted on 05.06.2023, revised on 15.06.2023, to be published on ICSSE 2023

  16. arXiv:2301.10439  [pdf, other

    cs.CL cs.LG

    ViDeBERTa: A powerful pre-trained language model for Vietnamese

    Authors: Cong Dao Tran, Nhut Huy Pham, Anh Nguyen, Truong Son Hy, Tu Vu

    Abstract: This paper presents ViDeBERTa, a new pre-trained monolingual language model for Vietnamese, with three versions - ViDeBERTa_xsmall, ViDeBERTa_base, and ViDeBERTa_large, which are pre-trained on a large-scale corpus of high-quality and diverse Vietnamese texts using DeBERTa architecture. Although many successful pre-trained language models based on Transformer have been widely proposed for the Engl… ▽ More

    Submitted 10 February, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

  17. arXiv:2212.00250  [pdf, other

    cs.CR cs.DC

    Split Learning without Local Weight Sharing to Enhance Client-side Data Privacy

    Authors: Ngoc Duy Pham, Tran Khoa Phan, Alsharif Abuadbba, Yansong Gao, Doan Nguyen, Naveen Chilamkurti

    Abstract: Split learning (SL) aims to protect user data privacy by distributing deep models between client-server and kee** private data locally. In SL training with multiple clients, the local model weights are shared among the clients for local model update. This paper first reveals data privacy leakage exacerbated from local weight sharing among the clients in SL through model inversion attacks. Then,… ▽ More

    Submitted 20 July, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

  18. arXiv:2211.11703   

    cs.CL cs.SD eess.AS

    Towards continually learning new languages

    Authors: Ngoc-Quan Pham, Jan Niehues, Alexander Waibel

    Abstract: Multilingual speech recognition with neural networks is often implemented with batch-learning, when all of the languages are available before training. An ability to add new languages after the prior training sessions can be economically beneficial, but the main challenge is catastrophic forgetting. In this work, we combine the qualities of weight factorization and elastic weight consolidation in… ▽ More

    Submitted 1 March, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Work in progress

  19. arXiv:2209.14494  [pdf, other

    cs.CL

    Multi-stage Information Retrieval for Vietnamese Legal Texts

    Authors: Nhat-Minh Pham, Ha-Thanh Nguyen, Trong-Hop Do

    Abstract: This study deals with the problem of information retrieval (IR) for Vietnamese legal texts. Despite being well researched in many languages, information retrieval has still not received much attention from the Vietnamese research community. This is especially true for the case of legal documents, which are hard to process. This study proposes a new approach for information retrieval for Vietnamese… ▽ More

    Submitted 11 November, 2022; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: Presented at PKAW 2022 (arXiv:2211.03888) Report-no: PKAW/2022/01

    Report number: Report-no: PKAW/2022/01

  20. arXiv:2209.09649  [pdf, other

    q-fin.ST cs.LG

    Predicting Mutual Funds' Performance using Deep Learning and Ensemble Techniques

    Authors: Nghia Chu, Binh Dao, Nga Pham, Huy Nguyen, Hien Tran

    Abstract: Predicting fund performance is beneficial to both investors and fund managers, and yet is a challenging task. In this paper, we have tested whether deep learning models can predict fund performance more accurately than traditional statistical techniques. Fund performance is typically evaluated by the Sharpe ratio, which represents the risk-adjusted performance to ensure meaningful comparability ac… ▽ More

    Submitted 31 July, 2023; v1 submitted 18 September, 2022; originally announced September 2022.

    Comments: 16 pages, 4 figures, 4 tables

  21. vieCap4H-VLSP 2021: Vietnamese Image Captioning for Healthcare Domain using Swin Transformer and Attention-based LSTM

    Authors: Thanh Tin Nguyen, Long H. Nguyen, Nhat Truong Pham, Liu Tai Nguyen, Van Huong Do, Hai Nguyen, Ngoc Duy Nguyen

    Abstract: This study presents our approach on the automatic Vietnamese image captioning for healthcare domain in text processing tasks of Vietnamese Language and Speech Processing (VLSP) Challenge 2021, as shown in Figure 1. In recent years, image captioning often employs a convolutional neural network-based architecture as an encoder and a long short-term memory (LSTM) as a decoder to generate sentences. T… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

    Comments: Accepted for publication in the VNU Journal of Science: Computer Science and Communication Engineering

    Journal ref: VNU Journal of Science: Computer Science and Communication Engineering, 38(2), 2022

  22. arXiv:2206.04864  [pdf, other

    cs.LG cs.CR

    Binarizing Split Learning for Data Privacy Enhancement and Computation Reduction

    Authors: Ngoc Duy Pham, Alsharif Abuadbba, Yansong Gao, Tran Khoa Phan, Naveen Chilamkurti

    Abstract: Split learning (SL) enables data privacy preservation by allowing clients to collaboratively train a deep learning model with the server without sharing raw data. However, SL still has limitations such as potential data privacy leakage and high computation at clients. In this study, we propose to binarize the SL local layers for faster computation (up to 17.5 times less forward-propagation time in… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

  23. arXiv:2206.01382  [pdf, ps, other

    cs.DS cs.CV

    Falconn++: A Locality-sensitive Filtering Approach for Approximate Nearest Neighbor Search

    Authors: Ninh Pham, Tao Liu

    Abstract: We present Falconn++, a novel locality-sensitive filtering approach for approximate nearest neighbor search on angular distance. Falconn++ can filter out potential far away points in any hash bucket \textit{before} querying, which results in higher quality candidates compared to other hashing-based solutions. Theoretically, Falconn++ asymptotically achieves lower query time complexity than Falconn… ▽ More

    Submitted 22 October, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: To appear in NeurIPS 2022

  24. arXiv:2205.12304  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Adaptive multilingual speech recognition with pretrained models

    Authors: Ngoc-Quan Pham, Alex Waibel, Jan Niehues

    Abstract: Multilingual speech recognition with supervised learning has achieved great results as reflected in recent research. With the development of pretraining methods on audio and text data, it is imperative to transfer the knowledge from unsupervised multilingual models to facilitate recognition, especially in many languages with limited data. Our work investigated the effectiveness of using two pretra… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Comments: Submitted to INTERSPEECH 2022

  25. arXiv:2202.13934  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Functional mixture-of-experts for classification

    Authors: Nhat Thien Pham, Faicel Chamroukhi

    Abstract: We develop a mixtures-of-experts (ME) approach to the multiclass classification where the predictors are univariate functions. It consists of a ME model in which both the gating network and the experts network are constructed upon multinomial logistic activation functions with functional inputs. We perform a regularized maximum likelihood estimation in which the coefficient functions enjoy interpr… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

    Comments: Submitted to the 53èmes Journées de la Société Française de Statistique

  26. arXiv:2202.03558  [pdf, other

    cs.LG cs.AI

    Attacking c-MARL More Effectively: A Data Driven Approach

    Authors: Nhan H. Pham, Lam M. Nguyen, Jie Chen, Hoang Thanh Lam, Subhro Das, Tsui-Wei Weng

    Abstract: In recent years, a proliferation of methods were developed for cooperative multi-agent reinforcement learning (c-MARL). However, the robustness of c-MARL agents against adversarial attacks has been rarely explored. In this paper, we propose to evaluate the robustness of c-MARL agents via a model-based approach, named c-MBA. Our proposed formulation can craft much stronger adversarial state perturb… ▽ More

    Submitted 10 September, 2023; v1 submitted 7 February, 2022; originally announced February 2022.

  27. arXiv:2202.02249  [pdf, other

    stat.ME cs.LG stat.CO stat.ML

    Functional Mixtures-of-Experts

    Authors: Faïcel Chamroukhi, Nhat Thien Pham, Van Hà Hoang, Geoffrey J. McLachlan

    Abstract: We consider the statistical analysis of heterogeneous data for prediction in situations where the observations include functions, typically time series. We extend the modeling with Mixtures-of-Experts (ME), as a framework of choice in modeling heterogeneity in data for prediction with vectorial observations, to this functional data analysis context. We first present a new family of ME models, name… ▽ More

    Submitted 20 December, 2023; v1 submitted 4 February, 2022; originally announced February 2022.

    MSC Class: 62-XX; 62R10 ACM Class: G.3

  28. arXiv:2201.06806  [pdf, ps, other

    cs.LG

    An Efficient Hashing-based Ensemble Method for Collaborative Outlier Detection

    Authors: Kitty Li, Ninh Pham

    Abstract: In collaborative outlier detection, multiple participants exchange their local detectors trained on decentralized devices without exchanging their own data. A key problem of collaborative outlier detection is efficiently aggregating multiple local detectors to form a global detector without breaching the privacy of participants' data and degrading the detection accuracy. We study locality-sensitiv… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

  29. arXiv:2201.03019  [pdf, other

    cs.LG cs.AI

    Robust and Resource-Efficient Data-Free Knowledge Distillation by Generative Pseudo Replay

    Authors: Kuluhan Binici, Shivam Aggarwal, Nam Trung Pham, Karianto Leman, Tulika Mitra

    Abstract: Data-Free Knowledge Distillation (KD) allows knowledge transfer from a trained neural network (teacher) to a more compact one (student) in the absence of original training data. Existing works use a validation set to monitor the accuracy of the student over real data and report the highest performance throughout the entire process. However, validation data may not be available at distillation time… ▽ More

    Submitted 21 March, 2022; v1 submitted 9 January, 2022; originally announced January 2022.

    Comments: Accepted by the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

  30. arXiv:2110.11293  [pdf, other

    cs.CV

    An Empirical Study on GANs with Margin Cosine Loss and Relativistic Discriminator

    Authors: Cuong V. Nguyen, Tien-Dung Cao, Tram Truong-Huu, Khanh N. Pham, Binh T. Nguyen

    Abstract: Generative Adversarial Networks (GANs) have emerged as useful generative models, which are capable of implicitly learning data distributions of arbitrarily complex dimensions. However, the training of GANs is empirically well-known for being highly unstable and sensitive. The loss functions of both the discriminator and generator concerning their parameters tend to oscillate wildly during training… ▽ More

    Submitted 21 October, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: 16 pages, 5 figures

  31. arXiv:2109.09026  [pdf, other

    cs.SD cs.HC cs.LG eess.AS

    Hybrid Data Augmentation and Deep Attention-based Dilated Convolutional-Recurrent Neural Networks for Speech Emotion Recognition

    Authors: Nhat Truong Pham, Duc Ngoc Minh Dang, Sy Dzung Nguyen

    Abstract: Speech emotion recognition (SER) has been one of the significant tasks in Human-Computer Interaction (HCI) applications. However, it is hard to choose the optimal features and deal with imbalance labeled data. In this article, we investigate hybrid data augmentation (HDA) methods to generate and balance data based on traditional and generative adversarial networks (GAN) methods. To evaluate the ef… ▽ More

    Submitted 18 September, 2021; originally announced September 2021.

    Comments: 12 pages, 16 figures, 6 tables

  32. arXiv:2109.08860   

    cs.GT

    Groups Influence with Minimum Cost in Social Networks

    Authors: Phuong N. H. Pham, Canh V. Pham, Hieu V. Duong, Thanh T. Nguyen, My T. Thai

    Abstract: This paper studies a Group Influence with Minimum cost which aims to find a seed set with smallest cost that can influence all target groups, where each user is associated with a cost and a group is influenced if the total score of the influenced users belonging to the group is at least a certain threshold. As the group-influence function is neither submodular nor supermodular, theoretical bounds… ▽ More

    Submitted 14 December, 2022; v1 submitted 18 September, 2021; originally announced September 2021.

    Comments: The paper contains some errors

  33. arXiv:2109.03219  [pdf, other

    cs.SD cs.LG cs.NE eess.AS

    Fruit-CoV: An Efficient Vision-based Framework for Speedy Detection and Diagnosis of SARS-CoV-2 Infections Through Recorded Cough Sounds

    Authors: Long H. Nguyen, Nhat Truong Pham, Van Huong Do, Liu Tai Nguyen, Thanh Tin Nguyen, Van Dung Do, Hai Nguyen, Ngoc Duy Nguyen

    Abstract: SARS-CoV-2 is colloquially known as COVID-19 that had an initial outbreak in December 2019. The deadly virus has spread across the world, taking part in the global pandemic disease since March 2020. In addition, a recent variant of SARS-CoV-2 named Delta is intractably contagious and responsible for more than four million deaths over the world. Therefore, it is vital to possess a self-testing serv… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

    Comments: 4 pages

  34. arXiv:2108.11089  [pdf, other

    cs.SD eess.AS

    Detecting Drill Failure in the Small Short-sound Drill Dataset

    Authors: Thanh Tran, Nhat Truong Pham, Jan Lundgren

    Abstract: Monitoring the conditions of machines is vital in the manufacturing industry. Early detection of faulty components in machines for stop** and repairing the failed components can minimize the downtime of the machine. This article presents an approach to detect the failure occurring in drill machines based on drill sounds from Valmet AB. The drill dataset includes three classes: anomalous sounds,… ▽ More

    Submitted 9 November, 2021; v1 submitted 25 August, 2021; originally announced August 2021.

    Comments: 8 pages, 10 figures, journal

  35. arXiv:2108.05698  [pdf, other

    cs.LG cs.CV

    Preventing Catastrophic Forgetting and Distribution Mismatch in Knowledge Distillation via Synthetic Data

    Authors: Kuluhan Binici, Nam Trung Pham, Tulika Mitra, Karianto Leman

    Abstract: With the increasing popularity of deep learning on edge devices, compressing large neural networks to meet the hardware requirements of resource-constrained devices became a significant research direction. Numerous compression methodologies are currently being used to reduce the memory sizes and energy consumption of neural networks. Knowledge distillation (KD) is among such methodologies and it f… ▽ More

    Submitted 5 November, 2021; v1 submitted 11 August, 2021; originally announced August 2021.

    Comments: Accepted by the 2022 Winter Conference on Applications of Computer Vision (WACV 2022)

    Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 663-671

  36. arXiv:2108.01808  [pdf, other

    cs.CV cs.AI cs.LG

    Leaf Recognition Using Convolutional Neural Networks Based Features

    Authors: Boi M. Quach, Dinh V. Cuong, Nhung Pham, Dang Huynh, Binh T. Nguyen

    Abstract: There is a warning light for the loss of plant habitats worldwide that entails concerted efforts to conserve plant biodiversity. Thus, plant species classification is of crucial importance to address this environmental challenge. In recent years, there is a considerable increase in the number of studies related to plant taxonomy. While some researchers try to improve their recognition performance… ▽ More

    Submitted 2 September, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

    Comments: 20 pages; 9 figures; 5 tables

  37. arXiv:2105.03010  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Efficient Weight factorization for Multilingual Speech Recognition

    Authors: Ngoc-Quan Pham, Tuan-Nam Nguyen, Sebastian Stueker, Alexander Waibel

    Abstract: End-to-end multilingual speech recognition involves using a single model training on a compositional speech corpus including many languages, resulting in a single neural network to handle transcribing different languages. Due to the fact that each language in the training data has different characteristics, the shared network may struggle to optimize for all various languages simultaneously. In th… ▽ More

    Submitted 6 May, 2021; originally announced May 2021.

    Comments: Submitted to Interspeech 2021

  38. arXiv:2103.06689  [pdf, other

    cs.CL

    Unsupervised Transfer Learning in Multilingual Neural Machine Translation with Cross-Lingual Word Embeddings

    Authors: Carlos Mullov, Ngoc-Quan Pham, Alexander Waibel

    Abstract: In this work we look into adding a new language to a multilingual NMT system in an unsupervised fashion. Under the utilization of pre-trained cross-lingual word embeddings we seek to exploit a language independent multilingual sentence representation to easily generalize to a new language. While using cross-lingual embeddings for word lookup we decode from a yet entirely unseen source language in… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

  39. arXiv:2103.03452  [pdf, other

    stat.ML cs.DC cs.LG

    FedDR -- Randomized Douglas-Rachford Splitting Algorithms for Nonconvex Federated Composite Optimization

    Authors: Quoc Tran-Dinh, Nhan H. Pham, Dzung T. Phan, Lam M. Nguyen

    Abstract: We develop two new algorithms, called, FedDR and asyncFedDR, for solving a fundamental nonconvex composite optimization problem in federated learning. Our algorithms rely on a novel combination between a nonconvex Douglas-Rachford splitting method, randomized block-coordinate strategies, and asynchronous implementation. They can also handle convex regularizers. Unlike recent methods in the literat… ▽ More

    Submitted 28 October, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

    Comments: 39 pages, and 12 figures

    Report number: UNC-STOR-June 2021

    Journal ref: NeurIPs 2021

  40. arXiv:2012.11098  [pdf, ps, other

    cs.DS

    Sublinear Maximum Inner Product Search using Concomitants of Extreme Order Statistics

    Authors: Ninh Pham

    Abstract: We propose a novel dimensionality reduction method for maximum inner product search (MIPS), named CEOs, based on the theory of concomitants of extreme order statistics. Utilizing the asymptotic behavior of these concomitants, we show that a few dimensions associated with the extreme values of the query signature are enough to estimate inner products. Since CEOs only uses the sign of a small subset… ▽ More

    Submitted 18 August, 2021; v1 submitted 20 December, 2020; originally announced December 2020.

    Comments: A short version with a new title "Simple Yet Efficient Algorithms for Maximum Inner Product Search via Extreme Order Statistics" appears in KDD 2021

  41. arXiv:2012.02950  [pdf, other

    cs.LG cs.AI cs.CY

    Deep Depression Prediction on Longitudinal Data via Joint Anomaly Ranking and Classification

    Authors: Guansong Pang, Ngoc Thien Anh Pham, Emma Baker, Rebecca Bentley, Anton van den Hengel

    Abstract: A wide variety of methods have been developed for identifying depression, but they focus primarily on measuring the degree to which individuals are suffering from depression currently. In this work we explore the possibility of predicting future depression using machine learning applied to longitudinal socio-demographic data. In doing so we show that data such as housing status, and the details of… ▽ More

    Submitted 20 March, 2022; v1 submitted 5 December, 2020; originally announced December 2020.

    Comments: Accepted to PAKDD 2022

  42. arXiv:2006.08748  [pdf, other

    cs.CL

    DynE: Dynamic Ensemble Decoding for Multi-Document Summarization

    Authors: Chris Hokamp, Demian Gholipour Ghalandari, Nghia The Pham, John Glover

    Abstract: Sequence-to-sequence (s2s) models are the basis for extensive work in natural language processing. However, some applications, such as multi-document summarization, multi-modal machine translation, and the automatic post-editing of machine translation, require map** a set of multiple distinct inputs into a single output sequence. Recent work has introduced bespoke architectures for these multi-i… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

  43. arXiv:2005.10070  [pdf, other

    cs.CL

    A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal

    Authors: Demian Gholipour Ghalandari, Chris Hokamp, Nghia The Pham, John Glover, Georgiana Ifrim

    Abstract: Multi-document summarization (MDS) aims to compress the content in large document collections into short summaries and has important applications in story clustering for newsfeeds, presentation of search results, and timeline generation. However, there is a lack of datasets that realistically address such use cases at a scale large enough for training supervised models for this task. This work pre… ▽ More

    Submitted 20 May, 2020; originally announced May 2020.

    Comments: Camera-ready version for ACL 2020

  44. arXiv:2005.09940  [pdf, other

    eess.AS cs.CL cs.SD

    Relative Positional Encoding for Speech Recognition and Direct Translation

    Authors: Ngoc-Quan Pham, Thanh-Le Ha, Tuan-Nam Nguyen, Thai-Son Nguyen, Elizabeth Salesky, Sebastian Stueker, Jan Niehues, Alexander Waibel

    Abstract: Transformer models are powerful sequence-to-sequence architectures that are capable of directly map** speech inputs to transcriptions or translations. However, the mechanism for modeling positions in this model was tailored for text modeling, and thus is less ideal for acoustic inputs. In this work, we adapt the relative position encoding scheme to the Speech Transformer, where the key addition… ▽ More

    Submitted 20 May, 2020; originally announced May 2020.

    Comments: Submitted to Interspeech 2020

  45. arXiv:2003.12347  [pdf

    cs.CY

    Mobile phone data and COVID-19: Missing an opportunity?

    Authors: Nuria Oliver, Emmanuel Letouzé, Harald Sterly, Sébastien Delataille, Marco De Nadai, Bruno Lepri, Renaud Lambiotte, Richard Benjamins, Ciro Cattuto, Vittoria Colizza, Nicolas de Cordes, Samuel P. Fraiberger, Till Koebe, Sune Lehmann, Juan Murillo, Alex Pentland, Phuong N Pham, Frédéric Pivetta, Albert Ali Salah, Jari Saramäki, Samuel V. Scarpino, Michele Tizzoni, Stefaan Verhulst, Patrick Vinck

    Abstract: This paper describes how mobile phone data can guide government and public health authorities in determining the best course of action to control the COVID-19 pandemic and in assessing the effectiveness of control measures such as physical distancing. It identifies key gaps and reasons why this kind of data is only scarcely used, although their value in similar epidemics has proven in a number of… ▽ More

    Submitted 27 March, 2020; originally announced March 2020.

  46. arXiv:2003.10973  [pdf, ps, other

    math.OC cs.LG

    Finite-Time Analysis of Stochastic Gradient Descent under Markov Randomness

    Authors: Thinh T. Doan, Lam M. Nguyen, Nhan H. Pham, Justin Romberg

    Abstract: Motivated by broad applications in reinforcement learning and machine learning, this paper considers the popular stochastic gradient descent (SGD) when the gradients of the underlying objective function are sampled from Markov processes. This Markov sampling leads to the gradient samples being biased and not independent. The existing results for the convergence of SGD under Markov randomness are o… ▽ More

    Submitted 1 April, 2020; v1 submitted 24 March, 2020; originally announced March 2020.

  47. arXiv:2003.10022  [pdf, other

    eess.AS cs.CL cs.SD

    High Performance Sequence-to-Sequence Model for Streaming Speech Recognition

    Authors: Thai-Son Nguyen, Ngoc-Quan Pham, Sebastian Stueker, Alex Waibel

    Abstract: Recently sequence-to-sequence models have started to achieve state-of-the-art performance on standard speech recognition tasks when processing audio data in batch mode, i.e., the complete audio data is available when starting processing. However, when it comes to performing run-on recognition on an input stream of audio data while producing recognition results in real-time and with low word-based… ▽ More

    Submitted 26 July, 2020; v1 submitted 22 March, 2020; originally announced March 2020.

    Comments: To appear in Interspeech 2020

  48. arXiv:2003.00430  [pdf, other

    cs.LG math.OC

    A Hybrid Stochastic Policy Gradient Algorithm for Reinforcement Learning

    Authors: Nhan H. Pham, Lam M. Nguyen, Dzung T. Phan, Phuong Ha Nguyen, Marten van Dijk, Quoc Tran-Dinh

    Abstract: We propose a novel hybrid stochastic policy gradient estimator by combining an unbiased policy gradient estimator, the REINFORCE estimator, with another biased one, an adapted SARAH estimator for policy optimization. The hybrid policy gradient estimator is shown to be biased, but has variance reduced property. Using this estimator, we develop a new Proximal Hybrid Stochastic Policy Gradient Algori… ▽ More

    Submitted 21 September, 2020; v1 submitted 1 March, 2020; originally announced March 2020.

    Comments: Accepted for publication at the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)

    Journal ref: Proceedings of the International Conference on Artificial Intelligence and Statistics, PMLR 108:374-385, 2020

  49. arXiv:1911.02628  [pdf, ps, other

    cs.DS cs.DC

    Distributed MST: A Smoothed Analysis

    Authors: Soumyottam Chatterjee, Gopal Pandurangan, Nguyen Dinh Pham

    Abstract: We study smoothed analysis of distributed graph algorithms, focusing on the fundamental minimum spanning tree (MST) problem. With the goal of studying the time complexity of distributed MST as a function of the "perturbation" of the input graph, we posit a {\em smoothing model} that is parameterized by a smoothing parameter $0 \leq ε(n) \leq 1$ which controls the amount of {\em random} edges that… ▽ More

    Submitted 6 November, 2019; originally announced November 2019.

  50. arXiv:1910.05608  [pdf, other

    cs.CL

    VAIS Hate Speech Detection System: A Deep Learning based Approach for System Combination

    Authors: Thai Binh Nguyen, Quang Minh Nguyen, Thu Hien Nguyen, Ngoc Phuong Pham, The Loc Nguyen, Quoc Truong Do

    Abstract: Nowadays, Social network sites (SNSs) such as Facebook, Twitter are common places where people show their opinions, sentiments and share information with others. However, some people use SNSs to post abuse and harassment threats in order to prevent other SNSs users from expressing themselves as well as seeking different opinions. To deal with this problem, SNSs have to use a lot of resources inclu… ▽ More

    Submitted 12 October, 2019; originally announced October 2019.

    Comments: 5 pages, 6 figures, Vietnamese Language and Speech Processing conference