Skip to main content

Showing 1–21 of 21 results for author: Do, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06403  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Meta Learning Text-to-Speech Synthesis in over 7000 Languages

    Authors: Florian Lux, Sarina Meyer, Lyonel Behringer, Frank Zalkow, Phat Do, Matt Coler, Emanuël A. P. Habets, Ngoc Thang Vu

    Abstract: In this work, we take on the challenging task of building a single text-to-speech synthesis system that is capable of generating speech in over 7000 languages, many of which lack sufficient data for traditional TTS development. By leveraging a novel integration of massively multilingual pretraining and meta learning to approximate language representations, our approach enables zero-shot speech syn… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: accepted at Interspeech 2024

  2. arXiv:2404.09951  [pdf, other

    cs.CV

    Unifying Global and Local Scene Entities Modelling for Precise Action Spotting

    Authors: Kim Hoang Tran, Phuc Vuong Do, Ngoc Quoc Ly, Ngan Le

    Abstract: Sports videos pose complex challenges, including cluttered backgrounds, camera angle changes, small action-representing objects, and imbalanced action class distribution. Existing methods for detecting actions in sports videos heavily rely on global features, utilizing a backbone network as a black box that encompasses the entire spatial frame. However, these approaches tend to overlook the nuance… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted to IJCNN 2024

  3. arXiv:2403.15882  [pdf, other

    cs.CL

    VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding

    Authors: Phong Nguyen-Thuan Do, Son Quoc Tran, Phu Gia Hoang, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: The success of Natural Language Understanding (NLU) benchmarks in various languages, such as GLUE for English, CLUE for Chinese, KLUE for Korean, and IndoNLU for Indonesian, has facilitated the evaluation of new NLU models across a wide range of tasks. To establish a standardized set of benchmarks for Vietnamese NLU, we introduce the first Vietnamese Language Understanding Evaluation (VLUE) benchm… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: Accepted at NAACL 2024 (Findings)

  4. arXiv:2403.09359  [pdf, other

    cs.CV cs.AI

    D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection

    Authors: Dinh Phat Do, Taehoon Kim, Jaemin Na, Jiwon Kim, Keonho Lee, Kyunghwan Cho, Wonjun Hwang

    Abstract: Domain adaptation for object detection typically entails transferring knowledge from one visible domain to another visible domain. However, there are limited studies on adapting from the visible to the thermal domain, because the domain gap between the visible and thermal domains is much larger than expected, and traditional domain adaptation can not successfully facilitate learning in this situat… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024. Link: https://github.com/EdwardDo69/D3T

  5. arXiv:2309.05103  [pdf, other

    cs.CL cs.AI

    AGent: A Novel Pipeline for Automatically Creating Unanswerable Questions

    Authors: Son Quoc Tran, Gia-Huy Do, Phong Nguyen-Thuan Do, Matt Kretchmar, Xinya Du

    Abstract: The development of large high-quality datasets and high-performing models have led to significant advancements in the domain of Extractive Question Answering (EQA). This progress has sparked considerable interest in exploring unanswerable questions within the EQA domain. Training EQA models with unanswerable questions helps them avoid extracting misleading or incorrect answers for queries that lac… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Comments: 16 pages, 10 tables, 3 figures

  6. arXiv:2306.12040  [pdf, other

    cs.CL eess.AS

    Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Map**, Features Input, and Source Language Selection

    Authors: Phat Do, Matt Coler, Jelske Dijkstra, Esther Klabbers

    Abstract: We compare using a PHOIBLE-based phone map** method and using phonological features input in transfer learning for TTS in low-resource languages. We use diverse source languages (English, Finnish, Hindi, Japanese, and Russian) and target languages (Bulgarian, Georgian, Kazakh, Swahili, Urdu, and Uzbek) to test the language-independence of the methods and enhance the findings' applicability. We u… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Comments: Accepted at the Speech Synthesis Workshop 2023

  7. arXiv:2306.00535  [pdf, other

    cs.CL eess.AS

    The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech

    Authors: Phat Do, Matt Coler, Jelske Dijkstra, Esther Klabbers

    Abstract: We compare phone labels and articulatory features as input for cross-lingual transfer learning in text-to-speech (TTS) for low-resource languages (LRLs). Experiments with FastSpeech 2 and the LRL West Frisian show that using articulatory features outperformed using phone labels in both intelligibility and naturalness. For LRLs without pronunciation dictionaries, we propose two novel approaches: a)… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: Accepted at INTERSPEECH 2023

  8. arXiv:2305.19396  [pdf, other

    eess.AS cs.CL

    Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages

    Authors: Phat Do, Matt Coler, Jelske Dijkstra, Esther Klabbers

    Abstract: We train a MOS prediction model based on wav2vec 2.0 using the open-access data sets BVCC and SOMOS. Our test with neural TTS data in the low-resource language (LRL) West Frisian shows that pre-training on BVCC before fine-tuning on SOMOS leads to the best accuracy for both fine-tuned and zero-shot prediction. Further fine-tuning experiments show that using more than 30 percent of the total data d… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted at INTERSPEECH 2023

  9. arXiv:2303.13355  [pdf, other

    cs.CL cs.AI

    Revealing Weaknesses of Vietnamese Language Models Through Unanswerable Questions in Machine Reading Comprehension

    Authors: Son Quoc Tran, Phong Nguyen-Thuan Do, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: Although the curse of multilinguality significantly restricts the language abilities of multilingual models in monolingual settings, researchers now still have to rely on multilingual models to develop state-of-the-art systems in Vietnamese Machine Reading Comprehension. This difficulty in researching is because of the limited number of high-quality works in develo** Vietnamese language models.… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted at The 2023 EACL Student Research Workshop

  10. arXiv:2302.00094  [pdf, other

    cs.AI

    The Impacts of Unanswerable Questions on the Robustness of Machine Reading Comprehension Models

    Authors: Son Quoc Tran, Phong Nguyen-Thuan Do, Uyen Le, Matt Kretchmar

    Abstract: Pretrained language models have achieved super-human performances on many Machine Reading Comprehension (MRC) benchmarks. Nevertheless, their relative inability to defend against adversarial attacks has spurred skepticism about their natural language understanding. In this paper, we ask whether training with unanswerable questions in SQuAD 2.0 can help improve the robustness of MRC models against… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

    Comments: Accepted atThe 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023)

  11. A Deep Reinforcement Learning-based Adaptive Charging Policy for WRSNs

    Authors: Ngoc Bui, Phi Le Nguyen, Viet Anh Nguyen, Phan Thuan Do

    Abstract: Wireless sensor networks consist of randomly distributed sensor nodes for monitoring targets or areas of interest. Maintaining the network for continuous surveillance is a challenge due to the limited battery capacity in each sensor. Wireless power transfer technology is emerging as a reliable solution for energizing the sensors by deploying a mobile charger (MC) to recharge the sensor. However, d… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: 9 pages

  12. arXiv:2204.07002  [pdf, other

    cs.CL

    XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-based Textual Knowledge Source

    Authors: Kiet Van Nguyen, Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research community in recent years because of the strong development of machine reading comprehension-based models. A reader-based QA system is a high-level search engi… ▽ More

    Submitted 13 August, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted by ACIIDS 2022

  13. Efficient algorithms for maximum induced matching problem in permutation and trapezoid graphs

    Authors: Viet Dung Nguyen, Ba Thai Pham, Phan Thuan Do

    Abstract: We first design an $\mathcal{O}(n^2)$ solution for finding a maximum induced matching in permutation graphs given their permutation models, based on a dynamic programming algorithm with the aid of the sweep line technique. With the support of the disjoint-set data structure, we improve the complexity to $\mathcal{O}(m + n)$. Consequently, we extend this result to give an $\mathcal{O}(m + n)$ algor… ▽ More

    Submitted 4 November, 2021; v1 submitted 18 July, 2021; originally announced July 2021.

    Journal ref: Fundamenta Informaticae, Volume 182, Issue 3 (November 18, 2021) fi:7684

  14. arXiv:2105.09043  [pdf, other

    cs.CL

    Sentence Extraction-Based Machine Reading Comprehension for Vietnamese

    Authors: Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Kiet Van Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: The development of natural language processing (NLP) in general and machine reading comprehension in particular has attracted the great attention of the research community. In recent years, there are a few datasets for machine reading comprehension tasks in Vietnamese with large sizes, such as UIT-ViQuAD and UIT-ViNewsQA. However, the datasets are not diverse in answers to serve the research. In t… ▽ More

    Submitted 11 June, 2021; v1 submitted 19 May, 2021; originally announced May 2021.

    Comments: Accepted by KSEM 2021 (International Conference on Knowledge Science, Engineering and Management)

  15. arXiv:2103.10357  [pdf, ps, other

    cs.DM math.CO

    The equidistribution of some Mahonian statistics over permutations avoiding a pattern of length three

    Authors: Phan Thuan Do, Thi Thu Huong Tran, Vincent Vajnovszki

    Abstract: We prove the equidistribution of several multistatistics over some classes of permutations avoiding a $3$-length pattern. We deduce the equidistribution, on the one hand of inv and foze" statistics, and on the other hand that of maj and makl statistics, over these classes of pattern avoiding permutations. Here inv and maj are the celebrated Mahonian statistics, foze" is one of the statistics defin… ▽ More

    Submitted 11 August, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

  16. arXiv:2002.03301  [pdf, other

    cs.NI

    A Virtual Network Customization Framework for Multicast Services in NFV-enabled Core Networks

    Authors: Omar Alhussein, Phu Thinh Do, Qiang Ye, Junling Li, Weisen Shi, Weihua Zhuang, Xuemin, Shen, Xu Li, Jaya Rao

    Abstract: The paradigm of network function virtualization (NFV) with the support of software defined networking (SDN) emerges as a promising approach for customizing network services in fifth generation (5G) networks. In this paper, a multicast service orchestration framework is presented, where joint traffic routing and virtual network function (NF) placement are studied for accommodating multicast service… ▽ More

    Submitted 9 February, 2020; originally announced February 2020.

    Comments: Accepted to IEEE Journal on Selected Areas in Communications

  17. arXiv:1902.08658  [pdf, ps, other

    cs.NI

    An SDN-Based Transmission Protocol with In-Path Packet Caching and Retransmission

    Authors: Jiayin Chen, Si Yan, Qiang Ye, Wei Quan, Phu Thinh Do, Weihua Zhuang, Xuemin, Shen, Xu Li, Jaya Rao

    Abstract: In this paper, a comprehensive software-defined networking (SDN) based transmission protocol (SDTP) is presented for fifth generation (5G) communication networks, where an SDN controller gathers network state information from the physical network to improve data transmission efficiency between end hosts, with in-path packet retransmission. In the SDTP, we first develop a new two-way handshake mech… ▽ More

    Submitted 22 February, 2019; originally announced February 2019.

    Comments: 6 pages, 8 figures, 20 references. Accepted by IEEE International Conference on Communications (ICC), 2019

  18. arXiv:1812.03974  [pdf

    cs.CV

    Accuracy, Uncertainty, and Adaptability of Automatic Myocardial ASL Segmentation using Deep CNN

    Authors: Hung P. Do, Yi Guo, Andrew J. Yoon, Krishna S. Nayak

    Abstract: PURPOSE: To apply deep CNN to the segmentation task in myocardial arterial spin labeled (ASL) perfusion imaging and to develop methods that measure uncertainty and that adapt the CNN model to a specific false positive vs. false negative tradeoff. METHODS: The Monte Carlo dropout (MCD) U-Net was trained on data from 22 subjects and tested on data from 6 heart transplant recipients. Manual segment… ▽ More

    Submitted 4 November, 2019; v1 submitted 10 December, 2018; originally announced December 2018.

  19. arXiv:1810.12557  [pdf, other

    cs.CL

    Machine Translation between Vietnamese and English: an Empirical Study

    Authors: Hong-Hai Phan-Vu, Viet-Trung Tran, Van-Nam Nguyen, Hoang-Vu Dang, Phan-Thuan Do

    Abstract: Machine translation is shifting to an end-to-end approach based on deep neural networks. The state of the art achieves impressive results for popular language pairs such as English - French or English - Chinese. However for English - Vietnamese the shortage of parallel corpora and expensive hyper-parameter search present practical challenges to neural-based approaches. This paper highlights our ef… ▽ More

    Submitted 30 October, 2018; originally announced October 2018.

  20. arXiv:1809.00742  [pdf, ps, other

    cs.DM math.CO

    Exhaustive generation for permutations avoiding a (colored) regular sets of patterns

    Authors: Phan Thuan Do, Thi Thu Huong Tran, Vincent Vajnovszki

    Abstract: Despite the fact that the field of pattern avoiding permutations has been skyrocketing over the last two decades, there are very few exhaustive generating algorithms for such classes of permutations. In this paper we introduce the notions of regular and colored regular set of forbidden patterns, which are particular cases of right-justified sets of forbidden patterns. We show the (colored) regular… ▽ More

    Submitted 15 September, 2018; v1 submitted 3 September, 2018; originally announced September 2018.

  21. arXiv:1703.05320  [pdf, other

    cs.CL cs.AI

    Legal Question Answering using Ranking SVM and Deep Convolutional Neural Network

    Authors: Phong-Khac Do, Huy-Tien Nguyen, Chien-Xuan Tran, Minh-Tien Nguyen, Minh-Le Nguyen

    Abstract: This paper presents a study of employing Ranking SVM and Convolutional Neural Network for two missions: legal information retrieval and question answering in the Competition on Legal Information Extraction/Entailment. For the first task, our proposed model used a triple of features (LSI, Manhattan, Jaccard), and is based on paragraph level instead of article level as in previous studies. In fact,… ▽ More

    Submitted 15 March, 2017; originally announced March 2017.

    Comments: 15 pages, 2 figures, Tenth International Workshop on Juris-informatics (JURISIN 2016) associated with JSAI International Symposia on AI 2016 (IsAI-2016)

    MSC Class: 14J30 (Primary) ACM Class: H.3; H.3.3; I.2.7