Skip to main content

Showing 1–48 of 48 results for author: Nguyen, P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2403.04578  [pdf, other

    eess.SY

    Tensor Power Flow Formulations for Multidimensional Analyses in Distribution Systems

    Authors: Edgar Mauricio Salazar Duque, Juan S. Giraldo, Pedro P. Vergara, Phuong H. Nguyen, Han, Slootweg

    Abstract: In this paper, we present two multidimensional power flow formulations based on a fixed-point iteration (FPI) algorithm to efficiently solve hundreds of thousands of power flows in distribution systems. The presented algorithms are the base for a new TensorPowerFlow (TPF) tool and shine for their simplicity, benefiting from multicore \gls{cpu} and \gls{gpu} parallelization. We also focus on the ma… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  2. arXiv:2401.05425  [pdf, other

    eess.SP cs.LG

    An Unobtrusive and Lightweight Ear-worn System for Continuous Epileptic Seizure Detection

    Authors: Abdul Aziz, Nhat Pham, Neel Vora, Cody Reynolds, Jaime Lehnen, Pooja Venkatesh, Zhuoran Yao, Jay Harvey, Tam Vu, Kan Ding, Phuc Nguyen

    Abstract: Epilepsy is one of the most common neurological diseases globally, affecting around 50 million people worldwide. Fortunately, up to 70 percent of people with epilepsy could live seizure-free if properly diagnosed and treated, and a reliable technique to monitor the onset of seizures could improve the quality of life of patients who are constantly facing the fear of random seizure attacks. The scal… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  3. arXiv:2312.12587  [pdf, other

    eess.SP cs.DC q-bio.TO

    Real-Time Diagnostic Integrity Meets Efficiency: A Novel Platform-Agnostic Architecture for Physiological Signal Compression

    Authors: Neel R Vora, Amir Hajighasemi, Cody T. Reynolds, Amirmohammad Radmehr, Mohamed Mohamed, Jillur Rahman Saurav, Abdul Aziz, Jai Prakash Veerla, Mohammad S Nasr, Hayden Lotspeich, Partha Sai Guttikonda, Thuong Pham, Aarti Darji, Parisa Boodaghi Malidarreh, Helen H Shang, Jay Harvey, Kan Ding, Phuc Nguyen, Jacob M Luber

    Abstract: Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases. However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monit… ▽ More

    Submitted 4 January, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

  4. arXiv:2312.09445  [pdf, other

    eess.SP cs.CV cs.LG

    IncepSE: Leveraging InceptionTime's performance with Squeeze and Excitation mechanism in ECG analysis

    Authors: Tue Minh Cao, Nhat Hong Tran, Le Phi Nguyen, Hieu Huy Pham, Hung Thanh Nguyen

    Abstract: Our study focuses on the potential for modifications of Inception-like architecture within the electrocardiogram (ECG) domain. To this end, we introduce IncepSE, a novel network characterized by strategic architectural incorporation that leverages the strengths of both InceptionTime and channel attention mechanisms. Furthermore, we propose a training setup that employs stabilization techniques tha… ▽ More

    Submitted 16 November, 2023; originally announced December 2023.

  5. arXiv:2311.01715  [pdf, other

    cs.SD eess.AS eess.SP

    Acousto-optic reconstruction of exterior sound field based on concentric circle sampling with circular harmonic expansion

    Authors: Phuc Duc Nguyen, Kenji Ishikawa, Noboru Harada, Takehiro Moriya

    Abstract: Acousto-optic sensing provides an alternative approach to traditional microphone arrays by shedding light on the interaction of light with an acoustic field. Sound field reconstruction is a fascinating and advanced technique used in acousto-optics sensing. Current challenges in sound-field reconstruction methods pertain to scenarios in which the sound source is located within the reconstruction ar… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  6. Bearing-Based Network Localization Under Randomized Gossip Protocol

    Authors: Nhat-Minh Le-Phan, Minh Hoang Trinh, Phuoc Doan Nguyen

    Abstract: In this paper, we consider a randomized gossip algorithm for the bearing-based network localization problem. Let each sensor node be able to obtain the bearing vectors and communicate its position estimates with several neighboring agents. Each update involves two agents, and the update sequence follows a stochastic process. Under the assumption that the network is infinitesimally bearing rigid an… ▽ More

    Submitted 17 January, 2024; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: preprint, 6 pages, 2 figures. Published in the Proceeding of the 12th International Conference on Control, Automation and Information Sciences (ICCAIS). arXiv admin note: text overlap with arXiv:2303.14733

  7. arXiv:2304.11080  [pdf, other

    eess.SP cs.LG

    Multimodal contrastive learning for diagnosing cardiovascular diseases from electrocardiography (ECG) signals and patient metadata

    Authors: Tue M. Cao, Nhat H. Tran, Phi Le Nguyen, Hieu Pham

    Abstract: This work discusses the use of contrastive learning and deep learning for diagnosing cardiovascular diseases from electrocardiography (ECG) signals. While the ECG signals usually contain 12 leads (channels), many healthcare facilities and devices lack access to all these 12 leads. This raises the problem of how to use only fewer ECG leads to produce meaningful diagnoses with high performance. We i… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

    Comments: Accepted for presentation at the Midwest Machine Learning Symposium (MMLS 2023), Chicago, IL, USA

  8. Randomized Matrix Weighted Consensus

    Authors: Nhat-Minh Le-Phan, Minh Hoang Trinh, Phuoc Doan Nguyen

    Abstract: In this paper, randomized gossip-type matrix-weighted consensus algorithms are proposed for both leaderless and leader-follower topologies. First, we introduce the notion of expected matrix-weighted network, which captures the multi-dimensional interactions between any two agents in a probabilistic sense. Under some mild assumptions on the distribution of the expected matrix weights and the upper… ▽ More

    Submitted 6 February, 2024; v1 submitted 26 March, 2023; originally announced March 2023.

    Comments: 32 pages, 6 figures, preprint

  9. arXiv:2212.03228  [pdf, other

    cs.LG cs.RO eess.SY

    ISAACS: Iterative Soft Adversarial Actor-Critic for Safety

    Authors: Kai-Chieh Hsu, Duy Phuong Nguyen, Jaime Fernández Fisac

    Abstract: The deployment of robots in uncontrolled environments requires them to operate robustly under previously unseen scenarios, like irregular terrain and wind conditions. Unfortunately, while rigorous safety frameworks from robust optimal control theory scale poorly to high-dimensional nonlinear dynamics, control policies computed by more tractable "deep" methods lack guarantees and tend to exhibit li… ▽ More

    Submitted 7 June, 2024; v1 submitted 6 December, 2022; originally announced December 2022.

    Comments: Accepted in 5th Annual Learning for Dynamics & Control Conference (L4DC), University of Pennsylvania

  10. arXiv:2210.08610  [pdf, other

    cs.SD cs.AI eess.AS

    Robust, General, and Low Complexity Acoustic Scene Classification Systems and An Effective Visualization for Presenting a Sound Scene Context

    Authors: Lam Pham, Dusan Salovic, Anahid Jalali, Alexander Schindler, Khoa Tran, Canh Vu, Phu X. Nguyen

    Abstract: In this paper, we present a comprehensive analysis of Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. In particular, we firstly propose an inception-based and low footprint ASC model, referred to as the ASC baseline. The proposed ASC baseline is then compared with benchmark and high-complexity network architectures of Mobile… ▽ More

    Submitted 16 October, 2022; originally announced October 2022.

  11. A Deep Reinforcement Learning-based Adaptive Charging Policy for WRSNs

    Authors: Ngoc Bui, Phi Le Nguyen, Viet Anh Nguyen, Phan Thuan Do

    Abstract: Wireless sensor networks consist of randomly distributed sensor nodes for monitoring targets or areas of interest. Maintaining the network for continuous surveillance is a challenge due to the limited battery capacity in each sensor. Wireless power transfer technology is emerging as a reliable solution for energizing the sensors by deploying a mobile charger (MC) to recharge the sensor. However, d… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: 9 pages

  12. A Soft-Bodied Aerial Robot for Collision Resilience and Contact-Reactive Perching

    Authors: Pham H. Nguyen, Karishma Patnaik, Shatadal Mishra, Panagiotis Polygerinos, Wenlong Zhang

    Abstract: Current aerial robots demonstrate limited interaction capabilities in unstructured environments when compared with their biological counterparts. Some examples include their inability to tolerate collisions and to successfully land or perch on objects of unknown shapes, sizes, and texture. Efforts to include compliance have introduced designs that incorporate external mechanical impact protection… ▽ More

    Submitted 4 January, 2023; v1 submitted 27 April, 2022; originally announced April 2022.

    Comments: Accepted for Publication, Soft Robotics Journal - Mary Ann Liebert Inc., Manuscript Details - 20 pages, 17 Figures, 2 Tables

  13. SHREC 2021: Classification in cryo-electron tomograms

    Authors: Ilja Gubins, Marten L. Chaillet, Gijs van der Schot, M. Cristina Trueba, Remco C. Veltkamp, Friedrich Förster, Xiao Wang, Daisuke Kihara, Emmanuel Moebel, Nguyen P. Nguyen, Tommi White, Filiz Bunyak, Giorgos Papoulias, Stavros Gerolymatos, Evangelia I. Zacharaki, Konstantinos Moustakas, Xiangrui Zeng, Sinuo Liu, Min Xu, Yaoyu Wang, Cheng Chen, Xuefeng Cui, Fa Zhang

    Abstract: Cryo-electron tomography (cryo-ET) is an imaging technique that allows three-dimensional visualization of macro-molecular assemblies under near-native conditions. Cryo-ET comes with a number of challenges, mainly low signal-to-noise and inability to obtain images from all angles. Computational methods are key to analyze cryo-electron tomograms. To promote innovation in computational methods, we… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

    Comments: Workshop version of the paper can be found here: https://diglib.eg.org/handle/10.2312/3dor20211307

  14. arXiv:2201.04581  [pdf, other

    cs.SD eess.AS

    Sound-Dr: Reliable Sound Dataset and Baseline Artificial Intelligence System for Respiratory Illnesses

    Authors: Truong V. Hoang, Quang H. Nguyen, Cuong Q. Nguyen, Phong X. Nguyen, Hoang D. Nguyen

    Abstract: As the burden of respiratory diseases continues to fall on society worldwide, this paper proposes a high-quality and reliable dataset of human sounds for studying respiratory illnesses, including pneumonia and COVID-19. It consists of coughing, mouth breathing, and nose breathing sounds together with metadata on related clinical characteristics. We also develop a proof-of-concept system for establ… ▽ More

    Submitted 4 August, 2023; v1 submitted 12 January, 2022; originally announced January 2022.

    Comments: 9 pages, PHMAP2023, PHM

    MSC Class: 68-11; 92-XX ACM Class: E.0; I.2.1

    Journal ref: IJPHM (2023)

  15. arXiv:2112.09172  [pdf, ps, other

    cs.CV cs.LG eess.IV

    An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification

    Authors: Lam Pham, Dat Ngo, Phu X. Nguyen, Truong Hoang, Alexander Schindler

    Abstract: This paper presents a task of audio-visual scene classification (SC) where input videos are classified into one of five real-life crowded scenes: 'Riot', 'Noise-Street', 'Firework-Event', 'Music-Event', and 'Sport-Atmosphere'. To this end, we firstly collect an audio-visual dataset (videos) of these five crowded contexts from Youtube (in-the-wild scenes). Then, a wide range of deep learning framew… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

  16. arXiv:2110.01605  [pdf, other

    eess.IV cs.CV cs.LG

    CCS-GAN: COVID-19 CT-scan classification with very few positive training images

    Authors: Sumeet Menon, Jayalakshmi Mangalagiri, Josh Galita, Michael Morris, Babak Saboury, Yaacov Yesha, Yelena Yesha, Phuong Nguyen, Aryya Gangopadhyay, David Chapman

    Abstract: We present a novel algorithm that is able to classify COVID-19 pneumonia from CT Scan slices using a very small sample of training images exhibiting COVID-19 pneumonia in tandem with a larger number of normal images. This algorithm is able to achieve high classification accuracy using as few as 10 positive training slices (from 10 positive cases), which to the best of our knowledge is one order of… ▽ More

    Submitted 1 October, 2021; originally announced October 2021.

    Comments: 10 pages, 9 figures, 1 table, submitted to IEEE Transactions on Medical Imaging

  17. Automated Workers Ergonomic Risk Assessment in Manual Material Handling using sEMG Wearable Sensors and Machine Learning

    Authors: Srimantha E. Mudiyanselage, Phuong H. D. Nguyen, Mohammad Sadra Rajabi, Reza Akhavian

    Abstract: Manual material handling tasks have the potential to be highly unsafe from an ergonomic viewpoint. Safety inspections to monitor body postures can help mitigate ergonomic risks of material handling. However, the real effect of awkward muscle movements, strains, and excessive forces that may result in an injury may not be identified by external cues. This paper evaluates the ability of surface elec… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Journal ref: Electronics. 2021; 10(20):2558

  18. arXiv:2109.07673  [pdf, other

    eess.SY cs.MA cs.RO

    Back to the Future: Efficient, Time-Consistent Solutions in Reach-Avoid Games

    Authors: Dennis R. Anthony, Duy P. Nguyen, David Fridovich-Keil, Jaime F. Fisac

    Abstract: We study the class of reach-avoid dynamic games in which multiple agents interact noncooperatively, and each wishes to satisfy a distinct target criterion while avoiding a failure criterion. Reach-avoid games are commonly used to express safety-critical optimal control problems found in mobile robot motion planning. Here, we focus on finding time-consistent solutions, in which future motion plans… ▽ More

    Submitted 2 March, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

    Comments: accepted to ICRA 2022

  19. arXiv:2107.06488  [pdf

    math.NA cs.CE eess.SP

    Behavior Analysis and Design of Concrete-Filled Steel Circular-Tube Short Columns Subjected to Axial Compression

    Authors: Duc-Duy Pham, Phu-Cuong Nguyen

    Abstract: In this paper, a new finite element (FE) model using ABAQUS software was developed to investigate the compressive behavior of Concrete-Filled Steel Circular-Tube (CFSCT) columns. Experimental studies indicated that the confinement offered by the circular steel tube in a CFSCT column increased both the strength and ductility of the filled concrete. Base on the database of 663 test results CFSCT col… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    Comments: 44 pages, 20 figures, an international paper, 7 tables, 2 authors, Phu-Cuong Nguyen is the corresponding author, Duc-Duy Pham is Master student

    Report number: E2019.02.2

  20. arXiv:2104.02060  [pdf

    eess.IV cs.CV cs.LG

    Toward Generating Synthetic CT Volumes using a 3D-Conditional Generative Adversarial Network

    Authors: Jayalakshmi Mangalagiri, David Chapman, Aryya Gangopadhyay, Yaacov Yesha, Joshua Galita, Sumeet Menon, Yelena Yesha, Babak Saboury, Michael Morris, Phuong Nguyen

    Abstract: We present a novel conditional Generative Adversarial Network (cGAN) architecture that is capable of generating 3D Computed Tomography scans in voxels from noisy and/or pixelated approximations and with the potential to generate full synthetic 3D scan volumes. We believe conditional cGAN to be a tractable approach to generate 3D CT volumes, even though the problem of generating full resolution dee… ▽ More

    Submitted 2 April, 2021; originally announced April 2021.

    Comments: It is a short paper accepted in CSCI 2020 conference and is accepted to publication in the IEEE CPS proceedings

  21. arXiv:2102.03893  [pdf, other

    eess.SY

    Enhancement of Distribution System State Estimation Using Pruned Physics-Aware Neural Networks

    Authors: Minh-Quan Tran, Ahmed S. Zamzam, Phuong H. Nguyen

    Abstract: Realizing complete observability in the three-phase distribution system remains a challenge that hinders the implementation of classic state estimation algorithms. In this paper, a new method, called the pruned physics-aware neural network (P2N2), is developed to improve the voltage estimation accuracy in the distribution system. The method relies on the physical grid topology, which is used to de… ▽ More

    Submitted 15 October, 2021; v1 submitted 7 February, 2021; originally announced February 2021.

  22. arXiv:2010.11682  [pdf

    eess.IV cs.CV cs.LG

    Lung Nodule Classification Using Biomarkers, Volumetric Radiomics and 3D CNNs

    Authors: Kushal Mehta, Arshita Jain, Jayalakshmi Mangalagiri, Sumeet Menon, Phuong Nguyen, David R. Chapman

    Abstract: We present a hybrid algorithm to estimate lung nodule malignancy that combines imaging biomarkers from Radiologist's annotation with image classification of CT scans. Our algorithm employs a 3D Convolutional Neural Network (CNN) as well as a Random Forest in order to combine CT imagery with biomarker annotation and volumetric radiomic features. We analyze and compare the performance of the algorit… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: This paper has been submitted to the Journal of Digital Imaging (JDI 2020). The poster of this paper has received the 2nd prize for the Research Poster Award. Link: https://siim.org/page/20m_p_lung_node_malignancy

  23. arXiv:2009.12478  [pdf, other

    cs.LG cs.CV eess.IV

    Generating Realistic COVID19 X-rays with a Mean Teacher + Transfer Learning GAN

    Authors: Sumeet Menon, Joshua Galita, David Chapman, Aryya Gangopadhyay, Jayalakshmi Mangalagiri, Phuong Nguyen, Yaacov Yesha, Yelena Yesha, Babak Saboury, Michael Morris

    Abstract: COVID-19 is a novel infectious disease responsible for over 800K deaths worldwide as of August 2020. The need for rapid testing is a high priority and alternative testing strategies including X-ray image classification are a promising area of research. However, at present, public datasets for COVID19 x-ray images have low data volumes, making it challenging to develop accurate image classifiers. S… ▽ More

    Submitted 25 September, 2020; originally announced September 2020.

    Comments: 10 pages, 11 figures, 2 tables; Submitted to IEEE BigData 2020 conference

  24. arXiv:2009.07520  [pdf, other

    stat.ML cs.LG eess.IV math.ST

    PCA Reduced Gaussian Mixture Models with Applications in Superresolution

    Authors: Johannes Hertrich, Dang Phoung Lan Nguyen, Jean-Fancois Aujol, Dominique Bernard, Yannick Berthoumieu, Abdellatif Saadaldin, Gabriele Steidl

    Abstract: Despite the rapid development of computational hardware, the treatment of large and high dimensional data sets is still a challenging problem. This paper provides a twofold contribution to the topic. First, we propose a Gaussian Mixture Model in conjunction with a reduction of the dimensionality of the data in each component of the model by principal component analysis, called PCA-GMM. To learn th… ▽ More

    Submitted 6 May, 2021; v1 submitted 16 September, 2020; originally announced September 2020.

    Journal ref: Inverse Problems and Imaging, vol. 16, pp. 341-366, 2022

  25. arXiv:2008.06828  [pdf, other

    cs.CV cs.LG eess.IV

    A novel approach to remove foreign objects from chest X-ray images

    Authors: Hieu X. Le, Phuong D. Nguyen, Thang H. Nguyen, Khanh N. Q. Le, Thanh T. Nguyen

    Abstract: We initially proposed a deep learning approach for foreign objects inpainting in smartphone-camera captured chest radiographs utilizing the cheXphoto dataset. Foreign objects which can significantly affect the quality of a computer-aided diagnostic prediction are captured under various settings. In this paper, we used multi-method to tackle both removal and inpainting chest radiographs. Firstly, a… ▽ More

    Submitted 15 August, 2020; originally announced August 2020.

    Comments: 9 pages, 7 figures, 7 tables

  26. arXiv:2005.03271  [pdf, other

    eess.AS cs.CL

    RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions

    Authors: Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu

    Abstract: In recent years, all-neural end-to-end approaches have obtained state-of-the-art results on several challenging automatic speech recognition (ASR) tasks. However, most existing works focus on building ASR models where train and test data are drawn from the same domain. This results in poor generalization characteristics on mismatched-domains: e.g., end-to-end models trained on short segments perfo… ▽ More

    Submitted 23 December, 2020; v1 submitted 7 May, 2020; originally announced May 2020.

    Comments: SLT camera-ready version

  27. arXiv:2003.10822  [pdf, other

    eess.IV cs.CV

    Pre-processing Image using Brightening, CLAHE and RETINEX

    Authors: Thi Phuoc Hanh Nguyen, Zinan Cai, Khanh Nguyen, Sokuntheariddh Keth, Ningyuan Shen, Mira Park

    Abstract: This paper focuses on finding the most optimal pre-processing methods considering three common algorithms for image enhancement: Brightening, CLAHE and Retinex. For the purpose of image training in general, these methods will be combined to find out the most optimal method for image enhancement. We have carried out the research on the different permutation of three methods: Brightening, CLAHE and… ▽ More

    Submitted 22 March, 2020; originally announced March 2020.

  28. arXiv:2003.09677  [pdf, ps, other

    eess.SP

    UAV-Assisted Secure Communications in Terrestrial Cognitive Radio Networks: Joint Power Control and 3D Trajectory Optimization

    Authors: Phu X. Nguyen, Van-Dinh Nguyen, Hieu V. Nguyen, Oh-Soon Shin

    Abstract: This paper considers secure communications for an underlay cognitive radio network (CRN) in the presence of an external eavesdropper (Eve). The secrecy performance of CRNs is usually limited by the primary receiver's interference power constraint. To overcome this issue, we propose to use an unmanned aerial vehicle (UAV) as a friendly jammer to interfere with Eve in decoding the confidential messa… ▽ More

    Submitted 25 March, 2020; v1 submitted 21 March, 2020; originally announced March 2020.

  29. arXiv:1911.10229  [pdf

    eess.IV eess.SP q-bio.QM

    Improved motion correction for functional MRI using an omnibus regression model

    Authors: Vyom Raval, Kevin P. Nguyen, Albert Montillo

    Abstract: Head motion during functional Magnetic Resonance Imaging acquisition can significantly contaminate the neural signal and introduce spurious, distance-dependent changes in signal correlations. This can heavily confound studies of development, aging, and disease. Previous approaches to suppress head motion artifacts have involved sequential regression of nuisance covariates, but this has been shown… ▽ More

    Submitted 21 January, 2020; v1 submitted 22 November, 2019; originally announced November 2019.

    Comments: 4 pages, 2 figures, accepted for IEEE ISBI 2020 conference Updated following ISBI reviewer suggestions

  30. arXiv:1911.10227  [pdf

    eess.SP cs.LG q-bio.NC q-bio.QM

    Prediction of individual progression rate in Parkinson's disease using clinical measures and biomechanical measures of gait and postural stability

    Authors: Vyom Raval, Kevin P. Nguyen, Ashley Gerald, Richard B. Dewey Jr., Albert Montillo

    Abstract: Parkinson's disease (PD) is a common neurological disorder characterized by gait impairment. PD has no cure, and an impediment to develo** a treatment is the lack of any accepted method to predict disease progression rate. The primary aim of this study was to develop a model using clinical measures and biomechanical measures of gait and postural stability to predict an individual's PD progressio… ▽ More

    Submitted 22 November, 2019; originally announced November 2019.

    Comments: 5 pages, 4 figures, IEEE ICASSP conference submission

  31. arXiv:1911.02242  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    A comparison of end-to-end models for long-form speech recognition

    Authors: Chung-Cheng Chiu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara Sainath, Yonghui Wu

    Abstract: End-to-end automatic speech recognition (ASR) models, including both attention-based models and the recurrent neural network transducer (RNN-T), have shown superior performance compared to conventional systems. However, previous studies have focused primarily on short utterances that typically last for just a few seconds or, at most, a few tens of seconds. Whether such architectures are practical… ▽ More

    Submitted 6 November, 2019; originally announced November 2019.

    Comments: ASRU camera-ready version

  32. arXiv:1910.08112  [pdf, other

    cs.LG eess.IV q-bio.NC stat.AP stat.ML

    Anatomically-Informed Data Augmentation for functional MRI with Applications to Deep Learning

    Authors: Kevin P. Nguyen, Cherise Chin Fatt, Alex Treacher, Cooper Mellema, Madhukar H. Trivedi, Albert Montillo

    Abstract: The application of deep learning to build accurate predictive models from functional neuroimaging data is often hindered by limited dataset sizes. Though data augmentation can help mitigate such training obstacles, most data augmentation methods have been developed for natural images as in computer vision tasks such as CIFAR, not for medical images. This work helps to fills in this gap by proposin… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

    Comments: SPIE Medical Imaging 2020

  33. arXiv:1910.02785  [pdf, other

    cs.LG cs.CR eess.IV stat.ML

    BUZz: BUffer Zones for defending adversarial examples in image classification

    Authors: Kaleel Mahmood, Phuong Ha Nguyen, Lam M. Nguyen, Thanh Nguyen, Marten van Dijk

    Abstract: We propose a novel defense against all existing gradient based adversarial attacks on deep neural networks for image classification problems. Our defense is based on a combination of deep neural networks and simple image transformations. While straightforward in implementation, this defense yields a unique security property which we term buffer zones. We argue that our defense based on buffer zone… ▽ More

    Submitted 16 June, 2020; v1 submitted 3 October, 2019; originally announced October 2019.

  34. arXiv:1909.13055  [pdf, other

    cs.CV cs.LG eess.IV

    DeepUSPS: Deep Robust Unsupervised Saliency Prediction With Self-Supervision

    Authors: Duc Tam Nguyen, Maximilian Dax, Chaithanya Kumar Mummadi, Thi Phuong Nhung Ngo, Thi Hoai Phuong Nguyen, Zhongyu Lou, Thomas Brox

    Abstract: Deep neural network (DNN) based salient object detection in images based on high-quality labels is expensive. Alternative unsupervised approaches rely on careful selection of multiple handcrafted saliency methods to generate noisy pseudo-ground-truth labels. In this work, we propose a two-stage mechanism for robust unsupervised object saliency prediction, where the first stage involves refinement… ▽ More

    Submitted 15 March, 2021; v1 submitted 28 September, 2019; originally announced September 2019.

    Comments: NeuRIPS-2019 (Vancouver, Canada): camera ready version

  35. arXiv:1906.09548  [pdf, ps, other

    cs.DC cs.NI eess.SY

    Computation Offloading and Resource Allocation for Backhaul Limited Cooperative MEC Systems

    Authors: Phuong-Duy Nguyen, Vu Nguyen Ha, Long Bao Le

    Abstract: In this paper, we jointly optimize computation offloading and resource allocation to minimize the weighted sum of energy consumption of all mobile users in a backhaul limited cooperative MEC system with multiple fog servers. Considering the partial offloading strategy and TDMA transmission at each base station, the underlying optimization problem with constraints on maximum task latency and limite… ▽ More

    Submitted 22 June, 2019; originally announced June 2019.

  36. arXiv:1810.07217  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Hierarchical Generative Modeling for Controllable Speech Synthesis

    Authors: Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang

    Abstract: This paper proposes a neural sequence-to-sequence text-to-speech (TTS) model which can control latent attributes in the generated speech that are rarely annotated in the training data, such as speaking style, accent, background noise, and recording conditions. The model is formulated as a conditional generative model based on the variational autoencoder (VAE) framework, with two levels of hierarch… ▽ More

    Submitted 27 December, 2018; v1 submitted 16 October, 2018; originally announced October 2018.

    Comments: 27 pages, accepted to ICLR 2019

  37. arXiv:1806.04558  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis

    Authors: Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, Yonghui Wu

    Abstract: We describe a neural network-based system for text-to-speech (TTS) synthesis that is able to generate speech audio in the voice of many different speakers, including those unseen during training. Our system consists of three independently trained components: (1) a speaker encoder network, trained on a speaker verification task using an independent dataset of noisy speech from thousands of speakers… ▽ More

    Submitted 2 January, 2019; v1 submitted 12 June, 2018; originally announced June 2018.

    Comments: NeurIPS 2018

    Journal ref: Advances in Neural Information Processing Systems 31 (2018), 4485-4495

  38. arXiv:1712.08335  [pdf

    eess.SP

    An Efficient Spectral Leakage Filtering for IEEE 802.11af in TV White Space

    Authors: Phu Xuan Nguyen, Thinh Hung Pham, Trang Hoang, Oh-Soon Shin

    Abstract: Orthogonal frequency division multiplexing (OFDM) has been widely adopted for modern wireless standards and become a key enabling technology for cognitive radios. However, one of its main drawbacks is significant spectral leakage due to the accumulation of multiple sinc-shaped subcarriers. In this paper, we present a novel pulse sha** scheme for efficient spectral leakage suppression in OFDM bas… ▽ More

    Submitted 22 December, 2017; originally announced December 2017.

  39. arXiv:1712.01996  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    An analysis of incorporating an external language model into a sequence-to-sequence model

    Authors: Anjuli Kannan, Yonghui Wu, Patrick Nguyen, Tara N. Sainath, Zhifeng Chen, Rohit Prabhavalkar

    Abstract: Attention-based sequence-to-sequence models for automatic speech recognition jointly train an acoustic model, language model, and alignment mechanism. Thus, the language model component is only trained on transcribed audio-text pairs. This leads to the use of shallow fusion with an external language model at inference time. Shallow fusion refers to log-linear interpolation with a separately traine… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

  40. arXiv:1712.01864  [pdf, other

    cs.CL cs.SD eess.AS stat.ML

    No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models

    Authors: Tara N. Sainath, Rohit Prabhavalkar, Shankar Kumar, Seungji Lee, Anjuli Kannan, David Rybach, Vlad Schogol, Patrick Nguyen, Bo Li, Yonghui Wu, Zhifeng Chen, Chung-Cheng Chiu

    Abstract: For decades, context-dependent phonemes have been the dominant sub-word unit for conventional acoustic modeling systems. This status quo has begun to be challenged recently by end-to-end models which seek to combine acoustic, pronunciation, and language model components into a single neural network. Such systems, which typically predict graphemes or words, simplify the recognition process since th… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

  41. arXiv:1712.01818  [pdf, other

    cs.CL eess.AS stat.ML

    Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models

    Authors: Rohit Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Kannan

    Abstract: Sequence-to-sequence models, such as attention-based models in automatic speech recognition (ASR), are typically trained to optimize the cross-entropy criterion which corresponds to improving the log-likelihood of the data. However, system performance is usually measured in terms of word error rate (WER), not log-likelihood. Traditional ASR systems benefit from discriminative sequence training whi… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

  42. arXiv:1712.01807  [pdf, other

    cs.CL eess.AS stat.ML

    Improving the Performance of Online Neural Transducer Models

    Authors: Tara N. Sainath, Chung-Cheng Chiu, Rohit Prabhavalkar, Anjuli Kannan, Yonghui Wu, Patrick Nguyen, Zhifeng Chen

    Abstract: Having a sequence-to-sequence model which can operate in an online fashion is important for streaming applications such as Voice Search. Neural transducer is a streaming sequence-to-sequence model, but has shown a significant degradation in performance compared to non-streaming models such as Listen, Attend and Spell (LAS). In this paper, we present various improvements to NT. Specifically, we loo… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

  43. arXiv:1712.01769  [pdf, other

    cs.CL cs.SD eess.AS stat.ML

    State-of-the-art Speech Recognition With Sequence-to-Sequence Models

    Authors: Chung-Cheng Chiu, Tara N. Sainath, Yonghui Wu, Rohit Prabhavalkar, Patrick Nguyen, Zhifeng Chen, Anjuli Kannan, Ron J. Weiss, Kanishka Rao, Ekaterina Gonina, Navdeep Jaitly, Bo Li, Jan Chorowski, Michiel Bacchiani

    Abstract: Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS), subsume the acoustic, pronunciation and language model components of a traditional automatic speech recognition (ASR) system into a single neural network. In previous work, we have shown that such architectures are comparable to state-of-theart ASR systems on dictation tasks, but it was not clear if such archite… ▽ More

    Submitted 23 February, 2018; v1 submitted 5 December, 2017; originally announced December 2017.

    Comments: ICASSP camera-ready version

  44. arXiv:1712.01541  [pdf, other

    eess.AS cs.SD

    Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model

    Authors: Bo Li, Tara N. Sainath, Khe Chai Sim, Michiel Bacchiani, Eugene Weinstein, Patrick Nguyen, Zhifeng Chen, Yonghui Wu, Kanishka Rao

    Abstract: Sequence-to-sequence models provide a simple and elegant solution for building speech recognition systems by folding separate components of a typical system, namely acoustic (AM), pronunciation (PM) and language (LM) models into a single neural network. In this work, we look at one such sequence-to-sequence model, namely listen, attend and spell (LAS), and explore the possibility of training a sin… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

    Comments: submitted to ICASSP 2018

  45. arXiv:1711.07274  [pdf, ps, other

    cs.CL cs.SD eess.AS stat.ML

    Speech recognition for medical conversations

    Authors: Chung-Cheng Chiu, Anshuman Tripathi, Katherine Chou, Chris Co, Navdeep Jaitly, Diana Jaunzeikare, Anjuli Kannan, Patrick Nguyen, Hasim Sak, Ananth Sankar, Justin Tansuwan, Nathan Wan, Yonghui Wu, Xuedong Zhang

    Abstract: In this work we explored building automatic speech recognition models for transcribing doctor patient conversation. We collected a large scale dataset of clinical conversations ($14,000$ hr), designed the task to represent the real word scenario, and explored several alignment approaches to iteratively improve data quality. We explored both CTC and LAS systems for building speech recognition model… ▽ More

    Submitted 20 June, 2018; v1 submitted 20 November, 2017; originally announced November 2017.

    Comments: Interspeech 2018 camera ready

  46. arXiv:1710.02928  [pdf, ps, other

    eess.SP

    Range-Spread Targets Detection in Unknown Doppler Shift via Semi-Definite Programming

    Authors: Mai. P. T. Nguyen, I. Song, S. Lee, S. Yoon

    Abstract: Based on the technique of generalized likelihood ratio test, we address detection schemes for Doppler-shifted range-spread targets in Gaussian noise. First, a detection scheme is derived by solving the maximization associated with the estimation of unknown Doppler frequency with semi-definite programming. To lower the computational complexity of the detector, we then consider a simplification of t… ▽ More

    Submitted 8 October, 2017; originally announced October 2017.

    Comments: First author is Mai P. T. Nguyen

  47. arXiv:1710.02656  [pdf, ps, other

    eess.SP

    Robust Radar Detection of a Mismatched Steering Vector Embedded in Compound Gaussian Clutter

    Authors: Mai P. T. Nguyen, I. Song

    Abstract: The problem of radar detection in compound Gaussian clutter when a radar signature is not completely known has not been considered yet and is addressed in this paper. We proposed a robust technique to detect, based on the generalized likelihood ratio test, a point-like target embedded in compound Gaussian clutter. Employing an array of antennas, we assume that the actual steering vector departs fr… ▽ More

    Submitted 7 October, 2017; originally announced October 2017.

    Comments: 7 pages, 5 figures

  48. arXiv:1602.06667  [pdf, other

    cs.RO cs.AI eess.SY

    A Motion Planning Strategy for the Active Vision-Based Map** of Ground-Level Structures

    Authors: Manikandasriram Srinivasan Ramanagopal, André Phu-Van Nguyen, Jerome Le Ny

    Abstract: This paper presents a strategy to guide a mobile ground robot equipped with a camera or depth sensor, in order to autonomously map the visible part of a bounded three-dimensional structure. We describe motion planning algorithms that determine appropriate successive viewpoints and attempt to fill holes automatically in a point cloud produced by the sensing and perception layer. The emphasis is on… ▽ More

    Submitted 10 November, 2017; v1 submitted 22 February, 2016; originally announced February 2016.

    Comments: Accepted for publication in IEEE Transactions on Automation Science and Engineering. Available in IEEE Xplore at http://ieeexplore.ieee.org/document/8093664