Search | arXiv e-print repository

arXiv:2405.02563 [pdf, other]

Deep Representation Learning-Based Dynamic Trajectory Phenoty** for Acute Respiratory Failure in Medical Intensive Care Units

Authors: Alan Wu, Tilendra Choudhary, Pulakesh Upadhyaya, Ayman Ali, Philip Yang, Rishikesan Kamaleswaran

Abstract: Sepsis-induced acute respiratory failure (ARF) is a serious complication with a poor prognosis. This paper presents a deep representation learningbased phenoty** method to identify distinct groups of clinical trajectories of septic patients with ARF. For this retrospective study, we created a dataset from electronic medical records (EMR) consisting of data from sepsis patients admitted to medica… ▽ More Sepsis-induced acute respiratory failure (ARF) is a serious complication with a poor prognosis. This paper presents a deep representation learningbased phenoty** method to identify distinct groups of clinical trajectories of septic patients with ARF. For this retrospective study, we created a dataset from electronic medical records (EMR) consisting of data from sepsis patients admitted to medical intensive care units who required at least 24 hours of invasive mechanical ventilation at a quarternary care academic hospital in southeast USA for the years 2016-2021. A total of N=3349 patient encounters were included in this study. Clustering Representation Learning on Incomplete Time Series Data (CRLI) algorithm was applied to a parsimonious set of EMR variables in this data set. To validate the optimal number of clusters, the K-means algorithm was used in conjunction with dynamic time war**. Our model yielded four distinct patient phenotypes that were characterized as liver dysfunction/heterogeneous, hypercapnia, hypoxemia, and multiple organ dysfunction syndrome by a critical care expert. A Kaplan-Meier analysis to compare the 28-day mortality trends exhibited significant differences (p < 0.005) between the four phenotypes. The study demonstrates the utility of our deep representation learning-based approach in unraveling phenotypes that reflect the heterogeneity in sepsis-induced ARF in terms of different mortality outcomes and severity. These phenotypes might reveal important clinical insights into an effective prognosis and tailored treatment strategies. △ Less

Submitted 4 May, 2024; originally announced May 2024.

Comments: 9 pages

arXiv:2403.13601 [pdf, other]

Lattice piecewise affine approximation of explicit model predictive control with application to satellite attitude control

Authors: Zhengqi Xu, Jun Xu, Ai-Guo Wu, Shuning Wang

Abstract: Satellite attitude cotrol is a crucial part of aerospace technology, and model predictive control(MPC) is one of the most promising controllers in this area, which will be less effective if real-time online optimization can not be achieved. Explicit MPC converts the online calculation into a table lookup process, however the solution is difficult to obtain if the system dimension is high or the co… ▽ More Satellite attitude cotrol is a crucial part of aerospace technology, and model predictive control(MPC) is one of the most promising controllers in this area, which will be less effective if real-time online optimization can not be achieved. Explicit MPC converts the online calculation into a table lookup process, however the solution is difficult to obtain if the system dimension is high or the constraints are complex. The lattice piecewise affine(PWA) function was used to represent the control law of explicit MPC, although the online calculation complexity is reduced, the offline calculation is still prohibitive for complex problems. In this paper, we use the sample points in the feasible region with their corresponding affine functions to construct the lattice PWA approximation of the optimal MPC controller designed for satellite attitude control. The asymptotic stability of satellite attitude control system under lattice PWA approximation has been proven, and simulations are executed to verify that the proposed method can achieve almost the same performance as linear online MPC with much lower online computational complexity and use less fuel than LQR method. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2401.12974 [pdf, other]

SegmentAnyBone: A Universal Model that Segments Any Bone at Any Location on MRI

Authors: Hanxue Gu, Roy Colglazier, Haoyu Dong, Jikai Zhang, Yaqian Chen, Zafer Yildiz, Yuwen Chen, Lin Li, Jichen Yang, Jay Willhite, Alex M. Meyer, Brian Guo, Yashvi Atul Shah, Emily Luo, Shipra Rajput, Sally Kuehn, Clark Bulleit, Kevin A. Wu, Jisoo Lee, Brandon Ramirez, Darui Lu, Jay M. Levin, Maciej A. Mazurowski

Abstract: Magnetic Resonance Imaging (MRI) is pivotal in radiology, offering non-invasive and high-quality insights into the human body. Precise segmentation of MRIs into different organs and tissues would be highly beneficial since it would allow for a higher level of understanding of the image content and enable important measurements, which are essential for accurate diagnosis and effective treatment pla… ▽ More Magnetic Resonance Imaging (MRI) is pivotal in radiology, offering non-invasive and high-quality insights into the human body. Precise segmentation of MRIs into different organs and tissues would be highly beneficial since it would allow for a higher level of understanding of the image content and enable important measurements, which are essential for accurate diagnosis and effective treatment planning. Specifically, segmenting bones in MRI would allow for more quantitative assessments of musculoskeletal conditions, while such assessments are largely absent in current radiological practice. The difficulty of bone MRI segmentation is illustrated by the fact that limited algorithms are publicly available for use, and those contained in the literature typically address a specific anatomic area. In our study, we propose a versatile, publicly available deep-learning model for bone segmentation in MRI across multiple standard MRI locations. The proposed model can operate in two modes: fully automated segmentation and prompt-based segmentation. Our contributions include (1) collecting and annotating a new MRI dataset across various MRI protocols, encompassing over 300 annotated volumes and 8485 annotated slices across diverse anatomic regions; (2) investigating several standard network architectures and strategies for automated segmentation; (3) introducing SegmentAnyBone, an innovative foundational model-based approach that extends Segment Anything Model (SAM); (4) comparative analysis of our algorithm and previous approaches; and (5) generalization analysis of our algorithm across different anatomical locations and MRI sequences, as well as an external dataset. We publicly release our model at https://github.com/mazurowski-lab/SegmentAnyBone. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: 15 pages, 15 figures

arXiv:2311.08840 [pdf, other]

An MRL-Based Design Solution for RIS-Assisted MU-MIMO Wireless System under Time-Varying Channels

Authors: Meng-Qian Alexander Wu, Tzu-Hsien Sang, Luisa Schuhmacher, Ming-Jie Guo, Khodr Hammoud, Sofie Pollin

Abstract: Utilizing Deep Reinforcement Learning (DRL) for Reconfigurable Intelligent Surface (RIS) assisted wireless communication has been extensively researched. However, existing DRL methods either act as a simple optimizer or only solve problems with concurrent Channel State Information (CSI) represented in the training data set. Consequently, solutions for RIS-assisted wireless communication systems un… ▽ More Utilizing Deep Reinforcement Learning (DRL) for Reconfigurable Intelligent Surface (RIS) assisted wireless communication has been extensively researched. However, existing DRL methods either act as a simple optimizer or only solve problems with concurrent Channel State Information (CSI) represented in the training data set. Consequently, solutions for RIS-assisted wireless communication systems under time-varying environments are relatively unexplored. However, communication problems should be considered with realistic assumptions; for instance, in scenarios where the channel is time-varying, the policy obtained by reinforcement learning should be applicable for situations where CSI is not well represented in the training data set. In this paper, we apply Meta-Reinforcement Learning (MRL) to the joint optimization problem of active beamforming at the Base Station (BS) and phase shift at the RIS, motivated by MRL's ability to extend the DRL concept of solving one Markov Decision Problem (MDP) to multiple MDPs. We provide simulation results to compare the average sum rate of the proposed approach with those of selected forerunners in the literature. Our approach improves the sum rate by more than 60% under time-varying CSI assumption while maintaining the advantages of typical DRL-based solutions. Our study's results emphasize the possibility of utilizing MRL-based designs in RIS-assisted wireless communication systems while considering realistic environment assumptions. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: To be published in proceedings of the 2023 IEEE Conference on Global Communications (GLOBECOM)

arXiv:2308.02724 [pdf, other]

A Tracking prior to Localization workflow for Ultrasound Localization Microscopy

Authors: Alexis Leconte, Jonathan Porée, Brice Rauby, Alice Wu, Nin Ghigo, Paul Xing, Chloé Bourquin, Gerardo Ramos-Palacios, Abbas F. Sadikot, Jean Provost

Abstract: Ultrasound Localization Microscopy (ULM) has proven effective in resolving microvascular structures and local mean velocities at sub-diffraction-limited scales, offering high-resolution imaging capabilities. Dynamic ULM (DULM) enables the creation of angiography or velocity movies throughout cardiac cycles. Currently, these techniques rely on a Localization-and-Tracking (LAT) workflow consisting i… ▽ More Ultrasound Localization Microscopy (ULM) has proven effective in resolving microvascular structures and local mean velocities at sub-diffraction-limited scales, offering high-resolution imaging capabilities. Dynamic ULM (DULM) enables the creation of angiography or velocity movies throughout cardiac cycles. Currently, these techniques rely on a Localization-and-Tracking (LAT) workflow consisting in detecting microbubbles (MB) in the frames before pairing them to generate tracks. While conventional LAT methods perform well at low concentrations, they suffer from longer acquisition times and degraded localization and tracking accuracy at higher concentrations, leading to biased angiogram reconstruction and velocity estimation. In this study, we propose a novel approach to address these challenges by reversing the current workflow. The proposed method, Tracking-and-Localization (TAL), relies on first tracking the MB and then performing localization. Through comprehensive benchmarking using both in silico and in vivo experiments and employing various metrics to quantify ULM angiography and velocity maps, we demonstrate that the TAL method consistently outperforms the reference LAT workflow. Moreover, when applied to DULM, TAL successfully extracts velocity variations along the cardiac cycle with improved repeatability. The findings of this work highlight the effectiveness of the TAL approach in overcoming the limitations of conventional LAT methods, providing enhanced ULM angiography and velocity imaging. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2303.08105 [pdf]

Image Guidance for Robot-Assisted Ankle Fracture Repair

Authors: Asef Islam, Anthony Wu, Jay Mandavilli, Wojtek Zbijewski, Jeff Siewerdsen

Abstract: This project concerns develo** and validating an image guidance framework for application to a robotic-assisted fibular reduction in ankle fracture surgery. The aim is to produce and demonstrate proper functioning of software for automatic determination of directions for fibular repositioning with the ultimate goal of application to a robotic reduction procedure that can reduce the time and comp… ▽ More This project concerns develo** and validating an image guidance framework for application to a robotic-assisted fibular reduction in ankle fracture surgery. The aim is to produce and demonstrate proper functioning of software for automatic determination of directions for fibular repositioning with the ultimate goal of application to a robotic reduction procedure that can reduce the time and complexity of the procedure as well as provide the benefits of reduced error in ideal final fibular position, improved syndesmosis restoration and reduced incidence of post-traumatic osteoarthritis. The focus of this product will be develo** and testing the image guidance software, from the input of preoperative images through the steps of automated segmentation and registration until the output of a final transformation that can be used as instructions to a robot on how to reposition the fibula, but will not involve develo** or implementing the hardware of the robot itself. △ Less

Submitted 18 March, 2023; v1 submitted 31 January, 2023; originally announced March 2023.

arXiv:2110.14174 [pdf, other]

doi 10.1109/TAC.2022.3169582

On the Dynamics of the Tavis-Cummings Model

Authors: Zhiyuan Dong, Guofeng Zhang, Ai-Guo Wu, Re-Bing Wu

Abstract: The purpose of this paper is to present a comprehensive study of the Tavis-Cummings model from a system-theoretic perspective. A typical form of the Tavis-Cummings model is composed of an ensemble of non-interacting two-level systems (TLSs) that are collectively coupled to a common cavity resonator. The associated quantum linear passive system is proposed, whose canonical form reveals typical feat… ▽ More The purpose of this paper is to present a comprehensive study of the Tavis-Cummings model from a system-theoretic perspective. A typical form of the Tavis-Cummings model is composed of an ensemble of non-interacting two-level systems (TLSs) that are collectively coupled to a common cavity resonator. The associated quantum linear passive system is proposed, whose canonical form reveals typical features of the Tavis-Cummings model, including $\sqrt{N}$- scaling, dark states, bright states, single-excitation superradiant and subradiant states. The passivity of this linear system is related to the vacuum Rabi mode splitting phenomenon in Tavis-Cummings systems. On the basis of the linear model, an analytic form is presented for the steady-state output state of the Tavis-Cummings model driven by a single-photon state. Master equations are used to study the excitation properties of the Tavis-Cummings model in the multi-excitation scenario. Finally, in terms of the transition matrix for a linear time-varying system, a computational framework is proposed for calculating the state of the Tavis-Cummings model, which is applicable to the multi-excitation case. △ Less

Submitted 9 May, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: 16 pages, 8 figures, IEEE Transactions on Automatic Control, to appear

Journal ref: IEEE Transactions on Automatic Control, 2022

arXiv:2110.08828 [pdf]

Compression-aware Projection with Greedy Dimension Reduction for Convolutional Neural Network Activations

Authors: Yu-Shan Tai, Chieh-Fang Teng, Cheng-Yang Chang, An-Yeu Wu

Abstract: Convolutional neural networks (CNNs) achieve remarkable performance in a wide range of fields. However, intensive memory access of activations introduces considerable energy consumption, impeding deployment of CNNs on resourceconstrained edge devices. Existing works in activation compression propose to transform feature maps for higher compressibility, thus enabling dimension reduction. Neverthele… ▽ More Convolutional neural networks (CNNs) achieve remarkable performance in a wide range of fields. However, intensive memory access of activations introduces considerable energy consumption, impeding deployment of CNNs on resourceconstrained edge devices. Existing works in activation compression propose to transform feature maps for higher compressibility, thus enabling dimension reduction. Nevertheless, in the case of aggressive dimension reduction, these methods lead to severe accuracy drop. To improve the trade-off between classification accuracy and compression ratio, we propose a compression-aware projection system, which employs a learnable projection to compensate for the reconstruction loss. In addition, a greedy selection metric is introduced to optimize the layer-wise compression ratio allocation by considering both accuracy and #bits reduction simultaneously. Our test results show that the proposed methods effectively reduce 2.91x~5.97x memory access with negligible accuracy drop on MobileNetV2/ResNet18/VGG16. △ Less

Submitted 17 October, 2021; originally announced October 2021.

Comments: 5 pages, 5 figures, submitted to 2022 ICASSP

arXiv:2105.09930 [pdf, other]

Mondegreen: A Post-Processing Solution to Speech Recognition Error Correction for Voice Search Queries

Authors: Sukhdeep S. Sodhi, Ellie Ka-In Chio, Ambarish Jash, Santiago Ontañón, Ajit Apte, Ankit Kumar, Ayooluwakunmi Jeje, Dima Kuzmin, Harry Fung, Heng-Tze Cheng, Jon Effrat, Tarush Bali, Nitin **dal, Pei Cao, Sarvjeet Singh, Senqiang Zhou, Tameen Khan, Amol Wankhede, Moustafa Alzantot, Allen Wu, Tushar Chandra

Abstract: As more and more online search queries come from voice, automatic speech recognition becomes a key component to deliver relevant search results. Errors introduced by automatic speech recognition (ASR) lead to irrelevant search results returned to the user, thus causing user dissatisfaction. In this paper, we introduce an approach, Mondegreen, to correct voice queries in text space without dependin… ▽ More As more and more online search queries come from voice, automatic speech recognition becomes a key component to deliver relevant search results. Errors introduced by automatic speech recognition (ASR) lead to irrelevant search results returned to the user, thus causing user dissatisfaction. In this paper, we introduce an approach, Mondegreen, to correct voice queries in text space without depending on audio signals, which may not always be available due to system constraints or privacy or bandwidth (for example, some ASR systems run on-device) considerations. We focus on voice queries transcribed via several proprietary commercial ASR systems. These queries come from users making internet, or online service search queries. We first present an analysis showing how different the language distribution coming from user voice queries is from that in traditional text corpora used to train off-the-shelf ASR systems. We then demonstrate that Mondegreen can achieve significant improvements in increased user interaction by correcting user voice queries in one of the largest search systems in Google. Finally, we see Mondegreen as complementing existing highly-optimized production ASR systems, which may not be frequently retrained and thus lag behind due to vocabulary drifts. △ Less

Submitted 20 May, 2021; originally announced May 2021.

Comments: Accepted in KDD 2021

arXiv:2101.00390 [pdf, other]

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Authors: Changhan Wang, Morgane Rivière, Ann Lee, Anne Wu, Chaitanya Talnikar, Daniel Haziza, Mary Williamson, Juan Pino, Emmanuel Dupoux

Abstract: We introduce VoxPopuli, a large-scale multilingual corpus providing 100K hours of unlabelled speech data in 23 languages. It is the largest open data to date for unsupervised representation learning as well as semi-supervised learning. VoxPopuli also contains 1.8K hours of transcribed speeches in 16 languages and their aligned oral interpretations into 5 other languages totaling 5.1K hours. We pro… ▽ More We introduce VoxPopuli, a large-scale multilingual corpus providing 100K hours of unlabelled speech data in 23 languages. It is the largest open data to date for unsupervised representation learning as well as semi-supervised learning. VoxPopuli also contains 1.8K hours of transcribed speeches in 16 languages and their aligned oral interpretations into 5 other languages totaling 5.1K hours. We provide speech recognition baselines and validate the versatility of VoxPopuli unlabelled data in semi-supervised learning under challenging out-of-domain settings. We will release the corpus at https://github.com/facebookresearch/voxpopuli under an open license. △ Less

Submitted 27 July, 2021; v1 submitted 2 January, 2021; originally announced January 2021.

Comments: Accepted to ACL 2021 (long paper)

arXiv:2010.05171 [pdf, other]

fairseq S2T: Fast Speech-to-Text Modeling with fairseq

Authors: Changhan Wang, Yun Tang, Xutai Ma, Anne Wu, Sravya Popuri, Dmytro Okhonko, Juan Pino

Abstract: We introduce fairseq S2T, a fairseq extension for speech-to-text (S2T) modeling tasks such as end-to-end speech recognition and speech-to-text translation. It follows fairseq's careful design for scalability and extensibility. We provide end-to-end workflows from data pre-processing, model training to offline (online) inference. We implement state-of-the-art RNN-based, Transformer-based as well as… ▽ More We introduce fairseq S2T, a fairseq extension for speech-to-text (S2T) modeling tasks such as end-to-end speech recognition and speech-to-text translation. It follows fairseq's careful design for scalability and extensibility. We provide end-to-end workflows from data pre-processing, model training to offline (online) inference. We implement state-of-the-art RNN-based, Transformer-based as well as Conformer-based models and open-source detailed training recipes. Fairseq's machine translation models and language models can be seamlessly integrated into S2T workflows for multi-task learning or transfer learning. Fairseq S2T documentation and examples are available at https://github.com/pytorch/fairseq/tree/master/examples/speech_to_text. △ Less

Submitted 14 June, 2022; v1 submitted 11 October, 2020; originally announced October 2020.

Comments: Post-conference updates (accepted to AACL 2020 Demo)

arXiv:2007.10310 [pdf, ps, other]

CoVoST 2 and Massively Multilingual Speech-to-Text Translation

Authors: Changhan Wang, Anne Wu, Juan Pino

Abstract: Speech translation has recently become an increasingly popular topic of research, partly due to the development of benchmark datasets. Nevertheless, current datasets cover a limited number of languages. With the aim to foster research in massive multilingual speech translation and speech translation for low resource language pairs, we release CoVoST 2, a large-scale multilingual speech translation… ▽ More Speech translation has recently become an increasingly popular topic of research, partly due to the development of benchmark datasets. Nevertheless, current datasets cover a limited number of languages. With the aim to foster research in massive multilingual speech translation and speech translation for low resource language pairs, we release CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. This represents the largest open dataset available to date from total volume and language coverage perspective. Data sanity checks provide evidence about the quality of the data, which is released under CC0 license. We also provide extensive speech recognition, bilingual and multilingual machine translation and speech translation baselines with open-source implementation. △ Less

Submitted 24 October, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

arXiv:2006.12124 [pdf, ps, other]

Self-Supervised Representations Improve End-to-End Speech Translation

Authors: Anne Wu, Changhan Wang, Juan Pino, Jiatao Gu

Abstract: End-to-end speech-to-text translation can provide a simpler and smaller system but is facing the challenge of data scarcity. Pre-training methods can leverage unlabeled data and have been shown to be effective on data-scarce settings. In this work, we explore whether self-supervised pre-trained speech representations can benefit the speech translation task in both high- and low-resource settings,… ▽ More End-to-end speech-to-text translation can provide a simpler and smaller system but is facing the challenge of data scarcity. Pre-training methods can leverage unlabeled data and have been shown to be effective on data-scarce settings. In this work, we explore whether self-supervised pre-trained speech representations can benefit the speech translation task in both high- and low-resource settings, whether they can transfer well to other languages, and whether they can be effectively combined with other common methods that help improve low-resource end-to-end speech translation such as using a pre-trained high-resource speech recognition system. We demonstrate that self-supervised pre-trained features can consistently improve the translation performance, and cross-lingual transfer allows to extend to a variety of languages without or with little tuning. △ Less

Submitted 24 October, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

Comments: Accepted to INTERSPEECH 2020

arXiv:2006.01125 [pdf]

Neural Network-Aided BCJR Algorithm for Joint Symbol Detection and Channel Decoding

Authors: Wen-Chiao Tsai, Chieh-Fang Teng, Han-Mo Ou, An-Yeu Wu

Abstract: Recently, deep learning-assisted communication systems have achieved many eye-catching results and attracted more and more researchers in this emerging field. Instead of completely replacing the functional blocks of communication systems with neural networks, a hybrid manner of BCJRNet symbol detection is proposed to combine the advantages of the BCJR algorithm and neural networks. However, its se… ▽ More Recently, deep learning-assisted communication systems have achieved many eye-catching results and attracted more and more researchers in this emerging field. Instead of completely replacing the functional blocks of communication systems with neural networks, a hybrid manner of BCJRNet symbol detection is proposed to combine the advantages of the BCJR algorithm and neural networks. However, its separate block design not only degrades the system performance but also results in additional hardware complexity. In this work, we propose a BCJR receiver for joint symbol detection and channel decoding. It can simultaneously utilize the trellis diagram and channel state information for a more accurate calculation of branch probability and thus achieve global optimum with 2.3 dB gain over separate block design. Furthermore, a dedicated neural network model is proposed to replace the channel-model-based computation of the BCJR receiver, which can avoid the requirements of perfect CSI and is more robust under CSI uncertainty with 1.0 dB gain. △ Less

Submitted 21 July, 2020; v1 submitted 30 May, 2020; originally announced June 2020.

Comments: 6 pages, six figures, accepted by 2020 IEEE International Workshop on Signal Processing Systems (SiPS)

arXiv:2005.12076 [pdf]

An Effective Entropy-assisted Mind-wandering Detection System with EEG Signals based on MM-SART Database

Authors: Yi-Ta Chen, Hsing-Hao Lee, Ching-Yen Shih, Zih-Ling Chen, Win-Ken Beh, Su-Ling Yeh, An-Yeu Wu

Abstract: Mind-wandering (MW), which usually defined as a lapse of attention, occurs between 20%-40% of the time, has negative effects on our daily life. Therefore, detecting when MW occurs can prevent us from those negative outcomes resulting from MW, such as failing to keep track of course during learning. In this work, we first collect a multi-modal Sustained Attention to Response Task (MM-SART) database… ▽ More Mind-wandering (MW), which usually defined as a lapse of attention, occurs between 20%-40% of the time, has negative effects on our daily life. Therefore, detecting when MW occurs can prevent us from those negative outcomes resulting from MW, such as failing to keep track of course during learning. In this work, we first collect a multi-modal Sustained Attention to Response Task (MM-SART) database for detecting MW. Eighty-two participants' data are collected in our experiments. For each participant, we collect measures of 32-channels electroencephalogram (EEG) signals, photoplethysmography (PPG) signals, galvanic skin response (GSR) signals, eye tracker signals, and several questionnaires for detailed analyses. Then, we propose an effective MW detection system based on the collected EEG signals. To explore the non-linear characteristics of EEG signals, we utilize the entropy-based features in time, frequency, and wavelet domains. The experimental results show that we can reach 0.712 AUC score by using the random forest (RF) classifier with the leave-one-subject-out cross-validation. Moreover, to lower the overall computational complexity of the MW detection system, we apply techniques of channel selection and feature selection. By using the only two most significant EEG channels, we can reduce the training time of the classifier by 44.16%. By performing correlation importance feature elimination (CIFE) on the feature set, we can further improve the AUC score to 0.725 but with only 14.6% of the selection time compared with the recursive feature elimination (RFE) method. By proposing the MW detection engine, current work can be applied to educational scenarios, especially in the era of remote learning nowadays. △ Less

Submitted 27 November, 2020; v1 submitted 25 May, 2020; originally announced May 2020.

Comments: 15 pages, Journal version

arXiv:2004.14252 [pdf]

Task-Projected Hyperdimensional Computing for Multi-Task Learning

Authors: Cheng-Yang Chang, Yu-Chuan Chuang, An-Yeu Wu

Abstract: Brain-inspired Hyperdimensional (HD) computing is an emerging technique for cognitive tasks in the field of low-power design. As a fast-learning and energy-efficient computational paradigm, HD computing has shown great success in many real-world applications. However, an HD model incrementally trained on multiple tasks suffers from the negative impacts of catastrophic forgetting. The model forgets… ▽ More Brain-inspired Hyperdimensional (HD) computing is an emerging technique for cognitive tasks in the field of low-power design. As a fast-learning and energy-efficient computational paradigm, HD computing has shown great success in many real-world applications. However, an HD model incrementally trained on multiple tasks suffers from the negative impacts of catastrophic forgetting. The model forgets the knowledge learned from previous tasks and only focuses on the current one. To the best of our knowledge, no study has been conducted to investigate the feasibility of applying multi-task learning to HD computing. In this paper, we propose Task-Projected Hyperdimensional Computing (TP-HDC) to make the HD model simultaneously support multiple tasks by exploiting the redundant dimensionality in the hyperspace. To mitigate the interferences between different tasks, we project each task into a separate subspace for learning. Compared with the baseline method, our approach efficiently utilizes the unused capacity in the hyperspace and shows a 12.8% improvement in averaged accuracy with negligible memory overhead. △ Less

Submitted 29 April, 2020; originally announced April 2020.

Comments: To be published in 16th International Conference on Artificial Intelligence Applications and Innovations

arXiv:2004.10021 [pdf, other]

Mask R-CNN Based Object Detection for Intelligent Wireless Power Transfer

Authors: Aozhou Wu, Qingqing Zhang, Wen Fang, Hao Deng, Sai Jiang, Qingwen Liu

Abstract: Resonant Beam Charging (RBC) is a promising multi-Watt and multi-meter wireless power transfer method with safety, mobility and simultaneously-charging capability. However, RBC system operation relies on information availability including power receiver location, class label and the receiver number. Since smartphone is the most widely-used mobile device, we propose a Mask R-CNN based smartphone de… ▽ More Resonant Beam Charging (RBC) is a promising multi-Watt and multi-meter wireless power transfer method with safety, mobility and simultaneously-charging capability. However, RBC system operation relies on information availability including power receiver location, class label and the receiver number. Since smartphone is the most widely-used mobile device, we propose a Mask R-CNN based smartphone detection model in the RBC system. Experiments illustrate that our model reduces the smartphone scanning time to one third. Thus, this machine learningdetectionapproachprovidesanintelligentwaytoimprove the user experience in wireless power transfer for mobile and Internet of Things (IoT) devices. △ Less

Submitted 25 April, 2020; v1 submitted 19 April, 2020; originally announced April 2020.

arXiv:2004.07929 [pdf, ps, other]

doi 10.1109/TAES.2022.3218277

Sliding Mode Attitude Maneuver Control for Rigid Spacecraft without Unwinding

Authors: Rui-Qi Dong, Ai-Guo Wu, Ying Zhang

Abstract: In this paper, attitude maneuver control without unwinding phenomenon is investigated for rigid spacecraft. First, a novel switching function is constructed by a hyperbolic sine function. It is shown that the spacecraft system possesses the unwinding-free performance when the system states are on the sliding surface. Based on the designed switching function, a sliding mode controller is developed… ▽ More In this paper, attitude maneuver control without unwinding phenomenon is investigated for rigid spacecraft. First, a novel switching function is constructed by a hyperbolic sine function. It is shown that the spacecraft system possesses the unwinding-free performance when the system states are on the sliding surface. Based on the designed switching function, a sliding mode controller is developed to ensure the robustness of the attitude maneuver control system. Another essential feature of the presented attitude control law is that a dynamic parameter is introduced to guarantee the unwinding-free performance when the system states are outside the sliding surface. The simulation results demonstrate that the unwinding phenomenon is avoided during the attitude maneuver of a rigid spacecraft by adopting the constructed switching function and the proposed attitude control scheme. △ Less

Submitted 15 April, 2020; originally announced April 2020.

Comments: 8 Pages, 8 figures. arXiv admin note: text overlap with arXiv:2004.07001

arXiv:2004.07001 [pdf, ps, other]

doi 10.1109/TAC.2021.3079220

Anti-Unwinding Sliding Mode Attitude Maneuver Control for Rigid Spacecraft

Authors: Rui-Qi Dong, Ai-Guo Wu, Ying Zhang

Abstract: In this paper, anti-unwinding attitude maneuver control for rigid spacecraft is considered. First, in order to avoid the unwinding phenomenon when the system states are restricted to the switching surface, a novel switching function is constructed by hyperbolic sine functions such that the switching surface contains two equilibriums. Then, a sliding mode attitude maneuver controller is designed ba… ▽ More In this paper, anti-unwinding attitude maneuver control for rigid spacecraft is considered. First, in order to avoid the unwinding phenomenon when the system states are restricted to the switching surface, a novel switching function is constructed by hyperbolic sine functions such that the switching surface contains two equilibriums. Then, a sliding mode attitude maneuver controller is designed based on the constructed switching function to ensure the robustness of the closed-loop attitude maneuver control system to disturbance. Another important feature of the developed attitude control law is that a dynamic parameter is introduced to guarantee the anti-unwinding performance before the system states reach the switching surface. The simulation results demonstrate that the unwinding problem is settled during attitude maneuver for rigid spacecraft by adopting the newly constructed switching function and proposed attitude control scheme. △ Less

Submitted 15 April, 2020; originally announced April 2020.

Comments: 8 pages, 8 figures

arXiv:2001.01395 [pdf]

Accumulated Polar Feature-based Deep Learning for Efficient and Lightweight Automatic Modulation Classification with Channel Compensation Mechanism

Authors: Chieh-Fang Teng, Ching-Yao Chou, Chun-Hsiang Chen, An-Yeu Wu

Abstract: In next-generation communications, massive machine-type communications (mMTC) induce severe burden on base stations. To address such an issue, automatic modulation classification (AMC) can help to reduce signaling overhead by blindly recognizing the modulation types without handshaking. Thus, it plays an important role in future intelligent modems. The emerging deep learning (DL) technique stores… ▽ More In next-generation communications, massive machine-type communications (mMTC) induce severe burden on base stations. To address such an issue, automatic modulation classification (AMC) can help to reduce signaling overhead by blindly recognizing the modulation types without handshaking. Thus, it plays an important role in future intelligent modems. The emerging deep learning (DL) technique stores intelligence in the network, resulting in superior performance over traditional approaches. However, conventional DL-based approaches suffer from heavy training overhead, memory overhead, and computational complexity, which severely hinder practical applications for resource-limited scenarios, such as Vehicle-to-Everything (V2X) applications. Furthermore, the overhead of online retraining under time-varying fading channels has not been studied in the prior arts. In this work, an accumulated polar feature-based DL with a channel compensation mechanism is proposed to cope with the aforementioned issues. Firstly, the simulation results show that learning features from the polar domain with historical data information can approach near-optimal performance while reducing training overhead by 99.8 times. Secondly, the proposed neural network-based channel estimator (NN-CE) can learn the channel response and compensate for the distorted channel with 13% improvement. Moreover, in applying this lightweight NN-CE in a time-varying fading channel, two efficient mechanisms of online retraining are proposed, which can reduce transmission overhead and retraining overhead by 90% and 76%, respectively. Finally, the performance of the proposed approach is evaluated and compared with prior arts on a public dataset to demonstrate its great efficiency and lightness. △ Less

Submitted 7 February, 2020; v1 submitted 5 January, 2020; originally announced January 2020.

Comments: 13 pages, 13 figures, 8 tables

arXiv:1912.05158 [pdf]

Low-Complexity LSTM-Assisted Bit-Flip** Algorithm for Successive Cancellation List Polar Decoder

Authors: Chun-Hsiang Chen, Chieh-Fang Teng, An-Yeu Wu

Abstract: Polar codes have attracted much attention in the past decade due to their capacity-achieving performance. The higher decoding capacity is required for 5G and beyond 5G (B5G). Although the cyclic redundancy check (CRC)- assisted successive cancellation list bit-flip** (CA-SCLF) decoders have been developed to obtain a better performance, the solution to error bit correction (bit-flip**) problem… ▽ More Polar codes have attracted much attention in the past decade due to their capacity-achieving performance. The higher decoding capacity is required for 5G and beyond 5G (B5G). Although the cyclic redundancy check (CRC)- assisted successive cancellation list bit-flip** (CA-SCLF) decoders have been developed to obtain a better performance, the solution to error bit correction (bit-flip**) problem is still imperfect and hard to design. In this work, we leverage the expert knowledge in communication systems and adopt deep learning (DL) technique to obtain the better solution. A low-complexity long short-term memory network (LSTM)-assisted CA-SCLF decoder is proposed to further improve the performance of conventional CA-SCLF and avoid complexity and memory overhead. Our test results show that we can effectively improve the BLER performance by 0.11dB compared to prior work and reduce the complexity and memory overhead by over 30% of the network. △ Less

Submitted 11 December, 2019; originally announced December 2019.

Comments: 5 pages, 5 figures

arXiv:1911.01710 [pdf]

Unsupervised Learning for Neural Network-based Polar Decoder via Syndrome Loss

Authors: Chieh-Fang Teng, An-Yeu Wu

Abstract: With the rapid growth of deep learning in many fields, machine learning-assisted communication systems had attracted lots of researches with many eye-catching initial results. At the present stage, most of the methods still have great demand of massive labeled data for supervised learning. However, obtaining labeled data in the practical applications is not feasible, which may result in severe per… ▽ More With the rapid growth of deep learning in many fields, machine learning-assisted communication systems had attracted lots of researches with many eye-catching initial results. At the present stage, most of the methods still have great demand of massive labeled data for supervised learning. However, obtaining labeled data in the practical applications is not feasible, which may result in severe performance degradation due to channel variations. To overcome such a constraint, syndrome loss has been proposed to penalize non-valid decoded codewords and achieve unsupervised learning for neural network-based decoder. However, it cannot be applied to polar decoder directly. In this work, by exploiting the nature of polar codes, we propose a modified syndrome loss. From simulation results, the proposed method demonstrates that domain-specific knowledge and know-how in code structure can enable unsupervised learning for neural network-based polar decoder. △ Less

Submitted 5 November, 2019; originally announced November 2019.

Comments: four pages, six figures

arXiv:1911.01704 [pdf]

Convolutional Neural Network-aided Bit-flip** for Belief Propagation Decoding of Polar Codes

Authors: Chieh-Fang Teng, Kuan-Shiuan Ho, Chen-Hsi Wu, Sin-Sheng Wong, An-Yeu Wu

Abstract: Known for their capacity-achieving abilities, polar codes have been selected as the control channel coding scheme for 5G communications. To satisfy the needs of high throughput and low latency, belief propagation (BP) is chosen as the decoding algorithm. However, in general, the error performance of BP is worse than that of enhanced successive cancellation (SC). Recently, critical-set bit-flip**… ▽ More Known for their capacity-achieving abilities, polar codes have been selected as the control channel coding scheme for 5G communications. To satisfy the needs of high throughput and low latency, belief propagation (BP) is chosen as the decoding algorithm. However, in general, the error performance of BP is worse than that of enhanced successive cancellation (SC). Recently, critical-set bit-flip** (CS-BF) is applied to BP decoding to lower the error rate. However, its trial and error process result in even longer latency. In this work, we propose a convolutional neural network-assisted bit-flip** (CNN-BF) mechanism to further enhance BP decoding of polar codes. With carefully designed input data and model architecture, our proposed CNN-BF can achieve much higher prediction accuracy and better error correction capability than CS-BF but with only half latency. It also achieves a lower block error rate (BLER) than SC list (CA-SCL). △ Less

Submitted 5 February, 2020; v1 submitted 5 November, 2019; originally announced November 2019.

Comments: 5 pages, 6 figures

arXiv:1907.04980 [pdf]

Neural Network-based Equalizer by Utilizing Coding Gain in Advance

Authors: Chieh-Fang Teng, Han-Mo Ou, An-Yeu Wu

Abstract: Recently, deep learning has been exploited in many fields with revolutionary breakthroughs. In the light of this, deep learning-assisted communication systems have also attracted much attention in recent years and have potential to break down the conventional design rule for communication systems. In this work, we propose two kinds of neural network-based equalizers to exploit different characteri… ▽ More Recently, deep learning has been exploited in many fields with revolutionary breakthroughs. In the light of this, deep learning-assisted communication systems have also attracted much attention in recent years and have potential to break down the conventional design rule for communication systems. In this work, we propose two kinds of neural network-based equalizers to exploit different characteristics between convolutional neural networks and recurrent neural networks. The equalizer in conventional block-based design may destroy the code structure and degrade the capacity of coding gain for decoder. On the contrary, our proposed approach not only eliminates channel fading, but also exploits the code structure with utilization of coding gain in advance, which can effectively increase the overall utilization of coding gain with more than 1.5 dB gain. △ Less

Submitted 31 August, 2019; v1 submitted 10 July, 2019; originally announced July 2019.

Comments: 5 pages, 4 figures, accepted by the 2019 Seventh IEEE Global Conference on Signal and Information Processing

arXiv:1810.12154 [pdf]

Low-complexity Recurrent Neural Network-based Polar Decoder with Weight Quantization Mechanism

Authors: Chieh-Fang Teng, Chen-Hsi Wu, Kuan-Shiuan Ho, An-Yeu Wu

Abstract: Polar codes have drawn much attention and been adopted in 5G New Radio (NR) due to their capacity-achieving performance. Recently, as the emerging deep learning (DL) technique has breakthrough achievements in many fields, neural network decoder was proposed to obtain faster convergence and better performance than belief propagation (BP) decoding. However, neural networks are memory-intensive and h… ▽ More Polar codes have drawn much attention and been adopted in 5G New Radio (NR) due to their capacity-achieving performance. Recently, as the emerging deep learning (DL) technique has breakthrough achievements in many fields, neural network decoder was proposed to obtain faster convergence and better performance than belief propagation (BP) decoding. However, neural networks are memory-intensive and hinder the deployment of DL in communication systems. In this work, a low-complexity recurrent neural network (RNN) polar decoder with codebook-based weight quantization is proposed. Our test results show that we can effectively reduce the memory overhead by 98% and alleviate computational complexity with slight performance loss. △ Less

Submitted 1 February, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

Comments: 5 pages, accepted by the 2019 International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

arXiv:1810.02027 [pdf]

Polar Feature Based Deep Architectures for Automatic Modulation Classification Considering Channel Fading

Authors: Chieh-Fang Teng, Ching-Chun Liao, Chun-Hsiang Chen, An-Yeu Wu

Abstract: To develop intelligent receivers, automatic modulation classification (AMC) plays an important role for better spectrum utilization. The emerging deep learning (DL) technique has received much attention in AMC due to its superior performance in classifying data with deep structure. In this work, a novel polar-based deep learning architecture with channel compensation network (CCN) is proposed. Our… ▽ More To develop intelligent receivers, automatic modulation classification (AMC) plays an important role for better spectrum utilization. The emerging deep learning (DL) technique has received much attention in AMC due to its superior performance in classifying data with deep structure. In this work, a novel polar-based deep learning architecture with channel compensation network (CCN) is proposed. Our test results show that learning features from polar domain (r-theta) can improve recognition accuracy by 5% and reduce training overhead by 48%. Besides, the proposed CCN is also robust to channel fading, such as amplitude and phase offsets, and can improve the recognition accuracy by 14% under practical channel environments. △ Less

Submitted 7 October, 2018; v1 submitted 3 October, 2018; originally announced October 2018.

Comments: 5 pages, accepted by the 2018 Sixth IEEE Global Conference on Signal and Information Processing

arXiv:1809.08410 [pdf, other]

Entropy-Assisted Multi-Modal Emotion Recognition Framework Based on Physiological Signals

Authors: Kuan Tung, Po-Kang Liu, Yu-Chuan Chuang, Sheng-Hui Wang, An-Yeu Wu

Abstract: As the result of the growing importance of the Human Computer Interface system, understanding human's emotion states has become a consequential ability for the computer. This paper aims to improve the performance of emotion recognition by conducting the complexity analysis of physiological signals. Based on AMIGOS dataset, we extracted several entropy-domain features such as Refined Composite Mult… ▽ More As the result of the growing importance of the Human Computer Interface system, understanding human's emotion states has become a consequential ability for the computer. This paper aims to improve the performance of emotion recognition by conducting the complexity analysis of physiological signals. Based on AMIGOS dataset, we extracted several entropy-domain features such as Refined Composite Multi-Scale Entropy (RCMSE), Refined Composite Multi-Scale Permutation Entropy (RCMPE) from ECG and GSR signals, and Multivariate Multi-Scale Entropy (MMSE), Multivariate Multi-Scale Permutation Entropy (MMPE) from EEG, respectively. The statistical results show that RCMSE in GSR has a dominating performance in arousal, while RCMPE in GSR would be the excellent feature in valence. Furthermore, we selected XGBoost model to predict emotion and get 68% accuracy in arousal and 84% in valence. △ Less

Submitted 22 September, 2018; originally announced September 2018.

arXiv:1301.2722 [pdf, other]

Distributed Consensus Formation Through Unconstrained Gossi**

Authors: Christopher D. Hollander, Annie S. Wu

Abstract: Gossip algorithms are widely used to solve the distributed consensus problem, but issues can arise when nodes receive multiple signals either at the same time or before they are able to finish processing their current work load. Specifically, a node may assume a new state that represents a linear combination of all received signals; even if such a state makes no sense in the problem domain. As a s… ▽ More Gossip algorithms are widely used to solve the distributed consensus problem, but issues can arise when nodes receive multiple signals either at the same time or before they are able to finish processing their current work load. Specifically, a node may assume a new state that represents a linear combination of all received signals; even if such a state makes no sense in the problem domain. As a solution to this problem, we introduce the notion of conflict resolution for gossip algorithms and prove that their application leads to a valid consensus state when the underlying communication network possesses certain properties. We also introduce a methodology based on absorbing Markov chains for analyzing gossip algorithms that make use of these conflict resolution algorithms. This technique allows us to calculate both the probabilities of converging to a specific consensus state and the time that such convergence is expected to take. Finally, we make use of simulation to validate our methodology and explore the temporal behavior of gossip algorithms as the size of the network, the number of states per node, and the network density increase. △ Less

Submitted 12 January, 2013; originally announced January 2013.

Showing 1–28 of 28 results for author: Wu, A