Skip to main content

Showing 1–50 of 66 results for author: Wang, N

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.16928  [pdf, other

    eess.SP cs.LG

    A Multi-Resolution Mutual Learning Network for Multi-Label ECG Classification

    Authors: Wei Huang, Ning Wang, Panpan Feng, Haiyan Wang, Zongmin Wang, Bing Zhou

    Abstract: Electrocardiograms (ECG), which record the electrophysiological activity of the heart, have become a crucial tool for diagnosing these diseases. In recent years, the application of deep learning techniques has significantly improved the performance of ECG signal classification. Multi-resolution feature analysis, which captures and processes information at different time scales, can extract subtle… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  2. arXiv:2406.01802  [pdf, other

    cs.RO eess.SY

    Motion Planning for Hybrid Dynamical Systems: Framework, Algorithm Template, and a Sampling-based Approach

    Authors: Nan Wang, Ricardo G. Sanfelice

    Abstract: This paper focuses on the motion planning problem for the systems exhibiting both continuous and discrete behaviors, which we refer to as hybrid dynamical systems. Firstly, the motion planning problem for hybrid systems is formulated using the hybrid equation framework, which is general to capture most hybrid systems. Secondly, a propagation algorithm template is proposed that describes a general… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 33 pages, 8 figures. Submitted to IJRR. arXiv admin note: text overlap with arXiv:2210.15082

  3. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, **shan Pan, Jiangxin Dong, **hui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi **, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  4. arXiv:2404.13388  [pdf

    eess.IV cs.CV cs.LG

    Diagnosis of Multiple Fundus Disorders Amidst a Scarcity of Medical Experts Via Self-supervised Machine Learning

    Authors: Yong Liu, Mengtian Kang, Shuo Gao, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Arokia Nathan, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Luigi Occhipinti

    Abstract: Fundus diseases are major causes of visual impairment and blindness worldwide, especially in underdeveloped regions, where the shortage of ophthalmologists hinders timely diagnosis. AI-assisted fundus image analysis has several advantages, such as high accuracy, reduced workload, and improved accessibility, but it requires a large amount of expert-annotated data to build reliable models. To addres… ▽ More

    Submitted 23 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  5. arXiv:2404.13386  [pdf

    eess.IV cs.CV cs.LG

    SSVT: Self-Supervised Vision Transformer For Eye Disease Diagnosis Based On Fundus Images

    Authors: Jiaqi Wang, Mengtian Kang, Yong Liu, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Shuo Gao, Luigi G. Occhipinti

    Abstract: Machine learning-based fundus image diagnosis technologies trigger worldwide interest owing to their benefits such as reducing medical resource power and providing objective evaluation results. However, current methods are commonly based on supervised methods, bringing in a heavy workload to biomedical staff and hence suffering in expanding effective databases. To address this issue, in this artic… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: ISBI 2024

  6. arXiv:2403.18413  [pdf, ps, other

    cs.RO eess.SY

    HyRRT-Connect: A Bidirectional Rapidly-Exploring Random Trees Motion Planning Algorithm for Hybrid Systems

    Authors: Nan Wang, Ricardo G. Sanfelice

    Abstract: This paper proposes a bidirectional rapidly-exploring random trees (RRT) algorithm to solve the motion planning problem for hybrid systems. The proposed algorithm, called HyRRT-Connect, propagates in both forward and backward directions in hybrid time until an overlap between the forward and backward propagation results is detected. Then, HyRRT-Connect constructs a motion plan through the reversal… ▽ More

    Submitted 29 March, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted by the 8th IFAC International Conference on Analysis and Design of Hybrid Systems (ADHS 2024)

  7. arXiv:2402.04171  [pdf, other

    eess.IV cs.CV

    3D Volumetric Super-Resolution in Radiology Using 3D RRDB-GAN

    Authors: Juhyung Ha, Nian Wang, Surendra Maharjan, Xuhong Zhang

    Abstract: This study introduces the 3D Residual-in-Residual Dense Block GAN (3D RRDB-GAN) for 3D super-resolution for radiology imagery. A key aspect of 3D RRDB-GAN is the integration of a 2.5D perceptual loss function, which contributes to improved volumetric image quality and realism. The effectiveness of our model was evaluated through 4x super-resolution experiments across diverse datasets, including Mi… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  8. arXiv:2402.02699  [pdf, other

    cs.SD cs.LG eess.AS

    Adversarial Data Augmentation for Robust Speaker Verification

    Authors: Zhenyu Zhou, Junhui Chen, Namin Wang, Lantian Li, Dong Wang

    Abstract: Data augmentation (DA) has gained widespread popularity in deep speaker models due to its ease of implementation and significant effectiveness. It enriches training data by simulating real-life acoustic variations, enabling deep neural networks to learn speaker-related representations while disregarding irrelevant acoustic variations, thereby improving robustness and generalization. However, a pot… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  9. arXiv:2401.03623  [pdf

    eess.IV

    A Video Coding Method Based on Neural Network for CLIC2024

    Authors: Zhengang Li, **gchi Zhang, Yonghua Wang, Xing Zeng, Zhen Zhang, Yunlin Long, Menghu Jia, Ning Wang

    Abstract: This paper presents a video coding scheme that combines traditional optimization methods with deep learning methods based on the Enhanced Compression Model (ECM). In this paper, the traditional optimization methods adaptively adjust the quantization parameter (QP). The key frame QP offset is set according to the video content characteristics, and the coding tree unit (CTU) level QP of all frames i… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

  10. arXiv:2401.00153  [pdf, other

    eess.IV

    USFM: A Universal Ultrasound Foundation Model Generalized to Tasks and Organs towards Label Efficient Image Analysis

    Authors: **g Jiao, ** Zhou, Xiaokang Li, Menghua Xia, Yi Huang, Lihong Huang, Na Wang, Xiaofan Zhang, Shichong Zhou, Yuanyuan Wang, Yi Guo

    Abstract: Inadequate generality across different organs and tasks constrains the application of ultrasound (US) image analysis methods in smart healthcare. Building a universal US foundation model holds the potential to address these issues. Nevertheless, the development of such foundational models encounters intrinsic challenges in US analysis, i.e., insufficient databases, low quality, and ineffective fea… ▽ More

    Submitted 2 January, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

    Comments: Submit to MedIA, 17 pages, 11 figures

  11. arXiv:2312.13523  [pdf

    physics.med-ph eess.IV

    High-resolution myelin-water fraction and quantitative relaxation map** using 3D ViSTa-MR fingerprinting

    Authors: Congyu Liao, Xiaozhi Cao, Siddharth Srinivasan Iyer, Sophie Schauman, Zihan Zhou, Xiaoqian Yan, Quan Chen, Zhitao Li, Nan Wang, Ting Gong, Zhe Wu, Hongjian He, Jianhui Zhong, Yang Yang, Adam Kerr, Kalanit Grill-Spector, Kawin Setsompop

    Abstract: Purpose: This study aims to develop a high-resolution whole-brain multi-parametric quantitative MRI approach for simultaneous map** of myelin-water fraction (MWF), T1, T2, and proton-density (PD), all within a clinically feasible scan time. Methods: We developed 3D ViSTa-MRF, which combined Visualization of Short Transverse relaxation time component (ViSTa) technique with MR Fingerprinting (MR… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: 38 pages, 12 figures and 1 table

    Journal ref: Magnetic Resonance in Medicine 2023

  12. Throughput Maximization for Intelligent Refracting Surface Assisted mmWave High-Speed Train Communications

    Authors: **g Li, Yong Niu, Hao Wu, Bo Ai, Ruisi He, Ning Wang, Sheng Chen

    Abstract: With the increasing demands from passengers for data-intensive services, millimeter-wave (mmWave) communication is considered as an effective technique to release the transmission pressure on high speed train (HST) networks. However, mmWave signals ncounter severe losses when passing through the carriage, which decreases the quality of services on board. In this paper, we investigate an intelligen… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: 13 pages, 7 figures, IEEE Internet of Things Journal

  13. Sum Rate Maximization under AoI Constraints for RIS-Assisted mmWave Communications

    Authors: Ziqi Guo, Yong Niu, Shiwen Mao, Changming Zhang, Ning Wang, Zhangdui Zhong, Bo Ai

    Abstract: The concept of age of information (AoI) has been proposed to quantify information freshness, which is crucial for time-sensitive applications. However, in millimeter wave (mmWave) communication systems, the link blockage caused by obstacles and the severe path loss greatly impair the freshness of information received by the user equipments (UEs). In this paper, we focus on reconfigurable intellige… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  14. arXiv:2310.04992  [pdf, other

    eess.IV cs.CV

    VisionFM: a Multi-Modal Multi-Task Vision Foundation Model for Generalist Ophthalmic Artificial Intelligence

    Authors: Jianing Qiu, Jian Wu, Hao Wei, Peilun Shi, Minqing Zhang, Yunyun Sun, Lin Li, Hanruo Liu, Hongyi Liu, Simeng Hou, Yuyang Zhao, Xuehui Shi, Junfang Xian, Xiaoxia Qu, Sirui Zhu, Lijie Pan, Xiaoniao Chen, Xiaojia Zhang, Shuai Jiang, Kebing Wang, Chenlong Yang, Mingqiang Chen, Sujie Fan, Jianhua Hu, Aiguo Lv , et al. (17 additional authors not shown)

    Abstract: We present VisionFM, a foundation model pre-trained with 3.4 million ophthalmic images from 560,457 individuals, covering a broad range of ophthalmic diseases, modalities, imaging devices, and demography. After pre-training, VisionFM provides a foundation to foster multiple ophthalmic artificial intelligence (AI) applications, such as disease screening and diagnosis, disease prognosis, subclassifi… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

  15. arXiv:2309.14158  [pdf, other

    cs.SD eess.AS

    An Investigation of Distribution Alignment in Multi-Genre Speaker Recognition

    Authors: Zhenyu Zhou, Junhui Chen, Namin Wang, Lantian Li, Dong Wang

    Abstract: Multi-genre speaker recognition is becoming increasingly popular due to its ability to better represent the complexities of real-world applications. However, a major challenge is the significant shift in the distribution of speaker vectors across different genres. While distribution alignment is a common approach to address this challenge, previous studies have mainly focused on aligning a source… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: submitted to ICASSP 2024

  16. arXiv:2308.09929  [pdf, other

    eess.SP cs.IT cs.NI

    RIS-assisted High-Speed Railway Integrated Sensing and Communication System

    Authors: Panpan Li, Yong Niu, Hao Wu, Zhu Han, Guiqi Sun, Ning Wang, Zhangdui Zhong, Bo Ai

    Abstract: One technology that has the potential to improve wireless communications in years to come is integrated sensing and communication (ISAC). In this study, we take advantage of reconfigurable intelligent surface's (RIS) potential advantages to achieve ISAC while using the same frequency and resources. Specifically, by using the reflecting elements, the RIS dynamically modifies the radio waves' streng… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

    Comments: 12 pages

  17. arXiv:2307.12845  [pdf, other

    eess.IV cs.CV

    Multi-View Vertebra Localization and Identification from CT Images

    Authors: Han Wu, Jiadong Zhang, Yu Fang, Zhentao Liu, Nizhuan Wang, Zhiming Cui, Dinggang Shen

    Abstract: Accurately localizing and identifying vertebrae from CT images is crucial for various clinical applications. However, most existing efforts are performed on 3D with crop** patch operation, suffering from the large computation costs and limited global information. In this paper, we propose a multi-view vertebra localization and identification from CT images, converting the 3D problem into a 2D lo… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: MICCAI 2023

  18. arXiv:2307.12634  [pdf, other

    eess.IV cs.CV

    Automatic lobe segmentation using attentive cross entropy and end-to-end fissure generation

    Authors: Qi Su, Na Wang, Jiawen Xie, Yinan Chen, Xiaofan Zhang

    Abstract: The automatic lung lobe segmentation algorithm is of great significance for the diagnosis and treatment of lung diseases, however, which has great challenges due to the incompleteness of pulmonary fissures in lung CT images and the large variability of pathological features. Therefore, we propose a new automatic lung lobe segmentation framework, in which we urge the model to pay attention to the a… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 5 pages, 3 figures, published to 'IEEE International Symposium on Biomedical Imaging (ISBI) 2023'

  19. arXiv:2306.10548  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    MARBLE: Music Audio Representation Benchmark for Universal Evaluation

    Authors: Ruibin Yuan, Yinghao Ma, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Le Zhuo, Yiqi Liu, Jiawen Huang, Zeyue Tian, Binyue Deng, Ningzhi Wang, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Roger Dannenberg, Wenhu Chen, Gus Xia, Wei Xue, Si Liu, Shi Wang, Ruibo Liu, Yike Guo, Jie Fu

    Abstract: In the era of extensive intersection between art and Artificial Intelligence (AI), such as image generation and fiction co-creation, AI for music remains relatively nascent, particularly in music understanding. This is evident in the limited work on deep music representations, the scarcity of large-scale datasets, and the absence of a universal and community-driven benchmark. To address this issue… ▽ More

    Submitted 23 November, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

    Comments: camera-ready version for NeurIPS 2023

  20. arXiv:2306.02894  [pdf, ps, other

    eess.IV

    Recyclable Semi-supervised Method Based on Multi-model Ensemble for Video Scene Parsing

    Authors: Biao Wu, Shaoli Liu, Diankai Zhang, Chengjian Zheng, Si Gao, Xiaofeng Zhang, Ning Wang

    Abstract: Pixel-level Scene Understanding is one of the fundamental problems in computer vision, which aims at recognizing object classes, masks and semantics of each pixel in the given image. Since the real-world is actually video-based rather than a static state, learning to perform video semantic segmentation is more reasonable and practical for realistic applications. In this paper, we adopt Mask2Former… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  21. arXiv:2305.18649  [pdf, ps, other

    cs.RO eess.SY

    HySST: A Stable Sparse Rapidly-Exploring Random Trees Optimal Motion Planning Algorithm for Hybrid Dynamical Systems

    Authors: Nan Wang, Ricardo G. Sanfelice

    Abstract: This paper proposes a stable sparse rapidly-exploring random trees (SST) algorithm to solve the optimal motion planning problem for hybrid systems. At each iteration, the proposed algorithm, called HySST, selects a vertex with the lowest cost among all the vertices within the neighborhood of a randomly selected sample and then extends the search tree by flow or jump, which is also chosen randomly… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: This paper has been submitted to the 2023 Conference of Decision and Control (CDC). arXiv admin note: substantial text overlap with arXiv:2210.15082

    ACM Class: I.2.9

  22. arXiv:2305.16043  [pdf, other

    cs.SD cs.LG eess.AS

    Ordered and Binary Speaker Embedding

    Authors: Jiaying Wang, Xianglong Wang, Namin Wang, Lantian Li, Dong Wang

    Abstract: Modern speaker recognition systems represent utterances by embedding vectors. Conventional embedding vectors are dense and non-structural. In this paper, we propose an ordered binary embedding approach that sorts the dimensions of the embedding vector via a nested dropout and converts the sorted vectors to binary codes via Bernoulli sampling. The resultant ordered binary codes offer some important… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: to be published in INTERSPEECH 2023

  23. arXiv:2304.03708  [pdf, other

    eess.IV cs.CV

    Efficient automatic segmentation for multi-level pulmonary arteries: The PARSE challenge

    Authors: Gongning Luo, Kuanquan Wang, Jun Liu, Shuo Li, Xinjie Liang, Xiangyu Li, Shaowei Gan, Wei Wang, Suyu Dong, Wenyi Wang, Pengxin Yu, Enyou Liu, Hongrong Wei, Na Wang, Jia Guo, Huiqi Li, Zhao Zhang, Ziwei Zhao, Na Gao, Nan An, Ashkan Pakzad, Bojidar Rangelov, Jiaqi Dou, Song Tian, Zeyu Liu , et al. (5 additional authors not shown)

    Abstract: Efficient automatic segmentation of multi-level (i.e. main and branch) pulmonary arteries (PA) in CTPA images plays a significant role in clinical applications. However, most existing methods concentrate only on main PA or branch PA segmentation separately and ignore segmentation efficiency. Besides, there is no public large-scale dataset focused on PA segmentation, which makes it highly challengi… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

  24. arXiv:2303.13631  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    In-depth analysis of music structure as a text network

    Authors: **-Rui Tsai, Yen-Ting Chou, Nathan-Christopher Wang, Hui-Ling Chen, Hong-Yue Huang, Zih-Jia Luo, Tzay-Ming Hong

    Abstract: Music, enchanting and poetic, permeates every corner of human civilization. Although music is not unfamiliar to people, our understanding of its essence remains limited, and there is still no universally accepted scientific description. This is primarily due to music being regarded as a product of both reason and emotion, making it difficult to define. In this article, we focus on the fundamental… ▽ More

    Submitted 2 January, 2024; v1 submitted 21 March, 2023; originally announced March 2023.

    Comments: 7 pages, 8 figures

  25. arXiv:2303.09232  [pdf

    eess.IV cs.CV cs.LG

    Generative Adversarial Network for Personalized Art Therapy in Melanoma Disease Management

    Authors: Lennart Jütte, Ning Wang, Bernhard Roth

    Abstract: Melanoma is the most lethal type of skin cancer. Patients are vulnerable to mental health illnesses which can reduce the effectiveness of the cancer treatment and the patients adherence to drug plans. It is crucial to preserve the mental health of patients while they are receiving treatment. However, current art therapy approaches are not personal and unique to the patient. We aim to provide a wel… ▽ More

    Submitted 20 March, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

  26. arXiv:2211.16694  [pdf, ps, other

    eess.AS cs.SD

    MSV Challenge 2022: NPU-HC Speaker Verification System for Low-resource Indian Languages

    Authors: Yue Li, Li Zhang, Namin Wang, Jie Liu, Lei Xie

    Abstract: This report describes the NPU-HC speaker verification system submitted to the O-COCOSDA Multi-lingual Speaker Verification (MSV) Challenge 2022, which focuses on develo** speaker verification systems for low-resource Asian languages. We participate in the I-MSV track, which aims to develop speaker verification systems for various Indian languages. In this challenge, we first explore different ne… ▽ More

    Submitted 1 December, 2022; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: 6pages, submitted to the 9th International Workshop on Vietnamese Language and Speech Processing

  27. arXiv:2211.05256  [pdf, other

    eess.IV cs.CV

    Power Efficient Video Super-Resolution on Mobile NPUs with Deep Learning, Mobile AI & AIM 2022 challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Cheng-Ming Chiang, Hsien-Kai Kuo, Yu-Syuan Xu, Man-Yu Lee, Allen Lu, Chia-Ming Cheng, Chih-Cheng Chen, Jia-Ying Yong, Hong-Han Shuai, Wen-Huang Cheng, Zhuang Jia, Tianyu Xu, Yijian Zhang, Long Bao, Heng Sun, Diankai Zhang, Si Gao, Shaoli Liu, Biao Wu, Xiaofeng Zhang, Chengjian Zheng, Kaidi Lu, Ning Wang , et al. (29 additional authors not shown)

    Abstract: Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this prob… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2105.08826, arXiv:2105.07809, arXiv:2211.04470, arXiv:2211.03885

  28. arXiv:2211.03038  [pdf, other

    eess.AS cs.CR cs.SD

    Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling

    Authors: Jixun Yao, Qing Wang, Yi Lei, Pengcheng Guo, Lei Xie, Namin Wang, Jie Liu

    Abstract: Speech data on the Internet are proliferating exponentially because of the emergence of social media, and the sharing of such personal data raises obvious security and privacy concerns. One solution to mitigate these concerns involves concealing speaker identities before sharing speech data, also referred to as speaker anonymization. In our previous work, we have developed an automatic speaker ver… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

    Comments: Submitted to ICASSP 2023

  29. arXiv:2208.03953  [pdf, ps, other

    eess.SP cs.IT

    Intelligent MIMO Detection Using Meta Learning

    Authors: Haomiao Huo, **dan Xu, Gege Su, Wei Xu, Ning Wang

    Abstract: In a K-best detector for multiple-input-multiple-output(MIMO) systems, the value of K needs to be sufficiently large to achieve near-maximum-likelihood (ML) performance. By treating K as a variable that can be adjusted according to a fitting function of some learnable coefficients, an intelligent MIMO detection network based on deep neural networks (DNN) is proposed to reduce complexity of the det… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

  30. arXiv:2207.03430  [pdf, ps, other

    eess.IV cs.CV

    A Novel Unified Conditional Score-based Generative Framework for Multi-modal Medical Image Completion

    Authors: Xiangxi Meng, Yuning Gu, Yongsheng Pan, Nizhuan Wang, Peng Xue, Mengkang Lu, Xuming He, Yiqiang Zhan, Dinggang Shen

    Abstract: Multi-modal medical image completion has been extensively applied to alleviate the missing modality issue in a wealth of multi-modal diagnostic tasks. However, for most existing synthesis methods, their inferences of missing modalities can collapse into a deterministic map** from the available ones, ignoring the uncertainties inherent in the cross-modal relationships. Here, we propose the Unifie… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

  31. arXiv:2205.13133  [pdf, other

    cs.IT eess.SP

    Coverage Probability Analysis of RIS-Assisted High-Speed Train Communications

    Authors: Changzhu Liu, Ruisi He, Yong Niu, Bo Ai, Zhu Han, Zhangfeng Ma, Meilin Gao, Zhangdui Zhong, Ning Wang

    Abstract: Reconfigurable intelligent surface (RIS) has received increasing attention due to its capability of extending cell coverage by reflecting signals toward receivers. This paper considers a RIS-assisted high-speed train (HST) communication system to improve the coverage probability. We derive the closed-form expression of coverage probability. Moreover, we analyze impacts of some key system parameter… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: 6 pages, 6 figures,submmited to GlobeCom 2022

  32. arXiv:2205.02524  [pdf, other

    cs.SD cs.AI eess.AS

    M2R2: Missing-Modality Robust emotion Recognition framework with iterative data augmentation

    Authors: Ning Wang

    Abstract: This paper deals with the utterance-level modalities missing problem with uncertain patterns on emotion recognition in conversation (ERC) task. Present models generally predict the speaker's emotions by its current utterance and context, which is degraded by modality missing considerably. Our work proposes a framework Missing-Modality Robust emotion Recognition (M2R2), which trains emotion recogni… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

  33. Cooperative Reflection and Synchronization Design for Distributed Multiple-RIS Communications

    Authors: Yaqiong Zhao, Wei Xu, Xiaohu You, Ning Wang, Huan Sun

    Abstract: To reap the promised gain achieved by distributed reconfigurable intelligent surfaces (RISs)-enhanced communications in a wireless network, timing synchronization among these metasurfaces is an essential prerequisite in practice. This paper proposes a unified framework for the joint estimation of the unknown timing offsets and the RIS channel parameters, as well as the design of cooperative reflec… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

  34. Federated Learning Enables Big Data for Rare Cancer Boundary Detection

    Authors: Sarthak Pati, Ujjwal Baid, Brandon Edwards, Micah Sheller, Shih-Han Wang, G Anthony Reina, Patrick Foley, Alexey Gruzdev, Deepthi Karkada, Christos Davatzikos, Chiharu Sako, Satyam Ghodasara, Michel Bilello, Suyash Mohan, Philipp Vollmuth, Gianluca Brugnara, Chandrakanth J Preetha, Felix Sahm, Klaus Maier-Hein, Maximilian Zenk, Martin Bendszus, Wolfgang Wick, Evan Calabrese, Jeffrey Rudie, Javier Villanueva-Meyer , et al. (254 additional authors not shown)

    Abstract: Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train acc… ▽ More

    Submitted 25 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: federated learning, deep learning, convolutional neural network, segmentation, brain tumor, glioma, glioblastoma, FeTS, BraTS

  35. arXiv:2204.08987  [pdf

    cs.LG eess.SP

    Deep learning based closed-loop optimization of geothermal reservoir production

    Authors: Nanzhe Wang, Haibin Chang, Xiangzhao Kong, Martin O. Saar, Dongxiao Zhang

    Abstract: To maximize the economic benefits of geothermal energy production, it is essential to optimize geothermal reservoir management strategies, in which geologic uncertainty should be considered. In this work, we propose a closed-loop optimization framework, based on deep learning surrogates, for the well control optimization of geothermal reservoirs. In this framework, we construct a hybrid convolutio… ▽ More

    Submitted 15 April, 2022; originally announced April 2022.

    Comments: 37 pages, 24 figures

  36. arXiv:2204.08171  [pdf, other

    cs.IT eess.SP

    Distributed Neural Precoding for Hybrid mmWave MIMO Communications with Limited Feedback

    Authors: Kai Wei, **dan Xu, Wei Xu, Ning Wang, Dong Chen

    Abstract: Hybrid precoding is a cost-efficient technique for millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) communications. This paper proposes a deep learning approach by using a distributed neural network for hybrid analog-and-digital precoding design with limited feedback. The proposed distributed neural precoding network, called DNet, is committed to achieving two objectives. Fir… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: 13 pages, 4 figures

  37. arXiv:2204.03889  [pdf, other

    cs.SD cs.CL eess.AS

    Adding Connectionist Temporal Summarization into Conformer to Improve Its Decoder Efficiency For Speech Recognition

    Authors: Nick J. C. Wang, Zongfeng Quan, Shaojun Wang, **g Xiao

    Abstract: The Conformer model is an excellent architecture for speech recognition modeling that effectively utilizes the hybrid losses of connectionist temporal classification (CTC) and attention to train model parameters. To improve the decoding efficiency of Conformer, we propose a novel connectionist temporal summarization (CTS) method that reduces the number of frames required for the attention decoder… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Comments: Submitted to INTERSPEECH 2022 (5 pages, 2 figures)

  38. arXiv:2204.03879  [pdf, other

    cs.CL cs.SD eess.AS

    A Study of Different Ways to Use The Conformer Model For Spoken Language Understanding

    Authors: Nick J. C. Wang, Shaojun Wang, **g Xiao

    Abstract: SLU combines ASR and NLU capabilities to accomplish speech-to-intent understanding. In this paper, we compare different ways to combine ASR and NLU, in particular using a single Conformer model with different ways to use its components, to better understand the strengths and weaknesses of each approach. We find that it is not necessarily a choice between two-stage decoding and end-to-end systems w… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Comments: Submitted to INTERSPEECH 2022. (5 pages, 1 figure.)

  39. arXiv:2204.03315  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model

    Authors: Nick J. C. Wang, Lu Wang, Yandan Sun, Haimei Kang, Dejun Zhang

    Abstract: In spoken language understanding (SLU), what the user says is converted to his/her intent. Recent work on end-to-end SLU has shown that accuracy can be improved via pre-training approaches. We revisit ideas presented by Lugosch et al. using speech pre-training and three-module modeling; however, to ease construction of the end-to-end SLU model, we use as our phoneme module an open-source acoustic-… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: Published in INTERSPEECH 2021

  40. arXiv:2203.16005  [pdf, other

    cs.IT eess.SP

    Deep Joint Source-Channel Coding for CSI Feedback: An End-to-End Approach

    Authors: Jialong Xu, Bo Ai, Ning Wang, Wei Chen

    Abstract: The increased throughput brought by MIMO technology relies on the knowledge of channel state information (CSI) acquired in the base station (BS). To make the CSI feedback overhead affordable for the evolution of MIMO technology (e.g., massive MIMO and ultra-massive MIMO), deep learning (DL) is introduced to deal with the CSI compression task. Based on the separation principle in existing communica… ▽ More

    Submitted 6 April, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: 12 pages, 11 figure

  41. GATE: Graph CCA for Temporal SElf-supervised Learning for Label-efficient fMRI Analysis

    Authors: Liang Peng, Nan Wang, Jie Xu, Xiaofeng Zhu, Xiaoxiao Li

    Abstract: In this work, we focus on the challenging task, neuro-disease classification, using functional magnetic resonance imaging (fMRI). In population graph-based disease analysis, graph convolutional neural networks (GCNs) have achieved remarkable success. However, these achievements are inseparable from abundant labeled data and sensitive to spurious signals. To improve fMRI representation learning and… ▽ More

    Submitted 27 August, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

    Journal ref: IEEE Transactions on Medical Imaging 2022

  42. arXiv:2203.00400  [pdf, other

    cs.IT eess.SP

    RIS-Assisted Quasi-Static Broad Coverage for Wideband mmWave Massive MIMO Systems

    Authors: Muxin He, **dan Xu, Wei Xu, Hong Shen, Ning Wang, Chunming Zhao

    Abstract: Reconfigurable intelligent surfaces (RISs) can establish favorable wireless environments to combat the severe attenuation and blockages in millimeter-wave (mmWave) bands. However, to achieve the optimal enhancement of performance, the instantaneous channel state information (CSI) needs to be estimated at the cost of a large overhead that scales with the number of RIS elements and the number of use… ▽ More

    Submitted 1 March, 2022; originally announced March 2022.

  43. arXiv:2111.10773  [pdf, other

    eess.IV cs.CV

    One-shot Weakly-Supervised Segmentation in Medical Images

    Authors: Wenhui Lei, Qi Su, Ran Gu, Na Wang, Xinglong Liu, Guotai Wang, Xiaofan Zhang, Shaoting Zhang

    Abstract: Deep neural networks usually require accurate and a large number of annotations to achieve outstanding performance in medical image segmentation. One-shot segmentation and weakly-supervised learning are promising research directions that lower labeling effort by learning a new class from only one annotated image and utilizing coarse labels instead, respectively. Previous works usually fail to leve… ▽ More

    Submitted 21 November, 2021; originally announced November 2021.

  44. arXiv:2110.08622  [pdf, other

    eess.IV

    Self-Learned Kernel Low Rank Approach TO Accelerated High Resolution 3D Diffusion MRI

    Authors: Abhijit Baul, Nian Wang, Choyi Zhang, Leslie Ying, Yuchou Chang, Ukash Nakarmi

    Abstract: Diffusion Magnetic Resonance Imaging (dMRI) is a promising method to analyze the subtle changes in the tissue structure. However, the lengthy acquisition time is a major limitation in the clinical application of dMRI. Different image acquisition techniques such as parallel imaging, compressed sensing, has shortened the prolonged acquisition time but creating high-resolution 3D dMRI slices still re… ▽ More

    Submitted 23 November, 2021; v1 submitted 16 October, 2021; originally announced October 2021.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  45. arXiv:2108.12074  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    4-bit Quantization of LSTM-based Speech Recognition Models

    Authors: Andrea Fasoli, Chia-Yu Chen, Mauricio Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan

    Abstract: We investigate the impact of aggressive low-precision representations of weights and activations in two families of large LSTM-based architectures for Automatic Speech Recognition (ASR): hybrid Deep Bidirectional LSTM - Hidden Markov Models (DBLSTM-HMMs) and Recurrent Neural Network - Transducers (RNN-Ts). Using a 4-bit integer representation, a naïve quantization approach applied to the LSTM port… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

    Comments: 5 pages, 3 figures, Andrea Fasoli and Chia-Yu Chen equally contributed to this work. Paper accepted to Interspeech 2021

    ACM Class: I.2.6

  46. arXiv:2107.04986  [pdf, other

    cs.IT eess.SP

    Theoretical Performance Limit for Radar Parameter Estimation

    Authors: Dazhuan Xu, Han Zhang, Nan Wang

    Abstract: In this paper, we employ the thoughts and methodologies of Shannon's information theory to solve the problem of the optimal radar parameter estimation. Based on a general radar system model, the \textit{a posteriori} probability density function of targets' parameters is derived. Range information (RI) and entropy error (EE) are defined to evaluate the performance. It is proved that acquiring 1 bi… ▽ More

    Submitted 6 February, 2023; v1 submitted 11 July, 2021; originally announced July 2021.

  47. arXiv:2105.10283  [pdf, other

    cs.IT eess.SP

    A Lightweight Deep Network for Efficient CSI Feedback in Massive MIMO Systems

    Authors: Yuyao Sun, Wei Xu, Le Liang, Ning Wang, Geoffery Ye Li, Xiaohu You

    Abstract: To fully exploit the advantages of massive multiple-input multiple-output (m-MIMO), accurate channel state information (CSI) is required at the transmitter. However, excessive CSI feedback for large antenna arrays is inefficient and thus undesirable in practical applications. By exploiting the inherent correlation characteristics of complex-valued channel responses in the angular-delay domain, we… ▽ More

    Submitted 21 May, 2021; originally announced May 2021.

  48. arXiv:2104.05376  [pdf, other

    cs.CV eess.IV

    Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer

    Authors: Tianwei Lin, Zhuoqi Ma, Fu Li, Dongliang He, Xin Li, Errui Ding, Nannan Wang, Jie Li, Xinbo Gao

    Abstract: Artistic style transfer aims at migrating the style from an example image to a content image. Currently, optimization-based methods have achieved great stylization quality, but expensive time cost restricts their practical applications. Meanwhile, feed-forward methods still fail to synthesize complex style, especially when holistic global and local patterns exist. Inspired by the common painting p… ▽ More

    Submitted 17 April, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: Accepted by CVPR 2021. Codes will be released soon on https://github.com/PaddlePaddle/PaddleGAN/

  49. Data-Driven Optimized Tracking Control Heuristic for MIMO Structures: A Balance System Case Study

    Authors: Ning Wang, Mohammed Abouheaf, Wail Gueaieb

    Abstract: A data-driven computational heuristic is proposed to control MIMO systems without prior knowledge of their dynamics. The heuristic is illustrated on a two-input two-output balance system. It integrates a self-adjusting nonlinear threshold accepting heuristic with a neural network to compromise between the desired transient and steady state characteristics of the system while optimizing a dynamic c… ▽ More

    Submitted 31 March, 2021; originally announced April 2021.

    Journal ref: IEEE International Conference on Systems, Man, and Cybernetics (SMC), Toronto, ON, Canada, 2020, pp. 2365-2370

  50. The Level Set Kalman Filter for State Estimation of Continuous-discrete Systems

    Authors: Ningyuan Wang, Daniel B. Forger

    Abstract: We propose a new extension of Kalman filtering for continuous-discrete systems with nonlinear state-space models that we name as the level set Kalman filter (LSKF). The LSKF assumes the probability distribution can be approximated as a Gaussian, and updates the Gaussian distribution through a time-update step and a measurement-update step. The LSKF improves the time-update step when compared to ex… ▽ More

    Submitted 13 December, 2021; v1 submitted 20 March, 2021; originally announced March 2021.