Skip to main content

Showing 1–50 of 76 results for author: Wu, F

Searching in archive eess. Search in all archives.
.
  1. arXiv:2404.18650  [pdf, other

    eess.SP

    Enhancing RSS-Based Visible Light Positioning by Optimal Calibrating the LED Tilt and Gain

    Authors: Fan Wu, Nobby Stevens, Lieven De Strycker, François Rottenberg

    Abstract: This paper presents an optimal calibration scheme and a weighted least squares (LS) localization algorithm for received signal strength (RSS) based visible light positioning (VLP) systems, focusing on the often overlooked impact of light emitting diode (LED) tilt. By optimally calibrating LED tilt and gain, we significantly enhance VLP localization accuracy. Our algorithm outperforms both machine… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  2. arXiv:2404.03408  [pdf, other

    eess.SP

    Comparative Efficacy of Commercial Wearables for Circadian Rhythm Home Monitoring from Activity, Heart Rate, and Core Body Temperature

    Authors: Fan Wu, Patrick Langer, **joo Shim, Elgar Fleisch, Filipe Barata

    Abstract: Circadian rhythms govern biological patterns that follow a 24-hour cycle. Dysfunctions in circadian rhythms can contribute to various health problems, such as sleep disorders. Current circadian rhythm assessment methods, often invasive or subjective, limit circadian rhythm monitoring to laboratories. Hence, this study aims to investigate scalable consumer-centric wearables for circadian rhythm mon… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  3. arXiv:2403.11694  [pdf, other

    eess.IV cs.CV

    Object Segmentation-Assisted Inter Prediction for Versatile Video Coding

    Authors: Zhuoyuan Li, Zikun Yuan, Li Li, Dong Liu, Xiaohu Tang, Feng Wu

    Abstract: In modern video coding standards, block-based inter prediction is widely adopted, which brings high compression efficiency. However, in natural videos, there are usually multiple moving objects of arbitrary shapes, resulting in complex motion fields that are difficult to compactly represent. This problem has been tackled by more flexible block partitioning methods in the Versatile Video Coding (VV… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 22 pages, 15 figures

  4. arXiv:2402.17043  [pdf, other

    eess.SY

    Traffic Control via Connected and Automated Vehicles: An Open-Road Field Experiment with 100 CAVs

    Authors: Jonathan W. Lee, Han Wang, Kathy Jang, Amaury Hayat, Matthew Bunting, Arwa Alanqary, William Barbour, Zhe Fu, Xiaoqian Gong, George Gunter, Sharon Hornstein, Abdul Rahman Kreidieh, Nathan Lichtlé, Matthew W. Nice, William A. Richardson, Adit Shah, Eugene Vinitsky, Fangyu Wu, Shengquan Xiang, Sulaiman Almatrudi, Fahd Althukair, Rahul Bhadani, Joy Carpio, Raphael Chekroun, Eric Cheng , et al. (39 additional authors not shown)

    Abstract: The CIRCLES project aims to reduce instabilities in traffic flow, which are naturally occurring phenomena due to human driving behavior. These "phantom jams" or "stop-and-go waves,"are a significant source of wasted energy. Toward this goal, the CIRCLES project designed a control system referred to as the MegaController by the CIRCLES team, that could be deployed in real traffic. Our field experim… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  5. arXiv:2401.08835  [pdf, other

    cs.CL eess.AS

    Improving ASR Contextual Biasing with Guided Attention

    Authors: Jiyang Tang, Kwangyoun Kim, Suwon Shon, Felix Wu, Prashant Sridhar, Shinji Watanabe

    Abstract: In this paper, we propose a Guided Attention (GA) auxiliary training loss, which improves the effectiveness and robustness of automatic speech recognition (ASR) contextual biasing without introducing additional parameters. A common challenge in previous literature is that the word error rate (WER) reduction brought by contextual biasing diminishes as the number of bias phrases increases. To addres… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted at ICASSP 2024

  6. arXiv:2312.01499  [pdf, other

    eess.SY cs.DC eess.SP

    Towards Decentralized Task Offloading and Resource Allocation in User-Centric Mobile Edge Computing

    Authors: Langtian Qin, Hancheng Lu, Yuang Chen, Baolin Chong, Feng Wu

    Abstract: In the traditional cellular-based mobile edge computing (MEC), users at the edge of the cell are prone to suffer severe inter-cell interference and signal attenuation, leading to low throughput even transmission interruptions. Such edge effect severely obstructs offloading of tasks to MEC servers. To address this issue, we propose user-centric mobile edge computing (UCMEC), a novel MEC architectur… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: 16 pages, 13 figures

  7. arXiv:2311.18073  [pdf, other

    eess.IV

    DiffGEPCI: 3D MRI Synthesis from mGRE Signals using 2.5D Diffusion Model

    Authors: Yuyang Hu, Satya V. V. N. Kothapalli, Weijie Gan, Alexander L. Sukstanskii, Gregory F. Wu, Manu Goyal, Dmitriy A. Yablonskiy, Ulugbek S. Kamilov

    Abstract: We introduce a new framework called DiffGEPCI for cross-modality generation in magnetic resonance imaging (MRI) using a 2.5D conditional diffusion model. DiffGEPCI can synthesize high-quality Fluid Attenuated Inversion Recovery (FLAIR) and Magnetization Prepared-Rapid Gradient Echo (MPRAGE) images, without acquiring corresponding measurements, by leveraging multi-Gradient-Recalled Echo (mGRE) MRI… ▽ More

    Submitted 18 April, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

  8. arXiv:2309.02318  [pdf, other

    cs.CV eess.IV

    TiAVox: Time-aware Attenuation Voxels for Sparse-view 4D DSA Reconstruction

    Authors: Zhenghong Zhou, Huangxuan Zhao, Jiemin Fang, Dongqiao Xiang, Lei Chen, Lingxia Wu, Feihong Wu, Wenyu Liu, Chuansheng Zheng, Xinggang Wang

    Abstract: Four-dimensional Digital Subtraction Angiography (4D DSA) plays a critical role in the diagnosis of many medical diseases, such as Arteriovenous Malformations (AVM) and Arteriovenous Fistulas (AVF). Despite its significant application value, the reconstruction of 4D DSA demands numerous views to effectively model the intricate vessels and radiocontrast flow, thereby implying a significant radiatio… ▽ More

    Submitted 19 December, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: 10 pages, 8 figures

  9. arXiv:2308.12203  [pdf, ps, other

    eess.SP

    A Robust ADMM-Based Optimization Algorithm For Underwater Acoustic Channel Estimation

    Authors: Tian Tian, Agastya Raj, Bruno Missi Xavier, Ying Zhang, Feiyun Wu, Kunde Yang

    Abstract: Accurate estimation of the Underwater acoustic (UWA) is a key part of underwater communications, especially for coherent systems. The severe multipath effects and large delay spreads make the estimation problem large-scale. The non-stationary, non-Gaussian, and impulsive nature of ocean ambient noise poses further obstacles to the design of estimation algorithms. Under the framework of compressed… ▽ More

    Submitted 24 August, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: This paper has been accepted and presented at OCEANS 2023 - Limerick Conference. This version is a preprint

  10. arXiv:2307.09775  [pdf, other

    cs.IR cs.SD eess.AS

    DisCover: Disentangled Music Representation Learning for Cover Song Identification

    Authors: Jiahao Xun, Shengyu Zhang, Yanting Yang, Jieming Zhu, Liqun Deng, Zhou Zhao, Zhenhua Dong, Ruiqi Li, Lichao Zhang, Fei Wu

    Abstract: In the field of music information retrieval (MIR), cover song identification (CSI) is a challenging task that aims to identify cover versions of a query song from a massive collection. Existing works still suffer from high intra-song variances and inter-song correlations, due to the entangled nature of version-specific and version-invariant factors in their modeling. In this work, we set the goal… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  11. arXiv:2307.01710  [pdf

    eess.SP astro-ph.CO

    A Scalable Arrangement Method for Aperiodic Array Antennas to Reduce Peak Sidelobe Level

    Authors: Jiao Zhang, Hongtao Zhang, Xuelei Chen, Fengquan Wu, Yufeng Liu, Wenmei Zhang

    Abstract: Peak sidelobe level reduction (PSLR) is crucial in the application of large-scale array antenna, which directly determines the radiation performance of array antenna. We study the PSLR of subarray level aperiodic arrays and propose three array structures: dislocated subarrays with uniform elements (DSUE), uniform subarrays with random elements (USRE), dislocated subarrays with random elements (DSR… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

  12. arXiv:2305.19507  [pdf, other

    cs.CV eess.IV

    Manifold Constraint Regularization for Remote Sensing Image Generation

    Authors: Xingzhe Su, Changwen Zheng, Wenwen Qiang, Fengge Wu, Junsuo Zhao, Fuchun Sun, Hui Xiong

    Abstract: Generative Adversarial Networks (GANs) have shown notable accomplishments in remote sensing domain. However, this paper reveals that their performance on remote sensing images falls short when compared to their impressive results with natural images. This study identifies a previously overlooked issue: GANs exhibit a heightened susceptibility to overfitting on remote sensing images.To address this… ▽ More

    Submitted 28 March, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

  13. arXiv:2305.11073  [pdf, other

    cs.CL cs.SD eess.AS

    A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks

    Authors: Yifan Peng, Kwangyoun Kim, Felix Wu, Brian Yan, Siddhant Arora, William Chen, Jiyang Tang, Suwon Shon, Prashant Sridhar, Shinji Watanabe

    Abstract: Conformer, a convolution-augmented Transformer variant, has become the de facto encoder architecture for speech processing due to its superior performance in various tasks, including automatic speech recognition (ASR), speech translation (ST) and spoken language understanding (SLU). Recently, a new encoder called E-Branchformer has outperformed Conformer in the LibriSpeech ASR benchmark, making it… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted at INTERSPEECH 2023. Code: https://github.com/espnet/espnet

  14. Close-range Human Following Control on a Cane-type Robot with Multi-camera Fusion

    Authors: Haowen Liu, Fengxian Wu, Bin Zhong, Yijun Zhao, Jiatong Zhang, Wenxin Niu, Mingming Zhang

    Abstract: Cane-type robots have been utilized to assist and supervise the mobility-impaired population. One essential technique for cane-type robots is human following control, which allows the robot to follow the user. However, the limited perceptible information of humans by sensors at close range, combined with the occlusion caused by lower limb swing during normal walking, affect the localization of use… ▽ More

    Submitted 27 September, 2023; v1 submitted 5 May, 2023; originally announced May 2023.

    Comments: This paper has been accepted for publication in IEEE Robotics and Automation Letters.( Volume: 8, Issue: 10, October 2023)

  15. arXiv:2303.05240  [pdf, other

    cs.CV eess.IV

    Intriguing Property and Counterfactual Explanation of GAN for Remote Sensing Image Generation

    Authors: Xingzhe Su, Wenwen Qiang, Jie Hu, Fengge Wu, Changwen Zheng, Fuchun Sun

    Abstract: Generative adversarial networks (GANs) have achieved remarkable progress in the natural image field. However, when applying GANs in the remote sensing (RS) image generation task, an extraordinary phenomenon is observed: the GAN model is more sensitive to the size of training data for RS image generation than for natural image generation. In other words, the generation quality of RS images will cha… ▽ More

    Submitted 14 May, 2024; v1 submitted 9 March, 2023; originally announced March 2023.

  16. arXiv:2302.14132  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding

    Authors: Yifan Peng, Kwangyoun Kim, Felix Wu, Prashant Sridhar, Shinji Watanabe

    Abstract: Self-supervised speech representation learning (SSL) has shown to be effective in various downstream tasks, but SSL models are usually large and slow. Model compression techniques such as pruning aim to reduce the model size and computation without degradation in accuracy. Prior studies focus on the pruning of Transformers; however, speech models not only utilize a stack of Transformer blocks, but… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: Accepted at ICASSP 2023

  17. arXiv:2302.10558  [pdf, other

    eess.SP

    Joint Optimization of Base Station Clustering and Service Caching in User-Centric MEC

    Authors: Langtian Qin, Hancheng Lu, Yao Lu, Chenwu Zhang, Feng Wu

    Abstract: Edge service caching can effectively reduce the delay or bandwidth overhead for acquiring and initializing applications. To address single-base station (BS) transmission limitation and serious edge effect in traditional cellular-based edge service caching networks, in this paper, we proposed a novel user-centric edge service caching framework where each user is jointly provided with edge caching a… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

  18. arXiv:2302.10515  [pdf, other

    eess.SP cs.DC cs.PF

    Energy-Efficient Blockchain-enabled User-Centric Mobile Edge Computing

    Authors: Langtian Qin, Hancheng Lu, Yuang Chen, Zhuojia Gu, Dan Zhao, Feng Wu

    Abstract: In the traditional mobile edge computing (MEC) system, the availability of MEC services is greatly limited for the edge users of the cell due to serious signal attenuation and inter-cell interference. User-centric MEC (UC-MEC) can be seen as a promising solution to address this issue. In UC-MEC, each user is served by a dedicated access point (AP) cluster enabled with MEC capability instead of a s… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

  19. arXiv:2301.06043  [pdf, other

    eess.IV cs.CV

    Unsupervised Cardiac Segmentation Utilizing Synthesized Images from Anatomical Labels

    Authors: Sihan Wang, Fu** Wu, Lei Li, Zheyao Gao, Byung-Woo Hong, Xiahai Zhuang

    Abstract: Cardiac segmentation is in great demand for clinical practice. Due to the enormous labor of manual delineation, unsupervised segmentation is desired. The ill-posed optimization problem of this task is inherently challenging, requiring well-designed constraints. In this work, we propose an unsupervised framework for multi-class segmentation with both intensity and shape constraints. Firstly, we ext… ▽ More

    Submitted 15 January, 2023; originally announced January 2023.

  20. arXiv:2212.10525  [pdf, other

    cs.CL eess.AS

    SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks

    Authors: Suwon Shon, Siddhant Arora, Chyi-Jiunn Lin, Ankita Pasad, Felix Wu, Roshan Sharma, Wei-Lun Wu, Hung-Yi Lee, Karen Livescu, Shinji Watanabe

    Abstract: Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community, but have not received as much attention as lower-level tasks like speech and speaker recognition. In particular, there are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers. Recent work has begun to introduce suc… ▽ More

    Submitted 15 June, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: accepted in ACL 2023 (long paper)

  21. arXiv:2212.08542  [pdf, other

    eess.AS cs.CL

    Context-aware Fine-tuning of Self-supervised Speech Models

    Authors: Suwon Shon, Felix Wu, Kwangyoun Kim, Prashant Sridhar, Karen Livescu, Shinji Watanabe

    Abstract: Self-supervised pre-trained transformers have improved the state of the art on a variety of speech tasks. Due to the quadratic time and space complexity of self-attention, they usually operate at the level of relatively short (e.g., utterance) segments. In this paper, we study the use of context, i.e., surrounding segments, during fine-tuning and propose a new approach called context-aware fine-tu… ▽ More

    Submitted 28 March, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

  22. arXiv:2210.03402  [pdf

    eess.SY cs.LG nlin.AO

    Research on Self-adaptive Online Vehicle Velocity Prediction Strategy Considering Traffic Information Fusion

    Authors: Ziyan Zhang, Junhao Shen, Dongwei Yao, Feng Wu

    Abstract: In order to increase the prediction accuracy of the online vehicle velocity prediction (VVP) strategy, a self-adaptive velocity prediction algorithm fused with traffic information was presented for the multiple scenarios. Initially, traffic scenarios were established inside the co-simulation environment. In addition, the algorithm of a general regressive neural network (GRNN) paired with datasets… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: 9 pages, 7 figures

  23. arXiv:2210.00077  [pdf, other

    eess.AS cs.LG

    E-Branchformer: Branchformer with Enhanced merging for speech recognition

    Authors: Kwangyoun Kim, Felix Wu, Yifan Peng, **g Pan, Prashant Sridhar, Kyu J. Han, Shinji Watanabe

    Abstract: Conformer, combining convolution and self-attention sequentially to capture both local and global information, has shown remarkable performance and is currently regarded as the state-of-the-art for automatic speech recognition (ASR). Several other studies have explored integrating convolution and self-attention but they have not managed to match Conformer's performance. The recently introduced Bra… ▽ More

    Submitted 14 October, 2022; v1 submitted 30 September, 2022; originally announced October 2022.

    Comments: Accepted to SLT 2022

  24. Model-Guided Multi-Contrast Deep Unfolding Network for MRI Super-resolution Reconstruction

    Authors: Gang Yang, Li Zhang, Man Zhou, Ai** Liu, Xun Chen, Zhiwei Xiong, Feng Wu

    Abstract: Magnetic resonance imaging (MRI) with high resolution (HR) provides more detailed information for accurate diagnosis and quantitative image analysis. Despite the significant advances, most existing super-resolution (SR) reconstruction network for medical images has two flaws: 1) All of them are designed in a black-box principle, thus lacking sufficient interpretability and further limiting their p… ▽ More

    Submitted 14 September, 2022; originally announced September 2022.

    Comments: Accepted to ACMMM 2022, 9 pages

  25. arXiv:2207.05565  [pdf, other

    eess.IV

    Towards Hybrid-Optimization Video Coding

    Authors: Shuai Huo, Dong Liu, Li Li, Siwei Ma, Feng Wu, Wen Gao

    Abstract: Video coding is a mathematical optimization problem of rate and distortion essentially. To solve this complex optimization problem, two popular video coding frameworks have been developed: block-based hybrid video coding and end-to-end learned video coding. If we rethink video coding from the perspective of optimization, we find that the existing two frameworks represent two directions of optimiza… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

  26. arXiv:2206.05284  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Decoupling Predictions in Distributed Learning for Multi-Center Left Atrial MRI Segmentation

    Authors: Zheyao Gao, Lei Li, Fu** Wu, Sihan Wang, Xiahai Zhuang

    Abstract: Distributed learning has shown great potential in medical image analysis. It allows to use multi-center training data with privacy protection. However, data distributions in local centers can vary from each other due to different imaging vendors, and annotation protocols. Such variation degrades the performance of learning-based methods. To mitigate the influence, two groups of methods have been p… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

    Comments: Accepted by MICCAI 2022

  27. arXiv:2205.14833  [pdf, other

    cs.LG cs.DC eess.SY

    Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning

    Authors: Chengfei Lv, Chaoyue Niu, Renjie Gu, Xiaotang Jiang, Zhaode Wang, Bin Liu, Ziqi Wu, Qiulin Yao, Congyu Huang, Panos Huang, Tao Huang, Hui Shu, **de Song, Bin Zou, Peng Lan, Guohuan Xu, Fei Wu, Shaojie Tang, Fan Wu, Guihai Chen

    Abstract: To break the bottlenecks of mainstream cloud-based machine learning (ML) paradigm, we adopt device-cloud collaborative ML and build the first end-to-end and general-purpose system, called Walle, as the foundation. Walle consists of a deployment platform, distributing ML tasks to billion-scale devices in time; a data pipeline, efficiently preparing task input; and a compute container, providing a c… ▽ More

    Submitted 29 May, 2022; originally announced May 2022.

    Comments: Accepted by OSDI 2022

  28. arXiv:2205.10781  [pdf, other

    eess.SY

    A Hierarchical MPC Approach to Car-Following via Linearly Constrained Quadratic Programming

    Authors: Fangyu Wu, Alexandre Bayen

    Abstract: Single-lane car-following is a fundamental task in autonomous driving. A desirable car-following controller should keep a reasonable range of distances to the preceding vehicle and do so as smoothly as possible. To achieve this, numerous control methods have been proposed: some only rely on local sensing; others also make use of non-local downstream observations. While local methods are capable of… ▽ More

    Submitted 20 August, 2022; v1 submitted 22 May, 2022; originally announced May 2022.

    Comments: 6 pages, 7 figures

  29. arXiv:2205.01086  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages

    Authors: Felix Wu, Kwangyoun Kim, Shinji Watanabe, Kyu Han, Ryan McDonald, Kilian Q. Weinberger, Yoav Artzi

    Abstract: We introduce Wav2Seq, the first self-supervised approach to pre-train both parts of encoder-decoder models for speech data. We induce a pseudo language as a compact discrete representation, and formulate a self-supervised pseudo speech recognition task -- transcribing audio inputs into pseudo subword sequences. This process stands on its own, or can be applied as low-cost second-stage pre-training… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

    Comments: Code available at https://github.com/asappresearch/wav2seq

  30. arXiv:2204.11792  [pdf, other

    cs.SD eess.AS

    SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech

    Authors: Zhenhui Ye, Zhou Zhao, Yi Ren, Fei Wu

    Abstract: The recent progress in non-autoregressive text-to-speech (NAR-TTS) has made fast and high-quality speech synthesis possible. However, current NAR-TTS models usually use phoneme sequence as input and thus cannot understand the tree-structured syntactic information of the input sequence, which hurts the prosody modeling. To this end, we propose SyntaSpeech, a syntax-aware and light-weight NAR-TTS mo… ▽ More

    Submitted 25 April, 2022; originally announced April 2022.

    Comments: Accepted by IJCAI-2022. 12 pages

  31. arXiv:2204.10704  [pdf, other

    cs.CV eess.IV

    SUES-200: A Multi-height Multi-scene Cross-view Image Benchmark Across Drone and Satellite

    Authors: Runzhe Zhu, Ling Yin, Mingze Yang, Fei Wu, Yuncheng Yang, Wenbo Hu

    Abstract: Cross-view image matching aims to match images of the same target scene acquired from different platforms. With the rapid development of drone technology, cross-view matching by neural network models has been a widely accepted choice for drone position or navigation. However, existing public datasets do not include images obtained by drones at different heights, and the types of scenes are relativ… ▽ More

    Submitted 21 January, 2023; v1 submitted 22 April, 2022; originally announced April 2022.

  32. arXiv:2201.03186  [pdf, other

    eess.IV cs.CV

    MyoPS: A Benchmark of Myocardial Pathology Segmentation Combining Three-Sequence Cardiac Magnetic Resonance Images

    Authors: Lei Li, Fu** Wu, Sihan Wang, Xinzhe Luo, Carlos Martin-Isla, Shuwei Zhai, Jianpeng Zhang, Yanfei Liu7, Zhen Zhang, Markus J. Ankenbrand, Haochuan Jiang, Xiaoran Zhang, Linhong Wang, Tewodros Weldebirhan Arega, Elif Altunok, Zhou Zhao, Feiyan Li, Jun Ma, ** Yang, Elodie Puybareau, Ilkay Oksuz, Stephanie Bricq, Weisheng Li, Kumaradevan Punithakumar, Sotirios A. Tsaftaris , et al. (7 additional authors not shown)

    Abstract: Assessment of myocardial viability is essential in diagnosis and treatment management of patients suffering from myocardial infarction, and classification of pathology on myocardium is the key to this assessment. This work defines a new task of medical image analysis, i.e., to perform myocardial pathology segmentation (MyoPS) combining three-sequence cardiac magnetic resonance (CMR) images, which… ▽ More

    Submitted 10 January, 2022; originally announced January 2022.

  33. Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images

    Authors: Xiaoqiang Wang, Lei Zhu, Siliang Tang, Huazhu Fu, ** Li, Fei Wu, Yi Yang, Yueting Zhuang

    Abstract: Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images. However, RGB-D data is not easily acquired, which limits the development of RGB-D SOD techniques. To alleviate this issue, we present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection. We first devise a… ▽ More

    Submitted 31 December, 2021; originally announced January 2022.

    Comments: Accepted by IEEE TIP

  34. arXiv:2112.08655  [pdf, other

    cs.CV eess.IV

    Feature Distillation Interaction Weighting Network for Lightweight Image Super-Resolution

    Authors: Guangwei Gao, Wenjie Li, Juncheng Li, Fei Wu, Huimin Lu, Yi Yu

    Abstract: Convolutional neural networks based single-image super-resolution (SISR) has made great progress in recent years. However, it is difficult to apply these methods to real-world scenarios due to the computational and memory cost. Meanwhile, how to take full advantage of the intermediate features under the constraints of limited parameters and calculations is also a huge challenge. To alleviate these… ▽ More

    Submitted 11 April, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: 9 pages, 9 figures, 4 tables, AAAI2022

  35. arXiv:2112.07648  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    On the Use of External Data for Spoken Named Entity Recognition

    Authors: Ankita Pasad, Felix Wu, Suwon Shon, Karen Livescu, Kyu J. Han

    Abstract: Spoken language understanding (SLU) tasks involve map** from speech audio signals to semantic labels. Given the complexity of such tasks, good performance might be expected to require large labeled datasets, which are difficult to collect for each new task and domain. However, recent advances in self-supervised speech representations have made it feasible to consider learning SLU models with lim… ▽ More

    Submitted 8 July, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: Accepted at NAACL 2022. Codebase available at https://github.com/asappresearch/spoken-ner

  36. arXiv:2112.07238  [pdf, other

    eess.SY

    Composing MPC with LQR and Neural Network for Amortized Efficiency and Stable Control

    Authors: Fangyu Wu, Guanhua Wang, Siyuan Zhuang, Kehan Wang, Alexander Keimer, Ion Stoica, Alexandre Bayen

    Abstract: Model predictive control (MPC) is a powerful control method that handles dynamical systems with constraints. However, solving MPC iteratively in real time, i.e., implicit MPC, remains a computational challenge. To address this, common solutions include explicit MPC and function approximation. Both methods, whenever applicable, may improve the computational efficiency of the implicit MPC by several… ▽ More

    Submitted 2 August, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: 13 pages, 10 figures, 2 tables

  37. arXiv:2111.10367  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech

    Authors: Suwon Shon, Ankita Pasad, Felix Wu, Pablo Brusco, Yoav Artzi, Karen Livescu, Kyu J. Han

    Abstract: Progress in speech processing has been facilitated by shared datasets and benchmarks. Historically these have focused on automatic speech recognition (ASR), speaker identification, or other lower-level tasks. Interest has been growing in higher-level spoken language understanding tasks, including using end-to-end models, but there are fewer annotated datasets for such tasks. At the same time, rece… ▽ More

    Submitted 29 July, 2022; v1 submitted 19 November, 2021; originally announced November 2021.

    Comments: Updated preprint for SLUE Benchmark v0.2; Toolkit link https://github.com/asappresearch/slue-toolkit

  38. arXiv:2111.04736  [pdf, other

    eess.IV cs.CV

    Multi-Modality Cardiac Image Analysis with Deep Learning

    Authors: Lei Li, Fu** Wu, Sihang Wang, Xiahai Zhuang

    Abstract: Accurate cardiac computing, analysis and modeling from multi-modality images are important for the diagnosis and treatment of cardiac disease. Late gadolinium enhancement magnetic resonance imaging (LGE MRI) is a promising technique to visualize and quantify myocardial infarction (MI) and atrial scars. Automating quantification of MI and atrial scars can be challenging due to the low image quality… ▽ More

    Submitted 8 November, 2021; originally announced November 2021.

    Comments: Under review as a chapter of book 'Deep Learning for Medical Image Analysis, 2E'

  39. arXiv:2109.14837  [pdf, other

    eess.IV cs.CV

    End-to-End Image Compression with Probabilistic Decoding

    Authors: Haichuan Ma, Dong Liu, Cunhui Dong, Li Li, Feng Wu

    Abstract: Lossy image compression is a many-to-one process, thus one bitstream corresponds to multiple possible original images, especially at low bit rates. However, this nature was seldom considered in previous studies on image compression, which usually chose one possible image as reconstruction, e.g. the one with the maximal a posteriori probability. We propose a learned image compression framework to n… ▽ More

    Submitted 30 September, 2021; originally announced September 2021.

  40. arXiv:2109.06870  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition

    Authors: Felix Wu, Kwangyoun Kim, **g Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi

    Abstract: This paper is a study of performance-efficiency trade-offs in pre-trained models for automatic speech recognition (ASR). We focus on wav2vec 2.0, and formalize several architecture designs that influence both the model performance and its efficiency. Putting together all our observations, we introduce SEW (Squeezed and Efficient Wav2vec), a pre-trained model architecture with significant improveme… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: Code available at https://github.com/asappresearch/sew

  41. arXiv:2106.09760  [pdf, other

    eess.AS cs.CL cs.SD

    Multi-mode Transformer Transducer with Stochastic Future Context

    Authors: Kwangyoun Kim, Felix Wu, Prashant Sridhar, Kyu J. Han, Shinji Watanabe

    Abstract: Automatic speech recognition (ASR) models make fewer errors when more surrounding speech information is presented as context. Unfortunately, acquiring a larger future context leads to higher latency. There exists an inevitable trade-off between speed and accuracy. Naively, to fit different latency requirements, people have to store multiple models and pick the best one under the constraints. Inste… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    Comments: Accepted to Interspeech 2021

  42. arXiv:2106.08752  [pdf, other

    eess.IV cs.CV

    Unsupervised Domain Adaptation with Variational Approximation for Cardiac Segmentation

    Authors: Fu** Wu, Xiahai Zhuang

    Abstract: Unsupervised domain adaptation is useful in medical image segmentation. Particularly, when ground truths of the target images are not available, domain adaptation can train a target-specific model by utilizing the existing labeled images from other modalities. Most of the reported works mapped images of both the source and target domains into a common latent feature space, and then reduced their d… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

    Comments: accepted by IEEE Transactions on Medical Imaging

  43. arXiv:2105.03678  [pdf, other

    eess.SP cs.LG stat.ML

    Nearly Minimax-Optimal Rates for Noisy Sparse Phase Retrieval via Early-Stopped Mirror Descent

    Authors: Fan Wu, Patrick Rebeschini

    Abstract: This paper studies early-stopped mirror descent applied to noisy sparse phase retrieval, which is the problem of recovering a $k$-sparse signal $\mathbf{x}^\star\in\mathbb{R}^n$ from a set of quadratic Gaussian measurements corrupted by sub-exponential noise. We consider the (non-convex) unregularized empirical risk minimization problem and show that early-stopped mirror descent, when equipped wit… ▽ More

    Submitted 8 May, 2021; originally announced May 2021.

    Comments: arXiv admin note: text overlap with arXiv:2010.10168

  44. JDSR-GAN: Constructing An Efficient Joint Learning Network for Masked Face Super-Resolution

    Authors: Guangwei Gao, Lei Tang, Fei Wu, Huimin Lu, Jian Yang

    Abstract: With the growing importance of preventing the COVID-19 virus, face images obtained in most video surveillance scenarios are low resolution with mask simultaneously. However, most of the previous face super-resolution solutions can not handle both tasks in one model. In this work, we treat the mask occlusion as image noise and construct a joint and collaborative learning network, called JDSR-GAN, f… ▽ More

    Submitted 29 January, 2023; v1 submitted 25 March, 2021; originally announced March 2021.

    Comments: IEEE Transactions on Multimedia, 8 pages, 7 figures

  45. arXiv:2011.08001  [pdf, other

    eess.IV cs.CV cs.LG

    Deep-LIBRA: Artificial intelligence method for robust quantification of breast density with independent validation in breast cancer risk assessment

    Authors: Omid Haji Maghsoudi, Aimilia Gastounioti, Christopher Scott, Lauren Pantalone, Fang-Fang Wu, Eric A. Cohen, Stacey Winham, Emily F. Conant, Celine Vachon, Despina Kontos

    Abstract: Breast density is an important risk factor for breast cancer that also affects the specificity and sensitivity of screening mammography. Current federal legislation mandates reporting of breast density for all women undergoing breast screening. Clinically, breast density is assessed visually using the American College of Radiology Breast Imaging Reporting And Data System (BI-RADS) scale. Here, we… ▽ More

    Submitted 18 October, 2021; v1 submitted 13 November, 2020; originally announced November 2020.

  46. arXiv:2008.12205  [pdf, other

    cs.CV eess.IV

    Random Style Transfer based Domain Generalization Networks Integrating Shape and Spatial Information

    Authors: Lei Li, Veronika A. Zimmer, Wangbin Ding, Fu** Wu, Liqin Huang, Julia A. Schnabel, Xiahai Zhuang

    Abstract: Deep learning (DL)-based models have demonstrated good performance in medical image segmentation. However, the models trained on a known dataset often fail when performed on an unseen dataset collected from different centers, vendors and disease populations. In this work, we present a random style transfer network to tackle the domain generalization problem for multi-vendor and center cardiac imag… ▽ More

    Submitted 3 September, 2020; v1 submitted 27 August, 2020; originally announced August 2020.

    Comments: 11 pages

  47. arXiv:2007.14177  [pdf

    cs.CV eess.IV

    Generative networks as inverse problems with fractional wavelet scattering networks

    Authors: Jiasong Wu, **g Zhang, Fuzhi Wu, Youyong Kong, Guanyu Yang, Lotfi Senhadji, Huazhong Shu

    Abstract: Deep learning is a hot research topic in the field of machine learning methods and applications. Generative Adversarial Networks (GANs) and Variational Auto-Encoders (VAEs) provide impressive image generations from Gaussian white noise, but both of them are difficult to train since they need to train the generator (or encoder) and the discriminator (or decoder) simultaneously, which is easy to cau… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

    Comments: 27 pages, 13 figures, 6 tables

  48. arXiv:2006.02666  [pdf

    eess.IV cs.CV

    Deep Sequential Feature Learning in Clinical Image Classification of Infectious Keratitis

    Authors: Yesheng Xu, Ming Kong, Wenjia Xie, Run** Duan, Zhengqing Fang, Yuxiao Lin, Qiang Zhu, Siliang Tang, Fei Wu, Yu-Feng Yao

    Abstract: Infectious keratitis is the most common entities of corneal diseases, in which pathogen grows in the cornea leading to inflammation and destruction of the corneal tissues. Infectious keratitis is a medical emergency, for which a rapid and accurate diagnosis is needed for speedy initiation of prompt and precise treatment to halt the disease progress and to limit the extent of corneal damage; otherw… ▽ More

    Submitted 4 June, 2020; originally announced June 2020.

    Comments: Accepted by Engineering

  49. arXiv:2006.01065  [pdf, other

    stat.ML cs.LG eess.SP

    Hadamard Wirtinger Flow for Sparse Phase Retrieval

    Authors: Fan Wu, Patrick Rebeschini

    Abstract: We consider the problem of reconstructing an $n$-dimensional $k$-sparse signal from a set of noiseless magnitude-only measurements. Formulating the problem as an unregularized empirical risk minimization task, we study the sample complexity performance of gradient descent with Hadamard parametrization, which we call Hadamard Wirtinger flow (HWF). Provided knowledge of the signal sparsity $k$, we p… ▽ More

    Submitted 24 February, 2021; v1 submitted 1 June, 2020; originally announced June 2020.

  50. arXiv:2005.12848  [pdf, other

    eess.SP cs.NI eess.SY

    Group-In: Group Inference from Wireless Traces of Mobile Devices

    Authors: Gürkan Solmaz, Jonathan Fürst, Samet Aytaç, Fang-**g Wu

    Abstract: This paper proposes Group-In, a wireless scanning system to detect static or mobile people groups in indoor or outdoor environments. Group-In collects only wireless traces from the Bluetooth-enabled mobile devices for group inference. The key problem addressed in this work is to detect not only static groups but also moving groups with a multi-phased approach based only noisy wireless Received Sig… ▽ More

    Submitted 10 June, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

    Comments: This work has been funded by the EU Horizon 2020 Programme under Grant Agreements No. 731993 AUTOPILOT and No.871249 LOCUS projects. The content of this paper does not reflect the official opinion of the EU. Responsibility for the information and views expressed therein lies entirely with the authors. Proc. of ACM/IEEE IPSN'20, 2020