Skip to main content

Showing 1–50 of 107 results for author: Yu, L

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.12447  [pdf, other

    eess.AS

    Text-aware Speech Separation for Multi-talker Keyword Spotting

    Authors: Haoyu Li, Baochen Yang, Yu Xi, Linfeng Yu, Tian Tan, Hao Li, Kai Yu

    Abstract: For noisy environments, ensuring the robustness of keyword spotting (KWS) systems is essential. While much research has focused on noisy KWS, less attention has been paid to multi-talker mixed speech scenarios. Unlike the usual cocktail party problem where multi-talker speech is separated using speaker clues, the key challenge here is to extract the target speech for KWS based on text clues. To ad… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH2024

  2. arXiv:2406.10677  [pdf, ps, other

    eess.SY

    Intermittent Encryption Strategies for Anti-Eavesdrop** Estimation

    Authors: Zhongyao Hu, Bo Chen, Pindi Weng, Jianzheng Wang, Li Yu

    Abstract: In this paper, an anti-eavesdrop** estimation problem is investigated. A linear encryption scheme is utilized, which first linearly transforms innovation via an encryption matrix and then encrypts some components of the transformed innovation. To reduce the computation and energy resources consumed by the linear encryption scheme, both stochastic and deterministic intermittent strategies which p… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 12 pages, 5 figures

    MSC Class: 93E-xx

  3. arXiv:2406.04680  [pdf, other

    eess.IV cs.CV

    MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome

    Authors: Yixin Huang, Yiqi **, Ke Tao, Kaijian Xia, Jianfeng Gu, Lei Yu, Lan Du, Cunjian Chen

    Abstract: May-Thurner Syndrome (MTS), also known as iliac vein compression syndrome or Cockett's syndrome, is a condition potentially impacting over 20 percent of the population, leading to an increased risk of iliofemoral deep venous thrombosis. In this paper, we present a 3D-based deep learning approach called MTS-Net for diagnosing May-Thurner Syndrome using CT scans. To effectively capture the spatial-t… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  4. arXiv:2405.13762  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation

    Authors: Gwanghyun Kim, Alonso Martinez, Yu-Chuan Su, Brendan Jou, José Lezama, Agrim Gupta, Lijun Yu, Lu Jiang, Aren Jansen, Jacob Walker, Krishna Somandepalli

    Abstract: Training diffusion models for audiovisual sequences allows for a range of generation tasks by learning conditional distributions of various input-output combinations of the two modalities. Nevertheless, this strategy often requires training a separate model for each task which is expensive. Here, we propose a novel training approach to effectively learn arbitrary conditional distributions in the a… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  5. arXiv:2405.07905  [pdf, other

    eess.IV cs.CV

    PLUTO: Pathology-Universal Transformer

    Authors: Dinkar Juyal, Harshith Padigela, Chintan Shah, Daniel Shenker, Natalia Harguindeguy, Yi Liu, Blake Martin, Yibo Zhang, Michael Nercessian, Miles Markey, Isaac Finberg, Kelsey Luu, Daniel Borders, Syed Ashar Javed, Emma Krause, Raymond Biju, Aashish Sood, Allen Ma, Jackson Nyman, John Shamshoian, Guillaume Chhor, Darpan Sanghavi, Marc Thibault, Limin Yu, Fedaa Najdawi , et al. (8 additional authors not shown)

    Abstract: Pathology is the study of microscopic inspection of tissue, and a pathology diagnosis is often the medical gold standard to diagnose disease. Pathology images provide a unique challenge for computer-vision-based analysis: a single pathology Whole Slide Image (WSI) is gigapixel-sized and often contains hundreds of thousands to millions of objects of interest across multiple resolutions. In this wor… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  6. arXiv:2405.02825  [pdf, other

    eess.SP

    An Enhanced Dynamic Ray Tracing Architecture for Channel Prediction Based on Multipath Bidirectional Geometry and Field Extrapolation

    Authors: Yinghe Miao, Li Yu, Yuxiang Zhang, Hongbo Xing, Jianhua Zhang

    Abstract: With the development of sixth generation (6G) networks toward digitalization and intelligentization of communications, rapid and precise channel prediction is crucial for the network potential release. Interestingly, a dynamic ray tracing (DRT) approach for channel prediction has recently been proposed, which utilizes the results of traditional RT to extrapolate the multipath geometry evolution. H… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  7. arXiv:2404.13550  [pdf, other

    cs.CV eess.IV

    Pointsoup: High-Performance and Extremely Low-Decoding-Latency Learned Geometry Codec for Large-Scale Point Cloud Scenes

    Authors: Kang You, Kai Liu, Li Yu, Pan Gao, Dandan Ding

    Abstract: Despite considerable progress being achieved in point cloud geometry compression, there still remains a challenge in effectively compressing large-scale scenes with sparse surfaces. Another key challenge lies in reducing decoding latency, a crucial requirement in real-world application. In this paper, we propose Pointsoup, an efficient learning-based geometry codec that attains high-performance an… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  8. arXiv:2404.02185  [pdf, other

    cs.CV cs.GR eess.IV

    NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-Efficient Scene Representation

    Authors: Sicheng Li, Hao Li, Yiyi Liao, Lu Yu

    Abstract: The emergence of Neural Radiance Fields (NeRF) has greatly impacted 3D scene modeling and novel-view synthesis. As a kind of visual media for 3D scene representation, compression with high rate-distortion performance is an eternal target. Motivated by advances in neural compression and neural field representation, we propose NeRFCodec, an end-to-end NeRF compression framework that integrates non-l… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR2024. The source code will be released

  9. arXiv:2403.15418  [pdf, other

    eess.SP

    Stochastic Analysis of Touch-Tone Frequency Recognition in Two-Way Radio Systems for Dialed Telephone Number Identification

    Authors: Liqiang Yu, Chen Li, Bo Liu, Chang Che

    Abstract: This paper focuses on recognizing dialed numbers in a touch-tone telephone system based on the Dual Tone MultiFrequency (DTMF) signaling technique with analysis of stochastic aspects during the noise and random duration of characters. Each dialed digit's acoustic profile is derived from a composite of two carrier frequencies, distinctly assigned to represent that digit. The identification of each… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: It is accepted by The 7th International Conference on Advanced Algorithms and Control Engineering (ICAACE 2024)

  10. arXiv:2403.14244  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    Isotropic Gaussian Splatting for Real-Time Radiance Field Rendering

    Authors: Yuanhao Gong, Lantao Yu, Guanghui Yue

    Abstract: The 3D Gaussian splatting method has drawn a lot of attention, thanks to its high performance in training and high quality of the rendered image. However, it uses anisotropic Gaussian kernels to represent the scene. Although such anisotropic kernels have advantages in representing the geometry, they lead to difficulties in terms of computation, such as splitting or merging two kernels. In this pap… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  11. arXiv:2403.12852  [pdf, other

    eess.IV cs.CV

    Generative Enhancement for 3D Medical Images

    Authors: Lingting Zhu, Noel Codella, Dongdong Chen, Zhenchao **, Lu Yuan, Lequan Yu

    Abstract: The limited availability of 3D medical image datasets, due to privacy concerns and high collection or annotation costs, poses significant challenges in the field of medical imaging. While a promising alternative is the use of synthesized medical data, there are few solutions for realistic 3D medical image synthesis due to difficulties in backbone design and fewer 3D training samples compared to 2D… ▽ More

    Submitted 24 May, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 20 pages, 8 figures

  12. arXiv:2403.12467  [pdf, other

    eess.SP

    Digital Twin Channel for 6G: Concepts, Architectures and Potential Applications

    Authors: Heng Wang, Jianhua Zhang, Gaofeng Nie, Li Yu, Zhiqiang Yuan, Tongjie Li, Jialin Wang, Guangyi Liu

    Abstract: Digital twin channel (DTC) is the real-time map** of a wireless channel from the physical world to the digital world, which is expected to provide significant performance enhancements for the sixth-generation (6G) air-interface design. In this work, we first define five evolution levels of channel twins with the progression of wireless communication. The fifth level, autonomous DTC, is elaborate… ▽ More

    Submitted 31 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 7 pages, 5 figures, 15 references. It is submitted to IEEE journal

  13. arXiv:2402.19286  [pdf, other

    eess.IV cs.CV

    PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation

    Authors: Ruining Deng, Quan Liu, Can Cui, Tianyuan Yao, Jialin Yue, Juming Xiong, Lining Yu, Yifei Wu, Mengmeng Yin, Yu Wang, Shilin Zhao, Yucheng Tang, Haichun Yang, Yuankai Huo

    Abstract: Understanding the anatomy of renal pathology is crucial for advancing disease diagnostics, treatment evaluation, and clinical research. The complex kidney system comprises various components across multiple levels, including regions (cortex, medulla), functional units (glomeruli, tubules), and cells (podocytes, mesangial cells in glomerulus). Prior studies have predominantly overlooked the intrica… ▽ More

    Submitted 20 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: IEEE / CVF Computer Vision and Pattern Recognition Conference 2024

  14. arXiv:2402.05582  [pdf

    eess.IV cs.CV cs.MM

    Joint End-to-End Image Compression and Denoising: Leveraging Contrastive Learning and Multi-Scale Self-ONNs

    Authors: Yuxin Xie, Li Yu, Farhad Pakdaman, Moncef Gabbouj

    Abstract: Noisy images are a challenge to image compression algorithms due to the inherent difficulty of compressing noise. As noise cannot easily be discerned from image details, such as high-frequency signals, its presence leads to extra bits needed for compression. Since the emerging learned image compression paradigm enables end-to-end optimization of codecs, recent efforts were made to integrate denois… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Copyright 2024 IEEE - Submitted to IEEE ICIP 2024

  15. arXiv:2402.04267  [pdf

    physics.med-ph cs.AI cs.CV eess.IV

    Application analysis of ai technology combined with spiral CT scanning in early lung cancer screening

    Authors: Shulin Li, Liqiang Yu, Bo Liu, Qunwei Lin, Jiaxin Huang

    Abstract: At present, the incidence and fatality rate of lung cancer in China rank first among all malignant tumors. Despite the continuous development and improvement of China's medical level, the overall 5-year survival rate of lung cancer patients is still lower than 20% and is staged. A number of studies have confirmed that early diagnosis and treatment of early stage lung cancer is of great significanc… ▽ More

    Submitted 26 January, 2024; originally announced February 2024.

    Comments: This article was accepted by Frontiers in Computing and Intelligent Systems https://drpress.org/ojs/index.php/fcis/article/view/15781. arXiv admin note: text overlap with arXiv:nlin/0508031 by other authors

  16. arXiv:2402.03302  [pdf, other

    eess.IV cs.CV cs.LG

    Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining

    Authors: Jiarun Liu, Hao Yang, Hong-Yu Zhou, Yan Xi, Lequan Yu, Yizhou Yu, Yong Liang, Guangming Shi, Shaoting Zhang, Hairong Zheng, Shanshan Wang

    Abstract: Accurate medical image segmentation demands the integration of multi-scale information, spanning from local features to global dependencies. However, it is challenging for existing methods to model long-range global information, where convolutional neural networks (CNNs) are constrained by their local receptive fields, and vision transformers (ViTs) suffer from high quadratic complexity of their a… ▽ More

    Submitted 6 March, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Code and models of Swin-UMamba are publicly available at: https://github.com/JiarunLiu/Swin-UMamba

  17. arXiv:2402.02936  [pdf, other

    eess.IV cs.CV cs.LG cs.MM

    Panoramic Image Inpainting With Gated Convolution And Contextual Reconstruction Loss

    Authors: Li Yu, Yanjun Gao, Farhad Pakdaman, Moncef Gabbouj

    Abstract: Deep learning-based methods have demonstrated encouraging results in tackling the task of panoramic image inpainting. However, it is challenging for existing methods to distinguish valid pixels from invalid pixels and find suitable references for corrupted areas, thus leading to artifacts in the inpainted results. In response to these challenges, we propose a panoramic image inpainting framework t… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: Copyright 2024 IEEE - to appear in IEEE ICASSP 2024

    Journal ref: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

  18. arXiv:2311.13682  [pdf, other

    cs.CV eess.IV

    Single-Shot Plug-and-Play Methods for Inverse Problems

    Authors: Yanqi Cheng, Lipei Zhang, Zhenda Shen, Shujun Wang, Lequan Yu, Raymond H. Chan, Carola-Bibiane Schönlieb, Angelica I Aviles-Rivero

    Abstract: The utilisation of Plug-and-Play (PnP) priors in inverse problems has become increasingly prominent in recent years. This preference is based on the mathematical equivalence between the general proximal operator and the regularised denoiser, facilitating the adaptation of various off-the-shelf denoiser priors to a wide range of inverse problems. However, existing PnP models predominantly rely on p… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  19. arXiv:2310.19293  [pdf, other

    eess.IV cs.CV

    FetusMapV2: Enhanced Fetal Pose Estimation in 3D Ultrasound

    Authors: Chaoyu Chen, Xin Yang, Yuhao Huang, Wenlong Shi, Yan Cao, Mingyuan Luo, Xindi Hu, Lei Zhue, Lequan Yu, Kejuan Yue, Yuanji Zhang, Yi Xiong, Dong Ni, Weijun Huang

    Abstract: Fetal pose estimation in 3D ultrasound (US) involves identifying a set of associated fetal anatomical landmarks. Its primary objective is to provide comprehensive information about the fetus through landmark connections, thus benefiting various critical applications, such as biometric measurements, plane localization, and fetal movement monitoring. However, accurately estimating the 3D fetal pose… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: 16 pages, 11 figures, accepted by Medical Image Analysis(2023)

  20. arXiv:2309.13874  [pdf, other

    eess.AS cs.LG cs.SD

    Diffusion Conditional Expectation Model for Efficient and Robust Target Speech Extraction

    Authors: Leying Zhang, Yao Qian, Linfeng Yu, Heming Wang, Xinkai Wang, Hemin Yang, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng

    Abstract: Target Speech Extraction (TSE) is a crucial task in speech processing that focuses on isolating the clean speech of a specific speaker from complex mixtures. While discriminative methods are commonly used for TSE, they can introduce distortion in terms of speech perception quality. On the other hand, generative approaches, particularly diffusion-based methods, can enhance speech quality perceptual… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024

  21. arXiv:2309.13292  [pdf, other

    cs.LG cs.CY cs.SD eess.AS

    Beyond Fairness: Age-Harmless Parkinson's Detection via Voice

    Authors: Yicheng Wang, Xiaotian Han, Leisheng Yu, Na Zou

    Abstract: Parkinson's disease (PD), a neurodegenerative disorder, often manifests as speech and voice dysfunction. While utilizing voice data for PD detection has great potential in clinical applications, the widely used deep learning models currently have fairness issues regarding different ages of onset. These deep models perform well for the elderly group (age $>$ 55) but are less accurate for the young… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

  22. arXiv:2309.05674  [pdf, other

    eess.IV cs.CV

    ConvFormer: Plug-and-Play CNN-Style Transformers for Improving Medical Image Segmentation

    Authors: Xian Lin, Zengqiang Yan, Xianbo Deng, Chuansheng Zheng, Li Yu

    Abstract: Transformers have been extensively studied in medical image segmentation to build pairwise long-range dependence. Yet, relatively limited well-annotated medical image data makes transformers struggle to extract diverse global features, resulting in attention collapse where attention maps become similar or even identical. Comparatively, convolutional neural networks (CNNs) have better convergence p… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: Accepted by MICCAI 2023

  23. arXiv:2309.03806  [pdf, ps, other

    cs.IT eess.SP

    Novel Power-Imbalanced Dense Codebooks for Reliable Multiplexing in Nakagami Channels

    Authors: Yiming Gui, Zilong Liu, Lisu Yu, Chunlei Li, **zhi Fan

    Abstract: This paper studies enhanced dense code multiple access (DCMA) system design for downlink transmission over the Nakagami-$m$ fading channels. By studying the DCMA pairwise error probability (PEP) in a Nakagami-$m$ channel, a novel design metric called minimum logarithmic sum distance (MLSD) is first derived. With respect to the proposed MLSD, we introduce a new family of power-imbalanced dense code… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  24. arXiv:2308.13205  [pdf, other

    cs.RO eess.SY

    Design and Control of a Bio-inspired Wheeled Bipedal Robot

    Authors: Haizhou Zhao, Lei Yu, Siying Qin, Yuqing Chen

    Abstract: Wheeled bipedal robots have the capability to execute agile and versatile locomotion tasks in unknown terrains, with balancing being a key criterion in evaluating their dynamic performance. This paper focuses on enhancing the balancing performance of wheeled bipedal robots through innovations in both hardware and software aspects. A bio-inspired mechanical design, inspired by the human barbell squ… ▽ More

    Submitted 15 January, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

  25. arXiv:2307.10094  [pdf, other

    eess.IV cs.CV

    Make-A-Volume: Leveraging Latent Diffusion Models for Cross-Modality 3D Brain MRI Synthesis

    Authors: Lingting Zhu, Zeyue Xue, Zhenchao **, Xian Liu, **gzhen He, Ziwei Liu, Lequan Yu

    Abstract: Cross-modality medical image synthesis is a critical topic and has the potential to facilitate numerous applications in the medical imaging field. Despite recent successes in deep-learning-based generative models, most current medical image synthesis methods rely on generative adversarial networks and suffer from notorious mode collapse and unstable training. Moreover, the 2D backbone-driven appro… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted by International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2023). 10 pages, 4 figures

  26. arXiv:2306.08337  [pdf, other

    eess.SY cs.NI

    Carbon emissions and sustainability of launching 5G mobile networks in China

    Authors: Tong Li, Li Yu, Yibo Ma, Tong Duan, Wenzhen Huang, Yan Zhou, Depeng **, Yong Li, Tao Jiang

    Abstract: Since 2021, China has deployed more than 2.1 million 5G base stations to increase the network capacity and provide ubiquitous digital connectivity for mobile terminals. However, the launch of 5G networks also exacerbates the misalignment between cellular traffic and energy consumption, which reduces carbon efficiency - the amount of network traffic that can be delivered for each unit of carbon emi… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

  27. arXiv:2306.04527  [pdf, other

    eess.IV cs.CV cs.LG

    ContriMix: Scalable stain color augmentation for domain generalization without domain labels in digital pathology

    Authors: Tan H. Nguyen, Dinkar Juyal, ** Li, Aaditya Prakash, Shima Nofallah, Chintan Shah, Sai Chowdary Gullapally, Limin Yu, Michael Griffin, Anand Sampat, John Abel, Justin Lee, Amaro Taylor-Weiner

    Abstract: Differences in staining and imaging procedures can cause significant color variations in histopathology images, leading to poor generalization when deploying deep-learning models trained from a different data source. Various color augmentation methods have been proposed to generate synthetic images during training to make models more robust, eliminating the need for stain normalization during test… ▽ More

    Submitted 8 March, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

  28. arXiv:2306.02057  [pdf, ps, other

    eess.SP

    DataAI-6G: A System Parameters Configurable Channel Dataset for AI-6G Research

    Authors: Zibing Shen, Jianhua Zhang, Li Yu, Yuxiang Zhang, Zhen Zhang, Xidong Hu

    Abstract: With the acceleration of the commercialization of fifth generation (5G) mobile communication technology and the research for 6G communication systems, the communication system has the characteristics of high frequency, multi-band, high speed movement of users and large antenna array. These bring many difficulties to obtain accurate channel state information (CSI), which makes the performance of tr… ▽ More

    Submitted 3 June, 2023; originally announced June 2023.

  29. arXiv:2305.02597  [pdf, other

    eess.IV

    "Seeing'' Electric Network Frequency from Events

    Authors: Lexuan Xu, Guang Hua, Haijian Zhang, Lei Yu, Ning Qiao

    Abstract: Most of the artificial lights fluctuate in response to the grid's alternating current and exhibit subtle variations in terms of both intensity and spectrum, providing the potential to estimate the Electric Network Frequency (ENF) from conventional frame-based videos. Nevertheless, the performance of Video-based ENF (V-ENF) estimation largely relies on the imaging quality and thus may suffer from s… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted by CVPR 2023

  30. arXiv:2304.07972  [pdf, ps, other

    eess.SY

    Remote State Estimation with Posterior-Based Stochastic Event-Triggered Schedule

    Authors: Zhongyao Hu, Bo Chen, Rusheng Wang, Li Yu

    Abstract: This paper aims to study the state estimation problem under the stochastic event-triggered (SET) schedule. A posterior-based SET mechanism is proposed, which determines whether to transmit data by the effect of the measurement on the posterior estimate. Since this SET mechanism considers the whole posterior probability density function, it has better information screening capability and utilizatio… ▽ More

    Submitted 16 April, 2023; originally announced April 2023.

    Comments: 10 pages, 5 figures

  31. arXiv:2304.07018  [pdf, other

    cs.CV cs.LG eess.IV

    DIPNet: Efficiency Distillation and Iterative Pruning for Image Super-Resolution

    Authors: Lei Yu, Xinpeng Li, Youwei Li, Ting Jiang, Qi Wu, Haoqiang Fan, Shuaicheng Liu

    Abstract: Efficient deep learning-based approaches have achieved remarkable performance in single image super-resolution. However, recent studies on efficient super-resolution have mainly focused on reducing the number of parameters and floating-point operations through various network designs. Although these methods can decrease the number of parameters and floating-point operations, they may not necessari… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  32. arXiv:2303.09119  [pdf, other

    cs.CV cs.SD eess.AS

    Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation

    Authors: Lingting Zhu, Xian Liu, Xuanyu Liu, Rui Qian, Ziwei Liu, Lequan Yu

    Abstract: Animating virtual avatars to make co-speech gestures facilitates various applications in human-machine interaction. The existing methods mainly rely on generative adversarial networks (GANs), which typically suffer from notorious mode collapse and unstable training, thus making it difficult to learn accurate audio-gesture joint distributions. In this work, we propose a novel diffusion-based framew… ▽ More

    Submitted 18 March, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted by IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023. 10 pages, 3 figures

  33. arXiv:2303.03793  [pdf

    physics.optics eess.IV physics.app-ph physics.bio-ph

    Roadmap on Deep Learning for Microscopy

    Authors: Giovanni Volpe, Carolina Wählby, Lei Tian, Michael Hecht, Artur Yakimovich, Kristina Monakhova, Laura Waller, Ivo F. Sbalzarini, Christopher A. Metzler, Mingyang Xie, Kevin Zhang, Isaac C. D. Lenton, Halina Rubinsztein-Dunlop, Daniel Brunner, Bijie Bai, Aydogan Ozcan, Daniel Midtvedt, Hao Wang, Nataša Sladoje, Joakim Lindblad, Jason T. Smith, Marien Ochoa, Margarida Barroso, Xavier Intes, Tong Qiu , et al. (50 additional authors not shown)

    Abstract: Through digital imaging, microscopy has evolved from primarily being a means for visual observation of life at the micro- and nano-scale, to a quantitative tool with ever-increasing resolution and throughput. Artificial intelligence, deep neural networks, and machine learning are all niche terms describing computational methods that have gained a pivotal role in microscopy-based research over the… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

  34. arXiv:2301.08046  [pdf, other

    eess.SY

    Learning stability of partially observed switched linear systems

    Authors: Zheming Wang, Raphaël M. Jungers, Mihály Petreczky, Bo Chen, Li Yu

    Abstract: This paper deals with learning stability of partially observed switched linear systems under arbitrary switching. Such systems are widely used to describe cyber-physical systems which arise by combining physical systems with digital components. In many real-world applications, the internal states cannot be observed directly. It is thus more realistic to conduct system analysis using the outputs of… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

  35. arXiv:2301.05955  [pdf, other

    eess.SP cs.HC cs.LG

    Hand Gesture Recognition through Reflected Infrared Light Wave Signals

    Authors: Md Zobaer Islam, Li Yu, Hisham Abuella, John F. O'Hara, Christopher Crick, Sabit Ekin

    Abstract: In this study, we present a wireless (non-contact) gesture recognition method using only incoherent light wave signals reflected from a human subject. In comparison to existing radar, light shadow, sound and camera-based sensing systems, this technology uses a low-cost ubiquitous light source (e.g., infrared LED) to send light towards the subject's hand performing gestures and the reflected light… ▽ More

    Submitted 13 June, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

    Comments: 5 pages, 6 figures, 2 tables, Accepted, presented and camera-ready version submitted at ICEEE 2023 at Istanbul, Turkey. arXiv admin note: substantial text overlap with arXiv:2007.08178

  36. arXiv:2212.14755  [pdf, ps, other

    eess.SY

    Secure Fusion Estimation Against FDI Sensor Attacks in Cyber-Physical Systems

    Authors: Bo Chen, Pindi Weng, Daniel W. C. Ho, Li Yu

    Abstract: This paper is concerned with the problem of secure multi-sensors fusion estimation for cyber-physical systems, where sensor measurements may be tampered with by false data injection (FDI) attacks. In this work, it is considered that the adversary may not be able to attack all sensors. That is, several sensors remain not being attacked. In this case, new local reorganized subsystems including the F… ▽ More

    Submitted 30 December, 2022; originally announced December 2022.

    Comments: 10 pages, 5 figures; the first version of this manuscript was completed on 2020

  37. arXiv:2211.05910  [pdf, other

    eess.IV cs.CV

    Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Maurizio Denna, Abdel Younes, Ganzorig Gankhuyag, **gang Huh, Myeong Kyun Kim, Kihwan Yoon, Hyeon-Cheol Moon, Seungho Lee, Yoonsik Choe, **woo Jeong, Sungjei Kim, Maciej Smyl, Tomasz Latkowski, Pawel Kubik, Michal Sokolski, Yujie Ma, Jiahao Chao, Zhou Zhou, Hongfan Gao, Zhengfeng Yang, Zhenbing Zeng, Zhengyang Zhuge, Chenghua Li , et al. (71 additional authors not shown)

    Abstract: Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2105.07825, arXiv:2105.08826, arXiv:2211.04470, arXiv:2211.03885, arXiv:2211.05256

  38. arXiv:2211.05309  [pdf

    eess.SY

    Generic Cryo-CMOS Device Modeling and EDACompatible Platform for Reliable Cryogenic IC Design

    Authors: Zhidong Tang, Zewei Wang, Yumeng Yuan, Chang He, Xin Luo, Ao Guo, Renhe Chen, Yongqi Hu, Longfei Yang, Chengwei Cao, Linlin Liu, Liujiang Yu, Ganbing Shang, Yongfeng Cao, Shoumian Chen, Yuhang Zhao, Shaojian Hu, Xufeng Kou

    Abstract: This paper outlines the establishment of a generic cryogenic CMOS database in which key electrical parameters and transfer characteristics of the MOSFETs are quantified as functions of device size, temperature/frequency responses. Meanwhile, comprehensive device statistical study is conducted to evaluate the influence of variation and mismatch effects at low temperatures. Furthermore, by incorpora… ▽ More

    Submitted 9 February, 2024; v1 submitted 9 November, 2022; originally announced November 2022.

  39. arXiv:2209.10218  [pdf, other

    eess.IV cs.CV

    HiFuse: Hierarchical Multi-Scale Feature Fusion Network for Medical Image Classification

    Authors: Xiangzuo Huo, Gang Sun, Shengwei Tian, Yan Wang, Long Yu, Jun Long, Wendong Zhang, Aolun Li

    Abstract: Medical image classification has developed rapidly under the impetus of the convolutional neural network (CNN). Due to the fixed size of the receptive field of the convolution kernel, it is difficult to capture the global features of medical images. Although the self-attention-based Transformer can model long-range dependencies, it has high computational complexity and lacks local inductive bias.… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

  40. arXiv:2209.08245  [pdf, other

    eess.SP

    How to Define the Propagation Environment Semantics and Its Application in Scatterer-Based Beam Prediction

    Authors: Yutong Sun, Jianhua Zhang, Li Yu, Zhen Zhang, ** Zhang

    Abstract: In view of the propagation environment directly determining the channel fading, the application tasks can also be solved with the aid of the environment information. Inspired by task-oriented semantic communication and machine learning (ML) powered environment-channel map** methods, this work aims to provide a new view of the environment from the semantic level, which defines the propagation env… ▽ More

    Submitted 17 September, 2022; originally announced September 2022.

    Comments: 5 pages, 5 figures

  41. arXiv:2208.14876  [pdf, other

    eess.IV cs.CV

    NestedFormer: Nested Modality-Aware Transformer for Brain Tumor Segmentation

    Authors: Zhaohu Xing, Lequan Yu, Liang Wan, Tong Han, Lei Zhu

    Abstract: Multi-modal MR imaging is routinely used in clinical practice to diagnose and investigate brain tumors by providing rich complementary information. Previous multi-modal MRI segmentation methods usually perform modal fusion by concatenating multi-modal MRIs at an early/middle stage of the network, which hardly explores non-linear dependencies between modalities. In this work, we propose a novel Nes… ▽ More

    Submitted 31 August, 2022; originally announced August 2022.

    Comments: MICCAI2022

  42. arXiv:2208.14635  [pdf, other

    eess.IV cs.CV cs.LG

    Segmentation-guided Domain Adaptation and Data Harmonization of Multi-device Retinal Optical Coherence Tomography using Cycle-Consistent Generative Adversarial Networks

    Authors: Shuo Chen, Da Ma, Sieun Lee, Timothy T. L. Yu, Gavin Xu, Donghuan Lu, Karteek Popuri, Myeong ** Ju, Marinko V. Sarunic, Mirza Faisal Beg

    Abstract: Optical Coherence Tomography(OCT) is a non-invasive technique capturing cross-sectional area of the retina in micro-meter resolutions. It has been widely used as a auxiliary imaging reference to detect eye-related pathology and predict longitudinal progression of the disease characteristics. Retina layer segmentation is one of the crucial feature extraction techniques, where the variations of reti… ▽ More

    Submitted 31 August, 2022; originally announced August 2022.

    Comments: 16 pages, 10 figures

  43. arXiv:2208.11609  [pdf, other

    eess.IV cs.CV

    Fast Nearest Convolution for Real-Time Efficient Image Super-Resolution

    Authors: Ziwei Luo, Youwei Li, Lei Yu, Qi Wu, Zhihong Wen, Haoqiang Fan, Shuaicheng Liu

    Abstract: Deep learning-based single image super-resolution (SISR) approaches have drawn much attention and achieved remarkable success on modern advanced GPUs. However, most state-of-the-art methods require a huge number of parameters, memories, and computational resources, which usually show inferior inference times when applying them to current mobile device CPUs/NPUs. In this paper, we propose a simple… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: AIM & Mobile AI 2022

  44. arXiv:2208.01812  [pdf, ps, other

    eess.SY cs.NI

    Distributed Event-Triggered Nonlinear Fusion Estimation under Resource Constraints

    Authors: Rusheng Wang, Bo Chen, Zhongyao Hu, Li Yu

    Abstract: This paper studies the event-triggered distributed fusion estimation problems for a class of nonlinear networked multisensor fusion systems without noise statistical characteristics. When considering the limited resource problems of two kinds of communication channels (i.e., sensor-to-remote estimator channel and smart sensor-to-fusion center channel), an event-triggered strategy and a dimensional… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    Comments: 15 pages,9 figures. The first draft was completed in June 2021, and this is the revised version

    MSC Class: 93DXX

  45. arXiv:2207.04596  [pdf, other

    eess.SP

    Frequency-Angle Two-Dimensional Reflection Coefficient Modeling Based on Terahertz Channel Measurement

    Authors: Zhaowei Chang, Jianhua Zhang, Pan Tang, Lei Tian, Li Yu, Guangyi Liu, Liang Xia

    Abstract: Terahertz (THz) channel propagation characteristics are vital for the design, evaluation, and optimization for THz communication systems. Moreover, reflection plays a significant role in channel propagation. In this letter, the reflection coefficient of the THz channel is researched based on extensive measurement campaigns. Firstly, we set up the THz channel sounder from 220 to 320 GHz with the in… ▽ More

    Submitted 10 July, 2022; originally announced July 2022.

  46. arXiv:2206.02850  [pdf, other

    cs.CV eess.IV

    GLF-CR: SAR-Enhanced Cloud Removal with Global-Local Fusion

    Authors: Fang Xu, Yilei Shi, Patrick Ebel, Lei Yu, Gui-Song Xia, Wen Yang, Xiao Xiang Zhu

    Abstract: The challenge of the cloud removal task can be alleviated with the aid of Synthetic Aperture Radar (SAR) images that can penetrate cloud cover. However, the large domain gap between optical and SAR images as well as the severe speckle noise of SAR images may cause significant interference in SAR-based cloud removal, resulting in performance degeneration. In this paper, we propose a novel global-lo… ▽ More

    Submitted 9 August, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

  47. arXiv:2206.00221  [pdf, ps, other

    eess.SY

    Distributed Estimation for Interconnected Systems with Arbitrary Coupling Structures

    Authors: Yuchen Zhang, Bo Chen, Li Yu, Daniel W. C. Ho

    Abstract: This paper is concerned with the problem of distributed estimation for time-varying interconnected dynamic systems with arbitrary coupling structures. To guarantee the robustness of the designed estimators, novel distributed stability conditions are proposed with only local information and the information from neighbors. Then, simplified stability conditions which do not require timely exchange of… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: 11 pages,5 figures (The first version of this manuscript was completed on June 2021)

    MSC Class: 15-00 ACM Class: G.2

  48. arXiv:2205.14510  [pdf, other

    eess.IV

    Q-LIC: Quantizing Learned Image Compression with Channel Splitting

    Authors: Heming Sun, Lu Yu, Jiro Katto

    Abstract: Learned image compression (LIC) has reached a comparable coding gain with traditional hand-crafted methods such as VVC intra. However, the large network complexity prohibits the usage of LIC on resource-limited embedded systems. Network quantization is an efficient way to reduce the network burden. This paper presents a quantized LIC (QLIC) by channel splitting. First, we explore that the influenc… ▽ More

    Submitted 28 May, 2022; originally announced May 2022.

  49. arXiv:2205.05675  [pdf, other

    cs.CV eess.IV

    NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

    Authors: Yawei Li, Kai Zhang, Radu Timofte, Luc Van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, **gyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, **shan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang , et al. (86 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of e… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Validation code of the baseline model is available at https://github.com/ofsoundof/IMDN. Validation of all submitted models is available at https://github.com/ofsoundof/NTIRE2022_ESR

  50. arXiv:2205.04723  [pdf, other

    eess.IV cs.CV

    Robust Medical Image Classification from Noisy Labeled Data with Global and Local Representation Guided Co-training

    Authors: Cheng Xue, Lequan Yu, Pengfei Chen, Qi Dou, Pheng-Ann Heng

    Abstract: Deep neural networks have achieved remarkable success in a wide variety of natural image and medical image computing tasks. However, these achievements indispensably rely on accurately annotated training data. If encountering some noisy-labeled images, the network training procedure would suffer from difficulties, leading to a sub-optimal classifier. This problem is even more severe in the medical… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.