Skip to main content

Showing 1–50 of 71 results for author: Zhong, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.18018  [pdf, other

    eess.IV

    A Cross Spatio-Temporal Pathology-based Lung Nodule Dataset

    Authors: Muwei Jian, Haoran Zhang, Mingju Shao, Hongyu Chen, Huihui Huang, Yanjie Zhong, Changlei Zhang, Bin Wang, Penghui Gao

    Abstract: Recently, intelligent analysis of lung nodules with the assistant of computer aided detection (CAD) techniques can improve the accuracy rate of lung cancer diagnosis. However, existing CAD systems and pulmonary datasets mainly focus on Computed Tomography (CT) images from one single period, while ignoring the cross spatio-temporal features associated with the progression of nodules contained in im… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.14069  [pdf, other

    eess.IV cs.CV

    Towards Multi-modality Fusion and Prototype-based Feature Refinement for Clinically Significant Prostate Cancer Classification in Transrectal Ultrasound

    Authors: Hong Wu, Juan Fu, Hongsheng Ye, Yuming Zhong, Xuebin Zou, Jianhua Zhou, Yi Wang

    Abstract: Prostate cancer is a highly prevalent cancer and ranks as the second leading cause of cancer-related deaths in men globally. Recently, the utilization of multi-modality transrectal ultrasound (TRUS) has gained significant traction as a valuable technique for guiding prostate biopsies. In this study, we propose a novel learning framework for clinically significant prostate cancer (csPCa) classifica… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2406.10514  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    GTR-Voice: Articulatory Phonetics Informed Controllable Expressive Speech Synthesis

    Authors: Zehua Kcriss Li, Meiying Melissa Chen, Yi Zhong, Pinxin Liu, Zhiyao Duan

    Abstract: Expressive speech synthesis aims to generate speech that captures a wide range of para-linguistic features, including emotion and articulation, though current research primarily emphasizes emotional aspects over the nuanced articulatory features mastered by professional voice actors. Inspired by this, we explore expressive speech synthesis through the lens of articulatory phonetics. Specifically,… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  4. arXiv:2405.15542  [pdf, other

    cs.NI cs.DC cs.LG eess.SP

    SATSense: Multi-Satellite Collaborative Framework for Spectrum Sensing

    Authors: Haoxuan Yuan, Zhe Chen, Zheng Lin, **bo Peng, Zihan Fang, Yuhang Zhong, Zihang Song, Yue Gao

    Abstract: Low Earth Orbit satellite Internet has recently been deployed, providing worldwide service with non-terrestrial networks. With the large-scale deployment of both non-terrestrial and terrestrial networks, limited spectrum resources will not be allocated enough. Consequently, dynamic spectrum sharing is crucial for their coexistence in the same spectrum, where accurate spectrum sensing is essential.… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 13 pages, 16 figures

  5. arXiv:2405.02660  [pdf, other

    cs.IT eess.SP

    AFDM Channel Estimation in Multi-Scale Multi-Lag Channels

    Authors: Rongyou Cao, Yuheng Zhong, Jiangbin Lyu, Deqing Wang, Liqun Fu

    Abstract: Affine Frequency Division Multiplexing (AFDM) is a brand new chirp-based multi-carrier (MC) waveform for high mobility communications, with promising advantages over Orthogonal Frequency Division Multiplexing (OFDM) and other MC waveforms. Existing AFDM research focuses on wireless communication at high carrier frequency (CF), which typically considers only Doppler frequency shift (DFS) as a resul… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 6 pages, 6 figures. Investigate AFDM under underwater multi-scale multi-lag channels. Derive the new input-output formula with the impact of Doppler time scaling. Propose two new channel estimation methods to tackle different level of Doppler factors. Perform diversity analyis based on CFR overlap probability (COP) and mutual incoherent property (MIP)

  6. arXiv:2404.13929  [pdf, other

    eess.IV cs.CV

    Exploring Kinetic Curves Features for the Classification of Benign and Malignant Breast Lesions in DCE-MRI

    Authors: Zixian Li, Yuming Zhong, Yi Wang

    Abstract: Breast cancer is the most common malignant tumor among women and the second cause of cancer-related death. Early diagnosis in clinical practice is crucial for timely treatment and prognosis. Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) has revealed great usability in the preoperative diagnosis and assessing therapy effects thanks to its capability to reflect the morphology and dy… ▽ More

    Submitted 10 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 6 pages, 8 figures, conference

  7. arXiv:2404.13277  [pdf, other

    eess.IV cs.CV

    Beyond Score Changes: Adversarial Attack on No-Reference Image Quality Assessment from Two Perspectives

    Authors: Chenxi Yang, Yujia Liu, Dingquan Li, Yan Zhong, Tingting Jiang

    Abstract: Deep neural networks have demonstrated impressive success in No-Reference Image Quality Assessment (NR-IQA). However, recent researches highlight the vulnerability of NR-IQA models to subtle adversarial perturbations, leading to inconsistencies between model predictions and subjective ratings. Current adversarial attacks, however, focus on perturbing predicted scores of individual images, neglecti… ▽ More

    Submitted 24 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

    Comments: Submitted to a conference

  8. arXiv:2404.11537  [pdf, other

    cs.CV eess.IV

    SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening

    Authors: Yu Zhong, Xiao Wu, Liang-Jian Deng, Zihan Cao

    Abstract: Pansharpening is a significant image fusion technique that merges the spatial content and spectral characteristics of remote sensing images to generate high-resolution multispectral images. Recently, denoising diffusion probabilistic models have been gradually applied to visual tasks, enhancing controllable image generation through low-rank adaptation (LoRA). In this paper, we introduce a spatial-… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  9. arXiv:2404.06695  [pdf, other

    eess.IV physics.med-ph

    Spiral Scanning and Self-Supervised Image Reconstruction Enable Ultra-Sparse Sampling Multispectral Photoacoustic Tomography

    Authors: Yutian Zhong, Xiaoming Zhang, Zongxin Mo, Shuangyang Zhang, Wufan Chen, Li Qi

    Abstract: Multispectral photoacoustic tomography (PAT) is an imaging modality that utilizes the photoacoustic effect to achieve non-invasive and high-contrast imaging of internal tissues. However, the hardware cost and computational demand of a multispectral PAT system consisting of up to thousands of detectors are huge. To address this challenge, we propose an ultra-sparse spiral sampling strategy for mult… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  10. arXiv:2403.06167  [pdf, other

    eess.SY

    Direct Shooting Method for Numerical Optimal Control: A Modified Transcription Approach

    Authors: Jiawei Tang, Yuxing Zhong, Pengyu Wang, Xingzhou Chen, Shuang Wu, Ling Shi

    Abstract: Direct shooting is an efficient method to solve numerical optimal control. It utilizes the Runge-Kutta scheme to discretize a continuous-time optimal control problem making the problem solvable by nonlinear programming solvers. However, conventional direct shooting raises a contradictory dynamics issue when using an augmented state to handle {high-order} systems. This paper fills the research gap… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: Accepted by ECC24

  11. arXiv:2402.08987  [pdf, other

    eess.IV cs.CV

    Multi-modality transrectal ultrasound video classification for identification of clinically significant prostate cancer

    Authors: Hong Wu, Juan Fu, Hongsheng Ye, Yuming Zhong, Xuebin Zhou, Jianhua Zhou, Yi Wang

    Abstract: Prostate cancer is the most common noncutaneous cancer in the world. Recently, multi-modality transrectal ultrasound (TRUS) has increasingly become an effective tool for the guidance of prostate biopsies. With the aim of effectively identifying prostate cancer, we propose a framework for the classification of clinically significant prostate cancer (csPCa) from multi-modality TRUS videos. The frame… ▽ More

    Submitted 17 February, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  12. arXiv:2402.06841  [pdf

    eess.IV cs.CV

    Point cloud-based registration and image fusion between cardiac SPECT MPI and CTA

    Authors: Shaojie Tang, Penpen Miao, Xingyu Gao, Yu Zhong, Dantong Zhu, Haixing Wen, Zhihui Xu, Qiuyue Wei, Hong** Yao, Xin Huang, Rui Gao, Chen Zhao, Weihua Zhou

    Abstract: A method was proposed for the point cloud-based registration and image fusion between cardiac single photon emission computed tomography (SPECT) myocardial perfusion images (MPI) and cardiac computed tomography angiograms (CTA). Firstly, the left ventricle (LV) epicardial regions (LVERs) in SPECT and CTA images were segmented by using different U-Net neural networks trained to generate the point c… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  13. arXiv:2401.06149  [pdf, other

    cs.CV cs.LG eess.IV

    Image Classifier Based Generative Method for Planar Antenna Design

    Authors: Yang Zhong, Wei** Dou, Andrew Cohen, Dia'a Bisharat, Yuandong Tian, Jiang Zhu, Qing Huo Liu

    Abstract: To extend the antenna design on printed circuit boards (PCBs) for more engineers of interest, we propose a simple method that models PCB antennas with a few basic components. By taking two separate steps to decide their geometric dimensions and positions, antenna prototypes can be facilitated with no experience required. Random sampling statistics relate to the quality of dimensions are used in se… ▽ More

    Submitted 16 December, 2023; originally announced January 2024.

    Comments: 13 pages, 18 figures

  14. arXiv:2312.09576  [pdf, other

    eess.IV cs.CV

    SegRap2023: A Benchmark of Organs-at-Risk and Gross Tumor Volume Segmentation for Radiotherapy Planning of Nasopharyngeal Carcinoma

    Authors: Xiangde Luo, Jia Fu, Yunxin Zhong, Shuolin Liu, Bing Han, Mehdi Astaraki, Simone Bendazzoli, Iuliana Toma-Dasu, Yiwen Ye, Ziyang Chen, Yong Xia, Yanzhou Su, ** Ye, Junjun He, Zhaohu Xing, Hongqiu Wang, Lei Zhu, Kaixiang Yang, Xin Fang, Zhiwei Wang, Chan Woong Lee, Sang Joon Park, Jaehee Chun, Constantin Ulrich, Klaus H. Maier-Hein , et al. (17 additional authors not shown)

    Abstract: Radiation therapy is a primary and effective NasoPharyngeal Carcinoma (NPC) treatment strategy. The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis. Previously, the delineation of GTVs and OARs was performed by experienced radiation oncologists. Recently, deep learning has achieved promising results… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: A challenge report of SegRap2023 (organized in conjunction with MICCAI2023)

  15. arXiv:2312.01727  [pdf

    eess.IV physics.bio-ph

    Deep learning acceleration of iterative model-based light fluence correction for photoacoustic tomography

    Authors: Zhaoyong Liang, Shuangyang Zhang, Zhichao Liang, Zhongxin Mo, Xiaoming Zhang, Yutian Zhong, Wufan Chen, Li Qi

    Abstract: Photoacoustic tomography (PAT) is a promising imaging technique that can visualize the distribution of chromophores within biological tissue. However, the accuracy of PAT imaging is compromised by light fluence (LF), which hinders the quantification of light absorbers. Currently, model-based iterative methods are used for LF correction, but they require significant computational resources due to r… ▽ More

    Submitted 7 December, 2023; v1 submitted 4 December, 2023; originally announced December 2023.

  16. arXiv:2311.15082  [pdf, other

    eess.IV

    Learning graph-Fourier spectra of textured surface images for defect localization

    Authors: Tapan Ganatma Nakkina, Adithyaa Karthikeyan, Yuhao Zhong, Ceyhun Eksin, Satish T. S. Bukkapatnam

    Abstract: In the realm of industrial manufacturing, product inspection remains a significant bottleneck, with only a small fraction of manufactured items undergoing inspection for surface defects. Advances in imaging systems and AI can allow automated full inspection of manufactured surfaces. However, even the most contemporary imaging and machine learning methods perform poorly for detecting defects in ima… ▽ More

    Submitted 1 December, 2023; v1 submitted 25 November, 2023; originally announced November 2023.

  17. arXiv:2311.12842  [pdf, other

    eess.IV cs.CV

    Multimodal Identification of Alzheimer's Disease: A Review

    Authors: Guian Fang, Mengsha Liu, Yi Zhong, Zhuolin Zhang, Jiehui Huang, Zhenchao Tang, Calvin Yu-Chian Chen

    Abstract: Alzheimer's disease is a progressive neurological disorder characterized by cognitive impairment and memory loss. With the increasing aging population, the incidence of AD is continuously rising, making early diagnosis and intervention an urgent need. In recent years, a considerable number of teams have applied computer-aided diagnostic techniques to early classification research of AD. Most studi… ▽ More

    Submitted 6 October, 2023; originally announced November 2023.

  18. arXiv:2310.08303  [pdf, other

    cs.CV cs.SD eess.AS

    Multimodal Variational Auto-encoder based Audio-Visual Segmentation

    Authors: Yuxin Mao, **g Zhang, Mochu Xiang, Yiran Zhong, Yuchao Dai

    Abstract: We propose an Explicit Conditional Multimodal Variational Auto-Encoder (ECMVAE) for audio-visual segmentation (AVS), aiming to segment sound sources in the video sequence. Existing AVS methods focus on implicit feature fusion strategies, where models are trained to fit the discrete samples in the dataset. With a limited and less diverse dataset, the resulting performance is usually unsatisfactory.… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: Accepted by ICCV2023,Project page(https://npucvr.github.io/MMVAE-AVS),Code(https://github.com/OpenNLPLab/MMVAE-AVS)

  19. arXiv:2310.07511  [pdf

    cs.CV cs.LG eess.IV

    A Unified Remote Sensing Anomaly Detector Across Modalities and Scenes via Deviation Relationship Learning

    Authors: **gtao Li, Xinyu Wang, Hengwei Zhao, Liangpei Zhang, Yanfei Zhong

    Abstract: Remote sensing anomaly detector can find the objects deviating from the background as potential targets. Given the diversity in earth anomaly types, a unified anomaly detector across modalities and scenes should be cost-effective and flexible to new earth observation sources and anomaly types. However, the current anomaly detectors are limited to a single modality and single scene, since they aim… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Journal paper

  20. arXiv:2309.09085  [pdf, other

    cs.SD cs.IR cs.MM eess.AS eess.SP

    SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription

    Authors: Yongyi Zang, Yi Zhong, Frank Cwitkowitz, Zhiyao Duan

    Abstract: Guitar tablature is a form of music notation widely used among guitarists. It captures not only the musical content of a piece, but also its implementation and ornamentation on the instrument. Guitar Tablature Transcription (GTT) is an important task with broad applications in music education, composition, and entertainment. Existing GTT datasets are quite limited in size and scope, rendering mode… ▽ More

    Submitted 24 January, 2024; v1 submitted 16 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024

  21. arXiv:2307.16579  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Contrastive Conditional Latent Diffusion for Audio-visual Segmentation

    Authors: Yuxin Mao, **g Zhang, Mochu Xiang, Yunqiu Lv, Yiran Zhong, Yuchao Dai

    Abstract: We propose a latent diffusion model with contrastive learning for audio-visual segmentation (AVS) to extensively explore the contribution of audio. We interpret AVS as a conditional generation task, where audio is defined as the conditional variable for sound producer(s) segmentation. With our new interpretation, it is especially necessary to model the correlation between audio and the final segme… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

  22. arXiv:2307.03942  [pdf, ps, other

    eess.IV cs.CV

    Ariadne's Thread:Using Text Prompts to Improve Segmentation of Infected Areas from Chest X-ray images

    Authors: Yi Zhong, Mengqiu Xu, Kongming Liang, Kaixin Chen, Ming Wu

    Abstract: Segmentation of the infected areas of the lung is essential for quantifying the severity of lung disease like pulmonary infections. Existing medical image segmentation methods are almost uni-modal methods based on image. However, these image-only methods tend to produce inaccurate results unless trained with large amounts of annotated data. To overcome this challenge, we propose a language-driven… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

    Comments: Provisional Acceptance by MICCAI 2023

  23. arXiv:2306.16714  [pdf, other

    eess.IV cs.CV

    SimPLe: Similarity-Aware Propagation Learning for Weakly-Supervised Breast Cancer Segmentation in DCE-MRI

    Authors: Yuming Zhong, Yi Wang

    Abstract: Breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) plays an important role in the screening and prognosis assessment of high-risk breast cancer. The segmentation of cancerous regions is essential useful for the subsequent analysis of breast MRI. To alleviate the annotation effort to train the segmentation networks, we propose a weakly-supervised strategy using extreme points as… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

  24. arXiv:2305.12107  [pdf, other

    cs.SD cs.CL eess.AS

    EE-TTS: Emphatic Expressive TTS with Linguistic Information

    Authors: Yi Zhong, Chen Zhang, Xule Liu, Chenxi Sun, Weishan Deng, Haifeng Hu, Zhongqian Sun

    Abstract: While Current TTS systems perform well in synthesizing high-quality speech, producing highly expressive speech remains a challenge. Emphasis, as a critical factor in determining the expressiveness of speech, has attracted more attention nowadays. Previous works usually enhance the emphasis by adding intermediate features, but they can not guarantee the overall expressiveness of the speech. To reso… ▽ More

    Submitted 14 April, 2024; v1 submitted 20 May, 2023; originally announced May 2023.

    Comments: Accepted by Interspeech 2023, fix some typos

  25. arXiv:2305.02493  [pdf, other

    cs.LG cs.AI eess.SY

    RCP-RF: A Comprehensive Road-car-pedestrian Risk Management Framework based on Driving Risk Potential Field

    Authors: Shuhang Tan, Zhiling Wang, Yan Zhong

    Abstract: Recent years have witnessed the proliferation of traffic accidents, which led wide researches on Automated Vehicle (AV) technologies to reduce vehicle accidents, especially on risk assessment framework of AV technologies. However, existing time-based frameworks can not handle complex traffic scenarios and ignore the motion tendency influence of each moving objects on the risk distribution, leading… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

  26. arXiv:2305.00213  [pdf, other

    stat.ML cs.LG eess.IV

    EBLIME: Enhanced Bayesian Local Interpretable Model-agnostic Explanations

    Authors: Yuhao Zhong, Anirban Bhattacharya, Satish Bukkapatnam

    Abstract: We propose EBLIME to explain black-box machine learning models and obtain the distribution of feature importance using Bayesian ridge regression models. We provide mathematical expressions of the Bayesian framework and theoretical outcomes including the significance of ridge parameter. Case studies were conducted on benchmark datasets and a real-world industrial application of locating internal de… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

    Comments: 10 pages, 5 figures, 2 tables

  27. arXiv:2305.00092  [pdf, other

    cs.LG cs.AI cs.RO eess.SY math.OC

    Improving Gradient Computation for Differentiable Physics Simulation with Contacts

    Authors: Yaofeng Desmond Zhong, Jiequn Han, Biswadip Dey, Georgia Olympia Brikis

    Abstract: Differentiable simulation enables gradients to be back-propagated through physics simulations. In this way, one can learn the dynamics and properties of a physics system by gradient-based optimization or embed the whole differentiable simulation as a layer in a deep learning model for downstream tasks, such as planning and control. However, differentiable simulation at its current stage is not per… ▽ More

    Submitted 28 April, 2023; originally announced May 2023.

    Comments: 5th Annual Conference on Learning for Dynamics and Control

    Journal ref: Proceedings of Machine Learning Research vol 211, 2023

  28. arXiv:2302.12434  [pdf, other

    cs.SD cs.AI eess.AS

    Catch You and I Can: Revealing Source Voiceprint Against Voice Conversion

    Authors: Jiangyi Deng, Yanjiao Chen, Yinan Zhong, Qianhao Miao, Xueluan Gong, Wenyuan Xu

    Abstract: Voice conversion (VC) techniques can be abused by malicious parties to transform their audios to sound like a target speaker, making it hard for a human being or a speaker verification/identification system to trace the source speaker. In this paper, we make the first attempt to restore the source voiceprint from audios synthesized by voice conversion methods with high credit. However, unveiling t… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: Accepted by USENIX Security Symposium 2023. Please cite this paper as "Jiangyi Deng, Yanjiao Chen, Yinan Zhong, Qianhao Miao, Xueluan Gong, Wenyuan Xu. Catch You and I Can: Revealing Source Voiceprint Against Voice Conversion. In 32nd USENIX Security Symposium (USENIX Security 23)."

  29. Multi-Scaling Differential Contraction Integral Method for Inverse Scattering Problems with Inhomogeneous Media

    Authors: Yu Zhong, Francesco Zardi, Marco Salucci, Giacomo Oliveri, Andrea Massa

    Abstract: Practical applications of microwave imaging often require the solution of inverse scattering problems with inhomogeneous backgrounds. Towards this end, a novel inversion strategy, which combines the multi-scaling (MS) regularization scheme and the Difference Contraction Integral Equation (DCIE) formulation, is proposed. Such an integrated approach mitigates the non-linearity and the ill-posedness… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  30. arXiv:2210.02287  [pdf

    cs.SD cs.LG eess.AS

    TC-SKNet with GridMask for Low-complexity Classification of Acoustic scene

    Authors: Luyuan Xie, Yan Zhong, Lin Yang, Zhaoyu Yan, Zhonghai Wu, Junjie Wang

    Abstract: Convolution neural networks (CNNs) have good performance in low-complexity classification tasks such as acoustic scene classifications (ASCs). However, there are few studies on the relationship between the length of target speech and the size of the convolution kernels. In this paper, we combine Selective Kernel Network with Temporal-Convolution (TC-SKNet) to adjust the receptive field of convolut… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: Accepted to APSIPA ASC 2022

  31. arXiv:2209.01578  [pdf, other

    eess.IV cs.CV

    Spatial-Temporal Transformer for Video Snapshot Compressive Imaging

    Authors: Lishun Wang, Miao Cao, Yong Zhong, Xin Yuan

    Abstract: Video snapshot compressive imaging (SCI) captures multiple sequential video frames by a single measurement using the idea of computational imaging. The underlying principle is to modulate high-speed frames through different masks and these modulated frames are summed to a single measurement captured by a low-speed 2D sensor (dubbed optical encoder); following this, algorithms are employed to recon… ▽ More

    Submitted 8 September, 2022; v1 submitted 4 September, 2022; originally announced September 2022.

  32. arXiv:2207.10282  [pdf

    cs.NI cs.AI eess.SY

    An Evolutionary Game based Secure Clustering Protocol with Fuzzy Trust Evaluation and Outlier Detection for Wireless Sensor Networks

    Authors: Liu Yang, Yinzhi Lu, Simon X. Yang, Yuanchang Zhong, Tan Guo, Zhifang Liang

    Abstract: Trustworthy and reliable data delivery is a challenging task in Wireless Sensor Networks (WSNs) due to unique characteristics and constraints. To acquire secured data delivery and address the conflict between security and energy, in this paper we present an evolutionary game based secure clustering protocol with fuzzy trust evaluation and outlier detection for WSNs. Firstly, a fuzzy trust evaluati… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

  33. arXiv:2207.06918  [pdf, ps, other

    eess.SP cs.LG

    Interference-Limited Ultra-Reliable and Low-Latency Communications: Graph Neural Networks or Stochastic Geometry?

    Authors: Yuhong Liu, Changyang She, Yi Zhong, Wibowo Hardjawana, Fu-Chun Zheng, Branka Vucetic

    Abstract: In this paper, we aim to improve the Quality-of-Service (QoS) of Ultra-Reliability and Low-Latency Communications (URLLC) in interference-limited wireless networks. To obtain time diversity within the channel coherence time, we first put forward a random repetition scheme that randomizes the interference power. Then, we optimize the number of reserved slots and the number of repetitions for each p… ▽ More

    Submitted 18 July, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: Submitted to IEEE journal for possible publication

  34. arXiv:2207.05042  [pdf, other

    cs.CV cs.MM cs.SD eess.AS eess.IV

    Audio-Visual Segmentation

    Authors: **xing Zhou, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, **g Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong

    Abstract: We propose to explore a new problem called audio-visual segmentation (AVS), in which the goal is to output a pixel-level map of the object(s) that produce sound at the time of the image frame. To facilitate this research, we construct the first audio-visual segmentation benchmark (AVSBench), providing pixel-wise annotations for the sounding objects in audible videos. Two settings are studied with… ▽ More

    Submitted 17 February, 2023; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: ECCV 2022; Code is available at https://github.com/OpenNLPLab/AVSBench

  35. EMVLight: a Multi-agent Reinforcement Learning Framework for an Emergency Vehicle Decentralized Routing and Traffic Signal Control System

    Authors: Haoran Su, Yaofeng D. Zhong, Joseph Y. J. Chow, Biswadip Dey, Li **

    Abstract: Emergency vehicles (EMVs) play a crucial role in responding to time-critical calls such as medical emergencies and fire outbreaks in urban areas. Existing methods for EMV dispatch typically optimize routes based on historical traffic-flow data and design traffic signal pre-emption accordingly; however, we still lack a systematic methodology to address the coupling between EMV routing and traffic s… ▽ More

    Submitted 29 June, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

    Comments: 19 figures, 10 tables. Manuscript extended on previous work arXiv:2109.05429, arXiv:2111.00278

    Journal ref: Transportation Research Part C: Emerging Technologies Volume 146, January 2023, 103955

  36. arXiv:2205.09285  [pdf, other

    q-bio.TO eess.IV

    Leveraging mid-infrared spectroscopic imaging and deep learning for tissue subtype classification in ovarian cancer

    Authors: Chalapathi Charan Gajjela, Matthew Brun, Rupali Mankar, Sara Corvigno, Noah Kennedy, Yan** Zhong, **song Liu, Anil K. Sood, David Mayerich, Sebastian Berisha, Rohith Reddy

    Abstract: Mid-infrared spectroscopic imaging (MIRSI) is an emerging class of label-free techniques being leveraged for digital histopathology. Modern histopathologic identification of ovarian cancer involves tissue staining followed by morphological pattern recognition. This process is time-consuming, subjective, and requires extensive expertise. This paper presents the first label-free, quantitative, and a… ▽ More

    Submitted 5 July, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

  37. Low-Complexity Distributed Precoding in User-Centric Cell-Free mmWave MIMO Systems

    Authors: Yingrong Zhong, Yashuai Cao, Tiejun Lv

    Abstract: User-centric (UC) based cell-free (CF) structures can provide the benefits of coverage enhancement for millimeter wave (mmWave) multiple input multiple output (MIMO) systems, which is regarded as the key technology of the reliable and high-rate services. In this paper, we propose a new beam selection scheme and precoding algorithm for the UC CF mmWave MIMO system, where a weighted sum-rate maximiz… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: This is the final version published in 2022 Wireless Telecommunications Symposium (WTS)

  38. arXiv:2203.15609  [pdf, other

    cs.SD eess.AS

    Locality Matters: A Locality-Biased Linear Attention for Automatic Speech Recognition

    Authors: **gyu Sun, Gui** Zhong, Dinghao Zhou, Baoxiang Li, Yiran Zhong

    Abstract: Conformer has shown a great success in automatic speech recognition (ASR) on many public benchmarks. One of its crucial drawbacks is the quadratic time-space complexity with respect to the input sequence length, which prohibits the model to scale-up as well as process longer input audio sequences. To solve this issue, numerous linear attention methods have been proposed. However, these methods oft… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: 5 pages, 2 figures, submitted to interspeech 2022

  39. arXiv:2203.03844  [pdf, other

    eess.IV cs.CV

    Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks

    Authors: Yunshan Zhong, Mingbao Lin, Xunchao Li, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji

    Abstract: Light-weight super-resolution (SR) models have received considerable attention for their serviceability in mobile devices. Many efforts employ network quantization to compress SR models. However, these methods suffer from severe performance degradation when quantizing the SR models to ultra-low precision (e.g., 2-bit and 3-bit) with the low-cost layer-wise quantizer. In this paper, we identify tha… ▽ More

    Submitted 3 July, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

    Comments: ECCV2022

  40. arXiv:2201.05768  [pdf, other

    eess.IV cs.CV

    Spectral Compressive Imaging Reconstruction Using Convolution and Contextual Transformer

    Authors: Lishun Wang, Zongliang Wu, Yong Zhong, Xin Yuan

    Abstract: Spectral compressive imaging (SCI) is able to encode the high-dimensional hyperspectral image to a 2D measurement, and then uses algorithms to reconstruct the spatio-spectral data-cube. At present, the main bottleneck of SCI is the reconstruction algorithm, and the state-of-the-art (SOTA) reconstruction methods generally face the problem of long reconstruction time and/or poor detail recovery. In… ▽ More

    Submitted 2 July, 2023; v1 submitted 15 January, 2022; originally announced January 2022.

  41. arXiv:2111.01326  [pdf, other

    eess.AS cs.CL cs.SD

    Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity

    Authors: Peter Wu, Jiatong Shi, Yifan Zhong, Shinji Watanabe, Alan W Black

    Abstract: Speech processing systems currently do not support the vast majority of languages, in part due to the lack of data in low-resource languages. Cross-lingual transfer offers a compelling way to help bridge this digital divide by incorporating high-resource data into low-resource systems. Current cross-lingual algorithms have shown success in text-based tasks and speech-related tasks over some low-re… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

  42. arXiv:2111.00278  [pdf, other

    cs.AI eess.SY

    A Decentralized Reinforcement Learning Framework for Efficient Passage of Emergency Vehicles

    Authors: Haoran Su, Yaofeng Desmond Zhong, Biswadip Dey, Amit Chakraborty

    Abstract: Emergency vehicles (EMVs) play a critical role in a city's response to time-critical events such as medical emergencies and fire outbreaks. The existing approaches to reduce EMV travel time employ route optimization and traffic signal pre-emption without accounting for the coupling between route these two subproblems. As a result, the planned route often becomes suboptimal. In addition, these appr… ▽ More

    Submitted 20 February, 2022; v1 submitted 30 October, 2021; originally announced November 2021.

    Comments: Artificial Intelligence and Humanitarian Assistance and Disaster Recovery (AI + HADR) workshop, NeurIPS 2021. arXiv admin note: substantial text overlap with arXiv:2109.05429

  43. arXiv:2109.05429  [pdf, other

    cs.LG eess.SY

    EMVLight: A Decentralized Reinforcement Learning Framework for Efficient Passage of Emergency Vehicles

    Authors: Haoran Su, Yaofeng Desmond Zhong, Biswadip Dey, Amit Chakraborty

    Abstract: Emergency vehicles (EMVs) play a crucial role in responding to time-critical events such as medical emergencies and fire outbreaks in an urban area. The less time EMVs spend traveling through the traffic, the more likely it would help save people's lives and reduce property loss. To reduce the travel time of EMVs, prior work has used route optimization based on historical traffic-flow data and tra… ▽ More

    Submitted 28 June, 2022; v1 submitted 12 September, 2021; originally announced September 2021.

    Comments: Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI-22)

  44. arXiv:2104.00239  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Positive Sample Propagation along the Audio-Visual Event Line

    Authors: **xing Zhou, Liang Zheng, Yiran Zhong, Shijie Hao, Meng Wang

    Abstract: Visual and audio signals often coexist in natural environments, forming audio-visual events (AVEs). Given a video, we aim to localize video segments containing an AVE and identify its category. In order to learn discriminative features for a classifier, it is pivotal to identify the helpful (or positive) audio-visual segment pairs while filtering out the irrelevant ones, regardless whether they ar… ▽ More

    Submitted 5 April, 2021; v1 submitted 31 March, 2021; originally announced April 2021.

    Comments: Accepted to CVPR 2021. Code is available at https://github.com/jasongief/PSP_CVPR_2021

  45. arXiv:2012.13920  [pdf

    eess.IV cs.CV

    WHU-Hi: UAV-borne hyperspectral with high spatial resolution (H2) benchmark datasets for hyperspectral image classification

    Authors: Xin Hu, Yanfei Zhong, Chang Luo, Xinyu Wang

    Abstract: Classification is an important aspect of hyperspectral images processing and application. At present, the researchers mostly use the classic airborne hyperspectral imagery as the benchmark dataset. However, existing datasets suffer from three bottlenecks: (1) low spatial resolution; (2) low labeled pixels proportion; (3) low degree of subclasses distinction. In this paper, a new benchmark dataset… ▽ More

    Submitted 30 March, 2021; v1 submitted 27 December, 2020; originally announced December 2020.

    Comments: 5 pages, 1 figure

  46. arXiv:2012.02334  [pdf, other

    cs.LG cs.AI eess.SY math.DS

    Benchmarking Energy-Conserving Neural Networks for Learning Dynamics from Data

    Authors: Yaofeng Desmond Zhong, Biswadip Dey, Amit Chakraborty

    Abstract: The last few years have witnessed an increased interest in incorporating physics-informed inductive bias in deep learning frameworks. In particular, a growing volume of literature has been exploring ways to enforce energy conservation while using neural networks for learning dynamics from observed time-series data. In this work, we survey ten recently proposed energy-conserving neural network mode… ▽ More

    Submitted 28 April, 2023; v1 submitted 3 December, 2020; originally announced December 2020.

  47. arXiv:2012.00876  [pdf, other

    cs.CL eess.AS

    Automatically Identifying Language Family from Acoustic Examples in Low Resource Scenarios

    Authors: Peter Wu, Yifan Zhong, Alan W Black

    Abstract: Existing multilingual speech NLP works focus on a relatively small subset of languages, and thus current linguistic understanding of languages predominantly stems from classical approaches. In this work, we propose a method to analyze language similarity using deep learning. Namely, we train a model on the Wilderness dataset and investigate how its latent space compares with classical language fam… ▽ More

    Submitted 1 December, 2020; originally announced December 2020.

  48. arXiv:2011.09766  [pdf, other

    cs.CV cs.LG eess.IV

    Foreground-Aware Relation Network for Geospatial Object Segmentation in High Spatial Resolution Remote Sensing Imagery

    Authors: Zhuo Zheng, Yanfei Zhong, Junjue Wang, Ailong Ma

    Abstract: Geospatial object segmentation, as a particular semantic segmentation task, always faces with larger-scale variation, larger intra-class variance of background, and foreground-background imbalance in the high spatial resolution (HSR) remote sensing imagery. However, general semantic segmentation methods mainly focus on scale variation in the natural scene, with inadequate consideration of the othe… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

    Comments: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 2020

  49. FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification

    Authors: Zhuo Zheng, Yanfei Zhong, Ailong Ma, Liangpei Zhang

    Abstract: Deep learning techniques have provided significant improvements in hyperspectral image (HSI) classification. The current deep learning based HSI classifiers follow a patch-based learning framework by dividing the image into overlap** patches. As such, these methods are local learning methods, which have a high computational cost. In this paper, a fast patch-free global learning (FPGA) framework… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

    Comments: 16 pages, 15 figures, IEEE Transactions on Geoscience and Remote Sensing, 2020

  50. arXiv:2011.03247  [pdf, other

    cs.CV eess.IV

    Hi-UCD: A Large-scale Dataset for Urban Semantic Change Detection in Remote Sensing Imagery

    Authors: Shiqi Tian, Ailong Ma, Zhuo Zheng, Yanfei Zhong

    Abstract: With the acceleration of the urban expansion, urban change detection (UCD), as a significant and effective approach, can provide the change information with respect to geospatial objects for dynamical urban analysis. However, existing datasets suffer from three bottlenecks: (1) lack of high spatial resolution images; (2) lack of semantic annotation; (3) lack of long-range multi-temporal images. In… ▽ More

    Submitted 27 December, 2020; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: Presented at NeurIPS 2020 Workshop on Machine Learning for the Develo** World