Skip to main content

Showing 1–50 of 206 results for author: He, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.18054  [pdf, other

    eess.IV cs.CV

    Leveraging Pre-trained Models for FF-to-FFPE Histopathological Image Translation

    Authors: Qilai Zhang, Jiawen Li, Peiran Liao, Jiali Hu, Tian Guan, Anjia Han, Yonghong He

    Abstract: The two primary types of Hematoxylin and Eosin (H&E) slides in histopathology are Formalin-Fixed Paraffin-Embedded (FFPE) and Fresh Frozen (FF). FFPE slides offer high quality histopathological images but require a labor-intensive acquisition process. In contrast, FF slides can be prepared quickly, but the image quality is relatively poor. Our task is to translate FF images into FFPE style, thereb… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.15222  [pdf

    eess.IV cs.AI cs.CV

    Rapid and Accurate Diagnosis of Acute Aortic Syndrome using Non-contrast CT: A Large-scale, Retrospective, Multi-center and AI-based Study

    Authors: Yujian Hu, Yilang Xiang, Yan-Jie Zhou, Yangyan He, Shifeng Yang, Xiaolong Du, Chunlan Den, Youyao Xu, Gaofeng Wang, Zhengyao Ding, **gyong Huang, Wenjun Zhao, Xuejun Wu, Donglin Li, Qianqian Zhu, Zhenjiang Li, Chenyang Qiu, Ziheng Wu, Yunjun He, Chen Tian, Yihui Qiu, Zuodong Lin, Xiaolong Zhang, Yuan He, Zhenpeng Yuan , et al. (15 additional authors not shown)

    Abstract: Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed… ▽ More

    Submitted 24 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: under peer review

  3. arXiv:2406.11006  [pdf, other

    cs.SD cs.AI eess.AS

    SPEAR: Receiver-to-Receiver Acoustic Neural War** Field

    Authors: Yuhang He, Shitong Xu, Jia-Xing Zhong, Sangyun Shin, Niki Trigoni, Andrew Markham

    Abstract: We present SPEAR, a continuous receiver-to-receiver acoustic neural war** field for spatial acoustic effects prediction in an acoustic 3D space with a single stationary audio source. Unlike traditional source-to-receiver modelling methods that require prior space acoustic properties knowledge to rigorously model audio propagation from source to receiver, we propose to predict by war** the spat… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 figures in main paper

  4. arXiv:2406.10932  [pdf, other

    cs.SD cs.AI eess.AS

    Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition

    Authors: Wenhan Yao, Jiangkun Yang, Yongqiang He, Jia Liu, Wei** Wen

    Abstract: Speech recognition is an essential start ring of human-computer interaction, and recently, deep learning models have achieved excellent success in this task. However, when the model training and private data provider are always separated, some security threats that make deep neural networks (DNNs) abnormal deserve to be researched. In recent years, the typical backdoor attacks have been researched… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  5. arXiv:2406.02640  [pdf, other

    eess.IV physics.med-ph physics.optics

    Ghost imaging-based Non-contact Heart Rate Detection

    Authors: Jianming Yu, Yuchen He, Bin Li, Hui Chen, Huaibin Zheng, Jianbin Liu, Zhuo Xu

    Abstract: Remote heart rate measurement is an increasingly concerned research field, usually using remote photoplethysmography (rPPG) to collect heart rate information through video data collection. However, in certain specific scenarios (such as low light conditions, intense lighting, and non-line-of-sight situations), traditional imaging methods fail to capture image information effectively, that may lead… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 4 pages, 6 figures

  6. arXiv:2405.17702  [pdf

    eess.SY

    A Two-sided Model for EV Market Dynamics and Policy Implications

    Authors: Haoxuan Ma, Brian Yueshuai He, Tomas Kaljevic, Jiaqi Ma

    Abstract: The diffusion of Electric Vehicles (EVs) plays a pivotal role in mitigating greenhouse gas emissions, particularly in the U.S., where ambitious zero-emission and carbon neutrality objectives have been set. In pursuit of these goals, many states have implemented a range of incentive policies aimed at stimulating EV adoption and charging infrastructure development, especially public EV charging stat… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Conference preprint, 8 pages, 3 figures

  7. arXiv:2405.12487  [pdf, other

    cs.CV eess.IV

    3DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image Classification

    Authors: Yan He, Bing Tu, Bo Liu, Jun Li, Antonio Plaza

    Abstract: Hyperspectral image (HSI) classification constitutes the fundamental research in remote sensing fields. Convolutional Neural Networks (CNNs) and Transformers have demonstrated impressive capability in capturing spectral-spatial contextual dependencies. However, these architectures suffer from limited receptive fields and quadratic computational complexity, respectively. Fortunately, recent Mamba a… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  8. arXiv:2405.00577  [pdf

    cs.LG eess.SP q-bio.NC

    Discovering robust biomarkers of neurological disorders from functional MRI using graph neural networks: A Review

    Authors: Yi Hao Chan, Deepank Girish, Sukrit Gupta, **g Xia, Chockalingam Kasi, Yinan He, Conghao Wang, Jagath C. Rajapakse

    Abstract: Graph neural networks (GNN) have emerged as a popular tool for modelling functional magnetic resonance imaging (fMRI) datasets. Many recent studies have reported significant improvements in disorder classification performance via more sophisticated GNN designs and highlighted salient features that could be potential biomarkers of the disorder. In this review, we provide an overview of how GNN and… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  9. arXiv:2404.15704  [pdf, other

    cs.LG cs.AI cs.SD eess.AS

    Efficient Multi-Model Fusion with Adversarial Complementary Representation Learning

    Authors: Zuheng Kang, Yayun He, Jianzong Wang, Junqing Peng, **g Xiao

    Abstract: Single-model systems often suffer from deficiencies in tasks such as speaker verification (SV) and image classification, relying heavily on partial prior knowledge during decision-making, resulting in suboptimal performance. Although multi-model fusion (MMF) can mitigate some of these issues, redundancy in learned representations may limits improvements. To this end, we propose an adversarial comp… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Accepted by the 2024 International Joint Conference on Neural Networks (IJCNN 2024)

  10. arXiv:2404.13892  [pdf, other

    cs.SD cs.AI eess.AS

    Retrieval-Augmented Audio Deepfake Detection

    Authors: Zuheng Kang, Yayun He, Botao Zhao, Xiaoyang Qu, Junqing Peng, **g Xiao, Jianzong Wang

    Abstract: With recent advances in speech synthesis including text-to-speech (TTS) and voice conversion (VC) systems enabling the generation of ultra-realistic audio deepfakes, there is growing concern about their potential misuse. However, most deepfake (DF) detection methods rely solely on the fuzzy knowledge learned by a single model, resulting in performance bottlenecks and transparency issues. Inspired… ▽ More

    Submitted 23 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted by the 2024 International Conference on Multimedia Retrieval (ICMR 2024)

  11. arXiv:2404.13786  [pdf, other

    eess.SY cs.AI cs.DC cs.LG

    Soar: Design and Deployment of A Smart Roadside Infrastructure System for Autonomous Driving

    Authors: Shuyao Shi, Neiwen Ling, Zhehao Jiang, Xuan Huang, Yuze He, Xiaoguang Zhao, Bufang Yang, Chen Bian, **gfei Xia, Zhenyu Yan, Raymond Yeung, Guoliang Xing

    Abstract: Recently,smart roadside infrastructure (SRI) has demonstrated the potential of achieving fully autonomous driving systems. To explore the potential of infrastructure-assisted autonomous driving, this paper presents the design and deployment of Soar, the first end-to-end SRI system specifically designed to support autonomous driving systems. Soar consists of both software and hardware components ca… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  12. arXiv:2404.05976  [pdf, other

    cs.LG eess.SY stat.ME

    A Cyber Manufacturing IoT System for Adaptive Machine Learning Model Deployment by Interactive Causality Enabled Self-Labeling

    Authors: Yutian Ren, Yuqi He, Xuyin Zhang, Aaron Yen, G. P. Li

    Abstract: Machine Learning (ML) has been demonstrated to improve productivity in many manufacturing applications. To host these ML applications, several software and Industrial Internet of Things (IIoT) systems have been proposed for manufacturing applications to deploy ML applications and provide real-time intelligence. Recently, an interactive causality enabled self-labeling method has been proposed to ad… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  13. arXiv:2402.17723  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners

    Authors: Yazhou Xing, Yingqing He, Zeyue Tian, Xintao Wang, Qifeng Chen

    Abstract: Video and audio content creation serves as the core technique for the movie industry and professional users. Recently, existing diffusion-based methods tackle video and audio generation separately, which hinders the technique transfer from academia to industry. In this work, we aim at filling the gap, with a carefully designed optimization-based framework for cross-visual-audio and joint-visual-au… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted to CVPR 2024. Project website: https://yzxing87.github.io/Seeing-and-Hearing/

  14. arXiv:2402.17184  [pdf, other

    cs.CL cs.SD eess.AS

    Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models

    Authors: Rohit Prabhavalkar, Zhong Meng, Weiran Wang, Adam Stooke, Xingyu Cai, Yanzhang He, Arun Narayanan, Dongseong Hwang, Tara N. Sainath, Pedro J. Moreno

    Abstract: The accuracy of end-to-end (E2E) automatic speech recognition (ASR) models continues to improve as they are scaled to larger sizes, with some now reaching billions of parameters. Widespread deployment and adoption of these models, however, requires computationally efficient strategies for decoding. In the present work, we study one such strategy: applying multiple frame reduction layers in the enc… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted to 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024)

  15. arXiv:2402.05847  [pdf, other

    eess.SP

    Reconfigurable Intelligent Surface-Aided Dual-Function Radar and Communication Systems With MU-MIMO Communication

    Authors: Yasheng **, Hong Ren, Cunhua Pan, Zhiyuan Yu, Ruisong Weng, Boshi Wang, Gui Zhou, Yongchao He, Maged Elkashlan

    Abstract: In this paper, we investigate an reconfigurable intelligent surface (RIS)-aided integrated sensing and communication (ISAC) system. Our objective is to maximize the achievable sum rate of the multi-antenna communication users through the joint active and passive beamforming. {Specifically}, the weighted minimum mean-square error (WMMSE) method is { first} used to reformulate the original problem i… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  16. arXiv:2402.04532  [pdf, other

    eess.SP

    Joint Beamforming Design for Double Active RIS-assisted Radar-Communication Coexistence Systems

    Authors: Mengyu Liu, Hong Ren, Cunhua Pan, Boshi Wang, Zhiyuan Yu, Ruisong Weng, Kangda Zhi, Yongchao He

    Abstract: Integrated sensing and communication (ISAC) technology has been considered as one of the key candidate technologies in the next-generation wireless communication systems. However, when radar and communication equipment coexist in the same system, i.e. radar-communication coexistence (RCC), the interference from communication systems to radar can be large and cannot be ignored. Recently, reconfigur… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  17. arXiv:2402.02122  [pdf, other

    eess.SP

    Secure Wireless Communication in Active RIS-Assisted DFRC System

    Authors: Yang Zhang, Hong Ren, Cunhua Pan, Boshi Wang, Zhiyuan Yu, Ruisong Weng, Tuo Wu, Yongchao He

    Abstract: This work considers a dual-functional radar and communication (DFRC) system with an active reconfigurable intelligent surface (RIS) and a potential eavesdropper. Our purpose is to maximize the secrecy rate (SR) of the system by jointly designing the beamforming matrix at the DFRC base station (BS) and the reflecting coefficients at the active RIS, subject to the signal-to-interference-plus-noise-r… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: 13 pages, 9 figures

  18. arXiv:2401.17577  [pdf, other

    cs.IT eess.SP

    Robustness in Wireless Distributed Learning: An Information-Theoretic Analysis

    Authors: Yangshuo He, Guanding Yu

    Abstract: In this paper, we take an information-theoretic approach to understand the robustness in wireless distributed learning. Upon measuring the difference in loss functions, we provide an upper bound of the performance deterioration due to imperfect wireless channels. Moreover, we characterize the transmission rate under task performance guarantees and propose the channel capacity gain resulting from t… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  19. arXiv:2401.05850  [pdf, other

    cs.SD eess.AS

    Contrastive Loss Based Frame-wise Feature disentanglement for Polyphonic Sound Event Detection

    Authors: Yadong Guan, Jiqing Han, Hongwei Song, Wenjie Song, Guibin Zheng, Tieran Zheng, Yongjun He

    Abstract: Overlap** sound events are ubiquitous in real-world environments, but existing end-to-end sound event detection (SED) methods still struggle to detect them effectively. A critical reason is that these methods represent overlap** events using shared and entangled frame-wise features, which degrades the feature discrimination. To solve the problem, we propose a disentangled feature learning fram… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: accepted by icassp2024

  20. arXiv:2312.16149  [pdf, other

    cs.SD eess.AS

    SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network

    Authors: Yuhang He, Zhuangzhuang Dai, Long Chen, Niki Trigoni, Andrew Markham

    Abstract: In this paper, we study an underexplored, yet important and challenging problem: counting the number of distinct sounds in raw audio characterized by a high degree of polyphonicity. We do so by systematically proposing a novel end-to-end trainable neural network (which we call DyDecNet, consisting of a dyadic decomposition front-end and backbone network), and quantifying the difficulty level of co… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: AAAI2024 Paper

  21. arXiv:2312.08553  [pdf, other

    eess.AS cs.SD

    USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

    Authors: Shao** Ding, David Qiu, David Rim, Yanzhang He, Oleg Rybakov, Bo Li, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Zhonglin Han, Jian Li, Amir Yazdanbakhsh, Shivani Agrawal

    Abstract: End-to-end automatic speech recognition (ASR) models have seen revolutionary quality gains with the recent development of large-scale universal speech models (USM). However, deploying these massive USMs is extremely expensive due to the enormous memory usage and computational cost. Therefore, model compression is an important research topic to fit USM-based ASR under budget in real-world scenarios… ▽ More

    Submitted 16 January, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

    Comments: Accepted by ICASSP 2024. Preprint

  22. arXiv:2311.12316  [pdf

    cs.CV cs.AI eess.IV

    Generating Progressive Images from Pathological Transitions via Diffusion Model

    Authors: Zeyu Liu, Tianyi Zhang, Yufang He, Yunlu Feng, Yu Zhao, Guanglei Zhang

    Abstract: Deep learning is widely applied in computer-aided pathological diagnosis, which alleviates the pathologist workload and provide timely clinical analysis. However, most models generally require large-scale annotated data for training, which faces challenges due to the sampling and annotation scarcity in pathological images. The rapid develo** generative models shows potential to generate more tra… ▽ More

    Submitted 9 March, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

    Comments: 13 pages, 9 figs, 4 tabs

  23. arXiv:2311.06712  [pdf, other

    eess.IV

    PuzzleTuning: Explicitly Bridge Pathological and Natural Image with Puzzles

    Authors: Tianyi Zhang, Shangqing Lyu, Yanli Lei, Sicheng Chen, Nan Ying, Yufang He, Yu Zhao, Yunlu Feng, Hwee Kuan Lee, Guanglei Zhang

    Abstract: Pathological image analysis is a crucial field in computer vision. Due to the annotation scarcity in the pathological field, pre-training with self-supervised learning (SSL) is widely applied to learn on unlabeled images. However, the current SSL-based pathological pre-training: (1) does not explicitly explore the essential focuses of the pathological field, and (2) does not effectively bridge wit… ▽ More

    Submitted 22 April, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

    Comments: 13 pages, 9 figures, 8 tables

  24. arXiv:2310.17346  [pdf, ps, other

    eess.IV

    Extended Signaling Methods for Reduced Video Decoder Power Consumption Using Green Metadata

    Authors: Christian Herglotz, Matthias Kränzler, Xixue Chu, Edouard Francois, Yong He, André Kaup

    Abstract: In this paper, we discuss one aspect of the latest MPEG standard edition on energy-efficient media consumption, also known as Green Metadata (ISO/IEC 232001-11), which is the interactive signaling for remote decoder-power reduction for peer-to-peer video conferencing. In this scenario, the receiver of a video, e.g., a battery-driven portable device, can send a dedicated request to the sender which… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: 5 pages, 2 figures

  25. arXiv:2310.07869  [pdf, ps, other

    eess.SP

    Kronecker-structured Sparse Vector Recovery with Application to IRS-MIMO Channel Estimation

    Authors: Yanbin He, Geethu Joseph

    Abstract: This paper studies the problem of Kronecker-structured sparse vector recovery from an underdetermined linear system with a Kronecker-structured dictionary. Such a problem arises in many real-world applications such as the sparse channel estimation of an intelligent reflecting surface-aided multiple-input multiple-output system. The prior art only exploits the Kronecker structure in the support of… ▽ More

    Submitted 16 October, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

  26. arXiv:2310.04681  [pdf, other

    cs.SD cs.AI eess.AS

    VoiceExtender: Short-utterance Text-independent Speaker Verification with Guided Diffusion Model

    Authors: Yayun He, Zuheng Kang, Jianzong Wang, Junqing Peng, **g Xiao

    Abstract: Speaker verification (SV) performance deteriorates as utterances become shorter. To this end, we propose a new architecture called VoiceExtender which provides a promising solution for improving SV performance when handling short-duration speech signals. We use two guided diffusion models, the built-in and the external speaker embedding (SE) guided diffusion model, both of which utilize a diffusio… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: Accepted by the 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2023)

  27. arXiv:2310.04114  [pdf, other

    eess.IV cs.CV

    Aorta Segmentation from 3D CT in MICCAI SEG.A. 2023 Challenge

    Authors: Andriy Myronenko, Dong Yang, Yufan He, Daguang Xu

    Abstract: Aorta provides the main blood supply of the body. Screening of aorta with imaging helps for early aortic disease detection and monitoring. In this work, we describe our solution to the Segmentation of the Aorta (SEG.A.231) from 3D CT challenge. We use automated segmentation method Auto3DSeg available in MONAI. Our solution achieves an average Dice score of 0.920 and 95th percentile of the Hausdorf… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: MICCAI 2023, SEG.A. 2023 challenge 1st place

  28. arXiv:2309.16210  [pdf, other

    eess.IV cs.CV cs.LG

    Abdominal multi-organ segmentation in CT using Swinunter

    Authors: Ming** Chen, Yongkang He, Yongyi Lu

    Abstract: Abdominal multi-organ segmentation in computed tomography (CT) is crucial for many clinical applications including disease detection and treatment planning. Deep learning methods have shown unprecedented performance in this perspective. However, it is still quite challenging to accurately segment different organs utilizing a single network due to the vague boundaries of organs, the complex backgro… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: 8pages. arXiv admin note: text overlap with arXiv:2201.01266 by other authors

  29. arXiv:2309.12963  [pdf, ps, other

    eess.AS cs.SD

    Massive End-to-end Models for Short Search Queries

    Authors: Weiran Wang, Rohit Prabhavalkar, Dongseong Hwang, Qiujia Li, Khe Chai Sim, Bo Li, James Qin, Xingyu Cai, Adam Stooke, Zhong Meng, CJ Zheng, Yanzhang He, Tara Sainath, Pedro Moreno Mengibar

    Abstract: In this work, we investigate two popular end-to-end automatic speech recognition (ASR) models, namely Connectionist Temporal Classification (CTC) and RNN-Transducer (RNN-T), for offline recognition of voice search queries, with up to 2B model parameters. The encoders of our models use the neural architecture of Google's universal speech model (USM), with additional funnel pooling layers to signifi… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  30. arXiv:2309.04780  [pdf, other

    cs.CV eess.IV

    Latent Degradation Representation Constraint for Single Image Deraining

    Authors: Yuhong He, Long Peng, Lu Wang, Jun Cheng

    Abstract: Since rain streaks show a variety of shapes and directions, learning the degradation representation is extremely challenging for single image deraining. Existing methods are mainly targeted at designing complicated modules to implicitly learn latent degradation representation from coupled rainy images. This way, it is hard to decouple the content-independent degradation representation due to the l… ▽ More

    Submitted 18 January, 2024; v1 submitted 9 September, 2023; originally announced September 2023.

    Comments: This paper is accepted to ICASSP 2024

  31. arXiv:2308.14602  [pdf

    eess.SY cs.LG

    Recent Progress in Energy Management of Connected Hybrid Electric Vehicles Using Reinforcement Learning

    Authors: Min Hua, Bin Shuai, Quan Zhou, **hai Wang, Yinglong He, Hongming Xu

    Abstract: The growing adoption of hybrid electric vehicles (HEVs) presents a transformative opportunity for revolutionizing transportation energy systems. The shift towards electrifying transportation aims to curb environmental concerns related to fossil fuel consumption. This necessitates efficient energy management systems (EMS) to optimize energy efficiency. The evolution of EMS from HEVs to connected hy… ▽ More

    Submitted 23 December, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

  32. arXiv:2308.06479  [pdf, other

    cs.NI eess.SP

    mmHawkeye: Passive UAV Detection with a COTS mmWave Radar

    Authors: Jia Zhang, Xin Na, Rui Xi, Yimiao Sun, Yuan He

    Abstract: Small Unmanned Aerial Vehicles (UAVs) are becoming potential threats to security-sensitive areas and personal privacy. A UAV can shoot photos at height, but how to detect such an uninvited intruder is an open problem. This paper presents mmHawkeye, a passive approach for UAV detection with a COTS millimeter wave (mmWave) radar. mmHawkeye doesn't require prior knowledge of the type, motions, and fl… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    Comments: 9 pages, 14 figures, IEEE SECON2023

    ACM Class: C.2; J.3

  33. arXiv:2307.14719  [pdf, ps, other

    eess.SP

    Bayesian Algorithms for Kronecker-structured Sparse Vector Recovery With Application to IRS-MIMO Channel Estimation

    Authors: Yanbin He, Geethu Joseph

    Abstract: We study the sparse recovery problem with an underdetermined linear system characterized by a Kronecker-structured dictionary and a Kronecker-supported sparse vector. We cast this problem into the sparse Bayesian learning (SBL) framework and rely on the expectation-maximization method for a solution. To this end, we model the Kronecker-structured support with a hierarchical Gaussian prior distribu… ▽ More

    Submitted 1 August, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

  34. arXiv:2307.08388  [pdf, other

    cs.CV eess.IV

    Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation

    Authors: Yaolei Qi, Yuting He, Xiaoming Qi, Yuan Zhang, Guanyu Yang

    Abstract: Accurate segmentation of topological tubular structures, such as blood vessels and roads, is crucial in various fields, ensuring accuracy and efficiency in downstream tasks. However, many factors complicate the task, including thin local structures and variable global morphologies. In this work, we note the specificity of tubular structures and use this knowledge to guide our DSCNet to simultaneou… ▽ More

    Submitted 18 August, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted by ICCV 2023

  35. arXiv:2307.06344  [pdf, other

    q-bio.QM cs.CV eess.IV

    The Whole Pathological Slide Classification via Weakly Supervised Learning

    Authors: Qiehe Sun, Jiawen Li, ** Xu, Junru Cheng, Tian Guan, Yonghong He

    Abstract: Due to its superior efficiency in utilizing annotations and addressing gigapixel-sized images, multiple instance learning (MIL) has shown great promise as a framework for whole slide image (WSI) classification in digital pathology diagnosis. However, existing methods tend to focus on advanced aggregators with different structures, often overlooking the intrinsic features of H\&E pathological slide… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

  36. arXiv:2306.16473  [pdf

    eess.SY

    Coordinating O&M and Logistical Resources to Enhance Post-Disaster Resilience of Interdependent Power and Natural Gas Distribution Systems

    Authors: Wei Wang, Kaigui Xie, Hongbin Wang, Tao Chen, Hongzhou Chen, Yufei He

    Abstract: Electric power and natural gas systems are becoming increasingly interdependent, driven by the growth of natural gas-fired generation and the electrification of the gas industry. Recent energy crises have underscored the urgent need for enhanced resilience in these interdependent systems. In response to this challenge, this paper focuses on the interdependent electric power and natural gas distrib… ▽ More

    Submitted 25 June, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: 10 pages, 9 figures

  37. arXiv:2306.11876  [pdf, other

    eess.IV cs.CV

    BMAD: Benchmarks for Medical Anomaly Detection

    Authors: **an Bao, Hanshi Sun, Hanqiu Deng, Yinsheng He, Zhaoxiang Zhang, Xingyu Li

    Abstract: Anomaly detection (AD) is a fundamental research problem in machine learning and computer vision, with practical applications in industrial inspection, video surveillance, and medical diagnosis. In medical imaging, AD is especially vital for detecting and diagnosing anomalies that may indicate rare diseases or conditions. However, there is a lack of a universal and fair benchmark for evaluating AD… ▽ More

    Submitted 27 April, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

  38. arXiv:2306.09014  [pdf, other

    eess.IV

    Geometric Wide-Angle Camera Calibration: A Review and Comparative Study

    Authors: Jianzhu Huai, Yuan Zhuang, Yuxin Shao, Grzegorz Jozkow, Binliang Wang, Yijia He, Alper Yilmaz

    Abstract: Wide-angle cameras are widely used in photogrammetry and autonomous systems which rely on the accurate metric measurements derived from images. To find the geometric relationship between incoming rays and image pixels, geometric camera calibration (GCC) has been actively developed. Aiming to provide practical calibration guidelines, this work surveys the existing GCC tools and evaluates the repres… ▽ More

    Submitted 27 March, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: 18 pages, 12 figures

  39. arXiv:2306.07949  [pdf, other

    eess.AS cs.AI cs.LG

    Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition

    Authors: Xianzhao Chen, Yist Y. Lin, Kang Wang, Yi He, Zejun Ma

    Abstract: End-to-end (E2E) systems have shown comparable performance to hybrid systems for automatic speech recognition (ASR). Word timings, as a by-product of ASR, are essential in many applications, especially for subtitling and computer-aided pronunciation training. In this paper, we improve the frame-level classifier for word timings in E2E system by introducing label priors in connectionist temporal cl… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: To appear in the proceedings of INTERSPEECH 2023

  40. arXiv:2306.07824  [pdf, ps, other

    eess.IV

    JCCS-PFGM: A Novel Circle-Supervision based Poisson Flow Generative Model for Multiphase CECT Progressive Low-Dose Reconstruction with Joint Condition

    Authors: Rongjun Ge, Yuting He, Cong Xia, Yang Chen, Daoqiang Zhang, Ge Wang

    Abstract: Multiphase contrast-enhanced computed tomography (CECT) scan is clinically significant to demonstrate the anatomy at different phases. In practice, such a multiphase CECT scan inherently takes longer time and deposits much more radiation dose into a patient body than a regular CT scan, and reduction of the radiation dose typically compromise the CECT image quality and its diagnostic value. With Jo… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  41. arXiv:2306.01553  [pdf, other

    eess.SP

    Detecting Low Pass Graph Signals via Spectral Pattern: Sampling Complexity and Applications

    Authors: Chenyue Zhang, Yiran He, Hoi-To Wai

    Abstract: This paper proposes a blind detection problem for low pass graph signals. Without assuming knowledge of the exact graph topology, we aim to detect if a set of graph signal observations are generated from a low pass graph filter. Our problem is motivated by the widely adopted assumption of low pass (a.k.a.~smooth) signals required by many existing works in graph signal processing (GSP), as well as… ▽ More

    Submitted 21 June, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: 15 pages, 11 figures, accepted by IEEE Transactions on Signal Processing

  42. arXiv:2305.17749  [pdf, other

    cs.SD cs.AI eess.AS physics.data-an

    Bayesian inference and neural estimation of acoustic wave propagation

    Authors: Yongchao Huang, Yuhang He, Hong Ge

    Abstract: In this work, we introduce a novel framework which combines physics and machine learning methods to analyse acoustic signals. Three methods are developed for this task: a Bayesian inference approach for inferring the spectral acoustics characteristics, a neural-physical model which equips a neural network with forward and backward physical losses, and the non-linear least squares approach which se… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    MSC Class: 68T01 ACM Class: J.2

  43. arXiv:2305.16619  [pdf, other

    eess.AS

    2-bit Conformer quantization for automatic speech recognition

    Authors: Oleg Rybakov, Phoenix Meadowlark, Shao** Ding, David Qiu, Jian Li, David Rim, Yanzhang He

    Abstract: Large speech models are rapidly gaining traction in research community. As a result, model compression has become an important topic, so that these models can fit in memory and be served with reduced cost. Practical approaches for compressing automatic speech recognition (ASR) model use int8 or int4 weight quantization. In this study, we propose to develop 2-bit ASR models. We explore the impact o… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: submitted to Interspeech

  44. arXiv:2305.15536  [pdf, other

    eess.AS cs.LG

    RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models

    Authors: David Qiu, David Rim, Shao** Ding, Oleg Rybakov, Yanzhang He

    Abstract: With the rapid increase in the size of neural networks, model compression has become an important area of research. Quantization is an effective technique at decreasing the model size, memory access, and compute load of large models. Despite recent advances in quantization aware training (QAT) technique, most papers present evaluations that are focused on computer vision tasks, which have differen… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  45. arXiv:2305.13869  [pdf, other

    physics.acc-ph cs.AI cs.LG eess.SY

    Trend-Based SAC Beam Control Method with Zero-Shot in Superconducting Linear Accelerator

    Authors: Xiaolong Chen, Xin Qi, Chunguang Su, Yuan He, Zhijun Wang, Kunxiang Sun, Chao **, Weilong Chen, Shuhui Liu, Xiaoying Zhao, Duanyang Jia, Man Yi

    Abstract: The superconducting linear accelerator is a highly flexiable facility for modern scientific discoveries, necessitating weekly reconfiguration and tuning. Accordingly, minimizing setup time proves essential in affording users with ample experimental time. We propose a trend-based soft actor-critic(TBSAC) beam control method with strong robustness, allowing the agents to be trained in a simulated en… ▽ More

    Submitted 25 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  46. arXiv:2305.10791  [pdf, other

    cs.CR eess.SY

    BrutePrint: Expose Smartphone Fingerprint Authentication to Brute-force Attack

    Authors: Yu Chen, Yiling He

    Abstract: Fingerprint authentication has been widely adopted on smartphones to complement traditional password authentication, making it a tempting target for attackers. The smartphone industry is fully aware of existing threats, and especially for the presentation attack studied by most prior works, the threats are nearly eliminated by liveness detection and attempt limit. In this paper, we study the seemi… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  47. arXiv:2305.10773  [pdf, ps, other

    eess.SP cs.AI

    Rate-Adaptive Coding Mechanism for Semantic Communications With Multi-Modal Data

    Authors: Yangshuo He, Guanding Yu, Yunlong Cai

    Abstract: Recently, the ever-increasing demand for bandwidth in multi-modal communication systems requires a paradigm shift. Powered by deep learning, semantic communications are applied to multi-modal scenarios to boost communication efficiency and save communication resources. However, the existing end-to-end neural network (NN) based framework without the channel encoder/decoder is incompatible with mode… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  48. arXiv:2304.14302  [pdf

    physics.app-ph eess.SY physics.optics

    In-memory photonic dot-product engine with electrically programmable weight banks

    Authors: Wen Zhou, Bowei Dong, Nikolaos Farmakidis, Xuan Li, Nathan Youngblood, Kairan Huang, Yuhan He, C. David Wright, Wolfram H. P. Pernice, Harish Bhaskaran

    Abstract: Electronically reprogrammable photonic circuits based on phase-change chalcogenides present an avenue to resolve the von-Neumann bottleneck; however, implementation of such hybrid photonic-electronic processing has not achieved computational success. Here, we achieve this milestone by demonstrating an in-memory photonic-electronic dot-product engine, one that decouples electronic programming of ph… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  49. Segment Anything in Medical Images

    Authors: Jun Ma, Yuting He, Feifei Li, Lin Han, Chenyu You, Bo Wang

    Abstract: Medical image segmentation is a critical component in clinical practice, facilitating accurate diagnosis, treatment planning, and disease monitoring. However, existing methods, often tailored to specific modalities or disease types, lack generalizability across the diverse spectrum of medical image segmentation tasks. Here we present MedSAM, a foundation model designed for bridging this gap by ena… ▽ More

    Submitted 1 April, 2024; v1 submitted 24 April, 2023; originally announced April 2023.

    Journal ref: Nature Communications 15, 654 (2024)

  50. arXiv:2303.10342  [pdf, other

    eess.IV cs.CV

    Whole-slide-imaging Cancer Metastases Detection and Localization with Limited Tumorous Data

    Authors: Yinsheng He, Xingyu Li

    Abstract: Recently, various deep learning methods have shown significant successes in medical image analysis, especially in the detection of cancer metastases in hematoxylin and eosin (H&E) stained whole-slide images (WSIs). However, in order to obtain good performance, these research achievements rely on hundreds of well-annotated WSIs. In this study, we tackle the tumor localization and detection problem… ▽ More

    Submitted 18 March, 2023; originally announced March 2023.

    Comments: 8 pages, 3 figures, 3 tables, 1 appendix