Skip to main content

Showing 1–50 of 439 results for author: Chen, C

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.00995  [pdf, other

    cs.CY eess.SY physics.app-ph

    Data on the Move: Traffic-Oriented Data Trading Platform Powered by AI Agent with Common Sense

    Authors: Yi Yu, Shengyue Yao, Tianchen Zhou, Yexuan Fu, **gru Yu, Ding Wang, Xuhong Wang, Cen Chen, Yilun Lin

    Abstract: In the digital era, data has become a pivotal asset, advancing technologies such as autonomous driving. Despite this, data trading faces challenges like the absence of robust pricing methods and the lack of trustworthy trading mechanisms. To address these challenges, we introduce a traffic-oriented data trading platform named Data on The Move (DTM), integrating traffic simulation, data trading, an… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.18069  [pdf, other

    eess.SP cs.AI cs.CL

    Large Language Models for Cuffless Blood Pressure Measurement From Wearable Biosignals

    Authors: Zengding Liu, Chen Chen, Jiannong Cao, Minglei Pan, Jikui Liu, Nan Li, Fen Miao, Ye Li

    Abstract: Large language models (LLMs) have captured significant interest from both academia and industry due to their impressive performance across various textual tasks. However, the potential of LLMs to analyze physiological time-series data remains an emerging research field. Particularly, there is a notable gap in the utilization of LLMs for analyzing wearable biosignals to achieve cuffless blood press… ▽ More

    Submitted 26 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.16910  [pdf, other

    eess.SP cs.AI cs.HC cs.LG q-bio.NC

    Mind's Eye: Image Recognition by EEG via Multimodal Similarity-Kee** Contrastive Learning

    Authors: Chi-Sheng Chen, Chun-Shu Wei

    Abstract: Decoding images from non-invasive electroencephalographic (EEG) signals has been a grand challenge in understanding how the human brain process visual information in real-world scenarios. To cope with the issues of signal-to-noise ratio and nonstationarity, this paper introduces a MUltimodal Similarity-kee** contrastivE learning (MUSE) framework for zero-shot EEG-based image classification. We d… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 19 pages, 14 figures

  4. arXiv:2406.15885  [pdf, other

    cs.SD cs.AI eess.AS

    The Music Maestro or The Musically Challenged, A Massive Music Evaluation Benchmark for Large Language Models

    Authors: Jiajia Li, Lu Yang, Mingni Tang, Cong Chen, Zuchao Li, ** Wang, Hai Zhao

    Abstract: Benchmark plays a pivotal role in assessing the advancements of large language models (LLMs). While numerous benchmarks have been proposed to evaluate LLMs' capabilities, there is a notable absence of a dedicated benchmark for assessing their musical abilities. To address this gap, we present ZIQI-Eval, a comprehensive and large-scale music benchmark specifically designed to evaluate the music-rel… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL-Findings 2024

  5. arXiv:2406.12300  [pdf

    eess.IV cs.CV q-bio.NC

    IR2QSM: Quantitative Susceptibility Map** via Deep Neural Networks with Iterative Reverse Concatenations and Recurrent Modules

    Authors: Min Li, Chen Chen, Zhuang Xiong, Ying Liu, Pengfei Rong, Shanshan Shan, Feng Liu, Hongfu Sun, Yang Gao

    Abstract: Quantitative susceptibility map** (QSM) is an MRI phase-based post-processing technique to extract the distribution of tissue susceptibilities, demonstrating significant potential in studying neurological diseases. However, the ill-conditioned nature of dipole inversion makes QSM reconstruction from the tissue field prone to noise and artifacts. In this work, we propose a novel deep learning-bas… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 10 pages, 9 figures

  6. arXiv:2406.09317  [pdf, other

    eess.IV cs.CV

    Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases

    Authors: Meng Wang, Tian Lin, Aidi Lin, Kai Yu, Yuanyuan Peng, Lianyu Wang, Cheng Chen, Ke Zou, Huiyu Liang, Man Chen, Xue Yao, Meiqin Zhang, Binwei Huang, Chaoxin Zheng, Peixin Zhang, Wei Chen, Yilong Luo, Yifan Chen, Honghe Xia, Tingkun Shi, Qi Zhang, **ming Guo, Xiaolin Chen, **gcheng Wang, Yih Chung Tham , et al. (24 additional authors not shown)

    Abstract: Previous foundation models for retinal images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language foundation model that leverages knowledge from over 400 fundus diseases. To RetiZero's pre-training, we compiled 341,896 fundus images paired with text descriptions, sourced from public datasets, ophthalmic literature, and online resources… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  7. arXiv:2406.09272  [pdf, other

    cs.CV cs.AI cs.SD eess.AS

    Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos

    Authors: Changan Chen, Puyuan Peng, Ami Baid, Zihui Xue, Wei-Ning Hsu, David Harwath, Kristen Grauman

    Abstract: Generating realistic audio for human interactions is important for many applications, such as creating sound effects for films or virtual reality games. Existing approaches implicitly assume total correspondence between the video and audio during training, yet many sounds happen off-screen and have weak to no correspondence with the visuals -- resulting in uncontrolled ambient sounds or hallucinat… ▽ More

    Submitted 20 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Project page: https://vision.cs.utexas.edu/projects/action2sound

  8. arXiv:2406.07854  [pdf, other

    cs.SD cs.MM eess.AS

    Zero-Shot Fake Video Detection by Audio-Visual Consistency

    Authors: Xiaolou Li, Zehua Liu, Chen Chen, Lantian Li, Li Guo, Dong Wang

    Abstract: Recent studies have advocated the detection of fake videos as a one-class detection task, predicated on the hypothesis that the consistency between audio and visual modalities of genuine data is more significant than that of fake data. This methodology, which solely relies on genuine audio-visual data while negating the need for forged counterparts, is thus delineated as a `zero-shot' detection pa… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: to be published in INTERSPEECH 2024

  9. arXiv:2406.04680  [pdf, other

    eess.IV cs.CV

    MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome

    Authors: Yixin Huang, Yiqi **, Ke Tao, Kaijian Xia, Jianfeng Gu, Lei Yu, Lan Du, Cunjian Chen

    Abstract: May-Thurner Syndrome (MTS), also known as iliac vein compression syndrome or Cockett's syndrome, is a condition potentially impacting over 20 percent of the population, leading to an increased risk of iliofemoral deep venous thrombosis. In this paper, we present a 3D-based deep learning approach called MTS-Net for diagnosing May-Thurner Syndrome using CT scans. To effectively capture the spatial-t… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  10. arXiv:2406.00654  [pdf, other

    cs.CL cs.SD eess.AS

    Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback

    Authors: Chen Chen, Yuchen Hu, Wen Wu, Helin Wang, Eng Siong Chng, Chao Zhang

    Abstract: In recent years, text-to-speech (TTS) technology has witnessed impressive advancements, particularly with large-scale training datasets, showcasing human-level speech quality and impressive zero-shot capabilities on unseen speakers. However, despite human subjective evaluations, such as the mean opinion score (MOS), remaining the gold standard for assessing the quality of synthetic speech, even st… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 19 pages, Preprint

  11. arXiv:2406.00329  [pdf, other

    eess.IV cs.CV cs.LG

    Whole Heart 3D+T Representation Learning Through Sparse 2D Cardiac MR Images

    Authors: Yundi Zhang, Chen Chen, Suprosanna Shit, Sophie Starck, Daniel Rueckert, Jiazhen Pan

    Abstract: Cardiac Magnetic Resonance (CMR) imaging serves as the gold-standard for evaluating cardiac morphology and function. Typically, a multi-view CMR stack, covering short-axis (SA) and 2/3/4-chamber long-axis (LA) views, is acquired for a thorough cardiac assessment. However, efficiently streamlining the complex, high-dimensional 3D+T CMR data and distilling compact, coherent representation remains a… ▽ More

    Submitted 6 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

  12. arXiv:2405.16715  [pdf

    eess.SP

    Coil Reweighting to Suppress Motion Artifacts in Real-Time Exercise Cine Imaging

    Authors: Chong Chen, Yingmin Liu, Yu Ding, Matthew Tong, Preethi Chandrasekaran, Christopher Crabtree, Syed M. Arshad, Yuchi Han, Rizwan Ahmad

    Abstract: Background: Accelerated real-time cine (RT-Cine) imaging enables cardiac function assessment without the need for breath-holding. However, when performed during in-magnet exercise, RT-Cine images may exhibit significant motion artifacts. Methods: By projecting the time-averaged images to the subspace spanned by the coil sensitivity maps, we propose a coil reweighting (CR) method to automatically s… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  13. arXiv:2405.14300  [pdf, other

    eess.IV cs.CV

    Automatic diagnosis of cardiac magnetic resonance images based on semi-supervised learning

    Authors: Hejun Huang, Zuguo Chen, Yi Huang, Guangqiang Luo, Chaoyang Chen, Youzhi Song

    Abstract: Cardiac magnetic resonance imaging (MRI) is a pivotal tool for assessing cardiac function. Precise segmentation of cardiac structures is imperative for accurate cardiac functional evaluation. This paper introduces a semi-supervised model for automatic segmentation of cardiac images and auxiliary diagnosis. By harnessing cardiac MRI images and necessitating only a small portion of annotated image d… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  14. arXiv:2405.14161  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

    Authors: Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Chengwei Qin, Pin-Yu Chen, Eng Siong Chng, Chao Zhang

    Abstract: We propose an unsupervised adaptation framework, Self-TAught Recognizer (STAR), which leverages unlabeled data to enhance the robustness of automatic speech recognition (ASR) systems in diverse target domains, such as noise and accents. STAR is developed for prevalent speech foundation models based on Transformer-related architecture with auto-regressive decoding (e.g., Whisper, Canary). Specifica… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 23 pages, Preprint

  15. arXiv:2405.10828  [pdf, other

    eess.SP cs.LG

    Analysis of Impulsive Interference in Digital Audio Broadcasting Systems in Electric Vehicles

    Authors: Chin-Hung Chen, Wen-Hung Huang, Boris Karanov, Alex Young, Yan Wu, Wim van Houtum

    Abstract: Recently, new types of interference in electric vehicles (EVs), such as converters switching and/or battery chargers, have been found to degrade the performance of wireless digital transmission systems. Measurements show that such an interference is characterized by impulsive behavior and is widely varying in time. This paper uses recorded data from our EV testbed to analyze the impulsive interfer… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 44th Symposium on Information Theory and Signal Processing in the Benelux (SITB 2024), Delft, the Netherlands

  16. arXiv:2405.10825  [pdf, other

    eess.SY cs.LG

    Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities

    Authors: Hao Zhou, Chengming Hu, Ye Yuan, Yufei Cui, Yili **, Can Chen, Haolun Wu, Dun Yuan, Li Jiang, Di Wu, Xue Liu, Charlie Zhang, Xianbin Wang, Jiangchuan Liu

    Abstract: Large language models (LLMs) have received considerable attention recently due to their outstanding comprehension and reasoning capabilities, leading to great progress in many fields. The advancement of LLM techniques also offers promising opportunities to automate many tasks in the telecommunication (telecom) field. After pre-training and fine-tuning, LLMs can perform diverse downstream tasks bas… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  17. arXiv:2405.10814  [pdf, other

    cs.IT cs.LG eess.SP

    Data-Driven Symbol Detection for Intersymbol Interference Channels with Bursty Impulsive Noise

    Authors: Boris Karanov, Chin-Hung Chen, Yan Wu, Alex Young, Wim van Houtum

    Abstract: We developed machine learning approaches for data-driven trellis-based soft symbol detection in coded transmission over intersymbol interference (ISI) channels in presence of bursty impulsive noise (IN), for example encountered in wireless digital broadcasting systems and vehicular communications. This enabled us to obtain optimized detectors based on the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: This work has been submitted to the IEEE for possible publication

  18. arXiv:2405.10025  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models

    Authors: Yuchen Hu, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng, Ruizhe Li

    Abstract: Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which aims to predict the ground-truth transcription from the decoded N-best hypotheses. Thanks to the strong language generation ability of LLMs and rich information in the N-best list, GER shows great effectiveness in enhancing ASR results. However, it still suf… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 14 pages, Accepted by ACL 2024

  19. arXiv:2405.08838  [pdf, other

    cs.SD cs.AI eess.AS

    PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset

    Authors: Yang Hou, Haitao Fu, Chuankai Chen, Zida Li, Haoyu Zhang, Jianjun Zhao

    Abstract: With the rapid advancement of generative AI, multimodal deepfakes, which manipulate both audio and visual modalities, have drawn increasing public concern. Currently, deepfake detection has emerged as a crucial strategy in countering these growing threats. However, as a key factor in training and validating deepfake detectors, most existing deepfake datasets primarily focus on the visual modal, an… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 13 page, 4 figures

    MSC Class: 68T45 ACM Class: I.4.9

  20. arXiv:2405.02821  [pdf, other

    cs.SD cs.AI cs.LG cs.RO eess.AS

    Sim2Real Transfer for Audio-Visual Navigation with Frequency-Adaptive Acoustic Field Prediction

    Authors: Changan Chen, Jordi Ramos, Anshul Tomar, Kristen Grauman

    Abstract: Sim2real transfer has received increasing attention lately due to the success of learning robotic tasks in simulation end-to-end. While there has been a lot of progress in transferring vision-based navigation policies, the existing sim2real strategy for audio-visual navigation performs data augmentation empirically without measuring the acoustic gap. The sound differs from light in that it spans a… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  21. arXiv:2404.17400  [pdf, other

    cs.CV cs.AI eess.IV

    Spatial-frequency Dual-Domain Feature Fusion Network for Low-Light Remote Sensing Image Enhancement

    Authors: Zishu Yao, Guodong Fan, **fu Fan, Min Gan, C. L. Philip Chen

    Abstract: Low-light remote sensing images generally feature high resolution and high spatial complexity, with continuously distributed surface features in space. This continuity in scenes leads to extensive long-range correlations in spatial domains within remote sensing images. Convolutional Neural Networks, which rely on local correlations for long-distance modeling, struggle to establish long-range corre… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: 14 page

  22. arXiv:2404.16844  [pdf, other

    cs.CV cs.LG eess.SP

    Sugarcane Health Monitoring With Satellite Spectroscopy and Machine Learning: A Review

    Authors: Ethan Kane Waters, Carla Chia-Ming Chen, Mostafa Rahimi Azghadi

    Abstract: Research into large-scale crop monitoring has flourished due to increased accessibility to satellite imagery. This review delves into previously unexplored and under-explored areas in sugarcane health monitoring and disease/pest detection using satellite-based spectroscopy and Machine Learning (ML). It discusses key considerations in system development, including relevant satellites, vegetation in… ▽ More

    Submitted 12 February, 2024; originally announced April 2024.

    Comments: 22 pages, 6 figures, 3 tables

  23. arXiv:2404.16216  [pdf, other

    cs.CV cs.RO cs.SD eess.AS

    ActiveRIR: Active Audio-Visual Exploration for Acoustic Environment Modeling

    Authors: Arjun Somayazulu, Sagnik Majumder, Changan Chen, Kristen Grauman

    Abstract: An environment acoustic model represents how sound is transformed by the physical characteristics of an indoor environment, for any given source/receiver location. Traditional methods for constructing acoustic models involve expensive and time-consuming collection of large quantities of acoustic data at dense spatial locations in the space, or rely on privileged knowledge of scene geometry to inte… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Project page: https://vision.cs.utexas.edu/projects/active_rir/

  24. arXiv:2404.13756  [pdf, other

    eess.IV cs.CV

    BC-MRI-SEG: A Breast Cancer MRI Tumor Segmentation Benchmark

    Authors: Anthony Bilic, Chen Chen

    Abstract: Binary breast cancer tumor segmentation with Magnetic Resonance Imaging (MRI) data is typically trained and evaluated on private medical data, which makes comparing deep learning approaches difficult. We propose a benchmark (BC-MRI-SEG) for binary breast cancer tumor segmentation based on publicly available MRI datasets. The benchmark consists of four datasets in total, where two datasets are used… ▽ More

    Submitted 2 June, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

  25. arXiv:2404.12979  [pdf, other

    cs.SD cs.LG eess.AS

    TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition

    Authors: Chengxin Chen, Pengyuan Zhang

    Abstract: One persistent challenge in Speech Emotion Recognition (SER) is the ubiquitous environmental noise, which frequently results in diminished SER performance in practical use. In this paper, we introduce a Two-level Refinement Network, dubbed TRNet, to address this challenge. Specifically, a pre-trained speech enhancement module is employed for front-end noise reduction and noise level estimation. La… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 13 pages, 3 figures

  26. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  27. arXiv:2404.08607  [pdf, other

    cs.IT eess.SP

    Learning-Based Joint Antenna Selection and Precoding Design for Cell-Free MIMO Networks

    Authors: Liangzhi Wang, Chen Chen, Carlo Fischione, Jie Zhang

    Abstract: This paper considers a downlink cell-free multiple-input multiple-output (MIMO) network in which multiple multi-antenna base stations (BSs) serve multiple users via coherent joint transmission. In order to reduce the energy consumption by radio frequency components, each BS selects a subset of antennas for downlink data transmission after estimating the channel state information (CSI). We aim to m… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  28. arXiv:2404.05206  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos

    Authors: Changan Chen, Kumar Ashutosh, Rohit Girdhar, David Harwath, Kristen Grauman

    Abstract: We propose a novel self-supervised embedding to learn how actions sound from narrated in-the-wild egocentric videos. Whereas existing methods rely on curated data with known audio-visual correspondence, our multimodal contrastive-consensus coding (MC3) embedding reinforces the associations between audio, language, and vision when all modality pairs agree, while diminishing those associations when… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR 2024. Project page: https://vision.cs.utexas.edu/projects/soundingactions

  29. arXiv:2404.00951  [pdf, other

    eess.IV

    Adapting CSI-Guided Imaging Across Diverse Environments: An Experimental Study Leveraging Continuous Learning

    Authors: Cheng Chen, Shoki Ohta, Takayuki Nishio, Mohamed Wahib

    Abstract: This study explores the feasibility of adapting CSI-guided imaging across varied environments. Focusing on continuous model learning through continuous updates, we investigate CSI-Imager's adaptability in dynamically changing settings, specifically transitioning from an office to an industrial environment. Unlike traditional approaches that may require retraining for new environments, our experime… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  30. arXiv:2403.14915  [pdf, other

    math.OC eess.SY

    Network Learning with Directional Sign Patterns

    Authors: Anqi Dong, Can Chen, Tryphon T. Georgiou

    Abstract: Complex systems can be effectively modeled via graphs that encode networked interactions, where relations between entities or nodes are often quantified by signed edge weights, e.g., promotion/inhibition in gene regulatory networks, or encoding political of friendship differences in social networks. However, it is often the case that only an aggregate consequence of such edge weights that characte… ▽ More

    Submitted 4 April, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    MSC Class: 62F15; 49Q22; 05Cxx; 92C42

  31. A GNN Approach for Cell-Free Massive MIMO

    Authors: Lou Salaun, Hong Yang, Shashwat Mishra, Chung Shue Chen

    Abstract: Beyond 5G wireless technology Cell-Free Massive MIMO (CFmMIMO) downlink relies on carefully designed precoders and power control to attain uniformly high rate coverage. Many such power control problems can be calculated via second order cone programming (SOCP). In practice, several order of magnitude faster numerical procedure is required because power control has to be rapidly updated to adapt to… ▽ More

    Submitted 8 February, 2024; originally announced March 2024.

    Journal ref: GLOBECOM 2022 - 2022 IEEE Global Communications Conference, Dec 2022, Rio de Janeiro, France. pp.3053-3058

  32. arXiv:2403.10581  [pdf, other

    q-bio.QM cs.AI cs.CL cs.LG eess.SP

    Large Language Model-informed ECG Dual Attention Network for Heart Failure Risk Prediction

    Authors: Chen Chen, Lei Li, Marcel Beetz, Abhirup Banerjee, Ramneek Gupta, Vicente Grau

    Abstract: Heart failure (HF) poses a significant public health challenge, with a rising global mortality rate. Early detection and prevention of HF could significantly reduce its impact. We introduce a novel methodology for predicting HF risk using 12-lead electrocardiograms (ECGs). We present a novel, lightweight dual-attention ECG network designed to capture complex ECG features essential for early HF ris… ▽ More

    Submitted 22 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: Under journal revision

  33. arXiv:2403.08337  [pdf, other

    eess.SY cs.AI cs.LG

    LLM-Assisted Light: Leveraging Large Language Model Capabilities for Human-Mimetic Traffic Signal Control in Complex Urban Environments

    Authors: Maonan Wang, Aoyu Pang, Yuheng Kan, Man-On Pun, Chung Shue Chen, Bo Huang

    Abstract: Traffic congestion in metropolitan areas presents a formidable challenge with far-reaching economic, environmental, and societal ramifications. Therefore, effective congestion management is imperative, with traffic signal control (TSC) systems being pivotal in this endeavor. Conventional TSC systems, designed upon rule-based algorithms or reinforcement learning (RL), frequently exhibit deficiencie… ▽ More

    Submitted 12 June, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: 20 pages, 11 figures

  34. arXiv:2403.01257  [pdf

    eess.SY

    Secure and Scalable Network Slicing with Plug-and-Play Support for Power Distribution System Communication Networks

    Authors: Jian Zhong, Chen Chen, Yuqi Qian, Yiheng Bian, Yuxiong Huang, Zhaohong Bie

    Abstract: With the rapid development of power distribution systems (PDSs), the number of terminal devices and the types of delivered services involved are constantly growing. These trends make the operations of PDSs highly dependent on the support of advanced communication networks, which face two related challenges. The first is to provide sufficient flexibility, resilience, and security to meet varying de… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  35. arXiv:2403.01256  [pdf

    eess.SY

    Resilient Microgrid Formation Considering Communication Interruptions

    Authors: Jian Zhong, Chen Chen, Young-** Kim, Yuxiong Huang, Mengjie Teng, Yiheng Bian, Zhaohong Bie

    Abstract: Distribution system (DS) communication failures following extreme events often degrade monitoring and control functions, thus preventing the acquisition of complete global DS component state information, on which existing post-disaster DS restoration methods are based. This letter proposes methods of inferring the states of DS components in the case of incomplete component state information. By us… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  36. arXiv:2403.01253  [pdf

    eess.SY

    Strategic SDN-based Microgrid Formation for Managing Communication Failures in Distribution System Restoration

    Authors: Jian Zhong, Chen Chen, Zhaohong Bie, Mohammad Shahidehpour

    Abstract: Grid modernization has increased the reliance of power networks on cyber networks within distribution systems (DSs), heightening their vulnerability to disasters. Communication network failures significantly impede DS load recovery by diminishing observation and control. Prior research has largely ignored the need for integrated recovery of DS power and cyber networks' centralized control. Indeed,… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  37. arXiv:2403.01250  [pdf

    eess.SY

    Resilient Mobile Energy Storage Resources Based Distribution Network Restoration in Interdependent Power-Transportation-Information Networks

    Authors: Jian Zhong, Chen Chen, Qiming Yang, Dafu Liu, Wentao Shen, Chenlin Ji, Zhaohong Bie

    Abstract: The interactions between power, transportation, and information networks (PTIN), are becoming more profound with the advent of smart city technologies. Existing mobile energy storage resource (MESR)-based power distribution network (PDN) restoration schemes often neglect the interdependencies among PTIN, thus, efficient PDN restoration cannot be achieved. This paper outlines the interacting factor… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  38. arXiv:2402.17877  [pdf, other

    eess.SP eess.IV

    Accelerated Real-time Cine and Flow under In-magnet Staged Exercise

    Authors: Preethi Chandrasekaran, Chong Chen, Yingmin Liu, Syed Murtaza Arshad, Christopher Crabtree, Matthew Tong, Yuchi Han, Rizwan Ahmad

    Abstract: Background: Cardiovascular magnetic resonance imaging (CMR) is a wellestablished imaging tool for diagnosing and managing cardiac conditions. The integration of exercise stress with CMR (ExCMR) can enhance its diagnostic capacity. Despite recent advances in CMR technology, quantitative ExCMR during exercise remains technically challenging due to motion artifacts and limited spatial and temporal re… ▽ More

    Submitted 21 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  39. arXiv:2402.16894  [pdf, other

    q-bio.NC eess.IV

    Topological Analysis of Mouse Brain Vasculature via 3D Light-sheet Microscopy Images

    Authors: Jiachen Yao, Nina Hagemann, Qiaojie Xiong, Jianxu Chen, Dirk M. Hermann, Chao Chen

    Abstract: Vascular networks play a crucial role in understanding brain functionalities. Brain integrity and function, neuronal activity and plasticity, which are crucial for learning, are actively modulated by their local environments, specifically vascular networks. With recent developments in high-resolution 3D light-sheet microscopy imaging together with tissue processing techniques, it becomes feasible… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  40. arXiv:2402.09463  [pdf

    eess.IV

    Multi-Center Fetal Brain Tissue Annotation (FeTA) Challenge 2022 Results

    Authors: Kelly Payette, Céline Steger, Roxane Licandro, Priscille de Dumast, Hongwei Bran Li, Matthew Barkovich, Liu Li, Maik Dannecker, Chen Chen, Cheng Ouyang, Niccolò McConnell, Alina Miron, Yongmin Li, Alena Uus, Irina Grigorescu, Paula Ramirez Gilliland, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Haoyu Wang, Ziyan Huang, ** Ye, Mireia Alenyà, Valentin Comte, Oscar Camara , et al. (42 additional authors not shown)

    Abstract: Segmentation is a critical step in analyzing the develo** human fetal brain. There have been vast improvements in automatic segmentation methods in the past several years, and the Fetal Brain Tissue Annotation (FeTA) Challenge 2021 helped to establish an excellent standard of fetal brain segmentation. However, FeTA 2021 was a single center study, and the generalizability of algorithms across dif… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Results from FeTA Challenge 2022, held at MICCAI; Manuscript submitted. Supplementary Info (including submission methods descriptions) available here: https://zenodo.org/records/10628648

  41. arXiv:2402.09245  [pdf, other

    eess.AS cs.LG eess.SP

    Overview of the L3DAS23 Challenge on Audio-Visual Extended Reality

    Authors: Christian Marinoni, Riccardo Fosco Gramaccioni, Changan Chen, Aurelio Uncini, Danilo Comminiello

    Abstract: The primary goal of the L3DAS23 Signal Processing Grand Challenge at ICASSP 2023 is to promote and support collaborative research on machine learning for 3D audio signal processing, with a specific emphasis on 3D speech enhancement and 3D Sound Event Localization and Detection in Extended Reality applications. As part of our latest competition, we provide a brand-new dataset, which maintains the s… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: Accepted to 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023)

  42. arXiv:2402.06894  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators

    Authors: Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, Eng Siong Chng

    Abstract: Recent advances in large language models (LLMs) have stepped forward the development of multilingual speech and machine translation by its reduced representation errors and incorporated external knowledge. However, both translation tasks typically utilize beam search decoding and top-1 hypothesis selection for inference. These techniques struggle to fully exploit the rich information in the divers… ▽ More

    Submitted 16 May, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Comments: 18 pages, Accepted by ACL 2024. This work is open sourced at: https://github.com/YUCHEN005/GenTranslate

  43. arXiv:2402.05457  [pdf, other

    cs.CL cs.AI cs.MM cs.SD eess.AS

    It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

    Authors: Chen Chen, Ruizhe Li, Yuchen Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, Ensiong Chng, Chao-Han Huck Yang

    Abstract: Recent studies have successfully shown that large language models (LLMs) can be successfully used for generative error correction (GER) on top of the automatic speech recognition (ASR) output. Specifically, an LLM is utilized to carry out a direct map** from the N-best hypotheses list generated by an ASR system to the predicted output transcription. However, despite its effectiveness, GER introd… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Accepted to ICLR 2024, 17 pages. This work will be open sourced under MIT license

  44. arXiv:2402.03897  [pdf, other

    eess.SY

    Robust Data-EnablEd Predictive Leading Cruise Control via Reachability Analysis

    Authors: Shuai Li, Chaoyi Chen, Haotian Zheng, Jiawei Wang, Qing Xu, Keqiang Li

    Abstract: Data-driven predictive control promises model-free wave-dampening strategies for Connected and Autonomous Vehicles (CAVs) in mixed traffic flow. However, its performance relies on data quality, which suffers from unknown noise and disturbances.This paper introduces a Robust Data-EnablEd Predictive Leading Cruise Control (RDeeP-LCC) method based on reachability analysis, aiming to achieve safe and… ▽ More

    Submitted 14 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: 8 pages, 4 figures

  45. arXiv:2401.15525  [pdf, other

    eess.SY cs.GT

    Multi-Interval Energy-Reserve Co-Optimization with SoC-Dependent Bids from Battery Storage

    Authors: Cong Chen, Siying Li, Lang Tong

    Abstract: We consider the problem of co-optimized energy-reserve market clearing with state-of-charge (SoC) dependent bids from battery storage participants. While SoC-dependent bidding accurately captures storage's degradation and opportunity costs, such bids result in a non-convex optimization in the market clearing process. More challenging is the regulation reserve capacity clearing, where the SoC-depen… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

  46. arXiv:2401.12645  [pdf, ps, other

    cs.IT cs.LG eess.SP

    On the Robustness of Deep Learning-aided Symbol Detectors to Varying Conditions and Imperfect Channel Knowledge

    Authors: Chin-Hung Chen, Boris Karanov, Wim van Houtum, Wu Yan, Alex Young, Alex Alvarado

    Abstract: Recently, a data-driven Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm tailored to channels with intersymbol interference has been introduced. This so-called BCJRNet algorithm utilizes neural networks to calculate channel likelihoods. BCJRNet has demonstrated resilience against inaccurate channel tap estimations when applied to a time-invariant channel with ideal exponential decay profiles. However, it… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted paper at IEEE Wireless Communications and Networking Conference (WCNC) 2024

  47. arXiv:2401.10446  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

    Authors: Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Chao Zhang, Pin-Yu Chen, EnSiong Chng

    Abstract: Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which leverages the rich linguistic knowledge and powerful reasoning ability of LLMs to improve recognition results. The latest work proposes a GER benchmark with HyPoradise dataset to learn the map** from ASR N-best hypotheses to ground-truth transcription by e… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: Accepted to ICLR 2024, Spotlight top 5%, 24 pages. This work will be open sourced at: https://github.com/YUCHEN005/RobustGER under MIT license

  48. arXiv:2401.03799  [pdf, other

    eess.SY

    Safe Chance-constrained Model Predictive Control under Gaussian Mixture Model Uncertainty

    Authors: Kai Ren, Colin Chen, Hyeontae Sung, Hee** Ahn, Ian Mitchell, Maryam Kamgarpour

    Abstract: We present a chance-constrained model predictive control (MPC) framework under Gaussian mixture model (GMM) uncertainty. Specifically, we consider the uncertainty that arises from predicting future behaviors of moving obstacles, which may exhibit multiple modes (for example, turning left or right). To address the multi-modal uncertainty distribution, we propose three MPC formulations: nominal chan… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: 13 pages, 10 figures, submitted to "TCST SI: Intelligent Decision Making, Planning and Control of Automated Vehicles"

  49. arXiv:2401.02023  [pdf, other

    eess.SY math.NA math.OC

    On Complexity of Stability Analysis in Higher-order Ecological Networks through Tensor Decompositions

    Authors: Anqi Dong, Can Chen

    Abstract: Complex ecological networks are often characterized by intricate interactions that extend beyond pairwise relationships. Understanding the stability of higher-order ecological networks is salient for species coexistence, biodiversity, and community persistence. In this article, we present complexity analyses for determining the linear stability of higher-order ecological networks through tensor de… ▽ More

    Submitted 3 April, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: 6 pages, 3 figures

    MSC Class: 93Dxx; 05C65; 92D40; 15A69

  50. arXiv:2401.01608  [pdf

    eess.SP cs.NI math.OC

    Interference Management in 5G and Beyond Networks

    Authors: Nessrine Trabelsi, Lamia Chaari Fourati, Chung Shue Chen

    Abstract: During the last decade, wireless data services have had an incredible impact on people's lives in ways we could never have imagined. The number of mobile devices has increased exponentially and data traffic has almost doubled every year. Undoubtedly, the rate of growth will continue to be rapid with the explosive increase in demands for data rates, latency, massive connectivity, network reliabilit… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.