Skip to main content

Showing 1–50 of 113 results for author: Luo, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.18270  [pdf, other

    cs.IT cs.NI eess.SY

    Exploiting Data Significance in Remote Estimation of Discrete-State Markov Sources

    Authors: Ji** Luo, Nikolaos Pappas

    Abstract: We consider the semantics-aware remote estimation of a discrete-state Markov source with normal (low-priority) and alarm (high-priority) states. Erroneously announcing a normal state at the destination when the source is actually in an alarm state (i.e., missed alarm error) incurs a significantly higher cost than falsely announcing an alarm state when the source is in a normal state (i.e., false a… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. Efficient Beamforming Feedback Information-Based Wi-Fi Sensing by Feature Selection

    Authors: Xin Li, **gzhi Hu, Jun Luo

    Abstract: Wi-Fi sensing leveraging plain-text beamforming feedback information (BFI) in multiple-input-multiple-output (MIMO) systems attracts increasing attention. However, due to the implicit relationship between BFI and the channel state information (CSI), quantifying the sensing capability of BFI poses a challenge in building efficient BFI-based sensing algorithms. In this letter, we first derive a math… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  3. arXiv:2405.20064  [pdf, other

    eess.AS cs.SD

    1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem

    Authors: Mingjie Chen, Hezhao Zhang, Yuanchao Li, Jiachen Luo, Wen Wu, Ziyang Ma, Peter Bell, Catherine Lai, Joshua Reiss, Lin Wang, Philip C. Woodland, Xie Chen, Huy Phan, Thomas Hain

    Abstract: Speech emotion recognition is a challenging classification task with natural emotional speech, especially when the distribution of emotion types is imbalanced in the training and test data. In this case, it is more difficult for a model to learn to separate minority classes, resulting in those sometimes being ignored or frequently misclassified. Previous work has utilised class weighted loss for t… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  4. arXiv:2405.06299  [pdf, other

    eess.SP cs.AI

    Cross-domain Learning Framework for Tracking Users in RIS-aided Multi-band ISAC Systems with Sparse Labeled Data

    Authors: **gzhi Hu, Dusit Niyato, Jun Luo

    Abstract: Integrated sensing and communications (ISAC) is pivotal for 6G communications and is boosted by the rapid development of reconfigurable intelligent surfaces (RISs). Using the channel state information (CSI) across multiple frequency bands, RIS-aided multi-band ISAC systems can potentially track users' positions with high precision. Though tracking with CSI is desirable as no communication overhead… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  5. arXiv:2404.04848  [pdf, other

    eess.IV cs.AI cs.CV

    Task-Aware Encoder Control for Deep Video Compression

    Authors: Xingtong Ge, Jixiang Luo, Xinjie Zhang, Tongda Xu, Guo Lu, Dailan He, **g Geng, Yan Wang, Jun Zhang, Hongwei Qin

    Abstract: Prior research on deep video compression (DVC) for machine tasks typically necessitates training a unique codec for each specific task, mandating a dedicated decoder per task. In contrast, traditional video codecs employ a flexible encoder controller, enabling the adaptation of a single codec to different tasks through mechanisms like mode prediction. Drawing inspiration from this, we introduce an… ▽ More

    Submitted 20 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  6. arXiv:2403.16855  [pdf, other

    eess.SY cs.IT cs.LG cs.NI

    Semantic-Aware Remote Estimation of Multiple Markov Sources Under Constraints

    Authors: Ji** Luo, Nikolaos Pappas

    Abstract: This paper studies semantic-aware communication for remote estimation of multiple Markov sources over a lossy and rate-constrained channel. Unlike most existing studies that treat all source states equally, we exploit the semantics of information and consider that the remote actuator has different tolerances for the estimation errors of different states. We aim to find an optimal scheduling policy… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  7. arXiv:2403.13030  [pdf, other

    eess.IV cs.CV

    Super-High-Fidelity Image Compression via Hierarchical-ROI and Adaptive Quantization

    Authors: Jixiang Luo, Yan Wang, Hongwei Qin

    Abstract: Learned Image Compression (LIC) has achieved dramatic progress regarding objective and subjective metrics. MSE-based models aim to improve objective metrics while generative models are leveraged to improve visual quality measured by subjective metrics. However, they all suffer from blurring or deformation at low bit rates, especially at below $0.2bpp$. Besides, deformation on human faces and text… ▽ More

    Submitted 20 May, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  8. arXiv:2402.09871  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in the Field of Music

    Authors: Zihao Wang, Shuyu Li, Tao Zhang, Qi Wang, Pengfei Yu, **yang Luo, Yan Liu, Ming Xi, Kejun Zhang

    Abstract: The rapidly evolving multimodal Large Language Models (LLMs) urgently require new benchmarks to uniformly evaluate their performance on understanding and textually describing music. However, due to semantic gaps between Music Information Retrieval (MIR) algorithms and human understanding, discrepancies between professionals and the public, and low precision of annotations, existing music descripti… ▽ More

    Submitted 13 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: Accepted by International Joint Conference on Artificial Intelligence 2024 (IJCAI 2024)

    MSC Class: 68Txx(Primary)14F05; 91Fxx(Secondary) ACM Class: I.2.7; J.5

  9. arXiv:2402.03919  [pdf, other

    cs.IT eess.SP

    Sensing Mutual Information with Random Signals in Gaussian Channels: Bridging Sensing and Communication Metrics

    Authors: Lei Xie, Fan Liu, Jia** Luo, Shenghui Song

    Abstract: Sensing performance is typically evaluated by classical radar metrics, such as Cramer-Rao bound and signal-to-clutter-plus-noise ratio. The recent development of the integrated sensing and communication (ISAC) framework motivated the efforts to unify the performance metric for sensing and communication, where mutual information (MI) was proposed as a sensing performance metric with deterministic s… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2311.07081

  10. arXiv:2402.01186  [pdf, other

    eess.IV cs.CV

    Ambient-Pix2PixGAN for Translating Medical Images from Noisy Data

    Authors: Wentao Chen, Xichen Xu, Jie Luo, Weimin Zhou

    Abstract: Image-to-image translation is a common task in computer vision and has been rapidly increasing the impact on the field of medical imaging. Deep learning-based methods that employ conditional generative adversarial networks (cGANs), such as Pix2PixGAN, have been extensively explored to perform image-to-image translation tasks. However, when noisy medical image data are considered, such methods cann… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: SPIE Medical Imaging 2024

  11. arXiv:2311.07346  [pdf, other

    eess.SY cs.IT cs.NI

    Goal-oriented Estimation of Multiple Markov Sources in Resource-constrained Systems

    Authors: Ji** Luo, Nikolaos Pappas

    Abstract: This paper investigates goal-oriented communication for remote estimation of multiple Markov sources in resource-constrained networks. An agent decides the updating times of the sources and transmits the packet to a remote destination over an unreliable channel with delay. The destination is tasked with source reconstruction for actuation. We utilize the metric \textit{cost of actuation error} (CA… ▽ More

    Submitted 3 June, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Accepted to be presented at the IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC) 2024

  12. arXiv:2311.03175  [pdf

    eess.IV cs.CV

    Frequency Domain Decomposition Translation for Enhanced Medical Image Translation Using GANs

    Authors: Zhuhui Wang, Jianwei Zuo, Xuliang Deng, Jiajia Luo

    Abstract: Medical Image-to-image translation is a key task in computer vision and generative artificial intelligence, and it is highly applicable to medical image analysis. GAN-based methods are the mainstream image translation methods, but they often ignore the variation and distribution of images in the frequency domain, or only take simple measures to align high-frequency information, which can lead to d… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  13. arXiv:2310.06336  [pdf, other

    eess.SP eess.SY

    HoloFed: Environment-Adaptive Positioning via Multi-band Reconfigurable Holographic Surfaces and Federated Learning

    Authors: **gzhi Hu, Zhe Chen, Tianyue Zheng, Robert Schober, Jun Luo

    Abstract: Positioning is an essential service for various applications and is expected to be integrated with existing communication infrastructures in 5G and 6G. Though current Wi-Fi and cellular base stations (BSs) can be used to support this integration, the resulting precision is unsatisfactory due to the lack of precise control of the wireless signals. Recently, BSs adopting reconfigurable holographic s… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  14. A Computationally Efficient Bi-level Coordination Framework for CAVs at Unsignalized Intersections

    Authors: Ji** Luo, Tingting Zhang, Qinyu Zhang

    Abstract: In this paper, we investigate cooperative vehicle coordination for connected and automated vehicles (CAVs) at unsignalized intersections. To support high traffic throughput while reducing computational complexity, we present a novel collision region model and decompose the optimal coordination problem into two sub-problems: \textit{centralized} priority scheduling and \textit{distributed} trajecto… ▽ More

    Submitted 21 October, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

  15. arXiv:2309.04672  [pdf, other

    eess.IV cs.CV

    SSHNN: Semi-Supervised Hybrid NAS Network for Echocardiographic Image Segmentation

    Authors: Renqi Chen, **g**g Luo, Fan Nian, Yuhui Cen, Yiheng Peng, Zekuan Yu

    Abstract: Accurate medical image segmentation especially for echocardiographic images with unmissable noise requires elaborate network design. Compared with manual design, Neural Architecture Search (NAS) realizes better segmentation results due to larger search space and automatic optimization, but most of the existing methods are weak in layer-wise feature aggregation and adopt a ``strong encoder, weak de… ▽ More

    Submitted 27 December, 2023; v1 submitted 8 September, 2023; originally announced September 2023.

    Comments: Accepted by ICASSP2024

  16. arXiv:2308.13789  [pdf

    eess.SP

    Sensiverse: A dataset for ISAC study

    Authors: Jia** Luo, Baojian Zhou, Yang Yu, ** Zhang, Xiaohui Peng, Jianglei Ma, Peiying Zhu, Jianmin Lu, Wen Tong

    Abstract: In order to address the lack of applicable channel models for ISAC research and evaluation, we release Sensiverse, a dataset that can be used for ISAC research. In this paper, we present the method of generating Sensiverse, including the acquisition and formatting of the 3D scene models, the generation of the channel data and associations with Tx/Rx deployment. The file structure and usage of the… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

  17. arXiv:2308.13287  [pdf, other

    eess.IV

    Efficient Learned Lossless JPEG Recompression

    Authors: Lina Guo, Yuanyuan Wang, Tongda Xu, Jixiang Luo, Dailan He, Zhenjun Ji, Shanshan Wang, Yang Wang, Hongwei Qin

    Abstract: JPEG is one of the most popular image compression methods. It is beneficial to compress those existing JPEG files without introducing additional distortion. In this paper, we propose a deep learning based method to further compress JPEG images losslessly. Specifically, we propose a Multi-Level Parallel Conditional Modeling (ML-PCM) architecture, which enables parallel decoding in different granula… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  18. arXiv:2307.11757  [pdf, other

    eess.IV cs.CV

    Improving Video Colorization by Test-Time Tuning

    Authors: Ya** Zhao, Haitian Zheng, Jiebo Luo, Edmund Y. Lam

    Abstract: With the advancements in deep learning, video colorization by propagating color information from a colorized reference frame to a monochrome video sequence has been well explored. However, the existing approaches often suffer from overfitting the training dataset and sequentially lead to suboptimal performance on colorizing testing samples. To address this issue, we propose an effective method, wh… ▽ More

    Submitted 25 June, 2023; originally announced July 2023.

    Comments: 5 pages, 4 figures

  19. arXiv:2307.05386  [pdf, other

    eess.SP physics.optics

    Exploring the Potential of Integrated Optical Sensing and Communication (IOSAC) Systems with Si Waveguides for Future Networks

    Authors: Xiangpeng Ou, Ying Qiu, Ming Luo, Fujun Sun, Peng Zhang, Gang Yang, Junjie Li, Jianfeng Gao, Xiaobin He, Anyan Du, Bo Tang, Bin Li, Zichen Liu, Zhihua Li, Ling Xie, Xi Xiao, Jun Luo, Wenwu Wang, ** Tao, Yan Yang

    Abstract: Advanced silicon photonic technologies enable integrated optical sensing and communication (IOSAC) in real time for the emerging application requirements of simultaneous sensing and communication for next-generation networks. Here, we propose and demonstrate the IOSAC system on the silicon nitride (SiN) photonics platform. The IOSAC devices based on microring resonators are capable of monitoring t… ▽ More

    Submitted 27 June, 2023; originally announced July 2023.

    Comments: 11pages, 5 figutres

  20. arXiv:2306.16556  [pdf, other

    eess.IV cs.CV

    Inter-Rater Uncertainty Quantification in Medical Image Segmentation via Rater-Specific Bayesian Neural Networks

    Authors: Qingqiao Hu, Hao Wang, **g Luo, Yunhao Luo, Zhiheng Zhangg, Jan S. Kirschke, Benedikt Wiestler, Bjoern Menze, Jianguo Zhang, Hongwei Bran Li

    Abstract: Automated medical image segmentation inherently involves a certain degree of uncertainty. One key factor contributing to this uncertainty is the ambiguity that can arise in determining the boundaries of a target region of interest, primarily due to variations in image appearance. On top of this, even among experts in the field, different opinions can emerge regarding the precise definition of spec… ▽ More

    Submitted 25 August, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: submitted to a journal for review

  21. arXiv:2306.15753  [pdf

    physics.soc-ph eess.SY

    Integrated Simulation Platform for Quantifying the Traffic-Induced Environmental and Health Impacts

    Authors: Xuanpeng Zhao, Guoyuan Wu, Akula Venkatram, Ji Luo, Peng Hao, Kanok Boriboonsomsin, Shaohua Hu

    Abstract: Air quality and human exposure to mobile source pollutants have become major concerns in urban transportation. Existing studies mainly focus on mitigating traffic congestion and reducing carbon footprints, with limited understanding of traffic-related health impacts from the environmental justice perspective. To address this gap, we present an innovative integrated simulation platform that models… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: 35 pages, 11 figures

  22. arXiv:2306.13843  [pdf, other

    cs.CV eess.IV

    Score-based Generative Models for Photoacoustic Image Reconstruction with Rotation Consistency Constraints

    Authors: Shangqing Tong, Hengrong Lan, Liming Nie, Jianwen Luo, Fei Gao

    Abstract: Photoacoustic tomography (PAT) is a newly emerged imaging modality which enables both high optical contrast and acoustic depth of penetration. Reconstructing images of photoacoustic tomography from limited amount of senser data is among one of the major challenges in photoacoustic imaging. Previous works based on deep learning were trained in supervised fashion, which directly map the input partia… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  23. arXiv:2305.16789  [pdf, other

    cs.LG cs.CV eess.SP

    Modulate Your Spectrum in Self-Supervised Learning

    Authors: Xi Weng, Yunhao Ni, Tengwei Song, Jie Luo, Rao Muhammad Anwer, Salman Khan, Fahad Shahbaz Khan, Lei Huang

    Abstract: Whitening loss offers a theoretical guarantee against feature collapse in self-supervised learning (SSL) with joint embedding architectures. Typically, it involves a hard whitening approach, transforming the embedding and applying loss to the whitened output. In this work, we introduce Spectral Transformation (ST), a framework to modulate the spectrum of embedding and to seek for functions beyond… ▽ More

    Submitted 21 January, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted at ICLR 2024. The code is available at https://github.com/winci-ai/intl

  24. arXiv:2305.06459  [pdf, other

    eess.SP cs.GR cs.HC eess.IV q-bio.NC

    SlicerTMS: Real-Time Visualization of Transcranial Magnetic Stimulation for Mental Health Treatment

    Authors: Loraine Franke, Tae Young Park, Jie Luo, Yogesh Rathi, Steve Pieper, Lipeng Ning, Daniel Haehn

    Abstract: We present a real-time visualization system for Transcranial Magnetic Stimulation (TMS), a non-invasive neuromodulation technique for treating various brain disorders and mental health diseases. Our solution targets the current challenges of slow and labor-intensive practices in treatment planning. Integrating Deep Learning (DL), our system rapidly predicts electric field (E-field) distributions i… ▽ More

    Submitted 12 March, 2024; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: 11 pages, 4 figures, 2 tables, MICCAI

  25. arXiv:2304.06883  [pdf, other

    cs.IT eess.SP

    Intelligent Reflecting Surface Aided Wireless Communication Systems: Joint Location and Passive Beamforming Design

    Authors: **tao Luo, Sixing Yin

    Abstract: Intelligent reflecting surface (IRS) has been widely studied in recent years, it has emerged as a new technology which can reflect the incident signal by intelligently configuring the reflection elements, thus changing the signal propagation environment, enhancing the signals users desire and suppressing the interference between users. In this paper, we study an IRS aided multi-users wireless comm… ▽ More

    Submitted 5 May, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: Following the publication of our work, we identified errors in our data analysis process. To uphold the standards of academic integrity and the accuracy of our findings, we feel it necessary to withdraw the current version of our paper. We plan to submit a revised version upon thorough review and correction of these errors

  26. arXiv:2303.15686  [pdf, other

    eess.SP eess.SY

    Multi-band Reconfigurable Holographic Surface Based ISAC Systems: Design and Optimization

    Authors: **gzhi Hu, Zhe Chen, Jun Luo

    Abstract: Metamaterial-based reconfigurable holographic surfaces (RHSs) have been proposed as novel cost-efficient antenna arrays, which are promising for improving the positioning and communication performance of integrated sensing and communications (ISAC) systems. However, due to the high frequency selectivity of the metamaterial elements, RHSs face challenges in supporting ultra-wide bandwidth (UWB), wh… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: 6 pages, 4 figures, IEEE ICC

  27. arXiv:2303.07687  [pdf, other

    cs.SD cs.CL eess.AS

    Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy

    Authors: Xulong Zhang, Haobin Tang, Jianzong Wang, Ning Cheng, Jian Luo, **g Xiao

    Abstract: Because of predicting all the target tokens in parallel, the non-autoregressive models greatly improve the decoding efficiency of speech recognition compared with traditional autoregressive models. In this work, we present dynamic alignment Mask CTC, introducing two methods: (1) Aligned Cross Entropy (AXE), finding the monotonic alignment that minimizes the cross-entropy loss through dynamic progr… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  28. arXiv:2303.02666  [pdf, other

    eess.IV cs.CV cs.LG

    Learned Lossless Compression for JPEG via Frequency-Domain Prediction

    Authors: Jixiang Luo, Shaohui Li, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong

    Abstract: JPEG images can be further compressed to enhance the storage and transmission of large-scale image datasets. Existing learned lossless compressors for RGB images cannot be well transferred to JPEG images due to the distinguishing distribution of DCT coefficients and raw pixels. In this paper, we propose a novel framework for learned lossless compression of JPEG images that achieves end-to-end opti… ▽ More

    Submitted 5 March, 2023; originally announced March 2023.

  29. arXiv:2302.02447  [pdf, other

    eess.AS cs.SD

    cross-modal fusion techniques for utterance-level emotion recognition from text and speech

    Authors: Jiachen Luo, Huy Phan, Joshua Reiss

    Abstract: Multimodal emotion recognition (MER) is a fundamental complex research problem due to the uncertainty of human emotional expression and the heterogeneity gap between different modalities. Audio and text modalities are particularly important for a human participant in understanding emotions. Although many successful attempts have been designed multimodal representations for MER, there still exist m… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

    Comments: 6 pages, 2 figures

  30. arXiv:2302.02419  [pdf, other

    cs.CL cs.SD eess.AS

    deep learning of segment-level feature representation for speech emotion recognition in conversations

    Authors: Jiachen Luo, Huy Phan, Joshua Reiss

    Abstract: Accurately detecting emotions in conversation is a necessary yet challenging task due to the complexity of emotions and dynamics in dialogues. The emotional state of a speaker can be influenced by many different factors, such as interlocutor stimulus, dialogue scene, and topic. In this work, we propose a conversational speech emotion recognition method to deal with capturing attentive contextual d… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

    Comments: 6 pages, 4 figures

  31. arXiv:2301.06681  [pdf

    eess.IV cs.CV

    Cross-domain Self-supervised Framework for Photoacoustic Computed Tomography Image Reconstruction

    Authors: Hengrong Lan, Lijie Huang, Zhiqiang Li, **g Lv, Jianwen Luo

    Abstract: Accurate image reconstruction is crucial for photoacoustic (PA) computed tomography (PACT). Recently, deep learning has been used to reconstruct the PA image with a supervised scheme, which requires high-quality images as ground truth labels. In practice, there are inevitable trade-offs between cost and performance since the use of more channels is an expensive strategy to access more measurements… ▽ More

    Submitted 20 September, 2023; v1 submitted 16 January, 2023; originally announced January 2023.

  32. arXiv:2212.10431  [pdf, other

    cs.CV cs.LG cs.MM eess.IV

    QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity

    Authors: Siyu Huang, Jie An, Donglai Wei, Jiebo Luo, Hanspeter Pfister

    Abstract: The mechanism of existing style transfer algorithms is by minimizing a hybrid loss function to push the generated image toward high similarities in both content and style. However, this type of approach cannot guarantee visual fidelity, i.e., the generated artworks should be indistinguishable from real ones. In this paper, we devise a new style transfer framework called QuantArt for high visual-fi… ▽ More

    Submitted 5 June, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Accepted to CVPR 2023. Code is available at https://github.com/siyuhuang/QuantArt

  33. arXiv:2211.07357  [pdf, other

    cs.LG cs.AI eess.SY

    Controlling Commercial Cooling Systems Using Reinforcement Learning

    Authors: Jerry Luo, Cosmin Paduraru, Octavian Voicu, Yuri Chervonyi, Scott Munns, Jerry Li, Crystal Qian, Praneet Dutta, Jared Quincy Davis, Ningjia Wu, Xingwei Yang, Chu-Ming Chang, Ted Li, Rob Rose, Mingyan Fan, Hootan Nakhost, Tinglin Liu, Brian Kirkman, Frank Altamura, Lee Cline, Patrick Tonker, Joel Gouker, Dave Uden, Warren Buddy Bryan, Jason Law , et al. (11 additional authors not shown)

    Abstract: This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments ha… ▽ More

    Submitted 14 December, 2022; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: 27 pages, 11 figures

  34. arXiv:2211.02419  [pdf, other

    eess.IV cs.CV cs.LG

    High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss

    Authors: Yucong Lin, **hua Su, Yuhang Li, Yuhao Wei, Hanchao Yan, Saining Zhang, Jiaan Luo, Danni Ai, Hong Song, **gfan Fan, Tianyu Fu, Deqiang Xiao, Feifei Wang, Jue Hou, Jian Yang

    Abstract: Deep learning methods have contributed substantially to the rapid advancement of medical image segmentation, the quality of which relies on the suitable design of loss functions. Popular loss functions, including the cross-entropy and dice losses, often fall short of boundary detection, thereby limiting high-resolution downstream applications such as automated diagnoses and procedures. We develope… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

  35. arXiv:2209.12265  [pdf, other

    cs.NI eess.SY

    Cooperative Sensing and Heterogeneous Information Fusion in VCPS: A Multi-agent Deep Reinforcement Learning Approach

    Authors: Xincao Xu, Kai Liu, Penglin Dai, Ruitao Xie, **g**g Cao, Jiangtao Luo

    Abstract: Cooperative sensing and heterogeneous information fusion are critical to realize vehicular cyber-physical systems (VCPSs). This paper makes the first attempt to quantitatively measure the quality of VCPS by designing a new metric called Age of View (AoV). Specifically, we first present the system architecture where heterogeneous information can be cooperatively sensed and uploaded via vehicle-to-i… ▽ More

    Submitted 27 January, 2023; v1 submitted 25 September, 2022; originally announced September 2022.

  36. arXiv:2209.08112  [pdf, other

    cs.LG cs.AI cs.MA cs.RO eess.SY

    Optimizing Industrial HVAC Systems with Hierarchical Reinforcement Learning

    Authors: William Wong, Praneet Dutta, Octavian Voicu, Yuri Chervonyi, Cosmin Paduraru, Jerry Luo

    Abstract: Reinforcement learning (RL) techniques have been developed to optimize industrial cooling systems, offering substantial energy savings compared to traditional heuristic policies. A major challenge in industrial control involves learning behaviors that are feasible in the real world due to machinery constraints. For example, certain actions can only be executed every few hours while other actions c… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

    Comments: 11 pages, 5 figures

  37. arXiv:2207.14524  [pdf, other

    eess.IV cs.CV

    Evaluating the Practicality of Learned Image Compression

    Authors: Hongjiu Yu, Qiancheng Sun, ** Hu, Xingyuan Xue, Jixiang Luo, Dailan He, Yilong Li, Pengbo Wang, Yuanyuan Wang, Yaxu Dai, Yan Wang, Hongwei Qin

    Abstract: Learned image compression has achieved extraordinary rate-distortion performance in PSNR and MS-SSIM compared to traditional methods. However, it suffers from intensive computation, which is intolerable for real-world applications and leads to its limited industrial application for now. In this paper, we introduce neural architecture search (NAS) to designing more efficient networks with lower lat… ▽ More

    Submitted 29 July, 2022; originally announced July 2022.

  38. arXiv:2207.03605  [pdf, other

    cs.LG cs.IT cs.MA eess.SY

    Learning-based Autonomous Channel Access in the Presence of Hidden Terminals

    Authors: Yulin Shao, Yucheng Cai, Taotao Wang, Ziyang Guo, Peng Liu, Jiajun Luo, Deniz Gunduz

    Abstract: We consider the problem of autonomous channel access (AutoCA), where a group of terminals tries to discover a communication strategy with an access point (AP) via a common wireless channel in a distributed fashion. Due to the irregular topology and the limited communication range of terminals, a practical challenge for AutoCA is the hidden terminal problem, which is notorious in wireless networks… ▽ More

    Submitted 2 December, 2022; v1 submitted 7 July, 2022; originally announced July 2022.

    Comments: Keywords: multiple channel access, hidden terminal, multi-agent deep reinforcement learning, Wi-Fi, proximal policy optimization

  39. arXiv:2206.13689  [pdf, other

    cs.SD eess.AS

    Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech Separation

    Authors: Jian Luo, Jianzong Wang, Ning Cheng, Edward Xiao, Xulong Zhang, **g Xiao

    Abstract: Time-domain Transformer neural networks have proven their superiority in speech separation tasks. However, these models usually have a large number of network parameters, thus often encountering the problem of GPU memory explosion. In this paper, we proposed Tiny-Sepformer, a tiny version of Transformer network for speech separation. We present two techniques to reduce the model parameters and mem… ▽ More

    Submitted 30 June, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

    Comments: Accepted by Interspeech 2022

  40. A Learning Aided Flexible Gradient Descent Approach to MISO Beamforming

    Authors: Zhixiong Yang, **g-Yuan Xia, Junshan Luo, Shuanghui Zhang, Deniz Gündüz

    Abstract: This paper proposes a learning aided gradient descent (LAGD) algorithm to solve the weighted sum rate (WSR) maximization problem for multiple-input single-output (MISO) beamforming. The proposed LAGD algorithm directly optimizes the transmit precoder through implicit gradient descent based iterations, at each of which the optimization strategy is determined by a neural network, and thus, is dynami… ▽ More

    Submitted 25 July, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

    Journal ref: [J]. IEEE Wireless Communications Letters, 2022

  41. arXiv:2205.14501  [pdf, other

    eess.IV

    PO-ELIC: Perception-Oriented Efficient Learned Image Coding

    Authors: Dailan He, Ziming Yang, Hongjiu Yu, Tongda Xu, Jixiang Luo, Yuan Chen, Chenjian Gao, Xinjie Shi, Hongwei Qin, Yan Wang

    Abstract: In the past years, learned image compression (LIC) has achieved remarkable performance. The recent LIC methods outperform VVC in both PSNR and MS-SSIM. However, the low bit-rate reconstructions of LIC suffer from artifacts such as blurring, color drifting and texture missing. Moreover, those varied artifacts make image quality metrics correlate badly with human perceptual quality. In this paper, w… ▽ More

    Submitted 28 May, 2022; originally announced May 2022.

    Comments: CVPR2022 Workshop, 5-th CLIC Image Compression Track

  42. arXiv:2205.14329  [pdf, other

    cs.SD cs.CL eess.AS

    Speech Augmentation Based Unsupervised Learning for Keyword Spotting

    Authors: Jian Luo, Jianzong Wang, Ning Cheng, Haobin Tang, **g Xiao

    Abstract: In this paper, we investigated a speech augmentation based unsupervised learning approach for keyword spotting (KWS) task. KWS is a useful speech application, yet also heavily depends on the labeled data. We designed a CNN-Attention architecture to conduct the KWS task. CNN layers focus on the local acoustic features, and attention layers model the long-time dependency. To improve the robustness o… ▽ More

    Submitted 28 May, 2022; originally announced May 2022.

    Comments: accepted by WCCI 2022

  43. arXiv:2205.14326  [pdf, other

    cs.CL cs.SD eess.AS

    Adaptive Activation Network For Low Resource Multilingual Speech Recognition

    Authors: Jian Luo, Jianzong Wang, Ning Cheng, Zhenpeng Zheng, **g Xiao

    Abstract: Low resource automatic speech recognition (ASR) is a useful but thorny task, since deep learning ASR models usually need huge amounts of training data. The existing models mostly established a bottleneck (BN) layer by pre-training on a large source language, and transferring to the low resource target language. In this work, we introduced an adaptive activation network to the upper layers of ASR m… ▽ More

    Submitted 28 May, 2022; originally announced May 2022.

    Comments: accepted by WCCI 2022

  44. arXiv:2205.01278  [pdf, other

    eess.SY cs.AI cs.RO

    Real-time Cooperative Vehicle Coordination at Unsignalized Road Intersections

    Authors: Ji** Luo, Tingting Zhang, Rui Hao, Donglin Li, Chunsheng Chen, Zhenyu Na, Qinyu Zhang

    Abstract: Cooperative coordination at unsignalized road intersections, which aims to improve the driving safety and traffic throughput for connected and automated vehicles, has attracted increasing interests in recent years. However, most existing investigations either suffer from computational complexity or cannot harness the full potential of the road infrastructure. To this end, we first present a dedica… ▽ More

    Submitted 22 March, 2023; v1 submitted 2 May, 2022; originally announced May 2022.

    Comments: in IEEE Transactions on Intelligent Transportation Systems

  45. arXiv:2204.05649  [pdf, other

    cs.SD eess.AS

    ADFF: Attention Based Deep Feature Fusion Approach for Music Emotion Recognition

    Authors: Zi Huang, Shulei Ji, Zhilan Hu, Chuangjian Cai, **g Luo, Xinyu Yang

    Abstract: Music emotion recognition (MER), a sub-task of music information retrieval (MIR), has developed rapidly in recent years. However, the learning of affect-salient features remains a challenge. In this paper, we propose an end-to-end attention-based deep feature fusion (ADFF) approach for MER. Only taking log Mel-spectrogram as input, this method uses adapted VGGNet as spatial feature learning module… ▽ More

    Submitted 30 June, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

    Comments: It has been received by Interspeech2022

  46. arXiv:2203.10645  [pdf, other

    eess.IV cs.CV

    Breast Cancer Induced Bone Osteolysis Prediction Using Temporal Variational Auto-Encoders

    Authors: Wei Xiong, Neil Yeung, Shubo Wang, Haofu Liao, Liyun Wang, Jiebo Luo

    Abstract: Objective and Impact Statement. We adopt a deep learning model for bone osteolysis prediction on computed tomography (CT) images of murine breast cancer bone metastases. Given the bone CT scans at previous time steps, the model incorporates the bone-cancer interactions learned from the sequential images and generates future CT images. Its ability of predicting the development of bone lesions in ca… ▽ More

    Submitted 28 March, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: 18 pages

  47. arXiv:2201.04302  [pdf, other

    eess.IV cs.LG

    De-Noising of Photoacoustic Microscopy Images by Deep Learning

    Authors: Da He, Jiasheng Zhou, Xiaoyu Shang, Jiajia Luo, Sung-Liang Chen

    Abstract: As a hybrid imaging technology, photoacoustic microscopy (PAM) imaging suffers from noise due to the maximum permissible exposure of laser intensity, attenuation of ultrasound in the tissue, and the inherent noise of the transducer. De-noising is a post-processing method to reduce noise, and PAM image quality can be recovered. However, previous de-noising techniques usually heavily rely on mathema… ▽ More

    Submitted 12 January, 2022; originally announced January 2022.

    Comments: 12 pages, 8 figures

  48. arXiv:2112.08744  [pdf, other

    eess.SY

    Distributed Nash Equilibrium Seeking for Noncooperative Games of High-Order Nonlinear Multi-Agent Systems Over Weight-Unbalanced Digraphs

    Authors: Zhenhua Deng, ** Luo

    Abstract: In this paper, we investigate the noncooperative games of multi-agent systems. Different from existing noncooperative games, our formulation involves the high-order nonlinear dynamics of players, and the communication topologies among players are weight-unbalanced digraphs. Due to the high-order nonlinear dynamics and the weight-unbalanced digraphs, existing Nash equilibrium seeking algorithms can… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

  49. MoRe-Fi: Motion-robust and Fine-grained Respiration Monitoring via Deep-Learning UWB Radar

    Authors: Tianyue Zheng, Zhe Chen, Shujie Zhang, Chao Cai, Jun Luo

    Abstract: Crucial for healthcare and biomedical applications, respiration monitoring often employs wearable sensors in practice, causing inconvenience due to their direct contact with human bodies. Therefore, researchers have been constantly searching for contact-free alternatives. Nonetheless, existing contact-free designs mostly require human subjects to remain static, largely confining their adoptions in… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

    Comments: 14 pages

    Journal ref: SenSys '21: Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems November 2021

  50. RF-Net: a Unified Meta-learning Framework for RF-enabled One-shot Human Activity Recognition

    Authors: Shuya Ding, Zhe Chen, Tianyue Zheng, Jun Luo

    Abstract: Radio-Frequency (RF) based device-free Human Activity Recognition (HAR) rises as a promising solution for many applications. However, device-free (or contactless) sensing is often more sensitive to environment changes than device-based (or wearable) sensing. Also, RF datasets strictly require on-line labeling during collection, starkly different from image and text data collections where human int… ▽ More

    Submitted 28 October, 2021; originally announced November 2021.

    Comments: 14 pages

    Journal ref: SenSys '20: Proceedings of the 18th Conference on Embedded Networked Sensor Systems, November 2020