Skip to main content

Showing 1–50 of 182 results for author: Yang, K

Searching in archive eess. Search in all archives.
.
  1. MARLP: Time-series Forecasting Control for Agricultural Managed Aquifer Recharge

    Authors: Yuning Chen, Kang Yang, Zhiyu An, Brady Holder, Luke Paloutzian, Khaled Bali, Wan Du

    Abstract: The rapid decline in groundwater around the world poses a significant challenge to sustainable agriculture. To address this issue, agricultural managed aquifer recharge (Ag-MAR) is proposed to recharge the aquifer by artificially flooding agricultural lands using surface water. Ag-MAR requires a carefully selected flooding schedule to avoid affecting the oxygen absorption of crop roots. However, c… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted by KDD 2024

  2. arXiv:2407.00614  [pdf, other

    cs.RO cs.CV eess.IV

    Learning Granularity-Aware Affordances from Human-Object Interaction for Tool-Based Functional Gras** in Dexterous Robotics

    Authors: Fan Yang, Wenrui Chen, Kailun Yang, Haoran Lin, DongSheng Luo, Conghui Tang, Zhiyong Li, Yaonan Wang

    Abstract: To enable robots to use tools, the initial step is teaching robots to employ dexterous gestures for touching specific areas precisely where tasks are performed. Affordance features of objects serve as a bridge in the functional interaction between agents and objects. However, leveraging these affordance cues to help robots achieve functional tool gras** remains unresolved. To address this, we pr… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: The source code and the established dataset will be made publicly available at https://github.com/yangfan293/GAAF-DEX

  3. arXiv:2406.04721  [pdf, other

    cs.IT eess.SP

    End-to-End Design of Polar Coded Integrated Data and Energy Networking

    Authors: Jie Hu, **gwen Cui, Kun Yang

    Abstract: In order to transmit data and transfer energy to the low-power Internet of Things (IoT) devices, integrated data and energy networking (IDEN) system may be harnessed. In this context, we propose a bitwise end-to-end design for polar coded IDEN systems, where the conventional encoding/decoding, modulation/demodulation, and energy harvesting (EH) modules are replaced by the neural networks (NNs). In… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  4. arXiv:2405.05518  [pdf, other

    cs.CV cs.RO eess.IV

    DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction

    Authors: Siyu Li, Jiacheng Lin, Hao Shi, Jiaming Zhang, Song Wang, You Yao, Zhiyong Li, Kailun Yang

    Abstract: Temporal information plays a pivotal role in Bird's-Eye-View (BEV) driving scene understanding, which can alleviate the visual information sparsity. However, the indiscriminate temporal fusion method will cause the barrier of feature redundancy when constructing vectorized High-Definition (HD) maps. In this paper, we revisit the temporal fusion of vectorized HD maps, focusing on temporal instance… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: The source code will be made publicly available at https://github.com/lynn-yu/DTCLMapper

  5. arXiv:2405.02942  [pdf, other

    physics.optics cs.CV cs.RO eess.IV

    Design, analysis, and manufacturing of a glass-plastic hybrid minimalist aspheric panoramic annular lens

    Authors: Shaohua Gao, Qi Jiang, Yiqi Liao, Yi Qiu, Wanglei Ying, Kailun Yang, Kaiwei Wang, Benhao Zhang, Jian Bai

    Abstract: We propose a high-performance glass-plastic hybrid minimalist aspheric panoramic annular lens (ASPAL) to solve several major limitations of the traditional panoramic annular lens (PAL), such as large size, high weight, and complex system. The field of view (FoV) of the ASPAL is 360°x(35°~110°) and the imaging quality is close to the diffraction limit. This large FoV ASPAL is composed of only 4 len… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted to Optics & Laser Technology

  6. arXiv:2405.01258  [pdf, other

    cs.CV cs.RO eess.IV

    Towards Consistent Object Detection via LiDAR-Camera Synergy

    Authors: Kai Luo, Hao Wu, Kefu Yi, Kailun Yang, Wei Hao, Rongdong Hu

    Abstract: As human-machine interaction continues to evolve, the capacity for environmental perception is becoming increasingly crucial. Integrating the two most common types of sensory data, images, and point clouds, can enhance detection accuracy. However, currently, no model exists that can simultaneously detect an object's position in both point clouds and images and ascertain their corresponding relatio… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: The source code will be made publicly available at https://github.com/xifen523/COD

  7. arXiv:2404.19201  [pdf, other

    eess.IV cs.CV cs.RO physics.optics

    Global Search Optics: Automatically Exploring Optimal Solutions to Compact Computational Imaging Systems

    Authors: Yao Gao, Qi Jiang, Shaohua Gao, Lei Sun, Kailun Yang, Kaiwei Wang

    Abstract: The popularity of mobile vision creates a demand for advanced compact computational imaging systems, which call for the development of both a lightweight optical system and an effective image reconstruction model. Recently, joint design pipelines come to the research forefront, where the two significant components are simultaneously optimized via data-driven learning to realize the optimal system… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: The source code will be made publicly available at https://github.com/wumengshenyou/GSO

  8. arXiv:2404.16346  [pdf, other

    eess.IV cs.AI cs.CV

    Light-weight Retinal Layer Segmentation with Global Reasoning

    Authors: Xiang He, Weiye Song, Yiming Wang, Fabio Poiesi, Ji Yi, Manishi Desai, Quanqing Xu, Kongzheng Yang, Yi Wan

    Abstract: Automatic retinal layer segmentation with medical images, such as optical coherence tomography (OCT) images, serves as an important tool for diagnosing ophthalmic diseases. However, it is challenging to achieve accurate segmentation due to low contrast and blood flow noises presented in the images. In addition, the algorithm should be light-weight to be deployed for practical clinical applications… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: IEEE Transactions on Instrumentation & Measurement

  9. arXiv:2404.16302  [pdf, other

    cs.CV cs.MM cs.RO eess.IV

    CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions

    Authors: Haoyuan Li, Qi Hu, You Yao, Kailun Yang, Peng Chen

    Abstract: Cross-modality images that integrate visible-infrared spectra cues can provide richer complementary information for object detection. Despite this, existing visible-infrared object detection methods severely degrade in severe weather conditions. This failure stems from the pronounced sensitivity of visible images to environmental perturbations, such as rain, haze, and snow, which frequently cause… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: The dataset and source code will be made publicly available at https://github.com/lhy-zjut/CFMW

  10. arXiv:2404.15297  [pdf, ps, other

    eess.SP cs.IT cs.LG

    Multi-stream Transmission for Directional Modulation Network via Distributed Multi-UAV-aided Multi-active-IRS

    Authors: Ke Yang, Rongen Dong, Wei Gao, Feng Shu, Wei** Shi, Yan Wang, Xuehui Wang, Jiangzhou Wang

    Abstract: Active intelligent reflecting surface (IRS) is a revolutionary technique for the future 6G networks. The conventional far-field single-IRS-aided directional modulation(DM) networks have only one (no direct path) or two (existing direct path) degrees of freedom (DoFs). This means that there are only one or two streams transmitted simultaneously from base station to user and will seriously limit its… ▽ More

    Submitted 28 April, 2024; v1 submitted 26 March, 2024; originally announced April 2024.

  11. arXiv:2404.14713  [pdf, other

    eess.SY

    Enhancing High-Speed Cruising Performance of Autonomous Vehicles through Integrated Deep Reinforcement Learning Framework

    Authors: **hao Liang, Kaidi Yang, Chaopeng Tan, **xiang Wang, Guodong Yin

    Abstract: High-speed cruising scenarios with mixed traffic greatly challenge the road safety of autonomous vehicles (AVs). Unlike existing works that only look at fundamental modules in isolation, this work enhances AV safety in mixed-traffic high-speed cruising scenarios by proposing an integrated framework that synthesizes three fundamental modules, i.e., behavioral decision-making, path-planning, and mot… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  12. arXiv:2404.14132  [pdf, other

    cs.CV eess.IV

    CRNet: A Detail-Preserving Network for Unified Image Restoration and Enhancement Task

    Authors: Kangzhen Yang, Tao Hu, Kexin Dai, Genggeng Chen, Yu Cao, Wei Dong, Peng Wu, Yanning Zhang, Qingsen Yan

    Abstract: In real-world scenarios, images captured often suffer from blurring, noise, and other forms of image degradation, and due to sensor limitations, people usually can only obtain low dynamic range images. To achieve high-quality images, researchers have attempted various image restoration and enhancement operations on photographs, including denoising, deblurring, and high dynamic range imaging. Howev… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: This paper is accepted by CVPR2024 Workshop, Code: https://github.com/CalvinYang0/CRNet

  13. arXiv:2404.13537  [pdf, other

    eess.IV cs.CV

    Bracketing Image Restoration and Enhancement with High-Low Frequency Decomposition

    Authors: Genggeng Chen, Kexin Dai, Kangzhen Yang, Tao Hu, Xiangyu Chen, Yongqing Yang, Wei Dong, Peng Wu, Yanning Zhang, Qingsen Yan

    Abstract: In real-world scenarios, due to a series of image degradations, obtaining high-quality, clear content photos is challenging. While significant progress has been made in synthesizing high-quality images, previous methods for image restoration and enhancement often overlooked the characteristics of different degradations. They applied the same structure to address various types of degradation, resul… ▽ More

    Submitted 24 April, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: This paper is accepted by CVPR 2024 Workshop, code: https://github.com/chengeng0613/HLNet

  14. arXiv:2404.12794  [pdf, other

    cs.CV cs.MM cs.RO eess.IV

    MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model

    Authors: Kang Zeng, Hao Shi, Jiacheng Lin, Siyu Li, **tao Cheng, Kaiwei Wang, Zhiyong Li, Kailun Yang

    Abstract: LiDAR-based Moving Object Segmentation (MOS) aims to locate and segment moving objects in point clouds of the current scan using motion information from previous scans. Despite the promising results achieved by previous MOS methods, several key issues, such as the weak coupling of temporal and spatial information, still need further study. In this paper, we propose a novel LiDAR-based 3D Moving Ob… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: The source code will be made publicly available at https://github.com/Terminal-K/MambaMOS

  15. arXiv:2403.16225  [pdf, other

    eess.SY

    Bi-Level Control of Weaving Sections in Mixed Traffic Environments with Connected and Automated Vehicles

    Authors: Longhao Yan, **hao Liang, Kaidi Yang

    Abstract: Connected and automated vehicles (CAVs) can be beneficial for improving the operation of highway bottlenecks such as weaving sections. This paper proposes a bi-level control approach based on an upper-level deep reinforcement learning controller and a lower-level model predictive controller to coordinate the lane-changings of a mixed fleet of CAVs and human-driven vehicles (HVs) in weaving section… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: 12 pages, 8 figures

  16. arXiv:2403.10012  [pdf, other

    cs.CV cs.RO eess.IV physics.optics

    Real-World Computational Aberration Correction via Quantized Domain-Mixing Representation

    Authors: Qi Jiang, Zhonghua Yi, Shaohua Gao, Yao Gao, Xiaolong Qian, Hao Shi, Lei Sun, Zhijie Xu, Kailun Yang, Kaiwei Wang

    Abstract: Relying on paired synthetic data, existing learning-based Computational Aberration Correction (CAC) methods are confronted with the intricate and multifaceted synthetic-to-real domain gap, which leads to suboptimal performance in real-world applications. In this paper, in contrast to improving the simulation pipeline, we deliver a novel insight into real-world CAC from the perspective of Unsupervi… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Codes and datasets will be made publicly available at https://github.com/zju-jiangqi/QDMR

  17. arXiv:2403.09975  [pdf, other

    cs.CV cs.RO eess.IV

    Skeleton-Based Human Action Recognition with Noisy Labels

    Authors: Yi Xu, Kunyu Peng, Di Wen, Rui** Liu, Junwei Zheng, Yufan Chen, Jiaming Zhang, Alina Roitberg, Kailun Yang, Rainer Stiefelhagen

    Abstract: Understanding human actions from body poses is critical for assistive robots sharing space with humans in order to make informed and safe decisions about the next interaction. However, precise temporal localization and annotation of activity sequences is time-consuming and the resulting labels are often noisy. If not effectively addressed, label noise negatively affects the model's training, resul… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: The source code will be made accessible at https://github.com/xuyizdby/NoiseEraSAR

  18. arXiv:2403.08504  [pdf, other

    cs.CV cs.RO eess.IV

    OccFiner: Offboard Occupancy Refinement with Hybrid Propagation

    Authors: Hao Shi, Song Wang, Jiaming Zhang, Xiaoting Yin, Zhongdao Wang, Zhijian Zhao, Guangming Wang, Jianke Zhu, Kailun Yang, Kaiwei Wang

    Abstract: Vision-based occupancy prediction, also known as 3D Semantic Scene Completion (SSC), presents a significant challenge in computer vision. Previous methods, confined to onboard processing, struggle with simultaneous geometric and semantic estimation, continuity across varying viewpoints, and single-view occlusion. Our paper introduces OccFiner, a novel offboard framework designed to enhance the acc… ▽ More

    Submitted 15 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  19. arXiv:2403.05007  [pdf, other

    eess.SY

    Age of Computing: A Metric of Computation Freshness in Communication and Computation Cooperative Networks

    Authors: Xingran Chen, Yusha Liu, Yali Zheng, Kun Yang

    Abstract: In communication and computation cooperative networks (3CNs), timely computation is crucial but not always guaranteed. There is a strong demand for a computational task to be completed within a given time. The time taken involves both processing time and communication time. However, a measure of such timeliness in 3CNs is lacking. In this letter, we introduce the novel concept, Age of Computing (A… ▽ More

    Submitted 18 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  20. arXiv:2403.01789  [pdf, other

    cs.CR eess.SY

    DECOR: Enhancing Logic Locking Against Machine Learning-Based Attacks

    Authors: Yinghua Hu, Kaixin Yang, Subhajit Dutta Chowdhury, Pierluigi Nuzzo

    Abstract: Logic locking (LL) has gained attention as a promising intellectual property protection measure for integrated circuits. However, recent attacks, facilitated by machine learning (ML), have shown the potential to predict the correct key in multiple LL schemes by exploiting the correlation of the correct key value with the circuit structure. This paper presents a generic LL enhancement method based… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 8 pages. Accepted at the International Symposium on Quality Electronic Design (ISQED), 2024

  21. arXiv:2402.18302  [pdf, other

    cs.CV cs.RO eess.AS eess.IV

    EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving

    Authors: Jiacheng Lin, Jiajun Chen, Kunyu Peng, Xuan He, Zhiyong Li, Rainer Stiefelhagen, Kailun Yang

    Abstract: This paper introduces the task of Auditory Referring Multi-Object Tracking (AR-MOT), which dynamically tracks specific objects in a video sequence based on audio expressions and appears as a challenging problem in autonomous driving. Due to the lack of semantic modeling capacity in audio and video, existing works have mainly focused on text-based multi-object tracking, which often comes at the cos… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: The source code and datasets will be made publicly available at https://github.com/lab206/EchoTrack

  22. arXiv:2402.00744  [pdf, other

    cs.SD cs.CL eess.AS

    BATON: Aligning Text-to-Audio Model with Human Preference Feedback

    Authors: Huan Liao, Haonan Han, Kai Yang, Tianjiao Du, Rui Yang, Zunnan Xu, Qinmei Xu, **gquan Liu, Jiasheng Lu, Xiu Li

    Abstract: With the development of AI-Generated Content (AIGC), text-to-audio models are gaining widespread attention. However, it is challenging for these models to generate audio aligned with human preference due to the inherent information density of natural language and limited model understanding ability. To alleviate this issue, we formulate the BATON, a framework designed to enhance the alignment betw… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  23. arXiv:2401.17837  [pdf, ps, other

    eess.SY

    Safe Reinforcement Learning-Based Eco-Driving Control for Mixed Traffic Flows With Disturbances

    Authors: Ke Lu, Dongjun Li, Qun Wang, Kaidi Yang, Lin Zhao, Ziyou Song

    Abstract: This paper presents a safe learning-based eco-driving framework tailored for mixed traffic flows, which aims to optimize energy efficiency while guaranteeing safety during real-system operations. Even though reinforcement learning (RL) is capable of optimizing energy efficiency in intricate environments, it is challenged by safety requirements during the training process. The lack of safety guaran… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  24. arXiv:2401.16923  [pdf, other

    cs.CV cs.RO eess.IV

    Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation

    Authors: Rui** Liu, Jiaming Zhang, Kunyu Peng, Yufan Chen, Ke Cao, Junwei Zheng, M. Saquib Sarfraz, Kailun Yang, Rainer Stiefelhagen

    Abstract: Integrating information from multiple modalities enhances the robustness of scene perception systems in autonomous vehicles, providing a more comprehensive and reliable sensory framework. However, the modality incompleteness in multi-modal segmentation remains under-explored. In this work, we establish a task called Modality-Incomplete Scene Segmentation (MISS), which encompasses both system-level… ▽ More

    Submitted 10 April, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: Accepted to IEEE IV 2024. The source code is publicly available at https://github.com/Rui**L/MISS

  25. arXiv:2401.16712  [pdf, other

    cs.CV cs.RO eess.IV

    LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras

    Authors: Fei Teng, Jiaming Zhang, Jiawei Liu, Kunyu Peng, Xina Cheng, Zhiyong Li, Kailun Yang

    Abstract: Leveraging the rich information extracted from light field (LF) cameras is instrumental for dense prediction tasks. However, adapting light field data to enhance Salient Object Detection (SOD) still follows the traditional RGB methods and remains under-explored in the community. Previous approaches predominantly employ a custom two-stream design to discover the implicit angular feature within ligh… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: The source code will be made publicly available at https://github.com/FeiBryantkit/LF-Tracy

  26. arXiv:2401.16700  [pdf, other

    cs.CV cs.RO eess.IV

    Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers

    Authors: Jianbin Jiao, Xina Cheng, Weijie Chen, Xiaoting Yin, Hao Shi, Kailun Yang

    Abstract: 3D human pose estimation captures the human joint points in three-dimensional space while kee** the depth information and physical structure. That is essential for applications that require precise pose information, such as human-computer interaction, scene understanding, and rehabilitation training. Due to the challenges in data collection, mainstream datasets of 3D human pose estimation are pr… ▽ More

    Submitted 25 March, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted to IJCNN 2024. The source code will be available at https://github.com/WUJINHUAN/3D-human-pose

  27. arXiv:2401.15561  [pdf, other

    eess.SY cs.RO

    A Parameter Privacy-Preserving Strategy for Mixed-Autonomy Platoon Control

    Authors: **gyuan Zhou, Kaidi Yang

    Abstract: It has been demonstrated that leading cruise control (LCC) can improve the operation of mixed-autonomy platoons by allowing connected and automated vehicles (CAVs) to make longitudinal control decisions based on the information provided by surrounding vehicles. However, LCC generally requires surrounding human-driven vehicles (HDVs) to share their real-time states, which can be used by adversaries… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

  28. arXiv:2401.11836  [pdf, other

    cs.LG cs.CR eess.SY

    Privacy-Preserving Data Fusion for Traffic State Estimation: A Vertical Federated Learning Approach

    Authors: Qiqing Wang, Kaidi Yang

    Abstract: This paper proposes a privacy-preserving data fusion method for traffic state estimation (TSE). Unlike existing works that assume all data sources to be accessible by a single trusted party, we explicitly address data privacy concerns that arise in the collaboration and data sharing between multiple data owners, such as municipal authorities (MAs) and mobility providers (MPs). To this end, we prop… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  29. arXiv:2401.11409  [pdf, other

    cs.IT eess.SP

    Robust Beamforming for Downlink Multi-Cell Systems: A Bilevel Optimization Perspective

    Authors: Xingdi Chen, Yu Xiong, Kai Yang

    Abstract: Utilization of inter-base station cooperation for information processing has shown great potential in enhancing the overall quality of communication services (QoS) in wireless communication networks. Nevertheless, such cooperations require the knowledge of channel state information (CSI) at base stations (BSs), which is assumed to be perfectly known. However, CSI errors are inevitable in practice… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: accepted at AAAI2024

  30. Enhancing System-Level Safety in Mixed-Autonomy Platoon via Safe Reinforcement Learning

    Authors: **gyuan Zhou, Longhao Yan, Kaidi Yang

    Abstract: Connected and automated vehicles (CAVs) have recently gained prominence in traffic research due to advances in communication technology and autonomous driving. Various longitudinal control strategies for CAVs have been developed to enhance traffic efficiency, stability, and safety in mixed-autonomy scenarios. Deep reinforcement learning (DRL) is one promising strategy for mixed-autonomy platoon co… ▽ More

    Submitted 1 March, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

    Comments: IEEE Transactions on Intelligent Vehicles (2024)

  31. arXiv:2401.05725  [pdf, ps, other

    cs.IT eess.SP

    Energy-Efficient STAR-RIS Enhanced UAV-Enabled MEC Networks with Bi-Directional Task Offloading

    Authors: Han Xiao, Xiaoyan Hu, Weile Zhang, Wenjie Wang, Kai-Kit Wong, Kun Yang

    Abstract: This paper introduces a novel multi-user mobile edge computing (MEC) scheme facilitated by the simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) and the unmanned aerial vehicle (UAV). Unlike existing MEC approaches, the proposed scheme enables bidirectional offloading, allowing users to concurrently offload tasks to the MEC servers located at the ground base… ▽ More

    Submitted 9 June, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

  32. arXiv:2312.14392  [pdf, other

    eess.SP

    Wideband Sample Rate Converter Using Cascaded Parallel-serial Structure for Synthetic Instrumentation

    Authors: Ruiyuan Ming, Peng Ye, Kuojun Yang, Zhixiang Pan, Li chen, Xuetao Liu

    Abstract: A sample rate converter(SRC) is designed to adjust the sampling rate of digital signals flexibly for different application requirements in the broadband signal processing system. In this paper, a novel parallel-serial structure is proposed to improve the bandwidth and flexibility of SRC. The core of this structure is a parallel decimation filter followed by a serial counterpart, the parallel part… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: 12 pages, 15 figures

  33. arXiv:2312.10547  [pdf, other

    cs.IT cs.LG cs.NI eess.SP

    Advancing RAN Slicing with Offline Reinforcement Learning

    Authors: Kun Yang, Shu-** Yeh, Menglei Zhang, Jerry Sydir, **g Yang, Cong Shen

    Abstract: Dynamic radio resource management (RRM) in wireless networks presents significant challenges, particularly in the context of Radio Access Network (RAN) slicing. This technology, crucial for catering to varying user requirements, often grapples with complex optimization scenarios. Existing Reinforcement Learning (RL) approaches, while achieving good performance in RAN slicing, typically rely on onl… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: 9 pages. 6 figures

  34. arXiv:2312.09576  [pdf, other

    eess.IV cs.CV

    SegRap2023: A Benchmark of Organs-at-Risk and Gross Tumor Volume Segmentation for Radiotherapy Planning of Nasopharyngeal Carcinoma

    Authors: Xiangde Luo, Jia Fu, Yunxin Zhong, Shuolin Liu, Bing Han, Mehdi Astaraki, Simone Bendazzoli, Iuliana Toma-Dasu, Yiwen Ye, Ziyang Chen, Yong Xia, Yanzhou Su, ** Ye, Junjun He, Zhaohu Xing, Hongqiu Wang, Lei Zhu, Kaixiang Yang, Xin Fang, Zhiwei Wang, Chan Woong Lee, Sang Joon Park, Jaehee Chun, Constantin Ulrich, Klaus H. Maier-Hein , et al. (17 additional authors not shown)

    Abstract: Radiation therapy is a primary and effective NasoPharyngeal Carcinoma (NPC) treatment strategy. The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis. Previously, the delineation of GTVs and OARs was performed by experienced radiation oncologists. Recently, deep learning has achieved promising results… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: A challenge report of SegRap2023 (organized in conjunction with MICCAI2023)

  35. arXiv:2312.06330  [pdf, other

    cs.CV cs.AI cs.RO eess.IV

    Navigating Open Set Scenarios for Skeleton-based Action Recognition

    Authors: Kunyu Peng, Cheng Yin, Junwei Zheng, Rui** Liu, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg

    Abstract: In real-world scenarios, human actions often fall outside the distribution of training data, making it crucial for models to recognize known actions and reject unknown ones. However, using pure skeleton data in such open-set conditions poses challenges due to the lack of visual background cues and the distinct sparse structure of body pose sequences. In this paper, we tackle the unexplored Open-Se… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024. The benchmark, code, and models will be released at https://github.com/KPeng9510/OS-SAR

  36. arXiv:2311.18168  [pdf, other

    cs.CV cs.LG eess.AS

    Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks, Methods, and Applications

    Authors: Karren D. Yang, Anurag Ranjan, Jen-Hao Rick Chang, Raviteja Vemulapalli, Oncel Tuzel

    Abstract: We consider the task of animating 3D facial geometry from speech signal. Existing works are primarily deterministic, focusing on learning a one-to-one map** from speech signal to 3D face meshes on small datasets with limited speakers. While these models can achieve high-quality lip articulation for speakers in the training set, they are unable to capture the full and diverse distribution of 3D f… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  37. arXiv:2311.14227  [pdf, other

    eess.IV cs.CV cs.LG

    Robust and Interpretable COVID-19 Diagnosis on Chest X-ray Images using Adversarial Training

    Authors: Karina Yang, Alexis Bennett, Dominique Duncan

    Abstract: The novel 2019 Coronavirus disease (COVID-19) global pandemic is a defining health crisis. Recent efforts have been increasingly directed towards achieving quick and accurate detection of COVID-19 across symptomatic patients to mitigate the intensity and spread of the disease. Artificial intelligence (AI) algorithms applied to chest X-ray (CXR) images have emerged as promising diagnostic tools, an… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

  38. arXiv:2311.11423  [pdf, other

    cs.IT cs.LG cs.NI eess.SP

    Offline Reinforcement Learning for Wireless Network Optimization with Mixture Datasets

    Authors: Kun Yang, Cong Shen, **g Yang, Shu-** Yeh, Jerry Sydir

    Abstract: The recent development of reinforcement learning (RL) has boosted the adoption of online RL for wireless radio resource management (RRM). However, online RL algorithms require direct interactions with the environment, which may be undesirable given the potential performance loss due to the unavoidable exploration in RL. In this work, we first investigate the use of \emph{offline} RL algorithms in… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

    Comments: This paper is the camera ready version for Asilomar 2023

  39. arXiv:2311.08720  [pdf, other

    eess.SP

    Massive Wireless Energy Transfer without Channel State Information via Imperfect Intelligent Reflecting Surfaces

    Authors: Cheng Luo, Jie Hu, Kun Yang, Kai-Kit Wong

    Abstract: Intelligent Reflecting Surface (IRS) utilizes low-cost, passive reflecting elements to enhance the passive beam gain, improve Wireless Energy Transfer (WET) efficiency, and enable its deployment for numerous Internet of Things (IoT) devices. However, the increasing number of IRS elements presents considerable channel estimation challenges. This is due to the lack of active Radio Frequency (RF) cha… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  40. arXiv:2311.05273  [pdf, other

    eess.SP

    Few-Shot Recognition and Classification Framework for Jamming Signal: A CGAN-Based Fusion CNN Approach

    Authors: Xuhui Ding, Yue Zhang, Gaoyang Li, Xiaozheng Gao, Neng Ye, Dusit Niyato, Kai Yang

    Abstract: Subject to intricate environmental variables, the precise classification of jamming signals holds paramount significance in the effective implementation of anti-jamming strategies within communication systems. In light of this imperative, we propose an innovative fusion algorithm based on conditional generative adversarial network (CGAN) and convolutional neural network (CNN), which aims to deal w… ▽ More

    Submitted 26 June, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: Required to supplement the experiments in Section VII, enhance the notations in Table I, and make necessary adjustments to Equation 17 to ensure accuracy and completeness

  41. arXiv:2311.04591  [pdf, other

    cs.CV cs.MM cs.RO eess.IV

    Rethinking Event-based Human Pose Estimation with 3D Event Representations

    Authors: Xiaoting Yin, Hao Shi, Jiaan Chen, Ze Wang, Yaozu Ye, Huajian Ni, Kailun Yang, Kaiwei Wang

    Abstract: Human pose estimation is a fundamental and appealing task in computer vision. Traditional frame-based cameras and videos are commonly applied, yet, they become less reliable in scenarios under high dynamic range or heavy motion blur. In contrast, event cameras offer a robust solution for navigating these challenging contexts. Predominant methodologies incorporate event cameras into learning framew… ▽ More

    Submitted 1 December, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: Extended version of arXiv:2206.04511. The code and dataset are available at https://github.com/MasterHow/EventPointPose

  42. arXiv:2310.15130  [pdf, other

    cs.SD cs.CV eess.AS

    Novel-View Acoustic Synthesis from 3D Reconstructed Rooms

    Authors: Byeongjoo Ahn, Karren Yang, Brian Hamilton, Jonathan Sheaffer, Anurag Ranjan, Miguel Sarabia, Oncel Tuzel, Jen-Hao Rick Chang

    Abstract: We investigate the benefit of combining blind audio recordings with 3D scene information for novel-view acoustic synthesis. Given audio recordings from 2-4 microphones and the 3D geometry and material of a scene containing multiple unknown sound sources, we estimate the sound anywhere in the scene. We identify the main challenges of novel-view acoustic synthesis as sound source localization, separ… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  43. arXiv:2310.13993  [pdf, other

    eess.SP

    Green Beamforming Design for Integrated Sensing and Communication Systems: A Practical Approach Using Beam-Matching Error Metrics

    Authors: Ke Xu, Jie Hu, Kun Yang

    Abstract: In this paper, we propose a green beamforming design for the integrated sensing and communication (ISAC) system, using beam-matching error to assess radar performance. The beam-matching error metric, which considers the mean square error between the desired and designed beam patterns, provides a more practical evaluation approach. To tackle the non-convex challenge inherent in beamforming design,… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

  44. arXiv:2310.13984  [pdf, other

    eess.SP

    Robust NOMA-assisted OTFS-ISAC Network Design with 3D Motion Prediction Topology

    Authors: Ke Xu, Jie Hu, Christos Masouros, Kun Yang

    Abstract: This paper proposes a novel non-orthogonal multiple access (NOMA)-assisted orthogonal time-frequency space (OTFS)-integrated sensing and communication (ISAC) network, which uses unmanned aerial vehicles (UAVs) as air base stations to support multiple users. By employing ISAC, the UAV extracts position and velocity information from the user's echo signals, and non-orthogonal power allocation is con… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

  45. Reconfigurable Intelligent Sensing Surface aided Wireless Powered Communication Networks: A Sensing-Then-Reflecting Approach

    Authors: Cheng Luo, Jie Hu, Kun Yang

    Abstract: This paper presents a reconfigurable intelligent sensing surface (RISS) that combines passive and active elements to achieve simultaneous reflection and direction of arrival (DOA) estimation tasks. By utilizing DOA information from the RISS instead of conventional channel estimation, the pilot overhead is reduced and the RISS becomes independent of the hybrid access point (HAP), enabling efficient… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  46. arXiv:2310.12551  [pdf, other

    cs.RO eess.IV

    Iterative PnP and its application in 3D-2D vascular image registration for robot navigation

    Authors: **gwei Song, Keke Yang, Zheng Zhang, Meng Li, Tuoyu Cao, Maani Ghaffari

    Abstract: This paper reports on a new real-time robot-centered 3D-2D vascular image alignment algorithm, which is robust to outliers and can align nonrigid shapes. Few works have managed to achieve both real-time and accurate performance for vascular intervention robots. This work bridges high-accuracy 3D-2D registration techniques and computational efficiency requirements in intervention robot applications… ▽ More

    Submitted 11 January, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: Submitted to ICRA 2024 Errors in Eq. 4 and Eq. 6 have been corrected. Updates include some minor improvements in Section II

  47. arXiv:2310.10300  [pdf, other

    cs.SD cs.IR eess.AS

    BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework for Music-Dance Retrieval

    Authors: Kaixing Yang, Xukun Zhou, Xulong Tang, Ran Diao, Hongyan Liu, Jun He, Zhaoxin Fan

    Abstract: Dance and music are closely related forms of expression, with mutual retrieval between dance videos and music being a fundamental task in various fields like education, art, and sports. However, existing methods often suffer from unnatural generation effects or fail to fully explore the correlation between music and dance. To overcome these challenges, we propose BeatDance, a novel beat-based mode… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  48. arXiv:2310.02815  [pdf, other

    cs.CV cs.RO eess.IV

    CoBEV: Elevating Roadside 3D Object Detection with Depth and Height Complementarity

    Authors: Hao Shi, Chengshan Pang, Jiaming Zhang, Kailun Yang, Yuhao Wu, Huajian Ni, Yining Lin, Rainer Stiefelhagen, Kaiwei Wang

    Abstract: Roadside camera-driven 3D object detection is a crucial task in intelligent transportation systems, which extends the perception range beyond the limitations of vision-centric vehicles and enhances road safety. While previous studies have limitations in using only depth or height information, we find both depth and height matter and they are in fact complementary. The depth feature encompasses pre… ▽ More

    Submitted 17 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: The source code will be made publicly available at https://github.com/MasterHow/CoBEV

  49. arXiv:2309.12029  [pdf, other

    cs.CV cs.MM cs.RO eess.IV

    Unveiling the Hidden Realm: Self-supervised Skeleton-based Action Recognition in Occluded Environments

    Authors: Yifei Chen, Kunyu Peng, Alina Roitberg, David Schneider, Jiaming Zhang, Junwei Zheng, Rui** Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen

    Abstract: To integrate action recognition methods into autonomous robotic systems, it is crucial to consider adverse situations involving target occlusions. Such a scenario, despite its practical relevance, is rarely addressed in existing self-supervised skeleton-based action recognition methods. To empower robots with the capacity to address occlusion, we propose a simple and effective method. We first pre… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: The source code will be made publicly available at https://github.com/cyfml/OPSTL

  50. arXiv:2309.12009  [pdf, other

    cs.CV cs.MM cs.RO eess.IV

    Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision

    Authors: Yi** Wei, Kunyu Peng, Alina Roitberg, Jiaming Zhang, Junwei Zheng, Rui** Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen

    Abstract: Self-supervised representation learning for human action recognition has developed rapidly in recent years. Most of the existing works are based on skeleton data while using a multi-modality setup. These works overlooked the differences in performance among modalities, which led to the propagation of erroneous knowledge between modalities while only three fundamental modalities, i.e., joints, bone… ▽ More

    Submitted 10 January, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024. The source code will be made publicly available at https://github.com/desehuileng0o0/IKEM