Skip to main content

Showing 1–50 of 67 results for author: Zhu, B

Searching in archive eess. Search in all archives.
.
  1. arXiv:2405.20055  [pdf, other

    eess.SY

    Hypergraph-Aided Task-Resource Matching for Maximizing Value of Task Completion in Collaborative IoT Systems

    Authors: Botao Zhu, Xianbin Wang

    Abstract: With the growing scale and intrinsic heterogeneity of Internet of Things (IoT) systems, distributed device collaboration becomes essential for effective task completion by dynamically utilizing limited communication and computing resources. However, the separated design and situation-agnostic operation of computing, communication and application layers create a fundamental challenge for rapid task… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: This paper has been published in IEEE Transactions on Mobile Computing, May 2024

  2. Improved Soft-k-Means Clustering Algorithm for Balancing Energy Consumption in Wireless Sensor Networks

    Authors: Botao Zhu, Ebrahim Bedeer, Ha H. Nguyen, Robert Barton, Jerome Henry

    Abstract: Energy load balancing is an essential issue in designing wireless sensor networks (WSNs). Clustering techniques are utilized as energy-efficient methods to balance the network energy and prolong its lifetime. In this paper, we propose an improved soft-k-means (IS-k-means) clustering algorithm to balance the energy consumption of nodes in WSNs. First, we use the idea of ``clustering by fast search… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Journal ref: Published in IEEE Internet of Things Journal, 2021

  3. arXiv:2403.06423  [pdf, other

    eess.SP cs.RO

    LiDAR Point Cloud-based Multiple Vehicle Tracking with Probabilistic Measurement-Region Association

    Authors: Guanhua Ding, Jianan Liu, Yuxuan Xia, Tao Huang, Bing Zhu, **** Sun

    Abstract: Multiple extended target tracking (ETT) has gained increasing attention due to the development of high-precision LiDAR and radar sensors in automotive applications. For LiDAR point cloud-based vehicle tracking, this paper presents a probabilistic measurement-region association (PMRA) ETT model, which can describe the complex measurement distribution by partitioning the target extent into different… ▽ More

    Submitted 18 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: 8 pages, 5 figures, accepted by the 27th International Conference on Information Fusion (FUSION 2024)

  4. arXiv:2402.17785  [pdf, other

    cs.SD cs.AI eess.AS

    ByteComposer: a Human-like Melody Composition Method based on Language Model Agent

    Authors: Xia Liang, Xingjian Du, Jiaju Lin, Pei Zou, Yuan Wan, Bilei Zhu

    Abstract: Large Language Models (LLM) have shown encouraging progress in multimodal understanding and generation tasks. However, how to design a human-aligned and interpretable melody composition system is still under-explored. To solve this problem, we propose ByteComposer, an agent framework emulating a human's creative pipeline in four separate steps : "Conception Analysis - Draft Composition - Self-Eval… ▽ More

    Submitted 6 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  5. arXiv:2402.07485  [pdf, other

    cs.SD eess.AS

    MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and Instruction Tuning

    Authors: Hang Zhao, Yifei Xin, Zhesong Yu, Bilei Zhu, Lu Lu, Zejun Ma

    Abstract: In the realm of audio-language pre-training (ALP), the challenge of achieving cross-modal alignment is significant. Moreover, the integration of audio inputs with diverse distributions and task variations poses challenges in develo** generic audio-language models. In this study, we present MINT, a novel ALP framework boosting audio-language models through multi-target pre-training and instructio… ▽ More

    Submitted 11 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  6. arXiv:2310.18767  [pdf, other

    eess.SP

    Enhancing Epileptic Seizure Detection with EEG Feature Embeddings

    Authors: Arman Zarei, Bingzhao Zhu, Mahsa Shoaran

    Abstract: Epilepsy is one of the most prevalent brain disorders that disrupts the lives of millions worldwide. For patients with drug-resistant seizures, there exist implantable devices capable of monitoring neural activity, promptly triggering neurostimulation to regulate seizures, or alerting patients of potential episodes. Next-generation seizure detection systems heavily rely on high-accuracy machine le… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  7. arXiv:2310.10159  [pdf, other

    cs.SD cs.CL eess.AS

    Joint Music and Language Attention Models for Zero-shot Music Tagging

    Authors: Xingjian Du, Zhesong Yu, Jiaju Lin, Bilei Zhu, Qiuqiang Kong

    Abstract: Music tagging is a task to predict the tags of music recordings. However, previous music tagging research primarily focuses on close-set music tagging tasks which can not be generalized to new tags. In this work, we propose a zero-shot music tagging system modeled by a joint music and language attention (JMLA) model to address the open-set music tagging problem. The JMLA model consists of an audio… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: \begin{keywords} Music tagging, joint music and language attention models, Music Foundation Model. \end{keywords}

  8. arXiv:2309.06036  [pdf, other

    eess.SP

    Which Framework is Suitable for Online 3D Multi-Object Tracking for Autonomous Driving with Automotive 4D Imaging Radar?

    Authors: Jianan Liu, Guanhua Ding, Yuxuan Xia, **** Sun, Tao Huang, Lihua Xie, Bing Zhu

    Abstract: Online 3D multi-object tracking (MOT) has recently received significant research interests due to the expanding demand of 3D perception in advanced driver assistance systems (ADAS) and autonomous driving (AD). Among the existing 3D MOT frameworks for ADAS and AD, conventional point object tracking (POT) framework using the tracking-by-detection (TBD) strategy has been well studied and accepted for… ▽ More

    Submitted 25 May, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: 8 pages, 5 figures, accepted by IEEE 35th Intelligent Vehicles Symposium (IV 2024), oral presentation (top 5%), code is available at https://github.com/dinggh0817/4D_Radar_MOT

  9. arXiv:2306.04970  [pdf, other

    cs.RO eess.SY

    Motion Planning for Aerial Pick-and-Place based on Geometric Feasibility Constraints

    Authors: Huazi Cao, Jiahao Shen, Cunjia Liu, Bo Zhu, Shiyu Zhao

    Abstract: This paper studies the motion planning problem of the pick-and-place of an aerial manipulator that consists of a quadcopter flying base and a Delta arm. We propose a novel partially decoupled motion planning framework to solve this problem. Compared to the state-of-the-art approaches, the proposed one has two novel features. First, it does not suffer from increased computation in high-dimensional… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

  10. arXiv:2306.02231  [pdf, other

    cs.CL cs.AI cs.LG eess.SY

    Fine-Tuning Language Models with Advantage-Induced Policy Alignment

    Authors: Banghua Zhu, Hiteshi Sharma, Felipe Vieira Frujeri, Shi Dong, Chenguang Zhu, Michael I. Jordan, Jiantao Jiao

    Abstract: Reinforcement learning from human feedback (RLHF) has emerged as a reliable approach to aligning large language models (LLMs) to human preferences. Among the plethora of RLHF techniques, proximal policy optimization (PPO) is of the most widely used methods. Despite its popularity, however, PPO may suffer from mode collapse, instability, and poor sample efficiency. We show that these issues can be… ▽ More

    Submitted 2 November, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

  11. arXiv:2306.02003  [pdf, other

    cs.LG cs.AI cs.PF eess.SY stat.ML

    On Optimal Caching and Model Multiplexing for Large Model Inference

    Authors: Banghua Zhu, Ying Sheng, Lianmin Zheng, Clark Barrett, Michael I. Jordan, Jiantao Jiao

    Abstract: Large Language Models (LLMs) and other large foundation models have achieved noteworthy success, but their size exacerbates existing resource consumption and latency challenges. In particular, the large-scale deployment of these models is hindered by the significant resource requirements during inference. In this paper, we study two approaches for mitigating these challenges: employing a cache to… ▽ More

    Submitted 28 August, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

  12. arXiv:2306.00265  [pdf, other

    cs.LG cs.AI cs.CV eess.IV stat.ML

    Doubly Robust Self-Training

    Authors: Banghua Zhu, Mingyu Ding, Philip Jacobson, Ming Wu, Wei Zhan, Michael Jordan, Jiantao Jiao

    Abstract: Self-training is an important technique for solving semi-supervised learning problems. It leverages unlabeled data by generating pseudo-labels and combining them with a limited labeled dataset for training. The effectiveness of self-training heavily relies on the accuracy of these pseudo-labels. In this paper, we introduce doubly robust self-training, a novel semi-supervised algorithm that provabl… ▽ More

    Submitted 2 November, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

  13. arXiv:2305.08247  [pdf

    eess.SY

    A Fast and Robust Camera-IMU Online Calibration Method For Localization System

    Authors: Xiaowen Tao, Pengxiang Meng, Bing Zhu, Jian Zhao

    Abstract: Autonomous driving has spurred the development of sensor fusion techniques, which combine data from multiple sensors to improve system performance. In particular, localization system based on sensor fusion , such as Visual Simultaneous Localization and Map** (VSLAM), is an important component in environment perception, and is the basis of decision-making and motion control for intelligent vehicl… ▽ More

    Submitted 14 May, 2023; originally announced May 2023.

  14. arXiv:2305.07618  [pdf

    cs.CV cs.LG eess.IV

    Uncertainty Estimation and Out-of-Distribution Detection for Deep Learning-Based Image Reconstruction using the Local Lipschitz

    Authors: Danyal F. Bhutto, Bo Zhu, Jeremiah Z. Liu, Neha Koonjoo, Hongwei B. Li, Bruce R. Rosen, Matthew S. Rosen

    Abstract: Accurate image reconstruction is at the heart of diagnostics in medical imaging. Supervised deep learning-based approaches have been investigated for solving inverse problems including image reconstruction. However, these trained models encounter unseen data distributions that are widely shifted from training data during deployment. Therefore, it is essential to assess whether a given input falls… ▽ More

    Submitted 1 December, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

  15. arXiv:2303.11692  [pdf, other

    cs.SD cs.IR eess.AS

    ByteCover3: Accurate Cover Song Identification on Short Queries

    Authors: Xingjian Du, Zijie Wang, Xia Liang, Huidong Liang, Bilei Zhu, Zejun Ma

    Abstract: Deep learning based methods have become a paradigm for cover song identification (CSI) in recent years, where the ByteCover systems have achieved state-of-the-art results on all the mainstream datasets of CSI. However, with the burgeon of short videos, many real-world applications require matching short music excerpts to full-length music tracks in the database, which is still under-explored and w… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: Accepeted by ICASSP 2023

  16. arXiv:2301.06784  [pdf, other

    eess.SP math.ST

    On the Statistical Consistency of a Generalized Cepstral Estimator

    Authors: Bin Zhu, Mattia Zorzi

    Abstract: We consider the problem to estimate the generalized cepstral coefficients of a stationary stochastic process or stationary multidimensional random field. It turns out that a naive version of the periodogram-based estimator for the generalized cepstral coefficients is not consistent. We propose a consistent estimator for those coefficients. Moreover, we show that the latter can be used in order to… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

    Comments: 11 pages in IEEE Transactions template, 4 figures. Submitted to IEEE Transactions on Automatic Control

  17. arXiv:2212.05301  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning

    Authors: Chen Chen, Yuchen Hu, Qiang Zhang, Heqing Zou, Beier Zhu, Eng Siong Chng

    Abstract: Audio-visual speech recognition (AVSR) has gained remarkable success for ameliorating the noise-robustness of speech recognition. Mainstream methods focus on fusing audio and visual inputs to obtain modality-invariant representations. However, such representations are prone to over-reliance on audio modality as it is much easier to recognize than video modality in clean conditions. As a result, th… ▽ More

    Submitted 2 February, 2023; v1 submitted 10 December, 2022; originally announced December 2022.

    Comments: Accepted by AAAI2023

  18. arXiv:2212.03540  [pdf, other

    eess.SY cs.RO

    EASpace: Enhanced Action Space for Policy Transfer

    Authors: Zheng Zhang, Qingrui Zhang, Bo Zhu, Xiaohan Wang, Tianjiang Hu

    Abstract: Formulating expert policies as macro actions promises to alleviate the long-horizon issue via structured exploration and efficient credit assignment. However, traditional option-based multi-policy transfer methods suffer from inefficient exploration of macro action's length and insufficient exploitation of useful long-duration macro actions. In this paper, a novel algorithm named EASpace (Enhanced… ▽ More

    Submitted 24 July, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

    Comments: 15 Pages

  19. Control Lyapunov-Barrier Function Based Model Predictive Control for Stochastic Nonlinear Affine Systems

    Authors: Weijiang Zheng, Bing Zhu

    Abstract: A stochastic model predictive control (MPC) framework is presented in this paper for nonlinear affine systems with stability and feasibility guarantee. We first introduce the concept of stochastic control Lyapunov-barrier function (CLBF) and provide a method to construct CLBF by combining an unconstrained control Lyapunov function (CLF) and control barrier functions. The unconstrained CLF is obtai… ▽ More

    Submitted 26 June, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: 21 pages, 6 figures

    Journal ref: International Journal of Robust and Nonlinear Control, 2024

  20. arXiv:2208.14372  [pdf, ps, other

    math.OC eess.SY

    Dead-beat model predictive control for discrete-time linear systems

    Authors: Bing Zhu

    Abstract: In this paper, model predictive control (MPC) strategies are proposed for dead-beat control of linear systems with and without state and control constraints. In unconstrained MPC, deadbeat performance can be guaranteed by setting the control horizon to the system dimension, and adding an terminal equality constraint. It is proved that the unconstrained deadbeat MPC is equivalent to linear deadbeat… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

  21. arXiv:2208.10059  [pdf, ps, other

    stat.ME eess.SY

    Sampling Gaussian Stationary Random Fields: A Stochastic Realization Approach

    Authors: Bin Zhu, Jiahao Liu, Zhengshou Lai, Tao Qian

    Abstract: Generating large-scale samples of stationary random fields is of great importance in the fields such as geomaterial modeling and uncertainty quantification. Traditional methodologies based on covariance matrix decomposition have the diffculty of being computationally expensive, which is even more serious when the dimension of the random field is large. This paper proposes an effcient stochastic re… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

    Comments: 17 pages, 9 figures

  22. arXiv:2206.10255  [pdf, other

    eess.SY cs.CV

    GNN-PMB: A Simple but Effective Online 3D Multi-Object Tracker without Bells and Whistles

    Authors: Jianan Liu, Li** Bai, Yuxuan Xia, Tao Huang, Bing Zhu, Qing-Long Han

    Abstract: Multi-object tracking (MOT) is among crucial applications in modern advanced driver assistance systems (ADAS) and autonomous driving (AD) systems. The global nearest neighbor (GNN) filter, as the earliest random vector-based Bayesian tracking framework, has been adopted in most of state-of-the-arts trackers in the automotive industry. The development of random finite set (RFS) theory facilitates a… ▽ More

    Submitted 8 February, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: accepted by IEEE Transactions on Intelligent Vehicles

  23. Brachial Plexus Nerve Trunk Segmentation Using Deep Learning: A Comparative Study with Doctors' Manual Segmentation

    Authors: Yu Wang, Binbin Zhu, Lingsi Kong, Jianlin Wang, Bin Gao, Jianhua Wang, Dingcheng Tian, Yudong Yao

    Abstract: Ultrasound-guided nerve block anesthesia (UGNB) is a high-tech visual nerve block anesthesia method that can observe the target nerve and its surrounding structures, the puncture needle's advancement, and local anesthetics spread in real-time. The key in UGNB is nerve identification. With the help of deep learning methods, the automatic identification or segmentation of nerves can be realized, ass… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

    Comments: 9 pages

    Journal ref: [J]. Ultrasound in Medicine & Biology, 2024, 50(3): 374-383

  24. NeuralTree: A 256-Channel 0.227-$μ$J/Class Versatile Neural Activity Classification and Closed-Loop Neuromodulation SoC

    Authors: Uisub Shin, Cong Ding, Bingzhao Zhu, Yashwanth Vyza, Alix Trouillet, Emilie C. M. Revol, Stéphanie P. Lacour, Mahsa Shoaran

    Abstract: Closed-loop neural interfaces with on-chip machine learning can detect and suppress disease symptoms in neurological disorders or restore lost functions in paralyzed patients. While high-density neural recording can provide rich neural activity information for accurate disease-state detection, existing systems have low channel counts and poor scalability, which could limit their therapeutic effica… ▽ More

    Submitted 8 December, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Journal ref: IEEE Journal of Solid-State Circuits, vol. 57, no. 11, pp. 3243-3257, Nov. 2022

  25. arXiv:2205.02999  [pdf, ps, other

    eess.SP

    Fast and Arbitrary Beam Pattern Design for RIS-Assisted Terahertz Wireless Communication

    Authors: Jian Dang, Zaichen Zhang, Yewei Li, Liang Wu, Bingcheng Zhu, Lei Wang

    Abstract: Reconfigurable intelligent surface (RIS) can assist terahertz wireless communication to restore the fragile line-of-sight links and facilitate beam steering. Arbitrary reflection beam patterns are desired to meet diverse requirements in different applications. This paper establishes relationship between RIS beam pattern design with two-dimensional finite impulse response filter design and proposes… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Comments: 5 pages, 5 figures

  26. arXiv:2204.14057  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Unsupervised Voice-Face Representation Learning by Cross-Modal Prototype Contrast

    Authors: Boqing Zhu, Kele Xu, Changjian Wang, Zheng Qin, Tao Sun, Huaimin Wang, Yuxing Peng

    Abstract: We present an approach to learn voice-face representations from the talking face videos, without any identity labels. Previous works employ cross-modal instance discrimination tasks to establish the correlation of voice and face. These methods neglect the semantic content of different videos, introducing false-negative pairs as training noise. Furthermore, the positive pairs are constructed based… ▽ More

    Submitted 26 May, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

    Comments: 8 pages, 4 figures. Accepted by IJCAI-2022

  27. arXiv:2203.03863  [pdf, ps, other

    eess.SP

    Amplitude-Constrained Constellation and Reflection Pattern Designs for Directional Backscatter Communications Using Programmable Metasurface

    Authors: Wei Wang, Bincheng Zhu, Yongming Huang, Wei Zhang

    Abstract: The large scale reflector array of programmable metasurfaces is capable of increasing the power efficiency of backscatter communications via passive beamforming and thus has the potential to revolutionize the low-data-rate nature of backscatter communications. In this paper, we propose to design the power-efficient higher-order constellation and reflection pattern under the amplitude constraint br… ▽ More

    Submitted 30 March, 2023; v1 submitted 8 March, 2022; originally announced March 2022.

    Comments: Accepted in IEEE Transactions on Wireless Communications

  28. arXiv:2202.10139  [pdf, other

    eess.AS cs.IR cs.SD

    S3T: Self-Supervised Pre-training with Swin Transformer for Music Classification

    Authors: Hang Zhao, Chen Zhang, Belei Zhu, Zejun Ma, Kejun Zhang

    Abstract: In this paper, we propose S3T, a self-supervised pre-training method with Swin Transformer for music classification, aiming to learn meaningful music representations from massive easily accessible unlabeled music data. S3T introduces a momentum-based paradigm, MoCo, with Swin Transformer as its feature extractor to music time-frequency domain. For better music representations learning, S3T contrib… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

    Comments: Accepted by ICASSP2022

  29. arXiv:2202.05267  [pdf, other

    physics.med-ph cs.CV eess.IV

    On Real-time Image Reconstruction with Neural Networks for MRI-guided Radiotherapy

    Authors: David E. J. Waddington, Nicholas Hindley, Neha Koonjoo, Christopher Chiu, Tess Reynolds, Paul Z. Y. Liu, Bo Zhu, Danyal Bhutto, Chiara Paganelli, Paul J. Keall, Matthew S. Rosen

    Abstract: MRI-guidance techniques that dynamically adapt radiation beams to follow tumor motion in real-time will lead to more accurate cancer treatments and reduced collateral healthy tissue damage. The gold-standard for reconstruction of undersampled MR data is compressed sensing (CS) which is computationally slow and limits the rate that images can be available for real-time adaptation. Here, we demonstr… ▽ More

    Submitted 18 May, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

    Comments: 12 pages, 6 figures, 1 table. v2 has a typo in eqn 1 corrected and references added to the discussion

  30. arXiv:2202.01269  [pdf, ps, other

    cs.LG eess.SP math.ST stat.CO stat.ML

    Robust Estimation for Nonparametric Families via Generative Adversarial Networks

    Authors: Banghua Zhu, Jiantao Jiao, Michael I. Jordan

    Abstract: We provide a general framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems, which aim at estimating unknown parameter of the true distribution given adversarially corrupted samples. Prior work focus on the problem of robust mean and covariance estimation when the true distribution lies in the family of Gaussian distributions or elliptic… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

  31. arXiv:2202.00874  [pdf, other

    cs.SD cs.AI cs.IR cs.LG eess.AS

    HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection

    Authors: Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov

    Abstract: Audio classification is an important task of map** audio samples into their corresponding labels. Recently, the transformer model with self-attention mechanisms has been adopted in this field. However, existing audio transformers require large GPU memories and long training time, meanwhile relying on pretrained vision models to achieve high performance, which limits the model's scalability in au… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

    Comments: Preprint version for ICASSP 2022, Singapore

  32. arXiv:2201.08563  [pdf, other

    eess.SP

    Performance Analysis of Hybrid RF-Reconfigurable Intelligent Surfaces Assisted FSO Communication

    Authors: Haibo Wang, Zaichen Zhang, Bingcheng Zhu, Yidi Zhang

    Abstract: Optical reconfigurable intelligent surface (ORIS) is an emerging technology that can achieve reconfigurable optical propagation environments by precisely adjusting signal's reflection and shape through a large number of passive reflecting elements. In this paper, we investigate the performance of an ORIS-assisted dual-hop hybrid radio frequency (RF) and free space optics (FSO) communication system… ▽ More

    Submitted 21 January, 2022; originally announced January 2022.

  33. arXiv:2112.07891  [pdf, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data

    Authors: Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov

    Abstract: Deep learning techniques for separating audio into different sound sources face several challenges. Standard architectures require training separate models for different types of audio sources. Although some universal separators employ a single model to target multiple sources, they have difficulty generalizing to unseen sources. In this paper, we propose a three-component pipeline to train a univ… ▽ More

    Submitted 12 February, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: Preprint version for Association for the Advancement of Artificial Intelligence Conference, AAAI 2022

  34. arXiv:2112.00333  [pdf, ps, other

    eess.SY cs.LG

    Joint Cluster Head Selection and Trajectory Planning in UAV-Aided IoT Networks by Reinforcement Learning with Sequential Model

    Authors: Botao Zhu, Ebrahim Bedeer, Ha H. Nguyen, Robert Barton, Jerome Henry

    Abstract: Employing unmanned aerial vehicles (UAVs) has attracted growing interests and emerged as the state-of-the-art technology for data collection in Internet-of-Things (IoT) networks. In this paper, with the objective of minimizing the total energy consumption of the UAV-IoT system, we formulate the problem of jointly designing the UAV's trajectory and selecting cluster heads in the IoT network as a co… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

    Comments: This paper has been accepted in IEEE IoT-J

  35. Deep Instance Segmentation with Automotive Radar Detection Points

    Authors: Jianan Liu, Weiyi Xiong, Li** Bai, Yuxuan Xia, Tao Huang, Wanli Ouyang, Bing Zhu

    Abstract: Automotive radar provides reliable environmental perception in all-weather conditions with affordable cost, but it hardly supplies semantic and geometry information due to the sparsity of radar detection points. With the development of automotive radar technologies in recent years, instance segmentation becomes possible by using automotive radar. Its data contain contexts such as radar cross secti… ▽ More

    Submitted 5 February, 2023; v1 submitted 4 October, 2021; originally announced October 2021.

    Comments: 11 pages, 9 figures, 3 tables, accepted by IEEE Transactions on Intelligent Vehicles

  36. arXiv:2109.14926  [pdf, other

    math.NA eess.SP eess.SY math.OC

    A Fast Robust Numerical Continuation Solver to a Two-Dimensional Spectral Estimation Problem

    Authors: Bin Zhu, Jiahao Liu

    Abstract: This paper presents a fast algorithm to solve a spectral estimation problem for two-dimensional random fields. The latter is formulated as a convex optimization problem with the Itakura-Saito pseudodistance as the objective function subject to the constraints of moment equations. We exploit the structure of the Hessian of the dual objective function in order to make possible a fast Newton solver.… ▽ More

    Submitted 30 September, 2021; originally announced September 2021.

    Comments: 13 pages, 8 figures

  37. arXiv:2109.05848  [pdf, other

    eess.SP cs.AR

    Closed-Loop Neural Prostheses with On-Chip Intelligence: A Review and A Low-Latency Machine Learning Model for Brain State Detection

    Authors: Bingzhao Zhu, Uisub Shin, Mahsa Shoaran

    Abstract: The application of closed-loop approaches in systems neuroscience and therapeutic stimulation holds great promise for revolutionizing our understanding of the brain and for develo** novel neuromodulation therapies to restore lost functions. Neural prostheses capable of multi-channel neural recording, on-site signal processing, rapid symptom detection, and closed-loop stimulation are critical to… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

  38. arXiv:2109.03990  [pdf, other

    eess.SP

    A Novel Method to Estimate the Coordinates of LEDs in Wireless Optical Positioning Systems

    Authors: Kehan Zhang, Zaichen Zhang, Bingcheng Zhu

    Abstract: Traditional visible light positioning (VLP) systems estimate receivers' coordinates based on the known light-emitting diode (LED) coordinates. However, the LED coordinates are not always known accurately. Because of the structural changes of the buildings due to temperature, humidity or material aging, even measured by highly accurate laser range finders, the LED coordinates may change unpredictab… ▽ More

    Submitted 8 September, 2021; originally announced September 2021.

    Comments: 5 pages, 4 figures, conference

  39. arXiv:2109.00354  [pdf, ps, other

    cs.IT eess.SP

    Outage Analysis and Beamwidth Optimization for Positioning-Assisted Beamforming

    Authors: Bingcheng Zhu, Zaichen Zhang, Julian Cheng

    Abstract: Conventional beamforming is based on channel estimation, which can be computationally intensive and inaccurate when the antenna array is large. In this work, we study the outage probability of positioning-assisted beamforming systems. Closed-form outage probability bounds are derived by considering positioning error, link distance and beamwidth. Based on the analytical result, we show that the bea… ▽ More

    Submitted 9 April, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

  40. arXiv:2108.00354  [pdf, ps, other

    eess.SY cs.LG

    UAV Trajectory Planning in Wireless Sensor Networks for Energy Consumption Minimization by Deep Reinforcement Learning

    Authors: Botao Zhu, Ebrahim Bedeer, Ha H. Nguyen, Robert Barton, Jerome Henry

    Abstract: Unmanned aerial vehicles (UAVs) have emerged as a promising candidate solution for data collection of large-scale wireless sensor networks (WSNs). In this paper, we investigate a UAV-aided WSN, where cluster heads (CHs) receive data from their member nodes, and a UAV is dispatched to collect data from CHs along the planned trajectory. We aim to minimize the total energy consumption of the UAV-WSN… ▽ More

    Submitted 31 July, 2021; originally announced August 2021.

    Journal ref: IEEE TVT, 2021

  41. arXiv:2106.11411  [pdf, other

    cs.SD eess.AS

    Attention-based cross-modal fusion for audio-visual voice activity detection in musical video streams

    Authors: Yuanbo Hou, Zhesong Yu, Xia Liang, Xingjian Du, Bilei Zhu, Zejun Ma, Dick Botteldooren

    Abstract: Many previous audio-visual voice-related works focus on speech, ignoring the singing voice in the growing number of musical video streams on the Internet. For processing diverse musical video data, voice activity detection is a necessary step. This paper attempts to detect the speech and singing voices of target performers in musical video streams using audiovisual information. To integrate inform… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

    Comments: Accepted by INTERSPEECH 2021

  42. arXiv:2105.12151  [pdf, other

    cs.CV eess.IV

    AutoReCon: Neural Architecture Search-based Reconstruction for Data-free Compression

    Authors: Baozhou Zhu, Peter Hofstee, Johan Peltenburg, **ho Lee, Zaid Alars

    Abstract: Data-free compression raises a new challenge because the original training dataset for a pre-trained model to be compressed is not available due to privacy or transmission issues. Thus, a common approach is to compute a reconstructed training dataset before compression. The current reconstruction methods compute the reconstructed training dataset with a generator by exploiting information from the… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  43. arXiv:2103.03612  [pdf, other

    eess.IV cs.MM

    An Optimized H.266/VVC Software Decoder On Mobile Platform

    Authors: Yiming Li, Shan Liu, Yu Chen, Yushan Zheng, Sijia Chen, Bin Zhu, Jian Lou

    Abstract: As the successor of H.265/HEVC, the new versatile video coding standard (H.266/VVC) can provide up to 50% bitrate saving with the same subjective quality, at the cost of increased decoding complexity. To accelerate the application of the new coding standard, a real-time H.266/VVC software decoder that can support various platforms is implemented, where SIMD technologies, parallelism optimization,… ▽ More

    Submitted 5 March, 2021; originally announced March 2021.

  44. arXiv:2102.08430  [pdf

    cs.LG eess.SY

    Multi-Stage Transmission Line Flow Control Using Centralized and Decentralized Reinforcement Learning Agents

    Authors: Xiumin Shang, **** Yang, Bingquan Zhu, Lin Ye, **g Zhang, Jian** Xu, Qin Lyu, Ruisheng Diao

    Abstract: Planning future operational scenarios of bulk power systems that meet security and economic constraints typically requires intensive labor efforts in performing massive simulations. To automate this process and relieve engineers' burden, a novel multi-stage control approach is presented in this paper to train centralized and decentralized reinforcement learning agents that can automatically adjust… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: This work is accepted by NeurIPS ML4Eng workshop 2020, please refer to https://ml4eng.github.io/camera_readys/56.pdf

  45. arXiv:2012.15398  [pdf, other

    eess.SY

    Two New Approaches to Optical IRSs: Schemes and Comparative Analysis

    Authors: Haibo Wang, Zaichen Zhang, Bingcheng Zhu, Jian Dang, Liang Wu

    Abstract: Oriented to the point-to-multipoint free space optical communication (FSO) scenarios, this paper analyzes the micro-mirror array and phased array-type optical intelligent reflecting surface (OIRS) in terms of control mode, power efficiency, and beam splitting. We build the physical models of the two types of OIRSs. Based on the models, the closed form solution of OIRSs' output power density distri… ▽ More

    Submitted 30 December, 2020; originally announced December 2020.

    Comments: 26 pages,11 figures

  46. arXiv:2012.02832  [pdf

    eess.IV

    A software decoder implementation for H.266/VVC video coding standard

    Authors: Bin Zhu, Shan Liu, Yuan Liu, Yi Luo, **g Ye, Haiyan Xu, Ying Huang, Hualong Jiao, Xiaozhong Xu, Xianguo Zhang, Chenchen Gu

    Abstract: Versatile Video Coding Standard (H.266/VVC) was completed by Joint Video Expert Team (JVET) of ITU-T and ISO/IEC, in July 2020. This new ITU recommendation/international standard is a successor to the well-known H.265/HEVC video coding standard with roughly doubled compression efficiency, but also at the cost of an increased computational complexity. The complexity of H.266/VVC decoder processing… ▽ More

    Submitted 7 December, 2020; v1 submitted 4 December, 2020; originally announced December 2020.

  47. arXiv:2010.14168  [pdf, other

    cs.SD cs.MM eess.AS

    Rule-embedded network for audio-visual voice activity detection in live musical video streams

    Authors: Yuanbo Hou, Yi Deng, Bilei Zhu, Zejun Ma, Dick Botteldooren

    Abstract: Detecting anchor's voice in live musical streams is an important preprocessing for music and speech signal processing. Existing approaches to voice activity detection (VAD) primarily rely on audio, however, audio-based VAD is difficult to effectively focus on the target voice in noisy environments. With the help of visual information, this paper proposes a rule-embedded network to fuse the audio-v… ▽ More

    Submitted 31 October, 2020; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: Submitted to ICASSP 2021

  48. arXiv:2010.14022  [pdf, other

    cs.SD cs.LG eess.AS

    ByteCover: Cover Song Identification via Multi-Loss Training

    Authors: Xingjian Du, Zhesong Yu, Bilei Zhu, Xiaoou Chen, Zejun Ma

    Abstract: We present in this paper ByteCover, which is a new feature learning method for cover song identification (CSI). ByteCover is built based on the classical ResNet model, and two major improvements are designed to further enhance the capability of the model for CSI. In the first improvement, we introduce the integration of instance normalization (IN) and batch normalization (BN) to build IBN blocks,… ▽ More

    Submitted 23 April, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

  49. arXiv:2010.13540  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Contrastive Unsupervised Learning for Audio Fingerprinting

    Authors: Zhesong Yu, Xingjian Du, Bilei Zhu, Zejun Ma

    Abstract: The rise of video-sharing platforms has attracted more and more people to shoot videos and upload them to the Internet. These videos mostly contain a carefully-edited background audio track, where serious speech change, pitch shifting and various types of audio effects may involve, and existing audio identification systems may fail to recognize the audio. To solve this problem, in this paper, we i… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

    Comments: 5 pages

  50. arXiv:2007.08165  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Audio Tagging by Cross Filtering Noisy Labels

    Authors: Boqing Zhu, Kele Xu, Qiuqiang Kong, Huaimin Wang, Yuxing Peng

    Abstract: High quality labeled datasets have allowed deep learning to achieve impressive results on many sound analysis tasks. Yet, it is labor-intensive to accurately annotate large amount of audio data, and the dataset may contain noisy labels in the practical settings. Meanwhile, the deep neural networks are susceptive to those incorrect labeled data because of their outstanding memorization ability. In… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

    Comments: Accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing