Skip to main content

Showing 1–50 of 131 results for author: Li, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.16870  [pdf, other

    eess.SY

    Robust Optimal Lane-changing Control for Connected Autonomous Vehicles in Mixed Traffic

    Authors: Anni Li, Andres S. Chavez Armijos, Christos G. Cassandras

    Abstract: We derive time and energy-optimal policies for a Connected Autonomous Vehicle (CAV) to execute lane change maneuvers in mixed traffic, i.e., in the presence of both CAVs and Human Driven Vehicles (HDVs). These policies are also shown to be robust with respect to the unpredictable behavior of HDVs by exploiting CAV cooperation which can eliminate or greatly reduce the interaction between CAVs and H… ▽ More

    Submitted 15 March, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2303.16948

  2. arXiv:2406.11175  [pdf, other

    cs.SD eess.AS

    SMRU: Split-and-Merge Recurrent-based UNet for Acoustic Echo Cancellation and Noise Suppression

    Authors: Zhihang Sun, Andong Li, Rilin Chen, Hao Zhang, Meng Yu, Yi Zhou, Dong Yu

    Abstract: The proliferation of deep neural networks has spawned the rapid development of acoustic echo cancellation and noise suppression, and plenty of prior arts have been proposed, which yield promising performance. Nevertheless, they rarely consider the deployment generality in different processing scenarios, such as edge devices, and cloud processing. To this end, this paper proposes a general model, t… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  3. arXiv:2406.00758  [pdf, other

    eess.IV cs.CV cs.MM

    Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption

    Authors: Anqi Li, Yuxi Liu, Huihui Bai, Feng Li, Runmin Cong, Meng Wang, Yao Zhao

    Abstract: Although recent generative image compression methods have demonstrated impressive potential in optimizing the rate-distortion-perception trade-off, they still face the critical challenge of flexible rate adaption to diverse compression necessities and scenarios. To overcome this challenge, this paper proposes a Controllable Generative Image Compression framework, Control-GIC, the first capable of… ▽ More

    Submitted 5 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

  4. arXiv:2405.17594  [pdf, other

    eess.SY

    Towards Achieving Cooperation Compliance of Human Drivers in Mixed Traffic

    Authors: Anni Li, Christos G. Cassandras

    Abstract: We consider a mixed-traffic environment in transportation systems, where Connected and Automated Vehicles (CAVs) coexist with potentially non-cooperative Human-Driven Vehicles (HDVs). We develop a cooperation compliance control framework to incentivize HDVs to align their behavior with socially optimal objectives using a ``refundable toll'' scheme so as to achieve a desired compliance probability… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  5. arXiv:2405.16446  [pdf, ps, other

    eess.SP

    A New Solution for MU-MISO Symbol-Level Precoding: Extrapolation and Deep Unfolding

    Authors: Mu Liang, Ang Li, Xiaoyan Hu, Christos Masouros

    Abstract: Constructive interference (CI) precoding, which converts the harmful multi-user interference into beneficial signals, is a promising and efficient interference management scheme in multi-antenna communication systems. However, CI-based symbol-level precoding (SLP) experiences high computational complexity as the number of symbol slots increases within a transmission block, rendering it unaffordabl… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  6. arXiv:2405.04167  [pdf, other

    cs.CV eess.IV

    Bridging the Synthetic-to-Authentic Gap: Distortion-Guided Unsupervised Domain Adaptation for Blind Image Quality Assessment

    Authors: Aobo Li, **jian Wu, Yongxu Liu, Leida Li

    Abstract: The annotation of blind image quality assessment (BIQA) is labor-intensive and time-consuming, especially for authentic images. Training on synthetic data is expected to be beneficial, but synthetically trained models often suffer from poor generalization in real domains due to domain gaps. In this work, we make a key observation that introducing more distortion types in the synthetic dataset may… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR2024

  7. arXiv:2404.15364  [pdf, other

    eess.SP cs.AI cs.CV cs.LG

    MP-DPD: Low-Complexity Mixed-Precision Neural Networks for Energy-Efficient Digital Predistortion of Wideband Power Amplifiers

    Authors: Yizhuo Wu, Ang Li, Mohammadreza Beikmirza, Gagan Deep Singh, Qinyu Chen, Leo C. N. de Vreede, Morteza Alavi, Chang Gao

    Abstract: Digital Pre-Distortion (DPD) enhances signal quality in wideband RF power amplifiers (PAs). As signal bandwidths expand in modern radio systems, DPD's energy consumption increasingly impacts overall system efficiency. Deep Neural Networks (DNNs) offer promising advancements in DPD, yet their high complexity hinders their practical deployment. This paper introduces open-source mixed-precision (MP)… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: Accepted to IEEE Microwave and Wireless Technology Letters (MWTL)

  8. arXiv:2403.09096  [pdf, other

    eess.IV cs.CV

    Deep unfolding Network for Hyperspectral Image Super-Resolution with Automatic Exposure Correction

    Authors: Yuan Fang, Yipeng Liu, Jie Chen, Zhen Long, Ao Li, Chong-Yung Chi, Ce Zhu

    Abstract: In recent years, the fusion of high spatial resolution multispectral image (HR-MSI) and low spatial resolution hyperspectral image (LR-HSI) has been recognized as an effective method for HSI super-resolution (HSI-SR). However, both HSI and MSI may be acquired under extreme conditions such as night or poorly illuminating scenarios, which may cause different exposure levels, thereby seriously downgr… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  9. arXiv:2402.15944  [pdf, other

    cs.IT eess.SP

    On A Class of Greedy Sparse Recovery Algorithms -- A High Dimensional Approach

    Authors: Gang Li, Qiuwei Li, Shuang Li, Wu Angela Li

    Abstract: Sparse signal recovery deals with finding the sparest solution of an under-determined linear system $x = Qs$. In this paper, we propose a novel greedy approach to addressing the challenges from such a problem. Such an approach is based on a characterization of solutions to the system, which allows us to work on the sparse recovery in the $s$-space directly with a given measure. With $l_2$-based me… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  10. arXiv:2402.04882  [pdf, other

    cs.NE cs.AI cs.LG cs.SD eess.AS

    LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units

    Authors: Zeyu Liu, Gourav Datta, Anni Li, Peter Anthony Beerel

    Abstract: Transformer models have demonstrated high accuracy in numerous applications but have high complexity and lack sequential processing capability making them ill-suited for many streaming applications at the edge where devices are heavily resource-constrained. Thus motivated, many researchers have proposed reformulating the transformer models as RNN modules which modify the self-attention computation… ▽ More

    Submitted 19 January, 2024; originally announced February 2024.

    Comments: The 12th International Conference on Learning Representations (ICLR 2024)

  11. arXiv:2402.03710  [pdf, other

    eess.AS cs.CL cs.SD

    Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience

    Authors: Xilin Jiang, Cong Han, Yinghao Aaron Li, Nima Mesgarani

    Abstract: In daily life, we encounter a variety of sounds, both desirable and undesirable, with limited control over their presence and volume. Our work introduces "Listen, Chat, and Edit" (LCE), a novel multimodal sound mixture editor that modifies each sound source in a mixture based on user-provided text instructions. LCE distinguishes itself with a user-friendly chat interface and its unique ability to… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: preprint

  12. arXiv:2401.00166  [pdf, ps, other

    cs.IT eess.SP

    Block-Level MU-MISO Interference Exploitation Precoding: Optimal Structure and Explicit Duality

    Authors: Junwen Yang, Ang Li, Xuewen Liao, Christos Masouros, A. L. Swindlehurst

    Abstract: This paper investigates block-level interference exploitation (IE) precoding for multi-user multiple-input single-output (MU-MISO) downlink systems. To overcome the need for symbol-level IE precoding to frequently update the precoding matrix, we propose to jointly optimize all the precoders or transmit signals within a transmission block. The resultant precoders only need to be updated once per bl… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

    Comments: Submitted to IEEE

  13. Patient-Adaptive and Learned MRI Data Undersampling Using Neighborhood Clustering

    Authors: Siddhant Gautam, Angqi Li, Saiprasad Ravishankar

    Abstract: There has been much recent interest in adapting undersampled trajectories in MRI based on training data. In this work, we propose a novel patient-adaptive MRI sampling algorithm based on grou** scans within a training set. Scan-adaptive sampling patterns are optimized together with an image reconstruction network for the training scans. The training optimization alternates between determining th… ▽ More

    Submitted 31 March, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

  14. arXiv:2311.16456  [pdf, other

    cs.CV eess.IV

    Spiking Neural Networks with Dynamic Time Steps for Vision Transformers

    Authors: Gourav Datta, Zeyu Liu, Anni Li, Peter A. Beerel

    Abstract: Spiking Neural Networks (SNNs) have emerged as a popular spatio-temporal computing paradigm for complex vision tasks. Recently proposed SNN training algorithms have significantly reduced the number of time steps (down to 1) for improved latency and energy efficiency, however, they target only convolutional neural networks (CNN). These algorithms, when applied on the recently spotlighted vision tra… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Under review

  15. arXiv:2310.05021  [pdf, other

    eess.SY

    Toward Intelligent Emergency Control for Large-scale Power Systems: Convergence of Learning, Physics, Computing and Control

    Authors: Qiuhua Huang, Renke Huang, Tianzhixi Yin, Sohom Datta, Xueqing Sun, Jason Hou, Jie Tan, Wenhao Yu, Yuan Liu, Xinya Li, Bruce Palmer, Ang Li, Xinda Ke, Marianna Vaiman, Song Wang, Yousu Chen

    Abstract: This paper has delved into the pressing need for intelligent emergency control in large-scale power systems, which are experiencing significant transformations and are operating closer to their limits with more uncertainties. Learning-based control methods are promising and have shown effectiveness for intelligent power system control. However, when they are applied to large-scale power systems, t… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

    Comments: submitted to PSCC 2024

  16. arXiv:2310.00534  [pdf, other

    eess.SY

    Safe Optimal Interactions Between Automated and Human-Driven Vehicles in Mixed Traffic with Event-triggered Control Barrier Functions

    Authors: Anni Li, Christos G. Cassandras, Wei Xiao

    Abstract: This paper studies safe driving interactions between Human-Driven Vehicles (HDVs) and Connected and Automated Vehicles (CAVs) in mixed traffic where the dynamics and control policies of HDVs are unknown and hard to predict. In order to address this challenge, we employ event-triggered Control Barrier Functions (CBFs) to estimate the HDV model online, construct data-driven and state-feedback safety… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

  17. arXiv:2309.15938  [pdf, other

    eess.AS cs.LG cs.SD

    Exploring Self-Supervised Contrastive Learning of Spatial Sound Event Representation

    Authors: Xilin Jiang, Cong Han, Yinghao Aaron Li, Nima Mesgarani

    Abstract: In this study, we present a simple multi-channel framework for contrastive learning (MC-SimCLR) to encode 'what' and 'where' of spatial audios. MC-SimCLR learns joint spectral and spatial representations from unlabeled spatial audios, thereby enhancing both event classification and sound localization in downstream tasks. At its core, we propose a multi-level data augmentation pipeline that augment… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  18. arXiv:2309.14324  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Towards General-Purpose Text-Instruction-Guided Voice Conversion

    Authors: Chun-Yi Kuan, Chen An Li, Tsu-Yuan Hsu, Tse-Yang Lin, Ho-Lam Chung, Kai-Wei Chang, Shuo-yiin Chang, Hung-yi Lee

    Abstract: This paper introduces a novel voice conversion (VC) model, guided by text instructions such as "articulate slowly with a deep tone" or "speak in a cheerful boyish voice". Unlike traditional methods that rely on reference utterances to determine the attributes of the converted speech, our model adds versatility and specificity to voice conversion. The proposed VC model is a neural codec language mo… ▽ More

    Submitted 16 January, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: Accepted to ASRU 2023

  19. arXiv:2309.10753  [pdf, ps, other

    eess.SY

    Generalized Cactus and Structural Controllability of Switched Linear Continuous-Time Systems

    Authors: Yuan Zhang, Yuanqing Xia, Aming Li

    Abstract: This paper explores the structural controllability of switched linear continuous-time systems. It first identifies a gap in the proof for a pivotal criterion for the structural controllability of switched linear systems in the literature. To address this void, we develop novel graph-theoretic concepts, such as multi-layer dynamic graphs, generalized stems/buds, and generalized cacti, and based on… ▽ More

    Submitted 22 May, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: Under view in IEEE TAC; fixed some typos

  20. arXiv:2309.09493  [pdf, other

    eess.AS cs.AI cs.SD

    HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform

    Authors: Yinghao Aaron Li, Cong Han, Xilin Jiang, Nima Mesgarani

    Abstract: Recent advancements in speech synthesis have leveraged GAN-based networks like HiFi-GAN and BigVGAN to produce high-fidelity waveforms from mel-spectrograms. However, these networks are computationally expensive and parameter-heavy. iSTFTNet addresses these limitations by integrating inverse short-time Fourier transform (iSTFT) into the network, achieving both speed and parameter efficiency. In th… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  21. arXiv:2308.12749  [pdf, other

    cs.IT eess.SP

    Block-Level Interference Exploitation Precoding for MU-MISO: An ADMM Approach

    Authors: Yiran Wang, Yunsi Wen, Ang Li, Xiaoyan Hu, Christos Masouros

    Abstract: We study constructive interference based block-level precoding (CI-BLP) in the downlink of multi-user multiple-input single-output (MU-MISO) systems. Specifically, our aim is to extend the analysis on CI-BLP to the case where the considered number of symbol slots is smaller than that of the users. To this end, we mathematically prove the feasibility of using the pseudo-inverse to obtain the optima… ▽ More

    Submitted 30 August, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

  22. arXiv:2308.11636  [pdf, other

    eess.SP cs.AI cs.DC cs.LG cs.NE

    Aggregating Intrinsic Information to Enhance BCI Performance through Federated Learning

    Authors: Rui Liu, Yuanyuan Chen, Anran Li, Yi Ding, Han Yu, Cuntai Guan

    Abstract: Insufficient data is a long-standing challenge for Brain-Computer Interface (BCI) to build a high-performance deep learning model. Though numerous research groups and institutes collect a multitude of EEG datasets for the same BCI task, sharing EEG data from multiple sites is still challenging due to the heterogeneity of devices. The significance of this challenge cannot be overstated, given the c… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

  23. arXiv:2307.14797  [pdf, other

    eess.SP

    Symbol-Level Precoding for MU-MIMO System with RIRC Receiver

    Authors: Xiao Tong, Ang Li, Lei Lei, Fan Liu, Fuwang Dong

    Abstract: Consider a multiuser multiple-input multiple-output (MU-MIMO) downlink system in which the base station (BS) sends multiple data streams to multi-antenna users via symbol-level precoding (SLP), where the optimization of receive combining matrix becomes crucial, unlike in the single-antenna user scenario. We begin by introducing a joint optimization problem on the symbol-level transmit precoder and… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: 13 pages, 10 figures

  24. arXiv:2307.09435  [pdf, other

    eess.AS cs.AI cs.SD

    SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs

    Authors: Yinghao Aaron Li, Cong Han, Nima Mesgarani

    Abstract: In recent years, large-scale pre-trained speech language models (SLMs) have demonstrated remarkable advancements in various generative speech modeling applications, such as text-to-speech synthesis, voice conversion, and speech enhancement. These applications typically involve map** text or speech inputs to pre-trained SLM representations, from which target speech is decoded. This paper introduc… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: WASPAA 2023

  25. arXiv:2307.00535  [pdf, other

    cs.IT eess.SP

    Goal-oriented Tensor: Beyond Age of Information Towards Semantics-Empowered Goal-Oriented Communications

    Authors: Aimin Li, Shaohua Wu, Sumei Sun, Jie Cao

    Abstract: Optimizations premised on open-loop metrics such as Age of Information (AoI) indirectly enhance the system's decision-making utility. We therefore propose a novel closed-loop metric named Goal-oriented Tensor (GoT) to directly quantify the impact of semantic mismatches on goal-oriented decision-making utility. Leveraging the GoT, we consider a sampler & decision-maker pair that works collaborative… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: 30 pages, 9 figures. arXiv admin note: substantial text overlap with arXiv:2305.04083

  26. arXiv:2306.15561  [pdf, other

    cs.CV cs.MM eess.IV

    You Can Mask More For Extremely Low-Bitrate Image Compression

    Authors: Anqi Li, Feng Li, Jiaxin Han, Huihui Bai, Runmin Cong, Chunjie Zhang, Meng Wang, Weisi Lin, Yao Zhao

    Abstract: Learned image compression (LIC) methods have experienced significant progress during recent years. However, these methods are primarily dedicated to optimizing the rate-distortion (R-D) performance at medium and high bitrates (> 0.1 bits per pixel (bpp)), while research on extremely low bitrates is limited. Besides, existing methods fail to explicitly explore the image structure and texture compon… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: Under review

  27. arXiv:2306.14509  [pdf, ps, other

    eess.SP

    Faster-Than-Nyquist Symbol-Level Precoding for Wideband Integrated Sensing and Communications

    Authors: Zihan Liao, Fan Liu, Ang Li, Christos Masouros

    Abstract: In this paper, we present an innovative symbol-level precoding (SLP) approach for a wideband multi-user multi-input multi-output (MU-MIMO) downlink Integrated Sensing and Communications (ISAC) system employing faster-than-Nyquist (FTN) signaling. Our proposed technique minimizes the minimum mean squared error (MMSE) for the sensed parameter estimation while ensuring the communication per-user qual… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  28. arXiv:2306.08454  [pdf, other

    cs.SD eess.AS

    Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction

    Authors: Wenzhe Liu, Yupeng Shi, Jun Chen, Wei Rao, Shulin He, Andong Li, Yannan Wang, Zhiyong Wu

    Abstract: This paper describes a real-time General Speech Reconstruction (Gesper) system submitted to the ICASSP 2023 Speech Signal Improvement (SSI) Challenge. This novel proposed system is a two-stage architecture, in which the speech restoration is performed, and then cascaded by speech enhancement. We propose a complex spectral map**-based generative adversarial network (CSM-GAN) as the speech restora… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: Accepted by InterSpeech 2023

  29. arXiv:2306.07691  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

    Authors: Yinghao Aaron Li, Cong Han, Vinay S. Raghavan, Gavin Mischler, Nima Mesgarani

    Abstract: In this paper, we present StyleTTS 2, a text-to-speech (TTS) model that leverages style diffusion and adversarial training with large speech language models (SLMs) to achieve human-level TTS synthesis. StyleTTS 2 differs from its predecessor by modeling styles as a latent random variable through diffusion models to generate the most suitable style for the text without requiring reference speech, a… ▽ More

    Submitted 19 November, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023

  30. arXiv:2306.02251  [pdf

    cs.SD eess.AS

    Effects of Tonal Coarticulation and Prosodic Positions on Tonal Contours of Low Rising Tones: In the Case of Xiamen Dialect

    Authors: Yiying Hu, Hui Feng, Qinghua Zhao, Aijun Li

    Abstract: Few studies have worked on the effects of tonal coarticulation and prosodic positions on the low rising tone in Xiamen Dialect. This study addressed such an issue. To do so, a new method, the Tonal Contour Analysis in Tonal Triangle, was proposed to measure the subtle curvature of the tonal contour. Findings are as follows: (1) The low rising tone in Xiamen Dialect has a tendency towards the falli… ▽ More

    Submitted 3 June, 2023; originally announced June 2023.

    Comments: To be published in InterSpeech 2023

  31. DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes

    Authors: Xilin Jiang, Yinghao Aaron Li, Nima Mesgarani

    Abstract: Lifelong audio feature extraction involves learning new sound classes incrementally, which is essential for adapting to new data distributions over time. However, optimizing the model only on new data can lead to catastrophic forgetting of previously learned tasks, which undermines the model's ability to perform well over the long term. This paper introduces a new approach to continual audio repre… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: INTERSPEECH 2023

    Journal ref: Proc. INTERSPEECH 2023, pp.2818--2822

  32. arXiv:2305.17883  [pdf, other

    eess.SY

    Maximizing Safety and Efficiency for Cooperative Lane-Changing: A Minimally Disruptive Approach

    Authors: Andres S. Chavez Armijos, Anni Li, Christos G. Cassandras

    Abstract: This paper addresses cooperative lane-changing maneuvers in mixed traffic, aiming to minimize traffic flow disruptions while accounting for uncooperative vehicles. The proposed approach adopts controllers combining Optimal control with Control Barrier Functions (OCBF controllers) which guarantee spatio-temporal constraints through the use of fixed-time convergence. Additionally, we introduce robus… ▽ More

    Submitted 30 May, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

  33. arXiv:2305.10953  [pdf, other

    eess.SY

    Detecting the driver nodes of temporal networks

    Authors: Tingting Qin, Gaopeng Duan, Aming Li

    Abstract: Detecting the driver nodes of complex networks has garnered significant attention recently to control complex systems to desired behaviors, where nodes represent system components and edges encode their interactions. Driver nodes, which are directly controlled by external inputs, play a crucial role in controlling all network nodes. While many approaches have been proposed to identify driver nodes… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  34. arXiv:2305.09328  [pdf, ps, other

    cs.IT eess.SP

    Performance Analysis of NOMA-RIS aided Integrated Navigation and Communication (INAC) Networks

    Authors: Tianwei Hou, Anna Li

    Abstract: Satellite communication constitutes a promising solution for the sixth generation (6G) wireless networks in terms of providing global communication services. In order to provide a cost-effective satellite network, we propose a novel medium-earth-orbit (MEO) satellite aided integrated-navigation-and-communication (INAC) network. To overcome the severe path loss of MEO satellites, we conceive a netw… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

  35. arXiv:2305.04083  [pdf, other

    cs.IT eess.SP

    Goal-oriented Tensor: Beyond AoI Towards Semantics-Empowered Goal-oriented Communications

    Authors: Aimin Li, Shaohua Wu, Sumei Sun

    Abstract: The intricate interplay of source dynamics, unreliable channels, and staleness of information has long been recognized as a significant impediment for the receiver to achieve accurate, timely, and most importantly, goal-oriented decision making. Thus, a plethora of promising metrics, such as Age of Information, Value of Information, and Mean Square Error, have emerged to quantify these underlying… ▽ More

    Submitted 9 May, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

    Comments: 6 pages, 4, figures, Submitted to 2023 Globecom

  36. arXiv:2304.07813  [pdf, other

    cs.IT eess.SP

    Deep Reinforcement Learning-Assisted Age-optimal Transmission Policy for HARQ-aided NOMA Networks

    Authors: Kunpeng Liu, Aimin Li, Shaohua Wu

    Abstract: The recent interweaving of AI-6G technologies has sparked extensive research interest in further enhancing reliable and timely communications. \emph{Age of Information} (AoI), as a novel and integrated metric implying the intricate trade-offs among reliability, latency, and update frequency, has been well-researched since its conception. This paper contributes new results in this area by employing… ▽ More

    Submitted 16 April, 2023; originally announced April 2023.

  37. arXiv:2304.04142  [pdf

    q-bio.QM cs.CV eess.IV

    Slideflow: Deep Learning for Digital Histopathology with Real-Time Whole-Slide Visualization

    Authors: James M. Dolezal, Sara Kochanny, Emma Dyer, Andrew Srisuwananukorn, Matteo Sacco, Frederick M. Howard, Anran Li, Prajval Mohan, Alexander T. Pearson

    Abstract: Deep learning methods have emerged as powerful tools for analyzing histopathological images, but current methods are often specialized for specific domains and software environments, and few open-source options exist for deploying models in an interactive interface. Experimenting with different deep learning approaches typically requires switching software libraries and reprocessing data, reducing… ▽ More

    Submitted 8 April, 2023; originally announced April 2023.

  38. arXiv:2303.16948  [pdf, other

    eess.SY

    Cooperative Lane Changing in Mixed Traffic can be Robust to Human Driver Behavior

    Authors: Anni Li, Andres S. Chavez Armijos, Christos G. Cassandras

    Abstract: We derive time and energy-optimal control policies for a Connected Autonomous Vehicle (CAV) to complete lane change maneuvers in mixed traffic. The interaction between CAVs and Human-Driven Vehicles (HDVs) requires designing the best possible response of a CAV to actions by its neighboring HDVs. This interaction is formulated using a bilevel optimization setting with an appropriate behavioral mode… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

  39. arXiv:2303.05991  [pdf, other

    math.OC cs.MA eess.SY

    Minimally Disruptive Cooperative Lane-change Maneuvers

    Authors: Behdad Chalaki, Vaishnav Tadiparthi, Hossein Nourkhiz Mahjoub, Jovin D'sa, Ehsan Moradi-Pari, Andres S. Chavez Armijos, Anni Li, Christos G. Cassandras

    Abstract: A lane-change maneuver on a congested highway could be severely disruptive or even infeasible without the cooperation of neighboring cars. However, cooperation with other vehicles does not guarantee that the performed maneuver will not have a negative impact on traffic flow unless it is explicitly considered in the cooperative controller design. In this letter, we present a socially compliant fram… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

    Comments: 6 pages, 2 figures

    Journal ref: IEEE Control Systems Letters, vol. 7, pp. 1766-1771, 2023

  40. arXiv:2303.04432  [pdf, ps, other

    eess.SP

    Deep Learning-Based Channel Extrapolation for Pattern Reconfigurable Massive MIMO

    Authors: Mu Liang, Ang Li

    Abstract: Reconfigurable antennas that can dynamically change their operation state exhibit excellent adaptivity and flexibility over traditional antennas, and MIMO arrays that consist of multifunctional and reconfigurable antennas (MRAs) are foreseen as one promising solution towards future Holographic MIMO. Specifically, in pattern reconfigurable MIMO (PR-MIMO) communication systems, accurate acquisition… ▽ More

    Submitted 6 April, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

  41. arXiv:2302.05756  [pdf, other

    eess.AS cs.SD eess.SP

    Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation

    Authors: Cong Han, Vishal Choudhari, Yinghao Aaron Li, Nima Mesgarani

    Abstract: Auditory attention decoding (AAD) is a technique used to identify and amplify the talker that a listener is focused on in a noisy environment. This is done by comparing the listener's brainwaves to a representation of all the sound sources to find the closest match. The representation is typically the waveform or spectrogram of the sounds. The effectiveness of these representations for AAD is unce… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

  42. arXiv:2301.08810  [pdf, other

    cs.CL cs.SD eess.AS

    Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions

    Authors: Yinghao Aaron Li, Cong Han, Xilin Jiang, Nima Mesgarani

    Abstract: Large-scale pre-trained language models have been shown to be helpful in improving the naturalness of text-to-speech (TTS) models by enabling them to produce more naturalistic prosodic patterns. However, these models are usually word-level or sup-phoneme-level and jointly trained with phonemes, making them inefficient for the downstream TTS task where only phonemes are needed. In this work, we pro… ▽ More

    Submitted 20 January, 2023; originally announced January 2023.

  43. arXiv:2301.01940  [pdf, other

    eess.IV cs.CV

    Enabling Augmented Segmentation and Registration in Ultrasound-Guided Spinal Surgery via Realistic Ultrasound Synthesis from Diagnostic CT Volume

    Authors: Ang Li, Jiayi Han, Yongjian Zhao, Keyu Li, Li Liu

    Abstract: This paper aims to tackle the issues on unavailable or insufficient clinical US data and meaningful annotation to enable bone segmentation and registration for US-guided spinal surgery. While the US is not a standard paradigm for spinal surgery, the scarcity of intra-operative clinical US data is an insurmountable bottleneck in training a neural network. Moreover, due to the characteristics of US… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

    Comments: Submitted to IEEE Transactions on Automation Science and Engineering. Copyright may be transferred without notice, after which this version may no longer be accessible. Note that the abstract is shorter than that in the pdf file due to character limitations

  44. arXiv:2212.14227  [pdf, other

    eess.AS cs.SD

    StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models

    Authors: Yinghao Aaron Li, Cong Han, Nima Mesgarani

    Abstract: One-shot voice conversion (VC) aims to convert speech from any source speaker to an arbitrary target speaker with only a few seconds of reference speech from the target speaker. This relies heavily on disentangling the speaker's identity and speech content, a task that still remains challenging. Here, we propose a novel approach to learning disentangled speech representation by transfer learning f… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

    Comments: SLT 2022

  45. arXiv:2211.16764  [pdf, other

    cs.SD eess.AS

    A General Unfolding Speech Enhancement Method Motivated by Taylor's Theorem

    Authors: Andong Li, Guochen Yu, Chengshi Zheng, Wenzhe Liu, Xiaodong Li

    Abstract: While deep neural networks have facilitated significant advancements in the field of speech enhancement, most existing methods are developed following either empirical or relatively blind criteria, lacking adequate guidelines in pipeline design. Inspired by Taylor's theorem, we propose a general unfolding framework for both single- and multi-channel speech enhancement tasks. Concretely, we formula… ▽ More

    Submitted 28 March, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: Submitted to TASLP, revised version, 17 pages

  46. arXiv:2211.14818  [pdf, other

    cs.IT eess.SP

    Speeding-up Symbol-Level Precoding Using Separable and Dual Optimizations

    Authors: Junwen Yang, Ang Li, Xuewen Liao, Christos Masouros

    Abstract: Symbol-level precoding (SLP) manipulates the transmitted signals to accurately exploit the multi-user interference (MUI) in the multi-user downlink. This enables that all the resultant interference contributes to correct detection, which is the so-called constructive interference (CI). Its performance superiority comes at the cost of solving a nonlinear optimization problem on a symbol-by-symbol b… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

    Comments: 30 pages, 11 figures

  47. arXiv:2211.12024  [pdf, other

    cs.SD eess.AS

    TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech Enhancement from Beam-Space Dictionary Perspective

    Authors: Andong Li, Guochen Yu, Wenzhe Liu, Xiaodong Li, Chengshi Zheng

    Abstract: Despite the promising performance of existing frame-wise all-neural beamformers in the speech enhancement field, it remains unclear what the underlying mechanism exists. In this paper, we revisit the beamforming behavior from the beam-space dictionary perspective and formulate it into the learning and mixing of different beam-space components. Based on that, we propose an all-neural beamformer cal… ▽ More

    Submitted 30 November, 2022; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: In submission to ICASSP 2023, 5 pages

  48. arXiv:2211.08636  [pdf, other

    cs.RO eess.SY

    Cooperative Energy and Time-Optimal Lane Change Maneuvers with Minimal Highway Traffic Disruption

    Authors: Andres S. Chavez Armijos, Anni Li, Christos G. Cassandras, Yasir K. Al-Nadawi, Hidekazu Araki, Behdad Chalaki, Ehsan Moradi-Pari, Hossein Nourkhiz Mahjoub, Vaishnav Tadiparthi

    Abstract: We derive optimal control policies for a Connected Automated Vehicle (CAV) and cooperating neighboring CAVs to carry out a lane change maneuver consisting of a longitudinal phase where the CAV properly positions itself relative to the cooperating neighbors and a lateral phase where it safely changes lanes. In contrast to prior work on this problem, where the CAV "selfishly" only seeks to minimize… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2203.17102

  49. arXiv:2211.05910  [pdf, other

    eess.IV cs.CV

    Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Maurizio Denna, Abdel Younes, Ganzorig Gankhuyag, **gang Huh, Myeong Kyun Kim, Kihwan Yoon, Hyeon-Cheol Moon, Seungho Lee, Yoonsik Choe, **woo Jeong, Sungjei Kim, Maciej Smyl, Tomasz Latkowski, Pawel Kubik, Michal Sokolski, Yujie Ma, Jiahao Chao, Zhou Zhou, Hongfan Gao, Zhengfeng Yang, Zhenbing Zeng, Zhengyang Zhuge, Chenghua Li , et al. (71 additional authors not shown)

    Abstract: Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2105.07825, arXiv:2105.08826, arXiv:2211.04470, arXiv:2211.03885, arXiv:2211.05256

  50. arXiv:2211.01661  [pdf, other

    cs.DS eess.SY math.OC

    Pairing optimization via statistics: Algebraic structure in pairing problems and its application to performance enhancement

    Authors: Naoki Fujita, André Röhm, Takatomo Mihana, Ryoichi Horisaki, Aohan Li, Mikio Hasegawa, Makoto Naruse

    Abstract: Fully pairing all elements of a set while attempting to maximize the total benefit is a combinatorically difficult problem. Such pairing problems naturally appear in various situations in science, technology, economics, and other fields. In our previous study, we proposed an efficient method to infer the underlying compatibilities among the entities, under the constraint that only the total compat… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.