Skip to main content

Showing 1–50 of 135 results for author: Yin, X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.16326  [pdf, other

    eess.AS

    RefXVC: Cross-Lingual Voice Conversion with Enhanced Reference Leveraging

    Authors: Mingyang Zhang, Yi Zhou, Yi Ren, Chen Zhang, Xiang Yin, Haizhou Li

    Abstract: This paper proposes RefXVC, a method for cross-lingual voice conversion (XVC) that leverages reference information to improve conversion performance. Previous XVC works generally take an average speaker embedding to condition the speaker identity, which does not account for the changing timbre of speech that occurs with different pronunciations. To address this, our method uses both global and loc… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Manuscript under review by TASLP

  2. arXiv:2406.05966  [pdf, other

    eess.SY

    Approximating arrival costs in distributed moving horizon estimation: A recursive method

    Authors: Xiaojie Li, Xunyuan Yin

    Abstract: In this paper, we present a new approach to distributed moving horizon estimation for constrained nonlinear processes. The method involves approximating the arrival costs of local estimators through a recursive framework. First, distributed full-information estimation for linear unconstrained systems is presented, which serves as the foundation for deriving the analytical expression of the arrival… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  3. arXiv:2405.12478  [pdf, other

    eess.SY

    Efficient Economic Model Predictive Control of Water Treatment Process with Learning-based Koopman Operator

    Authors: Minghao Han, **gshi Yao, Adrian Wing-Keung Law, Xunyuan Yin

    Abstract: Used water treatment plays a pivotal role in advancing environmental sustainability. Economic model predictive control holds the promise of enhancing the overall operational performance of the water treatment facilities. In this study, we propose a data-driven economic predictive control approach within the Koopman modeling framework. First, we propose a deep learning-enabled input-output Koopman… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  4. arXiv:2405.06999  [pdf, other

    eess.SY

    Large Language Model-aided Edge Learning in Distribution System State Estimation

    Authors: Renyou Xie, Xin Yin, Chaojie Li, Nian Liu, Bo Zhao, Zhaoyang Dong

    Abstract: Distribution system state estimation (DSSE) plays a crucial role in the real-time monitoring, control, and operation of distribution networks. Besides intensive computational requirements, conventional DSSE methods need high-quality measurements to obtain accurate states, whereas missing values often occur due to sensor failures or communication delays. To address these challenging issues, a forec… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  5. arXiv:2405.04867  [pdf, other

    eess.IV cs.CV

    MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

    Authors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhi**g Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Hai** Zeng, Kai Feng , et al. (24 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: MIPI@CVPR2024. Website: https://mipi-challenge.org/MIPI2024/

  6. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, **shan Pan, Jiangxin Dong, **hui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi **, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  7. arXiv:2404.16223  [pdf, other

    cs.CV eess.IV

    Deep RAW Image Super-Resolution. A NTIRE 2024 Challenge Survey

    Authors: Marcos V. Conde, Florin-Alexandru Vasluianu, Radu Timofte, Jianxing Zhang, Jia Li, Fan Wang, Xiaopeng Li, Zikun Liu, Hyunhee Park, Sejun Song, Changho Kim, Zhijuan Huang, Hongyuan Yu, Cheng Wan, Wending Xiang, Jiamin Lin, Hang Zhong, Qiaosong Zhang, Yue Sun, Xuanwu Yin, Kunlong Zuo, Senyan Xu, Siyuan Jiang, Zhi**g Sun, Jiaying Zhu , et al. (10 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 RAW Image Super-Resolution Challenge, highlighting the proposed solutions and results. New methods for RAW Super-Resolution could be essential in modern Image Signal Processing (ISP) pipelines, however, this problem is not as explored as in the RGB domain. Th goal of this challenge is to upscale RAW Bayer images by 2x, considering unknown degradations such as nois… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 - NTIRE Workshop

  8. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  9. arXiv:2404.06746  [pdf, other

    eess.SY

    Data-driven parallel Koopman subsystem modeling and distributed moving horizon state estimation for large-scale nonlinear processes

    Authors: Xiaojie Li, Song Bo, Xuewen Zhang, Yan Qin, Xunyuan Yin

    Abstract: In this work, we consider a state estimation problem for large-scale nonlinear processes in the absence of first-principles process models. By exploiting process operation data, both process modeling and state estimation design are addressed within a distributed framework. By leveraging the Koopman operator concept, a parallel subsystem modeling approach is proposed to establish interactive linear… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  10. arXiv:2404.06738  [pdf, other

    eess.SY

    Partition-based distributed extended Kalman filter for large-scale nonlinear processes with application to chemical and wastewater treatment processes

    Authors: Xiaojie Li, Adrian Wing-Keung Law, Xunyuan Yin

    Abstract: In this paper, we address a partition-based distributed state estimation problem for large-scale general nonlinear processes by proposing a Kalman-based approach. First, we formulate a linear full-information estimation design within a distributed framework as the basis for develo** our approach. Second, the analytical solution to the local optimization problems associated with the formulated di… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  11. arXiv:2404.06706  [pdf, ps, other

    eess.SY

    Iterative distributed moving horizon estimation of linear systems with penalties on both system disturbances and noise

    Authors: Xiaojie Li, Song Bo, Yan Qin, Xunyuan Yin

    Abstract: In this paper, partition-based distributed state estimation of general linear systems is considered. A distributed moving horizon state estimation scheme is developed via decomposing the entire system model into subsystem models and partitioning the global objective function of centralized moving horizon estimation (MHE) into local objective functions. The subsystem estimators of the distributed s… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  12. arXiv:2404.01468  [pdf, other

    eess.SY math.DS stat.AP

    Performance triggered adaptive model reduction for soil moisture estimation in precision irrigation

    Authors: Sarupa Debnath, Bernard T. Agyeman, Soumya R. Sahoo, Xunyuan Yin, **feng Liu

    Abstract: Accurate soil moisture information is crucial for develo** precise irrigation control strategies to enhance water use efficiency. Soil moisture estimation based on limited soil moisture sensors is crucial for obtaining comprehensive soil moisture information when dealing with large-scale agricultural fields. The major challenge in soil moisture estimation lies in the high dimensionality of the s… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  13. Reduced-order Koopman modeling and predictive control of nonlinear processes

    Authors: Xuewen Zhang, Minghao Han, Xunyuan Yin

    Abstract: In this paper, we propose an efficient data-driven predictive control approach for general nonlinear processes based on a reduced-order Koopman operator. A Kalman-based sparse identification of nonlinear dynamics method is employed to select lifting functions for Koopman identification. The selected lifting functions are used to project the original nonlinear state-space into a higher-dimensional… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 29 pages, 8 figures

    Journal ref: Computers & Chemical Engineering, 2023, 179, p.108440

  14. arXiv:2403.18632  [pdf, other

    eess.SY

    Optimal Control Synthesis of Markov Decision Processes for Efficiency with Surveillance Tasks

    Authors: Yu Chen, Xuanyuan Yin, Shaoyuan Li, Xiang Yin

    Abstract: We investigate the problem of optimal control synthesis for Markov Decision Processes (MDPs), addressing both qualitative and quantitative objectives. Specifically, we require the system to fulfill a qualitative surveillance task in the sense that a specific region of interest can be visited infinitely often with probability one. Furthermore, to quantify the performance of the system, we consider… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  15. arXiv:2403.17704   

    eess.SY cs.MA

    Prioritize Team Actions: Multi-Agent Temporal Logic Task Planning with Ordering Constraints

    Authors: Bowen Ye, Jianing Zhao, Shaoyuan Li, Xiang Yin

    Abstract: In this paper, we investigate the problem of linear temporal logic (LTL) path planning for multi-agent systems, introducing the new concept of \emph{ordering constraints}. Specifically, we consider a generic objective function that is defined for the path of each individual agent. The primary objective is to find a global plan for the team of agents, ensuring they collectively meet the specified L… ▽ More

    Submitted 8 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: This article is withdrawn due to errors in the methodology section, specifically concerning the insufficient explanation of the data collection process. Upon review, it's clear that the data sampling methods were not adequately described, potentially leading to misinterpretations of the results

  16. arXiv:2403.08504  [pdf, other

    cs.CV cs.RO eess.IV

    OccFiner: Offboard Occupancy Refinement with Hybrid Propagation

    Authors: Hao Shi, Song Wang, Jiaming Zhang, Xiaoting Yin, Zhongdao Wang, Zhijian Zhao, Guangming Wang, Jianke Zhu, Kailun Yang, Kaiwei Wang

    Abstract: Vision-based occupancy prediction, also known as 3D Semantic Scene Completion (SSC), presents a significant challenge in computer vision. Previous methods, confined to onboard processing, struggle with simultaneous geometric and semantic estimation, continuity across varying viewpoints, and single-view occlusion. Our paper introduces OccFiner, a novel offboard framework designed to enhance the acc… ▽ More

    Submitted 15 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  17. arXiv:2403.03390  [pdf, other

    cs.CV cs.LG eess.IV

    Performance Evaluation of Semi-supervised Learning Frameworks for Multi-Class Weed Detection

    Authors: Jiajia Li, Dong Chen, Xunyuan Yin, Zhaojian Li

    Abstract: Effective weed control plays a crucial role in optimizing crop yield and enhancing agricultural product quality. However, the reliance on herbicide application not only poses a critical threat to the environment but also promotes the emergence of resistant weeds. Fortunately, recent advances in precision weed management enabled by ML and DL provide a sustainable alternative. Despite great progress… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 11 pages, 7 figures

  18. arXiv:2402.13075  [pdf, other

    eess.SY cs.RO

    Formal Synthesis of Controllers for Safety-Critical Autonomous Systems: Developments and Challenges

    Authors: Xiang Yin, Bingzhao Gao, Xiao Yu

    Abstract: In recent years, formal methods have been extensively used in the design of autonomous systems. By employing mathematically rigorous techniques, formal methods can provide fully automated reasoning processes with provable safety guarantees for complex dynamic systems with intricate interactions between continuous dynamics and discrete logics. This paper provides a comprehensive review of formal co… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  19. arXiv:2402.05564   

    eess.SY

    A Game-Theoretical Approach for Optimal Supervisory Control of Discrete Event Systems under Energy Constraints

    Authors: Peng Lv, Shaoyuan Li, Xiang Yin

    Abstract: In this paper, we investigate the problem of optimal supervisory control for the discrete event systems under energy constraints. We consider that the execution of events consumes energy and the energy can be replenished at specific reload states. When the energy level drops below zero, the system will be crashed. To capture the above scenario, we introduce a new model, called consumption discrete… ▽ More

    Submitted 10 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: We will add richer content

  20. arXiv:2401.16700  [pdf, other

    cs.CV cs.RO eess.IV

    Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers

    Authors: Jianbin Jiao, Xina Cheng, Weijie Chen, Xiaoting Yin, Hao Shi, Kailun Yang

    Abstract: 3D human pose estimation captures the human joint points in three-dimensional space while kee** the depth information and physical structure. That is essential for applications that require precise pose information, such as human-computer interaction, scene understanding, and rehabilitation training. Due to the challenges in data collection, mainstream datasets of 3D human pose estimation are pr… ▽ More

    Submitted 25 March, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted to IJCNN 2024. The source code will be available at https://github.com/WUJINHUAN/3D-human-pose

  21. arXiv:2401.14907  [pdf, other

    cs.RO cs.LG eess.SY

    Learning Local Control Barrier Functions for Safety Control of Hybrid Systems

    Authors: Shuo Yang, Yu Chen, Xiang Yin, Rahul Mangharam

    Abstract: Hybrid dynamical systems are ubiquitous as practical robotic applications often involve both continuous states and discrete switchings. Safety is a primary concern for hybrid robotic systems. Existing safety-critical control approaches for hybrid systems are either computationally inefficient, detrimental to system performance, or limited to small-scale systems. To amend these drawbacks, in this p… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  22. arXiv:2401.01972  [pdf, other

    eess.SY

    On Approximate Opacity of Stochastic Control Systems

    Authors: Siyuan Liu, Xiang Yin, Dimos V. Dimarogonas, Majid Zamani

    Abstract: This paper investigates an important class of information-flow security property called opacity for stochastic control systems. Opacity captures whether a system's secret behavior (a subset of the system's behavior that is considered to be critical) can be kept from outside observers. Existing works on opacity for control systems only provide a binary characterization of the system's security leve… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: 16 pages, 6 figures, journal submission

  23. arXiv:2312.17444  [pdf, other

    cs.ET eess.SP

    Reconfigurable Frequency Multipliers Based on Complementary Ferroelectric Transistors

    Authors: Haotian Xu, Jianyi Yang, Cheng Zhuo, Thomas Kämpfe, Kai Ni, Xunzhao Yin

    Abstract: Frequency multipliers, a class of essential electronic components, play a pivotal role in contemporary signal processing and communication systems. They serve as crucial building blocks for generating high-frequency signals by multiplying the frequency of an input signal. However, traditional frequency multipliers that rely on nonlinear devices often require energy- and area-consuming filtering an… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: 6 pages, 8 figures, 1 table. Accepted by Design Automation and Test in Europe (DATE) 2024

  24. arXiv:2312.15946  [pdf, other

    cs.SD cs.GR eess.AS

    EnchantDance: Unveiling the Potential of Music-Driven Dance Movement

    Authors: Bo Han, Yi Ren, Hao Peng, Teng Zhang, Zeyu Ling, Xiang Yin, Feilin Han

    Abstract: The task of music-driven dance generation involves creating coherent dance movements that correspond to the given music. While existing methods can produce physically plausible dances, they often struggle to generalize to out-of-set data. The challenge arises from three aspects: 1) the high diversity of dance movements and significant differences in the distribution of music modalities, which make… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  25. arXiv:2312.11947  [pdf, other

    cs.CL cs.SD eess.AS

    Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling

    Authors: Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li

    Abstract: Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting. While recognising the significance of CSS task, the prior studies have not thoroughly investigated the emotional expressiveness problems due to the scarcity of emotional conversational datasets and the difficulty of stateful emotion mo… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 9 pages, 4 figures, Accepted by AAAI'2024, Code and audio samples: https://github.com/walker-hyf/ECSS

  26. arXiv:2312.05764  [pdf, other

    eess.SY

    Synthesis of Temporally-Robust Policies for Signal Temporal Logic Tasks using Reinforcement Learning

    Authors: Siqi Wang, Shaoyuan Li, Li Yin, Xiang Yin

    Abstract: This paper investigates the problem of designing control policies that satisfy high-level specifications described by signal temporal logic (STL) in unknown, stochastic environments. While many existing works concentrate on optimizing the spatial robustness of a system, our work takes a step further by also considering temporal robustness as a critical metric to quantify the tolerance of time unce… ▽ More

    Submitted 23 March, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

    Comments: Accepted to ICRA 2024

  27. arXiv:2312.04242  [pdf, other

    eess.SY

    Signal Temporal Logic Control Synthesis among Uncontrollable Dynamic Agents with Conformal Prediction

    Authors: Xinyi Yu, Yiqi Zhao, Xiang Yin, Lars Lindemann

    Abstract: The control of dynamical systems under temporal logic specifications among uncontrollable dynamic agents is challenging due to the agents' a-priori unknown behavior. Existing works have considered the problem where either all agents are controllable, the agent models are deterministic and known, or no safety guarantees are provided. We propose a predictive control synthesis framework that guarante… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  28. arXiv:2312.01403  [pdf, other

    eess.SY

    OplixNet: Towards Area-Efficient Optical Split-Complex Networks with Real-to-Complex Data Assignment and Knowledge Distillation

    Authors: Ruidi Qiu, Amro Eldebiky, Grace Li Zhang, Xunzhao Yin, Cheng Zhuo, Ulf Schlichtmann, Bing Li

    Abstract: Having the potential for high speed, high throughput, and low energy cost, optical neural networks (ONNs) have emerged as a promising candidate for accelerating deep learning tasks. In conventional ONNs, light amplitudes are modulated at the input and detected at the output. However, the light phases are still ignored in conventional structures, although they can also carry information for computi… ▽ More

    Submitted 15 December, 2023; v1 submitted 3 December, 2023; originally announced December 2023.

    Comments: Accepted by Design Automation and Test in Europe (DATE) 2024

  29. arXiv:2311.15531  [pdf, other

    eess.SY

    Sleep When Everything Looks Fine: Self-Triggered Monitoring for Signal Temporal Logic Tasks

    Authors: Chuwei Wang, Xinyi Yu, Jianing Zhao, Lars Lindemann, Xiang Yin

    Abstract: Online monitoring is a widely used technique in assessing if the performance of the system satisfies some desired requirements during run-time operation. Existing works on online monitoring usually assume that the monitor can acquire system information periodically at each time instant. However, such a periodic mechanism may be unnecessarily energy-consuming as it essentially requires to turn on s… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

  30. arXiv:2311.12770  [pdf, other

    eess.IV cs.CV

    Swift Parameter-free Attention Network for Efficient Super-Resolution

    Authors: Cheng Wan, Hongyuan Yu, Zhiqi Li, Yihang Chen, Yajun Zou, Yuqing Liu, Xuanwu Yin, Kunlong Zuo

    Abstract: Single Image Super-Resolution (SISR) is a crucial task in low-level computer vision, aiming to reconstruct high-resolution images from low-resolution counterparts. Conventional attention mechanisms have significantly improved SISR performance but often result in complex network structures and large number of parameters, leading to slow inference speed and large model size. To address this issue, w… ▽ More

    Submitted 12 May, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: NTIRE2024 ESR winner

  31. arXiv:2311.04591  [pdf, other

    cs.CV cs.MM cs.RO eess.IV

    Rethinking Event-based Human Pose Estimation with 3D Event Representations

    Authors: Xiaoting Yin, Hao Shi, Jiaan Chen, Ze Wang, Yaozu Ye, Huajian Ni, Kailun Yang, Kaiwei Wang

    Abstract: Human pose estimation is a fundamental and appealing task in computer vision. Traditional frame-based cameras and videos are commonly applied, yet, they become less reliable in scenarios under high dynamic range or heavy motion blur. In contrast, event cameras offer a robust solution for navigating these challenging contexts. Predominant methodologies incorporate event cameras into learning framew… ▽ More

    Submitted 1 December, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: Extended version of arXiv:2206.04511. The code and dataset are available at https://github.com/MasterHow/EventPointPose

  32. arXiv:2310.06553  [pdf, other

    eess.SY

    Safe-by-Construction Autonomous Vehicle Overtaking using Control Barrier Functions and Model Predictive Control

    Authors: Dingran Yuan, Xinyi Yu, Shaoyuan Li, Xiang Yin

    Abstract: Ensuring safety for vehicle overtaking systems is one of the most fundamental and challenging tasks in autonomous driving. This task is particularly intricate when the vehicle must not only overtake its front vehicle safely but also consider the presence of potential opposing vehicles in the opposite lane that it will temporarily occupy. In order to tackle the overtaking task in such challenging s… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  33. arXiv:2310.03750  [pdf

    eess.SP cond-mat.mtrl-sci cs.LG physics.app-ph

    Health diagnosis and recuperation of aged Li-ion batteries with data analytics and equivalent circuit modeling

    Authors: Riko I Made, **g Lin, **tao Zhang, Yu Zhang, Lionel C. H. Moh, Zhaolin Liu, Ning Ding, Sing Yang Chiam, Edwin Khoo, Xuesong Yin, Guangyuan Wesley Zheng

    Abstract: Battery health assessment and recuperation play a crucial role in the utilization of second-life Li-ion batteries. However, due to ambiguous aging mechanisms and lack of correlations between the recovery effects and operational states, it is challenging to accurately estimate battery health and devise a clear strategy for cell rejuvenation. This paper presents aging and reconditioning experiments… ▽ More

    Submitted 21 September, 2023; originally announced October 2023.

    Comments: 20 pages, 5 figures, 1 table

    Journal ref: iScience (2024)

  34. arXiv:2310.00919  [pdf, other

    eess.IV cs.CV cs.LG

    BAAF: A Benchmark Attention Adaptive Framework for Medical Ultrasound Image Segmentation Tasks

    Authors: Gong** Chen, Lei Zhao, Xiaotao Yin, Liang Cui, Jianxun Zhang, Yu Dai

    Abstract: The AI-based assisted diagnosis programs have been widely investigated on medical ultrasound images. Complex scenario of ultrasound image, in which the coupled interference of internal and external factors is severe, brings a unique challenge for localize the object region automatically and precisely in ultrasound images. In this study, we seek to propose a more general and robust Benchmark Attent… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  35. arXiv:2309.14050  [pdf, other

    eess.SY

    NNgTL: Neural Network Guided Optimal Temporal Logic Task Planning for Mobile Robots

    Authors: Ruijia Liu, Shaoyuan Li, Xiang Yin

    Abstract: In this work, we investigate task planning for mobile robots under linear temporal logic (LTL) specifications. This problem is particularly challenging when robots navigate in continuous workspaces due to the high computational complexity involved. Sampling-based methods have emerged as a promising avenue for addressing this challenge by incrementally constructing random trees, thereby sidesteppin… ▽ More

    Submitted 25 September, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: submitted

  36. arXiv:2309.12191  [pdf, other

    eess.SP physics.ins-det

    Exploring the Correlation Between Ultrasound Speed and the State of Health of LiFePO$_4$ Prismatic Cells

    Authors: Shengyuan Zhang, Peng Zuo, Xuesong Yin, Zheng Fan

    Abstract: Electric vehicles (EVs) have become a popular mode of transportation, with their performance depending on the ageing of the Li-ion batteries used to power them. However, it can be challenging and time-consuming to determine the capacity retention of a battery in service. A rapid and reliable testing method for state of health (SoH) determination is desired. Ultrasonic testing techniques are promis… ▽ More

    Submitted 24 September, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

  37. arXiv:2308.02345  [pdf, other

    eess.SY

    Communication-Efficient Decentralized Multi-Agent Reinforcement Learning for Cooperative Adaptive Cruise Control

    Authors: Dong Chen, Kaixiang Zhang, Yongqiang Wang, Xunyuan Yin, Zhaojian Li, Dimitar Filev

    Abstract: Connected and autonomous vehicles (CAVs) promise next-gen transportation systems with enhanced safety, energy efficiency, and sustainability. One typical control strategy for CAVs is the so-called cooperative adaptive cruise control (CACC) where vehicles drive in platoons and cooperate to achieve safe and efficient transportation. In this study, we formulate CACC as a multi-agent reinforcement lea… ▽ More

    Submitted 18 February, 2024; v1 submitted 4 August, 2023; originally announced August 2023.

    Comments: 14 pages, 11 figures

  38. arXiv:2307.12855  [pdf, other

    eess.SY

    Efficient STL Control Synthesis under Asynchronous Temporal Robustness Constraints

    Authors: Xinyi Yu, Xiang Yin, Lars Lindemann

    Abstract: In time-critical systems, such as air traffic control systems, it is crucial to design control policies that are robust to timing uncertainty. Recently, the notion of Asynchronous Temporal Robustness (ATR) was proposed to capture the robustness of a system trajectory against individual time shifts in its sub-trajectories. In a multi-robot system, this may correspond to individual robots being dela… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: This paper was accepted to CDC2023

  39. arXiv:2307.08268  [pdf, other

    eess.IV cs.CV

    Liver Tumor Screening and Diagnosis in CT with Pixel-Lesion-Patient Network

    Authors: Ke Yan, Xiaoli Yin, Yingda Xia, Fakai Wang, Shu Wang, Yuan Gao, Jiawen Yao, Chunli Li, Xiaoyu Bai, **gren Zhou, Ling Zhang, Le Lu, Yu Shi

    Abstract: Liver tumor segmentation and classification are important tasks in computer aided diagnosis. We aim to address three problems: liver tumor screening and preliminary diagnosis in non-contrast computed tomography (CT), and differential diagnosis in dynamic contrast-enhanced CT. A novel framework named Pixel-Lesion-pAtient Network (PLAN) is proposed. It uses a mask transformer to jointly segment and… ▽ More

    Submitted 21 October, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: MICCAI 2023, code: https://github.com/alibaba-damo-academy/pixel-lesion-patient-network

  40. arXiv:2307.07218  [pdf, other

    eess.AS cs.SD

    Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis

    Authors: Ziyue Jiang, **glin Liu, Yi Ren, **zheng He, Zhenhui Ye, Shengpeng Ji, Qian Yang, Chen Zhang, Pengfei Wei, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao

    Abstract: Zero-shot text-to-speech (TTS) aims to synthesize voices with unseen speech prompts, which significantly reduces the data and computation requirements for voice cloning by skip** the fine-tuning process. However, the prompting mechanisms of zero-shot TTS still face challenges in the following aspects: 1) previous works of zero-shot TTS are typically trained with single-sentence prompts, which si… ▽ More

    Submitted 10 April, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: Accepted by ICLR 2024

  41. arXiv:2307.05033  [pdf, other

    cs.CV cs.RO eess.IV

    Towards Anytime Optical Flow Estimation with Event Cameras

    Authors: Yaozu Ye, Hao Shi, Kailun Yang, Ze Wang, Xiaoting Yin, Yining Lin, Mao Liu, Yaonan Wang, Kaiwei Wang

    Abstract: Optical flow estimation is a fundamental task in the field of autonomous driving. Event cameras are capable of responding to log-brightness changes in microseconds. Its characteristic of producing responses only to the changing region is particularly suitable for optical flow estimation. In contrast to the super low-latency response speed of event cameras, existing datasets collected via event cam… ▽ More

    Submitted 19 October, 2023; v1 submitted 11 July, 2023; originally announced July 2023.

    Comments: Code will be available at https://github.com/Yaozhuwa/EVA-Flow

  42. arXiv:2306.15304  [pdf, other

    eess.AS cs.SD

    GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech

    Authors: Yahuan Cong, Haoyu Zhang, Haopeng Lin, Shichao Liu, Chunfeng Wang, Yi Ren, Xiang Yin, Zejun Ma

    Abstract: Cross-lingual timbre and style generalizable text-to-speech (TTS) aims to synthesize speech with a specific reference timbre or style that is never trained in the target language. It encounters the following challenges: 1) timbre and pronunciation are correlated since multilingual speech of a specific speaker is usually hard to obtain; 2) style and pronunciation are mixed because the speech style… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: Accepted by INTERSPEECH 2023

  43. arXiv:2306.13307  [pdf, other

    eess.AS cs.CL

    Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems

    Authors: Mingyu Cui, Jiawen Kang, Jiajun Deng, Xi Yin, Yutao Xie, Xie Chen, Xunying Liu

    Abstract: Current ASR systems are mainly trained and evaluated at the utterance level. Long range cross utterance context can be incorporated. A key task is to derive a suitable compact representation of the most relevant history contexts. In contrast to previous researches based on either LSTM-RNN encoded histories that attenuate the information from longer range contexts, or frame level concatenation of t… ▽ More

    Submitted 25 June, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

    Comments: Accepted by INTERSPEECH 2023

  44. arXiv:2306.08219  [pdf, other

    cs.IR cs.SD eess.AS

    Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects

    Authors: Xinghua Qu, Hongyang Liu, Zhu Sun, Xiang Yin, Yew Soon Ong, Lu Lu, Zejun Ma

    Abstract: Conversational recommender systems (CRSs) have become crucial emerging research topics in the field of RSs, thanks to their natural advantages of explicitly acquiring user preferences via interactive conversations and revealing the reasons behind recommendations. However, the majority of current CRSs are text-based, which is less user-friendly and may pose challenges for certain users, such as tho… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: Accepted by SIGIR 2023 Resource Track

  45. arXiv:2306.04830  [pdf, other

    eess.SY

    Extended Neighboring Extremal Optimal Control with State and Preview Perturbations

    Authors: Amin Vahidi-Moghaddam, Kaixiang Zhang, Zhaojian Li, Xunyuan Yin, Ziyou Song, Yan Wang

    Abstract: Optimal control schemes have achieved remarkable performance in numerous engineering applications. However, they typically require high computational cost, which has limited their use in real-world engineering systems with fast dynamics and/or limited computation power. To address this challenge, Neighboring Extremal (NE) has been developed as an efficient optimal adaption strategy to adapt a pre-… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  46. arXiv:2306.03509  [pdf, other

    eess.AS cs.AI cs.SD

    Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias

    Authors: Ziyue Jiang, Yi Ren, Zhenhui Ye, **glin Liu, Chen Zhang, Qian Yang, Shengpeng Ji, Rongjie Huang, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao

    Abstract: Scaling text-to-speech to a large and wild dataset has been proven to be highly effective in achieving timbre and speech style generalization, particularly in zero-shot TTS. However, previous works usually encode speech into latent using audio codec and use autoregressive language models or diffusion models to generate it, which ignores the intrinsic nature of speech and may lead to inferior or un… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  47. arXiv:2306.03504  [pdf, other

    cs.CV cs.SD eess.AS

    Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis

    Authors: Zhenhui Ye, Ziyue Jiang, Yi Ren, **glin Liu, Chen Zhang, Xiang Yin, Zejun Ma, Zhou Zhao

    Abstract: We are interested in a novel task, namely low-resource text-to-talking avatar. Given only a few-minute-long talking person video with the audio track as the training data and arbitrary texts as the driving input, we aim to synthesize high-quality talking portrait videos corresponding to the input text. This task has broad application prospects in the digital human industry but has not been technic… ▽ More

    Submitted 2 August, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted by ICML 2023 Workshop, 6 pages, 3 figures

  48. arXiv:2305.18474  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

    Authors: Jiawei Huang, Yi Ren, Rongjie Huang, Dongchao Yang, Zhenhui Ye, Chen Zhang, **glin Liu, Xiang Yin, Zejun Ma, Zhou Zhao

    Abstract: Large diffusion models have been successful in text-to-audio (T2A) synthesis tasks, but they often suffer from common issues such as semantic misalignment and poor temporal consistency due to limited natural language understanding and data scarcity. Additionally, 2D spatial structures widely used in T2A works lead to unsatisfactory audio quality when generating variable-length audio samples since… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  49. arXiv:2305.17732  [pdf, other

    cs.SD eess.AS

    StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation

    Authors: Kun Song, Yi Ren, Yi Lei, Chunfeng Wang, Kun Wei, Lei Xie, Xiang Yin, Zejun Ma

    Abstract: Direct speech-to-speech translation (S2ST) has gradually become popular as it has many advantages compared with cascade S2ST. However, current research mainly focuses on the accuracy of semantic translation and ignores the speech style transfer from a source language to a target language. The lack of high-fidelity expressive parallel data makes such style transfer challenging, especially in more p… ▽ More

    Submitted 25 July, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted to Interspeech 2023

  50. arXiv:2305.16342  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

    Authors: Zhi-Hao Lai, Tian-Hao Zhang, Qi Liu, Xinyuan Qian, Li-Fang Wei, Song-Lu Chen, Feng Chen, Xu-Cheng Yin

    Abstract: The local and global features are both essential for automatic speech recognition (ASR). Many recent methods have verified that simply combining local and global features can further promote ASR performance. However, these methods pay less attention to the interaction of local and global features, and their series architectures are rigid to reflect local and global relationships. To address these… ▽ More

    Submitted 29 May, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted by Interspeech 2023