Skip to main content

Showing 1–29 of 29 results for author: Cheng, B

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.17257  [pdf, other

    cs.CL cs.SD eess.AS

    Leveraging Parameter-Efficient Transfer Learning for Multi-Lingual Text-to-Speech Adaptation

    Authors: Yingting Li, Ambuj Mehrish, Bryan Chew, Bo Cheng, Soujanya Poria

    Abstract: Different languages have distinct phonetic systems and vary in their prosodic features making it challenging to develop a Text-to-Speech (TTS) model that can effectively synthesise speech in multilingual settings. Furthermore, TTS architecture needs to be both efficient enough to capture nuances in multiple languages and efficient enough to be practical for deployment. The standard approach is to… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2404.18599  [pdf, other

    eess.IV cs.CV

    Self-supervised learning for classifying paranasal anomalies in the maxillary sinus

    Authors: Debayan Bhattacharya, Finn Behrendt, Benjamin Tobias Becker, Lennart Maack, Dirk Beyersdorff, Elina Petersen, Marvin Petersen, Bastian Cheng, Dennis Eggert, Christian Betz, Anna Sophie Hoffmann, Alexander Schlaefer

    Abstract: Purpose: Paranasal anomalies, frequently identified in routine radiological screenings, exhibit diverse morphological characteristics. Due to the diversity of anomalies, supervised learning methods require large labelled dataset exhibiting diverse anomaly morphology. Self-supervised learning (SSL) can be used to learn representations from unlabelled data. However, there are no SSL methods designed… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  3. arXiv:2404.09000  [pdf, other

    eess.IV cs.CV cs.LG

    MaSkel: A Model for Human Whole-body X-rays Generation from Human Masking Images

    Authors: Yingjie Xi, Boyuan Cheng, **gyao Cai, Jian Jun Zhang, Xiaosong Yang

    Abstract: The human whole-body X-rays could offer a valuable reference for various applications, including medical diagnostics, digital animation modeling, and ergonomic design. The traditional method of obtaining X-ray information requires the use of CT (Computed Tomography) scan machines, which emit potentially harmful radiation. Thus it faces a significant limitation for realistic applications because it… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  4. arXiv:2404.04645  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    HyperTTS: Parameter Efficient Adaptation in Text to Speech using Hypernetworks

    Authors: Yingting Li, Rishabh Bhardwaj, Ambuj Mehrish, Bo Cheng, Soujanya Poria

    Abstract: Neural speech synthesis, or text-to-speech (TTS), aims to transform a signal from the text domain to the speech domain. While develo** TTS architectures that train and test on the same set of speakers has seen significant improvements, out-of-domain speaker performance still faces enormous limitations. Domain adaptation on a new set of speakers can be achieved by fine-tuning the whole model for… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  5. arXiv:2404.00569  [pdf, other

    cs.SD cs.CL eess.AS

    CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models

    Authors: Xiang Li, Fan Bu, Ambuj Mehrish, Yingting Li, Jiale Han, Bo Cheng, Soujanya Poria

    Abstract: Neural Text-to-Speech (TTS) systems find broad applications in voice assistants, e-learning, and audiobook creation. The pursuit of modern models, like Diffusion Models (DMs), holds promise for achieving high-fidelity, real-time speech synthesis. Yet, the efficiency of multi-step sampling in Diffusion Models presents challenges. Efforts have been made to integrate GANs with DMs, speeding up infere… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: Accepted by Findings of NAACL 2024. Code is available at https://github.com/XiangLi2022/CM-TTS

  6. arXiv:2403.00128  [pdf, other

    cs.RO cs.LG eess.SY

    From Flies to Robots: Inverted Landing in Small Quadcopters with Dynamic Perching

    Authors: Bryan Habas, Bo Cheng

    Abstract: Inverted landing is a routine behavior among a number of animal fliers. However, mastering this feat poses a considerable challenge for robotic fliers, especially to perform dynamic perching with rapid body rotations (or flips) and landing against gravity. Inverted landing in flies have suggested that optical flow senses are closely linked to the precise triggering and control of body flips that l… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

    Comments: 17 pages, 19 Figures, Journal paper currently under review

  7. arXiv:2401.05639  [pdf, other

    eess.SY

    Full-State Prescribed Performance-Based Consensus of Double-Integrator Multi-Agent Systems with Jointly Connected Topologies

    Authors: Yahui Hou, Bin Cheng

    Abstract: This paper addresses the full-state prescribed performance-based consensus problem for double-integrator multi-agent systems with jointly connected topologies. To improve the transient performance, a distributed prescribed performance control protocol consisting of the transformed relative position and the transformed relative velocity is proposed, where the communication topology satisfies the jo… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: 5 pages, 3 figures

  8. arXiv:2310.00288  [pdf

    cs.AR cs.ET eess.SY physics.app-ph

    Parallel in-memory wireless computing

    Authors: Cong Wang, Gong-Jie Ruan, Zai-Zheng Yang, Xing-Jian Yangdong, Yixiang Li, Liang Wu, Yingmeng Ge, Yichen Zhao, Chen Pan, Wei Wei, Li-Bo Wang, Bin Cheng, Zaichen Zhang, Chuan Zhang, Shi-Jun Liang, Feng Miao

    Abstract: Parallel wireless digital communication with ultralow power consumption is critical for emerging edge technologies such as 5G and Internet of Things. However, the physical separation between digital computing units and analogue transmission units in traditional wireless technology leads to high power consumption. Here we report a parallel in-memory wireless computing scheme. The approach combines… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

    Journal ref: Nat Electron 6, 381-389 (2023)

  9. arXiv:2303.17915  [pdf, other

    eess.IV cs.CV

    Multiple Instance Ensembling For Paranasal Anomaly Classification In The Maxillary Sinus

    Authors: Debayan Bhattacharya, Finn Behrendt, Benjamin Tobias Becker, Dirk Beyersdorff, Elina Petersen, Marvin Petersen, Bastian Cheng, Dennis Eggert, Christian Betz, Anna Sophie Hoffmann, Alexander Schlaefer

    Abstract: Paranasal anomalies are commonly discovered during routine radiological screenings and can present with a wide range of morphological features. This diversity can make it difficult for convolutional neural networks (CNNs) to accurately classify these anomalies, especially when working with limited datasets. Additionally, current approaches to paranasal anomaly classification are constrained to ide… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

  10. arXiv:2211.01371  [pdf, other

    eess.IV

    Unsupervised Anomaly Detection of Paranasal Anomalies in the Maxillary Sinus

    Authors: Debayan Bhattacharya, Finn Behrendt, Benjamin Tobias Becker, Dirk Beyersdorff, Elina Petersen, Marvin Petersen, Bastian Cheng, Dennis Eggert, Christian Betz, Anna Sophie Hoffmann, Alexander Schlaefer

    Abstract: Deep learning (DL) algorithms can be used to automate paranasal anomaly detection from Magnetic Resonance Imaging (MRI). However, previous works relied on supervised learning techniques to distinguish between normal and abnormal samples. This method limits the type of anomalies that can be classified as the anomalies need to be present in the training data. Further, many data points from normal an… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  11. arXiv:2210.17313  [pdf, ps, other

    eess.SY cs.AI math.OC

    DiscreteCommunication and ControlUpdating in Event-Triggered Consensus

    Authors: Bin Cheng, Yuezu Lv, Zhongkui Li, Zhisheng Duan

    Abstract: This paper studies the consensus control problem faced with three essential demands, namely, discrete control updating for each agent, discrete-time communications among neighboring agents, and the fully distributed fashion of the controller implementation without requiring any global information of the whole network topology. Noting that the existing related results only meeting one or two demand… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

  12. arXiv:2209.11043  [pdf, other

    cs.RO cs.LG eess.SY

    Inverted Landing in a Small Aerial Robot via Deep Reinforcement Learning for Triggering and Control of Rotational Maneuvers

    Authors: Bryan Habas, Jack W. Langelaan, Bo Cheng

    Abstract: Inverted landing in a rapid and robust manner is a challenging feat for aerial robots, especially while depending entirely on onboard sensing and computation. In spite of this, this feat is routinely performed by biological fliers such as bats, flies, and bees. Our previous work has identified a direct causal connection between a series of onboard visual cues and kinematic actions that allow for r… ▽ More

    Submitted 25 April, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

    Comments: 8 pages, 6 Figures, Submitted for ICRA 2023 Conference (Pending Review)

  13. arXiv:2209.01937  [pdf, other

    eess.IV cs.CV cs.LG

    Supervised Contrastive Learning to Classify Paranasal Anomalies in the Maxillary Sinus

    Authors: Debayan Bhattacharya, Benjamin Tobias Becker, Finn Behrendt, Marcel Bengs, Dirk Beyersdorff, Dennis Eggert, Elina Petersen, Florian Jansen, Marvin Petersen, Bastian Cheng, Christian Betz, Alexander Schlaefer, Anna Sophie Hoffmann

    Abstract: Using deep learning techniques, anomalies in the paranasal sinus system can be detected automatically in MRI images and can be further analyzed and classified based on their volume, shape and other parameters like local contrast. However due to limited training data, traditional supervised learning methods often fail to generalize. Existing deep learning methods in paranasal anomaly classification… ▽ More

    Submitted 5 September, 2022; originally announced September 2022.

  14. arXiv:2111.03539  [pdf, other

    cs.RO eess.SY

    Optimal Inverted Landing in a Small Aerial Robot with Varied Approach Velocities and Landing Gear Designs

    Authors: Bryan Habas, Bader AlAttar, Brian Davis, Jack W. Langelaan, Bo Cheng

    Abstract: Inverted landing is a challenging feat to perform in aerial robots, especially without external positioning. However, it is routinely performed by biological fliers such as bees, flies, and bats. Our previous observations of landing behaviors in flies suggest an open-loop causal relationship between their putative visual cues and the kinematics of the aerial maneuvers executed. For example, the de… ▽ More

    Submitted 3 March, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

    Comments: 7 pages, 9 figures, Submitted to ICRA 2022 conference

  15. arXiv:2109.05540  [pdf, other

    cs.RO eess.SY

    Encoding Distributional Soft Actor-Critic for Autonomous Driving in Multi-lane Scenarios

    Authors: **gliang Duan, Yangang Ren, Fawang Zhang, Yang Guan, Dongjie Yu, Shengbo Eben Li, Bo Cheng, Lin Zhao

    Abstract: In this paper, we propose a new reinforcement learning (RL) algorithm, called encoding distributional soft actor-critic (E-DSAC), for decision-making in autonomous driving. Unlike existing RL-based decision-making methods, E-DSAC is suitable for situations where the number of surrounding vehicles is variable and eliminates the requirement for manually pre-designed sorting rules, resulting in highe… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

  16. Fixed-Dimensional and Permutation Invariant State Representation of Autonomous Driving

    Authors: **gliang Duan, Dongjie Yu, Shengbo Eben Li, Wenxuan Wang, Yangang Ren, Ziyu Lin, Bo Cheng

    Abstract: In this paper, we propose a new state representation method, called encoding sum and concatenation (ESC), for the state representation of decision-making in autonomous driving. Unlike existing state representation methods, ESC is applicable to a variable number of surrounding vehicles and eliminates the need for manually pre-designed sorting rules, leading to higher representation ability and gene… ▽ More

    Submitted 4 March, 2022; v1 submitted 24 May, 2021; originally announced May 2021.

    Journal ref: IEEE Transactions on Intelligent Transportation Systems, 2021

  17. arXiv:2105.09683  [pdf, other

    eess.IV cs.CV

    DPN-SENet:A self-attention mechanism neural network for detection and diagnosis of COVID-19 from chest x-ray images

    Authors: Bo Cheng, Ruhui Xue, Hang Yang, Laili Zhu, Wei Xiang

    Abstract: Background and Objective: The new type of coronavirus is also called COVID-19. It began to spread at the end of 2019 and has now spread across the world. Until October 2020, It has infected around 37 million people and claimed about 1 million lives. We propose a deep learning model that can help radiologists and clinicians use chest X-rays to diagnose COVID-19 cases and show the diagnostic feature… ▽ More

    Submitted 20 May, 2021; originally announced May 2021.

    Comments: 11 pages, 7 figures

  18. arXiv:2102.11736  [pdf, other

    eess.SY cs.AI

    Recurrent Model Predictive Control

    Authors: Zhengyu Liu, **gliang Duan, Wenxuan Wang, Shengbo Eben Li, Yuming Yin, Ziyu Lin, Qi Sun, Bo Cheng

    Abstract: This paper proposes an off-line algorithm, called Recurrent Model Predictive Control (RMPC), to solve general nonlinear finite-horizon optimal control problems. Unlike traditional Model Predictive Control (MPC) algorithms, it can make full use of the current computing resources and adaptively select the longest model prediction horizon. Our algorithm employs a recurrent function to approximate the… ▽ More

    Submitted 23 February, 2021; originally announced February 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2102.10289

  19. Recurrent Model Predictive Control: Learning an Explicit Recurrent Controller for Nonlinear Systems

    Authors: Zhengyu Liu, **gliang Duan, Wenxuan Wang, Shengbo Eben Li, Yuming Yin, Ziyu Lin, Bo Cheng

    Abstract: This paper proposes an offline control algorithm, called Recurrent Model Predictive Control (RMPC), to solve large-scale nonlinear finite-horizon optimal control problems. It can be regarded as an explicit solver of traditional Model Predictive Control (MPC) algorithms, which can adaptively select appropriate model prediction horizon according to current computing resources, so as to improve the p… ▽ More

    Submitted 8 April, 2022; v1 submitted 20 February, 2021; originally announced February 2021.

    Journal ref: IEEE Transactions on Industrial Electronics, 2022

  20. arXiv:2003.11216  [pdf, ps, other

    eess.SY math.OC

    Event-Triggered Consensus of Homogeneous and Heterogeneous Multi-Agent Systems with Jointly Connected Switching Topologies

    Authors: Bin Cheng, Xiangke Wang, Zhongkui Li

    Abstract: This paper investigates the distributed event-based consensus problem of switching networks satisfying the jointly connected condition. Both the state consensus of homogeneous linear networks and output consensus of heterogeneous networks are studied. Two kinds of event-based protocols based on local sampled information are designed, without the need to solve any matrix equation or inequality. The… ▽ More

    Submitted 22 September, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

    Comments: This paper is an updated version of our paper published by IEEE Trans. Cybernetics, with corrections in the proof of Theorem 1. 11 pages and 4 figures

    Journal ref: IEEE Transactions on Cybernetics, vol. 49, no. 12, pp. 4421-4430, 2019

  21. arXiv:2003.00848  [pdf, other

    eess.SY cs.LG cs.RO stat.ML

    Mixed Reinforcement Learning with Additive Stochastic Uncertainty

    Authors: Yao Mu, Shengbo Eben Li, Chang Liu, Qi Sun, Bingbing Nie, Bo Cheng, Baiyu Peng

    Abstract: Reinforcement learning (RL) methods often rely on massive exploration data to search optimal policies, and suffer from poor sampling efficiency. This paper presents a mixed reinforcement learning (mixed RL) algorithm by simultaneously using dual representations of environmental dynamics to search the optimal policy with the purpose of improving both learning accuracy and training speed. The dual r… ▽ More

    Submitted 28 February, 2020; originally announced March 2020.

  22. Hierarchical Reinforcement Learning for Self-Driving Decision-Making without Reliance on Labeled Driving Data

    Authors: **gliang Duan, Shengbo Eben Li, Yang Guan, Qi Sun, Bo Cheng

    Abstract: Decision making for self-driving cars is usually tackled by manually encoding rules from drivers' behaviors or imitating drivers' manipulation using supervised learning techniques. Both of them rely on mass driving data to cover all possible driving scenarios. This paper presents a hierarchical reinforcement learning method for decision making of self-driving cars, which does not depend on a large… ▽ More

    Submitted 27 January, 2020; originally announced January 2020.

    Journal ref: IET Intelligent Transport Systems, 2020, 14(5): 297-305

  23. arXiv:2001.02811  [pdf, other

    cs.LG cs.AI eess.SY

    Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors

    Authors: **gliang Duan, Yang Guan, Shengbo Eben Li, Yangang Ren, Bo Cheng

    Abstract: In reinforcement learning (RL), function approximation errors are known to easily lead to the Q-value overestimations, thus greatly reducing policy performance. This paper presents a distributional soft actor-critic (DSAC) algorithm, which is an off-policy RL method for continuous control setting, to improve the policy performance by mitigating Q-value overestimations. We first discover in theory… ▽ More

    Submitted 11 June, 2021; v1 submitted 8 January, 2020; originally announced January 2020.

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems, 2021

  24. arXiv:1911.11397  [pdf, other

    eess.SY cs.LG math.OC

    Adaptive dynamic programming for nonaffine nonlinear optimal control problem with state constraints

    Authors: **gliang Duan, Zhengyu Liu, Shengbo Eben Li, Qi Sun, Zhenzhong Jia, Bo Cheng

    Abstract: This paper presents a constrained adaptive dynamic programming (CADP) algorithm to solve general nonlinear nonaffine optimal control problems with known dynamics. Unlike previous ADP algorithms, it can directly deal with problems with state constraints. Firstly, a constrained generalized policy iteration (CGPI) framework is developed to handle state constraints by transforming the traditional poli… ▽ More

    Submitted 8 April, 2022; v1 submitted 26 November, 2019; originally announced November 2019.

    Journal ref: Neurocomputing 484 (2022) 128-141

  25. arXiv:1910.04751  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Panoptic-DeepLab

    Authors: Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen

    Abstract: We present Panoptic-DeepLab, a bottom-up and single-shot approach for panoptic segmentation. Our Panoptic-DeepLab is conceptually simple and delivers state-of-the-art results. In particular, we adopt the dual-ASPP and dual-decoder structures specific to semantic, and instance segmentation, respectively. The semantic segmentation branch is the same as the typical design of any semantic segmentation… ▽ More

    Submitted 23 October, 2019; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: This work is presented at ICCV 2019 Joint COCO and Mapillary Recognition Challenge Workshop

  26. arXiv:1908.10357  [pdf, other

    cs.CV cs.LG eess.IV

    HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation

    Authors: Bowen Cheng, Bin Xiao, **gdong Wang, Honghui Shi, Thomas S. Huang, Lei Zhang

    Abstract: Bottom-up human pose estimation methods have difficulties in predicting the correct pose for small persons due to challenges in scale variation. In this paper, we present HigherHRNet: a novel bottom-up human pose estimation method for learning scale-aware representations using high-resolution feature pyramids. Equipped with multi-resolution supervision for training and multi-resolution aggregation… ▽ More

    Submitted 12 March, 2020; v1 submitted 27 August, 2019; originally announced August 2019.

    Comments: CVPR 2020

  27. arXiv:1908.02939  [pdf, other

    eess.IV

    A neural network approach to GOP-level rate control of x265 using Lookahead

    Authors: Boya Cheng, Yuan Zhang

    Abstract: To optimize the perceived quality under a specific bitrate constraint, multi-pass encoding is usually performed with the rate control mode of the average bitrate (ABR) or the constant rate factor (CRF) to distribute bits as reasonably as possible in terms of perceived quality, leading to high computational complexity. In this paper, we propose to utilize the video information generated during the… ▽ More

    Submitted 24 October, 2019; v1 submitted 8 August, 2019; originally announced August 2019.

    Comments: 5 pages 2019PCS

  28. arXiv:1907.00112  [pdf

    cs.CL cs.LG cs.SD eess.AS

    Leveraging Acoustic Cues and Paralinguistic Embeddings to Detect Expression from Voice

    Authors: Vikramjit Mitra, Sue Booker, Erik Marchi, David Scott Farrar, Ute Dorothea Peitz, Bridget Cheng, Ermine Teves, Anuj Mehta, Devang Naik

    Abstract: Millions of people reach out to digital assistants such as Siri every day, asking for information, making phone calls, seeking assistance, and much more. The expectation is that such assistants should understand the intent of the users query. Detecting the intent of a query from a short, isolated utterance is a difficult task. Intent cannot always be obtained from speech-recognized transcriptions.… ▽ More

    Submitted 28 June, 2019; originally announced July 2019.

    Comments: 5 pages, 6 figures

  29. arXiv:1807.05326  [pdf, ps, other

    eess.SY

    Fully Distributed Event-Triggered Protocols for Linear Multi-Agent Networks

    Authors: Bin Cheng, Zhongkui Li

    Abstract: This paper considers the distributed event-triggered consensus problem for general linear multi-agent networks. Both the leaderless and leader-follower consensus problems are considered. Based on the local sampled state or local output information, distributed adaptive event-triggered protocols are designed, which can ensure that consensus of the agents is achieved and the Zeno behavior is exclude… ▽ More

    Submitted 13 July, 2018; originally announced July 2018.

    Comments: 8 pages. Accepted for publication by IEEE Transactions on Automatic Control