Skip to main content

Showing 1–50 of 57 results for author: Cheng, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.15160  [pdf, other

    eess.AS eess.SP

    Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios

    Authors: Ya Jiang, Qing Wang, Jun Du, Maocheng Hu, Pengfei Hu, Zeyan Liu, Shi Cheng, Zhaoxu Nian, Yuxuan Dong, Mingqi Cai, Xin Fang, Chin-Hui Lee

    Abstract: This study presents an audio-visual information fusion approach to sound event localization and detection (SELD) in low-resource scenarios. We aim at utilizing audio and video modality information through cross-modal learning and multi-modal fusion. First, we propose a cross-modal teacher-student learning (TSL) framework to transfer information from an audio-only teacher model, trained on a rich c… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: accepted by icme2024

  2. arXiv:2406.02653  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Pancreatic Tumor Segmentation as Anomaly Detection in CT Images Using Denoising Diffusion Models

    Authors: Reza Babaei, Samuel Cheng, Theresa Thai, Shangqing Zhao

    Abstract: Despite the advances in medicine, cancer has remained a formidable challenge. Particularly in the case of pancreatic tumors, characterized by their diversity and late diagnosis, early detection poses a significant challenge crucial for effective treatment. The advancement of deep learning techniques, particularly supervised algorithms, has significantly propelled pancreatic tumor detection in the… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  3. arXiv:2405.10145  [pdf

    eess.SY

    Deep Koopman Operator-Informed Safety Command Governor for Autonomous Vehicles

    Authors: Hao Chen, Xiangkun He, Shuo Cheng, Chen Lv

    Abstract: Modeling of nonlinear behaviors with physical-based models poses challenges. However, Koopman operator maps the original nonlinear system into an infinite-dimensional linear space to achieve global linearization of the nonlinear system through input and output data, which derives an absolute equivalent linear representation of the original state space. Due to the impossibility of implementing the… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  4. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  5. arXiv:2404.08326  [pdf, other

    eess.SY

    Quaternion-Based Attitude Stabilization Using Synergistic Hybrid Feedback With Minimal Potential Functions

    Authors: Xin Tong, Qingpeng Ding, Haiyang Fang, Shing Shin Cheng

    Abstract: This paper investigates the robust global attitude stabilization problem for a rigid-body system using quaternion-based feedback. We propose a novel synergistic hybrid feedback with the following notable features: (1) It demonstrates central synergism by utilizing a minimal number of potential functions; (2) It ensures consistency with respect to the unit quaternion representation of rigid-body at… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 14 pages, 6 figures, extended version of a paper accepted for publication in Automatica

  6. arXiv:2401.15508  [pdf, other

    cs.RO cs.LG eess.SY

    Proto-MPC: An Encoder-Prototype-Decoder Approach for Quadrotor Control in Challenging Winds

    Authors: Yuliang Gu, Sheng Cheng, Naira Hovakimyan

    Abstract: Quadrotors are increasingly used in the evolving field of aerial robotics for their agility and mechanical simplicity. However, inherent uncertainties, such as aerodynamic effects coupled with quadrotors' operation in dynamically changing environments, pose significant challenges for traditional, nominal model-based control designs. We propose a multi-task meta-learning method called Encoder-Proto… ▽ More

    Submitted 21 May, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

  7. arXiv:2312.15190  [pdf, other

    cs.SD cs.AI cs.CR eess.AS

    SAIC: Integration of Speech Anonymization and Identity Classification

    Authors: Ming Cheng, Xingjian Diao, Shitong Cheng, Wenjun Liu

    Abstract: Speech anonymization and de-identification have garnered significant attention recently, especially in the healthcare area including telehealth consultations, patient voiceprint matching, and patient real-time monitoring. Speaker identity classification tasks, which involve recognizing specific speakers from audio to learn identity features, are crucial for de-identification. Since rare studies ha… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

  8. arXiv:2312.13585  [pdf, other

    cs.CL cs.SD eess.AS

    Speech Translation with Large Language Models: An Industrial Practice

    Authors: Zhichao Huang, Rong Ye, Tom Ko, Qianqian Dong, Shanbo Cheng, Mingxuan Wang, Hang Li

    Abstract: Given the great success of large language models (LLMs) across various tasks, in this paper, we introduce LLM-ST, a novel and effective speech translation model constructed upon a pre-trained LLM. By integrating the large language model (LLM) with a speech encoder and employing multi-task instruction tuning, LLM-ST can produce accurate timestamped transcriptions and translations, even from long au… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: Technical report. 13 pages. Demo: https://speechtranslation.github.io/llm-st/

  9. arXiv:2312.12744  [pdf

    cs.HC cs.LG eess.SP

    3D-CLMI: A Motor Imagery EEG Classification Model via Fusion of 3D-CNN and LSTM with Attention

    Authors: Shiwei Cheng, Yuejiang Hao

    Abstract: Due to the limitations in the accuracy and robustness of current electroencephalogram (EEG) classification algorithms, applying motor imagery (MI) for practical Brain-Computer Interface (BCI) applications remains challenging. This paper proposed a model that combined a three-dimensional convolutional neural network (CNN) with a long short-term memory (LSTM) network with attention to classify MI-EE… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  10. arXiv:2312.11384  [pdf, ps, other

    cs.RO eess.SY

    DiffTune-MPC: Closed-Loop Learning for Model Predictive Control

    Authors: Ran Tao, Sheng Cheng, Xiaofeng Wang, Shenlong Wang, Naira Hovakimyan

    Abstract: Model predictive control (MPC) has been applied to many platforms in robotics and autonomous systems for its capability to predict a system's future behavior while incorporating constraints that a system may have. To enhance the performance of a system with an MPC controller, one can manually tune the MPC's cost function. However, it can be challenging due to the possibly high dimension of the par… ▽ More

    Submitted 30 March, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: The first two authors contributed equally to this work

  11. Synergistic Perception and Control Simplex for Verifiable Safe Vertical Landing

    Authors: Ayoosh Bansal, Yang Zhao, James Zhu, Sheng Cheng, Yuliang Gu, Hyung-** Yoon, Hunmin Kim, Naira Hovakimyan, Lui Sha

    Abstract: Perception, Planning, and Control form the essential components of autonomy in advanced air mobility. This work advances the holistic integration of these components to enhance the performance and robustness of the complete cyber-physical system. We adapt Perception Simplex, a system for verifiable collision avoidance amidst obstacle detection faults, to the vertical landing maneuver for autonomou… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: To appear in AIAA SciTech 2024

    ACM Class: C.3; C.4; J.7

    Journal ref: AIAA SCITECH 2024 Forum, p. 1167

  12. arXiv:2309.07925  [pdf, other

    eess.AS cs.AI cs.MM cs.SD

    Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023

    Authors: Haotian Wang, Yuxuan Xi, Hang Chen, Jun Du, Yan Song, Qing Wang, Hengshun Zhou, Chenxi Wang, Jiefeng Ma, Pengfei Hu, Ya Jiang, Shi Cheng, Jie Zhang, Yuzhe Weng

    Abstract: In this paper, we propose a novel framework for recognizing both discrete and dimensional emotions. In our framework, deep features extracted from foundation models are used as robust acoustic and visual representations of raw video. Three different structures based on attention-guided feature gathering (AFG) are designed for deep feature fusion. Then, we introduce a joint decoding structure for e… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Comments: 5 pages, 4 figures

    Journal ref: The 31st ACM International Conference on Multimedia (MM'23), 2023

  13. arXiv:2309.07145  [pdf, other

    eess.SP cs.AI cs.LG

    ETP: Learning Transferable ECG Representations via ECG-Text Pre-training

    Authors: Che Liu, Zhongwei Wan, Sibo Cheng, Mi Zhang, Rossella Arcucci

    Abstract: In the domain of cardiovascular healthcare, the Electrocardiogram (ECG) serves as a critical, non-invasive diagnostic tool. Although recent strides in self-supervised learning (SSL) have been promising for ECG representation learning, these techniques often require annotated samples and struggle with classes not present in the fine-tuning stages. To address these limitations, we introduce ECG-Text… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

    Comments: under review

  14. arXiv:2309.04960  [pdf, other

    eess.IV cs.CV

    SdCT-GAN: Reconstructing CT from Biplanar X-Rays with Self-driven Generative Adversarial Networks

    Authors: Shuangqin Cheng, Qingliang Chen, Qiyi Zhang, Ming Li, Yamuhanmode Alike, Kaile Su, Pengcheng Wen

    Abstract: Computed Tomography (CT) is a medical imaging modality that can generate more informative 3D images than 2D X-rays. However, this advantage comes at the expense of more radiation exposure, higher costs, and longer acquisition time. Hence, the reconstruction of 3D CT images using a limited number of 2D X-rays has gained significant importance as an economical alternative. Nevertheless, existing met… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

  15. arXiv:2308.02429  [pdf, other

    physics.optics eess.SP

    Nonconvex optimization for optimum retrieval of the transmission matrix of a multimode fiber

    Authors: Shengfu Cheng, Xuyu Zhang, Tianting Zhong, Huanhao Li, Haoran Li, Lei Gong, Honglin Liu, Puxiang Lai

    Abstract: Transmission matrix (TM) allows light control through complex media such as multimode fibers (MMFs), gaining great attention in areas like biophotonics over the past decade. The measurement of a complex-valued TM is highly desired as it supports full modulation of the light field, yet demanding as the holographic setup is usually entailed. Efforts have been taken to retrieve a TM directly from int… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

  16. Global Stabilization of Antipodal Points on n-Sphere with Application to Attitude Tracking

    Authors: Xin Tong, Shing Shin Cheng

    Abstract: Existing approaches to robust global asymptotic stabilization of a pair of antipodal points on unit $n$-sphere $\mathbb{S}^n$ typically involve the non-centrally synergistic hybrid controllers for attitude tracking on unit quaternion space. However, when switching faults occur due to parameter errors, the non-centrally synergistic property can lead to the unwinding problem or in some cases, destab… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 8 pages

  17. arXiv:2305.06511  [pdf, other

    eess.IV cs.CV

    ParamNet: A Parameter-variable Network for Fast Stain Normalization

    Authors: Hongtao Kang, Die Luo, Li Chen, Junbo Hu, Shenghua Cheng, Tingwei Quan, Shaoqun Zeng, Xiuli Liu

    Abstract: In practice, digital pathology images are often affected by various factors, resulting in very large differences in color and brightness. Stain normalization can effectively reduce the differences in color and brightness of digital pathology images, thus improving the performance of computer-aided diagnostic systems. Conventional stain normalization methods rely on one or several reference images,… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  18. arXiv:2305.02596  [pdf

    eess.SY

    A Soft Coordination Method of Heterogeneous Devices in Distribution System Voltage Control

    Authors: Licheng Wang, Tao Wang, Gang Huang, Ruifeng Yan, Kai Wang, Youbing Zhang, Shijie Cheng

    Abstract: With the continuous increase of photovoltaic (PV) penetration, the voltage control interactions between newly installed PV inverters and previously deployed on-load tap-changer (OLTC) transformers become ever more significant. To achieve coordinated voltage regulation, current methods often rely on a decision-making algorithm to fully take over the control of all devices, requiring OLTC to give up… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  19. Synergistic Potential Functions from Single Modified Trace Function on SO(3)

    Authors: Xin Tong, Shing Shin Cheng

    Abstract: This paper is about the construction of a family of centrally synergistic potential functions from a single modified trace function on SO(3). First, we demonstrate that it is possible to complete the construction through angular war** with multiple directions, particularly effective in the unresolved cases in the literature. Second, it can be shown that for each potential function in the family,… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: Extended version of the paper accepted for publication in Automatica

  20. arXiv:2303.13819  [pdf, other

    eess.SY

    Verification of $L_1$ Adaptive Control using Verse Library: A Case Study of Quadrotors

    Authors: Lin Song, Yangge Li, Sheng Cheng, Pan Zhao, Sayan Mitra, Naira Hovakimyan

    Abstract: $L_1$ adaptive control ($L_1$AC) is a control design technique that can handle a broad class of system uncertainties and provide transient performance guarantees. In this work-in-progress abstract, we discuss how existing formal verification tools can be applied to check the performance of $L_1$AC systems. We show that the theoretical transient performance and robustness guarantees of an $L_1… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

    Comments: accepted to ICCPS-wip 2023

  21. arXiv:2303.10160  [pdf, other

    eess.AS cs.LG cs.SD

    Visual Information Matters for ASR Error Correction

    Authors: Vanya Bannihatti Kumar, Shanbo Cheng, Ningxin Peng, Yuchen Zhang

    Abstract: Aiming to improve the Automatic Speech Recognition (ASR) outputs with a post-processing step, ASR error correction (EC) techniques have been widely developed due to their efficiency in using parallel text data. Previous works mainly focus on using text or/ and speech data, which hinders the performance gain when not only text and speech information, but other modalities, such as visual information… ▽ More

    Submitted 26 May, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted at ICASSP 2023

  22. arXiv:2303.06207  [pdf, other

    cs.CV eess.IV

    A New Super-Resolution Measurement of Perceptual Quality and Fidelity

    Authors: Sheng Cheng

    Abstract: Super-resolution results are usually measured by full-reference image quality metrics or human rating scores. However, these evaluation methods are general image quality measurement, and do not account for the nature of the super-resolution problem. In this work, we analyze the evaluation problem based on the one-to-many map** nature of super-resolution, and propose a novel distribution-based me… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

  23. arXiv:2302.07341  [pdf, other

    eess.SY

    Cooperative Perception for Safe Control of Autonomous Vehicles under LiDAR Spoofing Attacks

    Authors: Hongchao Zhang, Zhouchi Li, Shiyu Cheng, Andrew Clark

    Abstract: Autonomous vehicles rely on LiDAR sensors to detect obstacles such as pedestrians, other vehicles, and fixed infrastructures. LiDAR spoofing attacks have been demonstrated that either create erroneous obstacles or prevent detection of real obstacles, resulting in unsafe driving behaviors. In this paper, we propose an approach to detect and mitigate LiDAR spoofing attacks by leveraging LiDAR scan d… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

  24. arXiv:2302.07208  [pdf, other

    eess.SY cs.RO

    $\mathcal{L}_1$Quad: $\mathcal{L}_1$ Adaptive Augmentation of Geometric Control for Agile Quadrotors with Performance Guarantees

    Authors: Zhuohuan Wu, Sheng Cheng, Pan Zhao, Aditya Gahlawat, Kasey A. Ackerman, Arun Lakshmanan, Chengyu Yang, Jiahao Yu, Naira Hovakimyan

    Abstract: Quadrotors that can operate safely in the presence of imperfect model knowledge and external disturbances are crucial in safety-critical applications. We present L1Quad, a control architecture for quadrotors based on the L1 adaptive control. L1Quad enables safe tubes centered around a desired trajectory that the quadrotor is always guaranteed to remain inside. Our design applies to both the rotati… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

    Comments: The first two authors contributed equally to this work

  25. arXiv:2301.10171  [pdf, other

    cs.LG cs.AI eess.SP

    Spectral Cross-Domain Neural Network with Soft-adaptive Threshold Spectral Enhancement

    Authors: Che Liu, Sibo Cheng, Wei** Ding, Rossella Arcucci

    Abstract: Electrocardiography (ECG) signals can be considered as multi-variable time-series. The state-of-the-art ECG data classification approaches, based on either feature engineering or deep learning techniques, treat separately spectral and time domains in machine learning systems. No spectral-time domain communication mechanism inside the classifier model can be found in current approaches, leading to… ▽ More

    Submitted 9 November, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

  26. arXiv:2211.15902  [pdf, ps, other

    cs.RO eess.SY

    Simultaneous Spatial and Temporal Assignment for Fast UAV Trajectory Optimization using Bilevel Optimization

    Authors: Qianzhong Chen, Sheng Cheng, Naira Hovakimyan

    Abstract: In this paper, we propose a framework for fast trajectory planning for unmanned aerial vehicles (UAVs). Our framework is reformulated from an existing bilevel optimization, in which the lower-level problem solves for the optimal trajectory with a fixed time allocation, whereas the upper-level problem updates the time allocation using analytical gradients. The lower-level problem incorporates the s… ▽ More

    Submitted 13 April, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: accepted by IEEE RA-L

  27. arXiv:2210.10879  [pdf, other

    cs.LG cs.CL cs.SD eess.AS

    G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR

    Authors: Gary Wang, Ekin D. Cubuk, Andrew Rosenberg, Shuyang Cheng, Ron J. Weiss, Bhuvana Ramabhadran, Pedro J. Moreno, Quoc V. Le, Daniel S. Park

    Abstract: Data augmentation is a ubiquitous technique used to provide robustness to automatic speech recognition (ASR) training. However, even as so much of the ASR training process has become automated and more "end-to-end", the data augmentation policy (what augmentation functions to use, and how to apply them) remains hand-crafted. We present Graph-Augment, a technique to define the augmentation space as… ▽ More

    Submitted 24 October, 2022; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: 6 pages, accepted at SLT 2022. Updated with copyright

  28. arXiv:2209.10024  [pdf, other

    cs.RO eess.SY

    Geometric Tracking Control of Omnidirectional Multirotors in the Presence of Rotor Dynamics

    Authors: Hyungyu Lee, Sheng Cheng, Zhuohuan Wu, Naira Hovakimyan

    Abstract: An omnidirectional multirotor has the advantageous maneuverability of decoupled translational and rotational motions, drastically superseding the traditional multirotors' motion capability. Such maneuverability requires an omnidirectional multirotor to frequently alter the thrust amplitude and even direction, which is prone to the rotors' settling time induced from the rotors' own dynamics. Furthe… ▽ More

    Submitted 20 September, 2022; originally announced September 2022.

  29. arXiv:2208.05944  [pdf, ps, other

    eess.SY

    Barrier Certificate based Safe Control for LiDAR-based Systems under Sensor Faults and Attacks

    Authors: Hongchao Zhang, Shiyu Cheng, Luyao Niu, Andrew Clark

    Abstract: Autonomous Cyber-Physical Systems (CPS) fuse proprioceptive sensors such as GPS and exteroceptive sensors including Light Detection and Ranging (LiDAR) and cameras for state estimation and environmental observation. It has been shown that both types of sensors can be compromised by malicious attacks, leading to unacceptable safety violations. We study the problem of safety-critical control of a Li… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

  30. arXiv:2205.05675  [pdf, other

    cs.CV eess.IV

    NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

    Authors: Yawei Li, Kai Zhang, Radu Timofte, Luc Van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, **gyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, **shan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang , et al. (86 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of e… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Validation code of the baseline model is available at https://github.com/ofsoundof/IMDN. Validation of all submitted models is available at https://github.com/ofsoundof/NTIRE2022_ESR

  31. arXiv:2205.03242  [pdf

    eess.SP cs.AI cs.LG

    Electrocardiographic Deep Learning for Predicting Post-Procedural Mortality

    Authors: David Ouyang, John Theurer, Nathan R. Stein, J. Weston Hughes, Pierre Elias, Bryan He, Neal Yuan, Grant Duffy, Roopinder K. Sandhu, Joseph Ebinger, Patrick Botting, Melvin Jujjavarapu, Brian Claggett, James E. Tooley, Tim Poterucha, Jonathan H. Chen, Michael Nurok, Marco Perez, Adler Perotte, James Y. Zou, Nancy R. Cook, Sumeet S. Chugh, Susan Cheng, Christine M. Albert

    Abstract: Background. Pre-operative risk assessments used in clinical practice are limited in their ability to identify risk for post-operative mortality. We hypothesize that electrocardiograms contain hidden risk markers that can help prognosticate post-operative mortality. Methods. In a derivation cohort of 45,969 pre-operative patients (age 59+- 19 years, 55 percent women), a deep learning algorithm was… ▽ More

    Submitted 30 April, 2022; originally announced May 2022.

  32. arXiv:2202.11889  [pdf, other

    eess.IV cs.CV

    A spectral-spatial fusion anomaly detection method for hyperspectral imagery

    Authors: Zengfu Hou, Siyuan Cheng, Ting Hu

    Abstract: In hyperspectral, high-quality spectral signals convey subtle spectral differences to distinguish similar materials, thereby providing unique advantage for anomaly detection. Hence fine spectra of anomalous pixels can be effectively screened out from heterogeneous background pixels. Since the same materials have similar characteristics in spatial and spectral dimension, detection performance can b… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

  33. arXiv:2112.01288  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA cs.LG eess.SP physics.data-an

    How to quantify fields or textures? A guide to the scattering transform

    Authors: Sihao Cheng, Brice Ménard

    Abstract: Extracting information from stochastic fields or textures is a ubiquitous task in science, from exploratory data analysis to classification and parameter estimation. From physics to biology, it tends to be done either through a power spectrum analysis, which is often too limited, or the use of convolutional neural networks (CNNs), which require large training sets and lack interpretability. In thi… ▽ More

    Submitted 30 November, 2021; originally announced December 2021.

    Comments: 18 pages, 16 figures

  34. $\mathcal{L}_1$ Adaptive Augmentation for Geometric Tracking Control of Quadrotors

    Authors: Zhuohuan Wu, Sheng Cheng, Kasey A. Ackerman, Aditya Gahlawat, Arun Lakshmanan, Pan Zhao, Naira Hovakimyan

    Abstract: This paper introduces an $\mathcal{L}_1$ adaptive control augmentation for geometric tracking control of quadrotors. In the proposed design, the $\mathcal{L}_1$ augmentation handles nonlinear (time- and state-dependent) uncertainties in the quadrotor dynamics without assuming or enforcing parametric structures, while the baseline geometric controller achieves stabilization of the known nonlinear m… ▽ More

    Submitted 2 March, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

    Comments: accepted by ICRA 2022

  35. arXiv:2109.00339  [pdf, other

    cs.DM eess.SP

    How likely is a random graph shift-enabled?

    Authors: Liyan Chen, Samuel Cheng, Vladimir Stankovic, Lina Stankovic

    Abstract: The shift-enabled property of an underlying graph is essential in designing distributed filters. This article discusses when a random graph is shift-enabled. In particular, popular graph models ER, WS, BA random graph are used, weighted and unweighted, as well as signed graphs. Our results show that the considered unweighted connected random graphs are shift-enabled with high probability when the… ▽ More

    Submitted 28 August, 2021; originally announced September 2021.

    Comments: 9 pages

  36. arXiv:2106.12511  [pdf

    eess.IV cs.CV cs.LG

    High-Throughput Precision Phenoty** of Left Ventricular Hypertrophy with Cardiovascular Deep Learning

    Authors: Grant Duffy, Paul P Cheng, Neal Yuan, Bryan He, Alan C. Kwan, Matthew J. Shun-Shin, Kevin M. Alexander, Joseph Ebinger, Matthew P. Lungren, Florian Rader, David H. Liang, Ingela Schnittger, Euan A. Ashley, James Y. Zou, Jignesh Patel, Ronald Witteles, Susan Cheng, David Ouyang

    Abstract: Left ventricular hypertrophy (LVH) results from chronic remodeling caused by a broad range of systemic and cardiovascular disease including hypertension, aortic stenosis, hypertrophic cardiomyopathy, and cardiac amyloidosis. Early detection and characterization of LVH can significantly impact patient care but is limited by under-recognition of hypertrophy, measurement error and variability, and di… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

  37. Optimal control of a 2D diffusion-advection process with a team of mobile actuators under jointly optimal guidance

    Authors: Sheng Cheng, Derek A. Paley

    Abstract: This paper describes an optimization framework to control a distributed parameter system (DPS) using a team of mobile actuators. The framework simultaneously seeks optimal control of the DPS and optimal guidance of the mobile actuators such that a cost function associated with both the DPS and the mobile actuators is minimized subject to the dynamics of each. The cost incurred from controlling the… ▽ More

    Submitted 18 June, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: Proofs for Lemmas~2.3, 2.5, and D.1 are attached in the supplement at the end

  38. arXiv:2105.08629  [pdf, other

    eess.IV cs.CV cs.LG

    Fast Camera Image Denoising on Mobile GPUs with Deep Learning, Mobile AI 2021 Challenge: Report

    Authors: Andrey Ignatov, Kim Byeoung-su, Radu Timofte, Angeline Pouget, Fenglong Song, Cheng Li, Shuai Xiao, Zhongqian Fu, Matteo Maggioni, Yibin Huang, Shen Cheng, Xin Lu, Yifeng Zhou, Liangyu Chen, Donghao Liu, Xiangyu Zhang, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Minsu Kwon, Myungje Lee, Jaeyoon Yoo, Changbeom Kang, Shinjo Wang, Bin Huang , et al. (7 additional authors not shown)

    Abstract: Image denoising is one of the most critical problems in mobile photo processing. While many solutions have been proposed for this task, they are usually working with synthetic data and are too computationally expensive to run on mobile devices. To address this problem, we introduce the first Mobile AI challenge, where the target is to develop an end-to-end deep learning-based image denoising solut… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: Mobile AI 2021 Workshop and Challenges: https://ai-benchmark.com/workshops/mai/2021/. arXiv admin note: substantial text overlap with arXiv:2105.07809, arXiv:2105.07825

  39. arXiv:2103.03011  [pdf, other

    cs.RO eess.SY

    Reinforcement Learning Trajectory Generation and Control for Aggressive Perching on Vertical Walls with Quadrotors

    Authors: Chen-Huan Pi, Kai-Chun Hu, Yu-Ting Huang, Stone Cheng

    Abstract: Micro aerial vehicles are widely being researched and employed due to their relative low operation costs and high flexibility in various applications. We study the under-actuated quadrotor perching problem, designing a trajectory planner and controller which generates feasible trajectories and drives quadrotors to desired state in state space. This paper proposes a trajectory generating and tracki… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

  40. arXiv:2009.09574  [pdf, other

    eess.IV cs.CV

    Reconstruct high-resolution multi-focal plane images from a single 2D wide field image

    Authors: Jiabo Ma, Sibo Liu, Shenghua Cheng, Xiuli Liu, Li Cheng, Shaoqun Zeng

    Abstract: High-resolution 3D medical images are important for analysis and diagnosis, but axial scanning to acquire them is very time-consuming. In this paper, we propose a fast end-to-end multi-focal plane imaging network (MFPINet) to reconstruct high-resolution multi-focal plane images from a single 2D low-resolution wild filed image without relying on scanning. To acquire realistic MFP images fast, the p… ▽ More

    Submitted 20 September, 2020; originally announced September 2020.

    Comments: 9 pages, 4 figures,3 Tables

  41. arXiv:2007.12951  [pdf, ps, other

    eess.SP physics.geo-ph

    Comparison of Machine Learning Methods for Predicting Karst Spring Discharge in North China

    Authors: Shu Cheng, Xiaojuan Qiao, Yaolin Shi, Dawei Wang

    Abstract: The quantitative analyses of karst spring discharge typically rely on physical-based models, which are inherently uncertain. To improve the understanding of the mechanism of spring discharge fluctuation and the relationship between precipitation and spring discharge, three machine learning methods were developed to reduce the predictive errors of physical-based groundwater models, simulate the dis… ▽ More

    Submitted 25 July, 2020; originally announced July 2020.

  42. arXiv:2005.11902  [pdf, other

    eess.AS

    ASR-Free Pronunciation Assessment

    Authors: Sitong Cheng, Zhixin Liu, Lantian Li, Zhiyuan Tang, Dong Wang, Thomas Fang Zheng

    Abstract: Most of the pronunciation assessment methods are based on local features derived from automatic speech recognition (ASR), e.g., the Goodness of Pronunciation (GOP) score. In this paper, we investigate an ASR-free scoring approach that is derived from the marginal distribution of raw speech signals. The hypothesis is that even if we have no knowledge of the language (so cannot recognize the phones/… ▽ More

    Submitted 24 May, 2020; originally announced May 2020.

    Comments: submitted to INTRESPEECH 2020

  43. arXiv:2005.10455  [pdf, other

    eess.IV cs.CV

    Single Image Super-Resolution via Residual Neuron Attention Networks

    Authors: Wenjie Ai, Xiaoguang Tu, Shilei Cheng, Mei Xie

    Abstract: Deep Convolutional Neural Networks (DCNNs) have achieved impressive performance in Single Image Super-Resolution (SISR). To further improve the performance, existing CNN-based methods generally focus on designing deeper architecture of the network. However, we argue blindly increasing network's depth is not the most sensible way. In this paper, we propose a novel end-to-end Residual Neuron Attenti… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

    Comments: 6 pages, 4 figures, Accepted by IEEE ICIP 2020

  44. arXiv:2005.07318  [pdf

    physics.optics eess.IV

    Displacement-agnostic coherent imaging through scatter with an interpretable deep neural network

    Authors: Yuzhe Li, Shiyi Cheng, Yujia Xue, Lei Tian

    Abstract: Coherent imaging through scatter is a challenging task in computational imaging. Both model-based and data-driven approaches have been explored to solve the inverse scattering problem. In our previous work, we have shown that a deep learning approach can make high-quality and highly generalizable predictions through unseen diffusers. Here, we propose a new deep neural network (DNN) model that is a… ▽ More

    Submitted 1 September, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

  45. arXiv:2005.00834  [pdf

    eess.IV physics.optics

    Learning-based super interpolation and extrapolation for speckled image reconstruction

    Authors: Huanhao Li, Zhipeng Yu, Yunqi Luo, Shengfu Cheng, Lihong V. Wang, Yuan** Zheng, Puxiang Lai

    Abstract: Speckles arise when coherent light interacts with biological tissues. Information retrieval from speckles is desired yet challenging, requiring understanding or map** of the multiple scattering process, or reliable capability to reverse or compensate for the scattering-induced phase distortions. In whatever situation, insufficient sampling of speckles undermines the encoded information, impeding… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

  46. arXiv:2004.12385  [pdf, other

    cs.LG cs.CV eess.IV

    Towards Feature Space Adversarial Attack

    Authors: Qiuling Xu, Guanhong Tao, Siyuan Cheng, Xiangyu Zhang

    Abstract: We propose a new adversarial attack to Deep Neural Networks for image classification. Different from most existing attacks that directly perturb input pixels, our attack focuses on perturbing abstract features, more specifically, features that denote styles, including interpretable styles such as vivid colors and sharp outlines, and uninterpretable ones. It induces model misclassfication by inject… ▽ More

    Submitted 15 December, 2020; v1 submitted 26 April, 2020; originally announced April 2020.

    Comments: AAAI 2021

  47. arXiv:2001.00692  [pdf

    cs.CV eess.IV q-bio.QM

    FFusionCGAN: An end-to-end fusion method for few-focus images using conditional GAN in cytopathological digital slides

    Authors: Xiebo Geng, Sibo Liua, Wei Han, Xu Li, Jiabo Ma, **gya Yu, Xiuli Liu, Sahoqun Zeng, Li Chen, Shenghua Cheng

    Abstract: Multi-focus image fusion technologies compress different focus depth images into an image in which most objects are in focus. However, although existing image fusion techniques, including traditional algorithms and deep learning-based algorithms, can generate high-quality fused images, they need multiple images with different focus depths in the same field of view. This criterion may not be met in… ▽ More

    Submitted 2 January, 2020; originally announced January 2020.

  48. arXiv:1911.01799  [pdf, ps, other

    eess.AS cs.CL cs.SD

    CN-CELEB: a challenging Chinese speaker recognition dataset

    Authors: Yue Fan, Jiawen Kang, Lantian Li, Kaicheng Li, Haolin Chen, Sitong Cheng, Pengyuan Zhang, Ziya Zhou, Yunqi Cai, Dong Wang

    Abstract: Recently, researchers set an ambitious goal of conducting speaker recognition in unconstrained conditions where the variations on ambient, channel and emotion could be arbitrary. However, most publicly available datasets are collected under constrained environments, i.e., with little noise and limited channel variation. These datasets tend to deliver over optimistic performance and do not meet the… ▽ More

    Submitted 31 October, 2019; originally announced November 2019.

  49. arXiv:1910.13046  [pdf, other

    eess.IV cs.CV

    Converged Deep Framework Assembling Principled Modules for CS-MRI

    Authors: Risheng Liu, Yuxi Zhang, Shichao Cheng, Zhongxuan Luo, Xin Fan

    Abstract: Compressed Sensing Magnetic Resonance Imaging (CS-MRI) significantly accelerates MR data acquisition at a sampling rate much lower than the Nyquist criterion. A major challenge for CS-MRI lies in solving the severely ill-posed inverse problem to reconstruct aliasing-free MR images from the sparse k-space data. Conventional methods typically optimize an energy function, producing reconstruction of… ▽ More

    Submitted 28 October, 2019; originally announced October 2019.

  50. arXiv:1909.05184  [pdf

    eess.IV cs.CV

    Multi-stage domain adversarial style reconstruction for cytopathological image stain normalization

    Authors: Xihao Chen, **gya Yu, Li Chen, Shaoqun Zeng, Xiuli Liu, Shenghua Cheng

    Abstract: The different stain styles of cytopathological images have a negative effect on the generalization ability of automated image analysis algorithms. This article proposes a new framework that normalizes the stain style for cytopathological images through a stain removal module and a multi-stage domain adversarial style reconstruction module. We convert colorful images into grayscale images with a co… ▽ More

    Submitted 11 September, 2019; originally announced September 2019.