Search | arXiv e-print repository

arXiv:2406.19718 [pdf, ps, other]

Global Regulation of Feedforward Nonlinear Systems: A Logic-Based Switching Gain Approach

Authors: Debao Fan, Xianfu Zhang, Gang Feng, Hanfeng Li

Abstract: In this article, we investigate the global regulation problem for a class of feedforward nonlinear systems. Notably, the systems under consideration allow unknown input-output-dependent nonlinear growth rates, which has not been considered in existing works. A novel logic-based switching (LBS) gain approach is proposed to counteract system uncertainties and nonlinearities. Furthermore, a tanh-type… ▽ More In this article, we investigate the global regulation problem for a class of feedforward nonlinear systems. Notably, the systems under consideration allow unknown input-output-dependent nonlinear growth rates, which has not been considered in existing works. A novel logic-based switching (LBS) gain approach is proposed to counteract system uncertainties and nonlinearities. Furthermore, a tanh-type speed-regulation function is embedded into the switching mechanism for the first time to improve the convergence speed and transient performance. Then, a switching adaptive output feedback (SAOF) controller is proposed based on the developed switching mechanism, which is of a concise form and low-complexity characteristic. It is shown that the objective of global regulation is achieved with faster convergence speed and better transient performance under the proposed controller. Moreover, by strengthening the switching mechanism, the improved control approach can deal with feedforward nonlinear systems with external disturbances. Finally, representative examples are presented to demonstrate the effectiveness and advantages of our approach in comparison with the existing approaches. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.18242 [pdf, other]

ConStyle v2: A Strong Prompter for All-in-One Image Restoration

Authors: Dongqi Fan, Junhao Zhang, Liang Chang

Abstract: This paper introduces ConStyle v2, a strong plug-and-play prompter designed to output clean visual prompts and assist U-Net Image Restoration models in handling multiple degradations. The joint training process of IRConStyle, an Image Restoration framework consisting of ConStyle and a general restoration network, is divided into two stages: first, pre-training ConStyle alone, and then freezing its… ▽ More This paper introduces ConStyle v2, a strong plug-and-play prompter designed to output clean visual prompts and assist U-Net Image Restoration models in handling multiple degradations. The joint training process of IRConStyle, an Image Restoration framework consisting of ConStyle and a general restoration network, is divided into two stages: first, pre-training ConStyle alone, and then freezing its weights to guide the training of the general restoration network. Three improvements are proposed in the pre-training stage to train ConStyle: unsupervised pre-training, adding a pretext task (i.e. classification), and adopting knowledge distillation. Without bells and whistles, we can get ConStyle v2, a strong prompter for all-in-one Image Restoration, in less than two GPU days and doesn't require any fine-tuning. Extensive experiments on Restormer (transformer-based), NAFNet (CNN-based), MAXIM-1S (MLP-based), and a vanilla CNN network demonstrate that ConStyle v2 can enhance any U-Net style Image Restoration models to all-in-one Image Restoration models. Furthermore, models guided by the well-trained ConStyle v2 exhibit superior performance in some specific degradation compared to ConStyle. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2405.14251 [pdf, other]

Efficient Navigation of a Robotic Fish Swimming Across the Vortical Flow Field

Authors: Haodong Feng, Dehan Yuan, Jiale Miao, Jie You, Yue Wang, Yi Zhu, Dixia Fan

Abstract: Navigating efficiently across vortical flow fields presents a significant challenge in various robotic applications. The dynamic and unsteady nature of vortical flows often disturbs the control of underwater robots, complicating their operation in hydrodynamic environments. Conventional control methods, which depend on accurate modeling, fail in these settings due to the complexity of fluid-struct… ▽ More Navigating efficiently across vortical flow fields presents a significant challenge in various robotic applications. The dynamic and unsteady nature of vortical flows often disturbs the control of underwater robots, complicating their operation in hydrodynamic environments. Conventional control methods, which depend on accurate modeling, fail in these settings due to the complexity of fluid-structure interactions (FSI) caused by unsteady hydrodynamics. This study proposes a deep reinforcement learning (DRL) algorithm, trained in a data-driven manner, to enable efficient navigation of a robotic fish swimming across vortical flows. Our proposed algorithm incorporates the LSTM architecture and uses several recent consecutive observations as the state to address the issue of partial observation, often due to sensor limitations. We present a numerical study of navigation within a Karman vortex street, created by placing a stationary cylinder in a uniform flow, utilizing the immersed boundary-lattice Boltzmann method (IB-LBM). The aim is to train the robotic fish to discover efficient navigation policies, enabling it to reach a designated target point across the Karman vortex street from various initial positions. After training, the fish demonstrates the ability to rapidly reach the target from different initial positions, showcasing the effectiveness and robustness of our proposed algorithm. Analysis of the results reveals that the robotic fish can leverage velocity gains and pressure differences induced by the vortices to reach the target, underscoring the potential of our proposed algorithm in enhancing navigation in complex hydrodynamic environments. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2403.06066 [pdf]

CausalCellSegmenter: Causal Inference inspired Diversified Aggregation Convolution for Pathology Image Segmentation

Authors: Dawei Fan, Yifan Gao, Jiaming Yu, Yan** Chen, Wencheng Li, Chuancong Lin, Kaibin Li, Changcai Yang, Riqing Chen, Lifang Wei

Abstract: Deep learning models have shown promising performance for cell nucleus segmentation in the field of pathology image analysis. However, training a robust model from multiple domains remains a great challenge for cell nucleus segmentation. Additionally, the shortcomings of background noise, highly overlap** between cell nucleus, and blurred edges often lead to poor performance. To address these ch… ▽ More Deep learning models have shown promising performance for cell nucleus segmentation in the field of pathology image analysis. However, training a robust model from multiple domains remains a great challenge for cell nucleus segmentation. Additionally, the shortcomings of background noise, highly overlap** between cell nucleus, and blurred edges often lead to poor performance. To address these challenges, we propose a novel framework termed CausalCellSegmenter, which combines Causal Inference Module (CIM) with Diversified Aggregation Convolution (DAC) techniques. The DAC module is designed which incorporates diverse downsampling features through a simple, parameter-free attention module (SimAM), aiming to overcome the problems of false-positive identification and edge blurring. Furthermore, we introduce CIM to leverage sample weighting by directly removing the spurious correlations between features for every input sample and concentrating more on the correlation between features and labels. Extensive experiments on the MoNuSeg-2018 dataset achieves promising results, outperforming other state-of-the-art methods, where the mIoU and DSC scores growing by 3.6% and 2.65%. △ Less

Submitted 9 March, 2024; originally announced March 2024.

Comments: 10 pages, 5 figures, 2 tables, MICCAI

arXiv:2308.08269 [pdf, other]

OnUVS: Online Feature Decoupling Framework for High-Fidelity Ultrasound Video Synthesis

Authors: Han Zhou, Dong Ni, Ao Chang, Xinrui Zhou, Rusi Chen, Yanlin Chen, Lian Liu, Jiamin Liang, Yuhao Huang, Tong Han, Zhe Liu, Deng-** Fan, Xin Yang

Abstract: Ultrasound (US) imaging is indispensable in clinical practice. To diagnose certain diseases, sonographers must observe corresponding dynamic anatomic structures to gather comprehensive information. However, the limited availability of specific US video cases causes teaching difficulties in identifying corresponding diseases, which potentially impacts the detection rate of such cases. The synthesis… ▽ More Ultrasound (US) imaging is indispensable in clinical practice. To diagnose certain diseases, sonographers must observe corresponding dynamic anatomic structures to gather comprehensive information. However, the limited availability of specific US video cases causes teaching difficulties in identifying corresponding diseases, which potentially impacts the detection rate of such cases. The synthesis of US videos may represent a promising solution to this issue. Nevertheless, it is challenging to accurately animate the intricate motion of dynamic anatomic structures while preserving image fidelity. To address this, we present a novel online feature-decoupling framework called OnUVS for high-fidelity US video synthesis. Our highlights can be summarized by four aspects. First, we introduced anatomic information into keypoint learning through a weakly-supervised training strategy, resulting in improved preservation of anatomical integrity and motion while minimizing the labeling burden. Second, to better preserve the integrity and textural information of US images, we implemented a dual-decoder that decouples the content and textural features in the generator. Third, we adopted a multiple-feature discriminator to extract a comprehensive range of visual cues, thereby enhancing the sharpness and fine details of the generated videos. Fourth, we constrained the motion trajectories of keypoints during online learning to enhance the fluidity of generated videos. Our validation and user studies on in-house echocardiographic and pelvic floor US videos showed that OnUVS synthesizes US videos with high fidelity. △ Less

Submitted 16 August, 2023; originally announced August 2023.

Comments: 14 pages, 13 figures and 6 tables

arXiv:2307.16262 [pdf, other]

Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges

Authors: Debesh Jha, Vanshali Sharma, Debapriya Banik, Debayan Bhattacharya, Kaushiki Roy, Steven A. Hicks, Nikhil Kumar Tomar, Vajira Thambawita, Adrian Krenzer, Ge-Peng Ji, Sahadev Poudel, George Batchkala, Saruar Alam, Awadelrahman M. A. Ahmed, Quoc-Huy Trinh, Zeshan Khan, Tien-Phat Nguyen, Shruti Shrestha, Sabari Nathan, Jeonghwan Gwak, Ritika K. Jha, Zheyuan Zhang, Alexander Schlaefer, Debotosh Bhattacharjee, M. K. Bhuyan , et al. (8 additional authors not shown)

Abstract: Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Deep learning has… ▽ More Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Deep learning has emerged as a promising solution to this challenge as it can assist endoscopists in detecting and classifying overlooked polyps and abnormalities in real time. In addition to the algorithm's accuracy, transparency and interpretability are crucial to explaining the whys and hows of the algorithm's prediction. Further, most algorithms are developed in private data, closed source, or proprietary software, and methods lack reproducibility. Therefore, to promote the development of efficient and transparent methods, we have organized the "Medico automatic polyp segmentation (Medico 2020)" and "MedAI: Transparency in Medical Image Segmentation (MedAI 2021)" competitions. We present a comprehensive summary and analyze each contribution, highlight the strength of the best-performing methods, and discuss the possibility of clinical translations of such methods into the clinic. For the transparency task, a multi-disciplinary team, including expert gastroenterologists, accessed each submission and evaluated the team based on open-source practices, failure case analysis, ablation studies, usability and understandability of evaluations to gain a deeper understanding of the models' credibility for clinical deployment. Through the comprehensive analysis of the challenge, we not only highlight the advancements in polyp and surgical instrument segmentation but also encourage qualitative evaluation for building more transparent and understandable AI-based colonoscopy systems. △ Less

Submitted 6 May, 2024; v1 submitted 30 July, 2023; originally announced July 2023.

arXiv:2307.05383 [pdf]

Human Emotion Recognition Based On Galvanic Skin Response signal Feature Selection and SVM

Authors: Di Fan, Mingyang Liu, Xiaohan Zhang, Xiaopeng Gong

Abstract: A novel human emotion recognition method based on automatically selected Galvanic Skin Response (GSR) signal features and SVM is proposed in this paper. GSR signals were acquired by e-Health Sensor Platform V2.0. Then, the data is de-noised by wavelet function and normalized to get rid of the individual difference. 30 features are extracted from the normalized data, however, directly using of thes… ▽ More A novel human emotion recognition method based on automatically selected Galvanic Skin Response (GSR) signal features and SVM is proposed in this paper. GSR signals were acquired by e-Health Sensor Platform V2.0. Then, the data is de-noised by wavelet function and normalized to get rid of the individual difference. 30 features are extracted from the normalized data, however, directly using of these features will lead to a low recognition rate. In order to gain the optimized features, a covariance based feature selection is employed in our method. Finally, a SVM with input of the optimized features is utilized to achieve the human emotion recognition. The experimental results indicate that the proposed method leads to good human emotion recognition, and the recognition accuracy is more than 66.67%. △ Less

Submitted 3 July, 2023; originally announced July 2023.

arXiv:2306.12817 [pdf, other]

doi 10.1109/CCTA54093.2023.10252460

Physics-guided neural networks for inversion-based feedforward control applied to hybrid stepper motors

Authors: Daiwei Fan, Max Bolderman, Sjirk Koekebakker, Hans Butler, Mircea Lazar

Abstract: Rotary motors, such as hybrid stepper motors (HSMs), are widely used in industries varying from printing applications to robotics. The increasing need for productivity and efficiency without increasing the manufacturing costs calls for innovative control design. Feedforward control is typically used in tracking control problems, where the desired reference is known in advance. In most applications… ▽ More Rotary motors, such as hybrid stepper motors (HSMs), are widely used in industries varying from printing applications to robotics. The increasing need for productivity and efficiency without increasing the manufacturing costs calls for innovative control design. Feedforward control is typically used in tracking control problems, where the desired reference is known in advance. In most applications, this is the case for HSMs, which need to track a periodic angular velocity and angular position reference. Performance achieved by feedforward control is limited by the accuracy of the available model describing the inverse system dynamics. In this work, we develop a physics-guided neural network (PGNN) feedforward controller for HSMs, which can learn the effect of parasitic forces from data and compensate for it, resulting in improved accuracy. Indeed, experimental results on an HSM used in printing industry show that the PGNN outperforms conventional benchmarks in terms of the mean-absolute tracking error. △ Less

Submitted 22 June, 2023; originally announced June 2023.

arXiv:2305.02241 [pdf, other]

A Multi-step Dynamics Modeling Framework For Autonomous Driving In Multiple Environments

Authors: Jason Gibson, Bogdan Vlahov, David Fan, Patrick Spieler, Daniel Pastor, Ali-akbar Agha-mohammadi, Evangelos A. Theodorou

Abstract: Modeling dynamics is often the first step to making a vehicle autonomous. While on-road autonomous vehicles have been extensively studied, off-road vehicles pose many challenging modeling problems. An off-road vehicle encounters highly complex and difficult-to-model terrain/vehicle interactions, as well as having complex vehicle dynamics of its own. These complexities can create challenges for eff… ▽ More Modeling dynamics is often the first step to making a vehicle autonomous. While on-road autonomous vehicles have been extensively studied, off-road vehicles pose many challenging modeling problems. An off-road vehicle encounters highly complex and difficult-to-model terrain/vehicle interactions, as well as having complex vehicle dynamics of its own. These complexities can create challenges for effective high-speed control and planning. In this paper, we introduce a framework for multistep dynamics prediction that explicitly handles the accumulation of modeling error and remains scalable for sampling-based controllers. Our method uses a specially-initialized Long Short-Term Memory (LSTM) over a limited time horizon as the learned component in a hybrid model to predict the dynamics of a 4-person seating all-terrain vehicle (Polaris S4 1000 RZR) in two distinct environments. By only having the LSTM predict over a fixed time horizon, we negate the need for long term stability that is often a challenge when training recurrent neural networks. Our framework is flexible as it only requires odometry information for labels. Through extensive experimentation, we show that our method is able to predict millions of possible trajectories in real-time, with a time horizon of five seconds in challenging off road driving scenarios. △ Less

Submitted 3 May, 2023; originally announced May 2023.

arXiv:2304.14660 [pdf, other]

doi 10.1016/j.media.2023.103061

Segment Anything Model for Medical Images?

Authors: Yuhao Huang, Xin Yang, Lian Liu, Han Zhou, Ao Chang, Xinrui Zhou, Rusi Chen, Junxuan Yu, Jiongquan Chen, Chaoyu Chen, Si**g Liu, Haozhe Chi, Xindi Hu, Kejuan Yue, Lei Li, Vicente Grau, Deng-** Fan, Fa** Dong, Dong Ni

Abstract: The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging because of the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. To fully validate SAM's perfo… ▽ More The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging because of the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. To fully validate SAM's performance on medical data, we collected and sorted 53 open-source datasets and built a large medical segmentation dataset with 18 modalities, 84 objects, 125 object-modality paired targets, 1050K 2D images, and 6033K masks. We comprehensively analyzed different models and strategies on the so-called COSMOS 1050K dataset. Our findings mainly include the following: 1) SAM showed remarkable performance in some specific objects but was unstable, imperfect, or even totally failed in other situations. 2) SAM with the large ViT-H showed better overall performance than that with the small ViT-B. 3) SAM performed better with manual hints, especially box, than the Everything mode. 4) SAM could help human annotation with high labeling quality and less time. 5) SAM was sensitive to the randomness in the center point and tight box prompts, and may suffer from a serious performance drop. 6) SAM performed better than interactive methods with one or a few points, but will be outpaced as the number of points increases. 7) SAM's performance correlated to different factors, including boundary complexity, intensity differences, etc. 8) Finetuning the SAM on specific medical tasks could improve its average DICE performance by 4.39% and 6.68% for ViT-B and ViT-H, respectively. We hope that this comprehensive report can help researchers explore the potential of SAM applications in MIS, and guide how to appropriately use and develop SAM. △ Less

Submitted 17 January, 2024; v1 submitted 28 April, 2023; originally announced April 2023.

Comments: Accepted by Medical Image Analysis. 23 pages, 18 figures, 8 tables

arXiv:2304.11526 [pdf, other]

doi 10.1063/5.0142949

How to Control Hydrodynamic Force on Fluidic Pinball via Deep Reinforcement Learning

Authors: Haodong Feng, Yue Wang, Hui Xiang, Zhiyang **, Dixia Fan

Abstract: Deep reinforcement learning (DRL) for fluidic pinball, three individually rotating cylinders in the uniform flow arranged in an equilaterally triangular configuration, can learn the efficient flow control strategies due to the validity of self-learning and data-driven state estimation for complex fluid dynamic problems. In this work, we present a DRL-based real-time feedback strategy to control th… ▽ More Deep reinforcement learning (DRL) for fluidic pinball, three individually rotating cylinders in the uniform flow arranged in an equilaterally triangular configuration, can learn the efficient flow control strategies due to the validity of self-learning and data-driven state estimation for complex fluid dynamic problems. In this work, we present a DRL-based real-time feedback strategy to control the hydrodynamic force on fluidic pinball, i.e., force extremum and tracking, from cylinders' rotation. By adequately designing reward functions and encoding historical observations, and after automatic learning of thousands of iterations, the DRL-based control was shown to make reasonable and valid control decisions in nonparametric control parameter space, which is comparable to and even better than the optimal policy found through lengthy brute-force searching. Subsequently, one of these results was analyzed by a machine learning model that enabled us to shed light on the basis of decision-making and physical mechanisms of the force tracking process. The finding from this work can control hydrodynamic force on the operation of fluidic pinball system and potentially pave the way for exploring efficient active flow control strategies in other complex fluid dynamic problems. △ Less

Submitted 22 April, 2023; originally announced April 2023.

arXiv:2303.01614 [pdf, other]

doi 10.55417/fr.2024006

STEP: Stochastic Traversability Evaluation and Planning for Risk-Aware Off-road Navigation; Results from the DARPA Subterranean Challenge

Authors: Anushri Dixit, David D. Fan, Kyohei Otsu, Sharmita Dey, Ali-Akbar Agha-Mohammadi, Joel W. Burdick

Abstract: Although autonomy has gained widespread usage in structured and controlled environments, robotic autonomy in unknown and off-road terrain remains a difficult problem. Extreme, off-road, and unstructured environments such as undeveloped wilderness, caves, rubble, and other post-disaster sites pose unique and challenging problems for autonomous navigation. Based on our participation in the DARPA Sub… ▽ More Although autonomy has gained widespread usage in structured and controlled environments, robotic autonomy in unknown and off-road terrain remains a difficult problem. Extreme, off-road, and unstructured environments such as undeveloped wilderness, caves, rubble, and other post-disaster sites pose unique and challenging problems for autonomous navigation. Based on our participation in the DARPA Subterranean Challenge, we propose an approach to improve autonomous traversal of robots in subterranean environments that are perceptually degraded and completely unknown through a traversability and planning framework called STEP (Stochastic Traversability Evaluation and Planning). We present 1) rapid uncertainty-aware map** and traversability evaluation, 2) tail risk assessment using the Conditional Value-at-Risk (CVaR), 3) efficient risk and constraint-aware kinodynamic motion planning using sequential quadratic programming-based (SQP) model predictive control (MPC), 4) fast recovery behaviors to account for unexpected scenarios that may cause failure, and 5) risk-based gait adaptation for quadrupedal robots. We illustrate and validate extensive results from our experiments on wheeled and legged robotic platforms in field studies at the Valentine Cave, CA (cave environment), Kentucky Underground, KY (mine environment), and Louisville Mega Cavern, KY (final competition site for the DARPA Subterranean Challenge with tunnel, urban, and cave environments). △ Less

Submitted 2 March, 2023; originally announced March 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2103.02828

Journal ref: Field Robotics, 4, 2024, 182-210

arXiv:2301.04904 [pdf, other]

Lesion-aware Dynamic Kernel for Polyp Segmentation

Authors: Ruifei Zhang, Peiwen Lai, Xiang Wan, De-Jun Fan, Feng Gao, Xiao-Jian Wu, Guanbin Li

Abstract: Automatic and accurate polyp segmentation plays an essential role in early colorectal cancer diagnosis. However, it has always been a challenging task due to 1) the diverse shape, size, brightness and other appearance characteristics of polyps, 2) the tiny contrast between concealed polyps and their surrounding regions. To address these problems, we propose a lesion-aware dynamic network (LDNet) f… ▽ More Automatic and accurate polyp segmentation plays an essential role in early colorectal cancer diagnosis. However, it has always been a challenging task due to 1) the diverse shape, size, brightness and other appearance characteristics of polyps, 2) the tiny contrast between concealed polyps and their surrounding regions. To address these problems, we propose a lesion-aware dynamic network (LDNet) for polyp segmentation, which is a traditional u-shape encoder-decoder structure incorporated with a dynamic kernel generation and updating scheme. Specifically, the designed segmentation head is conditioned on the global context features of the input image and iteratively updated by the extracted lesion features according to polyp segmentation predictions. This simple but effective scheme endows our model with powerful segmentation performance and generalization capability. Besides, we utilize the extracted lesion representation to enhance the feature contrast between the polyp and background regions by a tailored lesion-aware cross-attention module (LCA), and design an efficient self-attention module (ESA) to capture long-range context relations, further improving the segmentation accuracy. Extensive experiments on four public polyp benchmarks and our collected large-scale polyp dataset demonstrate the superior performance of our method compared with other state-of-the-art approaches. The source code is available at https://github.com/ReaFly/LDNet. △ Less

Submitted 12 January, 2023; originally announced January 2023.

Comments: Accepted by MICCAI2022

arXiv:2203.14291 [pdf, other]

doi 10.1007/s11633-022-1371-y

Video Polyp Segmentation: A Deep Learning Perspective

Authors: Ge-Peng Ji, Guobao Xiao, Yu-Cheng Chou, Deng-** Fan, Kai Zhao, Geng Chen, Luc Van Gool

Abstract: We present the first comprehensive video polyp segmentation (VPS) study in the deep learning era. Over the years, developments in VPS are not moving forward with ease due to the lack of large-scale fine-grained segmentation annotations. To address this issue, we first introduce a high-quality frame-by-frame annotated VPS dataset, named SUN-SEG, which contains 158,690 colonoscopy frames from the we… ▽ More We present the first comprehensive video polyp segmentation (VPS) study in the deep learning era. Over the years, developments in VPS are not moving forward with ease due to the lack of large-scale fine-grained segmentation annotations. To address this issue, we first introduce a high-quality frame-by-frame annotated VPS dataset, named SUN-SEG, which contains 158,690 colonoscopy frames from the well-known SUN-database. We provide additional annotations with diverse types, i.e., attribute, object mask, boundary, scribble, and polygon. Second, we design a simple but efficient baseline, dubbed PNS+, consisting of a global encoder, a local encoder, and normalized self-attention (NS) blocks. The global and local encoders receive an anchor frame and multiple successive frames to extract long-term and short-term spatial-temporal representations, which are then progressively updated by two NS blocks. Extensive experiments show that PNS+ achieves the best performance and real-time inference speed (170fps), making it a promising solution for the VPS task. Third, we extensively evaluate 13 representative polyp/object segmentation models on our SUN-SEG dataset and provide attribute-based comparisons. Finally, we discuss several open issues and suggest possible research directions for the VPS community. △ Less

Submitted 31 August, 2022; v1 submitted 27 March, 2022; originally announced March 2022.

Comments: Accepted by Machine Intelligence Research 2022 (Project Page: https://github.com/GewelsJI/VPS)

Journal ref: Machine Intelligence Research, vol. 19, no. 6, pp.531-549, 2022

arXiv:2203.13278 [pdf, other]

doi 10.1007/s11633-023-1466-0

Practical Blind Image Denoising via Swin-Conv-UNet and Data Synthesis

Authors: Kai Zhang, Yawei Li, **gyun Liang, Jiezhang Cao, Yulun Zhang, Hao Tang, Deng-** Fan, Radu Timofte, Luc Van Gool

Abstract: While recent years have witnessed a dramatic upsurge of exploiting deep neural networks toward solving image denoising, existing methods mostly rely on simple noise assumptions, such as additive white Gaussian noise (AWGN), JPEG compression noise and camera sensor noise, and a general-purpose blind denoising method for real images remains unsolved. In this paper, we attempt to solve this problem f… ▽ More While recent years have witnessed a dramatic upsurge of exploiting deep neural networks toward solving image denoising, existing methods mostly rely on simple noise assumptions, such as additive white Gaussian noise (AWGN), JPEG compression noise and camera sensor noise, and a general-purpose blind denoising method for real images remains unsolved. In this paper, we attempt to solve this problem from the perspective of network architecture design and training data synthesis. Specifically, for the network architecture design, we propose a swin-conv block to incorporate the local modeling ability of residual convolutional layer and non-local modeling ability of swin transformer block, and then plug it as the main building block into the widely-used image-to-image translation UNet architecture. For the training data synthesis, we design a practical noise degradation model which takes into consideration different kinds of noise (including Gaussian, Poisson, speckle, JPEG compression, and processed camera sensor noises) and resizing, and also involves a random shuffle strategy and a double degradation strategy. Extensive experiments on AGWN removal and real image denoising demonstrate that the new network architecture design achieves state-of-the-art performance and the new degradation model can help to significantly improve the practicability. We believe our work can provide useful insights into current denoising research. △ Less

Submitted 1 December, 2023; v1 submitted 24 March, 2022; originally announced March 2022.

Comments: Codes: https://github.com/cszn/SCUNet

Journal ref: Machine Intelligence Research, 2023

arXiv:2202.04074 [pdf, other]

Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image Segmentation

Authors: Xinkai Zhao, Chaowei Fang, De-Jun Fan, Xutao Lin, Feng Gao, Guanbin Li

Abstract: Semi-supervised learning (SSL), which aims at leveraging a few labeled images and a large number of unlabeled images for network training, is beneficial for relieving the burden of data annotation in medical image segmentation. According to the experience of medical imaging experts, local attributes such as texture, luster and smoothness are very important factors for identifying target objects li… ▽ More Semi-supervised learning (SSL), which aims at leveraging a few labeled images and a large number of unlabeled images for network training, is beneficial for relieving the burden of data annotation in medical image segmentation. According to the experience of medical imaging experts, local attributes such as texture, luster and smoothness are very important factors for identifying target objects like lesions and polyps in medical images. Motivated by this, we propose a cross-level contrastive learning scheme to enhance representation capacity for local features in semi-supervised medical image segmentation. Compared to existing image-wise, patch-wise and point-wise contrastive learning algorithms, our devised method is capable of exploring more complex similarity cues, namely the relational characteristics between global and local patch-wise representations. Additionally, for fully making use of cross-level semantic relations, we devise a novel consistency constraint that compares the predictions of patches against those of the full image. With the help of the cross-level contrastive learning and consistency constraint, the unlabelled data can be effectively explored to improve segmentation performance on two medical image datasets for polyp and skin lesion segmentation respectively. Code of our approach is available. △ Less

Submitted 13 February, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

arXiv:2201.02656 [pdf, other]

GPU-Net: Lightweight U-Net with more diverse features

Authors: Heng Yu, Di Fan, Weihu Song

Abstract: Image segmentation is an important task in the medical image field and many convolutional neural networks (CNNs) based methods have been proposed, among which U-Net and its variants show promising performance. In this paper, we propose GP-module and GPU-Net based on U-Net, which can learn more diverse features by introducing Ghost module and atrous spatial pyramid pooling (ASPP). Our method achiev… ▽ More Image segmentation is an important task in the medical image field and many convolutional neural networks (CNNs) based methods have been proposed, among which U-Net and its variants show promising performance. In this paper, we propose GP-module and GPU-Net based on U-Net, which can learn more diverse features by introducing Ghost module and atrous spatial pyramid pooling (ASPP). Our method achieves better performance with more than 4 times fewer parameters and 2 times fewer FLOPs, which provides a new potential direction for future research. Our plug-and-play module can also be applied to existing segmentation methods to further improve their performance. △ Less

Submitted 7 January, 2022; originally announced January 2022.

arXiv:2110.04062 [pdf]

A fast co-simulation approach to vehicle/track interaction with finite element models of S&C

Authors: Demeng Fan, Michel Sebès, Emmanuel Bourgeois, Hugues Chollet, Cédric Pozzolini

Abstract: Simulations of vehicle/track interaction (VTI) in switches and crossings (S\&C) require taking into account the complexity of their geometry. The VTI can be handled via a co-simulation process between a finite element (FE) model of the track and a multibody system (MBS) software. The objective of this paper is to reduce the computing effort in the co-simulation process. In the proposed approach, t… ▽ More Simulations of vehicle/track interaction (VTI) in switches and crossings (S\&C) require taking into account the complexity of their geometry. The VTI can be handled via a co-simulation process between a finite element (FE) model of the track and a multibody system (MBS) software. The objective of this paper is to reduce the computing effort in the co-simulation process. In the proposed approach, the VTI problem is solved inside the MBS software to reduce the computational effort in the track model as well as the flow of input/output between both modules. The FE code is used to supply the matrices of stiffness, dam** and mass at the beginning of the simulation. An explicit time scheme is used with mass scaling. A good agreement is found between both approaches with a reduction of the computing time by a factor of 10. This new approach allows the optimisation of the design of S\&C in further studies. △ Less

Submitted 6 October, 2021; originally announced October 2021.

Comments: IAVSD 2021 -- the 27th IAVSD Symposium on Dynamics of Vehicles on Roads and Tracks, Aug 2021, Saint Petersburg, Russia

arXiv:2109.12176 [pdf, other]

On the Fairness of Swarm Learning in Skin Lesion Classification

Authors: Di Fan, Yifan Wu, Xiaoxiao Li

Abstract: in healthcare. However, the existing AI model may be biased in its decision marking. The bias induced by data itself, such as collecting data in subgroups only, can be mitigated by including more diversified data. Distributed and collaborative learning is an approach to involve training models in massive, heterogeneous, and distributed data sources, also known as nodes. In this work, we target on… ▽ More in healthcare. However, the existing AI model may be biased in its decision marking. The bias induced by data itself, such as collecting data in subgroups only, can be mitigated by including more diversified data. Distributed and collaborative learning is an approach to involve training models in massive, heterogeneous, and distributed data sources, also known as nodes. In this work, we target on examining the fairness issue in Swarm Learning (SL), a recent edge-computing based decentralized machine learning approach, which is designed for heterogeneous illnesses detection in precision medicine. SL has achieved high performance in clinical applications, but no attempt has been made to evaluate if SL can improve fairness. To address the problem, we present an empirical study by comparing the fairness among single (node) training, SL, centralized training. Specifically, we evaluate on large public available skin lesion dataset, which contains samples from various subgroups. The experiments demonstrate that SL does not exacerbate the fairness problem compared to centralized training and improves both performance and fairness compared to single training. However, there still exists biases in SL model and the implementation of SL is more complex than the alternative two strategies. △ Less

Submitted 24 September, 2021; originally announced September 2021.

arXiv:2108.06932 [pdf, other]

doi 10.26599/AIR.2023.9150015

Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers

Authors: Bo Dong, Wenhai Wang, Deng-** Fan, **peng Li, Huazhu Fu, Ling Shao

Abstract: Most polyp segmentation methods use CNNs as their backbone, leading to two key issues when exchanging information between the encoder and decoder: 1) taking into account the differences in contribution between different-level features and 2) designing an effective mechanism for fusing these features. Unlike existing CNN-based methods, we adopt a transformer encoder, which learns more powerful and… ▽ More Most polyp segmentation methods use CNNs as their backbone, leading to two key issues when exchanging information between the encoder and decoder: 1) taking into account the differences in contribution between different-level features and 2) designing an effective mechanism for fusing these features. Unlike existing CNN-based methods, we adopt a transformer encoder, which learns more powerful and robust representations. In addition, considering the image acquisition influence and elusive properties of polyps, we introduce three standard modules, including a cascaded fusion module (CFM), a camouflage identification module (CIM), and a similarity aggregation module (SAM). Among these, the CFM is used to collect the semantic and location information of polyps from high-level features; the CIM is applied to capture polyp information disguised in low-level features, and the SAM extends the pixel features of the polyp area with high-level semantic position information to the entire polyp area, thereby effectively fusing cross-level features. The proposed model, named Polyp-PVT, effectively suppresses noises in the features and significantly improves their expressive capabilities. Extensive experiments on five widely adopted datasets show that the proposed model is more robust to various challenging situations (e.g., appearance changes, small objects, rotation) than existing representative methods. The proposed model is available at https://github.com/Deng**Fan/Polyp-PVT. △ Less

Submitted 19 February, 2024; v1 submitted 16 August, 2021; originally announced August 2021.

Comments: Accepted to CAAI AIR 2023

Journal ref: CAAI Artificial Intelligence Research, 2023, 2: 9150015

arXiv:2103.13813 [pdf, other]

RA-BNN: Constructing Robust & Accurate Binary Neural Network to Simultaneously Defend Adversarial Bit-Flip Attack and Improve Accuracy

Authors: Adnan Siraj Rakin, Li Yang, **gtao Li, Fan Yao, Chaitali Chakrabarti, Yu Cao, Jae-sun Seo, Deliang Fan

Abstract: Recently developed adversarial weight attack, a.k.a. bit-flip attack (BFA), has shown enormous success in compromising Deep Neural Network (DNN) performance with an extremely small amount of model parameter perturbation. To defend against this threat, we propose RA-BNN that adopts a complete binary (i.e., for both weights and activation) neural network (BNN) to significantly improve DNN model robu… ▽ More Recently developed adversarial weight attack, a.k.a. bit-flip attack (BFA), has shown enormous success in compromising Deep Neural Network (DNN) performance with an extremely small amount of model parameter perturbation. To defend against this threat, we propose RA-BNN that adopts a complete binary (i.e., for both weights and activation) neural network (BNN) to significantly improve DNN model robustness (defined as the number of bit-flips required to degrade the accuracy to as low as a random guess). However, such an aggressive low bit-width model suffers from poor clean (i.e., no attack) inference accuracy. To counter this, we propose a novel and efficient two-stage network growing method, named Early-Growth. It selectively grows the channel size of each BNN layer based on channel-wise binary masks training with Gumbel-Sigmoid function. Apart from recovering the inference accuracy, our RA-BNN after growing also shows significantly higher resistance to BFA. Our evaluation of the CIFAR-10 dataset shows that the proposed RA-BNN can improve the clean model accuracy by ~2-8 %, compared with a baseline BNN, while simultaneously improving the resistance to BFA by more than 125 x. Moreover, on ImageNet, with a sufficiently large (e.g., 5,000) amount of bit-flips, the baseline BNN accuracy drops to 4.3 % from 51.9 %, while our RA-BNN accuracy only drops to 37.1 % from 60.9 % (9 % clean accuracy improvement). △ Less

Submitted 22 March, 2021; originally announced March 2021.

arXiv:2103.02828 [pdf, other]

STEP: Stochastic Traversability Evaluation and Planning for Risk-Aware Off-road Navigation

Authors: David D. Fan, Kyohei Otsu, Yuki Kubo, Anushri Dixit, Joel Burdick, Ali-Akbar Agha-Mohammadi

Abstract: Although ground robotic autonomy has gained widespread usage in structured and controlled environments, autonomy in unknown and off-road terrain remains a difficult problem. Extreme, off-road, and unstructured environments such as undeveloped wilderness, caves, and rubble pose unique and challenging problems for autonomous navigation. To tackle these problems we propose an approach for assessing t… ▽ More Although ground robotic autonomy has gained widespread usage in structured and controlled environments, autonomy in unknown and off-road terrain remains a difficult problem. Extreme, off-road, and unstructured environments such as undeveloped wilderness, caves, and rubble pose unique and challenging problems for autonomous navigation. To tackle these problems we propose an approach for assessing traversability and planning a safe, feasible, and fast trajectory in real-time. Our approach, which we name STEP (Stochastic Traversability Evaluation and Planning), relies on: 1) rapid uncertainty-aware map** and traversability evaluation, 2) tail risk assessment using the Conditional Value-at-Risk (CVaR), and 3) efficient risk and constraint-aware kinodynamic motion planning using sequential quadratic programming-based (SQP) model predictive control (MPC). We analyze our method in simulation and validate its efficacy on wheeled and legged robotic platforms exploring extreme terrains including an abandoned subway and an underground lava tube. △ Less

Submitted 25 June, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

Comments: Accepted to Robotics: Science and Systems (RSS) 2021. Video link: https://youtu.be/N97cv4eH5c8

arXiv:2101.11110 [pdf, other]

Autonomous Off-road Navigation over Extreme Terrains with Perceptually-challenging Conditions

Authors: Rohan Thakker, Nikhilesh Alatur, David D. Fan, Jesus Tordesillas, Michael Paton, Kyohei Otsu, Olivier Toupet, Ali-akbar Agha-mohammadi

Abstract: We propose a framework for resilient autonomous navigation in perceptually challenging unknown environments with mobility-stressing elements such as uneven surfaces with rocks and boulders, steep slopes, negative obstacles like cliffs and holes, and narrow passages. Environments are GPS-denied and perceptually-degraded with variable lighting from dark to lit and obscurants (dust, fog, smoke). Lack… ▽ More We propose a framework for resilient autonomous navigation in perceptually challenging unknown environments with mobility-stressing elements such as uneven surfaces with rocks and boulders, steep slopes, negative obstacles like cliffs and holes, and narrow passages. Environments are GPS-denied and perceptually-degraded with variable lighting from dark to lit and obscurants (dust, fog, smoke). Lack of prior maps and degraded communication eliminates the possibility of prior or off-board computation or operator intervention. This necessitates real-time on-board computation using noisy sensor data. To address these challenges, we propose a resilient architecture that exploits redundancy and heterogeneity in sensing modalities. Further resilience is achieved by triggering recovery behaviors upon failure. We propose a fast settling algorithm to generate robust multi-fidelity traversability estimates in real-time. The proposed approach was deployed on multiple physical systems including skid-steer and tracked robots, a high-speed RC car and legged robots, as a part of Team CoSTAR's effort to the DARPA Subterranean Challenge, where the team won 2nd and 1st place in the Tunnel and Urban Circuits, respectively. △ Less

Submitted 26 January, 2021; originally announced January 2021.

Comments: 12 Pages, 7 Figures, 2020 International Symposium on Experimental Robotics (ISER 2020)

arXiv:2006.11392 [pdf, other]

PraNet: Parallel Reverse Attention Network for Polyp Segmentation

Authors: Deng-** Fan, Ge-Peng Ji, Tao Zhou, Geng Chen, Huazhu Fu, Jianbing Shen, Ling Shao

Abstract: Colonoscopy is an effective technique for detecting colorectal polyps, which are highly related to colorectal cancer. In clinical practice, segmenting polyps from colonoscopy images is of great importance since it provides valuable information for diagnosis and surgery. However, accurate polyp segmentation is a challenging task, for two major reasons: (i) the same type of polyps has a diversity of… ▽ More Colonoscopy is an effective technique for detecting colorectal polyps, which are highly related to colorectal cancer. In clinical practice, segmenting polyps from colonoscopy images is of great importance since it provides valuable information for diagnosis and surgery. However, accurate polyp segmentation is a challenging task, for two major reasons: (i) the same type of polyps has a diversity of size, color and texture; and (ii) the boundary between a polyp and its surrounding mucosa is not sharp. To address these challenges, we propose a parallel reverse attention network (PraNet) for accurate polyp segmentation in colonoscopy images. Specifically, we first aggregate the features in high-level layers using a parallel partial decoder (PPD). Based on the combined feature, we then generate a global map as the initial guidance area for the following components. In addition, we mine the boundary cues using a reverse attention (RA) module, which is able to establish the relationship between areas and boundary cues. Thanks to the recurrent cooperation mechanism between areas and boundaries, our PraNet is capable of calibrating any misaligned predictions, improving the segmentation accuracy. Quantitative and qualitative evaluations on five challenging datasets across six metrics show that our PraNet improves the segmentation accuracy significantly, and presents a number of advantages in terms of generalizability, and real-time segmentation efficiency. △ Less

Submitted 3 July, 2020; v1 submitted 13 June, 2020; originally announced June 2020.

Comments: Accepted to MICCAI 2020

arXiv:2004.14133 [pdf, other]

Inf-Net: Automatic COVID-19 Lung Infection Segmentation from CT Images

Authors: Deng-** Fan, Tao Zhou, Ge-Peng Ji, Yi Zhou, Geng Chen, Huazhu Fu, Jianbing Shen, Ling Shao

Abstract: Coronavirus Disease 2019 (COVID-19) spread globally in early 2020, causing the world to face an existential health crisis. Automated detection of lung infections from computed tomography (CT) images offers a great potential to augment the traditional healthcare strategy for tackling COVID-19. However, segmenting infected regions from CT slices faces several challenges, including high variation in… ▽ More Coronavirus Disease 2019 (COVID-19) spread globally in early 2020, causing the world to face an existential health crisis. Automated detection of lung infections from computed tomography (CT) images offers a great potential to augment the traditional healthcare strategy for tackling COVID-19. However, segmenting infected regions from CT slices faces several challenges, including high variation in infection characteristics, and low intensity contrast between infections and normal tissues. Further, collecting a large amount of data is impractical within a short time period, inhibiting the training of a deep model. To address these challenges, a novel COVID-19 Lung Infection Segmentation Deep Network (Inf-Net) is proposed to automatically identify infected regions from chest CT slices. In our Inf-Net, a parallel partial decoder is used to aggregate the high-level features and generate a global map. Then, the implicit reverse attention and explicit edge-attention are utilized to model the boundaries and enhance the representations. Moreover, to alleviate the shortage of labeled data, we present a semi-supervised segmentation framework based on a randomly selected propagation strategy, which only requires a few labeled images and leverages primarily unlabeled data. Our semi-supervised framework can improve the learning ability and achieve a higher performance. Extensive experiments on our COVID-SemiSeg and real CT volumes demonstrate that the proposed Inf-Net outperforms most cutting-edge segmentation models and advances the state-of-the-art performance. △ Less

Submitted 21 May, 2020; v1 submitted 22 April, 2020; originally announced April 2020.

Comments: To appear in IEEE TMI. The code is released in: https://github.com/Deng**Fan/Inf-Net

arXiv:2004.07054 [pdf, other]

doi 10.1109/TIP.2021.3058783

JCS: An Explainable COVID-19 Diagnosis System by Joint Classification and Segmentation

Authors: Yu-Huan Wu, Shang-Hua Gao, Jie Mei, Jun Xu, Deng-** Fan, Rong-Guo Zhang, Ming-Ming Cheng

Abstract: Recently, the coronavirus disease 2019 (COVID-19) has caused a pandemic disease in over 200 countries, influencing billions of humans. To control the infection, identifying and separating the infected people is the most crucial step. The main diagnostic tool is the Reverse Transcription Polymerase Chain Reaction (RT-PCR) test. Still, the sensitivity of the RT-PCR test is not high enough to effecti… ▽ More Recently, the coronavirus disease 2019 (COVID-19) has caused a pandemic disease in over 200 countries, influencing billions of humans. To control the infection, identifying and separating the infected people is the most crucial step. The main diagnostic tool is the Reverse Transcription Polymerase Chain Reaction (RT-PCR) test. Still, the sensitivity of the RT-PCR test is not high enough to effectively prevent the pandemic. The chest CT scan test provides a valuable complementary tool to the RT-PCR test, and it can identify the patients in the early-stage with high sensitivity. However, the chest CT scan test is usually time-consuming, requiring about 21.5 minutes per case. This paper develops a novel Joint Classification and Segmentation (JCS) system to perform real-time and explainable COVID-19 chest CT diagnosis. To train our JCS system, we construct a large scale COVID-19 Classification and Segmentation (COVID-CS) dataset, with 144,167 chest CT images of 400 COVID-19 patients and 350 uninfected cases. 3,855 chest CT images of 200 patients are annotated with fine-grained pixel-level labels of opacifications, which are increased attenuation of the lung parenchyma. We also have annotated lesion counts, opacification areas, and locations and thus benefit various diagnosis aspects. Extensive experiments demonstrate that the proposed JCS diagnosis system is very efficient for COVID-19 classification and segmentation. It obtains an average sensitivity of 95.0% and a specificity of 93.0% on the classification test set, and 78.5% Dice score on the segmentation test set of our COVID-CS dataset. The COVID-CS dataset and code are available at https://github.com/yuhuan-wu/JCS. △ Less

Submitted 3 August, 2021; v1 submitted 15 April, 2020; originally announced April 2020.

Comments: To appear in IEEE Transactions on Image Processing. Dataset and code are available at https://github.com/yuhuan-wu/JCS

arXiv:2002.01587 [pdf, other]

Deep Learning Tubes for Tube MPC

Authors: David D. Fan, Ali-akbar Agha-mohammadi, Evangelos A. Theodorou

Abstract: Learning-based control aims to construct models of a system to use for planning or trajectory optimization, e.g. in model-based reinforcement learning. In order to obtain guarantees of safety in this context, uncertainty must be accurately quantified. This uncertainty may come from errors in learning (due to a lack of data, for example), or may be inherent to the system. Propagating uncertainty fo… ▽ More Learning-based control aims to construct models of a system to use for planning or trajectory optimization, e.g. in model-based reinforcement learning. In order to obtain guarantees of safety in this context, uncertainty must be accurately quantified. This uncertainty may come from errors in learning (due to a lack of data, for example), or may be inherent to the system. Propagating uncertainty forward in learned dynamics models is a difficult problem. In this work we use deep learning to obtain expressive and flexible models of how distributions of trajectories behave, which we then use for nonlinear Model Predictive Control (MPC). We introduce a deep quantile regression framework for control that enforces probabilistic quantile bounds and quantifies epistemic uncertainty. Using our method we explore three different approaches for learning tubes that contain the possible trajectories of the system, and demonstrate how to use each of them in a Tube MPC scheme. We prove these schemes are recursively feasible and satisfy constraints with a desired margin of probability. We present experiments in simulation on a nonlinear quadrotor system, demonstrating the practical efficacy of these ideas. △ Less

Submitted 4 June, 2020; v1 submitted 4 February, 2020; originally announced February 2020.

Comments: RSS 2020 Camera Ready Version

arXiv:2001.01057 [pdf, other]

Pixel-Semantic Revise of Position Learning A One-Stage Object Detector with A Shared Encoder-Decoder

Authors: Qian Li, Nan Guo, Xiaochun Ye, Dongrui Fan, Zhimin Tang

Abstract: Recently, many methods have been proposed for object detection. They cannot detect objects by semantic features, adaptively. In this work, according to channel and spatial attention mechanisms, we mainly analyze that different methods detect objects adaptively. Some state-of-the-art detectors combine different feature pyramids with many mechanisms to enhance multi-level semantic information. Howev… ▽ More Recently, many methods have been proposed for object detection. They cannot detect objects by semantic features, adaptively. In this work, according to channel and spatial attention mechanisms, we mainly analyze that different methods detect objects adaptively. Some state-of-the-art detectors combine different feature pyramids with many mechanisms to enhance multi-level semantic information. However, they require more cost. This work addresses that by an anchor-free detector with shared encoder-decoder with attention mechanism, extracting shared features. We consider features of different levels from backbone (e.g., ResNet-50) as the basis features. Then, we feed the features into a simple module, followed by a detector header to detect objects. Meantime, we use the semantic features to revise geometric locations, and the detector is a pixel-semantic revising of position. More importantly, this work analyzes the impact of different pooling strategies (e.g., mean, maximum or minimum) on multi-scale objects, and finds the minimum pooling improve detection performance on small objects better. Compared with state-of-the-art MNC based on ResNet-101 for the standard MSCOCO 2014 baseline, our method improves detection AP of 3.8%. △ Less

Submitted 28 September, 2020; v1 submitted 4 January, 2020; originally announced January 2020.

Comments: Accepted by ICONIP2020(International Conference on Neural Information Processing)

arXiv:1910.02325 [pdf, other]

doi 10.1109/ICRA40945.2020.9196709

Bayesian Learning-Based Adaptive Control for Safety Critical Systems

Authors: David D. Fan, Jennifer Nguyen, Rohan Thakker, Nikhilesh Alatur, Ali-akbar Agha-mohammadi, Evangelos A. Theodorou

Abstract: Deep learning has enjoyed much recent success, and applying state-of-the-art model learning methods to controls is an exciting prospect. However, there is a strong reluctance to use these methods on safety-critical systems, which have constraints on safety, stability, and real-time performance. We propose a framework which satisfies these constraints while allowing the use of deep neural networks… ▽ More Deep learning has enjoyed much recent success, and applying state-of-the-art model learning methods to controls is an exciting prospect. However, there is a strong reluctance to use these methods on safety-critical systems, which have constraints on safety, stability, and real-time performance. We propose a framework which satisfies these constraints while allowing the use of deep neural networks for learning model uncertainties. Central to our method is the use of Bayesian model learning, which provides an avenue for maintaining appropriate degrees of caution in the face of the unknown. In the proposed approach, we develop an adaptive control framework leveraging the theory of stochastic CLFs (Control Lyapunov Functions) and stochastic CBFs (Control Barrier Functions) along with tractable Bayesian model learning via Gaussian Processes or Bayesian neural networks. Under reasonable assumptions, we guarantee stability and safety while adapting to unknown dynamics with probability 1. We demonstrate this architecture for high-speed terrestrial mobility targeting potential applications in safety-critical high-speed Mars rover missions. △ Less

Submitted 4 July, 2021; v1 submitted 5 October, 2019; originally announced October 2019.

Comments: Corrected an error in section II, where previously the problem was introduced in a non-stochastic setting and wrongly assumed the solution to an ODE with Gaussian distributed parametric uncertainty was equivalent to an SDE with a learned diffusion term. See Lew, T et al. "On the Problem of Reformulating Systems with Uncertain Dynamics as a Stochastic Differential Equation"

Journal ref: IEEE International Conference on Robotics and Automation (ICRA) 2020 May 31 (pp. 4093-4099)

arXiv:1701.01882 [pdf, other]

doi 10.1109/CDC.2016.7798330

Differential Dynamic Programming for time-delayed systems

Authors: David D. Fan, Evangelos A. Theodorou

Abstract: Trajectory optimization considers the problem of deciding how to control a dynamical system to move along a trajectory which minimizes some cost function. Differential Dynamic Programming (DDP) is an optimal control method which utilizes a second-order approximation of the problem to find the control. It is fast enough to allow real-time control and has been shown to work well for trajectory optim… ▽ More Trajectory optimization considers the problem of deciding how to control a dynamical system to move along a trajectory which minimizes some cost function. Differential Dynamic Programming (DDP) is an optimal control method which utilizes a second-order approximation of the problem to find the control. It is fast enough to allow real-time control and has been shown to work well for trajectory optimization in robotic systems. Here we extend classic DDP to systems with multiple time-delays in the state. Being able to find optimal trajectories for time-delayed systems with DDP opens up the possibility to use richer models for system identification and control, including recurrent neural networks with multiple timesteps in the state. We demonstrate the algorithm on a two-tank continuous stirred tank reactor. We also demonstrate the algorithm on a recurrent neural network trained to model an inverted pendulum with position information only. △ Less

Submitted 7 January, 2017; originally announced January 2017.

Comments: 7 pages, 6 figures, conference, Decision and Control (CDC), 2016 IEEE 55th Conference on

arXiv:1607.03193 [pdf, other]

Output Observability of Systems Over Finite Alphabets with Linear Internal Dynamics

Authors: Donglei Fan, Danielle C. Tarraf

Abstract: We consider a class of systems over finite alphabets with linear internal dynamics, finite-valued control inputs and finitely quantized outputs. We motivate the need for a new notion of observability and propose three new notions of output observability, thereby shifting our attention to the problem of state estimation for output prediction. We derive necessary and sufficient conditions for a syst… ▽ More We consider a class of systems over finite alphabets with linear internal dynamics, finite-valued control inputs and finitely quantized outputs. We motivate the need for a new notion of observability and propose three new notions of output observability, thereby shifting our attention to the problem of state estimation for output prediction. We derive necessary and sufficient conditions for a system to be output observable, algorithmic procedures to verify these conditions, and a construction of finite memory output observers when certain conditions are met. We conclude with simple illustrative examples. △ Less

Submitted 11 July, 2016; originally announced July 2016.

Comments: 26 pages, 3 figures. Submitted for journal publication

arXiv:1510.04209 [pdf, other]

Finite Uniform Bisimulations for Linear Systems with Finite Input Alphabets

Authors: Donglei Fan, Danielle C. Tarraf

Abstract: We consider a class of systems over finite alphabets, namely discrete-time systems with linear dynamics and a finite input alphabet. We formulate a notion of finite uniform bisimulation, and motivate and propose a notion of regular finite uniform bisimulation. We derive sufficient conditions for the existence of finite uniform bisimulations, and propose and analyze algorithms to compute finite uni… ▽ More We consider a class of systems over finite alphabets, namely discrete-time systems with linear dynamics and a finite input alphabet. We formulate a notion of finite uniform bisimulation, and motivate and propose a notion of regular finite uniform bisimulation. We derive sufficient conditions for the existence of finite uniform bisimulations, and propose and analyze algorithms to compute finite uniform bisimulations when the sufficient conditions are satisfied. We investigate the necessary conditions, and conclude with a set of illustrative examples. △ Less

Submitted 14 October, 2015; originally announced October 2015.

Comments: 19 pages

Showing 1–32 of 32 results for author: Fan, D