Search | arXiv e-print repository

Safe Reinforcement Learning for Power System Control: A Review

Authors: Peipei Yu, Zhenyi Wang, Hongcai Zhang, Yonghua Song

Abstract: The large-scale integration of intermittent renewable energy resources introduces increased uncertainty and volatility to the supply side of power systems, thereby complicating system operation and control. Recently, data-driven approaches, particularly reinforcement learning (RL), have shown significant promise in addressing complex control challenges in power systems, because RL can learn from i… ▽ More The large-scale integration of intermittent renewable energy resources introduces increased uncertainty and volatility to the supply side of power systems, thereby complicating system operation and control. Recently, data-driven approaches, particularly reinforcement learning (RL), have shown significant promise in addressing complex control challenges in power systems, because RL can learn from interactive feedback without needing prior knowledge of the system model. However, the training process of model-free RL methods relies heavily on random decisions for exploration, which may result in ``bad" decisions that violate critical safety constraints and lead to catastrophic control outcomes. Due to the inability of RL methods to theoretically ensure decision safety in power systems, directly deploying traditional RL algorithms in the real world is deemed unacceptable. Consequently, the safety issue in RL applications, known as safe RL, has garnered considerable attention in recent years, leading to numerous important developments. This paper provides a comprehensive review of the state-of-the-art safe RL techniques and discusses how these techniques can be applied to power system control problems such as frequency regulation, voltage control, and energy management. We then present discussions on key challenges and future research directions, related to convergence and optimality, training efficiency, universality, and real-world deployment. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2402.09871 [pdf, other]

MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in the Field of Music

Authors: Zihao Wang, Shuyu Li, Tao Zhang, Qi Wang, Pengfei Yu, **yang Luo, Yan Liu, Ming Xi, Kejun Zhang

Abstract: The rapidly evolving multimodal Large Language Models (LLMs) urgently require new benchmarks to uniformly evaluate their performance on understanding and textually describing music. However, due to semantic gaps between Music Information Retrieval (MIR) algorithms and human understanding, discrepancies between professionals and the public, and low precision of annotations, existing music descripti… ▽ More The rapidly evolving multimodal Large Language Models (LLMs) urgently require new benchmarks to uniformly evaluate their performance on understanding and textually describing music. However, due to semantic gaps between Music Information Retrieval (MIR) algorithms and human understanding, discrepancies between professionals and the public, and low precision of annotations, existing music description datasets cannot serve as benchmarks. To this end, we present MuChin, the first open-source music description benchmark in Chinese colloquial language, designed to evaluate the performance of multimodal LLMs in understanding and describing music. We established the Caichong Music Annotation Platform (CaiMAP) that employs an innovative multi-person, multi-stage assurance method, and recruited both amateurs and professionals to ensure the precision of annotations and alignment with popular semantics. Utilizing this method, we built a dataset with multi-dimensional, high-precision music annotations, the Caichong Music Dataset (CaiMD), and carefully selected 1,000 high-quality entries to serve as the test set for MuChin. Based on MuChin, we analyzed the discrepancies between professionals and amateurs in terms of music description, and empirically demonstrated the effectiveness of annotated data for fine-tuning LLMs. Ultimately, we employed MuChin to evaluate existing music understanding models on their ability to provide colloquial descriptions of music. All data related to the benchmark, along with the scoring code and detailed appendices, have been open-sourced (https://github.com/CarlWangChina/MuChin/). △ Less

Submitted 13 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: Accepted by International Joint Conference on Artificial Intelligence 2024 (IJCAI 2024)

MSC Class: 68Txx(Primary)14F05; 91Fxx(Secondary) ACM Class: I.2.7; J.5

arXiv:2312.13752 [pdf]

Hunting imaging biomarkers in pulmonary fibrosis: Benchmarks of the AIIB23 challenge

Authors: Yang Nan, Xiaodan Xing, Shiyi Wang, Zeyu Tang, Federico N Felder, Sheng Zhang, Roberta Eufrasia Ledda, Xiaoliu Ding, Ruiqi Yu, Wei** Liu, Feng Shi, Tianyang Sun, Zehong Cao, Minghui Zhang, Yun Gu, Hanxiao Zhang, Jian Gao, **yu Wang, Wen Tang, Pengxin Yu, Han Kang, Junqiang Chen, Xing Lu, Boyu Zhang, Michail Mamalakis , et al. (16 additional authors not shown)

Abstract: Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intric… ▽ More Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intricate honeycombing patterns present in the lung tissues of fibrotic lung disease patients exacerbate the challenges, often leading to various prediction errors. To address this issue, the 'Airway-Informed Quantitative CT Imaging Biomarker for Fibrotic Lung Disease 2023' (AIIB23) competition was organized in conjunction with the official 2023 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). The airway structures were meticulously annotated by three experienced radiologists. Competitors were encouraged to develop automatic airway segmentation models with high robustness and generalization abilities, followed by exploring the most correlated QIB of mortality prediction. A training set of 120 high-resolution computerised tomography (HRCT) scans were publicly released with expert annotations and mortality status. The online validation set incorporated 52 HRCT scans from patients with fibrotic lung disease and the offline test set included 140 cases from fibrosis and COVID-19 patients. The results have shown that the capacity of extracting airway trees from patients with fibrotic lung disease could be enhanced by introducing voxel-wise weighted general union loss and continuity loss. In addition to the competitive image biomarkers for prognosis, a strong airway-derived biomarker (Hazard ratio>1.5, p<0.0001) was revealed for survival prognostication compared with existing clinical measurements, clinician assessment and AI-based biomarkers. △ Less

Submitted 16 April, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

Comments: 19 pages

arXiv:2310.01163 [pdf, other]

Trust-Aware Motion Planning for Human-Robot Collaboration under Distribution Temporal Logic Specifications

Authors: Pian Yu, Shuyang Dong, Shili Sheng, Lu Feng, Marta Kwiatkowska

Abstract: Recent work has considered trust-aware decision making for human-robot collaboration (HRC) with a focus on model learning. In this paper, we are interested in enabling the HRC system to complete complex tasks specified using temporal logic that involve human trust. Since human trust in robots is not observable, we adopt the widely used partially observable Markov decision process (POMDP) framework… ▽ More Recent work has considered trust-aware decision making for human-robot collaboration (HRC) with a focus on model learning. In this paper, we are interested in enabling the HRC system to complete complex tasks specified using temporal logic that involve human trust. Since human trust in robots is not observable, we adopt the widely used partially observable Markov decision process (POMDP) framework for modelling the interactions between humans and robots. To specify the desired behaviour, we propose to use syntactically co-safe linear distribution temporal logic (scLDTL), a logic that is defined over predicates of states as well as belief states of partially observable systems. The incorporation of belief predicates in scLDTL enhances its expressiveness while simultaneously introducing added complexity. This also presents a new challenge as the belief predicates must be evaluated over the continuous (infinite) belief space. To address this challenge, we present an algorithm for solving the optimal policy synthesis problem. First, we enhance the belief MDP (derived by reformulating the POMDP) with a probabilistic labelling function. Then a product belief MDP is constructed between the probabilistically labelled belief MDP and the automaton translation of the scLDTL formula. Finally, we show that the optimal policy can be obtained by leveraging existing point-based value iteration algorithms with essential modifications. Human subject experiments with 21 participants on a driving simulator demonstrate the effectiveness of the proposed approach. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2309.14347 [pdf, other]

Continuous-time control synthesis under nested signal temporal logic specifications

Authors: Pian Yu, Xiao Tan, Dimos V. Dimarogonas

Abstract: In this work, we propose a novel approach for the continuous-time control synthesis of nonlinear systems under nested signal temporal logic (STL) specifications. While the majority of existing literature focuses on control synthesis for STL specifications without nested temporal operators, addressing nested temporal operators poses a notably more challenging scenario and requires new theoretical a… ▽ More In this work, we propose a novel approach for the continuous-time control synthesis of nonlinear systems under nested signal temporal logic (STL) specifications. While the majority of existing literature focuses on control synthesis for STL specifications without nested temporal operators, addressing nested temporal operators poses a notably more challenging scenario and requires new theoretical advancements. Our approach hinges on the concepts of signal temporal logic tree (sTLT) and control barrier function (CBF). Specifically, we detail the construction of an sTLT from a given STL formula and a continuous-time dynamical system, the sTLT semantics (i.e., satisfaction condition), and the equivalence or under-approximation relation between sTLT and STL. Leveraging the fact that the satisfaction condition of an sTLT is essentially kee** the state within certain sets during certain time intervals, it provides explicit guidelines for the CBF design. The resulting controller is obtained through the utilization of an online CBF-based program coupled with an event-triggered scheme for online updating the activation time interval of each CBF, with which the correctness of the system behavior can be established by construction. We demonstrate the efficacy of the proposed method for single-integrator and unicycle models under nested STL formulas. △ Less

Submitted 22 January, 2024; v1 submitted 17 September, 2023; originally announced September 2023.

Comments: Link to accompanying code: https://github.com/xiaotan-git/sTLT

arXiv:2307.06000 [pdf, other]

Reactive and human-in-the-loop planning and control of multi-robot systems under LTL specifications in dynamic environments

Authors: Pian Yu, Gianmarco Fedeli, Dimos V. Dimarogonas

Abstract: This paper investigates the planning and control problems for multi-robot systems under linear temporal logic (LTL) specifications. In contrast to most of existing literature, which presumes a static and known environment, our study focuses on dynamic environments that can have unknown moving obstacles like humans walking through. Depending on whether local communication is allowed between robots,… ▽ More This paper investigates the planning and control problems for multi-robot systems under linear temporal logic (LTL) specifications. In contrast to most of existing literature, which presumes a static and known environment, our study focuses on dynamic environments that can have unknown moving obstacles like humans walking through. Depending on whether local communication is allowed between robots, we consider two different online re-planning approaches. When local communication is allowed, we propose a local trajectory generation algorithm for each robot to resolve conflicts that are detected on-line. In the other case, i.e., no communication is allowed, we develop a model predictive controller to reactively avoid potential collisions. In both cases, task satisfaction is guaranteed whenever it is feasible. In addition, we consider the human-in-the-loop scenario where humans may additionally take control of one or multiple robots. We design a mixed initiative controller for each robot to prevent unsafe human behaviors while guarantee the LTL satisfaction. Using our previous developed ROS software package, several experiments are conducted to demonstrate the effectiveness and the applicability of the proposed strategies. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: Accepted by the 9th International Conference on Control, Decision and Information Technologies (CoDIT 2023)

arXiv:2304.14894 [pdf, other]

Making the Invisible Visible: Toward High-Quality Terahertz Tomographic Imaging via Physics-Guided Restoration

Authors: Weng-Tai Su, Yi-Chun Hung, Po-Jen Yu, Shang-Hua Yang, Chia-Wen Lin

Abstract: Terahertz (THz) tomographic imaging has recently attracted significant attention thanks to its non-invasive, non-destructive, non-ionizing, material-classification, and ultra-fast nature for object exploration and inspection. However, its strong water absorption nature and low noise tolerance lead to undesired blurs and distortions of reconstructed THz images. The diffraction-limited THz signals h… ▽ More Terahertz (THz) tomographic imaging has recently attracted significant attention thanks to its non-invasive, non-destructive, non-ionizing, material-classification, and ultra-fast nature for object exploration and inspection. However, its strong water absorption nature and low noise tolerance lead to undesired blurs and distortions of reconstructed THz images. The diffraction-limited THz signals highly constrain the performances of existing restoration methods. To address the problem, we propose a novel multi-view Subspace-Attention-guided Restoration Network (SARNet) that fuses multi-view and multi-spectral features of THz images for effective image restoration and 3D tomographic reconstruction. To this end, SARNet uses multi-scale branches to extract intra-view spatio-spectral amplitude and phase features and fuse them via shared subspace projection and self-attention guidance. We then perform inter-view fusion to further improve the restoration of individual views by leveraging the redundancies between neighboring views. Here, we experimentally construct a THz time-domain spectroscopy (THz-TDS) system covering a broad frequency range from 0.1 THz to 4 THz for building up a temporal/spectral/spatial/ material THz database of hidden 3D objects. Complementary to a quantitative evaluation, we demonstrate the effectiveness of our SARNet model on 3D THz tomographic reconstruction applications. △ Less

Submitted 28 April, 2023; originally announced April 2023.

Comments: 34 pages, 13 figures

arXiv:2304.11837 [pdf, other]

Fault-tolerant Control of an Over-actuated UAV Platform Built on Quadcopters and Passive Hinges

Authors: Yao Su, Pengkang Yu, Matthew J. Gerber, Lecheng Ruan, Tsu-Chin Tsao

Abstract: Propeller failure is a major cause of multirotor Unmanned Aerial Vehicles (UAVs) crashes. While conventional multirotor systems struggle to address this issue due to underactuation, over-actuated platforms can continue flying with appropriate fault-tolerant control (FTC). This paper presents a robust FTC controller for an over-actuated UAV platform composed of quadcopters mounted on passive joints… ▽ More Propeller failure is a major cause of multirotor Unmanned Aerial Vehicles (UAVs) crashes. While conventional multirotor systems struggle to address this issue due to underactuation, over-actuated platforms can continue flying with appropriate fault-tolerant control (FTC). This paper presents a robust FTC controller for an over-actuated UAV platform composed of quadcopters mounted on passive joints, offering input redundancy at both the high-level vehicle control and the low-level quadcopter control of vectored thrusts. To maximize the benefits of input redundancy during propeller failure, the proposed FTC controller features a hierarchical control architecture with three key components: (i) a low-level adjustment strategy to prevent propeller-level thrust saturation; (ii) a compensation loop for mitigating introduced disturbances; (iii) a nullspace-based control allocation framework to avoid quadcopter-level thrust saturation. Through reallocating actuator inputs in both the low-level and high-level control loops, the low-level quadcopter control can be maintained with up to two failed propellers, ensuring that the whole platform remains stable and avoids crashing. The proposed controller's superior performance is thoroughly examined through simulations and real-world experiments. △ Less

Submitted 14 June, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

arXiv:2304.03708 [pdf, other]

Efficient automatic segmentation for multi-level pulmonary arteries: The PARSE challenge

Authors: Gongning Luo, Kuanquan Wang, Jun Liu, Shuo Li, Xinjie Liang, Xiangyu Li, Shaowei Gan, Wei Wang, Suyu Dong, Wenyi Wang, Pengxin Yu, Enyou Liu, Hongrong Wei, Na Wang, Jia Guo, Huiqi Li, Zhao Zhang, Ziwei Zhao, Na Gao, Nan An, Ashkan Pakzad, Bojidar Rangelov, Jiaqi Dou, Song Tian, Zeyu Liu , et al. (5 additional authors not shown)

Abstract: Efficient automatic segmentation of multi-level (i.e. main and branch) pulmonary arteries (PA) in CTPA images plays a significant role in clinical applications. However, most existing methods concentrate only on main PA or branch PA segmentation separately and ignore segmentation efficiency. Besides, there is no public large-scale dataset focused on PA segmentation, which makes it highly challengi… ▽ More Efficient automatic segmentation of multi-level (i.e. main and branch) pulmonary arteries (PA) in CTPA images plays a significant role in clinical applications. However, most existing methods concentrate only on main PA or branch PA segmentation separately and ignore segmentation efficiency. Besides, there is no public large-scale dataset focused on PA segmentation, which makes it highly challenging to compare the different methods. To benchmark multi-level PA segmentation algorithms, we organized the first \textbf{P}ulmonary \textbf{AR}tery \textbf{SE}gmentation (PARSE) challenge. On the one hand, we focus on both the main PA and the branch PA segmentation. On the other hand, for better clinical application, we assign the same score weight to segmentation efficiency (mainly running time and GPU memory consumption during inference) while ensuring PA segmentation accuracy. We present a summary of the top algorithms and offer some suggestions for efficient and accurate multi-level PA automatic segmentation. We provide the PARSE challenge as open-access for the community to benchmark future algorithm developments at \url{https://parse2022.grand-challenge.org/Parse2022/}. △ Less

Submitted 7 April, 2023; originally announced April 2023.

arXiv:2303.05745 [pdf, other]

Multi-site, Multi-domain Airway Tree Modeling (ATM'22): A Public Benchmark for Pulmonary Airway Segmentation

Authors: Minghui Zhang, Yangqian Wu, Hanxiao Zhang, Yulei Qin, Hao Zheng, Wen Tang, Corey Arnold, Chenhao Pei, Pengxin Yu, Yang Nan, Guang Yang, Simon Walsh, Dominic C. Marshall, Matthieu Komorowski, Puyang Wang, Dazhou Guo, Dakai **, Ya'nan Wu, Shuiqing Zhao, Runsheng Chang, Boyu Zhang, Xing Lv, Abdul Qayyum, Moona Mazher, Qi Su , et al. (11 additional authors not shown)

Abstract: Open international challenges are becoming the de facto standard for assessing computer vision and image analysis algorithms. In recent years, new methods have extended the reach of pulmonary airway segmentation that is closer to the limit of image resolution. Since EXACT'09 pulmonary airway segmentation, limited effort has been directed to quantitative comparison of newly emerged algorithms drive… ▽ More Open international challenges are becoming the de facto standard for assessing computer vision and image analysis algorithms. In recent years, new methods have extended the reach of pulmonary airway segmentation that is closer to the limit of image resolution. Since EXACT'09 pulmonary airway segmentation, limited effort has been directed to quantitative comparison of newly emerged algorithms driven by the maturity of deep learning based approaches and clinical drive for resolving finer details of distal airways for early intervention of pulmonary diseases. Thus far, public annotated datasets are extremely limited, hindering the development of data-driven methods and detailed performance evaluation of new algorithms. To provide a benchmark for the medical imaging community, we organized the Multi-site, Multi-domain Airway Tree Modeling (ATM'22), which was held as an official challenge event during the MICCAI 2022 conference. ATM'22 provides large-scale CT scans with detailed pulmonary airway annotation, including 500 CT scans (300 for training, 50 for validation, and 150 for testing). The dataset was collected from different sites and it further included a portion of noisy COVID-19 CTs with ground-glass opacity and consolidation. Twenty-three teams participated in the entire phase of the challenge and the algorithms for the top ten teams are reviewed in this paper. Quantitative and qualitative results revealed that deep learning models embedded with the topological continuity enhancement achieved superior performance in general. ATM'22 challenge holds as an open-call design, the training data and the gold standard evaluation are available upon successful registration via its homepage. △ Less

Submitted 27 June, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

Comments: 32 pages, 16 figures. Homepage: https://atm22.grand-challenge.org/. Submitted

arXiv:2302.06611 [pdf, other]

Deep Learning and Medical Imaging for COVID-19 Diagnosis: A Comprehensive Survey

Authors: Song Wu, Yazhou Ren, Aodi Yang, Xinyue Chen, Xiaorong Pu, **g He, Liqiang Nie, Philip S. Yu

Abstract: COVID-19 (Coronavirus disease 2019) has been quickly spreading since its outbreak, impacting financial markets and healthcare systems globally. Countries all around the world have adopted a number of extraordinary steps to restrict the spreading virus, where early COVID-19 diagnosis is essential. Medical images such as X-ray images and Computed Tomography scans are becoming one of the main diagnos… ▽ More COVID-19 (Coronavirus disease 2019) has been quickly spreading since its outbreak, impacting financial markets and healthcare systems globally. Countries all around the world have adopted a number of extraordinary steps to restrict the spreading virus, where early COVID-19 diagnosis is essential. Medical images such as X-ray images and Computed Tomography scans are becoming one of the main diagnostic tools to combat COVID-19 with the aid of deep learning-based systems. In this survey, we investigate the main contributions of deep learning applications using medical images in fighting against COVID-19 from the aspects of image classification, lesion localization, and severity quantification, and review different deep learning architectures and some image preprocessing techniques for achieving a preciser diagnosis. We also provide a summary of the X-ray and CT image datasets used in various studies for COVID-19 detection. The key difficulties and potential applications of deep learning in fighting against COVID-19 are finally discussed. This work summarizes the latest methods of deep learning using medical images to diagnose COVID-19, highlighting the challenges and inspiring more studies to keep utilizing the advantages of deep learning to combat COVID-19. △ Less

Submitted 12 February, 2023; originally announced February 2023.

arXiv:2211.06770 [pdf, other]

MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

Authors: Andrey Ignatov, Anastasia Sycheva, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc Van Gool

Abstract: While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity. In this paper, we present a novel MicroISP model designed specifically for edge devices, taking into account their computational and memory limitations. The propo… ▽ More While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity. In this paper, we present a novel MicroISP model designed specifically for edge devices, taking into account their computational and memory limitations. The proposed solution is capable of processing up to 32MP photos on recent smartphones using the standard mobile ML libraries and requiring less than 1 second to perform the inference, while for FullHD images it achieves real-time performance. The architecture of the model is flexible, allowing to adjust its complexity to devices of different computational power. To evaluate the performance of the model, we collected a novel Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The experiments demonstrated that, despite its compact size, the MicroISP model is able to provide comparable or better visual results than the traditional mobile ISP systems, while outperforming the previously proposed efficient deep learning based solutions. Finally, this model is also compatible with the latest mobile AI accelerators, achieving good runtime and low power consumption on smartphone NPUs and APUs. The code, dataset and pre-trained models are available on the project website: https://people.ee.ethz.ch/~ihnatova/microisp.html △ Less

Submitted 8 November, 2022; originally announced November 2022.

Comments: arXiv admin note: text overlap with arXiv:2211.06263

arXiv:2211.06263 [pdf, other]

PyNet-V2 Mobile: Efficient On-Device Photo Processing With Neural Networks

Authors: Andrey Ignatov, Grigory Malivenko, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc Van Gool

Abstract: The increased importance of mobile photography created a need for fast and performant RAW image processing pipelines capable of producing good visual results in spite of the mobile camera sensor limitations. While deep learning-based approaches can efficiently solve this problem, their computational requirements usually remain too large for high-resolution on-device image processing. To address th… ▽ More The increased importance of mobile photography created a need for fast and performant RAW image processing pipelines capable of producing good visual results in spite of the mobile camera sensor limitations. While deep learning-based approaches can efficiently solve this problem, their computational requirements usually remain too large for high-resolution on-device image processing. To address this limitation, we propose a novel PyNET-V2 Mobile CNN architecture designed specifically for edge devices, being able to process RAW 12MP photos directly on mobile phones under 1.5 second and producing high perceptual photo quality. To train and to evaluate the performance of the proposed solution, we use the real-world Fujifilm UltraISP dataset consisting on thousands of RAW-RGB image pairs captured with a professional medium-format 102MP Fujifilm camera and a popular Sony mobile camera sensor. The results demonstrate that the PyNET-V2 Mobile model can substantially surpass the quality of tradition ISP pipelines, while outperforming the previously introduced neural network-based solutions designed for fast image processing. Furthermore, we show that the proposed architecture is also compatible with the latest mobile AI accelerators such as NPUs or APUs that can be used to further reduce the latency of the model to as little as 0.5 second. The dataset, code and pre-trained models used in this paper are available on the project website: https://github.com/gmalivenko/PyNET-v2 △ Less

Submitted 8 November, 2022; originally announced November 2022.

arXiv:2209.06054 [pdf, other]

doi 10.1145/3503161.3548368

SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias

Authors: Zihao Wang, Qihao Liang, Kejun Zhang, Yuxing Wang, Chen Zhang, Pengfei Yu, Yongsheng Feng, Wenbo Liu, Yikai Wang, Yuntai Bao, Yiheng Yang

Abstract: Real-time music accompaniment generation has a wide range of applications in the music industry, such as music education and live performances. However, automatic real-time music accompaniment generation is still understudied and often faces a trade-off between logical latency and exposure bias. In this paper, we propose SongDriver, a real-time music accompaniment generation system without logical… ▽ More Real-time music accompaniment generation has a wide range of applications in the music industry, such as music education and live performances. However, automatic real-time music accompaniment generation is still understudied and often faces a trade-off between logical latency and exposure bias. In this paper, we propose SongDriver, a real-time music accompaniment generation system without logical latency nor exposure bias. Specifically, SongDriver divides one accompaniment generation task into two phases: 1) The arrangement phase, where a Transformer model first arranges chords for input melodies in real-time, and caches the chords for the next phase instead of playing them out. 2) The prediction phase, where a CRF model generates playable multi-track accompaniments for the coming melodies based on previously cached chords. With this two-phase strategy, SongDriver directly generates the accompaniment for the upcoming melody, achieving zero logical latency. Furthermore, when predicting chords for a timestep, SongDriver refers to the cached chords from the first phase rather than its previous predictions, which avoids the exposure bias problem. Since the input length is often constrained under real-time conditions, another potential problem is the loss of long-term sequential information. To make up for this disadvantage, we extract four musical features from a long-term music piece before the current time step as global information. In the experiment, we train SongDriver on some open-source datasets and an original àiSong Dataset built from Chinese-style modern pop music scores. The results show that SongDriver outperforms existing SOTA (state-of-the-art) models on both objective and subjective metrics, meanwhile significantly reducing the physical latency. △ Less

Submitted 13 October, 2022; v1 submitted 13 September, 2022; originally announced September 2022.

Comments: *Both Zihao Wang and Qihao Liang contribute equally to the paper and share the co-first authorship. This paper has been accepted by ACM Multimedia 2022, oral session, full paper (main track)

arXiv:2206.06267 [pdf, other]

doi 10.1109/EMBC48229.2022.9871639

MMMNA-Net for Overall Survival Time Prediction of Brain Tumor Patients

Authors: Wen Tang, Haoyue Zhang, Pengxin Yu, Han Kang, Rongguo Zhang

Abstract: Overall survival (OS) time is one of the most important evaluation indices for gliomas situations. Multimodal Magnetic Resonance Imaging (MRI) scans play an important role in the study of glioma prognosis OS time. Several deep learning-based methods are proposed for the OS time prediction on multi-modal MRI problems. However, these methods usually fuse multi-modal information at the beginning or a… ▽ More Overall survival (OS) time is one of the most important evaluation indices for gliomas situations. Multimodal Magnetic Resonance Imaging (MRI) scans play an important role in the study of glioma prognosis OS time. Several deep learning-based methods are proposed for the OS time prediction on multi-modal MRI problems. However, these methods usually fuse multi-modal information at the beginning or at the end of the deep learning networks and lack the fusion of features from different scales. In addition, the fusion at the end of networks always adapts global with global (eg. fully connected after concatenation of global average pooling output) or local with local (eg. bilinear pooling), which loses the information of local with global. In this paper, we propose a novel method for multi-modal OS time prediction of brain tumor patients, which contains an improved nonlocal features fusion module introduced on different scales. Our method obtains a relative 8.76% improvement over the current state-of-art method (0.6989 vs. 0.6426 on accuracy). Extensive testing demonstrates that our method could adapt to situations with missing modalities. The code is available at https://github.com/TangWen920812/mmmna-net. △ Less

Submitted 13 June, 2022; originally announced June 2022.

Comments: Accepted EMBC 2022

arXiv:2206.06253 [pdf, ps, other]

doi 10.1007/978-3-031-16446-0_33

RPLHR-CT Dataset and Transformer Baseline for Volumetric Super-Resolution from CT Scans

Authors: Pengxin Yu, Haoyue Zhang, Han Kang, Wen Tang, Corey W. Arnold, Rongguo Zhang

Abstract: In clinical practice, anisotropic volumetric medical images with low through-plane resolution are commonly used due to short acquisition time and lower storage cost. Nevertheless, the coarse resolution may lead to difficulties in medical diagnosis by either physicians or computer-aided diagnosis algorithms. Deep learning-based volumetric super-resolution (SR) methods are feasible ways to improve r… ▽ More In clinical practice, anisotropic volumetric medical images with low through-plane resolution are commonly used due to short acquisition time and lower storage cost. Nevertheless, the coarse resolution may lead to difficulties in medical diagnosis by either physicians or computer-aided diagnosis algorithms. Deep learning-based volumetric super-resolution (SR) methods are feasible ways to improve resolution, with convolutional neural networks (CNN) at their core. Despite recent progress, these methods are limited by inherent properties of convolution operators, which ignore content relevance and cannot effectively model long-range dependencies. In addition, most of the existing methods use pseudo-paired volumes for training and evaluation, where pseudo low-resolution (LR) volumes are generated by a simple degradation of their high-resolution (HR) counterparts. However, the domain gap between pseudo- and real-LR volumes leads to the poor performance of these methods in practice. In this paper, we build the first public real-paired dataset RPLHR-CT as a benchmark for volumetric SR, and provide baseline results by re-implementing four state-of-the-art CNN-based methods. Considering the inherent shortcoming of CNN, we also propose a transformer volumetric super-resolution network (TVSRN) based on attention mechanisms, dispensing with convolutions entirely. This is the first research to use a pure transformer for CT volumetric SR. The experimental results show that TVSRN significantly outperforms all baselines on both PSNR and SSIM. Moreover, the TVSRN method achieves a better trade-off between the image quality, the number of parameters, and the running time. Data and code are available at https://github.com/smilenaxx/RPLHR-CT. △ Less

Submitted 13 June, 2022; originally announced June 2022.

Comments: Accepted MICCAI 2022

arXiv:2205.00327 [pdf, other]

Physics-guided Terahertz Computational Imaging

Authors: Weng-Tai Su, Yi-Chun Hung, Po-Jen Yu, Chia-Wen Lin, Shang-Hua Yang

Abstract: Visualizing information inside objects is an ever-lasting need to bridge the world from physics, chemistry, biology to computation. Among all tomographic techniques, terahertz (THz) computational imaging has demonstrated its unique sensing features to digitalize multi-dimensional object information in a non-destructive, non-ionizing, and non-invasive way. Applying modern signal processing and phys… ▽ More Visualizing information inside objects is an ever-lasting need to bridge the world from physics, chemistry, biology to computation. Among all tomographic techniques, terahertz (THz) computational imaging has demonstrated its unique sensing features to digitalize multi-dimensional object information in a non-destructive, non-ionizing, and non-invasive way. Applying modern signal processing and physics-guided modalities, THz computational imaging systems are now launched in various application fields in industrial inspection, security screening, chemical inspection and non-destructive evaluation. In this article, we overview recent advances in THz computational imaging modalities in the aspects of system configuration, wave propagation and interaction models, physics-guided algorithm for digitalizing interior information of imaged objects. Several image restoration and reconstruction issues based on multi-dimensional THz signals are further discussed, which provides a crosslink between material digitalization, functional property extraction, and multi-dimensional imager utilization from a signal processing perspective. △ Less

Submitted 30 April, 2022; originally announced May 2022.

arXiv:2205.00324 [pdf, other]

doi 10.1364/OE.461439

Terahertz Spatio-Temporal Deep Learning Computed Tomography

Authors: Yi-Chun Hung, Ta-Hsuan Chao, Pojen Yu, Shang-Hua Yang

Abstract: Terahertz computed tomography (THz CT) has drawn significant attention because of its unique capability to bring multi-dimensional object information from invisible to visible. However, current physics-model-based THz CT modalities present low data use efficiency on time-resolved THz signals and low model fusion extensibility, limiting their application fields' practical use. In this paper, we pro… ▽ More Terahertz computed tomography (THz CT) has drawn significant attention because of its unique capability to bring multi-dimensional object information from invisible to visible. However, current physics-model-based THz CT modalities present low data use efficiency on time-resolved THz signals and low model fusion extensibility, limiting their application fields' practical use. In this paper, we propose a supervised THz deep learning computed tomography (THz DL-CT) framework based on time-domain information. THz DL-CT restores superior THz tomographic images of 3D objects by extracting features from spatio-temporal THz signals without any prior material information. Compared with conventional and machine learning based methods, THz DL-CT delivers at least 50.2%, and 52.6% superior in root mean square error (RMSE) and structural similarity index (SSIM), respectively. Additionally, we have experimentally demonstrated that the pretrained THz DL-CT model can generalize to reconstruct multi-material systems with no prerequisite information. THz CT through the DL data fusion approach provides a new pathway for non-invasive functional imaging in object investigation. △ Less

Submitted 3 May, 2022; v1 submitted 30 April, 2022; originally announced May 2022.

arXiv:2201.06263 [pdf]

Learning-based multiplexed transmission of scattered twisted light through a kilometer-scale standard multimode fiber

Authors: Yifan Liu, Zhisen Zhang, Panpan Yu, Yi**g Wu, Ziqiang Wang, Yinmei Li, Wen Liu, Lei Gong

Abstract: Multiplexing multiple orbital angular momentum (OAM) modes of light has the potential to increase data capacity in optical communication. However, the distribution of such modes over long distances remains challenging. Free-space transmission is strongly influenced by atmospheric turbulence and light scattering, while the wave distortion induced by the mode dispersion in fibers disables OAM demult… ▽ More Multiplexing multiple orbital angular momentum (OAM) modes of light has the potential to increase data capacity in optical communication. However, the distribution of such modes over long distances remains challenging. Free-space transmission is strongly influenced by atmospheric turbulence and light scattering, while the wave distortion induced by the mode dispersion in fibers disables OAM demultiplexing in fiber-optic communications. Here, a deep-learning-based approach is developed to recover the data from scattered OAM channels without measuring any phase information. Over a 1-km-long standard multimode fiber, the method is able to identify different OAM modes with an accuracy of more than 99.9% in parallel demultiplexing of 24 scattered OAM channels. To demonstrate the transmission quality, color images are encoded in multiplexed twisted light and our method achieves decoding the transmitted data with an error rate of 0.13%. Our work shows the artificial intelligence algorithm could benefit the use of OAM multiplexing in commercial fiber networks and high-performance optical communication in turbulent environments. △ Less

Submitted 17 January, 2022; originally announced January 2022.

Comments: 12 pages, 5 figures

arXiv:2112.10949 [pdf, other]

District Cooling System Control for Providing Operating Reserve based on Safe Deep Reinforcement Learning

Authors: Peipei Yu, Hongxun Hui, Hongcai Zhang, Ge Chen, Yonghua Song

Abstract: Heating, ventilation, and air conditioning (HVAC) systems are well proved to be capable to provide operating reserve for power systems. As a type of large-capacity and energy-efficient HVAC system (up to 100 MW), district cooling system (DCS) is emerging in modern cities and has huge potential to be regulated as a flexible load. However, strategically controlling a DCS to provide flexibility is ch… ▽ More Heating, ventilation, and air conditioning (HVAC) systems are well proved to be capable to provide operating reserve for power systems. As a type of large-capacity and energy-efficient HVAC system (up to 100 MW), district cooling system (DCS) is emerging in modern cities and has huge potential to be regulated as a flexible load. However, strategically controlling a DCS to provide flexibility is challenging, because one DCS services multiple buildings with complex thermal dynamics and uncertain cooling demands. Improper control may lead to significant thermal discomfort and even deteriorate the power system's operation security. To address the above issues, we propose a model-free control strategy based on the deep reinforcement learning (DRL) without the requirement of accurate system model and uncertainty distribution. To avoid damaging "trial & error" actions that may violate the system's operation security during the training process, we further propose a safe layer combined to the DRL to guarantee the satisfaction of critical constraints, forming a safe-DRL scheme. Moreover, after providing operating reserve, DCS increases power and tries to recover all the buildings' temperature back to set values, which may probably cause an instantaneous peak-power rebound and bring a secondary impact on power systems. Therefore, we design a self-adaption reward function within the proposed safe-DRL scheme to constrain the peak-power effectively. Numerical studies based on a realistic DCS demonstrate the effectiveness of the proposed methods. △ Less

Submitted 20 December, 2021; originally announced December 2021.

arXiv:2107.11645 [pdf]

Dual-Attention Enhanced BDense-UNet for Liver Lesion Segmentation

Authors: Wenming Cao, Philip L. H. Yu, Gilbert C. S. Lui, Keith W. H. Chiu, Ho-Ming Cheng, Yanwen Fang, Man-Fung Yuen, Wai-Kay Seto

Abstract: In this work, we propose a new segmentation network by integrating DenseUNet and bidirectional LSTM together with attention mechanism, termed as DA-BDense-UNet. DenseUNet allows learning enough diverse features and enhancing the representative power of networks by regulating the information flow. Bidirectional LSTM is responsible to explore the relationships between the encoded features and the up… ▽ More In this work, we propose a new segmentation network by integrating DenseUNet and bidirectional LSTM together with attention mechanism, termed as DA-BDense-UNet. DenseUNet allows learning enough diverse features and enhancing the representative power of networks by regulating the information flow. Bidirectional LSTM is responsible to explore the relationships between the encoded features and the up-sampled features in the encoding and decoding paths. Meanwhile, we introduce attention gates (AG) into DenseUNet to diminish responses of unrelated background regions and magnify responses of salient regions progressively. Besides, the attention in bidirectional LSTM takes into account the contribution differences of the encoded features and the up-sampled features in segmentation improvement, which can in turn adjust proper weights for these two kinds of features. We conduct experiments on liver CT image data sets collected from multiple hospitals by comparing them with state-of-the-art segmentation models. Experimental results indicate that our proposed method DA-BDense-UNet has achieved comparative performance in terms of dice coefficient, which demonstrates its effectiveness. △ Less

Submitted 24 July, 2021; originally announced July 2021.

Comments: 9 pages, 3 figures

arXiv:2104.02609 [pdf, other]

I-ODA, Real-World Multi-modal Longitudinal Data for OphthalmicApplications

Authors: Nooshin Mojab, Vahid Noroozi, Abdullah Aleem, Manoj P. Nallabothula, Joseph Baker, Dimitri T. Azar, Mark Rosenblatt, RV Paul Chan, Darvin Yi, Philip S. Yu, Joelle A. Hallak

Abstract: Data from clinical real-world settings is characterized by variability in quality, machine-type, setting, and source. One of the primary goals of medical computer vision is to develop and validate artificial intelligence (AI) based algorithms on real-world data enabling clinical translations. However, despite the exponential growth in AI based applications in healthcare, specifically in ophthalmol… ▽ More Data from clinical real-world settings is characterized by variability in quality, machine-type, setting, and source. One of the primary goals of medical computer vision is to develop and validate artificial intelligence (AI) based algorithms on real-world data enabling clinical translations. However, despite the exponential growth in AI based applications in healthcare, specifically in ophthalmology, translations to clinical settings remain challenging. Limited access to adequate and diverse real-world data inhibits the development and validation of translatable algorithms. In this paper, we present a new multi-modal longitudinal ophthalmic imaging dataset, the Illinois Ophthalmic Database Atlas (I-ODA), with the goal of advancing state-of-the-art computer vision applications in ophthalmology, and improving upon the translatable capacity of AI based applications across different clinical settings. We present the infrastructure employed to collect, annotate, and anonymize images from multiple sources, demonstrating the complexity of real-world retrospective data and its limitations. I-ODA includes 12 imaging modalities with a total of 3,668,649 ophthalmic images of 33,876 individuals from the Department of Ophthalmology and Visual Sciences at the Illinois Eye and Ear Infirmary of the University of Illinois Chicago (UIC) over the course of 12 years. △ Less

Submitted 29 March, 2021; originally announced April 2021.

arXiv:2103.16932 [pdf, other]

Seeing through a Black Box: Toward High-Quality Terahertz TomographicImaging via Multi-Scale Spatio-Spectral Image Fusion

Authors: Weng-tai Su, Yi-Chun Hung, Ta-Hsuan Chao, Po-Jen Yu, Shang-Hua Yang, Chia-Wen Lin

Abstract: Terahertz (THz) imaging has recently attracted significant attention thanks to its non-invasive, non-destructive, non-ionizing, material-classification, and ultra-fast nature for object exploration and inspection. However, its strong water absorption nature and low noise tolerance lead to undesired blurs and distortions of reconstructed THz images. The performances of existing restoration methods… ▽ More Terahertz (THz) imaging has recently attracted significant attention thanks to its non-invasive, non-destructive, non-ionizing, material-classification, and ultra-fast nature for object exploration and inspection. However, its strong water absorption nature and low noise tolerance lead to undesired blurs and distortions of reconstructed THz images. The performances of existing restoration methods are highly constrained by the diffraction-limited THz signals. To address the problem, we propose a novel Subspace-and-Attention-guided Restoration Network (SARNet) that fuses multi-spectral features of a THz image for effective restoration. To this end, SARNet uses multi-scale branches to extract spatio-spectral features of amplitude and phase which are then fused via shared subspace projection and attention guidance. Here, we experimentally construct ultra-fast THz time-domain spectroscopy system covering a broad frequency range from 0.1 THz to 4 THz for building up temporal/spectral/spatial/phase/material THz database of hidden 3D objects. Complementary to a quantitative evaluation, we demonstrate the effectiveness of our SARNet model on 3D THz tomographic reconstruction △ Less

Submitted 29 December, 2021; v1 submitted 31 March, 2021; originally announced March 2021.

Comments: 10 pages, 9 figures

arXiv:2103.09111 [pdf, other]

Distributed motion coordination for multi-robot systems under LTL specifications

Authors: Pian Yu, Dimos V. Dimarogonas

Abstract: This paper investigates the online motion coordination problem for a group of mobile robots moving in a shared workspace, each of which is assigned a linear temporal logic specification. Based on the realistic assumptions that each robot is subject to both state and input constraints and can have only local view and local information, a fully distributed multi-robot motion coordination strategy is… ▽ More This paper investigates the online motion coordination problem for a group of mobile robots moving in a shared workspace, each of which is assigned a linear temporal logic specification. Based on the realistic assumptions that each robot is subject to both state and input constraints and can have only local view and local information, a fully distributed multi-robot motion coordination strategy is proposed. For each robot, the motion coordination strategy consists of three layers. An offline layer pre-computes the braking area for each region in the workspace, the controlled transition system, and a so-called potential function. An initialization layer outputs an initially safely satisfying trajectory. An online coordination layer resolves conflicts when one occurs. The online coordination layer is further decomposed into three steps. Firstly, a conflict detection algorithm is implemented, which detects conflicts with neighboring robots. Whenever conflicts are detected, a rule is designed to assign dynamically a planning order to each pair of neighboring robots. Finally, a sampling-based algorithm is designed to generate local collision-free trajectories for the robot which at the same time guarantees the feasibility of the specification. Safety is proven to be guaranteed for all robots at any time. The effectiveness and the computational tractability of the resulting solution is verified numerically by two case studies. △ Less

Submitted 16 March, 2021; originally announced March 2021.

Comments: It was submitted to IEEE Transactions on Robotics

arXiv:2103.09091 [pdf, other]

Online Control Synthesis for Uncertain Systems under Signal Temporal Logic Specifications

Authors: Pian Yu, Yulong Gao, Frank J. Jiang, Karl H. Johansson, Dimos V. Dimarogonas

Abstract: This paper studies the online control synthesis problem for uncertain discrete-time systems subject to signal temporal logic (STL) specifications. Different from existing techniques, this work proposes an approach based on STL, reachability analysis, and temporal logic trees. Firstly, a real-time version of STL semantics and a tube-based temporal logic tree (tTLT) are proposed. We show that the tT… ▽ More This paper studies the online control synthesis problem for uncertain discrete-time systems subject to signal temporal logic (STL) specifications. Different from existing techniques, this work proposes an approach based on STL, reachability analysis, and temporal logic trees. Firstly, a real-time version of STL semantics and a tube-based temporal logic tree (tTLT) are proposed. We show that the tTLT is an underapproximation for the STL formula, in the sense that a trajectory satisfying an tTLT also satisfies the corresponding STL formula. Secondly, an online control synthesis algorithm is designed. It is shown that when the STL formula is robustly satisfiable and the initial state of the system belongs to the initial root node of the tTLT, it is guaranteed that the trajectory generated by the control synthesis algorithm satisfies the STL formula. The effectiveness of the proposed approach is verified by a simulation example and a practical experiment. △ Less

Submitted 17 March, 2023; v1 submitted 16 March, 2021; originally announced March 2021.

arXiv:2103.09024 [pdf, other]

Robust approximate symbolic models for a class of continuous-time uncertain nonlinear systems via a control interface

Authors: Pian Yu, Dimos V. Dimarogonas

Abstract: Discrete abstractions have become a standard approach to assist control synthesis under complex specifications. Most techniques for the construction of a discrete abstraction for a continuous-time system require time-space discretization of the concrete system, which constitutes property satisfaction for the continuous-time system non-trivial. In this work, we aim at relaxing this requirement by i… ▽ More Discrete abstractions have become a standard approach to assist control synthesis under complex specifications. Most techniques for the construction of a discrete abstraction for a continuous-time system require time-space discretization of the concrete system, which constitutes property satisfaction for the continuous-time system non-trivial. In this work, we aim at relaxing this requirement by introducing a control interface. Firstly, we connect the continuous-time uncertain concrete system with its discrete deterministic state-space abstraction with a control interface. Then, a novel stability notion called $η$-approximate controlled globally practically stable, and a new simulation relation called robust approximate simulation relation are proposed. It is shown that the uncertain concrete system, under the condition that there exists an admissible control interface such that the augmented system (composed of the concrete system and its abstraction) can be made $η$-approximate controlled globally practically stable, robustly approximately simulates its discrete abstraction. The effectiveness of the proposed results is illustrated by two simulation examples. △ Less

Submitted 16 March, 2021; originally announced March 2021.

Comments: This paper was submitted to Automatica in October 16, 2019, and is currently under review. arXiv admin note: substantial text overlap with arXiv:1909.09040

arXiv:2004.10437 [pdf, other]

A fully distributed motion coordination strategy for multi-robot systems with local information

Authors: Pian Yu, Dimos V. Dimarogonas

Abstract: This paper investigates the online motion coordination problem for a group of mobile robots moving in a shared workspace. Based on the realistic assumptions that each robot is subject to both velocity and input constraints and can have only local view and local information, a fully distributed multi-robot motion coordination strategy is proposed. Building on top of a cell decomposition, a conflict… ▽ More This paper investigates the online motion coordination problem for a group of mobile robots moving in a shared workspace. Based on the realistic assumptions that each robot is subject to both velocity and input constraints and can have only local view and local information, a fully distributed multi-robot motion coordination strategy is proposed. Building on top of a cell decomposition, a conflict detection algorithm is presented first. Then, a rule is proposed to assign dynamically a planning order to each pair of neighboring robots, which is deadlock-free. Finally, a two-step motion planning process that combines fixed-path planning and trajectory planning is designed. The effectiveness of the resulting solution is verified by a simulation example. △ Less

Submitted 22 April, 2020; originally announced April 2020.

Comments: Accepted by the 2020 American Control Conference

arXiv:1911.03583 [pdf, other]

Community-preserving Graph Convolutions for Structural and Functional Joint Embedding of Brain Networks

Authors: Jiahao Liu, Guixiang Ma, Fei Jiang, Chun-Ta Lu, Philip S. Yu, Ann B. Ragin

Abstract: Brain networks have received considerable attention given the critical significance for understanding human brain organization, for investigating neurological disorders and for clinical diagnostic applications. Structural brain network (e.g. DTI) and functional brain network (e.g. fMRI) are the primary networks of interest. Most existing works in brain network analysis focus on either structural o… ▽ More Brain networks have received considerable attention given the critical significance for understanding human brain organization, for investigating neurological disorders and for clinical diagnostic applications. Structural brain network (e.g. DTI) and functional brain network (e.g. fMRI) are the primary networks of interest. Most existing works in brain network analysis focus on either structural or functional connectivity, which cannot leverage the complementary information from each other. Although multi-view learning methods have been proposed to learn from both networks (or views), these methods aim to reach a consensus among multiple views, and thus distinct intrinsic properties of each view may be ignored. How to jointly learn representations from structural and functional brain networks while preserving their inherent properties is a critical problem. In this paper, we propose a framework of Siamese community-preserving graph convolutional network (SCP-GCN) to learn the structural and functional joint embedding of brain networks. Specifically, we use graph convolutions to learn the structural and functional joint embedding, where the graph structure is defined with structural connectivity and node features are from the functional connectivity. Moreover, we propose to preserve the community structure of brain networks in the graph convolutions by considering the intra-community and inter-community properties in the learning process. Furthermore, we use Siamese architecture which models the pair-wise similarity learning to guide the learning process. To evaluate the proposed approach, we conduct extensive experiments on two real brain network datasets. The experimental results demonstrate the superior performance of the proposed approach in structural and functional joint embedding for neurological disorder analysis, indicating its promising value for clinical applications. △ Less

Submitted 8 November, 2019; originally announced November 2019.

arXiv:1910.01287 [pdf, other]

Exoskeleton-covered soft finger with vision-based proprioception and tactile sensing

Authors: Yu She, Sandra Q. Liu, Peiyu Yu, Edward Adelson

Abstract: Soft robots offer significant advantages in adaptability, safety, and dexterity compared to conventional rigid-body robots. However, it is challenging to equip soft robots with accurate proprioception and tactile sensing due to their high flexibility and elasticity. In this work, we describe the development of a vision-based proprioceptive and tactile sensor for soft robots called GelFlex, which i… ▽ More Soft robots offer significant advantages in adaptability, safety, and dexterity compared to conventional rigid-body robots. However, it is challenging to equip soft robots with accurate proprioception and tactile sensing due to their high flexibility and elasticity. In this work, we describe the development of a vision-based proprioceptive and tactile sensor for soft robots called GelFlex, which is inspired by previous GelSight sensing techniques. More specifically, we develop a novel exoskeleton-covered soft finger with embedded cameras and deep learning methods that enable high-resolution proprioceptive sensing and rich tactile sensing. To do so, we design features along the axial direction of the finger, which enable high-resolution proprioceptive sensing, and incorporate a reflective ink coating on the surface of the finger to enable rich tactile sensing. We design a highly underactuated exoskeleton with a tendon-driven mechanism to actuate the finger. Finally, we assemble 2 of the fingers together to form a robotic gripper and successfully perform a bar stock classification task, which requires both shape and tactile information. We train neural networks for proprioception and shape (box versus cylinder) classification using data from the embedded sensors. The proprioception CNN had over 99\% accuracy on our testing set (all six joint angles were within 1 degree of error) and had an average accumulative distance error of 0.77 mm during live testing, which is better than human finger proprioception. These proposed techniques offer soft robots the high-level ability to simultaneously perceive their proprioceptive state and peripheral environment, providing potential solutions for soft robots to solve everyday manipulation tasks. We believe the methods developed in this work can be widely applied to different designs and applications. △ Less

Submitted 23 June, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

Comments: Accepted to ICRA2020

arXiv:1909.09040 [pdf, other]

Approximately symbolic models for a class of continuous-time nonlinear systems

Authors: Pian Yu, Dimos V. Dimarogonas

Abstract: Discrete abstractions have become a standard approach to assist control synthesis under complex specifications. Most techniques for the construction of discrete abstractions are based on sampling of both the state and time spaces, which may not be able to guarantee safety for continuous-time systems. In this work, we aim at addressing this problem by considering only state-space abstraction. First… ▽ More Discrete abstractions have become a standard approach to assist control synthesis under complex specifications. Most techniques for the construction of discrete abstractions are based on sampling of both the state and time spaces, which may not be able to guarantee safety for continuous-time systems. In this work, we aim at addressing this problem by considering only state-space abstraction. Firstly, we connect the continuous-time concrete system with its discrete (state-space) abstraction with a control interface. Then, a novel stability notion called controlled globally asymptotic/practical stability with respect to a set is proposed. It is shown that every system, under the condition that there exists an admissible control interface such that the augmented system (composed of the concrete system and its abstraction) can be made controlled globally practically stable with respect to the given set, is approximately simulated by its discrete abstraction. The effectiveness of the proposed results is illustrated by a simulation example. △ Less

Submitted 19 September, 2019; originally announced September 2019.

Comments: Accepted by the 58th IEEE Conference on Decision and Control, Nice

arXiv:1906.08059 [pdf, other]

Automated Computer Evaluation of Acute Ischemic Stroke and Large Vessel Occlusion

Authors: Jia You, Philip L. H. Yu, Anderson C. O. Tsang, Eva L. H. Tsui, Pauline P. S. Woo, Gilberto K. K. Leung

Abstract: Large vessel occlusion (LVO) plays an important role in the diagnosis of acute ischemic stroke. Identifying LVO of patients in the early stage on admission would significantly lower the probabilities of suffering from severe effects due to stroke or even save their lives. In this paper, we utilized both structural and imaging data from all recorded acute ischemic stroke patients in Hong Kong. Tota… ▽ More Large vessel occlusion (LVO) plays an important role in the diagnosis of acute ischemic stroke. Identifying LVO of patients in the early stage on admission would significantly lower the probabilities of suffering from severe effects due to stroke or even save their lives. In this paper, we utilized both structural and imaging data from all recorded acute ischemic stroke patients in Hong Kong. Total 300 patients (200 training and 100 testing) are used in this study. We established three hierarchical models based on demographic data, clinical data and features obtained from computerized tomography (CT) scans. The first two stages of modeling are merely based on demographic and clinical data. Besides, the third model utilized extra CT imaging features obtained from deep learning model. The optimal cutoff is determined at the maximal Youden index based on 10-fold cross-validation. With both clinical and imaging features, the Level-3 model achieved the best performance on testing data. The sensitivity, specificity, Youden index, accuracy and area under the curve (AUC) are 0.930, 0.684, 0.614, 0.790 and 0.850 respectively. △ Less

Submitted 18 June, 2019; originally announced June 2019.

arXiv:1905.09049 [pdf]

Automated Segmentation for Hyperdense Middle Cerebral Artery Sign of Acute Ischemic Stroke on Non-Contrast CT Images

Authors: Jia You, Philip L. H. Yu, Anderson C. O. Tsang, Eva L. H. Tsui, Pauline P. S. Woo, Gilberto K. K. Leung

Abstract: The hyperdense middle cerebral artery (MCA) dot sign has been reported as an important factor in the diagnosis of acute ischemic stroke due to large vessel occlusion. Interpreting the initial CT brain scan in these patients requires high level of expertise, and has high inter-observer variability. An automated computerized interpretation of the urgent CT brain image, with an emphasis to pick up ea… ▽ More The hyperdense middle cerebral artery (MCA) dot sign has been reported as an important factor in the diagnosis of acute ischemic stroke due to large vessel occlusion. Interpreting the initial CT brain scan in these patients requires high level of expertise, and has high inter-observer variability. An automated computerized interpretation of the urgent CT brain image, with an emphasis to pick up early signs of ischemic stroke will facilitate early patient diagnosis, triage, and shorten the door-to-revascularization time for these group of patients. In this paper, we present an automated detection method of segmenting the MCA dot sign on non-contrast CT brain image scans based on powerful deep learning technique. △ Less

Submitted 22 May, 2019; originally announced May 2019.

arXiv:1804.10018 [pdf, ps, other]

Time-constrained multi-agent task scheduling based on prescribed performance control

Authors: Pian Yu, Dimos V. Dimarogonas

Abstract: The problem of time-constrained multi-agent task scheduling and control synthesis is addressed. We assume the existence of a high level plan which consists of a sequence of cooperative tasks, each of which is associated with a deadline and several Quality-of-Service levels. By taking into account the reward and cost of satisfying each task, a novel scheduling problem is formulated and a path synth… ▽ More The problem of time-constrained multi-agent task scheduling and control synthesis is addressed. We assume the existence of a high level plan which consists of a sequence of cooperative tasks, each of which is associated with a deadline and several Quality-of-Service levels. By taking into account the reward and cost of satisfying each task, a novel scheduling problem is formulated and a path synthesis algorithm is proposed. Based on the obtained plan, a distributed hybrid control law is further designed for each agent. Under the condition that only a subset of the agents are aware of the high level plan, it is shown that the proposed controller guarantees the satisfaction of time constraints for each task. A simulation example is given to verify the theoretical results. △ Less

Submitted 19 September, 2018; v1 submitted 26 April, 2018; originally announced April 2018.

Comments: Extended version of the 57th IEEE Conference on Decision and Control (CDC 2018)

arXiv:1801.04745 [pdf, ps, other]

Distributionally Robust Optimization for Sequential Decision Making

Authors: Zhi Chen, Pengqian Yu, William B. Haskell

Abstract: The distributionally robust Markov Decision Process (MDP) approach asks for a distributionally robust policy that achieves the maximal expected total reward under the most adversarial distribution of uncertain parameters. In this paper, we study distributionally robust MDPs where ambiguity sets for the uncertain parameters are of a format that can easily incorporate in its description the uncertai… ▽ More The distributionally robust Markov Decision Process (MDP) approach asks for a distributionally robust policy that achieves the maximal expected total reward under the most adversarial distribution of uncertain parameters. In this paper, we study distributionally robust MDPs where ambiguity sets for the uncertain parameters are of a format that can easily incorporate in its description the uncertainty's generalized moment as well as statistical distance information. In this way, we generalize existing works on distributionally robust MDP with generalized-moment-based and statistical-distance-based ambiguity sets to incorporate information from the former class such as moments and dispersions to the latter class that critically depends on empirical observations of the uncertain parameters. We show that, under this format of ambiguity sets, the resulting distributionally robust MDP remains tractable under mild technical conditions. To be more specific, a distributionally robust policy can be constructed by solving a sequence of one-stage convex optimization subproblems. △ Less

Submitted 9 October, 2018; v1 submitted 15 January, 2018; originally announced January 2018.

arXiv:1701.01290 [pdf, other]

Approximate Value Iteration for Risk-aware Markov Decision Processes

Authors: Pengqian Yu, William B. Haskell, Huan Xu

Abstract: We consider large-scale Markov decision processes (MDPs) with a risk measure of variability in cost, under the risk-aware MDPs paradigm. Previous studies showed that risk-aware MDPs, based on a minimax approach to handling risk, can be solved using dynamic programming for small to medium sized problems. However, due to the "curse of dimensionality", MDPs that model real-life problems are typically… ▽ More We consider large-scale Markov decision processes (MDPs) with a risk measure of variability in cost, under the risk-aware MDPs paradigm. Previous studies showed that risk-aware MDPs, based on a minimax approach to handling risk, can be solved using dynamic programming for small to medium sized problems. However, due to the "curse of dimensionality", MDPs that model real-life problems are typically prohibitively large for such approaches. In this paper, we employ an approximate dynamic programming approach, and develop a family of simulation-based algorithms to approximately solve large-scale risk-aware MDPs. In parallel, we develop a unified convergence analysis technique to derive sample complexity bounds for this new family of algorithms. △ Less

Submitted 16 May, 2017; v1 submitted 5 January, 2017; originally announced January 2017.

arXiv:1512.00583 [pdf, ps, other]

Central-limit approach to risk-aware Markov decision processes

Authors: Pengqian Yu, Jia Yuan Yu, Huan Xu

Abstract: Whereas classical Markov decision processes maximize the expected reward, we consider minimizing the risk. We propose to evaluate the risk associated to a given policy over a long-enough time horizon with the help of a central limit theorem. The proposed approach works whether the transition probabilities are known or not. We also provide a gradient-based policy improvement algorithm that converge… ▽ More Whereas classical Markov decision processes maximize the expected reward, we consider minimizing the risk. We propose to evaluate the risk associated to a given policy over a long-enough time horizon with the help of a central limit theorem. The proposed approach works whether the transition probabilities are known or not. We also provide a gradient-based policy improvement algorithm that converges to a local optimum of the risk objective. △ Less

Submitted 2 December, 2015; originally announced December 2015.

Comments: arXiv admin note: text overlap with arXiv:1403.6530 by other authors

arXiv:1501.07418 [pdf, other]

Distributionally Robust Counterpart in Markov Decision Processes

Authors: Pengqian Yu, Huan Xu

Abstract: This paper studies Markov Decision Processes under parameter uncertainty. We adapt the distributionally robust optimization framework, and assume that the uncertain parameters are random variables following an unknown distribution, and seeks the strategy which maximizes the expected performance under the most adversarial distribution. In particular, we generalize previous study \cite{xu2012distrib… ▽ More This paper studies Markov Decision Processes under parameter uncertainty. We adapt the distributionally robust optimization framework, and assume that the uncertain parameters are random variables following an unknown distribution, and seeks the strategy which maximizes the expected performance under the most adversarial distribution. In particular, we generalize previous study \cite{xu2012distributionally} which concentrates on distribution sets with very special structure to much more generic class of distribution sets, and show that the optimal strategy can be obtained efficiently under mild technical condition. This significantly extends the applicability of distributionally robust MDP to incorporate probabilistic information of uncertainty in a more flexible way. △ Less

Submitted 13 May, 2015; v1 submitted 29 January, 2015; originally announced January 2015.

Comments: Added references. Corrected typos. Modified a mistake in Example 2 (Variance). Provided more details of the simulation

Showing 1–37 of 37 results for author: Yu, P