Search | arXiv e-print repository

Centerline Boundary Dice Loss for Vascular Segmentation

Authors: Pengcheng Shi, Jiesi Hu, Yanwu Yang, Zilve Gao, Wei Liu, Ting Ma

Abstract: Vascular segmentation in medical imaging plays a crucial role in analysing morphological and functional assessments. Traditional methods, like the centerline Dice (clDice) loss, ensure topology preservation but falter in capturing geometric details, especially under translation and deformation. The combination of clDice with traditional Dice loss can lead to diameter imbalance, favoring larger ves… ▽ More Vascular segmentation in medical imaging plays a crucial role in analysing morphological and functional assessments. Traditional methods, like the centerline Dice (clDice) loss, ensure topology preservation but falter in capturing geometric details, especially under translation and deformation. The combination of clDice with traditional Dice loss can lead to diameter imbalance, favoring larger vessels. Addressing these challenges, we introduce the centerline boundary Dice (cbDice) loss function, which harmonizes topological integrity and geometric nuances, ensuring consistent segmentation across various vessel sizes. cbDice enriches the clDice approach by including boundary-aware aspects, thereby improving geometric detail recognition. It matches the performance of the boundary difference over union (B-DoU) loss through a mask-distance-based approach, enhancing traslation sensitivity. Crucially, cbDice incorporates radius information from vascular skeletons, enabling uniform adaptation to vascular diameter changes and maintaining balance in branch growth and fracture impacts. Furthermore, we conducted a theoretical analysis of clDice variants (cl-X-Dice). We validated cbDice's efficacy on three diverse vascular segmentation datasets, encompassing both 2D and 3D, and binary and multi-class segmentation. Particularly, the method integrated with cbDice demonstrated outstanding performance on the MICCAI 2023 TopCoW Challenge dataset. Our code is made publicly available at: https://github.com/PengchengShi1220/cbDice. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: accepted by MICCAI 2024

arXiv:2311.10331 [pdf, other]

Leveraging Multimodal Fusion for Enhanced Diagnosis of Multiple Retinal Diseases in Ultra-wide OCTA

Authors: Hao Wei, Peilun Shi, Guitao Bai, Minqing Zhang, Shuangle Li, Wu Yuan

Abstract: Ultra-wide optical coherence tomography angiography (UW-OCTA) is an emerging imaging technique that offers significant advantages over traditional OCTA by providing an exceptionally wide scanning range of up to 24 x 20 $mm^{2}$, covering both the anterior and posterior regions of the retina. However, the currently accessible UW-OCTA datasets suffer from limited comprehensive hierarchical informati… ▽ More Ultra-wide optical coherence tomography angiography (UW-OCTA) is an emerging imaging technique that offers significant advantages over traditional OCTA by providing an exceptionally wide scanning range of up to 24 x 20 $mm^{2}$, covering both the anterior and posterior regions of the retina. However, the currently accessible UW-OCTA datasets suffer from limited comprehensive hierarchical information and corresponding disease annotations. To address this limitation, we have curated the pioneering M3OCTA dataset, which is the first multimodal (i.e., multilayer), multi-disease, and widest field-of-view UW-OCTA dataset. Furthermore, the effective utilization of multi-layer ultra-wide ocular vasculature information from UW-OCTA remains underdeveloped. To tackle this challenge, we propose the first cross-modal fusion framework that leverages multi-modal information for diagnosing multiple diseases. Through extensive experiments conducted on our openly available M3OCTA dataset, we demonstrate the effectiveness and superior performance of our method, both in fixed and varying modalities settings. The construction of the M3OCTA dataset, the first multimodal OCTA dataset encompassing multiple diseases, aims to advance research in the ophthalmic image analysis community. △ Less

Submitted 17 November, 2023; originally announced November 2023.

arXiv:2310.13557 [pdf, other]

Distributed Global Optimal Coverage Control in Multi-agent Systems: Known and Unknown Environments

Authors: Mohammadhasan Faghihi, Meysam Yadegar, Mohammadhosein Bakhtiaridoust, Nader Meskin, Javad Shari, Peng Shi

Abstract: This paper introduces a novel approach to solve the coverage optimization problem in multi-agent systems. The proposed technique offers a solution that not only achieves the global optimality in the agents configuration but also effectively handles the issue of agents remaining stationary in regions void of information. The proposed approach leverages a novel cost function for optimizing the agent… ▽ More This paper introduces a novel approach to solve the coverage optimization problem in multi-agent systems. The proposed technique offers a solution that not only achieves the global optimality in the agents configuration but also effectively handles the issue of agents remaining stationary in regions void of information. The proposed approach leverages a novel cost function for optimizing the agents coverage and the cost function eventually aligns with the conventional Voronoi-based cost function. Theoretical analyses are conducted to assure the asymptotic convergence of agents towards the optimal configuration. A distinguishing feature of this approach lies in its departure from the reliance on geometric methods that are characteristic of Voronoi-based approaches; hence can be implemented more simply. Remarkably, the technique is adaptive and applicable to various environments with both known and unknown information distributions. Lastly, the efficacy of the proposed method is demonstrated through simulations, and the obtained results are compared with those of Voronoi-based algorithms. △ Less

Submitted 20 October, 2023; originally announced October 2023.

arXiv:2310.04992 [pdf, other]

VisionFM: a Multi-Modal Multi-Task Vision Foundation Model for Generalist Ophthalmic Artificial Intelligence

Authors: Jianing Qiu, Jian Wu, Hao Wei, Peilun Shi, Minqing Zhang, Yunyun Sun, Lin Li, Hanruo Liu, Hongyi Liu, Simeng Hou, Yuyang Zhao, Xuehui Shi, Junfang Xian, Xiaoxia Qu, Sirui Zhu, Lijie Pan, Xiaoniao Chen, Xiaojia Zhang, Shuai Jiang, Kebing Wang, Chenlong Yang, Mingqiang Chen, Sujie Fan, Jianhua Hu, Aiguo Lv , et al. (17 additional authors not shown)

Abstract: We present VisionFM, a foundation model pre-trained with 3.4 million ophthalmic images from 560,457 individuals, covering a broad range of ophthalmic diseases, modalities, imaging devices, and demography. After pre-training, VisionFM provides a foundation to foster multiple ophthalmic artificial intelligence (AI) applications, such as disease screening and diagnosis, disease prognosis, subclassifi… ▽ More We present VisionFM, a foundation model pre-trained with 3.4 million ophthalmic images from 560,457 individuals, covering a broad range of ophthalmic diseases, modalities, imaging devices, and demography. After pre-training, VisionFM provides a foundation to foster multiple ophthalmic artificial intelligence (AI) applications, such as disease screening and diagnosis, disease prognosis, subclassification of disease phenotype, and systemic biomarker and disease prediction, with each application enhanced with expert-level intelligence and accuracy. The generalist intelligence of VisionFM outperformed ophthalmologists with basic and intermediate levels in jointly diagnosing 12 common ophthalmic diseases. Evaluated on a new large-scale ophthalmic disease diagnosis benchmark database, as well as a new large-scale segmentation and detection benchmark database, VisionFM outperformed strong baseline deep neural networks. The ophthalmic image representations learned by VisionFM exhibited noteworthy explainability, and demonstrated strong generalizability to new ophthalmic modalities, disease spectrum, and imaging devices. As a foundation model, VisionFM has a large capacity to learn from diverse ophthalmic imaging data and disparate datasets. To be commensurate with this capacity, in addition to the real data used for pre-training, we also generated and leveraged synthetic ophthalmic imaging data. Experimental results revealed that synthetic data that passed visual Turing tests, can also enhance the representation learning capability of VisionFM, leading to substantial performance gains on downstream ophthalmic AI tasks. Beyond the ophthalmic AI applications developed, validated, and demonstrated in this work, substantial further applications can be achieved in an efficient and cost-effective manner using VisionFM as the foundation. △ Less

Submitted 7 October, 2023; originally announced October 2023.

arXiv:2308.13365 [pdf, ps, other]

Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder

Authors: Xuyuan Li, Zengqiang Shang, Peiyang Shi, Hua Hua, Ta Li, Pengyuan Zhang

Abstract: Neural networks have been able to generate high-quality single-sentence speech. However, it remains a challenge concerning audio-book speech synthesis due to the intra-paragraph correlation of semantic and acoustic features as well as variable styles. In this paper, we propose a highly expressive paragraph speech synthesis system with a multi-step variational autoencoder, called EP-MSTTS. EP-MSTTS… ▽ More Neural networks have been able to generate high-quality single-sentence speech. However, it remains a challenge concerning audio-book speech synthesis due to the intra-paragraph correlation of semantic and acoustic features as well as variable styles. In this paper, we propose a highly expressive paragraph speech synthesis system with a multi-step variational autoencoder, called EP-MSTTS. EP-MSTTS is the first VITS-based paragraph speech synthesis model and models the variable style of paragraph speech at five levels: frame, phoneme, word, sentence, and paragraph. We also propose a series of improvements to enhance the performance of this hierarchical model. In addition, we directly train EP-MSTTS on speech sliced by paragraph rather than sentence. Experiment results on the single-speaker French audiobook corpus released at Blizzard Challenge 2023 show EP-MSTTS obtains better performance than baseline models. △ Less

Submitted 11 June, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

Comments: accepted at Interspeech 2024

arXiv:2307.11783 [pdf]

A novel integrated method of detection-gras** for specific object based on the box coordinate matching

Authors: Zongmin Liu, Jirui Wang, Jie Li, Zufeng Li, Kai Ren, Peng Shi

Abstract: To better care for the elderly and disabled, it is essential for service robots to have an effective fusion method of object detection and grasp estimation. However, limited research has been observed on the combination of object detection and grasp estimation. To overcome this technical difficulty, a novel integrated method of detection-gras** for specific object based on the box coordinate mat… ▽ More To better care for the elderly and disabled, it is essential for service robots to have an effective fusion method of object detection and grasp estimation. However, limited research has been observed on the combination of object detection and grasp estimation. To overcome this technical difficulty, a novel integrated method of detection-gras** for specific object based on the box coordinate matching is proposed in this paper. Firstly, the SOLOv2 instance segmentation model is improved by adding channel attention module (CAM) and spatial attention module (SAM). Then, the atrous spatial pyramid pooling (ASPP) and CAM are added to the generative residual convolutional neural network (GR-CNN) model to optimize grasp estimation. Furthermore, a detection-gras** integrated algorithm based on box coordinate matching (DG-BCM) is proposed to obtain the fusion model of object detection and grasp estimation. For verification, experiments on object detection and grasp estimation are conducted separately to verify the superiority of improved models. Additionally, gras** tasks for several specific objects are implemented on a simulation platform, demonstrating the feasibility and effectiveness of DG-BCM algorithm proposed in this paper. △ Less

Submitted 20 July, 2023; originally announced July 2023.

arXiv:2305.15911 [pdf, other]

NexToU: Efficient Topology-Aware U-Net for Medical Image Segmentation

Authors: Pengcheng Shi, Xutao Guo, Yanwu Yang, Chenfei Ye, Ting Ma

Abstract: Convolutional neural networks (CNN) and Transformer variants have emerged as the leading medical image segmentation backbones. Nonetheless, due to their limitations in either preserving global image context or efficiently processing irregular shapes in visual objects, these backbones struggle to effectively integrate information from diverse anatomical regions and reduce inter-individual variabili… ▽ More Convolutional neural networks (CNN) and Transformer variants have emerged as the leading medical image segmentation backbones. Nonetheless, due to their limitations in either preserving global image context or efficiently processing irregular shapes in visual objects, these backbones struggle to effectively integrate information from diverse anatomical regions and reduce inter-individual variability, particularly for the vasculature. Motivated by the successful breakthroughs of graph neural networks (GNN) in capturing topological properties and non-Euclidean relationships across various fields, we propose NexToU, a novel hybrid architecture for medical image segmentation. NexToU comprises improved Pool GNN and Swin GNN modules from Vision GNN (ViG) for learning both global and local topological representations while minimizing computational costs. To address the containment and exclusion relationships among various anatomical structures, we reformulate the topological interaction (TI) module based on the nature of binary trees, rapidly encoding the topological constraints into NexToU. Extensive experiments conducted on three datasets (including distinct imaging dimensions, disease types, and imaging modalities) demonstrate that our method consistently outperforms other state-of-the-art (SOTA) architectures. All the code is publicly available at https://github.com/PengchengShi1220/NexToU. △ Less

Submitted 25 May, 2023; originally announced May 2023.

Comments: 13 pages, 6 figures

arXiv:2211.11944 [pdf, other]

COVID-Net Assistant: A Deep Learning-Driven Virtual Assistant for COVID-19 Symptom Prediction and Recommendation

Authors: Pengyuan Shi, Yuetong Wang, Saad Abbasi, Alexander Wong

Abstract: As the COVID-19 pandemic continues to put a significant burden on healthcare systems worldwide, there has been growing interest in finding inexpensive symptom pre-screening and recommendation methods to assist in efficiently using available medical resources such as PCR tests. In this study, we introduce the design of COVID-Net Assistant, an efficient virtual assistant designed to provide symptom… ▽ More As the COVID-19 pandemic continues to put a significant burden on healthcare systems worldwide, there has been growing interest in finding inexpensive symptom pre-screening and recommendation methods to assist in efficiently using available medical resources such as PCR tests. In this study, we introduce the design of COVID-Net Assistant, an efficient virtual assistant designed to provide symptom prediction and recommendations for COVID-19 by analyzing users' cough recordings through deep convolutional neural networks. We explore a variety of highly customized, lightweight convolutional neural network architectures generated via machine-driven design exploration (which we refer to as COVID-Net Assistant neural networks) on the Covid19-Cough benchmark dataset. The Covid19-Cough dataset comprises 682 cough recordings from a COVID-19 positive cohort and 642 from a COVID-19 negative cohort. Among the 682 cough recordings labeled positive, 382 recordings were verified by PCR test. Our experimental results show promising, with the COVID-Net Assistant neural networks demonstrating robust predictive performance, achieving AUC scores of over 0.93, with the best score over 0.95 while being fast and efficient in inference. The COVID-Net Assistant models are made available in an open source manner through the COVID-Net open initiative and, while not a production-ready solution, we hope their availability acts as a good resource for clinical scientists, machine learning researchers, as well as citizen scientists to develop innovative solutions. △ Less

Submitted 21 November, 2022; originally announced November 2022.

arXiv:2201.01198 [pdf, ps, other]

Semi-global Periodic Event-triggered Output Regulation for Nonlinear Multi-agent Systems

Authors: Shiqi Zheng, Peng Shi, Huiyan Zhang

Abstract: This study focuses on periodic event-triggered (PET) cooperative output regulation problem for a class of nonlinear multi-agent systems. The key feature of PET mechanism is that event-triggered conditions are required to be monitored only periodically. This approach is beneficial for Zeno behavior exclusion and saving of battery energy of onboard sensors. At first, new PET distributed observers ar… ▽ More This study focuses on periodic event-triggered (PET) cooperative output regulation problem for a class of nonlinear multi-agent systems. The key feature of PET mechanism is that event-triggered conditions are required to be monitored only periodically. This approach is beneficial for Zeno behavior exclusion and saving of battery energy of onboard sensors. At first, new PET distributed observers are proposed to estimate the leader information. We show that the estimation error converges to zero exponentially with a known convergence rate under asynchronous PET communication. Second, a novel PET output feedback controller is designed for the underlying strict feedback nonlinear multi-agent systems. Based on a state transformation technique and a local PET state observer, the cooperative semi-global output regulation problem can be solved by the proposed new control design technique. Simulation results of multiple Lorenz systems illustrate that the developed control scheme is effective. △ Less

Submitted 4 January, 2022; originally announced January 2022.

Comments: 35 pages, 5 figures, Accepted by IEEE Transactions on Automatic Control

Journal ref: IEEE Transactions on Automatic Control, 2022

arXiv:2109.07045 [pdf, ps, other]

Uncertainty Quantification in Medical Image Segmentation with Multi-decoder U-Net

Authors: Yanwu Yang, Xutao Guo, Yiwei Pan, Pengcheng Shi, Haiyan Lv, Ting Ma

Abstract: Accurate medical image segmentation is crucial for diagnosis and analysis. However, the models without calibrated uncertainty estimates might lead to errors in downstream analysis and exhibit low levels of robustness. Estimating the uncertainty in the measurement is vital to making definite, informed conclusions. Especially, it is difficult to make accurate predictions on ambiguous areas and focus… ▽ More Accurate medical image segmentation is crucial for diagnosis and analysis. However, the models without calibrated uncertainty estimates might lead to errors in downstream analysis and exhibit low levels of robustness. Estimating the uncertainty in the measurement is vital to making definite, informed conclusions. Especially, it is difficult to make accurate predictions on ambiguous areas and focus boundaries for both models and radiologists, even harder to reach a consensus with multiple annotations. In this work, the uncertainty under these areas is studied, which introduces significant information with anatomical structure and is as important as segmentation performance. We exploit the medical image segmentation uncertainty quantification by measuring segmentation performance with multiple annotations in a supervised learning manner and propose a U-Net based architecture with multiple decoders, where the image representation is encoded with the same encoder, and segmentation referring to each annotation is estimated with multiple decoders. Nevertheless, a cross-loss function is proposed for bridging the gap between different branches. The proposed architecture is trained in an end-to-end manner and able to improve predictive uncertainty estimates. The model achieves comparable performance with fewer parameters to the integrated training model that ranked the runner-up in the MICCAI-QUBIQ 2020 challenge. △ Less

Submitted 14 September, 2021; originally announced September 2021.

Comments: MICCAI_QUBIQ challenge, conference, Uncertainty qualification

arXiv:2106.10401 [pdf]

Parallel frequency function-deep neural network for efficient complex broadband signal approximation

Authors: Zhi Zeng, Pengpeng Shi, Fulei Ma, Peihan Qi

Abstract: A neural network is essentially a high-dimensional complex map** model by adjusting network weights for feature fitting. However, the spectral bias in network training leads to unbearable training epochs for fitting the high-frequency components in broadband signals. To improve the fitting efficiency of high-frequency components, the PhaseDNN was proposed recently by combining complex frequency… ▽ More A neural network is essentially a high-dimensional complex map** model by adjusting network weights for feature fitting. However, the spectral bias in network training leads to unbearable training epochs for fitting the high-frequency components in broadband signals. To improve the fitting efficiency of high-frequency components, the PhaseDNN was proposed recently by combining complex frequency band extraction and frequency shift techniques [Cai et al. SIAM J. SCI. COMPUT. 42, A3285 (2020)]. Our paper is devoted to an alternative candidate for fitting complex signals with high-frequency components. Here, a parallel frequency function-deep neural network (PFF-DNN) is proposed to suppress computational overhead while ensuring fitting accuracy by utilizing fast Fourier analysis of broadband signals and the spectral bias nature of neural networks. The effectiveness and efficiency of the proposed PFF-DNN method are verified based on detailed numerical experiments for six typical broadband signals. △ Less

Submitted 18 June, 2021; originally announced June 2021.

arXiv:2012.14982 [pdf, other]

Elastic Net based Feature Ranking and Selection

Authors: Shaode Yu, Haobo Chen, Hang Yu, Zhicheng Zhang, Xiaokun Liang, Wenjian Qin, Yaoqin Xie, ** Shi

Abstract: Feature selection is important in data representation and intelligent diagnosis. Elastic net is one of the most widely used feature selectors. However, the features selected are dependant on the training data, and their weights dedicated for regularized regression are irrelevant to their importance if used for feature ranking, that degrades the model interpretability and extension. In this study,… ▽ More Feature selection is important in data representation and intelligent diagnosis. Elastic net is one of the most widely used feature selectors. However, the features selected are dependant on the training data, and their weights dedicated for regularized regression are irrelevant to their importance if used for feature ranking, that degrades the model interpretability and extension. In this study, an intuitive idea is put at the end of multiple times of data splitting and elastic net based feature selection. It concerns the frequency of selected features and uses the frequency as an indicator of feature importance. After features are sorted according to their frequency, linear support vector machine performs the classification in an incremental manner. At last, a compact subset of discriminative features is selected by comparing the prediction performance. Experimental results on breast cancer data sets (BCDR-F03, WDBC, GSE 10810, and GSE 15852) suggest that the proposed framework achieves competitive or superior performance to elastic net and with consistent selection of fewer features. How to further enhance its consistency on high-dimension small-sample-size data sets should be paid more attention in our future work. The proposed framework is accessible online (https://github.com/NicoYuCN/elasticnetFR). △ Less

Submitted 29 December, 2020; originally announced December 2020.

arXiv:2003.03560 [pdf, other]

doi 10.1016/j.automatica.2020.109223

Periodic event-triggered output regulation for linear multi-agent systems

Authors: Shiqi Zheng, Peng Shi, Ramesh K. Agarwal, Chee Peng Lim

Abstract: This study considers the problem of periodic event-triggered (PET) cooperative output regulation for a class of linear multi-agent systems. The advantage of the PET output regulation is that the data transmission and triggered condition are only needed to be monitored at discrete sampling instants. It is assumed that only a small number of agents can have access to the system matrix and states of… ▽ More This study considers the problem of periodic event-triggered (PET) cooperative output regulation for a class of linear multi-agent systems. The advantage of the PET output regulation is that the data transmission and triggered condition are only needed to be monitored at discrete sampling instants. It is assumed that only a small number of agents can have access to the system matrix and states of the leader. Meanwhile, the PET mechanism is considered not only in the communication between various agents, but also in the sensor-to-controller and controller-to-actuator transmission channels for each agent. The above problem set-up will bring some challenges to the controller design and stability analysis. Based on a novel PET distributed observer, a PET dynamic output feedback control method is developed for each follower. Compared with the existing works, our method can naturally exclude the Zeno behavior, and the inter-event time becomes multiples of the sampling period. Furthermore, for every follower, the minimum inter-event time can be determined \textit{a prior}, and computed directly without the knowledge of the leader information. An example is given to verify and illustrate the effectiveness of the new design scheme. △ Less

Submitted 17 July, 2020; v1 submitted 7 March, 2020; originally announced March 2020.

Comments: 17 pages, 13 figures, submitted to Automatica. accepted

Journal ref: Automatica, 2020

Showing 1–13 of 13 results for author: Shi, P