-
Fleet Size and Spill for UAM Operation under Uncertain Demand
Authors:
Shangqing Cao,
Xuan Jiang,
Emin Burak Onat,
Bo Zou,
Mark Hansen,
Raja Sengupta,
Anjan Chakrabarty
Abstract:
Variation and imbalance in demand poses significant challenges to Urban Air Mobility (UAM) operations, affecting strategic decisions such as fleet sizing. To study the implications of demand variation on UAM fleet operations, we propose a stochastic passenger arrival time generation model that uses real-world data to infer demand distributions, and two integer programs that compute the zero-spill…
▽ More
Variation and imbalance in demand poses significant challenges to Urban Air Mobility (UAM) operations, affecting strategic decisions such as fleet sizing. To study the implications of demand variation on UAM fleet operations, we propose a stochastic passenger arrival time generation model that uses real-world data to infer demand distributions, and two integer programs that compute the zero-spill fleet size and the spill-minimizing flight schedules and charging policies, respectively. Our numerical experiment on a two-vertiport network shows that spill in relatively inelastic to fleet size and that the driving factor behind spill is the imbalance in demand.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
SCKansformer: Fine-Grained Classification of Bone Marrow Cells via Kansformer Backbone and Hierarchical Attention Mechanisms
Authors:
Yifei Chen,
Zhu Zhu,
Shenghao Zhu,
Linwei Qiu,
Binfeng Zou,
Fan Jia,
Yunpeng Zhu,
Chenyan Zhang,
Zhaojie Fang,
Feiwei Qin,
** Fan,
Changmiao Wang,
Yu Gao,
Gang Yu
Abstract:
The incidence and mortality rates of malignant tumors, such as acute leukemia, have risen significantly. Clinically, hospitals rely on cytological examination of peripheral blood and bone marrow smears to diagnose malignant tumors, with accurate blood cell counting being crucial. Existing automated methods face challenges such as low feature expression capability, poor interpretability, and redund…
▽ More
The incidence and mortality rates of malignant tumors, such as acute leukemia, have risen significantly. Clinically, hospitals rely on cytological examination of peripheral blood and bone marrow smears to diagnose malignant tumors, with accurate blood cell counting being crucial. Existing automated methods face challenges such as low feature expression capability, poor interpretability, and redundant feature extraction when processing high-dimensional microimage data. We propose a novel fine-grained classification model, SCKansformer, for bone marrow blood cells, which addresses these challenges and enhances classification accuracy and efficiency. The model integrates the Kansformer Encoder, SCConv Encoder, and Global-Local Attention Encoder. The Kansformer Encoder replaces the traditional MLP layer with the KAN, improving nonlinear feature representation and interpretability. The SCConv Encoder, with its Spatial and Channel Reconstruction Units, enhances feature representation and reduces redundancy. The Global-Local Attention Encoder combines Multi-head Self-Attention with a Local Part module to capture both global and local features. We validated our model using the Bone Marrow Blood Cell Fine-Grained Classification Dataset (BMCD-FGCD), comprising over 10,000 samples and nearly 40 classifications, developed with a partner hospital. Comparative experiments on our private dataset, as well as the publicly available PBC and ALL-IDB datasets, demonstrate that SCKansformer outperforms both typical and advanced microcell classification methods across all datasets. Our source code and private BMCD-FGCD dataset are available at https://github.com/JustlfC03/SCKansformer.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
SCUNet++: Swin-UNet and CNN Bottleneck Hybrid Architecture with Multi-Fusion Dense Skip Connection for Pulmonary Embolism CT Image Segmentation
Authors:
Yifei Chen,
Binfeng Zou,
Zhaoxin Guo,
Yiyu Huang,
Yifan Huang,
Feiwei Qin,
Qinhai Li,
Changmiao Wang
Abstract:
Pulmonary embolism (PE) is a prevalent lung disease that can lead to right ventricular hypertrophy and failure in severe cases, ranking second in severity only to myocardial infarction and sudden death. Pulmonary artery CT angiography (CTPA) is a widely used diagnostic method for PE. However, PE detection presents challenges in clinical practice due to limitations in imaging technology. CTPA can p…
▽ More
Pulmonary embolism (PE) is a prevalent lung disease that can lead to right ventricular hypertrophy and failure in severe cases, ranking second in severity only to myocardial infarction and sudden death. Pulmonary artery CT angiography (CTPA) is a widely used diagnostic method for PE. However, PE detection presents challenges in clinical practice due to limitations in imaging technology. CTPA can produce noises similar to PE, making confirmation of its presence time-consuming and prone to overdiagnosis. Nevertheless, the traditional segmentation method of PE can not fully consider the hierarchical structure of features, local and global spatial features of PE CT images. In this paper, we propose an automatic PE segmentation method called SCUNet++ (Swin Conv UNet++). This method incorporates multiple fusion dense skip connections between the encoder and decoder, utilizing the Swin Transformer as the encoder. And fuses features of different scales in the decoder subnetwork to compensate for spatial information loss caused by the inevitable downsampling in Swin-UNet or other state-of-the-art methods, effectively solving the above problem. We provide a theoretical analysis of this method in detail and validate it on publicly available PE CT image datasets FUMPE and CAD-PE. The experimental results indicate that our proposed method achieved a Dice similarity coefficient (DSC) of 83.47% and a Hausdorff distance 95th percentile (HD95) of 3.83 on the FUMPE dataset, as well as a DSC of 83.42% and an HD95 of 5.10 on the CAD-PE dataset. These findings demonstrate that our method exhibits strong performance in PE segmentation tasks, potentially enhancing the accuracy of automatic segmentation of PE and providing a powerful diagnostic tool for clinical physicians. Our source code and new FUMPE dataset are available at https://github.com/JustlfC03/SCUNet-plusplus.
△ Less
Submitted 2 January, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
CARE: A Large Scale CT Image Dataset and Clinical Applicable Benchmark Model for Rectal Cancer Segmentation
Authors:
Hantao Zhang,
Weidong Guo,
Chenyang Qiu,
Shouhong Wan,
Bingbing Zou,
Wanqin Wang,
Peiquan **
Abstract:
Rectal cancer segmentation of CT image plays a crucial role in timely clinical diagnosis, radiotherapy treatment, and follow-up. Although current segmentation methods have shown promise in delineating cancerous tissues, they still encounter challenges in achieving high segmentation precision. These obstacles arise from the intricate anatomical structures of the rectum and the difficulties in perfo…
▽ More
Rectal cancer segmentation of CT image plays a crucial role in timely clinical diagnosis, radiotherapy treatment, and follow-up. Although current segmentation methods have shown promise in delineating cancerous tissues, they still encounter challenges in achieving high segmentation precision. These obstacles arise from the intricate anatomical structures of the rectum and the difficulties in performing differential diagnosis of rectal cancer. Additionally, a major obstacle is the lack of a large-scale, finely annotated CT image dataset for rectal cancer segmentation. To address these issues, this work introduces a novel large scale rectal cancer CT image dataset CARE with pixel-level annotations for both normal and cancerous rectum, which serves as a valuable resource for algorithm research and clinical application development. Moreover, we propose a novel medical cancer lesion segmentation benchmark model named U-SAM. The model is specifically designed to tackle the challenges posed by the intricate anatomical structures of abdominal organs by incorporating prompt information. U-SAM contains three key components: promptable information (e.g., points) to aid in target area localization, a convolution module for capturing low-level lesion details, and skip-connections to preserve and recover spatial information during the encoding-decoding process. To evaluate the effectiveness of U-SAM, we systematically compare its performance with several popular segmentation methods on the CARE dataset. The generalization of the model is further verified on the WORD dataset. Extensive experiments demonstrate that the proposed U-SAM outperforms state-of-the-art methods on these two datasets. These experiments can serve as the baseline for future research and clinical application development.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
Reversed Image Signal Processing and RAW Reconstruction. AIM 2022 Challenge Report
Authors:
Marcos V. Conde,
Radu Timofte,
Yibin Huang,
**gyang Peng,
Chang Chen,
Cheng Li,
Eduardo Pérez-Pellitero,
Fenglong Song,
Furui Bai,
Shuai Liu,
Chaoyu Feng,
Xiaotao Wang,
Lei Lei,
Yu Zhu,
Chenghua Li,
Yingying Jiang,
Yong A,
Peisong Wang,
Cong Leng,
Jian Cheng,
Xiaoyu Liu,
Zhicun Yin,
Zhilu Zhang,
Junyi Li,
Ming Liu
, et al. (18 additional authors not shown)
Abstract:
Cameras capture sensor RAW images and transform them into pleasant RGB images, suitable for the human eyes, using their integrated Image Signal Processor (ISP). Numerous low-level vision tasks operate in the RAW domain (e.g. image denoising, white balance) due to its linear relationship with the scene irradiance, wide-range of information at 12bits, and sensor designs. Despite this, RAW image data…
▽ More
Cameras capture sensor RAW images and transform them into pleasant RGB images, suitable for the human eyes, using their integrated Image Signal Processor (ISP). Numerous low-level vision tasks operate in the RAW domain (e.g. image denoising, white balance) due to its linear relationship with the scene irradiance, wide-range of information at 12bits, and sensor designs. Despite this, RAW image datasets are scarce and more expensive to collect than the already large and public RGB datasets.
This paper introduces the AIM 2022 Challenge on Reversed Image Signal Processing and RAW Reconstruction. We aim to recover raw sensor images from the corresponding RGBs without metadata and, by doing this, "reverse" the ISP transformation. The proposed methods and benchmark establish the state-of-the-art for this low-level vision inverse problem, and generating realistic raw sensor readings can potentially benefit other tasks such as denoising and super-resolution.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning
Authors:
Chengfei Lv,
Chaoyue Niu,
Renjie Gu,
Xiaotang Jiang,
Zhaode Wang,
Bin Liu,
Ziqi Wu,
Qiulin Yao,
Congyu Huang,
Panos Huang,
Tao Huang,
Hui Shu,
**de Song,
Bin Zou,
Peng Lan,
Guohuan Xu,
Fei Wu,
Shaojie Tang,
Fan Wu,
Guihai Chen
Abstract:
To break the bottlenecks of mainstream cloud-based machine learning (ML) paradigm, we adopt device-cloud collaborative ML and build the first end-to-end and general-purpose system, called Walle, as the foundation. Walle consists of a deployment platform, distributing ML tasks to billion-scale devices in time; a data pipeline, efficiently preparing task input; and a compute container, providing a c…
▽ More
To break the bottlenecks of mainstream cloud-based machine learning (ML) paradigm, we adopt device-cloud collaborative ML and build the first end-to-end and general-purpose system, called Walle, as the foundation. Walle consists of a deployment platform, distributing ML tasks to billion-scale devices in time; a data pipeline, efficiently preparing task input; and a compute container, providing a cross-platform and high-performance execution environment, while facilitating daily task iteration. Specifically, the compute container is based on Mobile Neural Network (MNN), a tensor compute engine along with the data processing and model execution libraries, which are exposed through a refined Python thread-level virtual machine (VM) to support diverse ML tasks and concurrent task execution. The core of MNN is the novel mechanisms of operator decomposition and semi-auto search, sharply reducing the workload in manually optimizing hundreds of operators for tens of hardware backends and further quickly identifying the best backend with runtime optimization for a computation graph. The data pipeline introduces an on-device stream processing framework to enable processing user behavior data at source. The deployment platform releases ML tasks with an efficient push-then-pull method and supports multi-granularity deployment policies. We evaluate Walle in practical e-commerce application scenarios to demonstrate its effectiveness, efficiency, and scalability. Extensive micro-benchmarks also highlight the superior performance of MNN and the Python thread-level VM. Walle has been in large-scale production use in Alibaba, while MNN has been open source with a broad impact in the community.
△ Less
Submitted 29 May, 2022;
originally announced May 2022.
-
NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results
Authors:
Eduardo Pérez-Pellitero,
Sibi Catley-Chandar,
Richard Shaw,
Aleš Leonardis,
Radu Timofte,
Zexin Zhang,
Cen Liu,
Yunbo Peng,
Yue Lin,
Gaocheng Yu,
** Zhang,
Zhe Ma,
Hongbin Wang,
Xiangyu Chen,
Xintao Wang,
Haiwei Wu,
Lin Liu,
Chao Dong,
Jiantao Zhou,
Qingsen Yan,
Song Zhang,
Weiye Chen,
Yuhang Liu,
Zhen Zhang,
Yanning Zhang
, et al. (68 additional authors not shown)
Abstract:
This paper reviews the challenge on constrained high dynamic range (HDR) imaging that was part of the New Trends in Image Restoration and Enhancement (NTIRE) workshop, held in conjunction with CVPR 2022. This manuscript focuses on the competition set-up, datasets, the proposed methods and their results. The challenge aims at estimating an HDR image from multiple respective low dynamic range (LDR)…
▽ More
This paper reviews the challenge on constrained high dynamic range (HDR) imaging that was part of the New Trends in Image Restoration and Enhancement (NTIRE) workshop, held in conjunction with CVPR 2022. This manuscript focuses on the competition set-up, datasets, the proposed methods and their results. The challenge aims at estimating an HDR image from multiple respective low dynamic range (LDR) observations, which might suffer from under- or over-exposed regions and different sources of noise. The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i.e. solutions can not exceed a given number of operations). In Track 2, participants are asked to minimize the complexity of their solutions while imposing a constraint on fidelity scores (i.e. solutions are required to obtain a higher fidelity score than the prescribed baseline). Both tracks use the same data and metrics: Fidelity is measured by means of PSNR with respect to a ground-truth HDR image (computed both directly and with a canonical tonemap** operation), while complexity metrics include the number of Multiply-Accumulate (MAC) operations and runtime (in seconds).
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
BEATS: An Open-Source, High-Precision, Multi-Channel EEG Acquisition Tool System
Authors:
Bing Zou,
Yubo Zheng,
Mu Shen,
Yingying Luo,
Lei Li,
Lin Zhang
Abstract:
Stable and accurate electroencephalogram (EEG) signal acquisition is fundamental in non-invasive brain-computer interface (BCI) technology. Commonly used EEG acquisition system's hardware and software are usually closed-source. Its inability to flexible expansion and secondary development is a major obstacle to real-time BCI research. This paper presents the Bei**g University of Posts and Telecom…
▽ More
Stable and accurate electroencephalogram (EEG) signal acquisition is fundamental in non-invasive brain-computer interface (BCI) technology. Commonly used EEG acquisition system's hardware and software are usually closed-source. Its inability to flexible expansion and secondary development is a major obstacle to real-time BCI research. This paper presents the Bei**g University of Posts and Telecommunications EEG Acquisition Tool System named BEATS. It implements a comprehensive system from hardware to software, composed of the analog front-end, microprocessor, and software platform. BEATS is capable of collecting 32-channel EEG signals at a guaranteed sampling rate of 4k Hz with wireless transmission. Compared to state-of-the-art systems used in many EEG fields, it displays a better sampling rate. Using techniques including direct memory access, first in first out, and timer, the precision and stability of the acquisition are ensured at the microsecond level. An evaluation is conducted during 24 hours of continuous acquisitions. The data loss is 0 packets and the average maximum delay is only 0.07 s/h. Moreover, as an open source system, BEATS provides detailed design files, and adopts a plug-in structure and easy-to-access materials, which makes it can be quickly reproduced. Schematics, source code, and other materials of BEATS are available at https://github.com/buptantEEG/BEATS.
△ Less
Submitted 19 December, 2022; v1 submitted 3 March, 2022;
originally announced March 2022.
-
A Dark and Bright Channel Prior Guided Deep Network for Retinal Image Quality Assessment
Authors:
Ziwen Xu,
Beiji Zou,
Qing Liu
Abstract:
Retinal image quality assessment is an essential task in the diagnosis of retinal diseases. Recently, there are emerging deep models to grade quality of retinal images. Current state-of-the-arts either directly transfer classification networks originally designed for natural images to quality classification of retinal images or introduce extra image quality priors via multiple CNN branches or inde…
▽ More
Retinal image quality assessment is an essential task in the diagnosis of retinal diseases. Recently, there are emerging deep models to grade quality of retinal images. Current state-of-the-arts either directly transfer classification networks originally designed for natural images to quality classification of retinal images or introduce extra image quality priors via multiple CNN branches or independent CNNs. This paper proposes a dark and bright channel prior guided deep network for retinal image quality assessment called GuidedNet. Specifically, the dark and bright channel priors are embedded into the start layer of network to improve the discriminate ability of deep features. In addition, we re-annotate a new retinal image quality dataset called RIQA-RFMiD for further validation. Experimental results on a public retinal image quality dataset Eye-Quality and our re-annotated dataset RIQA-RFMiD demonstrate the effectiveness of the proposed GuidedNet.
△ Less
Submitted 20 April, 2021; v1 submitted 25 October, 2020;
originally announced October 2020.