-
Snake with Shifted Window: Learning to Adapt Vessel Pattern for OCTA Segmentation
Authors:
Xinrun Chen,
Mei Shen,
Haojian Ning,
Mengzhan Zhang,
Chengliang Wang,
Shiying Li
Abstract:
Segmenting specific targets or structures in optical coherence tomography angiography (OCTA) images is fundamental for conducting further pathological studies. The retinal vascular layers are rich and intricate, and such vascular with complex shapes can be captured by the widely-studied OCTA images. In this paper, we thus study how to use OCTA images with projection vascular layers to segment reti…
▽ More
Segmenting specific targets or structures in optical coherence tomography angiography (OCTA) images is fundamental for conducting further pathological studies. The retinal vascular layers are rich and intricate, and such vascular with complex shapes can be captured by the widely-studied OCTA images. In this paper, we thus study how to use OCTA images with projection vascular layers to segment retinal structures. To this end, we propose the SSW-OCTA model, which integrates the advantages of deformable convolutions suited for tubular structures and the swin-transformer for global feature extraction, adapting to the characteristics of OCTA modality images. Our model underwent testing and comparison on the OCTA-500 dataset, achieving state-of-the-art performance. The code is available at: https://github.com/ShellRedia/Snake-SWin-OCTA.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
MCformer: Multivariate Time Series Forecasting with Mixed-Channels Transformer
Authors:
Wenyong Han,
Tao Zhu Member,
Liming Chen,
Huansheng Ning,
Yang Luo,
Ya** Wan
Abstract:
The massive generation of time-series data by largescale Internet of Things (IoT) devices necessitates the exploration of more effective models for multivariate time-series forecasting. In previous models, there was a predominant use of the Channel Dependence (CD) strategy (where each channel represents a univariate sequence). Current state-of-the-art (SOTA) models primarily rely on the Channel In…
▽ More
The massive generation of time-series data by largescale Internet of Things (IoT) devices necessitates the exploration of more effective models for multivariate time-series forecasting. In previous models, there was a predominant use of the Channel Dependence (CD) strategy (where each channel represents a univariate sequence). Current state-of-the-art (SOTA) models primarily rely on the Channel Independence (CI) strategy. The CI strategy treats all channels as a single channel, expanding the dataset to improve generalization performance and avoiding inter-channel correlation that disrupts long-term features. However, the CI strategy faces the challenge of interchannel correlation forgetting. To address this issue, we propose an innovative Mixed Channels strategy, combining the data expansion advantages of the CI strategy with the ability to counteract inter-channel correlation forgetting. Based on this strategy, we introduce MCformer, a multivariate time-series forecasting model with mixed channel features. The model blends a specific number of channels, leveraging an attention mechanism to effectively capture inter-channel correlation information when modeling long-term features. Experimental results demonstrate that the Mixed Channels strategy outperforms pure CI strategy in multivariate time-series forecasting tasks.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Maximizing UAV Fog Deployment Efficiency for Critical Rescue Operations
Authors:
Abdenacer Naouri,
Huansheng Ning,
Nabil Abdelkader Nouri,
Amar Khelloufi,
Abdelkarim Ben Sada,
Salim Naouri,
Attia Qammar,
Sahraoui Dhelim
Abstract:
In disaster scenarios and high-stakes rescue operations, integrating Unmanned Aerial Vehicles (UAVs) as fog nodes has become crucial. This integration ensures a smooth connection between affected populations and essential health monitoring devices, supported by the Internet of Things (IoT). Integrating UAVs in such environments is inherently challenging, where the primary objectives involve maximi…
▽ More
In disaster scenarios and high-stakes rescue operations, integrating Unmanned Aerial Vehicles (UAVs) as fog nodes has become crucial. This integration ensures a smooth connection between affected populations and essential health monitoring devices, supported by the Internet of Things (IoT). Integrating UAVs in such environments is inherently challenging, where the primary objectives involve maximizing network connectivity and coverage while extending the network's lifetime through energy-efficient strategies to serve the maximum number of affected individuals. In this paper, We propose a novel model centred around dynamic UAV-based fog deployment that optimizes the system's adaptability and operational efficacy within the afflicted areas. First, we decomposed the problem into two subproblems. Connectivity and coverage subproblem, and network lifespan optimization subproblem. We shape our UAV fog deployment problem as a uni-objective optimization and introduce a specialized UAV fog deployment algorithm tailored specifically for UAV fog nodes deployed in rescue missions. While the network lifespan optimization subproblem is efficiently solved via a one-dimensional swap** method. Following that, We introduce a novel optimization strategy for UAV fog node placement in dynamic networks during evacuation scenarios, with a primary focus on ensuring robust connectivity and maximal coverage for mobile users, while extending the network's lifespan. Finally, we introduce Adaptive Whale Optimization Algorithm (WOA) for fog node deployment in a dynamic network. Its agility, rapid convergence, and low computational demands make it an ideal fit for high-pressure environments.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
An Accurate and Efficient Neural Network for OCTA Vessel Segmentation and a New Dataset
Authors:
Haojian Ning,
Chengliang Wang,
Xinrun Chen,
Shiying Li
Abstract:
Optical coherence tomography angiography (OCTA) is a noninvasive imaging technique that can reveal high-resolution retinal vessels. In this work, we propose an accurate and efficient neural network for retinal vessel segmentation in OCTA images. The proposed network achieves accuracy comparable to other SOTA methods, while having fewer parameters and faster inference speed (e.g. 110x lighter and 1…
▽ More
Optical coherence tomography angiography (OCTA) is a noninvasive imaging technique that can reveal high-resolution retinal vessels. In this work, we propose an accurate and efficient neural network for retinal vessel segmentation in OCTA images. The proposed network achieves accuracy comparable to other SOTA methods, while having fewer parameters and faster inference speed (e.g. 110x lighter and 1.3x faster than U-Net), which is very friendly for industrial applications. This is achieved by applying the modified Recurrent ConvNeXt Block to a full resolution convolutional network. In addition, we create a new dataset containing 918 OCTA images and their corresponding vessel annotations. The data set is semi-automatically annotated with the help of Segment Anything Model (SAM), which greatly improves the annotation speed. For the benefit of the community, our code and dataset can be obtained from https://github.com/nhjydywd/OCTA-FRNet.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Dynamic Interactional And Cooperative Network For Shield Machine
Authors:
Dazhi Gao,
Rongyang Li,
Hongbo Wang,
Lingfeng Mao,
Huansheng Ning
Abstract:
The shield machine (SM) is a complex mechanical device used for tunneling. However, the monitoring and deciding were mainly done by artificial experience during traditional construction, which brought some limitations, such as hidden mechanical failures, human operator error, and sensor anomalies. To deal with these challenges, many scholars have studied SM intelligent methods. Most of these metho…
▽ More
The shield machine (SM) is a complex mechanical device used for tunneling. However, the monitoring and deciding were mainly done by artificial experience during traditional construction, which brought some limitations, such as hidden mechanical failures, human operator error, and sensor anomalies. To deal with these challenges, many scholars have studied SM intelligent methods. Most of these methods only take SM into account but do not consider the SM operating environment. So, this paper discussed the relationship among SM, geological information, and control terminals. Then, according to the relationship, models were established for the control terminal, including SM rate prediction and SM anomaly detection. The experimental results show that compared with baseline models, the proposed models in this paper perform better. In the proposed model, the R2 and MSE of rate prediction can reach 92.2\%, and 0.0064 respectively. The abnormal detection rate of anomaly detection is up to 98.2\%.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
Artificial Intelligence for Suicide Assessment using Audiovisual Cues: A Review
Authors:
Sahraoui Dhelim,
Liming Chen,
Huansheng Ning,
Chris Nugent
Abstract:
Death by suicide is the seventh leading death cause worldwide. The recent advancement in Artificial Intelligence (AI), specifically AI applications in image and voice processing, has created a promising opportunity to revolutionize suicide risk assessment. Subsequently, we have witnessed fast-growing literature of research that applies AI to extract audiovisual non-verbal cues for mental illness a…
▽ More
Death by suicide is the seventh leading death cause worldwide. The recent advancement in Artificial Intelligence (AI), specifically AI applications in image and voice processing, has created a promising opportunity to revolutionize suicide risk assessment. Subsequently, we have witnessed fast-growing literature of research that applies AI to extract audiovisual non-verbal cues for mental illness assessment. However, the majority of the recent works focus on depression, despite the evident difference between depression symptoms and suicidal behavior and non-verbal cues. This paper reviews recent works that study suicide ideation and suicide behavior detection through audiovisual feature analysis, mainly suicidal voice/speech acoustic features analysis and suicidal visual cues. Automatic suicide assessment is a promising research direction that is still in the early stages. Accordingly, there is a lack of large datasets that can be used to train machine learning and deep learning models proven to be effective in other, similar tasks.
△ Less
Submitted 3 November, 2022; v1 submitted 22 January, 2022;
originally announced January 2022.
-
Delving into Rectifiers in Style-Based Image Translation
Authors:
Yipeng Zhang,
Bingliang Hu,
Hailong Ning,
Quang Wang
Abstract:
While modern image translation techniques can create photorealistic synthetic images, they have limited style controllability, thus could suffer from translation errors. In this work, we show that the activation function is one of the crucial components in controlling the direction of image synthesis. Specifically, we explicitly demonstrated that the slope parameters of the rectifier could change…
▽ More
While modern image translation techniques can create photorealistic synthetic images, they have limited style controllability, thus could suffer from translation errors. In this work, we show that the activation function is one of the crucial components in controlling the direction of image synthesis. Specifically, we explicitly demonstrated that the slope parameters of the rectifier could change the data distribution and be used independently to control the direction of translation. To improve the style controllability, two simple but effective techniques are proposed, including Adaptive ReLU (AdaReLU) and structural adaptive function. The AdaReLU can dynamically adjust the slope parameters according to the target style and can be utilized to increase the controllability by combining with Adaptive Instance Normalization (AdaIN). Meanwhile, the structural adaptative function enables rectifiers to manipulate the structure of feature maps more effectively. It is composed of the proposed structural convolution (StruConv), an efficient convolutional module that can choose the area to be activated based on the mean and variance specified by AdaIN. Extensive experiments show that the proposed techniques can greatly increase the network controllability and output diversity in style-based image translation tasks.
△ Less
Submitted 23 November, 2021; v1 submitted 20 November, 2021;
originally announced November 2021.
-
Audio Description from Image by Modal Translation Network
Authors:
Hailong Ning,
Xiangtao Zheng,
Yuan Yuan,
Xiaoqiang Lu
Abstract:
Audio is the main form for the visually impaired to obtain information. In reality, all kinds of visual data always exist, but audio data does not exist in many cases. In order to help the visually impaired people to better perceive the information around them, an image-to-audio-description (I2AD) task is proposed to generate audio descriptions from images in this paper. To complete this totally n…
▽ More
Audio is the main form for the visually impaired to obtain information. In reality, all kinds of visual data always exist, but audio data does not exist in many cases. In order to help the visually impaired people to better perceive the information around them, an image-to-audio-description (I2AD) task is proposed to generate audio descriptions from images in this paper. To complete this totally new task, a modal translation network (MT-Net) from visual to auditory sense is proposed. The proposed MT-Net includes three progressive sub-networks: 1) feature learning, 2) cross-modal map**, and 3) audio generation. First, the feature learning sub-network aims to learn semantic features from image and audio, including image feature learning and audio feature learning. Second, the cross-modal map** sub-network transforms the image feature into a cross-modal representation with the same semantic concept as the audio feature. In this way, the correlation of inter-modal data is effectively mined for easing the heterogeneous gap between image and audio. Finally, the audio generation sub-network is designed to generate the audio waveform from the cross-modal representation. The generated audio waveform is interpolated to obtain the corresponding audio file according to the sample frequency. Being the first attempt to explore the I2AD task, three large-scale datasets with plenty of manual audio descriptions are built. Experiments on the datasets verify the feasibility of generating intelligible audio from an image directly and the effectiveness of proposed method.
△ Less
Submitted 18 March, 2021;
originally announced March 2021.