Search | arXiv e-print repository

Advancements in Feature Extraction Recognition of Medical Imaging Systems Through Deep Learning Technique

Authors: Qishi Zhan, Dan Sun, Erdi Gao, Yuhan Ma, Yaxin Liang, Haowei Yang

Abstract: This study introduces a novel unsupervised medical image feature extraction method that employs spatial stratification techniques. An objective function based on weight is proposed to achieve the purpose of fast image recognition. The algorithm divides the pixels of the image into multiple subdomains and uses a quadtree to access the image. A technique for threshold optimization utilizing a simple… ▽ More This study introduces a novel unsupervised medical image feature extraction method that employs spatial stratification techniques. An objective function based on weight is proposed to achieve the purpose of fast image recognition. The algorithm divides the pixels of the image into multiple subdomains and uses a quadtree to access the image. A technique for threshold optimization utilizing a simplex algorithm is presented. Aiming at the nonlinear characteristics of hyperspectral images, a generalized discriminant analysis algorithm based on kernel function is proposed. In this project, a hyperspectral remote sensing image is taken as the object, and we investigate its mathematical modeling, solution methods, and feature extraction techniques. It is found that different types of objects are independent of each other and compact in image processing. Compared with the traditional linear discrimination method, the result of image segmentation is better. This method can not only overcome the disadvantage of the traditional method which is easy to be affected by light, but also extract the features of the object quickly and accurately. It has important reference significance for clinical diagnosis. △ Less

Submitted 23 May, 2024; originally announced June 2024.

Comments: conference

arXiv:2212.07656 [pdf, other]

Hybrid stability augmentation control of multi-rotor UAV in confined space based on adaptive backstep** control

Authors: QuanXi Zhan, JunRui Zhang, ChenYang Sun, RunJie Shen, Bin He

Abstract: This paper applies the UAV to the inspection of water diversion pipelines in hydropower stations. The diversion pipeline is an enclosed space, so the airflow disturbance caused by the rotation of the UAV blades and the strong air convection from the chimney effect have a great impact on the flight control of the UAV. Although the traditional linear control PID flight control algorithm has been wid… ▽ More This paper applies the UAV to the inspection of water diversion pipelines in hydropower stations. The diversion pipeline is an enclosed space, so the airflow disturbance caused by the rotation of the UAV blades and the strong air convection from the chimney effect have a great impact on the flight control of the UAV. Although the traditional linear control PID flight control algorithm has been widely used and can meet the requirements of general flight tasks, it cannot guarantee the stability of the system over a wide range. The inspection of a diversion line in an enclosed space requires high system stability and robustness of the UAV controller. In this paper, a hybrid stabilised adaptive backstep** control method is proposed. Firstly, a multi-rotor UAV model is analysed and transformed into a strict feedback form with external disturbances; then adaptive techniques are used to estimate the airflow disturbances caused by the blades, and the attitude and position tracking controllers are designed by combining backstep** control and PID control respectively; finally, the asymptotic stability of the system is ensured by constructing a Lyapunov function. The experimental data show that the flight controller designed in this paper has good robustness and tracking performance, and can better resist the disturbance caused by airflow disturbance in confined space. △ Less

Submitted 15 December, 2022; originally announced December 2022.

Comments: 7 pages

arXiv:2203.16822 [pdf, other]

How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications

Authors: Juan Zuluaga-Gomez, Amrutha Prasad, Iuliia Nigmatulina, Saeed Sarfjoo, Petr Motlicek, Matthias Kleinert, Hartmut Helmke, Oliver Ohneiser, Qingran Zhan

Abstract: Recent work on self-supervised pre-training focus on leveraging large-scale unlabeled speech data to build robust end-to-end (E2E) acoustic models (AM) that can be later fine-tuned on downstream tasks e.g., automatic speech recognition (ASR). Yet, few works investigated the impact on performance when the data properties substantially differ between the pre-training and fine-tuning phases, termed d… ▽ More Recent work on self-supervised pre-training focus on leveraging large-scale unlabeled speech data to build robust end-to-end (E2E) acoustic models (AM) that can be later fine-tuned on downstream tasks e.g., automatic speech recognition (ASR). Yet, few works investigated the impact on performance when the data properties substantially differ between the pre-training and fine-tuning phases, termed domain shift. We target this scenario by analyzing the robustness of Wav2Vec 2.0 and XLS-R models on downstream ASR for a completely unseen domain, air traffic control (ATC) communications. We benchmark these two models on several open-source and challenging ATC databases with signal-to-noise ratio between 5 and 20 dB. Relative word error rate (WER) reductions between 20% to 40% are obtained in comparison to hybrid-based ASR baselines by only fine-tuning E2E acoustic models with a smaller fraction of labeled data. We analyze WERs on the low-resource scenario and gender bias carried by one ATC dataset. △ Less

Submitted 17 October, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

Comments: To be published in the 2022 IEEE Spoken Language Technology Workshop (SLT) (SLT 2022)

arXiv:2203.16025 [pdf]

Multiple Narrow-band signals Direction Finding with TMLA by Nonuniform Period Modulation

Authors: Kebin Liu, Lening Zhang, Qingkui Zhan, Chong He

Abstract: A new array signal reconstruction and signal-channel DOA estimation method based on TMLA by nonuniform period modulation are proposed. By using non-uniform period modulation, the harmonic component produced by different elements could be separated. Therefore, the conventional snapshot could be reconstructed by analyzing the spectrum of the combined signal. Then spatial spectrum estimation method i… ▽ More A new array signal reconstruction and signal-channel DOA estimation method based on TMLA by nonuniform period modulation are proposed. By using non-uniform period modulation, the harmonic component produced by different elements could be separated. Therefore, the conventional snapshot could be reconstructed by analyzing the spectrum of the combined signal. Then spatial spectrum estimation method is used to implement DOA estimation. Numerical simulations are provided to verify the feasibility and accuracy of the proposed method. Since the duration of the signal in the frequency domain analysis processed in a single time is very short, this method is also applicable to narrowband signals. Another highlight is that this method can simultaneously measure the number of the elements-1 angle of incident signals. △ Less

Submitted 29 March, 2022; originally announced March 2022.

arXiv:2010.07726 [pdf, other]

LiteDepthwiseNet: An Extreme Lightweight Network for Hyperspectral Image Classification

Authors: Benlei Cui, XueMei Dong, Qiaoqiao Zhan, Jiangtao Peng, Weiwei Sun

Abstract: Deep learning methods have shown considerable potential for hyperspectral image (HSI) classification, which can achieve high accuracy compared with traditional methods. However, they often need a large number of training samples and have a lot of parameters and high computational overhead. To solve these problems, this paper proposes a new network architecture, LiteDepthwiseNet, for HSI classifica… ▽ More Deep learning methods have shown considerable potential for hyperspectral image (HSI) classification, which can achieve high accuracy compared with traditional methods. However, they often need a large number of training samples and have a lot of parameters and high computational overhead. To solve these problems, this paper proposes a new network architecture, LiteDepthwiseNet, for HSI classification. Based on 3D depthwise convolution, LiteDepthwiseNet can decompose standard convolution into depthwise convolution and pointwise convolution, which can achieve high classification performance with minimal parameters. Moreover, we remove the ReLU layer and Batch Normalization layer in the original 3D depthwise convolution, which significantly improves the overfitting phenomenon of the model on small sized datasets. In addition, focal loss is used as the loss function to improve the model's attention on difficult samples and unbalanced data, and its training performance is significantly better than that of cross-entropy loss or balanced cross-entropy loss. Experiment results on three benchmark hyperspectral datasets show that LiteDepthwiseNet achieves state-of-the-art performance with a very small number of parameters and low computational cost. △ Less

Submitted 15 October, 2020; originally announced October 2020.

arXiv:2006.10304 [pdf, ps, other]

Automatic Speech Recognition Benchmark for Air-Traffic Communications

Authors: Juan Zuluaga-Gomez, Petr Motlicek, Qingran Zhan, Karel Vesely, Rudolf Braun

Abstract: Advances in Automatic Speech Recognition (ASR) over the last decade opened new areas of speech-based automation such as in Air-Traffic Control (ATC) environment. Currently, voice communication and data links communications are the only way of contact between pilots and Air-Traffic Controllers (ATCo), where the former is the most widely used and the latter is a non-spoken method mandatory for ocean… ▽ More Advances in Automatic Speech Recognition (ASR) over the last decade opened new areas of speech-based automation such as in Air-Traffic Control (ATC) environment. Currently, voice communication and data links communications are the only way of contact between pilots and Air-Traffic Controllers (ATCo), where the former is the most widely used and the latter is a non-spoken method mandatory for oceanic messages and limited for some domestic issues. ASR systems on ATCo environments inherit increasing complexity due to accents from non-English speakers, cockpit noise, speaker-dependent biases, and small in-domain ATC databases for training. Hereby, we introduce CleanSky EC-H2020 ATCO2, a project that aims to develop an ASR-based platform to collect, organize and automatically pre-process ATCo speech-data from air space. This paper conveys an exploratory benchmark of several state-of-the-art ASR models trained on more than 170 hours of ATCo speech-data. We demonstrate that the cross-accent flaws due to speakers' accents are minimized due to the amount of data, making the system feasible for ATC environments. The developed ASR system achieves an averaged word error rate (WER) of 7.75% across four databases. An additional 35% relative improvement in WER is achieved on one test set when training a TDNNF system with byte-pair encoding. △ Less

Submitted 13 August, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

Comments: Accepted to: 21st INTERSPEECH conference (Shanghai, October 25-29)

Showing 1–6 of 6 results for author: Zhan, Q