Capturing Local and Global Features in Medical Images by Using Ensemble CNN-Transformer
Authors:
Javad Mirzapour Kaleybar,
Hooman Saadat,
Hooman Khaloo
Abstract:
This paper introduces a groundbreaking classification model called the Controllable Ensemble Transformer and CNN (CETC) for the analysis of medical images. The CETC model combines the powerful capabilities of convolutional neural networks (CNNs) and transformers to effectively capture both local and global features present in medical images. The model architecture comprises three main components:…
▽ More
This paper introduces a groundbreaking classification model called the Controllable Ensemble Transformer and CNN (CETC) for the analysis of medical images. The CETC model combines the powerful capabilities of convolutional neural networks (CNNs) and transformers to effectively capture both local and global features present in medical images. The model architecture comprises three main components: a convolutional encoder block (CEB), a transposed-convolutional decoder block (TDB), and a transformer classification block (TCB). The CEB is responsible for capturing multi-local features at different scales and draws upon components from VGGNet, ResNet, and MobileNet as backbones. By leveraging this combination, the CEB is able to effectively detect and encode local features. The TDB, on the other hand, consists of sub-decoders that decode and sum the captured features using ensemble coefficients. This enables the model to efficiently integrate the information from multiple scales. Finally, the TCB utilizes the SwT backbone and a specially designed prediction head to capture global features, ensuring a comprehensive understanding of the entire image. The paper provides detailed information on the experimental setup and implementation, including the use of transfer learning, data preprocessing techniques, and training settings. The CETC model is trained and evaluated using two publicly available COVID-19 datasets. Remarkably, the model outperforms existing state-of-the-art models across various evaluation metrics. The experimental results clearly demonstrate the superiority of the CETC model, emphasizing its potential for accurately and efficiently analyzing medical images.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
Object detection-based inspection of power line insulators: Incipient fault detection in the low data-regime
Authors:
Laya Das,
Mohammad Hossein Saadat,
Blazhe Gjorgiev,
Etienne Auger,
Giovanni Sansavini
Abstract:
Deep learning-based object detection is a powerful approach for detecting faulty insulators in power lines. This involves training an object detection model from scratch, or fine tuning a model that is pre-trained on benchmark computer vision datasets. This approach works well with a large number of insulator images, but can result in unreliable models in the low data regime. The current literatur…
▽ More
Deep learning-based object detection is a powerful approach for detecting faulty insulators in power lines. This involves training an object detection model from scratch, or fine tuning a model that is pre-trained on benchmark computer vision datasets. This approach works well with a large number of insulator images, but can result in unreliable models in the low data regime. The current literature mainly focuses on detecting the presence or absence of insulator caps, which is a relatively easy detection task, and does not consider detection of finer faults such as flashed and broken disks. In this article, we formulate three object detection tasks for insulator and asset inspection from aerial images, focusing on incipient faults in disks. We curate a large reference dataset of insulator images that can be used to learn robust features for detecting healthy and faulty insulators. We study the advantage of using this dataset in the low target data regime by pre-training on the reference dataset followed by fine-tuning on the target dataset. The results suggest that object detection models can be used to detect faults in insulators at a much incipient stage, and that transfer learning adds value depending on the type of object detection model. We identify key factors that dictate performance in the low data-regime and outline potential approaches to improve the state-of-the-art.
△ Less
Submitted 21 December, 2022;
originally announced December 2022.