Search | arXiv e-print repository

Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis

Authors: Moein Heidari, Sina Ghorbani Kolahi, Sanaz Karimijafarbigloo, Bobby Azad, Afshin Bozorgpour, Soheila Hatami, Reza Azad, Ali Diba, Ulas Bagci, Dorit Merhof, Ilker Hacihaliloglu

Abstract: Sequence modeling plays a vital role across various domains, with recurrent neural networks being historically the predominant method of performing these tasks. However, the emergence of transformers has altered this paradigm due to their superior performance. Built upon these advances, transformers have conjoined CNNs as two leading foundational models for learning visual representations. However… ▽ More Sequence modeling plays a vital role across various domains, with recurrent neural networks being historically the predominant method of performing these tasks. However, the emergence of transformers has altered this paradigm due to their superior performance. Built upon these advances, transformers have conjoined CNNs as two leading foundational models for learning visual representations. However, transformers are hindered by the $\mathcal{O}(N^2)$ complexity of their attention mechanisms, while CNNs lack global receptive fields and dynamic weight allocation. State Space Models (SSMs), specifically the \textit{\textbf{Mamba}} model with selection mechanisms and hardware-aware architecture, have garnered immense interest lately in sequential modeling and visual representation learning, challenging the dominance of transformers by providing infinite context lengths and offering substantial efficiency maintaining linear complexity in the input sequence. Capitalizing on the advances in computer vision, medical imaging has heralded a new epoch with Mamba models. Intending to help researchers navigate the surge, this survey seeks to offer an encyclopedic review of Mamba models in medical imaging. Specifically, we start with a comprehensive theoretical review forming the basis of SSMs, including Mamba architecture and its alternatives for sequence modeling paradigms in this context. Next, we offer a structured classification of Mamba models in the medical field and introduce a diverse categorization scheme based on their application, imaging modalities, and targeted organs. Finally, we summarize key challenges, discuss different future research directions of the SSMs in the medical domain, and propose several directions to fulfill the demands of this field. In addition, we have compiled the studies discussed in this paper along with their open-source implementations on our GitHub repository. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: This is the first version of our survey, and the paper is currently under review

arXiv:2403.19882 [pdf, other]

Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights

Authors: Moein Heidari, Reza Azad, Sina Ghorbani Kolahi, René Arimond, Leon Niggemeier, Alaa Sulaiman, Afshin Bozorgpour, Ehsan Khodapanah Aghdam, Amirhossein Kazerouni, Ilker Hacihaliloglu, Dorit Merhof

Abstract: Intrigued by the inherent ability of the human visual system to identify salient regions in complex scenes, attention mechanisms have been seamlessly integrated into various Computer Vision (CV) tasks. Building upon this paradigm, Vision Transformer (ViT) networks exploit attention mechanisms for improved efficiency. This review navigates the landscape of redesigned attention mechanisms within ViT… ▽ More Intrigued by the inherent ability of the human visual system to identify salient regions in complex scenes, attention mechanisms have been seamlessly integrated into various Computer Vision (CV) tasks. Building upon this paradigm, Vision Transformer (ViT) networks exploit attention mechanisms for improved efficiency. This review navigates the landscape of redesigned attention mechanisms within ViTs, aiming to enhance their performance. This paper provides a comprehensive exploration of techniques and insights for designing attention mechanisms, systematically reviewing recent literature in the field of CV. This survey begins with an introduction to the theoretical foundations and fundamental concepts underlying attention mechanisms. We then present a systematic taxonomy of various attention mechanisms within ViTs, employing redesigned approaches. A multi-perspective categorization is proposed based on their application, objectives, and the type of attention applied. The analysis includes an exploration of the novelty, strengths, weaknesses, and an in-depth evaluation of the different proposed strategies. This culminates in the development of taxonomies that highlight key properties and contributions. Finally, we gather the reviewed studies along with their available open-source implementations at our \href{https://github.com/mindflow-institue/Awesome-Attention-Mechanism-in-Medical-Imaging}{GitHub}\footnote{\url{https://github.com/xmindflow/Awesome-Attention-Mechanism-in-Medical-Imaging}}. We aim to regularly update it with the most recent relevant papers. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: Submitted to Computational Visual Media Journal

arXiv:2403.19880 [pdf, other]

Vision-Language Synthetic Data Enhances Echocardiography Downstream Tasks

Authors: Pooria Ashrafian, Milad Yazdani, Moein Heidari, Dena Shahriari, Ilker Hacihaliloglu

Abstract: High-quality, large-scale data is essential for robust deep learning models in medical applications, particularly ultrasound image analysis. Diffusion models facilitate high-fidelity medical image generation, reducing the costs associated with acquiring and annotating new images. This paper utilizes recent vision-language models to produce diverse and realistic synthetic echocardiography image dat… ▽ More High-quality, large-scale data is essential for robust deep learning models in medical applications, particularly ultrasound image analysis. Diffusion models facilitate high-fidelity medical image generation, reducing the costs associated with acquiring and annotating new images. This paper utilizes recent vision-language models to produce diverse and realistic synthetic echocardiography image data, preserving key features of the original images guided by textual and semantic label maps. Specifically, we investigate three potential avenues: unconditional generation, generation guided by text, and a hybrid approach incorporating both textual and semantic supervision. We show that the rich contextual information present in the synthesized data potentially enhances the accuracy and interpretability of downstream tasks, such as echocardiography segmentation and classification with improved metrics and faster convergence. Our implementation with checkpoints, prompts, and the created synthetic dataset will be publicly available at \href{https://github.com/Pooria90/DiffEcho}{GitHub}. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: Submitted as a conference paper to MICCAI 2024

arXiv:2212.09239 [pdf, other]

On Non-Interactive Source Simulation via Fourier Transform

Authors: Farhad Shirani, Mohsen Heidari

Abstract: The non-interactive source simulation (NISS) scenario is considered. In this scenario, a pair of distributed agents, Alice and Bob, observe a distributed binary memoryless source $(X^d,Y^d)$ generated based on joint distribution $P_{X,Y}$. The agents wish to produce a pair of discrete random variables $(U_d,V_d)$ with joint distribution $P_{U_d,V_d}$, such that $P_{U_d,V_d}$ converges in total var… ▽ More The non-interactive source simulation (NISS) scenario is considered. In this scenario, a pair of distributed agents, Alice and Bob, observe a distributed binary memoryless source $(X^d,Y^d)$ generated based on joint distribution $P_{X,Y}$. The agents wish to produce a pair of discrete random variables $(U_d,V_d)$ with joint distribution $P_{U_d,V_d}$, such that $P_{U_d,V_d}$ converges in total variation distance to a target distribution $Q_{U,V}$ as the input blocklength $d$ is taken to be asymptotically large. Inner and outer bounds are obtained on the set of distributions $Q_{U,V}$ which can be produced given an input distribution $P_{X,Y}$. To this end, a bijective map** from the set of distributions $Q_{U,V}$ to a union of star-convex sets is provided. By leveraging proof techniques from discrete Fourier analysis along with a novel randomized rounding technique, inner and outer bounds are derived for each of these star-convex sets, and by inverting the aforementioned bijective map**, necessary and sufficient conditions on $Q_{U,V}$ and $P_{X,Y}$ are provided under which $Q_{U,V}$ can be produced from $P_{X,Y}$. The bounds are applicable in NISS scenarios where the output alphabets $\mathcal{U}$ and $\mathcal{V}$ have arbitrary finite size. In case of binary output alphabets, the outer-bound recovers the previously best-known outer-bound. △ Less

Submitted 18 December, 2022; originally announced December 2022.

arXiv:2211.07804 [pdf, other]

Diffusion Models for Medical Image Analysis: A Comprehensive Survey

Authors: Amirhossein Kazerouni, Ehsan Khodapanah Aghdam, Moein Heidari, Reza Azad, Mohsen Fayyaz, Ilker Hacihaliloglu, Dorit Merhof

Abstract: Denoising diffusion models, a class of generative models, have garnered immense interest lately in various deep-learning problems. A diffusion probabilistic model defines a forward diffusion stage where the input data is gradually perturbed over several steps by adding Gaussian noise and then learns to reverse the diffusion process to retrieve the desired noise-free data from noisy data samples. D… ▽ More Denoising diffusion models, a class of generative models, have garnered immense interest lately in various deep-learning problems. A diffusion probabilistic model defines a forward diffusion stage where the input data is gradually perturbed over several steps by adding Gaussian noise and then learns to reverse the diffusion process to retrieve the desired noise-free data from noisy data samples. Diffusion models are widely appreciated for their strong mode coverage and quality of the generated samples despite their known computational burdens. Capitalizing on the advances in computer vision, the field of medical imaging has also observed a growing interest in diffusion models. To help the researcher navigate this profusion, this survey intends to provide a comprehensive overview of diffusion models in the discipline of medical image analysis. Specifically, we introduce the solid theoretical foundation and fundamental concepts behind diffusion models and the three generic diffusion modelling frameworks: diffusion probabilistic models, noise-conditioned score networks, and stochastic differential equations. Then, we provide a systematic taxonomy of diffusion models in the medical domain and propose a multi-perspective categorization based on their application, imaging modality, organ of interest, and algorithms. To this end, we cover extensive applications of diffusion models in the medical domain. Furthermore, we emphasize the practical use case of some selected approaches, and then we discuss the limitations of the diffusion models in the medical domain and propose several directions to fulfill the demands of this field. Finally, we gather the overviewed studies with their available open-source implementations at https://github.com/amirhossein-kz/Awesome-Diffusion-Models-in-Medical-Imaging. △ Less

Submitted 3 June, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

Comments: Third revision: including more papers and further discussions

arXiv:2208.00713 [pdf, other]

TransDeepLab: Convolution-Free Transformer-based DeepLab v3+ for Medical Image Segmentation

Authors: Reza Azad, Moein Heidari, Moein Shariatnia, Ehsan Khodapanah Aghdam, Sanaz Karimijafarbigloo, Ehsan Adeli, Dorit Merhof

Abstract: Convolutional neural networks (CNNs) have been the de facto standard in a diverse set of computer vision tasks for many years. Especially, deep neural networks based on seminal architectures such as U-shaped models with skip-connections or atrous convolution with pyramid pooling have been tailored to a wide range of medical image analysis tasks. The main advantage of such architectures is that the… ▽ More Convolutional neural networks (CNNs) have been the de facto standard in a diverse set of computer vision tasks for many years. Especially, deep neural networks based on seminal architectures such as U-shaped models with skip-connections or atrous convolution with pyramid pooling have been tailored to a wide range of medical image analysis tasks. The main advantage of such architectures is that they are prone to detaining versatile local features. However, as a general consensus, CNNs fail to capture long-range dependencies and spatial correlations due to the intrinsic property of confined receptive field size of convolution operations. Alternatively, Transformer, profiting from global information modelling that stems from the self-attention mechanism, has recently attained remarkable performance in natural language processing and computer vision. Nevertheless, previous studies prove that both local and global features are critical for a deep model in dense prediction, such as segmenting complicated structures with disparate shapes and configurations. To this end, this paper proposes TransDeepLab, a novel DeepLab-like pure Transformer for medical image segmentation. Specifically, we exploit hierarchical Swin-Transformer with shifted windows to extend the DeepLabv3 and model the Atrous Spatial Pyramid Pooling (ASPP) module. A thorough search of the relevant literature yielded that we are the first to model the seminal DeepLab model with a pure Transformer-based model. Extensive experiments on various medical image segmentation tasks verify that our approach performs superior or on par with most contemporary works on an amalgamation of Vision Transformer and CNN-based methods, along with a significant reduction of model complexity. The codes and trained models are publicly available at https://github.com/rezazad68/transdeeplab △ Less

Submitted 1 August, 2022; originally announced August 2022.

arXiv:2207.13415 [pdf, other]

TransNorm: Transformer Provides a Strong Spatial Normalization Mechanism for a Deep Segmentation Model

Authors: Reza Azad, Mohammad T. AL-Antary, Moein Heidari, Dorit Merhof

Abstract: In the past few years, convolutional neural networks (CNNs), particularly U-Net, have been the prevailing technique in the medical image processing era. Specifically, the seminal U-Net, as well as its alternatives, have successfully managed to address a wide variety of medical image segmentation tasks. However, these architectures are intrinsically imperfect as they fail to exhibit long-range inte… ▽ More In the past few years, convolutional neural networks (CNNs), particularly U-Net, have been the prevailing technique in the medical image processing era. Specifically, the seminal U-Net, as well as its alternatives, have successfully managed to address a wide variety of medical image segmentation tasks. However, these architectures are intrinsically imperfect as they fail to exhibit long-range interactions and spatial dependencies leading to a severe performance drop in the segmentation of medical images with variable shapes and structures. Transformers, preliminary proposed for sequence-to-sequence prediction, have arisen as surrogate architectures to precisely model global information assisted by the self-attention mechanism. Despite being feasibly designed, utilizing a pure Transformer for image segmentation purposes can result in limited localization capacity stemming from inadequate low-level features. Thus, a line of research strives to design robust variants of Transformer-based U-Net. In this paper, we propose Trans-Norm, a novel deep segmentation framework which concomitantly consolidates a Transformer module into both encoder and skip-connections of the standard U-Net. We argue that the expedient design of skip-connections can be crucial for accurate segmentation as it can assist in feature fusion between the expanding and contracting paths. In this respect, we derive a Spatial Normalization mechanism from the Transformer module to adaptively recalibrate the skip connection path. Extensive experiments across three typical tasks for medical image segmentation demonstrate the effectiveness of TransNorm. The codes and trained models are publicly available at https://github.com/rezazad68/transnorm. △ Less

Submitted 27 July, 2022; originally announced July 2022.

arXiv:2203.01932 [pdf, other]

Contextual Attention Network: Transformer Meets U-Net

Authors: Reza Azad, Moein Heidari, Yuli Wu, Dorit Merhof

Abstract: Currently, convolutional neural networks (CNN) (e.g., U-Net) have become the de facto standard and attained immense success in medical image segmentation. However, as a downside, CNN based methods are a double-edged sword as they fail to build long-range dependencies and global context connections due to the limited receptive field that stems from the intrinsic characteristics of the convolution o… ▽ More Currently, convolutional neural networks (CNN) (e.g., U-Net) have become the de facto standard and attained immense success in medical image segmentation. However, as a downside, CNN based methods are a double-edged sword as they fail to build long-range dependencies and global context connections due to the limited receptive field that stems from the intrinsic characteristics of the convolution operation. Hence, recent articles have exploited Transformer variants for medical image segmentation tasks which open up great opportunities due to their innate capability of capturing long-range correlations through the attention mechanism. Although being feasibly designed, most of the cohort studies incur prohibitive performance in capturing local information, thereby resulting in less lucidness of boundary areas. In this paper, we propose a contextual attention network to tackle the aforementioned limitations. The proposed method uses the strength of the Transformer module to model the long-range contextual dependency. Simultaneously, it utilizes the CNN encoder to capture local semantic information. In addition, an object-level representation is included to model the regional interaction map. The extracted hierarchical features are then fed to the contextual attention module to adaptively recalibrate the representation space using the local information. Then, they emphasize the informative regions while taking into account the long-range contextual dependency derived by the Transformer module. We validate our method on several large-scale public medical image segmentation datasets and achieve state-of-the-art performance. We have provided the implementation code in https://github.com/rezazad68/TMUnet. △ Less

Submitted 31 March, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

arXiv:2009.12639 [pdf]

doi 10.1109/GHTC46280.2020.9342904

A practical method for pupil segmentation in challenging conditions

Authors: Donya Khaledyan, Mohammad Eshghi, Morteza Heidari, Abolfazl Zargari Khuzani, Najmeh Mashhadi

Abstract: Various methods have been proposed for authentication, including password or pattern drawing, which is clearly visible on personal electronic devices. However, these methods of authentication are more vulnerable, as passwords and cards can be forgotten, lost, or stolen. Therefore, a great curiosity has developed in individual authentication using biometric methods that are based on physical and be… ▽ More Various methods have been proposed for authentication, including password or pattern drawing, which is clearly visible on personal electronic devices. However, these methods of authentication are more vulnerable, as passwords and cards can be forgotten, lost, or stolen. Therefore, a great curiosity has developed in individual authentication using biometric methods that are based on physical and behavioral features not possible to forget or be stolen. Authentication methods are used widely in portable devices since the lifetime of battery and time response are essential concerns in these devices. Due to the fact that these systems need to be fast and low power, designing efficient methods is still critical. We, in this paper, proposed a new low power and fast method for pupil segmentation based on approximate computing that under trading a minor level of accuracy, significant improvement in power assumption and time saving can be obtained and makes this algorithm suitable for hardware implementation. Furthermore, the experimental results of PSNR and SSIM show that the error rate in this method is negligible. △ Less

Submitted 26 September, 2020; originally announced September 2020.

Comments: 5 pages, 6 figures

Journal ref: 2020 IEEE Global Humanitarian Technology Conference (GHTC)

arXiv:2009.12631 [pdf]

doi 10.1109/GHTC46280.2020.9342851

Image quality enhancement in wireless capsule endoscopy with adaptive fraction gamma transformation and unsharp masking filter

Authors: Rezvan Ezatian, Donya Khaledyan, Kian Jafari, Morteza Heidari, Abolfazl Zargari Khuzani, Najmeh Mashhadi

Abstract: Wireless Capsule Endoscopy (WCE) presented in 2001 as one of the key approaches to observe the entire gastrointestinal (GI) tract, generally the small bowels. It has been used to detect diseases in the gastrointestinal tract. Endoscopic image analysis is still a required field with many open problems. The quality of many images it produced is rather unacceptable due to the nature of this imaging s… ▽ More Wireless Capsule Endoscopy (WCE) presented in 2001 as one of the key approaches to observe the entire gastrointestinal (GI) tract, generally the small bowels. It has been used to detect diseases in the gastrointestinal tract. Endoscopic image analysis is still a required field with many open problems. The quality of many images it produced is rather unacceptable due to the nature of this imaging system, which causes some issues to prognosticate by physicians and computer-aided diagnosis. In this paper, a novel technique is proposed to improve the quality of images captured by the WCE. More specifically, it enhanced the brightness, contrast, and preserve the color information while reducing its computational complexity. Furthermore, the experimental results of PSNR and SSIM confirm that the error rate in this method is near to the ground and negligible. Moreover, the proposed method improves intensity restricted average local entropy (IRMLE) by 22%, color enhancement factor (CEF) by 10%, and can keep the lightness of image effectively. The performances of our method have better visual quality and objective assessments in compare to the state-of-art methods. △ Less

Submitted 26 September, 2020; originally announced September 2020.

Comments: 7 pages, 7 figures

Journal ref: 2020 IEEE Global Humanitarian Technology Conference (GHTC)

arXiv:2009.09937 [pdf]

doi 10.1109/TBME.2021.3054248

Applying a random projection algorithm to optimize machine learning model for breast lesion classification

Authors: Morteza Heidari, Sivaramakrishnan Lakshmivarahan, Seyedehnafiseh Mirniaharikandehei, Gopichandh Danala, Sai Kiran R. Maryada, Hong Liu, Bin Zheng

Abstract: Machine learning is widely used in develo** computer-aided diagnosis (CAD) schemes of medical images. However, CAD usually computes large number of image features from the targeted regions, which creates a challenge of how to identify a small and optimal feature vector to build robust machine learning models. In this study, we investigate feasibility of applying a random projection algorithm to… ▽ More Machine learning is widely used in develo** computer-aided diagnosis (CAD) schemes of medical images. However, CAD usually computes large number of image features from the targeted regions, which creates a challenge of how to identify a small and optimal feature vector to build robust machine learning models. In this study, we investigate feasibility of applying a random projection algorithm to build an optimal feature vector from the initially CAD-generated large feature pool and improve performance of machine learning model. We assemble a retrospective dataset involving 1,487 cases of mammograms in which 644 cases have confirmed malignant mass lesions and 843 have benign lesions. A CAD scheme is first applied to segment mass regions and initially compute 181 features. Then, support vector machine (SVM) models embedded with several feature dimensionality reduction methods are built to predict likelihood of lesions being malignant. All SVM models are trained and tested using a leave-one-case-out cross-validation method. SVM generates a likelihood score of each segmented mass region depicting on one-view mammogram. By fusion of two scores of the same mass depicting on two-view mammograms, a case-based likelihood score is also evaluated. Comparing with the principle component analyses, nonnegative matrix factorization, and Chi-squared methods, SVM embedded with the random projection algorithm yielded a significantly higher case-based lesion classification performance with the area under ROC curve of 0.84+0.01 (p<0.02). The study demonstrates that the random project algorithm is a promising method to generate optimal feature vectors to help improve performance of machine learning models of medical images. △ Less

Submitted 9 September, 2020; originally announced September 2020.

Comments: 11 pages, 6 figures

Journal ref: IEEE Transactions on Biomedical Engineering, 2021

arXiv:2009.08809 [pdf]

Deep learning denoising for EOG artifacts removal from EEG signals

Authors: Najmeh Mashhadi, Abolfazl Zargari Khuzani, Morteza Heidari, Donya Khaledyan

Abstract: There are many sources of interference encountered in the electroencephalogram (EEG) recordings, specifically ocular, muscular, and cardiac artifacts. Rejection of EEG artifacts is an essential process in EEG analysis since such artifacts cause many problems in EEG signals analysis. One of the most challenging issues in EEG denoising processes is removing the ocular artifacts where Electrooculogra… ▽ More There are many sources of interference encountered in the electroencephalogram (EEG) recordings, specifically ocular, muscular, and cardiac artifacts. Rejection of EEG artifacts is an essential process in EEG analysis since such artifacts cause many problems in EEG signals analysis. One of the most challenging issues in EEG denoising processes is removing the ocular artifacts where Electrooculographic (EOG), and EEG signals have an overlap in both frequency and time domains. In this paper, we build and train a deep learning model to deal with this challenge and remove the ocular artifacts effectively. In the proposed scheme, we convert each EEG signal to an image to be fed to a U-NET model, which is a deep learning model usually used in image segmentation tasks. We proposed three different schemes and made our U-NET based models learn to purify contaminated EEG signals similar to the process used in the image segmentation process. The results confirm that one of our schemes can achieve a reliable and promising accuracy to reduce the Mean square error between the target signal (Pure EEGs) and the predicted signal (Purified EEGs). △ Less

Submitted 12 September, 2020; originally announced September 2020.

arXiv:2009.05880 [pdf]

An approach to human iris recognition using quantitative analysis of image features and machine learning

Authors: Abolfazl Zargari Khuzani, Najmeh Mashhadi, Morteza Heidari, Donya Khaledyan

Abstract: The Iris pattern is a unique biological feature for each individual, making it a valuable and powerful tool for human identification. In this paper, an efficient framework for iris recognition is proposed in four steps. (1) Iris segmentation (using a relative total variation combined with Coarse Iris Localization), (2) feature extraction (using Shape&density, FFT, GLCM, GLDM, and Wavelet), (3) fea… ▽ More The Iris pattern is a unique biological feature for each individual, making it a valuable and powerful tool for human identification. In this paper, an efficient framework for iris recognition is proposed in four steps. (1) Iris segmentation (using a relative total variation combined with Coarse Iris Localization), (2) feature extraction (using Shape&density, FFT, GLCM, GLDM, and Wavelet), (3) feature reduction (employing Kernel-PCA) and (4) classification (applying multi-layer neural network) to classify 2000 iris images of CASIA-Iris-Interval dataset obtained from 200 volunteers. The results confirm that the proposed scheme can provide a reliable prediction with an accuracy of up to 99.64%. △ Less

Submitted 12 September, 2020; originally announced September 2020.

arXiv:2009.00675 [pdf]

Applying a random projection algorithm to optimize machine learning model for predicting peritoneal metastasis in gastric cancer patients using CT images

Authors: Seyedehnafiseh Mirniaharikandehei, Morteza Heidari, Gopichandh Danala, Sivaramakrishnan Lakshmivarahan, Bin Zheng

Abstract: Background and Objective: Non-invasively predicting the risk of cancer metastasis before surgery plays an essential role in determining optimal treatment methods for cancer patients (including who can benefit from neoadjuvant chemotherapy). Although develo** radiomics based machine learning (ML) models has attracted broad research interest for this purpose, it often faces a challenge of how to b… ▽ More Background and Objective: Non-invasively predicting the risk of cancer metastasis before surgery plays an essential role in determining optimal treatment methods for cancer patients (including who can benefit from neoadjuvant chemotherapy). Although develo** radiomics based machine learning (ML) models has attracted broad research interest for this purpose, it often faces a challenge of how to build a highly performed and robust ML model using small and imbalanced image datasets. Methods: In this study, we explore a new approach to build an optimal ML model. A retrospective dataset involving abdominal computed tomography (CT) images acquired from 159 patients diagnosed with gastric cancer is assembled. Among them, 121 cases have peritoneal metastasis (PM), while 38 cases do not have PM. A computer-aided detection (CAD) scheme is first applied to segment primary gastric tumor volumes and initially computes 315 image features. Then, two Gradient Boosting Machine (GBM) models embedded with two different feature dimensionality reduction methods, namely, the principal component analysis (PCA) and a random projection algorithm (RPA) and a synthetic minority oversampling technique, are built to predict the risk of the patients having PM. All GBM models are trained and tested using a leave-one-case-out cross-validation method. Results: Results show that the GBM embedded with RPA yielded a significantly higher prediction accuracy (71.2%) than using PCA (65.2%) (p<0.05). Conclusions: The study demonstrated that CT images of the primary gastric tumors contain discriminatory information to predict the risk of PM, and RPA is a promising method to generate optimal feature vector, improving the performance of ML models of medical images. △ Less

Submitted 1 September, 2020; originally announced September 2020.

Comments: 24 pages, 7 figures

arXiv:2006.12229 [pdf]

doi 10.1016/j.ijmedinf.2020.104284

Improving performance of CNN to predict likelihood of COVID-19 using chest X-ray images with preprocessing algorithms

Authors: Morteza Heidari, Seyedehnafiseh Mirniaharikandehei, Abolfazl Zargari Khuzani, Gopichandh Danala, Yuchen Qiu, Bin Zheng

Abstract: As the rapid spread of coronavirus disease (COVID-19) worldwide, chest X-ray radiography has also been used to detect COVID-19 infected pneumonia and assess its severity or monitor its prognosis in the hospitals due to its low cost, low radiation dose, and wide accessibility. However, how to more accurately and efficiently detect COVID-19 infected pneumonia and distinguish it from other community-… ▽ More As the rapid spread of coronavirus disease (COVID-19) worldwide, chest X-ray radiography has also been used to detect COVID-19 infected pneumonia and assess its severity or monitor its prognosis in the hospitals due to its low cost, low radiation dose, and wide accessibility. However, how to more accurately and efficiently detect COVID-19 infected pneumonia and distinguish it from other community-acquired pneumonia remains a challenge. In order to address this challenge, we in this study develop and test a new computer-aided diagnosis (CAD) scheme. It includes several image pre-processing algorithms to remove diaphragms, normalize image contrast-to-noise ratio, and generate three input images, then links to a transfer learning based convolutional neural network (a VGG16 based CNN model) to classify chest X-ray images into three classes of COVID-19 infected pneumonia, other community-acquired pneumonia and normal (non-pneumonia) cases. To this purpose, a publicly available dataset of 8,474 chest X-ray images is used, which includes 415 confirmed COVID-19 infected pneumonia, 5,179 community-acquired pneumonia, and 2,880 non-pneumonia cases. The dataset is divided into two subsets with 90% and 10% of images in each subset to train and test the CNN-based CAD scheme. The testing results achieve 94.0% of overall accuracy in classifying three classes and 98.6% accuracy in detecting Covid-19 infected cases. Thus, the study demonstrates the feasibility of develo** a CAD scheme of chest X-ray images and providing radiologists useful decision-making supporting tools in detecting and diagnosis of COVID-19 infected pneumonia. △ Less

Submitted 11 June, 2020; originally announced June 2020.

Comments: 11 pages, 5 figures, 2 tables

Journal ref: International Journal of Medical Informatics, 104284. 23 Sep. 2020

arXiv:1802.09316 [pdf, other]

doi 10.1117/12.2291549

Effects of Important Parameters Variations on Computing Eigenspace-Based Minimum Variance Weights for Ultrasound Tissue Harmonic Imaging

Authors: Mehdi Haji Heidari, Moein Mozaffarzadeh, Rayyan Manwar, Mohammadreza Nasiriavanaki

Abstract: In recent years, the minimum variance (MV) beamforming has been widely studied due to its high resolution and contrast in B-mode Ultrasound imaging (USI). However, the performance of the MV beamformer is degraded at the presence of noise, as a result of the inaccurate covariance matrix estimation which leads to a low quality image. Second harmonic imaging (SHI) provides many advantages over the co… ▽ More In recent years, the minimum variance (MV) beamforming has been widely studied due to its high resolution and contrast in B-mode Ultrasound imaging (USI). However, the performance of the MV beamformer is degraded at the presence of noise, as a result of the inaccurate covariance matrix estimation which leads to a low quality image. Second harmonic imaging (SHI) provides many advantages over the conventional pulse-echo USI, such as enhanced axial and lateral resolutions. However, the low signal-to-noise ratio (SNR) is a major problem in SHI. In this paper, Eigenspace-based minimum variance (EIBMV) beamformer has been employed for second harmonic USI. The Tissue Harmonic Imaging (THI) is achieved by Pulse Inversion (PI) technique. Using the EIBMV weights, instead of the MV ones, would lead to reduced sidelobes and improved contrast, without compromising the high resolution of the MV beamformer (even at the presence of a strong noise). In addition, we have investigated the effects of variations of the important parameters in computing EIBMV weights, i.e., K, L, and δ, on the resolution and contrast obtained in SHI. The results are evaluated using numerical data (using point target and cyst phantoms), and the proper parameters of EIBMV are indicated for THI. △ Less

Submitted 26 February, 2018; originally announced February 2018.

Comments: This paper is presented in Photons Plus Ultrasound: Imaging and Sensing 2018 conference and published by International Society for Optics and Photonics

Journal ref: Proceedings Volume 10494, Photons Plus Ultrasound: Imaging and Sensing 2018; 104946R (2018);

Showing 1–16 of 16 results for author: Heidari, M