Search | arXiv e-print repository

arXiv:2309.00831 [pdf, other]

Multi-scale, Data-driven and Anatomically Constrained Deep Learning Image Registration for Adult and Fetal Echocardiography

Authors: Md. Kamrul Hasan, Haobo Zhu, Guang Yang, Choon Hwai Yap

Abstract: Temporal echocardiography image registration is a basis for clinical quantifications such as cardiac motion estimation, myocardial strain assessments, and stroke volume quantifications. In past studies, deep learning image registration (DLIR) has shown promising results and is consistently accurate and precise, requiring less computational time. We propose that a greater focus on the warped moving… ▽ More Temporal echocardiography image registration is a basis for clinical quantifications such as cardiac motion estimation, myocardial strain assessments, and stroke volume quantifications. In past studies, deep learning image registration (DLIR) has shown promising results and is consistently accurate and precise, requiring less computational time. We propose that a greater focus on the warped moving image's anatomic plausibility and image quality can support robust DLIR performance. Further, past implementations have focused on adult echocardiography, and there is an absence of DLIR implementations for fetal echocardiography. We propose a framework that combines three strategies for DLIR in both fetal and adult echo: (1) an anatomic shape-encoded loss to preserve physiological myocardial and left ventricular anatomical topologies in warped images; (2) a data-driven loss that is trained adversarially to preserve good image texture features in warped images; and (3) a multi-scale training scheme of a data-driven and anatomically constrained algorithm to improve accuracy. Our tests show that good anatomical topology and image textures are strongly linked to shape-encoded and data-driven adversarial losses. They improve different aspects of registration performance in a non-overlap** way, justifying their combination. Despite fundamental distinctions between adult and fetal echo images, we show that these strategies can provide excellent registration results in both adult and fetal echocardiography using the publicly available CAMUS adult echo dataset and our private multi-demographic fetal echo dataset. Our approach outperforms traditional non-DL gold standard registration approaches, including Optical Flow and Elastix. Registration improvements could be translated to more accurate and precise clinical quantification of cardiac ejection fraction, demonstrating a potential for translation. △ Less

Submitted 11 September, 2023; v1 submitted 2 September, 2023; originally announced September 2023.

Comments: Our data-driven and anatomically constrained DLIR method's source code will be publicly available at https://github.com/kamruleee51/DdC-AC-DLIR

arXiv:2212.11486 [pdf, other]

Over-the-Air Federated Learning with Enhanced Privacy

Authors: Xiaochan Xue, Moh Khalid Hasan, Shucheng Yu, Laxima Niure Kandel, Min Song

Abstract: Federated learning (FL) has emerged as a promising learning paradigm in which only local model parameters (gradients) are shared. Private user data never leaves the local devices thus preserving data privacy. However, recent research has shown that even when local data is never shared by a user, exchanging model parameters without protection can also leak private information. Moreover, in wireless… ▽ More Federated learning (FL) has emerged as a promising learning paradigm in which only local model parameters (gradients) are shared. Private user data never leaves the local devices thus preserving data privacy. However, recent research has shown that even when local data is never shared by a user, exchanging model parameters without protection can also leak private information. Moreover, in wireless systems, the frequent transmission of model parameters can cause tremendous bandwidth consumption and network congestion when the model is large. To address this problem, we propose a new FL framework with efficient over-the-air parameter aggregation and strong privacy protection of both user data and models. We achieve this by introducing pairwise cancellable random artificial noises (PCR-ANs) on end devices. As compared to existing over-the-air computation (AirComp) based FL schemes, our design provides stronger privacy protection. We analytically show the secrecy capacity and the convergence rate of the proposed wireless FL aggregation algorithm. △ Less

Submitted 22 December, 2022; originally announced December 2022.

Comments: 6 pages

arXiv:2209.08807 [pdf, other]

A Deep Learning Approach for Parallel Imaging and Compressed Sensing MRI Reconstruction

Authors: Farhan Sadik, Md. Kamrul Hasan

Abstract: Parallel imaging accelerates MRI data acquisition by acquiring additional sensitivity information with an array of receiver coils, resulting in fewer phase encoding steps. Because of fewer data requirements than parallel imaging, compressed sensing magnetic resonance imaging (CS-MRI) has gained popularity in the field of medical imaging. Parallel imaging and compressed sensing (CS) both reduce the… ▽ More Parallel imaging accelerates MRI data acquisition by acquiring additional sensitivity information with an array of receiver coils, resulting in fewer phase encoding steps. Because of fewer data requirements than parallel imaging, compressed sensing magnetic resonance imaging (CS-MRI) has gained popularity in the field of medical imaging. Parallel imaging and compressed sensing (CS) both reduce the amount of data captured in the k-space, which speeds up traditional MRI acquisition. As acquisition time is inversely proportional to sample count, forming an image from reduced k-space samples results in faster acquisition but with aliasing artifacts. For de-aliasing the reconstructed image, this paper proposes a novel Generative Adversarial Network (GAN) called RECGAN-GR that is supervised with multi-modal losses. In comparison to existing GAN networks, our proposed method introduces a novel generator network, RemU-Net, which is integrated with dual-domain loss functions such as weighted magnitude and phase loss functions, as well as parallel imaging-based loss, GRAPPA consistency loss. As refinement learning, a k-space correction block is proposed to make the GAN network self-resistant to generating unnecessary data, which speeds up the reconstruction process. Comprehensive results show that the proposed RECGAN-GR not only improves the PSNR by 4 dB over GAN-based methods but also by 2 dB over conventional state-of-the-art CNN methods available in the literature for single-coil data. The proposed work significantly improves image quality for low-retained data, resulting in five to ten times faster acquisition. △ Less

Submitted 17 December, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

Comments: 13 pages, 11 figures

arXiv:2208.12232 [pdf, other]

doi 10.1016/j.compbiomed.2023.106624

A survey, review, and future trends of skin lesion segmentation and classification

Authors: Md. Kamrul Hasan, Md. Asif Ahamad, Choon Hwai Yap, Guang Yang

Abstract: The Computer-aided Diagnosis or Detection (CAD) approach for skin lesion analysis is an emerging field of research that has the potential to alleviate the burden and cost of skin cancer screening. Researchers have recently indicated increasing interest in develo** such CAD systems, with the intention of providing a user-friendly tool to dermatologists to reduce the challenges encountered or asso… ▽ More The Computer-aided Diagnosis or Detection (CAD) approach for skin lesion analysis is an emerging field of research that has the potential to alleviate the burden and cost of skin cancer screening. Researchers have recently indicated increasing interest in develo** such CAD systems, with the intention of providing a user-friendly tool to dermatologists to reduce the challenges encountered or associated with manual inspection. This article aims to provide a comprehensive literature survey and review of a total of 594 publications (356 for skin lesion segmentation and 238 for skin lesion classification) published between 2011 and 2022. These articles are analyzed and summarized in a number of different ways to contribute vital information regarding the methods for the development of CAD systems. These ways include relevant and essential definitions and theories, input data (dataset utilization, preprocessing, augmentations, and fixing imbalance problems), method configuration (techniques, architectures, module frameworks, and losses), training tactics (hyperparameter settings), and evaluation criteria. We intend to investigate a variety of performance-enhancing approaches, including ensemble and post-processing. We also discuss these dimensions to reveal their current trends based on utilization frequencies. In addition, we highlight the primary difficulties associated with evaluating skin lesion segmentation and classification systems using minimal datasets, as well as the potential solutions to these difficulties. Findings, recommendations, and trends are disclosed to inform future research on develo** an automated and robust CAD system for skin lesion analysis. △ Less

Submitted 2 February, 2023; v1 submitted 25 August, 2022; originally announced August 2022.

Comments: This manuscript has been accepted to be published in Computers in Biology and Medicine and has a total of 106 pages (single column and double spacing), 13 figures, and 11 tables

Journal ref: Computers in biology and medicine (2023): 106624

arXiv:2203.08490 [pdf, other]

Learning Audio Representations with MLPs

Authors: Mashrur M. Morshed, Ahmad Omar Ahsan, Hasan Mahmud, Md. Kamrul Hasan

Abstract: In this paper, we propose an efficient MLP-based approach for learning audio representations, namely timestamp and scene-level audio embeddings. We use an encoder consisting of sequentially stacked gated MLP blocks, which accept 2D MFCCs as inputs. In addition, we also provide a simple temporal interpolation-based algorithm for computing scene-level embeddings from timestamp embeddings. The audio… ▽ More In this paper, we propose an efficient MLP-based approach for learning audio representations, namely timestamp and scene-level audio embeddings. We use an encoder consisting of sequentially stacked gated MLP blocks, which accept 2D MFCCs as inputs. In addition, we also provide a simple temporal interpolation-based algorithm for computing scene-level embeddings from timestamp embeddings. The audio representations generated by our method are evaluated across a diverse set of benchmarks at the Holistic Evaluation of Audio Representations (HEAR) challenge, hosted at the NeurIPS 2021 competition track. We achieved first place on the Speech Commands (full), Speech Commands (5 hours), and the Mridingham Tonic benchmarks. Furthermore, our approach is also the most resource-efficient among all the submitted methods, in terms of both the number of model parameters and the time required to compute embeddings. △ Less

Submitted 16 March, 2022; originally announced March 2022.

Comments: In submission to Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track

arXiv:2202.06128 [pdf, other]

Grasp-and-Lift Detection from EEG Signal Using Convolutional Neural Network

Authors: Md. Kamrul Hasan, Sifat Redwan Wahid, Faria Rahman, Shanjida Khan Maliha, Sauda Binte Rahman

Abstract: People undergoing neuromuscular dysfunctions and amputated limbs require automatic prosthetic appliances. In develo** such prostheses, the precise detection of brain motor actions is imperative for the Grasp-and-Lift (GAL) tasks. Because of the low-cost and non-invasive essence of Electroencephalography (EEG), it is widely preferred for detecting motor actions during the controls of prosthetic t… ▽ More People undergoing neuromuscular dysfunctions and amputated limbs require automatic prosthetic appliances. In develo** such prostheses, the precise detection of brain motor actions is imperative for the Grasp-and-Lift (GAL) tasks. Because of the low-cost and non-invasive essence of Electroencephalography (EEG), it is widely preferred for detecting motor actions during the controls of prosthetic tools. This article has automated the hand movement activity viz GAL detection method from the 32-channel EEG signals. The proposed pipeline essentially combines preprocessing and end-to-end detection steps, eliminating the requirement of hand-crafted feature engineering. Preprocessing action consists of raw signal denoising, using either Discrete Wavelet Transform (DWT) or highpass or bandpass filtering and data standardization. The detection step consists of Convolutional Neural Network (CNN)- or Long Short Term Memory (LSTM)-based model. All the investigations utilize the publicly available WAY-EEG-GAL dataset, having six different GAL events. The best experiment reveals that the proposed framework achieves an average area under the ROC curve of 0.944, employing the DWT-based denoising filter, data standardization, and CNN-based detection model. The obtained outcome designates an excellent achievement of the introduced method in detecting GAL events from the EEG signals, turning it applicable to prosthetic appliances, brain-computer interfaces, robotic arms, etc. △ Less

Submitted 12 February, 2022; originally announced February 2022.

Comments: Accepted in https://icaeee2022.com/

arXiv:2201.00458 [pdf, other]

Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark

Authors: Parnian Afshar, Arash Mohammadi, Konstantinos N. Plataniotis, Keyvan Farahani, Justin Kirby, Anastasia Oikonomou, Amir Asif, Leonard Wee, Andre Dekker, Xin Wu, Mohammad Ariful Haque, Shahruk Hossain, Md. Kamrul Hasan, Uday Kamal, Winston Hsu, Jhih-Yuan Lin, M. Sohel Rahman, Nabil Ibtehaz, Sh. M. Amir Foisol, Kin-Man Lam, Zhong Guang, Runze Zhang, Sumohana S. Channappayya, Shashank Gupta, Chander Dev

Abstract: Lung cancer is one of the deadliest cancers, and in part its effective diagnosis and treatment depend on the accurate delineation of the tumor. Human-centered segmentation, which is currently the most common approach, is subject to inter-observer variability, and is also time-consuming, considering the fact that only experts are capable of providing annotations. Automatic and semi-automatic tumor… ▽ More Lung cancer is one of the deadliest cancers, and in part its effective diagnosis and treatment depend on the accurate delineation of the tumor. Human-centered segmentation, which is currently the most common approach, is subject to inter-observer variability, and is also time-consuming, considering the fact that only experts are capable of providing annotations. Automatic and semi-automatic tumor segmentation methods have recently shown promising results. However, as different researchers have validated their algorithms using various datasets and performance metrics, reliably evaluating these methods is still an open challenge. The goal of the Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark created through 2018 IEEE Video and Image Processing (VIP) Cup competition, is to provide a unique dataset and pre-defined metrics, so that different researchers can develop and evaluate their methods in a unified fashion. The 2018 VIP Cup started with a global engagement from 42 countries to access the competition data. At the registration stage, there were 129 members clustered into 28 teams from 10 countries, out of which 9 teams made it to the final stage and 6 teams successfully completed all the required tasks. In a nutshell, all the algorithms proposed during the competition, are based on deep learning models combined with a false positive reduction technique. Methods developed by the three finalists show promising results in tumor segmentation, however, more effort should be put into reducing the false positive rate. This competition manuscript presents an overview of the VIP-Cup challenge, along with the proposed algorithms and results. △ Less

Submitted 2 January, 2022; originally announced January 2022.

arXiv:2109.07702 [pdf, other]

A Multi-Task Cross-Task Learning Architecture for Ad-hoc Uncertainty Estimation in 3D Cardiac MRI Image Segmentation

Authors: S. M. Kamrul Hasan, Cristian A. Linte

Abstract: Medical image segmentation has significantly benefitted thanks to deep learning architectures. Furthermore, semi-supervised learning (SSL) has recently been a growing trend for improving a model's overall performance by leveraging abundant unlabeled data. Moreover, learning multiple tasks within the same model further improves model generalizability. To generate smoother and accurate segmentation… ▽ More Medical image segmentation has significantly benefitted thanks to deep learning architectures. Furthermore, semi-supervised learning (SSL) has recently been a growing trend for improving a model's overall performance by leveraging abundant unlabeled data. Moreover, learning multiple tasks within the same model further improves model generalizability. To generate smoother and accurate segmentation masks from 3D cardiac MR images, we present a Multi-task Cross-task learning consistency approach to enforce the correlation between the pixel-level (segmentation) and the geometric-level (distance map) tasks. Our extensive experimentation with varied quantities of labeled data in the training sets justifies the effectiveness of our model for the segmentation of the left atrial cavity from Gadolinium-enhanced magnetic resonance (GE-MR) images. With the incorporation of uncertainty estimates to detect failures in the segmentation masks generated by CNNs, our study further showcases the potential of our model to flag low-quality segmentation from a given model. △ Less

Submitted 2 October, 2021; v1 submitted 15 September, 2021; originally announced September 2021.

Comments: Accepted to 2021 Computing in Cardiology (CinC); Code is available at https://github.com/SMKamrulHasan/MTCTL

arXiv:2105.03995 [pdf, other]

Acute Lymphoblastic Leukemia Detection from Microscopic Images Using Weighted Ensemble of Convolutional Neural Networks

Authors: Chayan Mondal, Md. Kamrul Hasan, Md. Tasnim Jawad, Aishwariya Dutta, Md. Rabiul Islam, Md. Abdul Awal, Mohiuddin Ahmad

Abstract: Acute Lymphoblastic Leukemia (ALL) is a blood cell cancer characterized by numerous immature lymphocytes. Even though automation in ALL prognosis is an essential aspect of cancer diagnosis, it is challenging due to the morphological correlation between malignant and normal cells. The traditional ALL classification strategy demands experienced pathologists to carefully read the cell images, which i… ▽ More Acute Lymphoblastic Leukemia (ALL) is a blood cell cancer characterized by numerous immature lymphocytes. Even though automation in ALL prognosis is an essential aspect of cancer diagnosis, it is challenging due to the morphological correlation between malignant and normal cells. The traditional ALL classification strategy demands experienced pathologists to carefully read the cell images, which is arduous, time-consuming, and often suffers inter-observer variations. This article has automated the ALL detection task from microscopic cell images, employing deep Convolutional Neural Networks (CNNs). We explore the weighted ensemble of different deep CNNs to recommend a better ALL cell classifier. The weights for the ensemble candidate models are estimated from their corresponding metrics, such as accuracy, F1-score, AUC, and kappa values. Various data augmentations and pre-processing are incorporated for achieving a better generalization of the network. We utilize the publicly available C-NMC-2019 ALL dataset to conduct all the comprehensive experiments. Our proposed weighted ensemble model, using the kappa values of the ensemble candidates as their weights, has outputted a weighted F1-score of 88.6 %, a balanced accuracy of 86.2 %, and an AUC of 0.941 in the preliminary test set. The qualitative results displaying the gradient class activation maps confirm that the introduced model has a concentrated learned region. In contrast, the ensemble candidate models, such as Xception, VGG-16, DenseNet-121, MobileNet, and InceptionResNet-V2, separately produce coarse and scatter learned areas for most example cases. Since the proposed kappa value-based weighted ensemble yields a better result for the aimed task in this article, it can experiment in other domains of medical diagnostic applications. △ Less

Submitted 9 May, 2021; originally announced May 2021.

Comments: 31 pages, 9 figures

arXiv:2102.06169 [pdf, other]

COVID-19 identification from volumetric chest CT scans using a progressively resized 3D-CNN incorporating segmentation, augmentation, and class-rebalancing

Authors: Md. Kamrul Hasan, Md. Tasnim Jawad, Kazi Nasim Imtiaz Hasan, Sajal Basak Partha, Md. Masum Al Masba, Shumit Saha

Abstract: The novel COVID-19 is a global pandemic disease overgrowing worldwide. Computer-aided screening tools with greater sensitivity is imperative for disease diagnosis and prognosis as early as possible. It also can be a helpful tool in triage for testing and clinical supervision of COVID-19 patients. However, designing such an automated tool from non-invasive radiographic images is challenging as many… ▽ More The novel COVID-19 is a global pandemic disease overgrowing worldwide. Computer-aided screening tools with greater sensitivity is imperative for disease diagnosis and prognosis as early as possible. It also can be a helpful tool in triage for testing and clinical supervision of COVID-19 patients. However, designing such an automated tool from non-invasive radiographic images is challenging as many manually annotated datasets are not publicly available yet, which is the essential core requirement of supervised learning schemes. This article proposes a 3D Convolutional Neural Network (CNN)-based classification approach considering both the inter- and intra-slice spatial voxel information. The proposed system is trained in an end-to-end manner on the 3D patches from the whole volumetric CT images to enlarge the number of training samples, performing the ablation studies on patch size determination. We integrate progressive resizing, segmentation, augmentations, and class-rebalancing to our 3D network. The segmentation is a critical prerequisite step for COVID-19 diagnosis enabling the classifier to learn prominent lung features while excluding the outer lung regions of the CT scans. We evaluate all the extensive experiments on a publicly available dataset, named MosMed, having binary- and multi-class chest CT image partitions. Our experimental results are very encouraging, yielding areas under the ROC curve of 0.914 and 0.893 for the binary- and multi-class tasks, respectively, applying 5-fold cross-validations. Our method's promising results delegate it as a favorable aiding tool for clinical practitioners and radiologists to assess COVID-19. △ Less

Submitted 14 April, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

Comments: 33 pages

arXiv:2102.01824 [pdf, other]

Dermo-DOCTOR: A framework for concurrent skin lesion detection and recognition using a deep convolutional neural network with end-to-end dual encoders

Authors: Md. Kamrul Hasan, Shidhartho Roy, Chayan Mondal, Md. Ashraful Alam, Md. Toufick E Elahi, Aishwariya Dutta, S. M. Taslim Uddin Raju, Md. Tasnim Jawad, Mohiuddin Ahmad

Abstract: Automated skin lesion analysis for simultaneous detection and recognition is still challenging for inter-class homogeneity and intra-class heterogeneity, leading to low generic capability of a Single Convolutional Neural Network (CNN) with limited datasets. This article proposes an end-to-end deep CNN-based framework for simultaneous detection and recognition of the skin lesions, named Dermo-DOCTO… ▽ More Automated skin lesion analysis for simultaneous detection and recognition is still challenging for inter-class homogeneity and intra-class heterogeneity, leading to low generic capability of a Single Convolutional Neural Network (CNN) with limited datasets. This article proposes an end-to-end deep CNN-based framework for simultaneous detection and recognition of the skin lesions, named Dermo-DOCTOR, consisting of two encoders. The feature maps from two encoders are fused channel-wise, called Fused Feature Map (FFM). The FFM is utilized for decoding in the detection sub-network, concatenating each stage of two encoders' outputs with corresponding decoder layers to retrieve the lost spatial information due to pooling in the encoders. For the recognition sub-network, the outputs of three fully connected layers, utilizing feature maps of two encoders and FFM, are aggregated to obtain a final lesion class. We train and evaluate the proposed Dermo-Doctor utilizing two publicly available benchmark datasets, such as ISIC-2016 and ISIC-2017. The achieved segmentation results exhibit mean intersection over unions of 85.0 % and 80.0 % respectively for ISIC-2016 and ISIC-2017 test datasets. The proposed Dermo-DOCTOR also demonstrates praiseworthy success in lesion recognition, providing the areas under the receiver operating characteristic curves of 0.98 and 0.91 respectively for those two datasets. The experimental results show that the proposed Dermo-DOCTOR outperforms the alternative methods mentioned in the literature, designed for skin lesion detection and recognition. As the Dermo-DOCTOR provides better-results on two different test datasets, even with limited training data, it can be an auspicious computer-aided assistive tool for dermatologists. △ Less

Submitted 23 February, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

Comments: 39 Pages

arXiv:2102.01822 [pdf, other]

Multi-class probabilistic atlas-based whole heart segmentation method in cardiac CT and MRI

Authors: Tarun Kanti Ghosh, Md. Kamrul Hasan, Shidhartho Roy, Md. Ashraful Alam, Eklas Hossain, Mohiuddin Ahmad

Abstract: Accurate and robust whole heart substructure segmentation is crucial in develo** clinical applications, such as computer-aided diagnosis and computer-aided surgery. However, segmentation of different heart substructures is challenging because of inadequate edge or boundary information, the complexity of the background and texture, and the diversity in different substructures' sizes and shapes. T… ▽ More Accurate and robust whole heart substructure segmentation is crucial in develo** clinical applications, such as computer-aided diagnosis and computer-aided surgery. However, segmentation of different heart substructures is challenging because of inadequate edge or boundary information, the complexity of the background and texture, and the diversity in different substructures' sizes and shapes. This article proposes a framework for multi-class whole heart segmentation employing non-rigid registration-based probabilistic atlas incorporating the Bayesian framework. We also propose a non-rigid registration pipeline utilizing a multi-resolution strategy for obtaining the highest attainable mutual information between the moving and fixed images. We further incorporate non-rigid registration into the expectation-maximization algorithm and implement different deep convolutional neural network-based encoder-decoder networks for ablation studies. All the extensive experiments are conducted utilizing the publicly available dataset for the whole heart segmentation containing 20 MRI and 20 CT cardiac images. The proposed approach exhibits an encouraging achievement, yielding a mean volume overlap** error of 14.5 % for CT scans exceeding the state-of-the-art results by a margin of 1.3 % in terms of the same metric. As the proposed approach provides better-results to delineate the different substructures of the heart, it can be a medical diagnostic aiding tool for hel** experts with quicker and more accurate results. △ Less

Submitted 2 February, 2021; originally announced February 2021.

Comments: 17 pages

arXiv:2009.05379 [pdf, other]

L2-Constrained RemNet for Camera Model Identification and Image Manipulation Detection

Authors: Abdul Muntakim Rafi, Jonathan Wu, Md. Kamrul Hasan

Abstract: Source camera model identification (CMI) and image manipulation detection are of paramount importance in image forensics. In this paper, we propose an L2-constrained Remnant Convolutional Neural Network (L2-constrained RemNet) for performing these two crucial tasks. The proposed network architecture consists of a dynamic preprocessor block and a classification block. An L2 loss is applied to the o… ▽ More Source camera model identification (CMI) and image manipulation detection are of paramount importance in image forensics. In this paper, we propose an L2-constrained Remnant Convolutional Neural Network (L2-constrained RemNet) for performing these two crucial tasks. The proposed network architecture consists of a dynamic preprocessor block and a classification block. An L2 loss is applied to the output of the preprocessor block, and categorical crossentropy loss is calculated based on the output of the classification block. The whole network is trained in an end-to-end manner by minimizing the total loss, which is a combination of the L2 loss and the categorical crossentropy loss. Aided by the L2 loss, the data-adaptive preprocessor learns to suppress the unnecessary image contents and assists the classification block in extracting robust image forensics features. We train and test the network on the Dresden database and achieve an overall accuracy of 98.15%, where all the test images are from devices and scenes not used during training to replicate practical applications. The network also outperforms other state-of-the-art CNNs even when the images are manipulated. Furthermore, we attain an overall accuracy of 99.68% in image manipulation detection, which implies that it can be used as a general-purpose network for image forensic tasks. △ Less

Submitted 14 September, 2020; v1 submitted 10 September, 2020; originally announced September 2020.

Comments: arXiv admin note: text overlap with arXiv:1902.00694

arXiv:2007.11993 [pdf, other]

CVR-Net: A deep convolutional neural network for coronavirus recognition from chest radiography images

Authors: Md. Kamrul Hasan, Md. Ashraful Alam, Md. Toufick E Elahi, Shidhartho Roy, Sifat Redwan Wahid

Abstract: The novel Coronavirus Disease 2019 (COVID-19) is a global pandemic disease spreading rapidly around the world. A robust and automatic early recognition of COVID-19, via auxiliary computer-aided diagnostic tools, is essential for disease cure and control. The chest radiography images, such as Computed Tomography (CT) and X-ray, and deep Convolutional Neural Networks (CNNs), can be a significant and… ▽ More The novel Coronavirus Disease 2019 (COVID-19) is a global pandemic disease spreading rapidly around the world. A robust and automatic early recognition of COVID-19, via auxiliary computer-aided diagnostic tools, is essential for disease cure and control. The chest radiography images, such as Computed Tomography (CT) and X-ray, and deep Convolutional Neural Networks (CNNs), can be a significant and useful material for designing such tools. However, designing such an automated tool is challenging as a massive number of manually annotated datasets are not publicly available yet, which is the core requirement of supervised learning systems. In this article, we propose a robust CNN-based network, called CVR-Net (Coronavirus Recognition Network), for the automatic recognition of the coronavirus from CT or X-ray images. The proposed end-to-end CVR-Net is a multi-scale-multi-encoder ensemble model, where we have aggregated the outputs from two different encoders and their different scales to obtain the final prediction probability. We train and test the proposed CVR-Net on three different datasets, where the images have collected from different open-source repositories. We compare our proposed CVR-Net with state-of-the-art methods, which are trained and tested on the same datasets. We split three datasets into five different tasks, where each task has a different number of classes, to evaluate the multi-tasking CVR-Net. Our model achieves an overall F1-score & accuracy of 0.997 & 0.998; 0.963 & 0.964; 0.816 & 0.820; 0.961 & 0.961; and 0.780 & 0.780, respectively, for task-1 to task-5. As the CVR-Net provides promising results on the small datasets, it can be an auspicious computer-aided diagnostic tool for the diagnosis of coronavirus to assist the clinical practitioners and radiologists. Our source codes and model are publicly available at https://github.com/kamruleee51/CVR-Net. △ Less

Submitted 21 July, 2020; originally announced July 2020.

Comments: 31 Pages

arXiv:2006.02578 [pdf, other]

doi 10.13140/RG.2.2.18341.86249

DFR-TSD: A Deep Learning Based Framework for Robust Traffic Sign Detection Under Challenging Weather Conditions

Authors: Sabbir Ahmed, Uday Kamal, Md. Kamrul Hasan

Abstract: Robust traffic sign detection and recognition (TSDR) is of paramount importance for the successful realization of autonomous vehicle technology. The importance of this task has led to a vast amount of research efforts and many promising methods have been proposed in the existing literature. However, the SOTA (SOTA) methods have been evaluated on clean and challenge-free datasets and overlooked the… ▽ More Robust traffic sign detection and recognition (TSDR) is of paramount importance for the successful realization of autonomous vehicle technology. The importance of this task has led to a vast amount of research efforts and many promising methods have been proposed in the existing literature. However, the SOTA (SOTA) methods have been evaluated on clean and challenge-free datasets and overlooked the performance deterioration associated with different challenging conditions (CCs) that obscure the traffic images captured in the wild. In this paper, we look at the TSDR problem under CCs and focus on the performance degradation associated with them. To overcome this, we propose a Convolutional Neural Network (CNN) based TSDR framework with prior enhancement. Our modular approach consists of a CNN-based challenge classifier, Enhance-Net, an encoder-decoder CNN architecture for image enhancement, and two separate CNN architectures for sign-detection and classification. We propose a novel training pipeline for Enhance-Net that focuses on the enhancement of the traffic sign regions (instead of the whole image) in the challenging images subject to their accurate detection. We used CURE-TSD dataset consisting of traffic videos captured under different CCs to evaluate the efficacy of our approach. We experimentally show that our method obtains an overall precision and recall of 91.1% and 70.71% that is 7.58% and 35.90% improvement in precision and recall, respectively, compared to the current benchmark. Furthermore, we compare our approach with SOTA object detection networks, Faster-RCNN and R-FCN, and show that our approach outperforms them by a large margin. △ Less

Submitted 3 June, 2020; originally announced June 2020.

arXiv:2006.00205 [pdf]

doi 10.1109/ICAIIC.2019.8668981

Opportunities of Optical Spectrum for Future Wireless Communications

Authors: Mostafa Zaman Chowdhury, Moh Khalid Hasan, Md Shahjalal, Eun Bi Shin, Yeong Min Jang

Abstract: The requirements in terms of service quality such as data rate, latency, power consumption, number of connectivity of future fifth-generation (5G) communication is very high. Moreover, in Internet of Things (IoT) requires massive connectivity. Optical wireless communication (OWC) technologies such as visible light communication, light fidelity, optical camera communication, and free space optical… ▽ More The requirements in terms of service quality such as data rate, latency, power consumption, number of connectivity of future fifth-generation (5G) communication is very high. Moreover, in Internet of Things (IoT) requires massive connectivity. Optical wireless communication (OWC) technologies such as visible light communication, light fidelity, optical camera communication, and free space optical communication can effectively serve for the successful deployment of 5G and IoT. This paper clearly presents the contributions of OWC networks for 5G and IoT solutions. △ Less

Submitted 30 May, 2020; originally announced June 2020.

Comments: 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)

arXiv:2006.00204 [pdf]

doi 10.1109/ICTC.2018.8539460

Optical wireless hybrid networks for 5G and beyond communications

Authors: Mostafa Zaman Chowdhury, Moh Khalid Hasan, Md Shahjalal, Md Tanvir Hossan, Yeong Min Jang

Abstract: The next 5 th generation (5G) and above ultra-high speed, ultra-low latency, and extremely high reliable communication systems will consist of heterogeneous networks. These heterogeneous networks will consist not only radio frequency (RF) based systems but also optical wireless based systems. Hybrid architectures among different networks is an excellent approach for achieving the required level of… ▽ More The next 5 th generation (5G) and above ultra-high speed, ultra-low latency, and extremely high reliable communication systems will consist of heterogeneous networks. These heterogeneous networks will consist not only radio frequency (RF) based systems but also optical wireless based systems. Hybrid architectures among different networks is an excellent approach for achieving the required level of service quality. In this paper, we provide the opportunities bring by hybrid systems considering RF as well as optical wireless based communication technologies. We also discuss about the key research direction of hybrid network systems. △ Less

Submitted 30 May, 2020; originally announced June 2020.

Comments: 2018 International Conference on Information and Communication Technology Convergence (ICTC)

arXiv:2004.11253 [pdf, other]

L-CO-Net: Learned Condensation-Optimization Network for Clinical Parameter Estimation from Cardiac Cine MRI

Authors: S. M. Kamrul Hasan, Cristian A. Linte

Abstract: In this work, we implement a fully convolutional segmenter featuring both a learned group structure and a regularized weight-pruner to reduce the high computational cost in volumetric image segmentation. We validated our framework on the ACDC dataset featuring one healthy and four pathology groups imaged throughout the cardiac cycle. Our technique achieved Dice scores of 96.8% (LV blood-pool), 93.… ▽ More In this work, we implement a fully convolutional segmenter featuring both a learned group structure and a regularized weight-pruner to reduce the high computational cost in volumetric image segmentation. We validated our framework on the ACDC dataset featuring one healthy and four pathology groups imaged throughout the cardiac cycle. Our technique achieved Dice scores of 96.8% (LV blood-pool), 93.3% (RV blood-pool) and 90.0% (LV Myocardium) with five-fold cross-validation and yielded similar clinical parameters as those estimated from the ground truth segmentation data. Based on these results, this technique has the potential to become an efficient and competitive cardiac image segmentation tool that may be used for cardiac computer-aided diagnosis, planning, and guidance applications. △ Less

Submitted 21 April, 2020; originally announced April 2020.

Comments: 6 pages, 5 figures, IEEE Conference. arXiv admin note: text overlap with arXiv:2004.02249

arXiv:2004.02249 [pdf, other]

CondenseUNet: A Memory-Efficient Condensely-Connected Architecture for Bi-ventricular Blood Pool and Myocardium Segmentation

Authors: S. M. Kamrul Hasan, Cristian A. Linte

Abstract: With the advent of Cardiac Cine Magnetic Resonance (CMR) Imaging, there has been a paradigm shift in medical technology, thanks to its capability of imaging different structures within the heart without ionizing radiation. However, it is very challenging to conduct pre-operative planning of minimally invasive cardiac procedures without accurate segmentation and identification of the left ventricle… ▽ More With the advent of Cardiac Cine Magnetic Resonance (CMR) Imaging, there has been a paradigm shift in medical technology, thanks to its capability of imaging different structures within the heart without ionizing radiation. However, it is very challenging to conduct pre-operative planning of minimally invasive cardiac procedures without accurate segmentation and identification of the left ventricle (LV), right ventricle (RV) blood-pool, and LV-myocardium. Manual segmentation of those structures, nevertheless, is time-consuming and often prone to error and biased outcomes. Hence, automatic and computationally efficient segmentation techniques are paramount. In this work, we propose a novel memory-efficient Convolutional Neural Network (CNN) architecture as a modification of both CondenseNet, as well as DenseNet for ventricular blood-pool segmentation by introducing a bottleneck block and an upsampling path. Our experiments show that the proposed architecture runs on the Automated Cardiac Diagnosis Challenge (ACDC) dataset using half (50%) the memory requirement of DenseNet and one-twelfth (~ 8%) of the memory requirements of U-Net, while still maintaining excellent accuracy of cardiac segmentation. We validated the framework on the ACDC dataset featuring one healthy and four pathology groups whose heart images were acquired throughout the cardiac cycle and achieved the mean dice scores of 96.78% (LV blood-pool), 93.46% (RV blood-pool) and 90.1% (LV-Myocardium). These results are promising and promote the proposed methods as a competitive tool for cardiac image segmentation and clinical parameter estimation that has the potential to provide fast and accurate results, as needed for pre-procedural planning and/or pre-operative applications. △ Less

Submitted 5 April, 2020; originally announced April 2020.

Comments: 7 pages, 3 figures

arXiv:1912.00815 [pdf, other]

Multiframe-based Adaptive Despeckling Algorithm for Ultrasound B-mode Imaging with Superior Edge and Texture

Authors: Jayanta Dey, Md. Kamrul Hasan

Abstract: Removing speckle noise from medical ultrasound images while preserving image features without introducing artifact and distortion is a major challenge in ultrasound image restoration. In this paper, we propose a multiframe-based adaptive despeckling (MADS) algorithm to reconstruct a high-resolution B-mode image from raw radio-frequency (RF) data that is based on a multiple input single output (MIS… ▽ More Removing speckle noise from medical ultrasound images while preserving image features without introducing artifact and distortion is a major challenge in ultrasound image restoration. In this paper, we propose a multiframe-based adaptive despeckling (MADS) algorithm to reconstruct a high-resolution B-mode image from raw radio-frequency (RF) data that is based on a multiple input single output (MISO) model. As a prior step to despeckling, the speckle pattern in each frame is estimated using a novel multiframe-based adaptive approach for ultrasonic speckle noise estimation (MSNE) based on a single input multiple output (SIMO) modeling of consecutive deconvolved ultrasound image frames. The elegance of the proposed despeckling algorithm is that it addresses the despeckling problem by completely following the signal generation model unlike conventional ad-hoc smoothening or filtering based approaches, and therefore, it is likely to maximally preserve the image features. As deconvolution is a necessary pre-processing step to despeckling, we describe here a 2-D extension of the SIMO model-based 1-D deconvolution method. Finally, a complete framework for the generation of high-resolution ultrasound B-mode image has been also established in this paper. The results show 8.55-15.91 dB, 8.24-14.94 dB improvement in terms of SNR and PSNR, respectively, for simulation data and 2.22-3.17, 13.24-32.85 improvement in terms of NIQE and BRISQUE, respectively, for in-vivo data compared to the traditional despeckling algorithms. Visual comparison shows superior texture, resolution, details of B-mode images offered by our method compared to those by a commercial scanner, and hence, it may significantly improve the diagnostic quality of ultrasound images. △ Less

Submitted 29 September, 2021; v1 submitted 2 December, 2019; originally announced December 2019.

arXiv:1910.02579 [pdf]

A Novel Technique of Noninvasive Hemoglobin Level Measurement Using HSV Value of Fingertip Image

Authors: Md Kamrul Hasan, Nazmus Sakib, Joshua Field, Richard R. Love, Sheikh I. Ahamed

Abstract: Over the last decade, smartphones have changed radically to support us with mHealth technology, cloud computing, and machine learning algorithm. Having its multifaceted facilities, we present a novel smartphone-based noninvasive hemoglobin (Hb) level prediction model by analyzing hue, saturation and value (HSV) of a fingertip video. Here, we collect 60 videos of 60 subjects from two different loca… ▽ More Over the last decade, smartphones have changed radically to support us with mHealth technology, cloud computing, and machine learning algorithm. Having its multifaceted facilities, we present a novel smartphone-based noninvasive hemoglobin (Hb) level prediction model by analyzing hue, saturation and value (HSV) of a fingertip video. Here, we collect 60 videos of 60 subjects from two different locations: Blood Center of Wisconsin, USA and AmaderGram, Bangladesh. We extract red, green, and blue (RGB) pixel intensities of selected images of those videos captured by the smartphone camera with flash on. Then we convert RGB values of selected video frames of a fingertip video into HSV color space and we generate histogram values of these HSV pixel intensities. We average these histogram values of a fingertip video and consider as an observation against the gold standard Hb concentration. We generate two input feature matrices based on observation of two different data sets. Partial Least Squares (PLS) algorithm is applied on the input feature matrix. We observe R2=0.95 in both data sets through our research. We analyze our data using Python OpenCV, Matlab, and R statistics tool. △ Less

Submitted 6 October, 2019; originally announced October 2019.

arXiv:1907.04305 [pdf, other]

DSNet: Automatic Dermoscopic Skin Lesion Segmentation

Authors: Md. Kamrul Hasan, Lavsen Dahal, Prasad N. Samarakoon, Fakrul Islam Tushar, Robert Marti Marly

Abstract: Automatic segmentation of skin lesion is considered a crucial step in Computer Aided Diagnosis (CAD) for melanoma diagnosis. Despite its significance, skin lesion segmentation remains a challenging task due to their diverse color, texture, and indistinguishable boundaries and forms an open problem. Through this study, we present a new and automatic semantic segmentation network for robust skin les… ▽ More Automatic segmentation of skin lesion is considered a crucial step in Computer Aided Diagnosis (CAD) for melanoma diagnosis. Despite its significance, skin lesion segmentation remains a challenging task due to their diverse color, texture, and indistinguishable boundaries and forms an open problem. Through this study, we present a new and automatic semantic segmentation network for robust skin lesion segmentation named Dermoscopic Skin Network (DSNet). In order to reduce the number of parameters to make the network lightweight, we used depth-wise separable convolution in lieu of standard convolution to project the learned discriminating features onto the pixel space at different stages of the encoder. Additionally, we implemented U-Net and Fully Convolutional Network (FCN8s) to compare against the proposed DSNet. We evaluate our proposed model on two publicly available datasets, namely ISIC-2017 and PH2. The obtained mean Intersection over Union (mIoU) is 77.5 % and 87.0 % respectively for ISIC-2017 and PH2 datasets which outperformed the ISIC-2017 challenge winner by 1.0 % with respect to mIoU. Our proposed network also outperformed U-Net and FCN8s respectively by 3.6 % and 6.8 % with respect to mIoU on the ISIC-2017 dataset. Our network for skin lesion segmentation outperforms other methods and can provide better segmented masks on two different test datasets which can lead to better performance in melanoma detection. Our trained model along with the source code and predicted masks are made publicly available. △ Less

Submitted 23 January, 2020; v1 submitted 9 July, 2019; originally announced July 2019.

Comments: 25 pages

arXiv:1903.02189 [pdf, other]

Grid-Connected Emergency Back-Up Power Supply

Authors: Dhiman Chowdhury, Mohammad Sharif Miah, Md. Feroz Hossain, Md. Mostafijur Rahman, Md. Marzan Hossain, Md. Nazim Uddin Sheikh, Md. Mehedi Hasan, Uzzal Sarker, Abu Shahir Md. Khalid Hasan

Abstract: This paper documents a design and modelling of a grid-connected emergency back-up power supply for medium power applications. There are a rectifier-link boost derived battery charging circuit and a 4-switch push-pull power inverter circuit which are controlled by pulse width modulation (PWM) signals. This paper presents a state averaging model and Laplace domain transfer function of the charging c… ▽ More This paper documents a design and modelling of a grid-connected emergency back-up power supply for medium power applications. There are a rectifier-link boost derived battery charging circuit and a 4-switch push-pull power inverter circuit which are controlled by pulse width modulation (PWM) signals. This paper presents a state averaging model and Laplace domain transfer function of the charging circuit and a switching converter model of the power inverter circuit. A changeover relay based transfer switch controls the power flow towards the utility loads. During off-grid situations, loads are fed power by the proposed inverter circuit and during on-grid situations, battery is charged by an ac-link rectifier-fed boost converter. There is a relay switching circuit to control the charging phenomenon of the battery. The proposed design has been simulated in PLECS and the simulation results corroborate the reliability of the presented framework. △ Less

Submitted 6 March, 2019; originally announced March 2019.

arXiv:1902.04845 [pdf, other]

SHEAR-net: An End-to-End Deep Learning Approach for Single Push Ultrasound Shear Wave Elasticity Imaging

Authors: Tamim Ahmed, Md. Kamrul Hasan

Abstract: Ultrasound Shear Wave Elastography (USWE) with conventional B-mode imaging demonstrates better performance in lesion segmentation and classification problems. In this article, we propose SHEAR-net, an end-to-end deep neural network, to reconstruct USWE images from tracked tissue displacement data at different time instants induced by a single acoustic radiation force (ARF) with 100% or 50% of the… ▽ More Ultrasound Shear Wave Elastography (USWE) with conventional B-mode imaging demonstrates better performance in lesion segmentation and classification problems. In this article, we propose SHEAR-net, an end-to-end deep neural network, to reconstruct USWE images from tracked tissue displacement data at different time instants induced by a single acoustic radiation force (ARF) with 100% or 50% of the energy in conventional use. The SHEAR-net consists of a localizer called the S-net to first localize the lesion location and then uses recurrent layers to extract temporal correlations from wave patterns using different time frames, and finally, with an estimator, it reconstructs the shear modulus image from the concatenated outputs of S-net and recurrent layers. The network is trained with 800 simulation and a limited number of CIRS tissue mimicking phantom data and is optimized using a multi-task learning loss function where the tasks are: inclusion localization and modulus estimation. The efficacy of the proposed SHEAR-net is extensively evaluated both qualitatively and quantitatively on 125 test set of motion data obtained from simulation and CIRS phantoms. We show that the proposed approach consistently outperforms the current state-of-the-art method and achieves overall 4-5 dB improvement in PSNR and SNR. In addition, an average gain of 0.15 in DSC and SSIM values indicate that the SHEAR-net has a better inclusion coverage area and structural similarity of the two approaches. The proposed real-time deep learning based technique can accurately estimate shear modulus for a minimum tissue displacement of 0.5$μ$m and image multiple inclusions with a single push ARF. △ Less

Submitted 13 February, 2019; originally announced February 2019.

arXiv:1902.01573 [pdf, other]

Classification of Breast Lesions Using Quantitative Ultrasound Biomarkers

Authors: Navid Ibtehaj Nizam, Sharmin R. Ara, Md. Kamrul Hasan

Abstract: Quantitative ultrasound (QUS) based parameters like the effective scatterer diameter (ESD) and mean scatterer spacing (MSS) are gaining attention recently as non-invasive biomarkers for soft tissue characterization. In this work, we propose a multiple QUS parameter based technique that employs ESD and MSS, for binary classification of breast lesions. In order to produce improved ESD estimates, we… ▽ More Quantitative ultrasound (QUS) based parameters like the effective scatterer diameter (ESD) and mean scatterer spacing (MSS) are gaining attention recently as non-invasive biomarkers for soft tissue characterization. In this work, we propose a multiple QUS parameter based technique that employs ESD and MSS, for binary classification of breast lesions. In order to produce improved ESD estimates, we propose a modified frequency domain technique for ESD estimation of breast tissues from the diffuse component of backscattered radio-frequency (RF) data. Ensemble empirical mode decomposition (EEMD) is performed to separate the diffuse component from the coherent component by decomposing the RF data into their intrinsic mode functions (IMFs). A non-parametric Kolmogorov-Smirnov (K-S) test is employed for automatic IMF selection along with a multi-step system effect minimization process. The ESD is estimated using a nearest neighborhood average regression line fitting algorithm. Furthermore, we use an ameliorated EEMD domain autoregressive (AR) spectral estimation technique for MSS estimation. On using the ESD for binary classification of 159 lesions, we obtain high sensitivity, specificity, accuracy values of 91.07%, 96.12%, and 94.34%, respectively, with an area under the receiver operating characteristics (ROC) curve of 0.94. On combining ESD with MSS we obtain even more improved sensitivity, specificity, and accuracy values of 96.43%, 95.15%, and 95.60%, respectively, with an area under the ROC of 0.96. Such a high classification performance highlights the potential of these QUS parameters to be used as non-invasive biomarkers for breast cancer detection. △ Less

Submitted 9 July, 2019; v1 submitted 5 February, 2019; originally announced February 2019.

arXiv:1902.00694 [pdf, other]

RemNet: Remnant Convolutional Neural Network for Camera Model Identification

Authors: Abdul Muntakim Rafi, Thamidul Islam Tonmoy, Uday Kamal, Q. M. Jonathan Wu, Md. Kamrul Hasan

Abstract: Camera model identification (CMI) has gained significant importance in image forensics as digitally altered images are becoming increasingly commonplace. In this paper, a novel convolutional neural network (CNN) architecture is proposed for CMI with emphasis given on the preprocessing task considered to be inevitable for removing the scene content that heavily obscures the camera model fingerprint… ▽ More Camera model identification (CMI) has gained significant importance in image forensics as digitally altered images are becoming increasingly commonplace. In this paper, a novel convolutional neural network (CNN) architecture is proposed for CMI with emphasis given on the preprocessing task considered to be inevitable for removing the scene content that heavily obscures the camera model fingerprints. Unlike the conventional approaches where fixed filters are used for preprocessing, the proposed remnant blocks, when coupled with a classification block and trained end-to-end minimizing the classification loss, learn to suppress the unnecessary image contents dynamically. This helps the classification block extract more robust camera model-specific features for CMI from the remnant of the image. The whole network, called RemNet, consisting of a preprocessing block and a shallow classification block, when trained on 18 models from the Dresden database, shows 100% accuracy for 16 camera models with an overall accuracy of 97.59% on test images from unseen devices, outperforming the state of the art deep CNNs used in CMI. Furthermore, the proposed remnant blocks, when cascaded with the existing deep CNNs, e.g., ResNet, DenseNet, boost their performances by a large margin. The proposed approach proves to be very robust in identifying the source camera models, even if the original images are post-processed. It also achieves an overall accuracy of 95.11% on the IEEE Signal Processing Cup 2018 dataset, which indicates its generalizability. △ Less

Submitted 27 June, 2020; v1 submitted 2 February, 2019; originally announced February 2019.

arXiv:1812.01951 [pdf, other]

Lung Cancer Tumor Region Segmentation Using Recurrent 3D-DenseUNet

Authors: Uday Kamal, Abdul Muntakim Rafi, Rakibul Hoque, Jonathan Wu, Md. Kamrul Hasan

Abstract: The performance of a computer-aided automated diagnosis system of lung cancer from Computed Tomography (CT) volumetric images greatly depends on the accurate detection and segmentation of tumor regions. In this paper, we present Recurrent 3D-DenseUNet, a novel deep learning based architecture for volumetric lung tumor segmentation from CT scans. The proposed architecture consists of a 3D encoder b… ▽ More The performance of a computer-aided automated diagnosis system of lung cancer from Computed Tomography (CT) volumetric images greatly depends on the accurate detection and segmentation of tumor regions. In this paper, we present Recurrent 3D-DenseUNet, a novel deep learning based architecture for volumetric lung tumor segmentation from CT scans. The proposed architecture consists of a 3D encoder block that learns to extract fine-grained spatial and coarse-grained temporal features, a recurrent block of multiple Convolutional Long Short-Term Memory (ConvLSTM) layers to extract fine-grained spatio-temporal information, and finally a 3D decoder block to reconstruct the desired volume segmentation masks from the latent feature space. The encoder and decoder blocks consist of several 3D-convolutional layers that are densely connected among themselves so that necessary feature aggregation can occur throughout the network. During prediction, we apply selective thresholding followed by morphological operation, on top of the network prediction, to better differentiate between tumorous and non-tumorous image-slices, which shows more promise than only thresholding-based approaches. We train and test our network on the NSCLC-Radiomics dataset of 300 patients, provided by The Cancer Imaging Archive (TCIA) for the 2018 IEEE VIP Cup. Moreover, we perform an extensive ablation study of different loss functions in practice for this task. The proposed network outperforms other state-of-the-art 3D segmentation architectures with an average dice score of 0.7228. △ Less

Submitted 8 September, 2020; v1 submitted 5 December, 2018; originally announced December 2018.

arXiv:1811.02214 [pdf, other]

Cuffless Blood Pressure Estimation from Electrocardiogram and Photoplethysmogram Using Waveform Based ANN-LSTM Network

Authors: Md. Sayed Tanveer, Md. Kamrul Hasan

Abstract: Goal: Although photoplethysmogram (PPG) and electrocardiogram (ECG) signals can be used to estimate blood pressure (BP) by extracting various features, the changes in morphological contours of both PPG and ECG signals due to various diseases of circulatory system and interaction of other physiological systems make the extraction of such features very difficult. Methods: In this work, we propose a… ▽ More Goal: Although photoplethysmogram (PPG) and electrocardiogram (ECG) signals can be used to estimate blood pressure (BP) by extracting various features, the changes in morphological contours of both PPG and ECG signals due to various diseases of circulatory system and interaction of other physiological systems make the extraction of such features very difficult. Methods: In this work, we propose a waveform-based hierarchical Artificial Neural Network - Long Short Term Memory (ANN-LSTM) model for BP estimation. The model consists of two hierarchy levels, where the lower hierarchy level uses ANNs to extract necessary morphological features from ECG and PPG waveforms and the upper hierarchy level uses LSTM layers to account for the time domain variation of the features extracted by lower hierarchy level. Results: The proposed model is evaluated on 39 subjects using the Association for the Advancement of Medical Instrumentations (AAMI) standard and the British Hypertension Society (BHS) standard. The method satisfies both the standards in the estimation of systolic blood pressure (SBP) and diastolic blood pressure (DBP). For the proposed network, the mean absolute error (MAE) and the root mean square error (RMSE) for SBP estimation are 1.10 and 1.56 mmHg, respectively, and for DBP estimation are 0.58 and 0.85 mmHg, respectively. Conclusion: The performance of the proposed hierarchical ANN-LSTM model is found to be better than the other feature engineering-based networks. It is shown that the proposed model is able to automatically extract the necessary features and their time domain variations to estimate BP reliably in a noninvasive continuous manner. Significance: The method is expected to greatly facilitate the presently available mobile health-care gadgets in continuous BP estimation. △ Less

Submitted 6 November, 2018; originally announced November 2018.

arXiv:1809.00576 [pdf, other]

Application of DenseNet in Camera Model Identification and Post-processing Detection

Authors: Abdul Muntakim Rafi, Uday Kamal, Rakibul Hoque, Abid Abrar, Sowmitra Das, Robert Laganière, Md. Kamrul Hasan

Abstract: Camera model identification has earned paramount importance in the field of image forensics with an upsurge of digitally altered images which are constantly being shared through websites, media, and social applications. But, the task of identification becomes quite challenging if metadata are absent from the image and/or if the image has been post-processed. In this paper, we present a DenseNet pi… ▽ More Camera model identification has earned paramount importance in the field of image forensics with an upsurge of digitally altered images which are constantly being shared through websites, media, and social applications. But, the task of identification becomes quite challenging if metadata are absent from the image and/or if the image has been post-processed. In this paper, we present a DenseNet pipeline to solve the problem of identifying the source camera-model of an image. Our approach is to extract patches of 256*256 from a labeled image dataset and apply augmentations, i.e., Empirical Mode Decomposition (EMD). We use this extended dataset to train a Neural Network with the DenseNet-201 architecture. We concatenate the output features for 3 different sizes (64*64, 128*128, 256*256) and pass them to a secondary network to make the final prediction. This strategy proves to be very robust for identifying the source camera model, even when the original image is post-processed. Our model has been trained and tested on the Forensic Camera-Model Identification Dataset provided for the IEEE Signal Processing (SP) Cup 2018. During testing we achieved an overall accuracy of 98.37%, which is the current state-of-the-art on this dataset using a single model. We used transfer learning and tested our model on the Dresden Database for Camera Model Identification, with an overall test accuracy of over 99% for 19 models. In addition, we demonstrate that the proposed pipeline is suitable for other image-forensic classification tasks, such as, detecting the type of post-processing applied to an image with an accuracy of 96.66% -- which indicates the generality of our approach. △ Less

Submitted 27 May, 2019; v1 submitted 3 September, 2018; originally announced September 2018.

arXiv:1801.00907 [pdf, other]

Performance Analysis of a Scalable DC Microgrid Offering Solar Power Based Energy Access and Efficient Control for Domestic Loads

Authors: Abu Shahir Md. Khalid Hasan, Dhiman Chowdhury, Mohammad Ziaur Rahman Khan

Abstract: DC microgrids conform to distributed control of renewable energy sources which ratifies efficacious instantaneous power sharing and sustenance of energy access among different domestic Power Management Units (PMUs) along with maintaining stability of the grid voltage. In this paper design metrics and performance evaluation of a scalable DC microgrid are documented where a look-up table of generate… ▽ More DC microgrids conform to distributed control of renewable energy sources which ratifies efficacious instantaneous power sharing and sustenance of energy access among different domestic Power Management Units (PMUs) along with maintaining stability of the grid voltage. In this paper design metrics and performance evaluation of a scalable DC microgrid are documented where a look-up table of generated power of a source converter complies with the distribution of efficient power sharing phenomenon among a set of two home PMUs. The source converter is connected with a Photovoltaic panel of 300 W and uses Perturb and Observation (P&O) method for executing Maximum Power Point Tracking (MPPT). A boost average DCDC converter topology is used to enhance the voltage level of the source converter before transmission. The load converter consists of two parallely connected PMUs each of which is constructed with high switching frequency based Full Bridge (FB) converter to charge an integrated Energy Storage System (ESS). In this paper the overall system is modeled and simulated on MATLAB/Simulink platform with ESSs in the form of Lead Acid batteries connected to the load side of the FB converter circuits and these batteries yield to support marginalized power utilities. The behaviour of the system is tested in different solar insolation levels along with several battery charging levels of 12 V and 36 V to assess the power efficiency. In each testbed the efficiency is found to be more than 93% which affirm the reliability of the framework and a look-up table is generated comprising the grid and load quantities for effective control of power transmission. △ Less

Submitted 3 January, 2018; originally announced January 2018.

arXiv:1712.01015 [pdf, other]

doi 10.1088/2057-1976/aaca00

Ultrasonic Tissue Reflectivity Function Estimation Using Correlation Constrained Multichannel FLMS Algorithm with Missing RF Data

Authors: Jayanta Dey, Md. Kamrul Hasan

Abstract: Poor resolution of ultrasound images due to convolution of the tissue reflectivity function (TRF) with the system point spread function (PSF) is a major issue in medical ultrasound imaging. In this paper, we propose a correlation constrained missing-data estimation based blind multichannel frequency- domain least-mean-squares (md-bMCFLMS) algorithm to undo the effect of PSF on the ultrasound radio… ▽ More Poor resolution of ultrasound images due to convolution of the tissue reflectivity function (TRF) with the system point spread function (PSF) is a major issue in medical ultrasound imaging. In this paper, we propose a correlation constrained missing-data estimation based blind multichannel frequency- domain least-mean-squares (md-bMCFLMS) algorithm to undo the effect of PSF on the ultrasound radio-frequency (RF) data. In the first step, a block-based MCFLMS (bMCFLMS) algorithm is proposed to estimate the TRFs and the PSF which are used in the second step to estimate the missing data. This missing data is used in the md-bMCFLMS algorithm to construct a modified cost function for further improvement of the image resolution. To account for the nonstationarity of the PSF, unlike the blocking approach described in the literature, we introduce a time-efficient blocking method in this paper. The blocking approach described here uses a block position independent fixed size matrix and can be implemented parallely. The bMCFLMS algorithm, however, shows misconvergence due to both channel noise and propagation of TRF estimation error from the previous blocks. This phe- nomenon is more intense in the case of md-bMCFLMS algorithm because of increased estimation error. To address this problem, a novel constraint based on the correlation between the measured RF data and estimated TRF is proposed in this paper. The efficacy of our proposed blind deconvolution algorithm is measured using simulation phantom, experimental phantom and in-vivo data. △ Less

Submitted 4 December, 2017; originally announced December 2017.

Showing 1–31 of 31 results for author: Hasan, M K