Search | arXiv e-print repository

Enhanced In-Flight Connectivity for Urban Air Mobility via LEO Satellite Networks

Authors: Karnika Biswas, Hakim Ghazzai, Abdullah Khanfor, Lokman Sboui

Abstract: Urban Air Mobility (UAM) is the envisioned future of inter-city aerial transportation. This paper presents a novel, in-flight connectivity link allocation method for UAM, which dynamically switches between terrestrial cellular and Low Earth Orbit (LEO) satellite networks based on real-time conditions. Our approach prefers cellular networks for cost efficiency, switching to LEO satellites under poo… ▽ More Urban Air Mobility (UAM) is the envisioned future of inter-city aerial transportation. This paper presents a novel, in-flight connectivity link allocation method for UAM, which dynamically switches between terrestrial cellular and Low Earth Orbit (LEO) satellite networks based on real-time conditions. Our approach prefers cellular networks for cost efficiency, switching to LEO satellites under poor cellular conditions to ensure continuous UAM connectivity. By integrating real-time metrics like signal strength, network congestion, and flight trajectory into the selection process, our algorithm effectively balances cost, minimum data rate requirements, and continuity of communication. Numerical results validate minimization of data-loss while ensuring an optimal selection from the set of available above-threshold data rates at every time sample. Furthermore, insights derived from our study emphasize the importance of hybrid connectivity solutions in ensuring seamless, uninterrupted communication for future urban aerial vehicles. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 6 pages, 6 figures, conference

arXiv:2405.13901 [pdf, other]

DCT-Based Decorrelated Attention for Vision Transformers

Authors: Hongyi Pan, Emadeldeen Hamdan, Xin Zhu, Koushik Biswas, Ahmet Enis Cetin, Ulas Bagci

Abstract: Central to the Transformer architectures' effectiveness is the self-attention mechanism, a function that maps queries, keys, and values into a high-dimensional vector space. However, training the attention weights of queries, keys, and values is non-trivial from a state of random initialization. In this paper, we propose two methods. (i) We first address the initialization problem of Vision Transf… ▽ More Central to the Transformer architectures' effectiveness is the self-attention mechanism, a function that maps queries, keys, and values into a high-dimensional vector space. However, training the attention weights of queries, keys, and values is non-trivial from a state of random initialization. In this paper, we propose two methods. (i) We first address the initialization problem of Vision Transformers by introducing a simple, yet highly innovative, initialization approach utilizing Discrete Cosine Transform (DCT) coefficients. Our proposed DCT-based attention initialization marks a significant gain compared to traditional initialization strategies; offering a robust foundation for the attention mechanism. Our experiments reveal that the DCT-based initialization enhances the accuracy of Vision Transformers in classification tasks. (ii) We also recognize that since DCT effectively decorrelates image information in the frequency domain, this decorrelation is useful for compression because it allows the quantization step to discard many of the higher-frequency components. Based on this observation, we propose a novel DCT-based compression technique for the attention function of Vision Transformers. Since high-frequency DCT coefficients usually correspond to noise, we truncate the high-frequency DCT components of the input patches. Our DCT-based compression reduces the size of weight matrices for queries, keys, and values. While maintaining the same level of accuracy, our DCT compressed Swin Transformers obtain a considerable decrease in the computational overhead. △ Less

Submitted 28 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.06166 [pdf, other]

MDNet: Multi-Decoder Network for Abdominal CT Organs Segmentation

Authors: Debesh Jha, Nikhil Kumar Tomar, Koushik Biswas, Gorkem Durak, Matthew Antalek, Zheyuan Zhang, Bin Wang, Md Mostafijur Rahman, Hongyi Pan, Alpay Medetalibeyoglu, Yury Velichko, Daniela Ladner, Amir Borhani, Ulas Bagci

Abstract: Accurate segmentation of organs from abdominal CT scans is essential for clinical applications such as diagnosis, treatment planning, and patient monitoring. To handle challenges of heterogeneity in organ shapes, sizes, and complex anatomical relationships, we propose a \textbf{\textit{\ac{MDNet}}}, an encoder-decoder network that uses the pre-trained \textit{MiT-B2} as the encoder and multiple di… ▽ More Accurate segmentation of organs from abdominal CT scans is essential for clinical applications such as diagnosis, treatment planning, and patient monitoring. To handle challenges of heterogeneity in organ shapes, sizes, and complex anatomical relationships, we propose a \textbf{\textit{\ac{MDNet}}}, an encoder-decoder network that uses the pre-trained \textit{MiT-B2} as the encoder and multiple different decoder networks. Each decoder network is connected to a different part of the encoder via a multi-scale feature enhancement dilated block. With each decoder, we increase the depth of the network iteratively and refine segmentation masks, enriching feature maps by integrating previous decoders' feature maps. To refine the feature map further, we also utilize the predicted masks from the previous decoder to the current decoder to provide spatial attention across foreground and background regions. MDNet effectively refines the segmentation mask with a high dice similarity coefficient (DSC) of 0.9013 and 0.9169 on the Liver Tumor segmentation (LiTS) and MSD Spleen datasets. Additionally, it reduces Hausdorff distance (HD) to 3.79 for the LiTS dataset and 2.26 for the spleen segmentation dataset, underscoring the precision of MDNet in capturing the complex contours. Moreover, \textit{\ac{MDNet}} is more interpretable and robust compared to the other baseline models. △ Less

Submitted 9 May, 2024; originally announced May 2024.

arXiv:2405.01503 [pdf, other]

PAM-UNet: Shifting Attention on Region of Interest in Medical Images

Authors: Abhijit Das, Debesh Jha, Vandan Gorade, Koushik Biswas, Hongyi Pan, Zheyuan Zhang, Daniela P. Ladner, Yury Velichko, Amir Borhani, Ulas Bagci

Abstract: Computer-aided segmentation methods can assist medical personnel in improving diagnostic outcomes. While recent advancements like UNet and its variants have shown promise, they face a critical challenge: balancing accuracy with computational efficiency. Shallow encoder architectures in UNets often struggle to capture crucial spatial features, leading in inaccurate and sparse segmentation. To addre… ▽ More Computer-aided segmentation methods can assist medical personnel in improving diagnostic outcomes. While recent advancements like UNet and its variants have shown promise, they face a critical challenge: balancing accuracy with computational efficiency. Shallow encoder architectures in UNets often struggle to capture crucial spatial features, leading in inaccurate and sparse segmentation. To address this limitation, we propose a novel \underline{P}rogressive \underline{A}ttention based \underline{M}obile \underline{UNet} (\underline{PAM-UNet}) architecture. The inverted residual (IR) blocks in PAM-UNet help maintain a lightweight framework, while layerwise \textit{Progressive Luong Attention} ($\mathcal{PLA}$) promotes precise segmentation by directing attention toward regions of interest during synthesis. Our approach prioritizes both accuracy and speed, achieving a commendable balance with a mean IoU of 74.65 and a dice score of 82.87, while requiring only 1.32 floating-point operations per second (FLOPS) on the Liver Tumor Segmentation Benchmark (LiTS) 2017 dataset. These results highlight the importance of develo** efficient segmentation models to accelerate the adoption of AI in clinical practice. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: Accepted at 2024 IEEE EMBC

arXiv:2404.17064 [pdf, other]

Detection of Peri-Pancreatic Edema using Deep Learning and Radiomics Techniques

Authors: Ziliang Hong, Debesh Jha, Koushik Biswas, Zheyuan Zhang, Yury Velichko, Cemal Yazici, Temel Tirkes, Amir Borhani, Baris Turkbey, Alpay Medetalibeyoglu, Gorkem Durak, Ulas Bagci

Abstract: Identifying peri-pancreatic edema is a pivotal indicator for identifying disease progression and prognosis, emphasizing the critical need for accurate detection and assessment in pancreatitis diagnosis and management. This study \textit{introduces a novel CT dataset sourced from 255 patients with pancreatic diseases, featuring annotated pancreas segmentation masks and corresponding diagnostic labe… ▽ More Identifying peri-pancreatic edema is a pivotal indicator for identifying disease progression and prognosis, emphasizing the critical need for accurate detection and assessment in pancreatitis diagnosis and management. This study \textit{introduces a novel CT dataset sourced from 255 patients with pancreatic diseases, featuring annotated pancreas segmentation masks and corresponding diagnostic labels for peri-pancreatic edema condition}. With the novel dataset, we first evaluate the efficacy of the \textit{LinTransUNet} model, a linear Transformer based segmentation algorithm, to segment the pancreas accurately from CT imaging data. Then, we use segmented pancreas regions with two distinctive machine learning classifiers to identify existence of peri-pancreatic edema: deep learning-based models and a radiomics-based eXtreme Gradient Boosting (XGBoost). The LinTransUNet achieved promising results, with a dice coefficient of 80.85\%, and mIoU of 68.73\%. Among the nine benchmarked classification models for peri-pancreatic edema detection, \textit{Swin-Tiny} transformer model demonstrated the highest recall of $98.85 \pm 0.42$ and precision of $98.38\pm 0.17$. Comparatively, the radiomics-based XGBoost model achieved an accuracy of $79.61\pm4.04$ and recall of $91.05\pm3.28$, showcasing its potential as a supplementary diagnostic tool given its rapid processing speed and reduced training time. Our code is available \url{https://github.com/NUBagciLab/Peri-Pancreatic-Edema-Detection}. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2403.06112 [pdf, other]

Decentralized P2P Trading based on Blockchain for Retail Electricity Markets

Authors: Masoud H. Nazari, Antar Kumar Biswas

Abstract: This paper introduces peer to peer (P2P) trading mechanisms based on decentralized Blockchain to facilitate retail electricity market for ever-increasing distributed energy resources (DERs). The Blockchain network supports fast and secure retail trading among DERs and facilitates a sustainable local P2P trading platform. In this decentralized Blockchain architecture no single entity or organizatio… ▽ More This paper introduces peer to peer (P2P) trading mechanisms based on decentralized Blockchain to facilitate retail electricity market for ever-increasing distributed energy resources (DERs). The Blockchain network supports fast and secure retail trading among DERs and facilitates a sustainable local P2P trading platform. In this decentralized Blockchain architecture no single entity or organization has control over the entire system rather all users collectively maintain control. The effectiveness of the proposed automated market design and optimization is simulated using different use case scenarios in an open source Blockchain Simulator and MATLAB. The results show the efficacy of the trading mechanism in achieving demand response through strategies such as peak load shaving, load shifting, and integration of DERs. △ Less

Submitted 10 March, 2024; originally announced March 2024.

arXiv:2401.09630 [pdf, other]

CT Liver Segmentation via PVT-based Encoding and Refined Decoding

Authors: Debesh Jha, Nikhil Kumar Tomar, Koushik Biswas, Gorkem Durak, Alpay Medetalibeyoglu, Matthew Antalek, Yury Velichko, Daniela Ladner, Amir Borhani, Ulas Bagci

Abstract: Accurate liver segmentation from CT scans is essential for effective diagnosis and treatment planning. Computer-aided diagnosis systems promise to improve the precision of liver disease diagnosis, disease progression, and treatment planning. In response to the need, we propose a novel deep learning approach, \textit{\textbf{PVTFormer}}, that is built upon a pretrained pyramid vision transformer (P… ▽ More Accurate liver segmentation from CT scans is essential for effective diagnosis and treatment planning. Computer-aided diagnosis systems promise to improve the precision of liver disease diagnosis, disease progression, and treatment planning. In response to the need, we propose a novel deep learning approach, \textit{\textbf{PVTFormer}}, that is built upon a pretrained pyramid vision transformer (PVT v2) combined with advanced residual upsampling and decoder block. By integrating a refined feature channel approach with a hierarchical decoding strategy, PVTFormer generates high quality segmentation masks by enhancing semantic features. Rigorous evaluation of the proposed method on Liver Tumor Segmentation Benchmark (LiTS) 2017 demonstrates that our proposed architecture not only achieves a high dice coefficient of 86.78\%, mIoU of 78.46\%, but also obtains a low HD of 3.50. The results underscore PVTFormer's efficacy in setting a new benchmark for state-of-the-art liver segmentation methods. The source code of the proposed PVTFormer is available at \url{https://github.com/DebeshJha/PVTFormer}. △ Less

Submitted 20 April, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

arXiv:2312.11480 [pdf, other]

Adaptive Smooth Activation for Improved Disease Diagnosis and Organ Segmentation from Radiology Scans

Authors: Koushik Biswas, Debesh Jha, Nikhil Kumar Tomar, Gorkem Durak, Alpay Medetalibeyoglu, Matthew Antalek, Yury Velichko, Daniela Ladner, Amir Bohrani, Ulas Bagci

Abstract: In this study, we propose a new activation function, called Adaptive Smooth Activation Unit (ASAU), tailored for optimized gradient propagation, thereby enhancing the proficiency of convolutional networks in medical image analysis. We apply this new activation function to two important and commonly used general tasks in medical image analysis: automatic disease diagnosis and organ segmentation in… ▽ More In this study, we propose a new activation function, called Adaptive Smooth Activation Unit (ASAU), tailored for optimized gradient propagation, thereby enhancing the proficiency of convolutional networks in medical image analysis. We apply this new activation function to two important and commonly used general tasks in medical image analysis: automatic disease diagnosis and organ segmentation in CT and MRI. Our rigorous evaluation on the RadImageNet abdominal/pelvis (CT and MRI) dataset and Liver Tumor Segmentation Benchmark (LiTS) 2017 demonstrates that our ASAU-integrated frameworks not only achieve a substantial (4.80\%) improvement over ReLU in classification accuracy (disease detection) on abdominal CT and MRI but also achieves 1\%-3\% improvement in dice coefficient compared to widely used activations for `healthy liver tissue' segmentation. These improvements offer new baselines for develo** a diagnostic tool, particularly for complex, challenging pathologies. The superior performance and adaptability of ASAU highlight its potential for integration into a wide range of image classification and segmentation tasks. △ Less

Submitted 29 November, 2023; originally announced December 2023.

arXiv:2310.11651 [pdf, other]

US Microelectronics Packaging Ecosystem: Challenges and Opportunities

Authors: Rouhan Noor, Himanandhan Reddy Kottur, Patrick J Craig, Liton Kumar Biswas, M Shafkat M Khan, Nitin Varshney, Hamed Dalir, Elif Akçalı, Bahareh Ghane Motlagh, Charles Woychik, Yong-Kyu Yoon, Navid Asadizanjani

Abstract: The semiconductor industry is experiencing a significant shift from traditional methods of shrinking devices and reducing costs. Chip designers actively seek new technological solutions to enhance cost-effectiveness while incorporating more features into the silicon footprint. One promising approach is Heterogeneous Integration (HI), which involves advanced packaging techniques to integrate indepe… ▽ More The semiconductor industry is experiencing a significant shift from traditional methods of shrinking devices and reducing costs. Chip designers actively seek new technological solutions to enhance cost-effectiveness while incorporating more features into the silicon footprint. One promising approach is Heterogeneous Integration (HI), which involves advanced packaging techniques to integrate independently designed and manufactured components using the most suitable process technology. However, adopting HI introduces design and security challenges. To enable HI, research and development of advanced packaging is crucial. The existing research raises the possible security threats in the advanced packaging supply chain, as most of the Outsourced Semiconductor Assembly and Test (OSAT) facilities/vendors are offshore. To deal with the increasing demand for semiconductors and to ensure a secure semiconductor supply chain, there are sizable efforts from the United States (US) government to bring semiconductor fabrication facilities onshore. However, the US-based advanced packaging capabilities must also be ramped up to fully realize the vision of establishing a secure, efficient, resilient semiconductor supply chain. Our effort was motivated to identify the possible bottlenecks and weak links in the advanced packaging supply chain based in the US. △ Less

Submitted 30 October, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: 22 pages, 8 figures

arXiv:2307.16024 [pdf]

A Local Measurement Based Comprehensive Protection Scheme for AC Microgrid

Authors: Sindhura Gupta, Susovan Mukhopadhyay, Ambarnath Banerji, Sujit K. Biswas, Prasun Sanki

Abstract: The popularity of low-voltage ac distribution networks is increasing day by day. However, an efficient protection scheme for low-voltage ac distribution systems is still challenging. This paper introduces a protection scheme suitable for low-voltage grid connected and islanded ac microgrid based on local measurements in order to locate, identify and isolate faults. Here current decomposition metho… ▽ More The popularity of low-voltage ac distribution networks is increasing day by day. However, an efficient protection scheme for low-voltage ac distribution systems is still challenging. This paper introduces a protection scheme suitable for low-voltage grid connected and islanded ac microgrid based on local measurements in order to locate, identify and isolate faults. Here current decomposition method is specifically incorporated to accomplish fault type identification. MATLAB/SIMULINK platform is chosen to examine the performance of the proposed scheme both in grid connected and islanded low-voltage ac microgrid. The feasibility of the protection scheme is extensively investigated by simulating all types of faults with substantial variations like different fault location, different fault resistances etc. The test results ensure that the proposed protection scheme is sufficiently reliable and faster for providing complete protection in low-voltage ac microgrid. △ Less

Submitted 29 July, 2023; originally announced July 2023.

Comments: International Conference Energy Systems, Drives, Power Electronics, Measurement and Sensors (ESDPEMS 2023)(Hybrid Mode) Organized By Department of Electrical Engineering Narula Institute of Technology, Kolkata

arXiv:2211.01618 [pdf, other]

Self Supervised Low Dose Computed Tomography Image Denoising Using Invertible Network Exploiting Inter Slice Congruence

Authors: Sutanu Bera, Prabir Kumar Biswas

Abstract: The resurgence of deep neural networks has created an alternative pathway for low-dose computed tomography denoising by learning a nonlinear transformation function between low-dose CT (LDCT) and normal-dose CT (NDCT) image pairs. However, those paired LDCT and NDCT images are rarely available in the clinical environment, making deep neural network deployment infeasible. This study proposes a nove… ▽ More The resurgence of deep neural networks has created an alternative pathway for low-dose computed tomography denoising by learning a nonlinear transformation function between low-dose CT (LDCT) and normal-dose CT (NDCT) image pairs. However, those paired LDCT and NDCT images are rarely available in the clinical environment, making deep neural network deployment infeasible. This study proposes a novel method for self-supervised low-dose CT denoising to alleviate the requirement of paired LDCT and NDCT images. Specifically, we have trained an invertible neural network to minimize the pixel-based mean square distance between a noisy slice and the average of its two immediate adjacent noisy slices. We have shown the aforementioned is similar to training a neural network to minimize the distance between clean NDCT and noisy LDCT image pairs. Again, during the reverse map** of the invertible network, the output image is mapped to the original input image, similar to cycle consistency loss. Finally, the trained invertible network's forward map** is used for denoising LDCT images. Extensive experiments on two publicly available datasets showed that our method performs favourably against other existing unsupervised methods. △ Less

Submitted 3 November, 2022; originally announced November 2022.

Comments: 10 pages, Accepted in IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023

arXiv:2207.11894 [pdf, other]

Sub-Aperture Feature Adaptation in Single Image Super-resolution Model for Light Field Imaging

Authors: Aupendu Kar, Suresh Nehra, Jayanta Mukhopadhyay, Prabir Kumar Biswas

Abstract: With the availability of commercial Light Field (LF) cameras, LF imaging has emerged as an up and coming technology in computational photography. However, the spatial resolution is significantly constrained in commercial microlens based LF cameras because of the inherent multiplexing of spatial and angular information. Therefore, it becomes the main bottleneck for other applications of light field… ▽ More With the availability of commercial Light Field (LF) cameras, LF imaging has emerged as an up and coming technology in computational photography. However, the spatial resolution is significantly constrained in commercial microlens based LF cameras because of the inherent multiplexing of spatial and angular information. Therefore, it becomes the main bottleneck for other applications of light field cameras. This paper proposes an adaptation module in a pretrained Single Image Super Resolution (SISR) network to leverage the powerful SISR model instead of using highly engineered light field imaging domain specific Super Resolution models. The adaption module consists of a Sub aperture Shift block and a fusion block. It is an adaptation in the SISR network to further exploit the spatial and angular information in LF images to improve the super resolution performance. Experimental validation shows that the proposed method outperforms existing light field super resolution algorithms. It also achieves PSNR gains of more than 1 dB across all the datasets as compared to the same pretrained SISR models for scale factor 2, and PSNR gains 0.6 to 1 dB for scale factor 4. △ Less

Submitted 26 July, 2022; v1 submitted 24 July, 2022; originally announced July 2022.

Comments: \c{opyright} 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

arXiv:2103.15903 [pdf, other]

Iterative Gradient Encoding Network with Feature Co-Occurrence Loss for Single Image Reflection Removal

Authors: Sutanu Bera, Prabir Kumar Biswas

Abstract: Removing undesired reflections from a photo taken in front of glass is of great importance for enhancing visual computing systems' efficiency. Previous learning-based approaches have produced visually plausible results for some reflections type, however, failed to generalize against other reflection types. There is a dearth of literature for efficient methods concerning single image reflection rem… ▽ More Removing undesired reflections from a photo taken in front of glass is of great importance for enhancing visual computing systems' efficiency. Previous learning-based approaches have produced visually plausible results for some reflections type, however, failed to generalize against other reflection types. There is a dearth of literature for efficient methods concerning single image reflection removal, which can generalize well in large-scale reflection types. In this study, we proposed an iterative gradient encoding network for single image reflection removal. Next, to further supervise the network in learning the correlation between the transmission layer features, we proposed a feature co-occurrence loss. Extensive experiments on the public benchmark dataset of SIR$^2$ demonstrated that our method can remove reflection favorably against the existing state-of-the-art method on all imaging settings, including diverse backgrounds. Moreover, as the reflection strength increases, our method can still remove reflection even where other state of the art methods failed. △ Less

Submitted 29 March, 2021; originally announced March 2021.

Comments: Submitted to IEEE International Conference of Image Processing (ICIP)

arXiv:2101.06517 [pdf, other]

A Novel Approach for Earthquake Early Warning System Design using Deep Learning Techniques

Authors: Tonumoy Mukherjee, Chandrani Singh, Prabir Kumar Biswas

Abstract: Earthquake signals are non-stationary in nature and thus in real-time, it is difficult to identify and classify events based on classical approaches like peak ground displacement, peak ground velocity. Even the popular algorithm of STA/LTA requires extensive research to determine basic thresholding parameters so as to trigger an alarm. Also, many times due to human error or other unavoidable natur… ▽ More Earthquake signals are non-stationary in nature and thus in real-time, it is difficult to identify and classify events based on classical approaches like peak ground displacement, peak ground velocity. Even the popular algorithm of STA/LTA requires extensive research to determine basic thresholding parameters so as to trigger an alarm. Also, many times due to human error or other unavoidable natural factors such as thunder strikes or landslides, the algorithm may end up raising a false alarm. This work focuses on detecting earthquakes by converting seismograph recorded data into corresponding audio signals for better perception and then uses popular Speech Recognition techniques of Filter bank coefficients and Mel Frequency Cepstral Coefficients (MFCC) to extract the features. These features were then used to train a Convolutional Neural Network(CNN) and a Long Short Term Memory(LSTM) network. The proposed method can overcome the above-mentioned problems and help in detecting earthquakes automatically from the waveforms without much human intervention. For the 1000Hz audio data set the CNN model showed a testing accuracy of 91.1% for 0.2-second sample window length while the LSTM model showed 93.99% for the same. A total of 610 sounds consisting of 310 earthquake sounds and 300 non-earthquake sounds were used to train the models. While testing, the total time required for generating the alarm was approximately 2 seconds which included individual times for data collection, processing, and prediction taking into consideration the processing and prediction delays. This shows the effectiveness of the proposed method for Earthquake Early Warning (EEW) applications. Since the input of the method is only the waveform, it is suitable for real-time processing, thus the models can also be used as an onsite EEW system requiring a minimum amount of preparation time and workload. △ Less

Submitted 16 January, 2021; originally announced January 2021.

arXiv:2011.05684 [pdf, other]

doi 10.1109/TMI.2021.3094525

Noise Conscious Training of Non Local Neural Network powered by Self Attentive Spectral Normalized Markovian Patch GAN for Low Dose CT Denoising

Authors: Sutanu Bera, Prabir Kumar Biswas

Abstract: The explosive rise of the use of Computer tomography (CT) imaging in medical practice has heightened public concern over the patient's associated radiation dose. However, reducing the radiation dose leads to increased noise and artifacts, which adversely degrades the scan's interpretability. Consequently, an advanced image reconstruction algorithm to improve the diagnostic performance of low dose… ▽ More The explosive rise of the use of Computer tomography (CT) imaging in medical practice has heightened public concern over the patient's associated radiation dose. However, reducing the radiation dose leads to increased noise and artifacts, which adversely degrades the scan's interpretability. Consequently, an advanced image reconstruction algorithm to improve the diagnostic performance of low dose ct arose as the primary concern among the researchers, which is challenging due to the ill-posedness of the problem. In recent times, the deep learning-based technique has emerged as a dominant method for low dose CT(LDCT) denoising. However, some common bottleneck still exists, which hinders deep learning-based techniques from furnishing the best performance. In this study, we attempted to mitigate these problems with three novel accretions. First, we propose a novel convolutional module as the first attempt to utilize neighborhood similarity of CT images for denoising tasks. Our proposed module assisted in boosting the denoising by a significant margin. Next, we moved towards the problem of non-stationarity of CT noise and introduced a new noise aware mean square error loss for LDCT denoising. Moreover, the loss mentioned above also assisted to alleviate the laborious effort required while training CT denoising network using image patches. Lastly, we propose a novel discriminator function for CT denoising tasks. The conventional vanilla discriminator tends to overlook the fine structural details and focus on the global agreement. Our proposed discriminator leverage self-attention and pixel-wise GANs for restoring the diagnostic quality of LDCT images. Our method validated on a publicly available dataset of the 2016 NIH-AAPM-Mayo Clinic Low Dose CT Grand Challenge performed remarkably better than the existing state of the art method. △ Less

Submitted 11 November, 2020; originally announced November 2020.

Journal ref: IEEE Transactions on Medical Imaging 2021

arXiv:2007.05835 [pdf, other]

doi 10.1109/TCSVT.2020.3007723

Lightweight Modules for Efficient Deep Learning based Image Restoration

Authors: Avisek Lahiri, Sourav Bairagya, Sutanu Bera, Siddhant Haldar, Prabir Kumar Biswas

Abstract: Low level image restoration is an integral component of modern artificial intelligence (AI) driven camera pipelines. Most of these frameworks are based on deep neural networks which present a massive computational overhead on resource constrained platform like a mobile phone. In this paper, we propose several lightweight low-level modules which can be used to create a computationally low cost vari… ▽ More Low level image restoration is an integral component of modern artificial intelligence (AI) driven camera pipelines. Most of these frameworks are based on deep neural networks which present a massive computational overhead on resource constrained platform like a mobile phone. In this paper, we propose several lightweight low-level modules which can be used to create a computationally low cost variant of a given baseline model. Recent works for efficient neural networks design have mainly focused on classification. However, low-level image processing falls under the image-to-image' translation genre which requires some additional computational modules not present in classification. This paper seeks to bridge this gap by designing generic efficient modules which can replace essential components used in contemporary deep learning based image restoration networks. We also present and analyse our results highlighting the drawbacks of applying depthwise separable convolutional kernel (a popular method for efficient classification network) for sub-pixel convolution based upsampling (a popular upsampling strategy for low-level vision applications). This shows that concepts from domain of classification cannot always be seamlessly integrated into image-to-image translation tasks. We extensively validate our findings on three popular tasks of image inpainting, denoising and super-resolution. Our results show that proposed networks consistently output visually similar reconstructions compared to full capacity baselines with significant reduction of parameters, memory footprint and execution speeds on contemporary mobile devices. △ Less

Submitted 11 July, 2020; originally announced July 2020.

Comments: Accepted at: IEEE Transactions on Circuits and Systems for Video Technology (Early Access Print) | |Codes Available at: https://github.com/avisekiit/TCSVT-LightWeight-CNNs | Supplementary Document at: https://drive.google.com/file/d/1BQhkh33Sen-d0qOrjq5h8ahw2VCUIVLg/view?usp=sharing

arXiv:2005.13064 [pdf]

Embedded System to Detect, Track and Classify Plankton Using a Lensless Video Microscope

Authors: Thomas G. Zimmerman, Vito P. Pastore, Sujoy K. Biswas, Simone Bianco

Abstract: Plankton provide the foundation for life on earth. To advance our understanding of the marine ecosystem, for scientific, commercial and survival purposes, more in situ continuous monitoring and analysis of plankton is required. Cost, complexity, power and data communication demands are barriers to widespread deployment of in situ plankton microscopes. We address these barriers by building and char… ▽ More Plankton provide the foundation for life on earth. To advance our understanding of the marine ecosystem, for scientific, commercial and survival purposes, more in situ continuous monitoring and analysis of plankton is required. Cost, complexity, power and data communication demands are barriers to widespread deployment of in situ plankton microscopes. We address these barriers by building and characterizing a lensless microscope with a data pipeline optimized for the Raspberry Pi 3. The pipeline records 1080p video of multiple plankton swimming in a sample well while simultaneously detecting, tracking and selecting salient cropped images for classification @ 5.1 frames per second. Thirteen machine learning classifiers and combinations of nine sets of features are evaluated on nine plankton classes, optimized for speed (F1=0.74 @ 1 msec. per image prediction) and accuracy (F1=0.81 @ .80 sec.). System performance results confirm that performing the entire data pipeline from image capture to classification is possible on a low-cost open-source embedded computer. △ Less

Submitted 26 May, 2020; originally announced May 2020.

Comments: 10 pages, 10 figures, 6 tables

arXiv:1903.09520 [pdf, other]

A lightweight convolutional neural network for image denoising with fine details preservation capability

Authors: Sutanu Bera, Avisek Lahiri, Prabir Kumar Biswas

Abstract: Image denoising is a fundamental problem in image processing whose primary objective is to remove the noise while preserving the original image structure. In this work, we proposed a new architecture for image denoising. We have used several dense blocks to design our network. Additionally, we have forwarded feature extracted in the first layer to the input of every transition layer. Our experimen… ▽ More Image denoising is a fundamental problem in image processing whose primary objective is to remove the noise while preserving the original image structure. In this work, we proposed a new architecture for image denoising. We have used several dense blocks to design our network. Additionally, we have forwarded feature extracted in the first layer to the input of every transition layer. Our experimental result suggests that the use of low-level feature helps in reconstructing better texture. Furthermore, we had trained our network with a combination of MSE and a differentiable multi-scale structural similarity index(MS-SSIM). With proper training, our proposed model with a much lower parameter can outperform other models which were with trained much higher parameters. We evaluated our algorithm on two grayscale benchmark dataset BSD68 and SET12. Our model had achieved similar PSNR with the current state of the art methods and most of the time better SSIM than other algorithms. △ Less

Submitted 22 March, 2019; originally announced March 2019.

arXiv:1611.09265 [pdf]

doi 10.1109/TED.2017.2679604

Image Processing with Dipole-Coupled Nanomagnets: Noise Suppression and Edge Enhancement Detection

Authors: Md Ahsanul Abeed, Ayan K. Biswas, Md Mamun Al-Rashid, Jayasimha Atulasimha, Supriyo Bandyopadhyay

Abstract: Hardware based image processing offers speed and convenience not found in software-centric approaches. Here, we show theoretically that a two-dimensional periodic array of dipole-coupled elliptical nanomagnets, delineated on a piezoelectric substrate, can act as a dynamical system for specific image processing functions. Each nanomagnet has two stable magnetization states that encode pixel color (… ▽ More Hardware based image processing offers speed and convenience not found in software-centric approaches. Here, we show theoretically that a two-dimensional periodic array of dipole-coupled elliptical nanomagnets, delineated on a piezoelectric substrate, can act as a dynamical system for specific image processing functions. Each nanomagnet has two stable magnetization states that encode pixel color (black or white). An image containing black and white pixels is first converted to voltage states and then mapped into the magnetization states of a nanomagnet array with magneto-tunneling junctions (MTJs). The same MTJs are employed to read out the processed pixel colors later. Dipole interaction between the nanomagnets implements specific image processing tasks such as noise reduction and edge enhancement detection. These functions are triggered by applying a global strain to the nanomagnets with a voltage dropped across the piezoelectric substrate. An image containing an arbitrary number of black and white pixels can be processed in few nanoseconds with very low energy cost. △ Less

Submitted 28 November, 2016; originally announced November 2016.

Journal ref: IEEE Transactions on Electron Devices, Vol. 64 (5), 2417-2424 (2017)

arXiv:1504.00952 [pdf, ps, other]

doi 10.1142/S2010324717500047

Energy-efficient hybrid spintronic-straintronic reconfigurable bit comparator

Authors: Ayan K. Biswas, Jayasimha Atulasimha, Supriyo Bandyopadhyay

Abstract: We propose a reconfigurable bit comparator implemented with a nanowire spin valve whose two contacts are magnetostrictive with bistable magnetization. Reference and input bits are "written" into the magnetization states of the two contacts with electrically generated strain and the spin-valve's resistance is lowered if they match. Multiple comparators can be interfaced in parallel with a magneto-t… ▽ More We propose a reconfigurable bit comparator implemented with a nanowire spin valve whose two contacts are magnetostrictive with bistable magnetization. Reference and input bits are "written" into the magnetization states of the two contacts with electrically generated strain and the spin-valve's resistance is lowered if they match. Multiple comparators can be interfaced in parallel with a magneto-tunneling junction to determine if an N-bit input stream matches an N-bit reference stream bit by bit. The system is robust against thermal noise at room temperature and a 16-bit comparator can operate at roughly 416 MHz while dissipating at most 420 aJ per cycle. △ Less

Submitted 12 April, 2015; v1 submitted 3 April, 2015; originally announced April 2015.

Comments: Submitted to Applied Physics Letters. Version 1 ignored the energy dissipation in the passive resistors since they were very high. However, high resistances increase the RC time constant associated with charging. In version 2, the RC time constant has been reduced at the expense of increased energy dissipation, but the latter is still very small in absolute terms

Journal ref: SPIN, Vol. 7 (2), 1750004 (2017)

Showing 1–20 of 20 results for author: Biswas, K