Search | arXiv e-print repository

Demystifying the Effect of Receptive Field Size in U-Net Models for Medical Image Segmentation

Authors: Vincent Loos, Rohit Pardasani, Navchetan Awasthi

Abstract: Medical image segmentation is a critical task in healthcare applications, and U-Nets have demonstrated promising results. This work delves into the understudied aspect of receptive field (RF) size and its impact on the U-Net and Attention U-Net architectures. This work explores several critical elements including the relationship between RF size, characteristics of the region of interest, and mode… ▽ More Medical image segmentation is a critical task in healthcare applications, and U-Nets have demonstrated promising results. This work delves into the understudied aspect of receptive field (RF) size and its impact on the U-Net and Attention U-Net architectures. This work explores several critical elements including the relationship between RF size, characteristics of the region of interest, and model performance, as well as the balance between RF size and computational costs for U-Net and Attention U-Net methods for different datasets. This work also proposes a mathematical notation for representing the theoretical receptive field (TRF) of a given layer in a network and proposes two new metrics - effective receptive field (ERF) rate and the Object rate to quantify the fraction of significantly contributing pixels within the ERF against the TRF area and assessing the relative size of the segmentation object compared to the TRF size respectively. The results demonstrate that there exists an optimal TRF size that successfully strikes a balance between capturing a wider global context and maintaining computational efficiency, thereby optimizing model performance. Interestingly, a distinct correlation is observed between the data complexity and the required TRF size; segmentation based solely on contrast achieved peak performance even with smaller TRF sizes, whereas more complex segmentation tasks necessitated larger TRFs. Attention U-Net models consistently outperformed their U-Net counterparts, highlighting the value of attention mechanisms regardless of TRF size. These novel insights present an invaluable resource for develo** more efficient U-Net-based architectures for medical imaging and pave the way for future exploration. A tool is also developed that calculates the TRF for a U-Net (and Attention U-Net) model, and also suggest an appropriate TRF size for a given model and dataset. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2101.12404 [pdf, other]

Multi-Threshold Attention U-Net (MTAU) based Model for Multimodal Brain Tumor Segmentation in MRI scans

Authors: Navchetan Awasthi, Rohit Pardasani, Swati Gupta

Abstract: Gliomas are one of the most frequent brain tumors and are classified into high grade and low grade gliomas. The segmentation of various regions such as tumor core, enhancing tumor etc. plays an important role in determining severity and prognosis. Here, we have developed a multi-threshold model based on attention U-Net for identification of various regions of the tumor in magnetic resonance imagin… ▽ More Gliomas are one of the most frequent brain tumors and are classified into high grade and low grade gliomas. The segmentation of various regions such as tumor core, enhancing tumor etc. plays an important role in determining severity and prognosis. Here, we have developed a multi-threshold model based on attention U-Net for identification of various regions of the tumor in magnetic resonance imaging (MRI). We propose a multi-path segmentation and built three separate models for the different regions of interest. The proposed model achieved mean Dice Coefficient of 0.59, 0.72, and 0.61 for enhancing tumor, whole tumor and tumor core respectively on the training dataset. The same model gave mean Dice Coefficient of 0.57, 0.73, and 0.61 on the validation dataset and 0.59, 0.72, and 0.57 on the test dataset. △ Less

Submitted 28 January, 2021; originally announced January 2021.

arXiv:2001.06434 [pdf, other]

Sinogram super-resolution and denoising convolutional neural network (SRCN) for limited data photoacoustic tomography

Authors: Navchetan Awasthi, Rohit Pardasani, Sandeep Kumar Kalva, Manojit Pramanik, Phaneendra K. Yalavarthy

Abstract: The quality of the reconstructed photoacoustic image largely depends on the amount of photoacoustic (PA) boundary data available, which in turn is proportional to the number of detectors employed. In case of limited data (owing to less number of detectors due to cost/instrumentation constraints), the reconstructed PA images suffer from artifacts and are often noisy. In this work, for the first tim… ▽ More The quality of the reconstructed photoacoustic image largely depends on the amount of photoacoustic (PA) boundary data available, which in turn is proportional to the number of detectors employed. In case of limited data (owing to less number of detectors due to cost/instrumentation constraints), the reconstructed PA images suffer from artifacts and are often noisy. In this work, for the first time, a deep learning based model was developed to super resolve and denoise the photoacoustic sinogram data. The proposed method was compared with existing nearest neighbor interpolation and wavelet based denoising techniques and was shown to outperform them both in numerical and in-vivo cases. The improvement obtained in Root Mean Square Error (RMSE) and Peak Signal to Noise Ratio (PSNR) for the reconstructed PA image using the sinogram data that was super-resolved and denoised using proposed neural network based method was as high as 41.70 % and 6.93 dB respectively compared to utilizing limited sinogram data. △ Less

Submitted 17 January, 2020; originally announced January 2020.

Showing 1–3 of 3 results for author: Awasthi, N