¹¹institutetext: Department of Computer Science, University of Liverpool, UK ²²institutetext: Imperial College London, UK ³³institutetext: University of Information Technology, HCMC, Vietnam ⁴⁴institutetext: Vietnam National University, HCMC, Vietnam ⁵⁵institutetext: Alder Hey Children’s Hospital, Liverpool, UK

Shape-Sensitive Loss for Catheter and Guidewire Segmentation

Chayun Kongtongvattana 11 Baoru Huang 22 **gxuan Kang 11 Hoan Nguyen 3344 Olajide Olufemi 55 Anh Nguyen 11

Abstract

We introduce a shape-sensitive loss function for catheter and guidewire segmentation and utilize it in a vision transformer network to establish a new state-of-the-art result on a large-scale X-ray images dataset. We transform network-derived predictions and their corresponding ground truths into signed distance maps, thereby enabling any networks to concentrate on the essential boundaries rather than merely the overall contours. These SDMs are subjected to the vision transformer, efficiently producing high-dimensional feature vectors encapsulating critical image attributes. By computing the cosine similarity between these feature vectors, we gain a nuanced understanding of image similarity that goes beyond the limitations of traditional overlap-based measures. The advantages of our approach are manifold, ranging from scale and translation invariance to superior detection of subtle differences, thus ensuring precise localization and delineation of the medical instruments within the images. Comprehensive quantitative and qualitative analyses substantiate the significant enhancement in performance over existing baselines, demonstrating the promise held by our new shape-sensitive loss function for improving catheter and guidewire segmentation.

Keywords:

Shape-sensitive loss function, Vision Transformer (ViT), Catheter and guidewire segmentation, Signed distance maps (SDMs)

1 Introduction

Endovascular interventions have drastically changed cardiovascular surgery, bringing forth benefits like minimized trauma and faster recovery. However, they also present potential hazards such as damage to the vessel wall [1, 2, 23]. Accurate segmentation of catheters and guidewires within X-ray images is pivotal for reducing these risks.

Refer to caption — Figure 1: Catheter and guidewrite segmentation in X-ray images. First row: The input X-ray images. Second row: The segmentation results. Red color denotes the catheter, green color denotes the guidewire.

The merging of computer vision and machine learning in the medical domain has spurred advancements in addressing challenges associated with endovascular interventions [3, 4, 5, 6, 7]. Especially, the role of deep learning has proven significant in refining the precision of surgical procedures and enhancing patient safety [8, 25]. Despite these strides, the segmentation of intricate structures of catheters and guidewires in X-ray images remains a hurdle. Traditional loss functions often fall short in capturing the global spatial relationships crucial for this task [9, 28, 10]. While research has employed convolutional neural networks (CNNs) with these loss functions to deliver promising results [11, 34, 16, 17], the challenge persists.

Our study presents a novel shape-sensitive loss function that uses the Vision Transformer (ViT) to gauge the similarity between signed distance maps (SDMs) [13, 14]. This approach aims to offer superior context awareness and structure sensitivity, leading to enhanced segmentation.

The structure of this paper comprises a literature review, a detailed explanation of our proposed loss function, presentation of experimental results, and a concluding section discussing potential future directions.

Outlined below are the foundational contributions of this work:

•

Introduction of a shape-sensitive loss function that merges spatial distance insights with the feature extraction prowess of Vision Transformer.
•

A unique technique for converting network outputs into SDMs, preserving structural intricacies.
•

A balanced approach, merging our shape-sensitive loss with the traditional Dice loss to ensure holistic segmentation performance.

2 RELATED WORKS

2.1 Catheter Segmentation

Catheter segmentation has gained momentum with the introduction of deep learning frameworks [39, 15]. The FW-Net utilizes an end-to-end approach with an encoder-decoder structure, optical flow extraction, and a unique flow-guided war** function to ensure temporal continuity in imaging sequences [15]. Another method employs deep convolutional neural networks for segmenting catheters and guidewires in 2D X-ray fluoroscopic sequences, using previous image contexts for enhanced accuracy and achieving a notable median centerline distance error of 0.2 mm [18]. A transformative approach incorporates Convolutional Neural Networks (CNNs) with transfer learning, exploiting synthetic fluoroscopic images to develop a streamlined segmentation model requiring minimal manually annotated data, significantly reducing testing time while remaining adaptable to higher input resolutions [16]. These advancements underscore the continuous evolution and adaptability of deep learning methodologies in catheter and guidewire segmentation tasks.

Table 1: Types of Semantic Segmentation Loss functions

Distribution-Based Loss
Binary Cross-Entropy [19]	$-\sum y\log(p)+(1-y)\log(1-p)$
Weighted Cross-Entropy [22]	$-\sum w_{y}y\log(p)$
Balanced Cross-Entropy [20]	$-\beta\sum y\log(p)-(1-\beta)\sum(1-y)\log(1-p)$
Focal [21]	$-\sum(1-p)^{\gamma}y\log(p)$
Region-Based Loss
Dice [24]	$1-\frac{2\sum(y\cap p)}{\sum y+\sum p}$
Tversky [26]	$\frac{\sum(y\cap p)}{\sum(y\cap p)+\alpha\sum(y-p)+\beta\sum(p-y)}$
Focal Tversky [27]	$(1-Tversky)^{\gamma}$
Compound Loss
Combo [31]	$CL(y,\hat{y})=\alpha L_{m-bce}-(1-\alpha)DL(y,\hat{y})$
ELL [32]	$L_{ELL}=\alpha L_{Dice}+\beta L_{CE}$
Boundary-Based Loss
HD [30]	$L_{HD_{DT}}=\frac{1}{N}\sum_{i=1}^{N}[(s_{i}-g_{i})\cdot(d_{Gi}^{2}-d_{Si}^{2})]$
InverseForm [12]	$L_{\text{if}}(b_{\text{pred}},b_{\text{gt}})=\sum_{j=1}^{N}d_{\text{if}}(b_{% \text{pred,j}},b_{\text{gt,j}})$
Shape-sensitive (ours)	Equation 5

2.2 Loss Function for Medical Segmentation

Loss functions in deep learning for image segmentation tasks are pivotal for determining the quality of medical image segmentations (Table 1). These functions can be broadly categorized into four primary types [10]: distribution-based, region-based, boundary-based, and compounded loss. Distribution-based losses, such as Binary Cross-Entropy [19], Focal loss [21], and Weighted Cross-Entropy [22], measure the dissimilarity between predicted and true probability distributions. While effective in modeling these distributions, they might lack spatial coherence and can struggle with class imbalances. On the other hand, region-based losses like Dice loss [24], Tversky Loss [26], and Focal Tversky loss [27] excel in scenarios with class imbalances by focusing on the overlap between predicted and actual segments, but they may miss finer details. Boundary-based losses, such as Shape-aware loss [29] and Hausdorff Distance loss [30], prioritize boundary accuracy but can be sensitive to minor perturbations. Meanwhile, compounded losses like Combo loss [31] and Exponential Logarithmic Loss [32] provide a comprehensive approach by merging features from different loss types, though they can potentially increase training complexity.

The pursuit of accuracy, especially for intricate structures, led to the inception of shape-aware loss functions, which factor in the spatial relationships and distances of pixels from target boundaries. The InverseForm loss function [12] is a noteworthy example. It integrates an Inverse Transformation Network into the loss calculation to produce a Transformation Matrix, ensuring alignment between predicted segmentation and target boundaries. Nonetheless, capturing high-level features that provide global context can be challenging for some methods. Drawing inspiration from such integrative approaches, our proposal incorporates Vision Transformers (ViTs) within the loss function. With their attention mechanism, ViTs capture global contextual features from images, paving the way for a more accurate representation of complex structures, such as catheters and guidewires in X-ray images.

3 Shape-Sensitive Loss for Catheter and Guidewire Segmentation

In this section, we introduce a novel approach enhancing catheter and guidewire segmentation in X-ray images. Leveraging SDM and ViT foundations, our method integrates key modifications to improve accuracy, offering a fresh perspective in precise medical imaging.

3.1 Preliminaries: Signed Distance Map

The Signed Distance Map (SDM) is resha** image segmentation by map** each pixel’s distance to the image’s contour, signifying its relation to the boundary. This method enhances the clarity of segmentation, especially for complex structures. SDM’s emphasis on boundary localization is crucial in medical imaging, like catheter and guidewire segmentation. By using SDM, models become more sensitive to boundaries, prioritizing localization and improving segmentation accuracy, resulting in fewer errors in X-ray image segmentation. Fig. 2 shows an example of SDM.

Initial Transformation to SDM: Both the predicted output and its corresponding label undergo an initial transformation to fit the SDM format, setting the stage for further operations tailored for this representation. Let $N(x,y)$ be the network output, where $(x,y)$ are pixel coordinates.

Transformation to Contour: The contour representation, $C(x,y)$ , is derived by thresholding the network output at a value $T:$

C(x,y)=\begin{cases}1&\text{if }N(x,y)>T\\ 0&\text{otherwise}\end{cases}

(1)

Transformation to SDM: Define $d((x,y),(i,j))$ as the Euclidean distance between any point $(x,y)$ and its nearest boundary point $(i,j).$

\text{SDM}(x,y)=\begin{cases}0,&\text{if }C(x,y)=1\\ \min_{(i,j)}d((x,y),(i,j)),&\text{if }C(x,y)\neq 1\end{cases}

(2)

The computation of $d((x,y),(i,j))$ often employs methods like the Fast Marching Method or the Distance Transform algorithm. The SDM produces a continuous spectrum of values, with positive distances outside the object and negative distances inside. Leveraging SDM in the loss function allows the network to achieve precise segmentation of intricate structures in X-ray images.

3.2 Feature Extraction using Vision Transformer

The Vision Transformer (ViT) excels in extracting high-level features, pivotal for understanding the intricate patterns in Signed Distance Maps (SDMs). Our approach, while building upon ViT’s established use with SDM, introduces key modifications to enhance feature extraction. These modifications, including optimized patch sizes for high-resolution imaging and an adaptive attention mechanism targeting critical anatomical features, are specifically tailored to address the variability in medical data and the need for precise segmentation. This integration not only demonstrates ViT’s potential in specialized tasks, but also distinctly sets apart our method, as shown in Fig. 3.

For feature extraction, the last layer of the ViT is bypassed, allowing it to produce high-dimensional vectors that capture detailed SDM patterns—a critical component for precise segmentation. Opting for the ViT-B/384 configuration, designed for high-resolution images, is ideal. Using a 384-patch size, it captures fine-grained details, making it apt for high-resolution SDM analysis. From this, features $\alpha$ for predicted and $\beta$ for true SDMs are determined, forming the core for loss computation.

\alpha,\beta=\text{ViT}(\text{SDM}_{\text{Output}},\text{SDM}_{\text{Label}})

(3)

In the equation above, $\text{ViT}(\text{SDM}_{\text{Output}},\text{SDM}_{\text{Label}})$ demonstrates the feature extraction process by passing SDMs through the ViT. By utilizing ViT, segmentation is further enhanced, resulting in a thorough of SDM structures.

3.3 Shape-Sensitive Loss Computation

Accurate segmentation heavily hinges on an effective loss computation mechanism to steer the model’s learning, ensuring the precise identification of segmentation boundaries within SDMs. The employed loss computation in this work is fundamentally shape-sensitive, an approach meticulously tailored to heighten the model’s sensitivity to the geometric details within the SDMs. This heightened sensitivity is achieved through the use of cosine similarity between the high-level features extracted from both the predicted and true SDMs.

The premise behind employing cosine similarity lies in its capacity to consider the angle between the feature vectors, offering a nuanced and detailed insight into their alignment. This perspective enables a more delicate evaluation of the segmentation process, capturing the geometric intricacies within the high-dimensional feature space derived from the SDMs.

\text{CosSim }\left(\alpha,\beta\right)=\frac{\left(\alpha\cdot\beta\right)}{% \left(|\alpha|\times|\beta|\right)}

(4)

The shape-sensitive loss, denoted as $L_{\text{SS}}$ , is thus defined as the deviation from perfect alignment, represented mathematically as:

\begin{split}\mathcal{L}_{\text{SS}}(\text{SDM}_{\text{Output}},\text{SDM}_{% \text{Label}})&=\\ 1-\operatorname{CosSim}(\text{ViT}(\text{SDM}_{\text{Output}}),&\quad\text{ViT% }(\text{SDM}_{\text{Label}}))\end{split}

(5)

Further enriching the loss computation, the total loss, $\mathcal{L}_{\text{total}}$ , amalgamates the shape-sensitive loss with the Dice loss, a well-regarded loss function for segmentation tasks. This composite measure is meticulously balanced for optimal segmentation results, ensuring the model maintains a comprehensive focus, attending to shape sensitivity and other pivotal aspects of segmentation.

\mathcal{L}_{\text{total}}=\gamma\mathcal{L}_{\text{Dice}}+\delta\mathcal{L}_{% \text{SS}}(\text{SDM}_{\text{Output}},\text{SDM}_{\text{Label}})

(6)

This structured loss computation strategy, by intertwining geometric sensitivity with established loss functions, furnishes the model with a robust and multifaceted learning signal, underpinning the attainment of superior segmentation performance in processing SDMs.

4 Experimental Results

4.1 Experiment Setup

Dataset: We assessed our proposed loss function using our newly collected dataset. This dataset includes 5,086 real animal X-ray images and 18,791 phantom X-ray images, both paired with ground truth annotations. While real animal images are inherently 512 × 512 pixels, phantom images were resized from 1024 × 1024 to 512 × 512, ensuring labels remained accurate. The dataset was divided with a 70-30 split for training and testing, respectively.

Evaluation metrics: Our semantic segmentation performance was gauged using several standard metrics:

•

Dice Coefficient: Measures overlap between prediction and ground truth.
•

Jaccard Similarity (IoU): Calculates the ratio of intersected region to the combined predicted and ground truth areas.
•

Mean Intersection over Union (mIoU): This is an average IoU across all classes, predominantly for multi-class segmentation.
•

Accuracy: Determines the proportion of correctly identified pixels.

We integrate our method into different segmentation backbones, including U-Net [33], U-Net++ [36], U-Net3+ [37], TransU-net [35], and SwinU-Net [38].

4.2 Results on Real Animal X-Ray Images

As shown in Table 2, all segmentation networks profited from our method. TransU-Net exhibited a boost in the Dice coefficient from 54.52% to 57.16%, a gain of 2.64 percentage points. Similar growth was observed in Jaccard Similarity, mIoU, and Accuracy. Expanding on this, the U-Net, a foundational segmentation architecture, after embedding our loss function, showed improvements in all four metrics, with the Dice coefficient rising by 1.75 percentage points. The other models, including U-Net++, U-Net3+, and SwinU-Net, mirrored this enhancement trend. Worth highlighting is that the TransU-Net, when synergized with our loss function, outperformed other architectures in all metrics.

Table 2: Comparing results for real animal X-ray images

Network	Dice	Jaccard	mIoU	Accuracy
TransU-net	54.52	43.87	61.15	78.25
TransU-Net w/ours	57.16	46.04	62.66	79.57
U-Net	46.20	36.62	54.14	71.57
U-Net w/ours	47.95	38.79	55.05	72.89
U-Net++	48.77	39.22	56.17	72.30
U-Net++ w/ours	50.35	41.39	57.08	73.62
U-Net3+	49.24	40.12	56.37	73.14
U-Net3+ w/ours	51.37	42.29	57.28	74.46
SwinU-Net	52.74	42.14	57.48	77.67
SwinU-Net w/ours	54.71	44.58	59.04	79.13

4.3 Results on Phantom X-ray Images

In our expanded study focusing on phantom X-ray images, the capabilities of our proposed method were prominently showcased, emphasizing its prowess in segmentation tasks (Table 3). When we integrated our method with the TransU-Net model, we witnessed a significant improvement in the Dice coefficient. The metrics rose from a base of 40.83% to an enhanced 43.71%, validating our method’s effectiveness in enhancing the segmentation accuracy of intricate medical images. A closer inspection of Figure 4 brings forth another salient feature of our approach. Across different architectures, there was a consistent pattern - the parameter count remained stable. Interestingly, this consistency held even when our innovative loss function was introduced. This is a crucial observation as it suggests that our method not only bolsters segmentation performance but also maintains it without imposing any supplementary computational burdens.

Table 3: Comparing results for phantom X-ray images

Network	Dice	Jaccard	mIoU	Accuracy
TransU-net	40.83	33.94	46.01	66.18
TransU-Net w/ours	43.71	36.65	47.21	67.58
U-Net	34.51	27.62	43.14	60.57
U-Net w/ours	37.13	28.56	45.83	62.29
U-Net++	35.24	27.43	44.72	61.76
U-Net++ w/ours	37.25	28.94	45.97	63.12
U-Net3+	35.53	27.87	44.89	61.88
U-Net3+ w/ours	38.07	29.54	46.21	63.48
SwinU-Net	39.44	31.27	45.29	65.37
SwinU-Net w/ours	41.17	33.79	46.23	66.88

4.4 Ablation Study

In our exhaustive evaluation using real-animal X-ray image datasets, several critical findings became evident. Analyzing the data in Table 6, it is clear that the Cosine similarity consistently outperformed other traditional distance measurements when it came to evaluating feature embeddings. Further insights from Table 4.4 revealed that integrating the Dice loss with our uniquely devised, shape-sensitive loss function culminated in the most robust segmentation results. Diving deeper, Table 4 offers a nuanced look into the impact of blending parameters across different network architectures. Notably, the TransU-net framework, when meticulously calibrated with blending coefficients of 0.5 for both ${\gamma}$ and ${\delta}$ , stood out, delivering unparalleled performance. Summarizing our extensive assessments, the pinnacle was a Dice coefficient of 57.16%, a testament to the potency and precision of our methodological choices and rigorous parameter adjustments in the demanding realm of medical image segmentation.

Table 5: Performance comparison of different loss functions.

Loss Function	Dice (%)
CE¹¹1CE = Cross-Entropy and Dice	54.52
FT²²2FT = Focal Tversky and ours	55.78
Combo and ours	56.63
Dice and ours	57.16

Table 6: Comparison between various distance measurements for loss value calculation. Dist. Measurement Dice (%) Cosine similarity 57.16 Euclidean distance 56.78 Manhattan distance 56.10 Jaccard similarity 56.72 Hamming distance 54.03

5 Conclusion

In this study, we proposed a customized loss function for shape-sensitive segmentation using a pre-trained Vision Transformer (ViT) network. By converting the network prediction and ground truth segmentation maps into signed distance maps and extracting high-level features through the ViT, we were able to estimate feature matrices. The spatial distance between boundary maps was then evaluated using Cosine similarity, which measured the dissimilarity between the extracted high-level features. By combining our customized loss function with the Dice loss, we aimed to leverage shape-sensitive segmentation and capture finer details, ultimately improving the overall segmentation accuracy. Our approach demonstrated the effectiveness of utilizing ViT and signed distance map in the segmentation task, providing valuable insights into optimizing boundary map distances for accurate segmentation results. While our strategy underscored the merits of employing both ViT and signed distance map for segmentation, it’s essential to acknowledge the relatively subdued accuracy in our prediction results. This highlights a pivotal area of potential enhancement, motivating us to delve deeper into refinements and optimizations to elevate the segmentation accuracy in future endeavours.

References

[1] H. Rafii-Tari, C. J. Payne, G.-Z. Yang.: Current and emerging robot-assisted endovascular catheterization technologies: A review. In: Annals of Biomedical Engineering (2014)
[2] N. Simaan, R. M. Yasin, and L. Wang.: Medical technologies and challenges of robot-assisted minimally invasive intervention and diagnostics. In: Annual Review of Control, Robotics, and Autonomous System (2018)
[3] Y. Thakur, J. S. Bax, D. W. Holdsworth, and M. Drangova.: Design and performance evaluation of a remote catheter navigation system. In: IEEE Transactions on Biomedical Engineering (2009)
[4] M. E. M. K. Abdelaziz, D. Kundrat, M. Pupillo, G. Dagnino, T. MY, W. C. Kwok, V. Groenhuis, F. J. Siepel, C. Riga, S. Stramigioli, et al.: Toward a versatile robotic platform for fluoroscopy and MRI-guided endovascular interventions: A pre-clinical study. In: IROS (2019)
[5] G.-B. Bian, X.-L. Xie, Z.-Q. Feng, Z.-G. Hou, P. Wei, L. Cheng, and M. Tan.: An enhanced dual-finger robotic hand for catheter manipulating in vascular intervention: A preliminary study. In: ICIA (2013)
[6] W. Chi, G. Dagnino, T. Kwok, A. Nguyen, D. Kundrat, E. M. K. Abdelaziz, C. Riga, C. Bicknell, and G.-Z. Yang.: Collaborative robot-assisted endovascular catheterization with generative adversarial imitation learning In: ICRA, (2020)
[7] Y. Zhao, S. Guo, Y. Wang, J. Cui, Y. Ma, Y. Zeng, X. Liu, Y. Jiang, Y. Li, L. Shi, et al.: A CNN-based prototype method of unstructured surgical state perception and navigation for an endovascular surgery robot. In: Medical & Biological Engineering & Computing (2019)
[8] M. Benavente Molinero, G. Dagnino, J. Liu, W. Chi, M. Abdelaziz, T. Kwok, C. Riga, and G. Yang.: Haptic Guidance for Robot-Assisted Endovascular Procedures: Implementation and Evaluation on Surgical Simulator. In: IROS (2019)
[9] K. O’Shea and R. Nash.: An Introduction to Convolutional Neural Networks, (2015), arXiv:1511.08458. cs.NE.
[10] S. Jadon.: A survey of loss functions for semantic segmentation. In: 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB) Oct. (2020)
[11] Shuai Guo, Songyuan Tang, Jianjun Zhu, **gfan Fan, Danni Ai, Hong Song, ** Liang, Jian Yang.: Improved U-Net for Guidewire Tip Segmentation in X-ray Fluoroscopy Images. In: ICAIP ’19: Proceedings of the 2019 3rd International Conference on Advances in Image Processing (2019)
[12] S. Borse, Y. Wang, Y. Zhang, and F. Porikli.: InverseForm: A Loss Function for Structured Boundary-Aware Segmentation. arXiv:2104.02745 [cs.CV] (2021)
[13] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby.: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv:2010.11929 [cs.CV] (2021)
[14] Q. Huang, Y. Zhou, and L. Tao.: Dual-Term Loss Function For Shape-Aware Medical Image Segmentation. In: 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), pp. 1798-1802 (2021), doi: 10.1109/ISBI48211.2021.9433775.
[15] Anh Nguyen, Dennis Kundrat, Giulio Dagnino, et al.: End-to-end real-time catheter segmentation with optical flow-guided war** during endovascular intervention. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 9967–9973, (2020) DOI: 10.1109/ICRA40945.2020.9197307
[16] Gherardini M, Mazomenos E, Menciassi A, Stoyanov D.: Catheter segmentation in X-ray fluoroscopy using synthetic data and transfer learning with light U-nets. Compute Methods Programs Biomed, 192:105420, August 2020. doi: 10.1016/j.cmpb.2020.105420. Epub 2020 Feb 29. PMID: 32171151; PMCID: PMC7903142.
[17] W. Wang, Q. Li, C. Xiao, D. Zhang, L. Miao, and L. Wang.: An Improved Boundary-Aware U-Net for Ore Image Semantic Segmentation. In: Sensors, vol. 21, no. 8, article no. 2615, (2021) DOI: 10.3390/s21082615.
[18] Pierre Ambrosini, Daniel Ruijters, Wiro J. Niessen, Adriaan Moelker, Theo van Walsum.: Fully Automatic and Real-Time Catheter Segmentation in X-Ray Fluoroscopy. arXiv preprint arXiv:1707.05137 (2017)
[19] Yi-de Ma, Qing Liu, and Zhi-Bai Qian.: Automated image segmentation using improved PCNN model based on cross-entropy. In: Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, pages 743–746. IEEE (2004)
[20] Saining Xie and Zhuowen Tu. :Holistically-nested edge detection. Proceedings of the IEEE international conference on computer vision, pages 1395–1403 (2015)
[21] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. : Focal loss for dense object detection. arXiv preprint arXiv:1708.02002 (2017)
[22] Vasyl Pihur, Susmita Datta, and Somnath Datta. : Weighted rank aggregation of cluster validation measures: a monte carlo cross-entropy approach. In: Bioinformatics, 23(13):1607–1615 (2007)
[23] Jianu, Tudor, Baoru Huang, Mohamed EMK Abdelaziz, Minh Nhat Vu, Sebastiano Fichera, Chun-Yi Lee, Pierre Berthet-Rayne, and Anh Nguyen. Cathsim: An open-source simulator for autonomous cannulation. arXiv preprint arXiv:2208.01455 (2022).
[24] Carole H Sudre, Wenqi Li, Tom Vercauteren, Sebastien Ourselin, and M Jorge Cardoso. : Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Deep learning in medical image analysis and multimodal learning for clinical decision support, pages 240–248. Springer (2017)
[25] Huang, Baoru, Yicheng Hu, Anh Nguyen, Stamatia Giannarou, and Daniel S. Elson. Detecting the Sensing Area of a Laparoscopic Probe in Minimally Invasive Cancer Surgery. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 260-270. Cham: Springer Nature Switzerland, 2023.
[26] Seyed Sadegh Mohseni Salehi, Deniz Erdogmus, and Ali Gholipour. : Tversky loss function for image segmentation using 3D fully convolutional deep networks. In: International Workshop on Machine Learning in Medical Imaging, pages 379–387. Springer (2017)
[27] Nabila Abraham and Naimul Mefraz Khan. : A novel focal Tversky loss function with improved attention U-Net for lesion segmentation. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pages 683–687. IEEE (2019)
[28] Huang, Baoru, Jian-Qing Zheng, Anh Nguyen, Chi Xu, Ioannis Gkouzionis, Kunal Vyas, David Tuch, Stamatia Giannarou, and Daniel S. Elson. Self-supervised depth estimation in laparoscopic image using 3D geometric consistency. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 13-22. Cham: Springer Nature Switzerland, 2022.
[29] Zeeshan Hayder, Xuming He, and Mathieu Salzmann. : Shape-aware instance segmentation. arXiv preprint arXiv:1612.03129, 2(5):7 (2016)
[30] Davood Karimi and Septimiu E. Salcudean. : Reducing the Hausdorff distance in medical image segmentation with convolutional neural networks. In: IEEE Transactions on Medical Imaging, 39(2):499–513 (2019)
[31] Saeid Asgari Taghanaki, Yefeng Zheng, S Kevin Zhou, Bogdan Georgescu, Puneet Sharma, Daguang Xu, Dorin Comaniciu, and Ghassan Hamarneh. : Combo loss: Handling input and output imbalance in multi-organ segmentation. In: Computerized Medical Imaging and Graphics, 75:24–33 (2019)
[32] Ken CL Wong, Mehdi Moradi, Hui Tang, and Tanveer Syeda-Mahmood. : 3D segmentation with exponential logarithmic loss for highly unbalanced object sizes. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 612–619. Springer (2018)
[33] O. Ronneberger, P. Fischer, and T. Brox.: U-Net: Convolutional Networks for Biomedical Image Segmentation. In: CoRR (2015)
[34] Tran, Minh Q., Tuong Do, Huy Tran, Erman Tjiputra, Quang D. Tran, and Anh Nguyen. Light-weight deformable registration using adversarial learning with distilling knowledge. IEEE transactions on medical imaging 41, no. 6 (2022): 1443-1453.
[35] J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, L. Lu, A. L. Yuille, and Y. Zhou.: TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. In: CoRR (2021)
[36] Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang.: UNet++: A Nested U-Net Architecture for Medical Image Segmentation. In: arXiv:cs.CV (2018)
[37] H. Huang, L. Lin, R. Tong, H. Hu, Q. Zhang, Y. Iwamoto, X. Han, Y.-W. Chen, and J. Wu.: UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation. In: arXiv:eess.IV (2020)
[38] E. K. Aghdam, R. Azad, M. Zarvani, and D. Merhof.: Attention Swin U-Net: Cross-Contextual Attention Mechanism for Skin Lesion Segmentation. In: arXiv:eess.IV (2022)
[39] Ghosh, Rahul, et al. Automated catheter segmentation and tip detection in cerebral angiography with topology-aware geometric deep learning. In: Journal of NeuroInterventional Surgery (2023)..