11institutetext: University of Moratuwa, Moratuwa, Sri Lanka 22institutetext: University of Peradeniya, Peradeniya, Sri Lanka 33institutetext: University of Colombo, Colombo, Sri Lanka
44institutetext: Zone24x7 Inc., USA

LiverUSRecon: Automatic 3D Reconstruction and Volumetry of the Liver with a Few Partial Ultrasound Scan

Kaushalya Sivayogaraj 11    Sahan T. Guruge 22    Udari Liyanage 22    Jeevani Udupihille 33    Saroj Jayasinghe 22    Gerard Fernando 44    Ranga Rodrigo 11    M. Rukshani Liyanaarachchi 11
Abstract

3D reconstruction of the liver for volumetry is important for qualitative analysis and disease diagnosis. Liver volumetry using ultrasound (US) scans, although advantageous due to less acquisition time and safety, is challenging due to the inherent noisiness in US scans, blurry boundaries, and partial liver visibility. We address these challenges by using the segmentation masks of a few incomplete sagittal-plane US scans of the liver in conjunction with a statistical shape model (SSM) built using a set of CT scans of the liver. We compute the shape parameters needed to warp this canonical SSM to fit the US scans through a parametric regression network. The resulting 3D liver reconstruction is accurate and leads to automatic liver volume calculation. We evaluate the accuracy of the estimated liver volumes with respect to CT segmentation volumes using RMSE. Our volume computation is statistically much closer to the volume estimated using CT scans than the volume computed using Childs’ method by radiologists: p-value of 0.094(>0.05)annotated0.094absent0.050.094\;(>0.05)0.094 ( > 0.05 ) says that there is no significant difference between CT segmentation volumes and ours in contrast to Childs’ method. We validate our method using investigations (ablation studies) on the US image resolution, the number of CT scans used for SSM, the number of principal components, and the number of input US scans. To the best of our knowledge, this is the first automatic liver volumetry system using a few incomplete US scans given a set of CT scans of livers for SSM.

Keywords:
Liver volumetry Ultrasound (US) TransUNet 3D reconstruction Statistical Shape Modeling (SSM).

1 Introduction

3D reconstruction of the liver for volume measurement and 3D visual shape analysis using an accessible medical imaging modality like ultrasound (US) imaging is important. It helps clinicians to analyse subject-specific liver morphology and accurately estimate liver volume in real-time. 3D reconstruction from segmentation of 3D scans (slice based 2D image stacks) such as Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) scans, although still demanding, is generally straightforward  [13, 20, 11, 9, 7, 2]. However, the well-known disadvantages of MRI and CT modalities—long acquisition time, cost, and the use of ionizing radiation in CT—make 3D reconstruction using US images attractive.

3D reconstruction of organs using a few 2D US scans acquired at various angles in different planes is possible [14, 16]. However, this technique requires full view of the organ in the scan and uses several input image slices. More crucially, it requires image acquisition location information (pose of the probe) which is difficult to annotate when performing a clinical scan. If liver volume calculation is the only requirement, extracting measurements and training a regression model can lead to an estimate of the volumes from left lobe and right lobe of the liver (called Childs’ method by radiologists [3]). However, measuring lengths from low contrast and noisy US images is subjective, time-consuming, and prone to inter-observer variability; and visual shape analysis is not possible. Moreover, US scans usually do not have the full view of the liver in one image. Thus, 3D reconstruction of the liver using several partial US scans is useful, and current methods are still unable to do so.

3D reconstruction using a few slices where the organ of interest is full in view is not novel. We can examine CT and MRI liver scans as 3D volumes for a qualitative understanding and volumetry using tools such as 3D Slicer [6] and ITK-SNAP [18]. Reconstructing with CT slices of the left ventricle of the heart has been partially explored by Yuan et al. [17] given an atlas (left ventricle of the heart) and full visibility in CT slices. However, 3D reconstruction of the liver—a large organ in the human body with a complex 3D structure—is challenging due to the partial visibility of the liver resulting from the limited field of view of the US probe, noisiness, and artefacts.

If the reconstruction must go beyond visualizing CT or MRI volumes, 3D reconstruction from slices is important. There have been approaches that use one or few 2D slices for 3D reconstruction, such as Instantiation-Net [15] for MRI ventricle, liver reconstruction using an X-ray image by Tong et al. [12], and cardiac 3D reconstruction [1] for MRI. Yuan et al. [17] use a few 2D CT slices and combine segmentation and 3D reconstruction to reconstruct the left ventricle using a Statistical Shape Model (SSM). Tong et al. [12] too use an SSM. However, all these methods use X-ray, CT, or MRI images, and the reconstruction is less challenging due to the high contrast and well-defined boundaries as opposed to US. Therefore, there is no method for reconstructing a large organ like the liver using a few US slices without 3D probe coordinates.

In this paper, we create the 3D reconstruction of the liver using just three sagittal plane US slices where the liver is only partially visible with the aid of an SSM. We create the SSM using a population of liver meshes obtained from CT segmentations. The SSM extracts meaningful information and captures the underlying shape variation within the liver population and provides the mean liver model and principal components. Using just three slices is advantageous due to the ability to quickly acquire them. A deep network segments the three slices and a Multi-Layer Perceptron (MLP) regressor generates the shape parameters which, in turn, warp the SSM to create a patient-specific 3D reconstruction of the liver. This enables us to accurately estimate the patient-specific liver volume. Our volume estimates are more accurate, i.e., statistically closer to the ground truth (radiologist-segmented CT liver volumes) than the volumes estimated by radiologists using the Childs’ method. To the best of our knowledge, this is the first automated deep learning method that calculates the liver volume from three incomplete 2D US scans. Further, we introduce a new US liver database with parallel, annotated CT scans comprising 134 scans. Our contributions are

  • 3D liver reconstruction and volume estimation using three US scans acquired from mid-line, mid-clavicular line, and anterior auxiliary line of the sagittal plane, where the liver is partially visible,

  • a database of paired US scans and radiologist-annotated CT scans that comprises 134 such scans, and

  • surpassing the volume computation accuracy obtained by radiologists using the Childs’ method on US images.

Our contributions open up an avenue to use less-expensive, noisy, partial US scans of organs for 3D reconstruction and volumetry. This, in our opinion, will make scan-based accurate estimation common place for better diagnosis.

2 Methodology

The aim of our framework is to accurately reconstruct the 3D model of the organ that matches with the noisy, possibly partial US scans as few as two or three111We mention as “three” in subsequent discussions for brevity.. The resulting 3D model is useful for visualization and volumetry in the clinic. There are three main modules in our 3D reconstruction framework: Statistical Shape Model (SSM) creation, US segmentation, and the 3D reconstruction itself. The SSM module takes a set of manually segmented 3D CT scans of the same organ of multiple subjects after a registration step and produces the mean mesh and principal components. The segmentation module uses TransUNet [2] to segment the three US images and generates binary masks which guide the final 3D reconstruction module. The 3D reconstruction module is a parametric regression model that warps the average 3D model to match the segmented US images. The average model is the mean of aligned meshes, which has equal number of vertices and faces as other organ models generated from 3D CT segmentation. The final result is a 3D model of the organ that matches with the three US scans.

Fig. 1 describes this framework, which calculates the liver volume from the reconstructed liver model. 3D reconstruction of the liver is possible by SSM which uses a 3D liver model atlas generated by manually segmented 3D CT scans. Principal Components Analysis (PCA) constructs the parameter space from the generated liver atlas. Raw US slices and their masks train the TransUNet [2] segmentation network to generate liver masks of the three input US slices. The masks and their shape parameters are the input that train the parametric regression MLP. This MLP, during test time, generates the shape parameters to reconstruct the 3D liver model by war** the SSM. Finally, we calculate the liver volume from the 3D liver mesh.

TransUNet segmenter Refer to captionRefer to captionRefer to caption US slices Refer to captionRefer to captionRefer to caption Masks Param. regress. MLP SS: stacking Shape param. {αk}subscript𝛼𝑘\{\alpha_{k}\}{ italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } Refer to caption 3D reconstruction SSM s¯,{vk}¯𝑠subscript𝑣𝑘\bar{s},\{v_{k}\}over¯ start_ARG italic_s end_ARG , { italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } Vol. cm3superscriptcm3\mathrm{cm}^{3}roman_cm start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT
Figure 1: The proposed framework: binary masks of the three US slices generate the shape parameters through the parametric regression MLP. These warp the SSM to generate the 3D liver reconstruction.

Statistical Shape Model (SSM): The purpose of this model is to produce the 3D liver model that matches the three US scans of a liver. An SSM describes a set of semantically similar objects—3D liver models in our case—using a set of few parameters. It is a fundamental technique in vision and medical image processing invaluable in semantic segmentation and 3D reconstruction [8]. An SSM has the mean shape of the dataset. This mean shape, combined with principal components that represent the key variations, forms the backbone of the SSM.

We carry out the SSM process introduced in [19]. We use a set of N𝑁Nitalic_N 3D liver meshes generated from CT liver segmentation done by radiologists as the input population. Each liver model has different number of vertices and faces. As the first step, we carry out a non-rigid registration to fit each 3D liver mesh to the first 3D liver mesh (reference model) to obtain 3D models with the same topology (to make the vertices and faces of each mesh equal in number). Then, we align the fitted meshes rigidly to avoid translational and rotational variations. Then we perform PCA as the final step to the generated liver atlas S=S1,S2,,SN𝑆subscript𝑆1subscript𝑆2subscript𝑆𝑁S={S_{1},S_{2},\cdots,S_{N}}italic_S = italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_S start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT, Si3×M,i1,,Nformulae-sequencesubscript𝑆𝑖superscript3𝑀𝑖1𝑁S_{i}\in\mathbb{R}^{3\times M},i\in 1,\cdots,Nitalic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 3 × italic_M end_POSTSUPERSCRIPT , italic_i ∈ 1 , ⋯ , italic_N, where M𝑀Mitalic_M is the number of vertices in the reference model, to create the principal components. Each liver shape Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is mapped to a vector siT3Msuperscriptsubscript𝑠𝑖𝑇superscript3𝑀{s_{i}}^{T}\in\mathbb{R}^{3M}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 3 italic_M end_POSTSUPERSCRIPT. Let Smap=[s1,s2,,sN]TN×3Msubscript𝑆mapsuperscriptsubscript𝑠1subscript𝑠2subscript𝑠𝑁𝑇superscript𝑁3𝑀S_{\mathrm{map}}=[s_{1},s_{2},\cdots,s_{N}]^{T}\in\mathbb{R}^{N\times{3M}}italic_S start_POSTSUBSCRIPT roman_map end_POSTSUBSCRIPT = [ italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_s start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × 3 italic_M end_POSTSUPERSCRIPT and the mean mesh

s¯=1Ni=1Nsi¯𝑠1𝑁superscriptsubscript𝑖1𝑁subscript𝑠𝑖\bar{s}=\frac{1}{N}\sum_{i=1}^{N}s_{i}over¯ start_ARG italic_s end_ARG = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (1)

Using Singular Value Decomposition (SVD), Smap=UΣVTsubscript𝑆map𝑈Σsuperscript𝑉𝑇S_{\mathrm{map}}=U\Sigma V^{T}italic_S start_POSTSUBSCRIPT roman_map end_POSTSUBSCRIPT = italic_U roman_Σ italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, we can represent each liver model using singular vectors vksubscript𝑣𝑘v_{k}italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (each column of V𝑉Vitalic_V, V3M×K𝑉superscript3𝑀𝐾V\in\mathbb{R}^{3M\times K}italic_V ∈ blackboard_R start_POSTSUPERSCRIPT 3 italic_M × italic_K end_POSTSUPERSCRIPT) as

L=reshape(s¯+i=kKvkαk)𝐿reshape¯𝑠superscriptsubscript𝑖𝑘𝐾subscript𝑣𝑘subscript𝛼𝑘L=\mathrm{reshape}\left(\bar{s}+\sum_{i=k}^{K}v_{k}\alpha_{k}\right)italic_L = roman_reshape ( over¯ start_ARG italic_s end_ARG + ∑ start_POSTSUBSCRIPT italic_i = italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) (2)

where L3×M𝐿superscript3𝑀L\in\mathbb{R}^{3\times M}italic_L ∈ blackboard_R start_POSTSUPERSCRIPT 3 × italic_M end_POSTSUPERSCRIPT is a reconstructed liver model. Shape parameters αksubscript𝛼𝑘\alpha_{k}italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT are the variables that represent the liver parametric model. We choose K=50𝐾50K=50italic_K = 50 components. In our system, the combination of the segmentation network and parametric regression MLP predicts the shape parameters. We train this network using the three US segmentation masks and ground truth shape parameters. As a result, given a number of US images and CT liver modules, and the trained segmentation and parametric regression MLP, we can generate the 3D liver model that matches the masks obtained from the three US scans.

Segmentation Model for US Liver Segmentation: In our 3D liver reconstruction the binary masks that result from the segmentation of the three US images guide the final 3D reconstruction. We use TransUNet [2] built based on ResNet50 and ViT (trained on ImageNet [5]), and fine-tuned on Synapse multi-organ segmentation dataset and automated cardiac diagnosis challenge dataset [2]. We fine-tune it using our US liver segmentation dataset. Our dataset comprises three US images each of 134 patients segmented by radiologists. We augmented the dataset using random operations (rotation, translation, flip**, and crop**) when fine-tuning TransUNet. This step prevents overfitting and improves generalization. We do not alter any other hyper-parameters of TansUNet. The ViT based TransUNet is important for the segmentation as the partial views of the livers in our US images benefit from the long-range attention available in ViT. In particular, TransUNet adopts a hybrid architectural approach that fuses the strengths of both CNNs and transformers. This hybrid approach combines the fine-grained, high-resolution spatial information inherent in CNN features with the broader global context captured by the transformers. To establish this point, we also used the standard U-Net, different variants of U-Net [13] to evaluate their performance on this segmentation task comparison with TransUNet (Table 1). In summary, our TransUNet based segmenter accurately segments the noisy, partial US scans of the liver. We feed the segmentation masks to the 3D reconstruction model.

3D Reconstruction Model for Liver Model Reconstruction: Generating shape parameters (αksubscript𝛼𝑘\alpha_{k}italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT) from the segmented masks described above is the next step following the segmentation. In this study, we approach the challenge of 3D reconstruction of liver from multiple views of sagittal plane US images. Our objective is to predict the model parameters by directly utilizing the slice-masks as input data. The 3D reconstruction model uses these masks to generate the shape parameters required for 3D liver model reconstruction. The system uses shape parameters (αksubscript𝛼𝑘\alpha_{k}italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPTs), average liver model (s¯¯𝑠\bar{s}over¯ start_ARG italic_s end_ARG), normalized principal components (vksubscript𝑣𝑘v_{k}italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPTs), and normalization parameters (E(vk)𝐸subscript𝑣𝑘E(v_{k})italic_E ( italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT )) and std(vk)stdsubscript𝑣𝑘\mathrm{std}(v_{k})roman_std ( italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT )) to generate the 3D liver model. To achieve this, we employ a parametric regression MLP that receives the stack of three US slice-masks as its input. So, α=regressionnetwork(USbinarymasks)𝛼regressionnetworkUSbinarymasks\alpha=\mathrm{regression~{}network}(\mathrm{US~{}binary~{}masks})italic_α = roman_regression roman_network ( roman_US roman_binary roman_masks ), where α={α1,,αk,,αK}𝛼subscript𝛼1subscript𝛼𝑘subscript𝛼𝐾\alpha=\{\alpha_{1},\dots,\alpha_{k},\dots,\alpha_{K}\}italic_α = { italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , … , italic_α start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT }. Our parametric regression MLP has two layers.

Liver Volume Calculation: Following the 3D reconstruction, we are able to estimate the liver volume. We save the 3D reconstruction as an obj file and estimate its volume using trimesh [4] in cm3superscriptcm3\mathrm{cm}^{3}roman_cm start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT. We have verified the accuracy of trimesh by comparing the volume against the volume computed by 3D Slicer [6]. Our dataset comprises the liver volume estimated by radiologists: 1. using the CT segmentations, and 2. using Child’s method on US slices. We compare the volume we computed using the proposed method with these two methods and statistically analyze.

3 Experiments and Results

Data Acquisition: We obtained US (three per patient) and corresponding CT scans (for SSM and corresponding 3D reconstruction comparison) of 134 healthy patients222We plan to make this liver dataset of three annotated US slices and liver annotated CT volumes available for the benefit of the community.. We captured the three US slices (1470×2316147023161470\times 23161470 × 2316) at the mid-line, mid-clavicular line, and anterior auxiliary line of the sagittal plane. An experienced radiologist segmented the liver in US images using ITK-SNAP[18] to be used for training, and relevant slices of abdomen CT using 3D-Slicer [6] to be used for SSM and volume comparison (considered as ground truth). We stacked together the segmented 2D CT slices to reconstruct the 3D liver mesh to be used in the SSM and for comparison. We resized the US images to 192×192192192192\times 192192 × 192 or 384×384384384384\times 384384 × 384 (for ablation). Out of 134 subjects, we allocated 99 for training and 35 for testing.

US Liver Segmentation Results: We used FCN [10], UNet [13], UNet++ [20] with EfficientNetB7 encoder, and TransUNet [2] for segmenting US scans for US liver segmentation (Table 1). TransUNet achieved the best Accuracy (Acc.), Dice Score Coefficient (DSC), Intersection over Union (IoU), and Hausdorff distance (HD) for unseen data. This is because TransUNet uses transformers to encode tokenized image patches from a CNN feature map. Thus, the input sequence captures global contexts [2]. We used UNet as the decoder to decode the hidden feature for generating the final segmentation masks. 2D liver predictions overlap well with ground truth liver labels. This, in turn, leads to an accurate liver volume calculation. Ours is the first method that uses a transformer network in US liver segmentation. Following this result, we used TransUNet for all other experiments.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 2: US segmentation and 3D reconstruction results: Three input US sagittal plane images, corresponding segmentations, and 3D liver reconstructions using the shape parameters for three subjects.
Table 1: Segmentation accuracy: TransUNet performs better and, hence, was selected for subsequent experiments. \ast represents the usage of EfficientNet-B7 as an encoder. 3D reconstruction accuracy: CD and MSD are less when we combine TransUNet with Param. Regress. MLP than UNet.
Segmentation FCN UNet UNet++\ast TransUNet
Acc. (%) \uparrow 93.2 95.4 94.4 97.5
DSC (%) \uparrow 38.5 65.6 68.1 91.3
HD (mm) \downarrow 5.5 4.8 4.5 3.6
IoU (%) \uparrow 24.1 50.2 52.7 84.4
Recon. TransUNet UNet
Accuracy + Recon. + Recon.
MSD (mm)\downarrow 6.6 6.8
CD (mm) \downarrow 12.8 13.1
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 3: Two visualizations (front and back) of reconstruction accuracy of three livers: ground truth (yellow  , liver models generated from CT segmentation) overlaps well with our results (green  ).
6006006006008008008008001,00010001{,}0001 , 0001,20012001{,}2001 , 2001,40014001{,}4001 , 4001,60016001{,}6001 , 6001,80018001{,}8001 , 800OursCTChilds’vol., (cm3superscriptcm3\mathrm{cm}^{3}roman_cm start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT)
Figure 4: Box plot of liver volumes calculated from Childs’ method, CT segmentation, and proposed method: Childs’ method has outliers, but the proposed method has no outliers and it’s liver volume distribution falls within CT segmentation’s liver volume distribution.
Table 2: Main result: Statistical analysis: RMSE is less in estimated volumes from our method. Paired t𝑡titalic_t-test shows that there is no significant difference in volumes between CT and our method (p>0.05𝑝0.05p>0.05italic_p > 0.05). Our method is statistically more accurate. μ𝜇\muitalic_μ: mean difference, SEM: Standard Error of the Mean.

Vol. Compar. RMSE Pair μ𝜇\muitalic_μ std. SEM 95% CI of μ𝜇\muitalic_μ diff. (Lower, Upper) t𝑡titalic_t df Signi. 2-tailed CT & Childs’ 306.9 1 -201.5 234.8 39.7 (-282.1, -120.8) -5.1 34 .000 CT & Ours 275.8 2 78.1 268.4 45.4 (-14.1, 170.3) 1.7 34 .094

3D Reconstruction Results: We send the liver masks obtained from the above segmentation process to the 3D reconstruction model to generate the shape parameters to reconstruct the 3D liver shape model. As the problem at hand is a slice mask based shape reconstruction, we use the reconstruction method in Yuan et al. [17]. Table 1 illustrates the accuracy of the 3D reconstruction method on test data. We use Chamfer Distance (CD) and Mean Surface Distance (MSD) to compare the generated 3D reconstruction with ground truth 3D liver models (see Table 1). The combination of TransUNet and parametric regression MLP obtained less CD and MSD compared to using UNet with the same setup. Front and back views of 3D reconstructed liver models are in Fig. 3; Our system can generate liver models retaining their complex shape. Further, Fig. 3 provides visualization of results, where predicted liver models highly overlap with ground truth. We calculated the Root Mean Square Error (RMSE) to compare with the ground truth CT volumes with the volumes obtained by radiologists using the Childs’ method and the proposed method; Our liver volumes are closer to the ground truth as shown in Table 2. Box plot in Fig. 4 shows the descriptive statistics of each liver volume calculation method. We performed patient-wise paired sample t𝑡titalic_t-test as shown in Table 2. We can conclude that there is a significant difference in liver volumes calculated between CT segmentation (mean=1162.4,std.=275.7\mathrm{mean}=1162.4,\mathrm{std.}=275.7roman_mean = 1162.4 , roman_std . = 275.7) and Childs’ method (mean=960.9,std.=257.9\mathrm{mean}=960.9,\mathrm{std.}=257.9roman_mean = 960.9 , roman_std . = 257.9); t(34)=5.1,p=.000formulae-sequence𝑡345.1𝑝.000t(34)=-5.1,p=.000italic_t ( 34 ) = - 5.1 , italic_p = .000. In contrast, there is no significant difference in liver volumes calculated between CT segmentation (mean=1162.4,std.=275.7\mathrm{mean}=1162.4,\mathrm{std.}=275.7roman_mean = 1162.4 , roman_std . = 275.7) and our method (mean=1240.6,std.=133.1\mathrm{mean}=1240.6,\mathrm{std.}=133.1roman_mean = 1240.6 , roman_std . = 133.1); t(34)=1.7,p=0.094formulae-sequence𝑡341.7𝑝0.094t(34)=1.7,p=0.094italic_t ( 34 ) = 1.7 , italic_p = 0.094.

Ablation Study: Table 3 shows our ablation studies on the effect of resolution of the three US slices, no. of slices used to compute the shape parameters, the number of principal components used for SSM, and the no. of CT scans used for SSM. We have chosen to use the resolution of 192×192192192192\times 192192 × 192, 3 slices, 50 principal components, and 100 CT scans for SSM. The choices do not drastically affect the final results.

Table 3: Effect of US resolution, no. of US scans used, no. of principal components used for the SSM, and the no. of CT scans used for the SSM. \ast indicates what we used for the final results. These choices do not affect the final results drastically.
US res. RMSE
1922superscript1922192^{2}192 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT \ast 275.83
3842superscript3842384^{2}384 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT 271.63
No. slices RMSE
2 281.65
\ast 275.83
No. Comp. RMSE
10 252.35
20 249.90
40 254.53
50 \ast 275.83
70 306.82
CT scans in SSM RMSE
50 (1st) 291.70
50 (2nd) 285.52
60 279.96
80 275.18
100 \ast 275.83
{credits}

3.0.1 Acknowledgements

K. Sivayogaraj acknowledges the support received from the Chancellor’s scholarship donated by Zone24x7 (Pvt.) Ltd. R. Rodrigo acknowledges the support received from the University of Moratuwa Senate Research Committee grant SRC/LT/2021/20.

References

  • [1] Chang, Q., Yan, Z., Zhou, M., Liu, D., Sawalha, K., Ye, M., Zhangli, Q., Kanski, M., Al’Aref, S., Axel, L., Metaxas, D.: Deeprecon: Joint 2D cardiac segmentation and 3D volume reconstruction via a structure-specific generative method. In: Medical Image Computing and Computer Assisted Intervention. pp. 567–577 (2022)
  • [2] Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., Zhou, Y.: TransUNet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
  • [3] Childs, J., Esterman, A., Thoirs, K., Turner, R.: Ultrasound in the assessment of hepatomegaly: A simple technique to determine an enlarged liver using reliable and valid measurements. Sonography 3, 47–52 (03 2016)
  • [4] Dawson-Haggerty et al.: trimesh, https://trimesh.org/
  • [5] Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 248–255 (06 2009)
  • [6] Fedorov, A., Beichel, R., Kalpathy-Cramer, J., Finet, J., Fillion-Robin, J.C., Pujol, S., Bauer, C., Jennings, D., Fennessy, F., Sonka, M., Buatti, J., Aylward, S., Miller, J., Pieper, S., Kikinis, R.: 3D slicer as an image computing platform for the quantitative imaging network. Magnetic Resonance Imaging 30, 1323–41 (07 2012)
  • [7] Fu, S., Lu, Y., Wang, Y., Zhou, Y., Shen, W., Fishman, E., Yuille, A.: Domain adaptive relational reasoning for 3d multi-organ segmentation. In: Medical Image Computing and Computer Assisted Intervention. pp. 656–666 (2020)
  • [8] Ian L. Dryden, K.V.M.: Statistical Shape Analysis: With Applications in R. Wiley, 2 edn. (2016)
  • [9] Li, X., Chen, H., Qi, X., Dou, Q., Fu, C.W., Heng, P.A.: H-DenseUNet: hybrid densely connected unet for liver and tumor segmentation from ct volumes. IEEE Transactions on Medical Imaging 37(12), 2663–2674 (2018)
  • [10] Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 3431–3440. Boston, MA (2015)
  • [11] Milletari, F., Navab, N., Ahmadi, S.A.: V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: International Conference on 3D Vision (3DV). pp. 565–571 (2016)
  • [12] Nakao, M., Tong, F., Nakamura, M., Matsuda, T.: Image-to-graph convolutional network for deformable shape reconstruction from a single projection image. In: Medical Image Computing and Computer Assisted Intervention. pp. 259–268 (2021)
  • [13] Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer Assisted Intervention. pp. 234–241 (2015)
  • [14] Sawdayee, H., Vaxman, A., Bermano, A.H.: OReX: Object reconstruction from planar cross-sections using neural fields. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 20854–20862 (2023)
  • [15] Wang, Z.Y., Zhou, X.Y., Li, P., Theodoreli-Riga, C., Yang, G.Z.: Instantiation-Net: 3D mesh reconstruction from single 2D image for right ventricle. In: Medical Image Computing and Computer Assisted Intervention. pp. 680–691 (2020)
  • [16] Yeung, P.H., Hesse, L., Aliasi, M., Haak, M., Xie, W., Namburete, A.I., et al.: Implicitvol: Sensorless 3D ultrasound reconstruction with deep implicit representation. arXiv preprint arXiv:2109.12108 (2021)
  • [17] Yuan, X., Liu, C., Feng, F., Zhu, Y., Wang, Y.: Slice-mask based 3D cardiac shape reconstruction from ct volume. In: Asian Conference on Computer Vision. pp. 1909–1925 (2022)
  • [18] Yushkevich, P.A., Piven, J., Cody Hazlett, H., Gimpel Smith, R., Ho, S., Gee, J.C., Gerig, G.: User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. Neuroimage 31(3), 1116–1128 (2006)
  • [19] Zhang, J., Hislop-Jambrich, J., Besier, T.F.: Predictive statistical models of baseline variations in 3-D femoral cortex morphology. Medical Engineering & Physics 38(5), 450–457 (2016)
  • [20] Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: A nested U-Net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. pp. 3–11 (2018)