Search | arXiv e-print repository

Joint semi-supervised and contrastive learning enables zero-shot domain-adaptation and multi-domain segmentation

Authors: Alvaro Gomariz, Yusuke Kikuchi, Yun Yvonna Li, Thomas Albrecht, Andreas Maunz, Daniela Ferrara, Huanxiang Lu, Orcun Goksel

Abstract: Despite their effectiveness, current deep learning models face challenges with images coming from different domains with varying appearance and content. We introduce SegCLR, a versatile framework designed to segment volumetric images across different domains, employing supervised and contrastive learning simultaneously to effectively learn from both labeled and unlabeled data. We demonstrate the s… ▽ More Despite their effectiveness, current deep learning models face challenges with images coming from different domains with varying appearance and content. We introduce SegCLR, a versatile framework designed to segment volumetric images across different domains, employing supervised and contrastive learning simultaneously to effectively learn from both labeled and unlabeled data. We demonstrate the superior performance of SegCLR through a comprehensive evaluation involving three diverse clinical datasets of retinal fluid segmentation in 3D Optical Coherence Tomography (OCT), various network configurations, and verification across 10 different network initializations. In an unsupervised domain adaptation context, SegCLR achieves results on par with a supervised upper-bound model trained on the intended target domain. Notably, we discover that the segmentation performance of SegCLR framework is marginally impacted by the abundance of unlabeled data from the target domain, thereby we also propose an effective zero-shot domain adaptation extension of SegCLR, eliminating the need for any target domain information. This shows that our proposed addition of contrastive loss in standard supervised training for segmentation leads to superior models, inherently more generalizable to both in- and out-of-domain test data. We additionally propose a pragmatic solution for SegCLR deployment in realistic scenarios with multiple domains containing labeled data. Accordingly, our framework pushes the boundaries of deep-learning based segmentation in multi-domain applications, regardless of data availability - labeled, unlabeled, or nonexistent. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2309.00453 [pdf, other]

Learning the Imaging Model of Speed-of-Sound Reconstruction via a Convolutional Formulation

Authors: Can Deniz Bezek, Maxim Haas, Richard Rau, Orcun Goksel

Abstract: Speed-of-sound (SoS) is an emerging ultrasound contrast modality, where pulse-echo techniques using conventional transducers offer multiple benefits. For estimating tissue SoS distributions, spatial domain reconstruction from relative speckle shifts between different beamforming sequences is a promising approach. This operates based on a forward model that relates the sought local values of SoS to… ▽ More Speed-of-sound (SoS) is an emerging ultrasound contrast modality, where pulse-echo techniques using conventional transducers offer multiple benefits. For estimating tissue SoS distributions, spatial domain reconstruction from relative speckle shifts between different beamforming sequences is a promising approach. This operates based on a forward model that relates the sought local values of SoS to observed speckle shifts, for which the associated image reconstruction inverse problem is solved. The reconstruction accuracy thus highly depends on the hand-crafted forward imaging model. In this work, we propose to learn the SoS imaging model based on data. We introduce a convolutional formulation of the pulse-echo SoS imaging problem such that the entire field-of-view requires a single unified kernel, the learning of which is then tractable and robust. We present least-squares estimation of such convolutional kernel, which can further be constrained and regularized for numerical stability. In experiments, we show that a forward model learned from k-Wave simulations improves the median contrast of SoS reconstructions by 63%, compared to a conventional hand-crafted line-based wave-path model. This simulation-learned model generalizes successfully to acquired phantom data, nearly doubling the SoS contrast compared to the conventional hand-crafted alternative. We demonstrate equipment-specific and small-data regime feasibility by learning a forward model from a single phantom image, where our learned model quadruples the SoS contrast compared to the conventional hand-crafted model. On in-vivo data, the simulation- and phantom-learned models respectively exhibit impressive 7 and 10 folds contrast improvements over the conventional model. △ Less

Submitted 1 September, 2023; originally announced September 2023.

arXiv:2303.11262 [pdf, other]

Robust Imaging of Speed-of-Sound Using Virtual Source Transmission

Authors: Dieter Schweizer, Richard Rau, Can Deniz Bezek, Rahel A. Kubik-Huch, Orcun Goksel

Abstract: Speed-of-sound (SoS) is a novel imaging biomarker for assessing biomechanical characteristics of soft tissues. SoS imaging in pulse-echo mode using conventional ultrasound systems with hand-held transducers has the potential to enable new clinical uses. Recent work demonstrated diverging waves from single-element (SE) transmits to outperform plane-wave sequences. However, single-element transmits… ▽ More Speed-of-sound (SoS) is a novel imaging biomarker for assessing biomechanical characteristics of soft tissues. SoS imaging in pulse-echo mode using conventional ultrasound systems with hand-held transducers has the potential to enable new clinical uses. Recent work demonstrated diverging waves from single-element (SE) transmits to outperform plane-wave sequences. However, single-element transmits have severely limited power and hence produce low signal-to-noise ratio (SNR) in echo data. We herein propose Walsh-Hadamard (WH) coded and virtual-source (VS) transmit sequences for improved SNR in SoS imaging. We additionally present an iterative method of estimating beamforming SoS in the medium, which otherwise confound SoS reconstructions due to beamforming inaccuracies in the images used for reconstruction. Through numerical simulations, phantom experiments, and in-vivo imaging data, we show that WH is not robust against motion, which is often unavoidable in clinical imaging scenarios. Our proposed virtual-source sequence is shown to provide the highest SoS reconstruction performance, especially robust to motion artifacts. In phantom experiments, despite having a comparable SoS root-mean-square-error (RMSE) of 17.5 to 18.0 m/s at rest, with a minor axial probe motion of ~0.67 mm/s the RMSE for SE, WH, and VS already deteriorate to 20.2, 105.4, 19.0 m/s, respectively; showing that WH produces unacceptable results, not robust to motion. In the clinical data, the high SNR and motion-resilience of VS sequence is seen to yield superior contrast compared to SE and WH sequences. △ Less

Submitted 20 March, 2023; originally announced March 2023.

arXiv:2301.02933 [pdf, other]

Weakly Supervised Joint Whole-Slide Segmentation and Classification in Prostate Cancer

Authors: Pushpak Pati, Guillaume Jaume, Zeineb Ayadi, Kevin Thandiackal, Behzad Bozorgtabar, Maria Gabrani, Orcun Goksel

Abstract: The segmentation and automatic identification of histological regions of diagnostic interest offer a valuable aid to pathologists. However, segmentation methods are hampered by the difficulty of obtaining pixel-level annotations, which are tedious and expensive to obtain for Whole-Slide images (WSI). To remedy this, weakly supervised methods have been developed to exploit the annotations directly… ▽ More The segmentation and automatic identification of histological regions of diagnostic interest offer a valuable aid to pathologists. However, segmentation methods are hampered by the difficulty of obtaining pixel-level annotations, which are tedious and expensive to obtain for Whole-Slide images (WSI). To remedy this, weakly supervised methods have been developed to exploit the annotations directly available at the image level. However, to our knowledge, none of these techniques is adapted to deal with WSIs. In this paper, we propose WholeSIGHT, a weakly-supervised method, to simultaneously segment and classify WSIs of arbitrary shapes and sizes. Formally, WholeSIGHT first constructs a tissue-graph representation of the WSI, where the nodes and edges depict tissue regions and their interactions, respectively. During training, a graph classification head classifies the WSI and produces node-level pseudo labels via post-hoc feature attribution. These pseudo labels are then used to train a node classification head for WSI segmentation. During testing, both heads simultaneously render class prediction and segmentation for an input WSI. We evaluated WholeSIGHT on three public prostate cancer WSI datasets. Our method achieved state-of-the-art weakly-supervised segmentation performance on all datasets while resulting in better or comparable classification with respect to state-of-the-art weakly-supervised WSI classification methods. Additionally, we quantify the generalization capability of our method in terms of segmentation and classification performance, uncertainty estimation, and model calibration. △ Less

Submitted 7 January, 2023; originally announced January 2023.

arXiv:2211.11553 [pdf, other]

Analytical Estimation of Beamforming Speed-of-Sound Using Transmission Geometry

Authors: Can Deniz Bezek, Orcun Goksel

Abstract: Most ultrasound imaging techniques necessitate the fundamental step of converting temporal signals received from transducer elements into a spatial echogenecity map. This beamforming (BF) step requires the knowledge of speed-of-sound (SoS) value in the imaged medium. An incorrect assumption of BF SoS leads to aberration artifacts, not only deteriorating the quality and resolution of conventional b… ▽ More Most ultrasound imaging techniques necessitate the fundamental step of converting temporal signals received from transducer elements into a spatial echogenecity map. This beamforming (BF) step requires the knowledge of speed-of-sound (SoS) value in the imaged medium. An incorrect assumption of BF SoS leads to aberration artifacts, not only deteriorating the quality and resolution of conventional brightness mode (B-mode) images, hence limiting their clinical usability, but also impairing other ultrasound modalities such as elastography and spatial SoS reconstructions, which rely on faithfully beamformed images as their input. In this work, we propose an analytical method for estimating BF SoS. We show that pixel-wise relative shifts between frames beamformed with an assumed SoS is a function of geometric disparities of the transmission paths and the error in such SoS assumption. Using this relation, we devise an analytical model, the closed form solution of which yields the difference between the assumed and the true SoS in the medium. Based on this, we correct the BF SoS, which can also be applied iteratively. Both in simulations and experiments, lateral B-mode resolution is shown to be improved by $\approx$25% compared to that with an initial SoS assumption error of 3.3% (50 m/s), while localization artifacts from beamforming are also corrected. After 5 iterations, our method achieves BF SoS errors of under 0.6 m/s in simulations and under 1 m/s in experimental phantom data. Residual time-delay errors in beamforming 32 numerical phantoms are shown to reduce down to 0.07 $μ$s, with average improvements of up to 21 folds compared to initial inaccurate assumptions. We additionally show the utility of the proposed method in imaging local SoS maps, where using our correction method reduces reconstruction root-mean-square errors substantially, down to their lower-bound with actual BF SoS. △ Less

Submitted 21 November, 2022; originally announced November 2022.

arXiv:2211.04238 [pdf, other]

HDRfeat: A Feature-Rich Network for High Dynamic Range Image Reconstruction

Authors: Lingkai Zhu, Fei Zhou, Bozhi Liu, Orcun Göksel

Abstract: A major challenge for high dynamic range (HDR) image reconstruction from multi-exposed low dynamic range (LDR) images, especially with dynamic scenes, is the extraction and merging of relevant contextual features in order to suppress any ghosting and blurring artifacts from moving objects. To tackle this, in this work we propose a novel network for HDR reconstruction with deep and rich feature ext… ▽ More A major challenge for high dynamic range (HDR) image reconstruction from multi-exposed low dynamic range (LDR) images, especially with dynamic scenes, is the extraction and merging of relevant contextual features in order to suppress any ghosting and blurring artifacts from moving objects. To tackle this, in this work we propose a novel network for HDR reconstruction with deep and rich feature extraction layers, including residual attention blocks with sequential channel and spatial attention. For the compression of the rich-features to the HDR domain, a residual feature distillation block (RFDB) based architecture is adopted. In contrast to earlier deep-learning methods for HDR, the above contributions shift focus from merging/compression to feature extraction, the added value of which we demonstrate with ablation experiments. We present qualitative and quantitative comparisons on a public benchmark dataset, showing that our proposed method outperforms the state-of-the-art. △ Less

Submitted 8 November, 2022; originally announced November 2022.

Comments: 4 pages, 5 figures

arXiv:2208.08377 [pdf, other]

Global Speed-of-Sound Prediction Using Transmission Geometry

Authors: Can Deniz Bezek, Mert Bilgin, Lin Zhang, Orcun Goksel

Abstract: Most ultrasound (US) imaging techniques use spatially-constant speed-of-sound (SoS) values for beamforming. Having a discrepancy between the actual and used SoS value leads to aberration artifacts, e.g., reducing the image resolution, which may affect diagnostic usability. Accuracy and quality of different US imaging modalities, such as tomographic reconstruction of local SoS maps, also depend on… ▽ More Most ultrasound (US) imaging techniques use spatially-constant speed-of-sound (SoS) values for beamforming. Having a discrepancy between the actual and used SoS value leads to aberration artifacts, e.g., reducing the image resolution, which may affect diagnostic usability. Accuracy and quality of different US imaging modalities, such as tomographic reconstruction of local SoS maps, also depend on a good initial beamforming SoS. In this work, we develop an analytical method for estimating mean SoS in an imaged medium. We show that the relative shifts between beamformed frames depend on the SoS offset and the geometric disparities in transmission paths. Using this relation, we estimate a correction factor and hence a corrected mean SoS in the medium. We evaluated our proposed method on a set of numerical simulations, demonstrating its utility both for global SoS prediction and for local SoS tomographic reconstruction. For our evaluation dataset, for an initial SoS under- and over-assumption of 5% the medium SoS, our method is able to predict the actual mean SoS within 0.3% accuracy. For the tomographic reconstruction of local SoS maps, the reconstruction accuracy is improved on average by 78.5% and 87%, respectively, compared to an initial SoS under- and over-assumption of 5%. △ Less

Submitted 17 August, 2022; originally announced August 2022.

arXiv:2203.03664 [pdf, other]

Unsupervised Domain Adaptation with Contrastive Learning for OCT Segmentation

Authors: Alvaro Gomariz, Huanxiang Lu, Yun Yvonna Li, Thomas Albrecht, Andreas Maunz, Fethallah Benmansour, Alessandra M. Valcarcel, Jennifer Luu, Daniela Ferrara, Orcun Goksel

Abstract: Accurate segmentation of retinal fluids in 3D Optical Coherence Tomography images is key for diagnosis and personalized treatment of eye diseases. While deep learning has been successful at this task, trained supervised models often fail for images that do not resemble labeled examples, e.g. for images acquired using different devices. We hereby propose a novel semi-supervised learning framework f… ▽ More Accurate segmentation of retinal fluids in 3D Optical Coherence Tomography images is key for diagnosis and personalized treatment of eye diseases. While deep learning has been successful at this task, trained supervised models often fail for images that do not resemble labeled examples, e.g. for images acquired using different devices. We hereby propose a novel semi-supervised learning framework for segmentation of volumetric images from new unlabeled domains. We jointly use supervised and contrastive learning, also introducing a contrastive pairing scheme that leverages similarity between nearby slices in 3D. In addition, we propose channel-wise aggregation as an alternative to conventional spatial-pooling aggregation for contrastive feature map projection. We evaluate our methods for domain adaptation from a (labeled) source domain to an (unlabeled) target domain, each containing images acquired with different acquisition devices. In the target domain, our method achieves a Dice coefficient 13.8% higher than SimCLR (a state-of-the-art contrastive framework), and leads to results comparable to an upper bound with supervised training in that domain. In the source domain, our model also improves the results by 5.4% Dice, by successfully leveraging information from many unlabeled images. △ Less

Submitted 3 August, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

Comments: Accepted for publication at MICCAI 2022

arXiv:2109.11819 [pdf, other]

Estimating Mean Speed-of-Sound from Sequence-Dependent Geometric Disparities

Authors: Xenia Augustin, Lin Zhang, Orcun Goksel

Abstract: In ultrasound beamforming, focusing time delays are typically computed with a spatially constant speed-of-sound (SoS) assumption. A mismatch between beamforming and true medium SoS then leads to aberration artifacts. Other imaging techniques such as spatially-resolved SoS reconstruction using tomographic techniques also rely on a good SoS estimate for initial beamforming. In this work, we exploit… ▽ More In ultrasound beamforming, focusing time delays are typically computed with a spatially constant speed-of-sound (SoS) assumption. A mismatch between beamforming and true medium SoS then leads to aberration artifacts. Other imaging techniques such as spatially-resolved SoS reconstruction using tomographic techniques also rely on a good SoS estimate for initial beamforming. In this work, we exploit spatially-varying geometric disparities in the transmit and receive paths of multiple sequences for estimating a mean medium SoS. We use images from diverging waves beamformed with an assumed SoS, and propose a model fitting method for estimating the SoS offset. We demonstrate the effectiveness of our proposed method for tomographic SoS reconstruction. With corrected beamforming SoS, the reconstruction accuracy on simulated data was improved by 63% and 29%, respectively, for an initial SoS over- and under-estimation of 1.5%. We further demonstrate our proposed method on a breast phantom, indicating substantial improvement in contrast-to-noise ratio for local SoS map**. △ Less

Submitted 24 September, 2021; originally announced September 2021.

arXiv:2103.10784 [pdf, other]

Motion Estimation for Optical Coherence Elastography Using Signal Phase and Intensity

Authors: Hossein Khodadadi, Orcun Goksel, Sabine Kling

Abstract: Displacement estimation in optical coherence tomography (OCT) imaging is relevant for several potential applications, e.g. for optical coherence elastography (OCE) for corneal biomechanical characterization. Larger displacements may be resolved using correlation-based block matching techniques, which however are prone to signal de-correlation and imprecise at commonly desired sub-pixel resolutions… ▽ More Displacement estimation in optical coherence tomography (OCT) imaging is relevant for several potential applications, e.g. for optical coherence elastography (OCE) for corneal biomechanical characterization. Larger displacements may be resolved using correlation-based block matching techniques, which however are prone to signal de-correlation and imprecise at commonly desired sub-pixel resolutions. Phase-based tracking methods can estimate tiny sub-wavelength motion, but are not suitable for motion magnitudes larger than half the wavelength due to phase wrap** and the difficulty of any unwrap** due to noise. In this paper a robust OCT displacement estimation method is introduced by formulating tracking as an optimization problem that jointly penalizes intensity disparity, phase difference, and motion discontinuity. This is then solved using yynamic programming, utilizing both sub-wavelength-scale phase and pixel-scale intensity information from OCT imaging, while inherently seeking for the number of phase wraps. This allows for effectively tracking axial and lateral displacements, respectively, with sub-wavelength and pixel scale resolution. Results with tissue mimicking phantoms show that our proposed approach substantially outperforms conventional methods in terms of axial tracking precision, in particular for displacements exceeding half the imaging wavelength. △ Less

Submitted 19 March, 2021; originally announced March 2021.

Comments: 10 pages, 8 figures

arXiv:2103.05745 [pdf, other]

Content-Preserving Unpaired Translation from Simulated to Realistic Ultrasound Images

Authors: Devavrat Tomar, Lin Zhang, Tiziano Portenier, Orcun Goksel

Abstract: Interactive simulation of ultrasound imaging greatly facilitates sonography training. Although ray-tracing based methods have shown promising results, obtaining realistic images requires substantial modeling effort and manual parameter tuning. In addition, current techniques still result in a significant appearance gap between simulated images and real clinical scans. Herein we introduce a novel c… ▽ More Interactive simulation of ultrasound imaging greatly facilitates sonography training. Although ray-tracing based methods have shown promising results, obtaining realistic images requires substantial modeling effort and manual parameter tuning. In addition, current techniques still result in a significant appearance gap between simulated images and real clinical scans. Herein we introduce a novel content-preserving image translation framework (ConPres) to bridge this appearance gap, while maintaining the simulated anatomical layout. We achieve this goal by leveraging both simulated images with semantic segmentations and unpaired in-vivo ultrasound scans. Our framework is based on recent contrastive unpaired translation techniques and we propose a regularization approach by learning an auxiliary segmentation-to-real image translation task, which encourages the disentanglement of content and style. In addition, we extend the generator to be class-conditional, which enables the incorporation of additional losses, in particular a cyclic consistency loss, to further improve the translation quality. Qualitative and quantitative comparisons against state-of-the-art unpaired translation methods demonstrate the superiority of our proposed framework. △ Less

Submitted 30 September, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

arXiv:2102.11865 [pdf, other]

Probabilistic Spatial Analysis in Quantitative Microscopy with Uncertainty-Aware Cell Detection using Deep Bayesian Regression of Density Maps

Authors: Alvaro Gomariz, Tiziano Portenier, César Nombela-Arrieta, Orcun Goksel

Abstract: 3D microscopy is key in the investigation of diverse biological systems, and the ever increasing availability of large datasets demands automatic cell identification methods that not only are accurate, but also can imply the uncertainty in their predictions to inform about potential errors and hence confidence in conclusions using them. While conventional deep learning methods often yield determin… ▽ More 3D microscopy is key in the investigation of diverse biological systems, and the ever increasing availability of large datasets demands automatic cell identification methods that not only are accurate, but also can imply the uncertainty in their predictions to inform about potential errors and hence confidence in conclusions using them. While conventional deep learning methods often yield deterministic results, advances in deep Bayesian learning allow for accurate predictions with a probabilistic interpretation in numerous image classification and segmentation tasks. It is however nontrivial to extend such Bayesian methods to cell detection, which requires specialized learning frameworks. In particular, regression of density maps is a popular successful approach for extracting cell coordinates from local peaks in a postprocessing step, which hinders any meaningful probabilistic output. We herein propose a deep learning-based cell detection framework that can operate on large microscopy images and outputs desired probabilistic predictions by (i) integrating Bayesian techniques for the regression of uncertainty-aware density maps, where peak detection can be applied to generate cell proposals, and (ii) learning a map** from the numerous proposals to a probabilistic space that is calibrated, i.e. accurately represents the chances of a successful prediction. Utilizing such calibrated predictions, we propose a probabilistic spatial analysis with Monte-Carlo sampling. We demonstrate this in revising an existing description of the distribution of a mesenchymal stromal cell type within the bone marrow, where our proposed methods allow us to reveal spatial patterns that are otherwise undetectable. Introducing such probabilistic analysis in quantitative microscopy pipelines will allow for reporting confidence intervals for testing biological hypotheses of spatial distributions. △ Less

Submitted 23 February, 2021; originally announced February 2021.

arXiv:2101.11476 [pdf, other]

Utilizing Uncertainty Estimation in Deep Learning Segmentation of Fluorescence Microscopy Images with Missing Markers

Authors: Alvaro Gomariz, Raphael Egli, Tiziano Portenier, César Nombela-Arrieta, Orcun Goksel

Abstract: Fluorescence microscopy images contain several channels, each indicating a marker staining the sample. Since many different marker combinations are utilized in practice, it has been challenging to apply deep learning based segmentation models, which expect a predefined channel combination for all training samples as well as at inference for future application. Recent work circumvents this problem… ▽ More Fluorescence microscopy images contain several channels, each indicating a marker staining the sample. Since many different marker combinations are utilized in practice, it has been challenging to apply deep learning based segmentation models, which expect a predefined channel combination for all training samples as well as at inference for future application. Recent work circumvents this problem using a modality attention approach to be effective across any possible marker combination. However, for combinations that do not exist in a labeled training dataset, one cannot have any estimation of potential segmentation quality if that combination is encountered during inference. Without this, not only one lacks quality assurance but one also does not know where to put any additional imaging and labeling effort. We herein propose a method to estimate segmentation quality on unlabeled images by (i) estimating both aleatoric and epistemic uncertainties of convolutional neural networks for image segmentation, and (ii) training a Random Forest model for the interpretation of uncertainty features via regression to their corresponding segmentation metrics. Additionally, we demonstrate that including these uncertainty measures during training can provide an improvement on segmentation performance. △ Less

Submitted 27 January, 2021; originally announced January 2021.

Comments: Accepted at the IEEE International Symposium on Biomedical Imaging (ISBI) 2021. 4 pages and 4 figures

arXiv:2101.08339 [pdf, other]

Learning Ultrasound Rendering from Cross-Sectional Model Slices for Simulated Training

Authors: Lin Zhang, Tiziano Portenier, Orcun Goksel

Abstract: Purpose. Given the high level of expertise required for navigation and interpretation of ultrasound images, computational simulations can facilitate the training of such skills in virtual reality. With ray-tracing based simulations, realistic ultrasound images can be generated. However, due to computational constraints for interactivity, image quality typically needs to be compromised. Methods.… ▽ More Purpose. Given the high level of expertise required for navigation and interpretation of ultrasound images, computational simulations can facilitate the training of such skills in virtual reality. With ray-tracing based simulations, realistic ultrasound images can be generated. However, due to computational constraints for interactivity, image quality typically needs to be compromised. Methods. We propose herein to bypass any rendering and simulation process at interactive time, by conducting such simulations during a non-time-critical offline stage and then learning image translation from cross-sectional model slices to such simulated frames. We use a generative adversarial framework with a dedicated generator architecture and input feeding scheme, which both substantially improve image quality without increase in network parameters. Integral attenuation maps derived from cross-sectional model slices, texture-friendly strided convolutions, providing stochastic noise and input maps to intermediate layers in order to preserve locality are all shown herein to greatly facilitate such translation task. Results. Given several quality metrics, the proposed method with only tissue maps as input is shown to provide comparable or superior results to a state-of-the-art that uses additional images of low-quality ultrasound renderings. An extensive ablation study shows the need and benefits from the individual contributions utilized in this work, based on qualitative examples and quantitative ultrasound similarity metrics. To that end, a local histogram statistics based error metric is proposed and demonstrated for visualization of local dissimilarities between ultrasound images. △ Less

Submitted 20 January, 2021; originally announced January 2021.

arXiv:2008.12380 [pdf, other]

doi 10.1038/s42256-021-00379-y

Modality Attention and Sampling Enables Deep Learning with Heterogeneous Marker Combinations in Fluorescence Microscopy

Authors: Alvaro Gomariz, Tiziano Portenier, Patrick M. Helbling, Stephan Isringhausen, Ute Suessbier, César Nombela-Arrieta, Orcun Goksel

Abstract: Fluorescence microscopy allows for a detailed inspection of cells, cellular networks, and anatomical landmarks by staining with a variety of carefully-selected markers visualized as color channels. Quantitative characterization of structures in acquired images often relies on automatic image analysis methods. Despite the success of deep learning methods in other vision applications, their potentia… ▽ More Fluorescence microscopy allows for a detailed inspection of cells, cellular networks, and anatomical landmarks by staining with a variety of carefully-selected markers visualized as color channels. Quantitative characterization of structures in acquired images often relies on automatic image analysis methods. Despite the success of deep learning methods in other vision applications, their potential for fluorescence image analysis remains underexploited. One reason lies in the considerable workload required to train accurate models, which are normally specific for a given combination of markers, and therefore applicable to a very restricted number of experimental settings. We herein propose Marker Sampling and Excite, a neural network approach with a modality sampling strategy and a novel attention module that together enable (i) flexible training with heterogeneous datasets with combinations of markers and (ii) successful utility of learned models on arbitrary subsets of markers prospectively. We show that our single neural network solution performs comparably to an upper bound scenario where an ensemble of many networks is naïvely trained for each possible marker combination separately. In addition, we demonstrate the feasibility of this framework in high-throughput biological analysis by revising a recent quantitative characterization of bone marrow vasculature in 3D confocal microscopy datasets and further confirm the validity of our approach on an additional, significantly different dataset of microvessels in fetal liver tissues. Not only can our work substantially ameliorate the use of deep learning in fluorescence microscopy analysis, but it can also be utilized in other fields with incomplete data acquisitions and missing modalities. △ Less

Submitted 22 June, 2021; v1 submitted 27 August, 2020; originally announced August 2020.

Comments: Main: 21 pages, 6 figures, 1 table. Supplementary: 5 pages, 7 figures, 3 tables

Journal ref: Nature Machine Intelligence (2021)

arXiv:2007.06669 [pdf, other]

Reinforcement Learning of Musculoskeletal Control from Functional Simulations

Authors: Emanuel Joos, Fabien Péan, Orcun Goksel

Abstract: To diagnose, plan, and treat musculoskeletal pathologies, understanding and reproducing muscle recruitment for complex movements is essential. With muscle activations for movements often being highly redundant, nonlinear, and time dependent, machine learning can provide a solution for their modeling and control for anatomy-specific musculoskeletal simulations. Sophisticated biomechanical simulatio… ▽ More To diagnose, plan, and treat musculoskeletal pathologies, understanding and reproducing muscle recruitment for complex movements is essential. With muscle activations for movements often being highly redundant, nonlinear, and time dependent, machine learning can provide a solution for their modeling and control for anatomy-specific musculoskeletal simulations. Sophisticated biomechanical simulations often require specialized computational environments, being numerically complex and slow, hindering their integration with typical deep learning frameworks. In this work, a deep reinforcement learning (DRL) based inverse dynamics controller is trained to control muscle activations of a biomechanical model of the human shoulder. In a generalizable end-to-end fashion, muscle activations are learned given current and desired position-velocity pairs. A customized reward functions for trajectory control is introduced, enabling straightforward extension to additional muscles and higher degrees of freedom. Using the biomechanical model, multiple episodes are simulated on a cluster simultaneously using the evolving neural models of the DRL being trained. Results are presented for a single-axis motion control of shoulder abduction for the task of following randomly generated angular trajectories. △ Less

Submitted 13 July, 2020; originally announced July 2020.

arXiv:2006.14395 [pdf, other]

doi 10.1109/TUFFC.2020.3010186

Training Variational Networks with Multi-Domain Simulations: Speed-of-Sound Image Reconstruction

Authors: Melanie Bernhardt, Valery Vishnevskiy, Richard Rau, Orcun Goksel

Abstract: Speed-of-sound has been shown as a potential biomarker for breast cancer imaging, successfully differentiating malignant tumors from benign ones. Speed-of-sound images can be reconstructed from time-of-flight measurements from ultrasound images acquired using conventional handheld ultrasound transducers. Variational Networks (VN) have recently been shown to be a potential learning-based approach f… ▽ More Speed-of-sound has been shown as a potential biomarker for breast cancer imaging, successfully differentiating malignant tumors from benign ones. Speed-of-sound images can be reconstructed from time-of-flight measurements from ultrasound images acquired using conventional handheld ultrasound transducers. Variational Networks (VN) have recently been shown to be a potential learning-based approach for optimizing inverse problems in image reconstruction. Despite earlier promising results, these methods however do not generalize well from simulated to acquired data, due to the domain shift. In this work, we present for the first time a VN solution for a pulse-echo SoS image reconstruction problem using diverging waves with conventional transducers and single-sided tissue access. This is made possible by incorporating simulations with varying complexity into training. We use loop unrolling of gradient descent with momentum, with an exponentially weighted loss of outputs at each unrolled iteration in order to regularize training. We learn norms as activation functions regularized to have smooth forms for robustness to input distribution variations. We evaluate reconstruction quality on ray-based and full-wave simulations as well as on tissue-mimicking phantom data, in comparison to a classical iterative (L-BFGS) optimization of this image reconstruction problem. We show that the proposed regularization techniques combined with multi-source domain training yield substantial improvements in the domain adaptation capabilities of VN, reducing median RMSE by 54% on a wave-based simulation dataset compared to the baseline VN. We also show that on data acquired from a tissue-mimicking breast phantom the proposed VN provides improved reconstruction in 12 milliseconds. △ Less

Submitted 25 June, 2020; originally announced June 2020.

arXiv:2006.10850 [pdf, other]

Deep Image Translation for Enhancing Simulated Ultrasound Images

Authors: Lin Zhang, Tiziano Portenier, Christoph Paulus, Orcun Goksel

Abstract: Ultrasound simulation based on ray tracing enables the synthesis of highly realistic images. It can provide an interactive environment for training sonographers as an educational tool. However, due to high computational demand, there is a trade-off between image quality and interactivity, potentially leading to sub-optimal results at interactive rates. In this work we introduce a deep learning app… ▽ More Ultrasound simulation based on ray tracing enables the synthesis of highly realistic images. It can provide an interactive environment for training sonographers as an educational tool. However, due to high computational demand, there is a trade-off between image quality and interactivity, potentially leading to sub-optimal results at interactive rates. In this work we introduce a deep learning approach based on adversarial training that mitigates this trade-off by improving the quality of simulated images with constant computation time. An image-to-image translation framework is utilized to translate low quality images into high quality versions. To incorporate anatomical information potentially lost in low quality images, we additionally provide segmentation maps to image translation. Furthermore, we propose to leverage information from acoustic attenuation maps to better preserve acoustic shadows and directional artifacts, an invaluable feature for ultrasound image interpretation. The proposed method yields an improvement of 7.2% in Fréchet Inception Distance and 8.9% in patch-based Kullback-Leibler divergence. △ Less

Submitted 18 June, 2020; originally announced June 2020.

arXiv:2006.10166 [pdf, other]

Deep Network for Scatterer Distribution Estimation for Ultrasound Image Simulation

Authors: Lin Zhang, Valery Vishnevskiy, Orcun Goksel

Abstract: Simulation-based ultrasound training can be an essential educational tool. Realistic ultrasound image appearance with typical speckle texture can be modeled as convolution of a point spread function with point scatterers representing tissue microstructure. Such scatterer distribution, however, is in general not known and its estimation for a given tissue type is fundamentally an ill-posed inverse… ▽ More Simulation-based ultrasound training can be an essential educational tool. Realistic ultrasound image appearance with typical speckle texture can be modeled as convolution of a point spread function with point scatterers representing tissue microstructure. Such scatterer distribution, however, is in general not known and its estimation for a given tissue type is fundamentally an ill-posed inverse problem. In this paper, we demonstrate a convolutional neural network approach for probabilistic scatterer estimation from observed ultrasound data. We herein propose to impose a known statistical distribution on scatterers and learn the map** between ultrasound image and distribution parameter map by training a convolutional neural network on synthetic images. In comparison with several existing approaches, we demonstrate in numerical simulations and with in-vivo images that the synthesized images from scatterer representations estimated with our approach closely match the observations with varying acquisition parameters such as compression and rotation of the imaged domain. △ Less

Submitted 17 June, 2020; originally announced June 2020.

arXiv:2006.09772 [pdf, other]

doi 10.1109/ISBI45749.2020.9098431

Mitosis Detection Under Limited Annotation: A Joint Learning Approach

Authors: Pushpak Pati, Antonio Foncubierta-Rodriguez, Orcun Goksel, Maria Gabrani

Abstract: Mitotic counting is a vital prognostic marker of tumor proliferation in breast cancer. Deep learning-based mitotic detection is on par with pathologists, but it requires large labeled data for training. We propose a deep classification framework for enhancing mitosis detection by leveraging class label information, via softmax loss, and spatial distribution information among samples, via distance… ▽ More Mitotic counting is a vital prognostic marker of tumor proliferation in breast cancer. Deep learning-based mitotic detection is on par with pathologists, but it requires large labeled data for training. We propose a deep classification framework for enhancing mitosis detection by leveraging class label information, via softmax loss, and spatial distribution information among samples, via distance metric learning. We also investigate strategies towards steadily providing informative samples to boost the learning. The efficacy of the proposed framework is established through evaluation on ICPR 2012 and AMIDA 2013 mitotic data. Our framework significantly improves the detection with small training data and achieves on par or superior performance compared to state-of-the-art methods for using the entire training data. △ Less

Submitted 2 July, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

Comments: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI)

arXiv:2003.05658 [pdf, other]

Frequency-Dependent Attenuation Reconstruction with an Acoustic Reflector

Authors: Richard Rau, Ozan Unal, Dieter Schweizer, Valery Vishnevskiy, Orcun Goksel

Abstract: Attenuation of ultrasound waves varies with tissue composition, hence its estimation offers great potential for tissue characterization and diagnosis and staging of pathology. We recently proposed a method that allows to spatially reconstruct the distribution of the overall ultrasound attenuation in tissue based on computed tomography, using reflections from a passive acoustic reflector. This requ… ▽ More Attenuation of ultrasound waves varies with tissue composition, hence its estimation offers great potential for tissue characterization and diagnosis and staging of pathology. We recently proposed a method that allows to spatially reconstruct the distribution of the overall ultrasound attenuation in tissue based on computed tomography, using reflections from a passive acoustic reflector. This requires a standard ultrasound transducer operating in pulse-echo mode and a calibration protocol using water measurements, thus it can be implemented on conventional ultrasound systems with minor adaptations. Herein, we extend this method by additionally estimating and imaging the frequency-dependent nature of local ultrasound attenuation for the first time. Spatial distributions of attenuation coefficient and exponent are reconstructed, enabling an elaborate and expressive tissue-specific characterization. With simulations, we demonstrate that our proposed method yields a low reconstruction error of 0.04dB/cm at 1MHz for attenuation coefficient and 0.08 for the frequency exponent. With tissue-mimicking phantoms and ex-vivo bovine muscle samples, a high reconstruction contrast as well as reproducibility are demonstrated. Attenuation exponents of a gelatin-cellulose mixture and an ex-vivo bovine muscle sample were found to be, respectively, 1.4 and 0.5 on average, consistently from different images of their heterogeneous compositions. Such frequency-dependent parametrization could enable novel imaging and diagnostic techniques, as well as facilitate attenuation compensation of other ultrasound-based imaging techniques. △ Less

Submitted 10 October, 2020; v1 submitted 12 March, 2020; originally announced March 2020.

arXiv:1910.05935 [pdf, other]

doi 10.1007/s11548-021-02426-w

Speed-of-Sound Imaging using Diverging Waves

Authors: Richard Rau, Dieter Schweizer, Valery Vishnevskiy, Orcun Goksel

Abstract: Recent ultrasound imaging modalities based on ultrasound computed tomography indicate a huge potential to detect pathologies is tissue due to altered biomechanical properties. Especially the imaging of speed-of-sound (SoS) distribution in tissue has shown clinical promise and thus gained increasing attention in the field -- with several methods proposed based on transmission mode tomography. SoS i… ▽ More Recent ultrasound imaging modalities based on ultrasound computed tomography indicate a huge potential to detect pathologies is tissue due to altered biomechanical properties. Especially the imaging of speed-of-sound (SoS) distribution in tissue has shown clinical promise and thus gained increasing attention in the field -- with several methods proposed based on transmission mode tomography. SoS imaging using conventional ultrasound (US) systems would be convenient and easy for clinical translation, but this requires using conventional US probes with single-sides tissue access and thus pulse-echo imaging sequences. Recent pulse-echo SoS imaging methods rely on plane wave (PW) insonifications, which is prone to strong aberration effects for non-homogeneous tissue composition. In this paper we propose to use diverging waves (DW) for SoS imaging and thus substantially improve the reconstruction of SoS distributions. We study this proposition by first plane wavefront aberrations compared to DW. We then present the sensitivity of both approaches to major parameterization choices on a set of simulated phantoms. Using the optimum parameter combination for each method for a given transducer model and imaging sequence, we analyze the SoS imaging performance comparatively between the two approaches. Results indicate that using DW instead of PW, the reconstruction accuracy improves substantially, by over 22% in reconstruction error (RMSE) and by 55% in contrast (CNR). We also demonstrate improvements in SoS reconstructions from an actual US acquisition of a breast phantom with tumor- and cyst-representative inclusions, with high and low SoS contrast, respectively. △ Less

Submitted 14 October, 2019; originally announced October 2019.

Journal ref: International Journal of Computer Assisted Radiology and Surgery (2021)

arXiv:1909.10254 [pdf, other]

doi 10.1109/ULTSYM.2019.8926297

Ultrasound Aberration Correction based on Local Speed-of-Sound Map Estimation

Authors: Richard Rau, Dieter Schweizer, Valery Vishnevskiy, Orcun Goksel

Abstract: For beamforming ultrasound (US) signals, typically a spatially constant speed-of-sound (SoS) is assumed to calculate delays. As SoS in tissue may vary relatively largely, this approximation may cause wavefront aberrations, thus degrading effective imaging resolution. In the literature, corrections have been proposed based on unidirectional SoS estimation or computationally-expensive a posteriori p… ▽ More For beamforming ultrasound (US) signals, typically a spatially constant speed-of-sound (SoS) is assumed to calculate delays. As SoS in tissue may vary relatively largely, this approximation may cause wavefront aberrations, thus degrading effective imaging resolution. In the literature, corrections have been proposed based on unidirectional SoS estimation or computationally-expensive a posteriori phase rectification. In this paper we demonstrate a direct delay correction approach for US beamforming, by leveraging 2D spatial SoS distribution estimates from plane-wave imaging. We show both in simulations and with ex vivo measurements that resolutions close to the wavelength limit can be achieved using our proposed local SoS-adaptive beamforming, yielding a lateral resolution improvement of 22% to 29% on tissue samples with up to 3% SoS-contrast (45m/s). We verify that our method accurately images absolute positions of tissue structures down to sub-pixel resolution of a tenth of a wavelength, whereas a global SoS assumption leads to artifactual localizations. △ Less

Submitted 23 September, 2019; originally announced September 2019.

Comments: will be published in the proceedings of the IEEE International Ultrasonics Symposium (IUS) 2019

Journal ref: 2019 IEEE International Ultrasonics Symposium (IUS)

arXiv:1906.11615 [pdf, other]

doi 10.1007/978-3-030-32254-0_67

Attenuation Imaging with Pulse-Echo Ultrasound based on an Acoustic Reflector

Authors: Richard Rau, Ozan Unal, Dieter Schweizer, Valery Vishnevskiy, Orcun Goksel

Abstract: Ultrasound attenuation is caused by absorption and scattering in tissue and is thus a function of tissue composition, hence its imaging offers great potential for screening and differential diagnosis. In this paper we propose a novel method that allows to reconstruct spatial attenuation distribution in tissue based on computed tomography, using reflections from a passive acoustic reflector. This r… ▽ More Ultrasound attenuation is caused by absorption and scattering in tissue and is thus a function of tissue composition, hence its imaging offers great potential for screening and differential diagnosis. In this paper we propose a novel method that allows to reconstruct spatial attenuation distribution in tissue based on computed tomography, using reflections from a passive acoustic reflector. This requires a standard ultrasound transducer operating in pulse-echo mode, thus it can be implemented on conventional ultrasound systems with minor modifications. We use calibration with water measurements in order to normalize measurements for quantitative imaging of attenuation. In contrast to earlier techniques, we herein show that attenuation reconstructions are possible without any geometric prior on the inclusion location or shape. We present a quantitative evaluation of reconstructions based on simulations, gelatin phantoms, and ex-vivo bovine skeletal muscle tissue, achieving contrast-to-noise ratio of up to 2.3 for an inclusion in ex-vivo tissue. △ Less

Submitted 24 July, 2019; v1 submitted 27 June, 2019; originally announced June 2019.

Comments: Accepted at MICCAI 2019 (International Conference on Medical Image Computing and Computer Assisted Intervention)

Journal ref: Medical Image Computing and Computer Assisted Intervention - MICCAI 2019 pp 601-609

arXiv:1906.05528 [pdf, other]

Deep Variational Networks with Exponential Weighting for Learning Computed Tomography

Authors: Valery Vishnevskiy, Richard Rau, Orcun Goksel

Abstract: Tomographic image reconstruction is relevant for many medical imaging modalities including X-ray, ultrasound (US) computed tomography (CT) and photoacoustics, for which the access to full angular range tomographic projections might be not available in clinical practice due to physical or time constraints. Reconstruction from incomplete data in low signal-to-noise ratio regime is a challenging and… ▽ More Tomographic image reconstruction is relevant for many medical imaging modalities including X-ray, ultrasound (US) computed tomography (CT) and photoacoustics, for which the access to full angular range tomographic projections might be not available in clinical practice due to physical or time constraints. Reconstruction from incomplete data in low signal-to-noise ratio regime is a challenging and ill-posed inverse problem that usually leads to unsatisfactory image quality. While informative image priors may be learned using generic deep neural network architectures, the artefacts caused by an ill-conditioned design matrix often have global spatial support and cannot be efficiently filtered out by means of convolutions. In this paper we propose to learn an inverse map** in an end-to-end fashion via unrolling optimization iterations of a prototypical reconstruction algorithm. We herein introduce a network architecture that performs filtering jointly in both sinogram and spatial domains. To efficiently train such deep network we propose a novel regularization approach based on deep exponential weighting. Experiments on US and X-ray CT data show that our proposed method is qualitatively and quantitatively superior to conventional non-linear reconstruction methods as well as state-of-the-art deep networks for image reconstruction. Fast inference time of the proposed algorithm allows for sophisticated reconstructions in real-time critical settings, demonstrated with US SoS imaging of an ex vivo bovine phantom. △ Less

Submitted 13 June, 2019; originally announced June 2019.

Comments: Accepted to MICCAI 2019

arXiv:1902.00469 [pdf, other]

SCATGAN for Reconstruction of Ultrasound Scatterers Using Generative Adversarial Networks

Authors: Andrawes Al Bahou, Christine Tanner, Orcun Goksel

Abstract: Computational simulation of ultrasound (US) echography is essential for training sonographers. Realistic simulation of US interaction with microscopic tissue structures is often modeled by a tissue representation in the form of point scatterers, convolved with a spatially varying point spread function. This yields a realistic US B-mode speckle texture, given that a scatterer representation for a p… ▽ More Computational simulation of ultrasound (US) echography is essential for training sonographers. Realistic simulation of US interaction with microscopic tissue structures is often modeled by a tissue representation in the form of point scatterers, convolved with a spatially varying point spread function. This yields a realistic US B-mode speckle texture, given that a scatterer representation for a particular tissue type is readily available. This is often not the case and scatterers are nontrivial to determine. In this work we propose to estimate scatterer maps from sample US B-mode images of a tissue, by formulating this inverse map** problem as image translation, where we learn the map** with Generative Adversarial Networks, using a US simulation software for training. We demonstrate robust reconstruction results, invariant to US viewing and imaging settings such as imaging direction and center frequency. Our method is shown to generalize beyond the trained imaging settings, demonstrated on in-vivo US data. Our inference runs orders of magnitude faster than optimization-based techniques, enabling future extensions for reconstructing 3D B-mode volumes with only linear computational complexity. △ Less

Submitted 1 February, 2019; originally announced February 2019.

arXiv:1901.08109 [pdf, other]

Siamese Networks with Location Prior for Landmark Tracking in Liver Ultrasound Sequences

Authors: Alvaro Gomariz, Weiye Li, Ece Ozkan, Christine Tanner, Orcun Goksel

Abstract: Image-guided radiation therapy can benefit from accurate motion tracking by ultrasound imaging, in order to minimize treatment margins and radiate moving anatomical targets, e.g., due to breathing. One way to formulate this tracking problem is the automatic localization of given tracked anatomical landmarks throughout a temporal ultrasound sequence. For this, we herein propose a fully-convolutiona… ▽ More Image-guided radiation therapy can benefit from accurate motion tracking by ultrasound imaging, in order to minimize treatment margins and radiate moving anatomical targets, e.g., due to breathing. One way to formulate this tracking problem is the automatic localization of given tracked anatomical landmarks throughout a temporal ultrasound sequence. For this, we herein propose a fully-convolutional Siamese network that learns the similarity between pairs of image regions containing the same landmark. Accordingly, it learns to localize and thus track arbitrary image features, not only predefined anatomical structures. We employ a temporal consistency model as a location prior, which we combine with the network-predicted location probability map to track a target iteratively in ultrasound sequences. We applied this method on the dataset of the Challenge on Liver Ultrasound Tracking (CLUST) with competitive results, where our work is the first to effectively apply CNNs on this tracking problem, thanks to our temporal regularization. △ Less

Submitted 23 January, 2019; originally announced January 2019.

Comments: Accepted at the IEEE International Symposium on Biomedical Imaging (ISBI) 2019

arXiv:1811.04634 [pdf, other]

doi 10.1007/s11548-019-01984-4

Extending Pretrained Segmentation Networks with Additional Anatomical Structures

Authors: Firat Ozdemir, Orcun Goksel

Abstract: Comprehensive surgical planning require complex patient-specific anatomical models. For instance, functional muskuloskeletal simulations necessitate all relevant structures to be segmented, which could be performed in real-time using deep neural networks given sufficient annotated samples. Such large datasets of multiple structure annotations are costly to procure and are often unavailable in prac… ▽ More Comprehensive surgical planning require complex patient-specific anatomical models. For instance, functional muskuloskeletal simulations necessitate all relevant structures to be segmented, which could be performed in real-time using deep neural networks given sufficient annotated samples. Such large datasets of multiple structure annotations are costly to procure and are often unavailable in practice. Nevertheless, annotations from different studies and centers can be readily available, or become available in the future in an incremental fashion. We propose a class-incremental segmentation framework for extending a deep network trained for some anatomical structure to yet another structure using a small incremental annotation set. Through distilling knowledge from the current state of the framework, we bypass the need for a full retraining. This is a meta-method to extend any choice of desired deep segmentation network with only a minor addition per structure, which makes it suitable for lifelong class-incremental learning and applicable also for future deep neural network architectures. We evaluated our methods on a public knee dataset of 100 MR volumes. Through varying amount of incremental annotation ratios, we show how our proposed method can retain the previous anatomical structure segmentation performance superior to the conventional finetuning approach. In addition, our framework inherently exploits transferable knowledge from previously trained structures to incremental tasks, demonstrated by superior results compared to non-incremental training. With the presented method, new anatomical structures can be learned without catastrophic forgetting of older structures and without extensive increase of memory and complexity. △ Less

Submitted 17 May, 2019; v1 submitted 12 November, 2018; originally announced November 2018.

Comments: Published in IJCARS. 8 pages, 4 figures, contains supplementary material

Journal ref: International Journal of Computer Assisted Radiology and Surgery, 2 May 2019, issn 1861-6429

Showing 1–28 of 28 results for author: Goksel, O