Search | arXiv e-print repository

Application-driven Validation of Posteriors in Inverse Problems

Authors: Tim J. Adler, Jan-Hinrich Nölke, Annika Reinke, Minu Dietlinde Tizabi, Sebastian Gruber, Dasha Trofimova, Lynton Ardizzone, Paul F. Jaeger, Florian Buettner, Ullrich Köthe, Lena Maier-Hein

Abstract: Current deep learning-based solutions for image analysis tasks are commonly incapable of handling problems to which multiple different plausible solutions exist. In response, posterior-based methods such as conditional Diffusion Models and Invertible Neural Networks have emerged; however, their translation is hampered by a lack of research on adequate validation. In other words, the way progress i… ▽ More Current deep learning-based solutions for image analysis tasks are commonly incapable of handling problems to which multiple different plausible solutions exist. In response, posterior-based methods such as conditional Diffusion Models and Invertible Neural Networks have emerged; however, their translation is hampered by a lack of research on adequate validation. In other words, the way progress is measured often does not reflect the needs of the driving practical application. Closing this gap in the literature, we present the first systematic framework for the application-driven validation of posterior-based methods in inverse problems. As a methodological novelty, it adopts key principles from the field of object detection validation, which has a long history of addressing the question of how to locate and match multiple object instances in an image. Treating modes as instances enables us to perform mode-centric validation, using well-interpretable metrics from the application perspective. We demonstrate the value of our framework through instantiations for a synthetic toy example and two medical vision use cases: pose estimation in surgery and imaging-based quantification of functional tissue parameters for diagnostics. Our framework offers key advantages over common approaches to posterior validation in all three examples and could thus revolutionize performance assessment in inverse problems. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: Shared first authors: Tim J. Adler and Jan-Hinrich Nölke. 16 pages, 8 figures, 1 table

arXiv:2307.05591 [pdf, other]

Linear Alignment of Vision-language Models for Image Captioning

Authors: Fabian Paischer, Markus Hofmarcher, Sepp Hochreiter, Thomas Adler

Abstract: Recently, vision-language models like CLIP have advanced the state of the art in a variety of multi-modal tasks including image captioning and caption evaluation. Many approaches adapt CLIP-style models to a downstream task by training a map** network between CLIP and a language model. This is costly as it usually involves calculating gradients for large models. We propose a more efficient train… ▽ More Recently, vision-language models like CLIP have advanced the state of the art in a variety of multi-modal tasks including image captioning and caption evaluation. Many approaches adapt CLIP-style models to a downstream task by training a map** network between CLIP and a language model. This is costly as it usually involves calculating gradients for large models. We propose a more efficient training protocol that fits a linear map** between image and text embeddings of CLIP via a closed-form solution. This bypasses the need for gradient computation and results in a lightweight captioning method called ReCap, which can be trained up to 1000 times faster than existing lightweight methods. Moreover, we propose two new learning-based image-captioning metrics that build on CLIP score along with our linear map**. Furthermore, we combine ReCap with our new metrics to design an iterative datastore-augmentation loop (DAL) based on synthetic captions. We evaluate ReCap on MS-COCO, Flickr30k, VizWiz, and MSRVTT. ReCap achieves performance comparable to state-of-the-art lightweight methods on established metrics while outperforming them on our new metrics, which are better aligned with human ratings on Flickr8k-Expert and Flickr8k-Crowdflower. Finally, we demonstrate that ReCap transfers well to other domains and that our DAL leads to a performance boost. △ Less

Submitted 6 February, 2024; v1 submitted 10 July, 2023; originally announced July 2023.

Comments: 8 pages (+ references and appendix)

arXiv:2306.09312 [pdf, other]

Semantic HELM: A Human-Readable Memory for Reinforcement Learning

Authors: Fabian Paischer, Thomas Adler, Markus Hofmarcher, Sepp Hochreiter

Abstract: Reinforcement learning agents deployed in the real world often have to cope with partially observable environments. Therefore, most agents employ memory mechanisms to approximate the state of the environment. Recently, there have been impressive success stories in mastering partially observable environments, mostly in the realm of computer games like Dota 2, StarCraft II, or MineCraft. However, ex… ▽ More Reinforcement learning agents deployed in the real world often have to cope with partially observable environments. Therefore, most agents employ memory mechanisms to approximate the state of the environment. Recently, there have been impressive success stories in mastering partially observable environments, mostly in the realm of computer games like Dota 2, StarCraft II, or MineCraft. However, existing methods lack interpretability in the sense that it is not comprehensible for humans what the agent stores in its memory. In this regard, we propose a novel memory mechanism that represents past events in human language. Our method uses CLIP to associate visual inputs with language tokens. Then we feed these tokens to a pretrained language model that serves the agent as memory and provides it with a coherent and human-readable representation of the past. We train our memory mechanism on a set of partially observable environments and find that it excels on tasks that require a memory component, while mostly attaining performance on-par with strong baselines on tasks that do not. On a challenging continuous recognition task, where memorizing the past is crucial, our memory mechanism converges two orders of magnitude faster than prior methods. Since our memory mechanism is human-readable, we can peek at an agent's memory and check whether crucial pieces of information have been stored. This significantly enhances troubleshooting and paves the way toward more interpretable agents. △ Less

Submitted 27 October, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: To appear at NeurIPS 2023, 10 pages (+ references and appendix), Code: https://github.com/ml-jku/helm

arXiv:2303.17719 [pdf, other]

Why is the winner the best?

Authors: Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Sharib Ali, Vincent Andrearczyk, Marc Aubreville, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano, Jorge Bernal, Sebastian Bodenstedt, Alessandro Casella, Veronika Cheplygina, Marie Daum, Marleen de Bruijne, Adrien Depeursinge, Reuben Dorent, Jan Egger, David G. Ellis, Sandy Engelhardt, Melanie Ganz , et al. (100 additional authors not shown)

Abstract: International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To addre… ▽ More International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work. △ Less

Submitted 30 March, 2023; originally announced March 2023.

Comments: accepted to CVPR 2023

arXiv:2303.12915 [pdf, other]

Self-distillation for surgical action recognition

Authors: Amine Yamlahi, Thuy Nuong Tran, Patrick Godau, Melanie Schellenberg, Dominik Michael, Finn-Henri Smidt, Jan-Hinrich Noelke, Tim Adler, Minu Dietlinde Tizabi, Chinedu Nwoye, Nicolas Padoy, Lena Maier-Hein

Abstract: Surgical scene understanding is a key prerequisite for contextaware decision support in the operating room. While deep learning-based approaches have already reached or even surpassed human performance in various fields, the task of surgical action recognition remains a major challenge. With this contribution, we are the first to investigate the concept of self-distillation as a means of addressin… ▽ More Surgical scene understanding is a key prerequisite for contextaware decision support in the operating room. While deep learning-based approaches have already reached or even surpassed human performance in various fields, the task of surgical action recognition remains a major challenge. With this contribution, we are the first to investigate the concept of self-distillation as a means of addressing class imbalance and potential label ambiguity in surgical video analysis. Our proposed method is a heterogeneous ensemble of three models that use Swin Transfomers as backbone and the concepts of self-distillation and multi-task learning as core design choices. According to ablation studies performed with the CholecT45 challenge data via cross-validation, the biggest performance boost is achieved by the usage of soft labels obtained by self-distillation. External validation of our method on an independent test set was achieved by providing a Docker container of our inference model to the challenge organizers. According to their analysis, our method outperforms all other solutions submitted to the latest challenge in the field. Our approach thus shows the potential of self-distillation for becoming an important tool in medical image analysis applications. △ Less

Submitted 22 March, 2023; originally announced March 2023.

arXiv:2303.10191 [pdf, other]

doi 10.1007/978-3-031-43907-0_73

Unsupervised Domain Transfer with Conditional Invertible Neural Networks

Authors: Kris K. Dreher, Leonardo Ayala, Melanie Schellenberg, Marco Hübner, Jan-Hinrich Nölke, Tim J. Adler, Silvia Seidlitz, Jan Sellner, Alexander Studier-Fischer, Janek Gröhl, Felix Nickel, Ullrich Köthe, Alexander Seitel, Lena Maier-Hein

Abstract: Synthetic medical image generation has evolved as a key technique for neural network training and validation. A core challenge, however, remains in the domain gap between simulations and real data. While deep learning-based domain transfer using Cycle Generative Adversarial Networks and similar architectures has led to substantial progress in the field, there are use cases in which state-of-the-ar… ▽ More Synthetic medical image generation has evolved as a key technique for neural network training and validation. A core challenge, however, remains in the domain gap between simulations and real data. While deep learning-based domain transfer using Cycle Generative Adversarial Networks and similar architectures has led to substantial progress in the field, there are use cases in which state-of-the-art approaches still fail to generate training images that produce convincing results on relevant downstream tasks. Here, we address this issue with a domain transfer approach based on conditional invertible neural networks (cINNs). As a particular advantage, our method inherently guarantees cycle consistency through its invertible architecture, and network training can efficiently be conducted with maximum likelihood training. To showcase our method's generic applicability, we apply it to two spectral imaging modalities at different scales, namely hyperspectral imaging (pixel-level) and photoacoustic tomography (image-level). According to comprehensive experiments, our method enables the generation of realistic spectral data and outperforms the state of the art on two downstream classification tasks (binary and multi-class). cINN-based domain transfer could thus evolve as an important method for realistic synthetic data generation in the field of spectral imaging and beyond. △ Less

Submitted 17 March, 2023; originally announced March 2023.

arXiv:2212.08568 [pdf, other]

Biomedical image analysis competitions: The state of current participation practice

Authors: Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali, Anubha Gupta, Jan Kybic, Alison Noble, Carlos Ortiz de Solórzano, Samiksha Pachade, Caroline Petitjean, Daniel Sage, Donglai Wei, Elizabeth Wilden, Deepak Alapatt, Vincent Andrearczyk, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano , et al. (331 additional authors not shown)

Abstract: The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,… ▽ More The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps. △ Less

Submitted 12 September, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

arXiv:2211.09708 [pdf, other]

Sources of performance variability in deep learning-based polyp detection

Authors: Thuy Nuong Tran, Tim Adler, Amine Yamlahi, Evangelia Christodoulou, Patrick Godau, Annika Reinke, Minu Dietlinde Tizabi, Peter Sauer, Tillmann Persicke, Jörg Gerhard Albert, Lena Maier-Hein

Abstract: Validation metrics are a key prerequisite for the reliable tracking of scientific progress and for deciding on the potential clinical translation of methods. While recent initiatives aim to develop comprehensive theoretical frameworks for understanding metric-related pitfalls in image analysis problems, there is a lack of experimental evidence on the concrete effects of common and rare pitfalls on… ▽ More Validation metrics are a key prerequisite for the reliable tracking of scientific progress and for deciding on the potential clinical translation of methods. While recent initiatives aim to develop comprehensive theoretical frameworks for understanding metric-related pitfalls in image analysis problems, there is a lack of experimental evidence on the concrete effects of common and rare pitfalls on specific applications. We address this gap in the literature in the context of colon cancer screening. Our contribution is twofold. Firstly, we present the winning solution of the Endoscopy computer vision challenge (EndoCV) on colon cancer detection, conducted in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2022. Secondly, we demonstrate the sensitivity of commonly used metrics to a range of hyperparameters as well as the consequences of poor metric choices. Based on comprehensive validation studies performed with patient data from six clinical centers, we found all commonly applied object detection metrics to be subject to high inter-center variability. Furthermore, our results clearly demonstrate that the adaptation of standard hyperparameters used in the computer vision community does not generally lead to the clinically most plausible results. Finally, we present localization criteria that correspond well to clinical relevance. Our work could be a first step towards reconsidering common validation strategies in automatic colon cancer screening applications. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Comments: 12 pages, 9 figures, 3 tables. Submitted to IPCAI 2023

arXiv:2206.03483 [pdf, other]

Few-Shot Learning by Dimensionality Reduction in Gradient Space

Authors: Martin Gauch, Maximilian Beck, Thomas Adler, Dmytro Kotsur, Stefan Fiel, Hamid Eghbal-zadeh, Johannes Brandstetter, Johannes Kofler, Markus Holzleitner, Werner Zellinger, Daniel Klotz, Sepp Hochreiter, Sebastian Lehner

Abstract: We introduce SubGD, a novel few-shot learning method which is based on the recent finding that stochastic gradient descent updates tend to live in a low-dimensional parameter subspace. In experimental and theoretical analyses, we show that models confined to a suitable predefined subspace generalize well for few-shot learning. A suitable subspace fulfills three criteria across the given tasks: it… ▽ More We introduce SubGD, a novel few-shot learning method which is based on the recent finding that stochastic gradient descent updates tend to live in a low-dimensional parameter subspace. In experimental and theoretical analyses, we show that models confined to a suitable predefined subspace generalize well for few-shot learning. A suitable subspace fulfills three criteria across the given tasks: it (a) allows to reduce the training error by gradient flow, (b) leads to models that generalize well, and (c) can be identified by stochastic gradient descent. SubGD identifies these subspaces from an eigendecomposition of the auto-correlation matrix of update directions across different tasks. Demonstrably, we can identify low-dimensional suitable subspaces for few-shot learning of dynamical systems, which have varying properties described by one or few parameters of the analytical system description. Such systems are ubiquitous among real-world applications in science and engineering. We experimentally corroborate the advantages of SubGD on three distinct dynamical systems problem settings, significantly outperforming popular few-shot learning methods both in terms of sample efficiency and performance. △ Less

Submitted 7 June, 2022; originally announced June 2022.

Comments: Accepted at Conference on Lifelong Learning Agents (CoLLAs) 2022. Code: https://github.com/ml-jku/subgd Blog post: https://ml-jku.github.io/subgd

Journal ref: Proceedings of The 1st Conference on Lifelong Learning Agents, PMLR 199:1043-1064 (2022)

arXiv:2205.12258 [pdf, other]

History Compression via Language Models in Reinforcement Learning

Authors: Fabian Paischer, Thomas Adler, Vihang Patil, Angela Bitto-Nemling, Markus Holzleitner, Sebastian Lehner, Hamid Eghbal-zadeh, Sepp Hochreiter

Abstract: In a partially observable Markov decision process (POMDP), an agent typically uses a representation of the past to approximate the underlying MDP. We propose to utilize a frozen Pretrained Language Transformer (PLT) for history representation and compression to improve sample efficiency. To avoid training of the Transformer, we introduce FrozenHopfield, which automatically associates observations… ▽ More In a partially observable Markov decision process (POMDP), an agent typically uses a representation of the past to approximate the underlying MDP. We propose to utilize a frozen Pretrained Language Transformer (PLT) for history representation and compression to improve sample efficiency. To avoid training of the Transformer, we introduce FrozenHopfield, which automatically associates observations with pretrained token embeddings. To form these associations, a modern Hopfield network stores these token embeddings, which are retrieved by queries that are obtained by a random but fixed projection of observations. Our new method, HELM, enables actor-critic network architectures that contain a pretrained language Transformer for history representation as a memory module. Since a representation of the past need not be learned, HELM is much more sample efficient than competitors. On Minigrid and Procgen environments HELM achieves new state-of-the-art results. Our code is available at https://github.com/ml-jku/helm. △ Less

Submitted 21 February, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

Comments: ICML 2022

arXiv:2111.05408 [pdf, other]

doi 10.1016/j.media.2022.102488

Robust deep learning-based semantic organ segmentation in hyperspectral images

Authors: Silvia Seidlitz, Jan Sellner, Jan Odenthal, Berkin Özdemir, Alexander Studier-Fischer, Samuel Knödler, Leonardo Ayala, Tim J. Adler, Hannes G. Kenngott, Minu Tizabi, Martin Wagner, Felix Nickel, Beat P. Müller-Stich, Lena Maier-Hein

Abstract: Semantic image segmentation is an important prerequisite for context-awareness and autonomous robotics in surgery. The state of the art has focused on conventional RGB video data acquired during minimally invasive surgery, but full-scene semantic segmentation based on spectral imaging data and obtained during open surgery has received almost no attention to date. To address this gap in the literat… ▽ More Semantic image segmentation is an important prerequisite for context-awareness and autonomous robotics in surgery. The state of the art has focused on conventional RGB video data acquired during minimally invasive surgery, but full-scene semantic segmentation based on spectral imaging data and obtained during open surgery has received almost no attention to date. To address this gap in the literature, we are investigating the following research questions based on hyperspectral imaging (HSI) data of pigs acquired in an open surgery setting: (1) What is an adequate representation of HSI data for neural network-based fully automated organ segmentation, especially with respect to the spatial granularity of the data (pixels vs. superpixels vs. patches vs. full images)? (2) Is there a benefit of using HSI data compared to other modalities, namely RGB data and processed HSI data (e.g. tissue parameters like oxygenation), when performing semantic organ segmentation? According to a comprehensive validation study based on 506 HSI images from 20 pigs, annotated with a total of 19 classes, deep learning-based segmentation performance increases, consistently across modalities, with the spatial context of the input data. Unprocessed HSI data offers an advantage over RGB data or processed data from the camera provider, with the advantage increasing with decreasing size of the input to the neural network. Maximum performance (HSI applied to whole images) yielded a mean DSC of 0.90 ((standard deviation (SD)) 0.04), which is in the range of the inter-rater variability (DSC of 0.89 ((standard deviation (SD)) 0.07)). We conclude that HSI could become a powerful image modality for fully-automatic surgical scene understanding with many advantages over traditional imaging, including the ability to recover additional functional tissue information. Code and pre-trained models: https://github.com/IMSY-DKFZ/htc. △ Less

Submitted 10 July, 2022; v1 submitted 9 November, 2021; originally announced November 2021.

Comments: The first two authors (Silvia Seidlitz and Jan Sellner) contributed equally to this paper

ACM Class: I.2.10; I.4.6; J.3

Journal ref: Medical Image Analysis, Volume 80, 2022, 102488, ISSN 1361-8415

arXiv:2110.15556 [pdf, other]

doi 10.1051/0004-6361/202141785

MATISSE, the VLTI mid-infrared imaging spectro-interferometer

Authors: B. Lopez, S. Lagarde, R. G. Petrov, W. Jaffe, P. Antonelli, F. Allouche, P. Berio, A. Matter, A. Meilland, F. Millour, S. Robbe-Dubois, Th. Henning, G. Weigelt, A. Glindemann, T. Agocs, Ch. Bailet, U. Beckmann, F. Bettonvil, R. van Boekel, P. Bourget, Y. Bresson, P. Bristow, P. Cruzalèbes, E. Eldswijk, Y. Fanteï Caujolle , et al. (128 additional authors not shown)

Abstract: Context:Optical interferometry is at a key development stage. ESO's VLTI has established a stable, robust infrastructure for long-baseline interferometry for general astronomical observers. The present second-generation instruments offer a wide wavelength coverage and improved performance. Their sensitivity and measurement accuracy lead to data and images of high reliability. Aims:We have develope… ▽ More Context:Optical interferometry is at a key development stage. ESO's VLTI has established a stable, robust infrastructure for long-baseline interferometry for general astronomical observers. The present second-generation instruments offer a wide wavelength coverage and improved performance. Their sensitivity and measurement accuracy lead to data and images of high reliability. Aims:We have developed MATISSE, the Multi AperTure mid-Infrared SpectroScopic Experiment, to access high resolution imaging in a wide spectral domain and explore topics such: stellar activity and mass loss; planet formation and evolution in the gas and dust disks around young stars; accretion processes around super massive black holes in AGN. Methods:The instrument is a spectro-interferometric imager covering three atmospheric bands (L,M,N) from 2.8 to 13.0 mu, combining four optical beams from the VLTI's telscopes. Its concept, related observing procedure, data reduction and calibration approach are the product of 30 years of instrumental research. The instrument utilizes a multi-axial beam combination that delivers spectrally dispersed fringes. The signal provides the following quantities at several spectral resolutions: photometric flux, coherent fluxes, visibilities, closure phases, wavelength differential visibilities and phases, and aperture-synthesis imaging. Results:We provide an overview of the physical principle of the instrument and its functionalities, the characteristics of the delivered signal, a description of the observing modes and of their performance limits. An ensemble of data and reconstructed images are illustrating the first acquired key observations. Conclusion:The instrument has been in operation at Cerro Paranal, ESO, Chile since 2018, and has been open for science use by the international community since April 2019. The first scientific results are being published now. △ Less

Submitted 2 March, 2022; v1 submitted 29 October, 2021; originally announced October 2021.

Comments: 24 pages, 26 figures, submitted to Astronomy & Astrophysics

Journal ref: A&A 659, A192 (2022)

arXiv:2105.13901 [pdf, other]

doi 10.1126/sciadv.add6778

Video-rate multispectral imaging in laparoscopic surgery: First-in-human application

Authors: Leonardo Ayala, Sebastian Wirkert, Anant Vemuri, Tim Adler, Silvia Seidlitz, Sebastian Pirmann, Christina Engels, Dogu Teber, Lena Maier-Hein

Abstract: Multispectral and hyperspectral imaging (MSI/HSI) can provide clinically relevant information on morphological and functional tissue properties. Application in the operating room (OR), however, has so far been limited by complex hardware setups and slow acquisition times. To overcome these limitations, we propose a novel imaging system for video-rate spectral imaging in the clinical workflow. The… ▽ More Multispectral and hyperspectral imaging (MSI/HSI) can provide clinically relevant information on morphological and functional tissue properties. Application in the operating room (OR), however, has so far been limited by complex hardware setups and slow acquisition times. To overcome these limitations, we propose a novel imaging system for video-rate spectral imaging in the clinical workflow. The system integrates a small snapshot multispectral camera with a standard laparoscope and a clinically commonly used light source, enabling the recording of multispectral images with a spectral dimension of 16 at a frame rate of 25 Hz. An ongoing in patient study shows that multispectral recordings from this system can help detect perfusion changes in partial nephrectomy surgery, thus opening the doors to a wide range of clinical applications. △ Less

Submitted 28 May, 2021; originally announced May 2021.

arXiv:2103.17014 [pdf, other]

doi 10.1051/0004-6361/202140626

Mid-infrared circumstellar emission of the long-period Cepheid l Carinae resolved with VLTI/MATISSE

Authors: V. Hocdé, N. Nardetto, A. Matter, E. Lagadec, A. Mérand, P. Cruzalèbes, A. Meilland, F. Millour, B. Lopez, P. Berio, G. Weigelt, R. Petrov, J. W. Isbell, W. Jaffe, P. Kervella, A. Glindemann, M. Schöller, F. Allouche, A. Gallenne, A. Domiciano de Souza, G. Niccolini, E. Kokoulina, J. Varga, S. Lagarde, J. -C. Augereau , et al. (129 additional authors not shown)

Abstract: The nature of circumstellar envelopes (CSE) around Cepheids is still a matter of debate. The physical origin of their infrared (IR) excess could be either a shell of ionized gas, or a dust envelope, or both. This study aims at constraining the geometry and the IR excess of the environment of the long-period Cepheid $\ell$ Car (P=35.5 days) at mid-IR wavelengths to understand its physical nature. W… ▽ More The nature of circumstellar envelopes (CSE) around Cepheids is still a matter of debate. The physical origin of their infrared (IR) excess could be either a shell of ionized gas, or a dust envelope, or both. This study aims at constraining the geometry and the IR excess of the environment of the long-period Cepheid $\ell$ Car (P=35.5 days) at mid-IR wavelengths to understand its physical nature. We first use photometric observations in various bands and Spitzer Space Telescope spectroscopy to constrain the IR excess of $\ell$ Car. Then, we analyze the VLTI/MATISSE measurements at a specific phase of observation, in order to determine the flux contribution, the size and shape of the environment of the star in the L band. We finally test the hypothesis of a shell of ionized gas in order to model the IR excess. We report the first detection in the L band of a centro-symmetric extended emission around l Car, of about 1.7$R_\star$ in FWHM, producing an excess of about 7.0\% in this band. In the N band, there is no clear evidence for dust emission from VLTI/MATISSE correlated flux and Spitzer data. On the other side, the modeled shell of ionized gas implies a more compact CSE ($1.13\pm0.02\,R_\star$) and fainter (IR excess of 1\% in the L band). We provide new evidences for a compact CSE of $\ell$ Car and we demonstrate the capabilities of VLTI/MATISSE for determining common properties of CSEs. While the compact CSE of $\ell$ Car is probably of gaseous nature, the tested model of a shell of ionized gas is not able to simultaneously reproduce the IR excess and the interferometric observations. Further Galactic Cepheids observations with VLTI/MATISSE are necessary for determining the properties of CSEs, which may also depend on both the pulsation period and the evolutionary state of the stars. △ Less

Submitted 31 March, 2021; originally announced March 2021.

Comments: 13 pages, 8 figures, accepted in Astronomy and Astrophysics

arXiv:2012.08195 [pdf, other]

Representing Ambiguity in Registration Problems with Conditional Invertible Neural Networks

Authors: Darya Trofimova, Tim Adler, Lisa Kausch, Lynton Ardizzone, Klaus Maier-Hein, Ulrich Köthe, Carsten Rother, Lena Maier-Hein

Abstract: Image registration is the basis for many applications in the fields of medical image computing and computer assisted interventions. One example is the registration of 2D X-ray images with preoperative three-dimensional computed tomography (CT) images in intraoperative surgical guidance systems. Due to the high safety requirements in medical applications, estimating registration uncertainty is of a… ▽ More Image registration is the basis for many applications in the fields of medical image computing and computer assisted interventions. One example is the registration of 2D X-ray images with preoperative three-dimensional computed tomography (CT) images in intraoperative surgical guidance systems. Due to the high safety requirements in medical applications, estimating registration uncertainty is of a crucial importance in such a scenario. However, previously proposed methods, including classical iterative registration methods and deep learning-based methods have one characteristic in common: They lack the capacity to represent the fact that a registration problem may be inherently ambiguous, meaning that multiple (substantially different) plausible solutions exist. To tackle this limitation, we explore the application of invertible neural networks (INN) as core component of a registration methodology. In the proposed framework, INNs enable going beyond point estimates as network output by representing the possible solutions to a registration problem by a probability distribution that encodes different plausible solutions via multiple modes. In a first feasibility study, we test the approach for a 2D 3D registration setting by registering spinal CT volumes to X-ray images. To this end, we simulate the X-ray images taken by a C-Arm with multiple orientations using the principle of digitially reconstructed radiographs (DRRs). Due to the symmetry of human spine, there are potentially multiple substantially different poses of the C-Arm that can lead to similar projections. The hypothesis of this work is that the proposed approach is able to identify multiple solutions in such ambiguous registration problems. △ Less

Submitted 15 December, 2020; originally announced December 2020.

Comments: The paper got accepted at Medical Imaging Meets NeurIPS Workshop at Neural Information Processing Systems 2020

arXiv:2012.05697 [pdf, other]

doi 10.1051/0004-6361/202039400

The asymmetric inner disk of the Herbig Ae star HD 163296 in the eyes of VLTI/MATISSE: evidence for a vortex?

Authors: J. Varga, M. Hogerheijde, R. van Boekel, L. Klarmann, R. Petrov, L. B. F. M. Waters, S. Lagarde, E. Pantin, Ph. Berio, G. Weigelt, S. Robbe-Dubois, B. Lopez, F. Millour, J. -C. Augereau, H. Meheut, A. Meilland, Th. Henning, W. Jaffe, F. Bettonvil, P. Bristow, K. -H. Hofmann, A. Matter, G. Zins, S. Wolf, F. Allouche , et al. (111 additional authors not shown)

Abstract: Context. The inner few au region of planet-forming disks is a complex environment. High angular resolution observations have a key role in understanding the disk structure and the dynamical processes at work. Aims. In this study we aim to characterize the mid-infrared brightness distribution of the inner disk of the young intermediate-mass star HD 163296, from VLTI/MATISSE observations. Methods. W… ▽ More Context. The inner few au region of planet-forming disks is a complex environment. High angular resolution observations have a key role in understanding the disk structure and the dynamical processes at work. Aims. In this study we aim to characterize the mid-infrared brightness distribution of the inner disk of the young intermediate-mass star HD 163296, from VLTI/MATISSE observations. Methods. We use geometric models to fit the data. Our models include a smoothed ring, a flat disk with inner cavity, and a 2D Gaussian. The models can account for disk inclination and for azimuthal asymmetries as well. We also perform numerical hydro-dynamical simulations of the inner edge of the disk. Results. Our modeling reveals a significant brightness asymmetry in the L-band disk emission. The brightness maximum of the asymmetry is located at the NW part of the disk image, nearly at the position angle of the semimajor axis. The surface brightness ratio in the azimuthal variation is $3.5 \pm 0.2$. Comparing our result on the location of the asymmetry with other interferometric measurements, we confirm that the morphology of the $r<0.3$ au disk region is time-variable. We propose that this asymmetric structure, located in or near the inner rim of the dusty disk, orbits the star. For the physical origin of the asymmetry, we tested a hypothesis where a vortex is created by Rossby wave instability, and we find that a unique large scale vortex may be compatible with our data. The half-light radius of the L-band emitting region is $0.33\pm 0.01$ au, the inclination is ${52^\circ}^{+5^\circ}_{-7^\circ}$, and the position angle is $143^\circ \pm 3^\circ$. Our models predict that a non-negligible fraction of the L-band disk emission originates inside the dust sublimation radius for $μ$m-sized grains. Refractory grains or large ($\gtrsim 10\ μ$m-sized) grains could be the origin for this emission. △ Less

Submitted 10 December, 2020; originally announced December 2020.

Comments: accepted for publication in A&A

Journal ref: A&A 647, A56 (2021)

arXiv:2011.05110 [pdf, other]

Invertible Neural Networks for Uncertainty Quantification in Photoacoustic Imaging

Authors: Jan-Hinrich Nölke, Tim Adler, Janek Gröhl, Thomas Kirchner, Lynton Ardizzone, Carsten Rother, Ullrich Köthe, Lena Maier-Hein

Abstract: Multispectral photoacoustic imaging (PAI) is an emerging imaging modality which enables the recovery of functional tissue parameters such as blood oxygenation. However, the underlying inverse problems are potentially ill-posed, meaning that radically different tissue properties may - in theory - yield comparable measurements. In this work, we present a new approach for handling this specific type… ▽ More Multispectral photoacoustic imaging (PAI) is an emerging imaging modality which enables the recovery of functional tissue parameters such as blood oxygenation. However, the underlying inverse problems are potentially ill-posed, meaning that radically different tissue properties may - in theory - yield comparable measurements. In this work, we present a new approach for handling this specific type of uncertainty by leveraging the concept of conditional invertible neural networks (cINNs). Specifically, we propose going beyond commonly used point estimates for tissue oxygenation and converting single-pixel initial pressure spectra to the full posterior probability density. This way, the inherent ambiguity of a problem can be encoded with multiple modes in the output. Based on the presented architecture, we demonstrate two use cases which leverage this information to not only detect and quantify but also to compensate for uncertainties: (1) photoacoustic device design and (2) optimization of photoacoustic image acquisition. Our in silico studies demonstrate the potential of the proposed methodology to become an important building block for uncertainty-aware reconstruction of physiological parameters with PAI. △ Less

Submitted 23 November, 2020; v1 submitted 10 November, 2020; originally announced November 2020.

Comments: 7 pages, 4 figures, submitted to "Bildverarbeitung für die Medizin (BVM) 2021"

arXiv:2010.06498 [pdf, other]

Cross-Domain Few-Shot Learning by Representation Fusion

Authors: Thomas Adler, Johannes Brandstetter, Michael Widrich, Andreas Mayr, David Kreil, Michael Kopp, Günter Klambauer, Sepp Hochreiter

Abstract: In order to quickly adapt to new data, few-shot learning aims at learning from few examples, often by using already acquired knowledge. The new data often differs from the previously seen data due to a domain shift, that is, a change of the input-target distribution. While several methods perform well on small domain shifts like new target classes with similar inputs, larger domain shifts are stil… ▽ More In order to quickly adapt to new data, few-shot learning aims at learning from few examples, often by using already acquired knowledge. The new data often differs from the previously seen data due to a domain shift, that is, a change of the input-target distribution. While several methods perform well on small domain shifts like new target classes with similar inputs, larger domain shifts are still challenging. Large domain shifts may result in high-level concepts that are not shared between the original and the new domain, whereas low-level concepts like edges in images might still be shared and useful. For cross-domain few-shot learning, we suggest representation fusion to unify different abstraction levels of a deep neural network into one representation. We propose Cross-domain Hebbian Ensemble Few-shot learning (CHEF), which achieves representation fusion by an ensemble of Hebbian learners acting on different layers of a deep neural network. Ablation studies show that representation fusion is a decisive factor to boost cross-domain few-shot learning. On the few-shot datasets miniImagenet and tieredImagenet with small domain shifts, CHEF is competitive with state-of-the-art methods. On cross-domain few-shot benchmark challenges with larger domain shifts, CHEF establishes novel state-of-the-art results in all categories. We further apply CHEF on a real-world cross-domain application in drug discovery. We consider a domain shift from bioactive molecules to environmental chemicals and drugs with twelve associated toxicity prediction tasks. On these tasks, that are highly relevant for computational drug discovery, CHEF significantly outperforms all its competitors. Github: https://github.com/ml-jku/chef △ Less

Submitted 16 February, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

arXiv:2008.02217 [pdf, other]

Hopfield Networks is All You Need

Authors: Hubert Ramsauer, Bernhard Schäfl, Johannes Lehner, Philipp Seidl, Michael Widrich, Thomas Adler, Lukas Gruber, Markus Holzleitner, Milena Pavlović, Geir Kjetil Sandve, Victor Greiff, David Kreil, Michael Kopp, Günter Klambauer, Johannes Brandstetter, Sepp Hochreiter

Abstract: We introduce a modern Hopfield network with continuous states and a corresponding update rule. The new Hopfield network can store exponentially (with the dimension of the associative space) many patterns, retrieves the pattern with one update, and has exponentially small retrieval errors. It has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all pa… ▽ More We introduce a modern Hopfield network with continuous states and a corresponding update rule. The new Hopfield network can store exponentially (with the dimension of the associative space) many patterns, retrieves the pattern with one update, and has exponentially small retrieval errors. It has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points which store a single pattern. The new update rule is equivalent to the attention mechanism used in transformers. This equivalence enables a characterization of the heads of transformer models. These heads perform in the first layers preferably global averaging and in higher layers partial averaging via metastable states. The new modern Hopfield network can be integrated into deep learning architectures as layers to allow the storage of and access to raw input data, intermediate results, or learned prototypes. These Hopfield layers enable new ways of deep learning, beyond fully-connected, convolutional, or recurrent networks, and provide pooling, memory, association, and attention mechanisms. We demonstrate the broad applicability of the Hopfield layers across various domains. Hopfield layers improved state-of-the-art on three out of four considered multiple instance learning problems as well as on immune repertoire classification with several hundreds of thousands of instances. On the UCI benchmark collections of small classification tasks, where deep learning methods typically struggle, Hopfield layers yielded a new state-of-the-art when compared to different machine learning methods. Finally, Hopfield layers achieved state-of-the-art on two drug design datasets. The implementation is available at: https://github.com/ml-jku/hopfield-layers △ Less

Submitted 28 April, 2021; v1 submitted 16 July, 2020; originally announced August 2020.

Comments: 10 pages (+ appendix); 12 figures; Blog: https://ml-jku.github.io/hopfield-layers/; GitHub: https://github.com/ml-jku/hopfield-layers

arXiv:2005.03501 [pdf]

Heidelberg Colorectal Data Set for Surgical Data Science in the Sensor Operating Room

Authors: Lena Maier-Hein, Martin Wagner, Tobias Ross, Annika Reinke, Sebastian Bodenstedt, Peter M. Full, Hellena Hempe, Diana Mindroc-Filimon, Patrick Scholz, Thuy Nuong Tran, Pierangela Bruno, Anna Kisilenko, Benjamin Müller, Tornike Davitashvili, Manuela Capek, Minu Tizabi, Matthias Eisenmann, Tim J. Adler, Janek Gröhl, Melanie Schellenberg, Silvia Seidlitz, T. Y. Emmy Lai, Bünyamin Pekdemir, Veith Roethlingshoefer, Fabian Both , et al. (8 additional authors not shown)

Abstract: Image-based tracking of medical instruments is an integral part of surgical data science applications. Previous research has addressed the tasks of detecting, segmenting and tracking medical instruments based on laparoscopic video data. However, the proposed methods still tend to fail when applied to challenging images and do not generalize well to data they have not been trained on. This paper in… ▽ More Image-based tracking of medical instruments is an integral part of surgical data science applications. Previous research has addressed the tasks of detecting, segmenting and tracking medical instruments based on laparoscopic video data. However, the proposed methods still tend to fail when applied to challenging images and do not generalize well to data they have not been trained on. This paper introduces the Heidelberg Colorectal (HeiCo) data set - the first publicly available data set enabling comprehensive benchmarking of medical instrument detection and segmentation algorithms with a specific emphasis on method robustness and generalization capabilities. Our data set comprises 30 laparoscopic videos and corresponding sensor data from medical devices in the operating room for three different types of laparoscopic surgery. Annotations include surgical phase labels for all video frames as well as information on instrument presence and corresponding instance-wise segmentation masks for surgical instruments (if any) in more than 10,000 individual frames. The data has successfully been used to organize international competitions within the Endoscopic Vision Challenges 2017 and 2019. △ Less

Submitted 23 February, 2021; v1 submitted 7 May, 2020; originally announced May 2020.

Comments: Submitted to Nature Scientific Data

arXiv:1911.01877 [pdf, other]

doi 10.1007/978-3-030-32689-0_8

Out of distribution detection for intra-operative functional imaging

Authors: Tim J. Adler, Leonardo Ayala, Lynton Ardizzone, Hannes G. Kenngott, Anant Vemuri, Beat P. Müller-Stich, Carsten Rother, Ullrich Köthe, Lena Maier-Hein

Abstract: Multispectral optical imaging is becoming a key tool in the operating room. Recent research has shown that machine learning algorithms can be used to convert pixel-wise reflectance measurements to tissue parameters, such as oxygenation. However, the accuracy of these algorithms can only be guaranteed if the spectra acquired during surgery match the ones seen during training. It is therefore of gre… ▽ More Multispectral optical imaging is becoming a key tool in the operating room. Recent research has shown that machine learning algorithms can be used to convert pixel-wise reflectance measurements to tissue parameters, such as oxygenation. However, the accuracy of these algorithms can only be guaranteed if the spectra acquired during surgery match the ones seen during training. It is therefore of great interest to detect so-called out of distribution (OoD) spectra to prevent the algorithm from presenting spurious results. In this paper we present an information theory based approach to OoD detection based on the widely applicable information criterion (WAIC). Our work builds upon recent methodology related to invertible neural networks (INN). Specifically, we make use of an ensemble of INNs as we need their tractable Jacobians in order to compute the WAIC. Comprehensive experiments with in silico, and in vivo multispectral imaging data indicate that our approach is well-suited for OoD detection. Our method could thus be an important step towards reliable functional imaging in the operating room. △ Less

Submitted 5 November, 2019; originally announced November 2019.

Comments: The final authenticated version is available online at https://doi.org/10.1007/978-3-030-32689-0_8

Journal ref: Proceedings of the First International Workshop on Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, UNSURE 2019, and the 8th International Workshop on Clinical Image-Based Procedures, CLIP 2019

arXiv:1910.13804 [pdf, other]

doi 10.3390/photonics8120535

Quantum Optical Experiments Modeled by Long Short-Term Memory

Authors: Thomas Adler, Manuel Erhard, Mario Krenn, Johannes Brandstetter, Johannes Kofler, Sepp Hochreiter

Abstract: We demonstrate how machine learning is able to model experiments in quantum physics. Quantum entanglement is a cornerstone for upcoming quantum technologies such as quantum computation and quantum cryptography. Of particular interest are complex quantum states with more than two particles and a large number of entangled quantum levels. Given such a multiparticle high-dimensional quantum state, it… ▽ More We demonstrate how machine learning is able to model experiments in quantum physics. Quantum entanglement is a cornerstone for upcoming quantum technologies such as quantum computation and quantum cryptography. Of particular interest are complex quantum states with more than two particles and a large number of entangled quantum levels. Given such a multiparticle high-dimensional quantum state, it is usually impossible to reconstruct an experimental setup that produces it. To search for interesting experiments, one thus has to randomly create millions of setups on a computer and calculate the respective output states. In this work, we show that machine learning models can provide significant improvement over random search. We demonstrate that a long short-term memory (LSTM) neural network can successfully learn to model quantum experiments by correctly predicting output state characteristics for given setups without the necessity of computing the states themselves. This approach not only allows for faster search but is also an essential step towards automated design of multiparticle high-dimensional quantum experiments using generative machine learning models. △ Less

Submitted 30 October, 2019; originally announced October 2019.

Comments: 9 pages

Journal ref: Photonics 8(12), 535 (2021)

arXiv:1910.04093 [pdf, other]

Patch Refinement -- Localized 3D Object Detection

Authors: Johannes Lehner, Andreas Mitterecker, Thomas Adler, Markus Hofmarcher, Bernhard Nessler, Sepp Hochreiter

Abstract: We introduce Patch Refinement a two-stage model for accurate 3D object detection and localization from point cloud data. Patch Refinement is composed of two independently trained Voxelnet-based networks, a Region Proposal Network (RPN) and a Local Refinement Network (LRN). We decompose the detection task into a preliminary Bird's Eye View (BEV) detection step and a local 3D detection step. Based o… ▽ More We introduce Patch Refinement a two-stage model for accurate 3D object detection and localization from point cloud data. Patch Refinement is composed of two independently trained Voxelnet-based networks, a Region Proposal Network (RPN) and a Local Refinement Network (LRN). We decompose the detection task into a preliminary Bird's Eye View (BEV) detection step and a local 3D detection step. Based on the proposed BEV locations by the RPN, we extract small point cloud subsets ("patches"), which are then processed by the LRN, which is less limited by memory constraints due to the small area of each patch. Therefore, we can apply encoding with a higher voxel resolution locally. The independence of the LRN enables the use of additional augmentation techniques and allows for an efficient, regression focused training as it uses only a small fraction of each scene. Evaluated on the KITTI 3D object detection benchmark, our submission from January 28, 2019, outperformed all previous entries on all three difficulties of the class car, using only 50 % of the available training data and only LiDAR information. △ Less

Submitted 9 October, 2019; originally announced October 2019.

Comments: Machine Learning for Autonomous Driving Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

arXiv:1904.11809 [pdf, other]

doi 10.1117/12.2509608

Photoacoustic monitoring of blood oxygenation during neurosurgical interventions

Authors: Thomas Kirchner, Janek Gröhl, Niklas Holzwarth, Mildred A. Herrera, Tim Adler, Adrián Hernández-Aguilera, Edgar Santos, Lena Maier-Hein

Abstract: Multispectral photoacoustic (PA) imaging is a prime modality to monitor hemodynamics and changes in blood oxygenation (sO2). Although sO2 changes can be an indicator of brain activity both in normal and in pathological conditions, PA imaging of the brain has mainly focused on small animal models with lissencephalic brains. Therefore, the purpose of this work was to investigate the usefulness of mu… ▽ More Multispectral photoacoustic (PA) imaging is a prime modality to monitor hemodynamics and changes in blood oxygenation (sO2). Although sO2 changes can be an indicator of brain activity both in normal and in pathological conditions, PA imaging of the brain has mainly focused on small animal models with lissencephalic brains. Therefore, the purpose of this work was to investigate the usefulness of multispectral PA imaging in assessing sO2 in a gyrencephalic brain. To this end, we continuously imaged a porcine brain as part of an open neurosurgical intervention with a handheld PA and ultrasonic (US) imaging system in vivo. Throughout the experiment, we varied respiratory oxygen and continuously measured arterial blood gases. The arterial blood oxygenation (SaO2) values derived by the blood gas analyzer were used as a reference to compare the performance of linear spectral unmixing algorithms in this scenario. According to our experiment, PA imaging can be used to monitor sO2 in the porcine cerebral cortex. While linear spectral unmixing algorithms are well-suited for detecting changes in oxygenation, there are limits with respect to the accurate quantification of sO2, especially in depth. Overall, we conclude that multispectral PA imaging can potentially be a valuable tool for change detection of sO2 in the cerebral cortex of a gyrencephalic brain. The spectral unmixing algorithms investigated in this work will be made publicly available as part of the open-source software platform Medical Imaging Interaction Toolkit (MITK). △ Less

Submitted 26 April, 2019; originally announced April 2019.

Comments: AAM Conference Proceedings, Photons Plus Ultrasound: Imaging and Sensing 2019; 108780C (2019)

arXiv:1903.03441 [pdf, other]

doi 10.1007/s11548-019-01939-9

Uncertainty-aware performance assessment of optical imaging modalities with invertible neural networks

Authors: Tim J. Adler, Lynton Ardizzone, Anant Vemuri, Leonardo Ayala, Janek Gröhl, Thomas Kirchner, Sebastian Wirkert, Jakob Kruse, Carsten Rother, Ullrich Köthe, Lena Maier-Hein

Abstract: Purpose: Optical imaging is evolving as a key technique for advanced sensing in the operating room. Recent research has shown that machine learning algorithms can be used to address the inverse problem of converting pixel-wise multispectral reflectance measurements to underlying tissue parameters, such as oxygenation. Assessment of the specific hardware used in conjunction with such algorithms, ho… ▽ More Purpose: Optical imaging is evolving as a key technique for advanced sensing in the operating room. Recent research has shown that machine learning algorithms can be used to address the inverse problem of converting pixel-wise multispectral reflectance measurements to underlying tissue parameters, such as oxygenation. Assessment of the specific hardware used in conjunction with such algorithms, however, has not properly addressed the possibility that the problem may be ill-posed. Methods: We present a novel approach to the assessment of optical imaging modalities, which is sensitive to the different types of uncertainties that may occur when inferring tissue parameters. Based on the concept of invertible neural networks, our framework goes beyond point estimates and maps each multispectral measurement to a full posterior probability distribution which is capable of representing ambiguity in the solution via multiple modes. Performance metrics for a hardware setup can then be computed from the characteristics of the posteriors. Results: Application of the assessment framework to the specific use case of camera selection for physiological parameter estimation yields the following insights: (1) Estimation of tissue oxygenation from multispectral images is a well-posed problem, while (2) blood volume fraction may not be recovered without ambiguity. (3) In general, ambiguity may be reduced by increasing the number of spectral bands in the camera. Conclusion: Our method could help to optimize optical camera design in an application-specific manner. △ Less

Submitted 8 March, 2019; originally announced March 2019.

Comments: Accepted at IPCAI 2019

arXiv:1902.05839 [pdf, other]

doi 10.1038/s41598-021-83405-8

Estimation of blood oxygenation with learned spectral decoloring for quantitative photoacoustic imaging (LSD-qPAI)

Authors: Janek Gröhl, Thomas Kirchner, Tim Adler, Lena Maier-Hein

Abstract: One of the main applications of photoacoustic (PA) imaging is the recovery of functional tissue properties, such as blood oxygenation (sO2). This is typically achieved by linear spectral unmixing of relevant chromophores from multispectral photoacoustic images. Despite the progress that has been made towards quantitative PA imaging (qPAI), most sO2 estimation methods yield poor results in realisti… ▽ More One of the main applications of photoacoustic (PA) imaging is the recovery of functional tissue properties, such as blood oxygenation (sO2). This is typically achieved by linear spectral unmixing of relevant chromophores from multispectral photoacoustic images. Despite the progress that has been made towards quantitative PA imaging (qPAI), most sO2 estimation methods yield poor results in realistic settings. In this work, we tackle the challenge by employing learned spectral decoloring for quantitative photoacoustic imaging (LSD-qPAI) to obtain quantitative estimates for blood oxygenation. LSD-qPAI computes sO2 directly from pixel-wise initial pressure spectra Sp0, which are vectors comprised of the initial pressure at the same spatial location over all recorded wavelengths. Initial results suggest that LSD-qPAI is able to obtain accurate sO2 estimates directly from multispectral photoacoustic measurements in silico and plausible estimates in vivo. △ Less

Submitted 15 February, 2019; originally announced February 2019.

Comments: 5 pages

arXiv:1901.02786 [pdf, other]

Photoacoustics can image spreading depolarization deep in gyrencephalic brain

Authors: Thomas Kirchner, Janek Gröhl, Mildred Herrera, Tim Adler, Adrián Hernández-Aguilera, Edgar Santos, Lena Maier-Hein

Abstract: Spreading depolarization (SD) is a self-propagating wave of near-complete neuronal depolarization that is abundant in a wide range of neurological conditions, including stroke. SD was only recently documented in humans and is now considered a therapeutic target for brain injury, but the mechanisms related to SD in complex brains are not well understood. While there are numerous approaches to inter… ▽ More Spreading depolarization (SD) is a self-propagating wave of near-complete neuronal depolarization that is abundant in a wide range of neurological conditions, including stroke. SD was only recently documented in humans and is now considered a therapeutic target for brain injury, but the mechanisms related to SD in complex brains are not well understood. While there are numerous approaches to interventional imaging of SD on the exposed brain surface, measuring SD deep in brain is so far only possible with low spatiotemporal resolution and poor contrast. Here, we show that photoacoustic imaging enables the study of SD and its hemodynamics deep in the gyrencephalic brain with high spatiotemporal resolution. As rapid neuronal depolarization causes tissue hypoxia, we achieve this by continuously estimating blood oxygenation with an intraoperative hybrid photoacoustic and ultrasonic (PAUS) imaging system. Due to its high resolution, promising imaging depth and high contrast, this novel approach to SD imaging can yield new insights into SD and thereby lead to advances in stroke, and brain injury research. △ Less

Submitted 14 January, 2019; v1 submitted 9 January, 2019; originally announced January 2019.

Showing 1–27 of 27 results for author: Adler, T