Search | arXiv e-print repository

Is Epistemic Uncertainty Faithfully Represented by Evidential Deep Learning Methods?

Authors: Mira Jürgens, Nis Meinert, Viktor Bengs, Eyke Hüllermeier, Willem Waegeman

Abstract: Trustworthy ML systems should not only return accurate predictions, but also a reliable representation of their uncertainty. Bayesian methods are commonly used to quantify both aleatoric and epistemic uncertainty, but alternative approaches, such as evidential deep learning methods, have become popular in recent years. The latter group of methods in essence extends empirical risk minimization (ERM… ▽ More Trustworthy ML systems should not only return accurate predictions, but also a reliable representation of their uncertainty. Bayesian methods are commonly used to quantify both aleatoric and epistemic uncertainty, but alternative approaches, such as evidential deep learning methods, have become popular in recent years. The latter group of methods in essence extends empirical risk minimization (ERM) for predicting second-order probability distributions over outcomes, from which measures of epistemic (and aleatoric) uncertainty can be extracted. This paper presents novel theoretical insights of evidential deep learning, highlighting the difficulties in optimizing second-order loss functions and interpreting the resulting epistemic uncertainty measures. With a systematic setup that covers a wide range of approaches for classification, regression and counts, it provides novel insights into issues of identifiability and convergence in second-order loss minimization, and the relative (rather than absolute) nature of epistemic uncertainty measures. △ Less

Submitted 20 February, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

arXiv:2212.02462 [pdf, other]

Whale Casting: Remote mobile streaming humpback whale vocalizations to the world

Authors: James P. Crutchfield, Alexandra M. Jurgens

Abstract: Over several days in early August 2021, while at sea in Chatham Strait, Southeast Alaska, aboard M/Y Blue Pearl, an online twitch.tv stream broadcast in real-time humpback whale vocalizations monitored via hydrophone. Dozens on mainland North American and around the planet listened in and chatted via the stream. The webcasts demonstrated a proof-of-concept: only relatively inexpensive commercial-o… ▽ More Over several days in early August 2021, while at sea in Chatham Strait, Southeast Alaska, aboard M/Y Blue Pearl, an online twitch.tv stream broadcast in real-time humpback whale vocalizations monitored via hydrophone. Dozens on mainland North American and around the planet listened in and chatted via the stream. The webcasts demonstrated a proof-of-concept: only relatively inexpensive commercial-off-the-shelf equipment is required for remote mobile streaming at sea. These notes document what was required and make recommendations for higher-quality and larger-scale deployments. One conclusion is that real-time, automated audio documenting whale acoustic behavior is readily accessible and, using the cloud, it can be directly integrated into behavioral databases -- information sources that now often focus exclusively on nonreal-time visual-sighting narrative reports and photography. △ Less

Submitted 5 December, 2022; originally announced December 2022.

Comments: 6 pages, 3 figures; http://csc.ucdavis.edu/~cmg/compmech/pubs/whalecasting.html

arXiv:2205.02848 [pdf, other]

Building Brains: Subvolume Recombination for Data Augmentation in Large Vessel Occlusion Detection

Authors: Florian Thamm, Oliver Taubmann, Markus Jürgens, Aleksandra Thamm, Felix Denzinger, Leonhard Rist, Hendrik Ditt, Andreas Maier

Abstract: Ischemic strokes are often caused by large vessel occlusions (LVOs), which can be visualized and diagnosed with Computed Tomography Angiography scans. As time is brain, a fast, accurate and automated diagnosis of these scans is desirable. Human readers compare the left and right hemispheres in their assessment of strokes. A large training data set is required for a standard deep learning-based mod… ▽ More Ischemic strokes are often caused by large vessel occlusions (LVOs), which can be visualized and diagnosed with Computed Tomography Angiography scans. As time is brain, a fast, accurate and automated diagnosis of these scans is desirable. Human readers compare the left and right hemispheres in their assessment of strokes. A large training data set is required for a standard deep learning-based model to learn this strategy from data. As labeled medical data in this field is rare, other approaches need to be developed. To both include the prior knowledge of side comparison and increase the amount of training data, we propose an augmentation method that generates artificial training samples by recombining vessel tree segmentations of the hemispheres or hemisphere subregions from different patients. The subregions cover vessels commonly affected by LVOs, namely the internal carotid artery (ICA) and middle cerebral artery (MCA). In line with the augmentation scheme, we use a 3D-DenseNet fed with task-specific input, fostering a side-by-side comparison between the hemispheres. Furthermore, we propose an extension of that architecture to process the individual hemisphere subregions. All configurations predict the presence of an LVO, its side, and the affected subregion. We show the effect of recombination as an augmentation strategy in a 5-fold cross validated ablation study. We enhanced the AUC for patient-wise classification regarding the presence of an LVO of all investigated architectures. For one variant, the proposed method improved the AUC from 0.73 without augmentation to 0.89. The best configuration detects LVOs with an AUC of 0.91, LVOs in the ICA with an AUC of 0.96, and in the MCA with 0.91 while accurately predicting the affected side. △ Less

Submitted 16 May, 2022; v1 submitted 5 May, 2022; originally announced May 2022.

Comments: PrePrint - Accepted at MICCAI 2022

arXiv:2204.12333 [pdf, other]

An Algorithm for the Labeling and Interactive Visualization of the Cerebrovascular System of Ischemic Strokes

Authors: Florian Thamm, Markus Jürgens, Oliver Taubmann, Aleksandra Thamm, Leonhard Rist, Hendrik Ditt, Andreas Maier

Abstract: During the diagnosis of ischemic strokes, the Circle of Willis and its surrounding vessels are the arteries of interest. Their visualization in case of an acute stroke is often enabled by Computed Tomography Angiography (CTA). Still, the identification and analysis of the cerebral arteries remain time consuming in such scans due to a large number of peripheral vessels which may disturb the visual… ▽ More During the diagnosis of ischemic strokes, the Circle of Willis and its surrounding vessels are the arteries of interest. Their visualization in case of an acute stroke is often enabled by Computed Tomography Angiography (CTA). Still, the identification and analysis of the cerebral arteries remain time consuming in such scans due to a large number of peripheral vessels which may disturb the visual impression. In previous work we proposed VirtualDSA++, an algorithm designed to segment and label the cerebrovascular tree on CTA scans. Especially with stroke patients, labeling is a delicate procedure, as in the worst case whole hemispheres may not be present due to impeded perfusion. Hence, we extended the labeling mechanism for the cerebral arteries to identify occluded vessels. In the work at hand, we place the algorithm in a clinical context by evaluating the labeling and occlusion detection on stroke patients, where we have achieved labeling sensitivities comparable to other works between 92\,\% and 95\,\%. To the best of our knowledge, ours is the first work to address labeling and occlusion detection at once, whereby a sensitivity of 67\,\% and a specificity of 81\,\% were obtained for the latter. VirtualDSA++ also automatically segments and models the intracranial system, which we further used in a deep learning driven follow up work. We present the generic concept of iterative systematic search for pathways on all nodes of said model, which enables new interactive features. Exemplary, we derive in detail, firstly, the interactive planning of vascular interventions like the mechanical thrombectomy and secondly, the interactive suppression of vessel structures that are not of interest in diagnosing strokes (like veins). We discuss both features as well as further possibilities emerging from the proposed concept. △ Less

Submitted 26 April, 2022; originally announced April 2022.

arXiv:2112.01797 [pdf, other]

Detection of Large Vessel Occlusions using Deep Learning by Deforming Vessel Tree Segmentations

Authors: Florian Thamm, Oliver Taubmann, Markus Jürgens, Hendrik Ditt, Andreas Maier

Abstract: Computed Tomography Angiography is a key modality providing insights into the cerebrovascular vessel tree that are crucial for the diagnosis and treatment of ischemic strokes, in particular in cases of large vessel occlusions (LVO). Thus, the clinical workflow greatly benefits from an automated detection of patients suffering from LVOs. This work uses convolutional neural networks for case-level c… ▽ More Computed Tomography Angiography is a key modality providing insights into the cerebrovascular vessel tree that are crucial for the diagnosis and treatment of ischemic strokes, in particular in cases of large vessel occlusions (LVO). Thus, the clinical workflow greatly benefits from an automated detection of patients suffering from LVOs. This work uses convolutional neural networks for case-level classification trained with elastic deformation of the vessel tree segmentation masks to artificially augment training data. Using only masks as the input to our model uniquely allows us to apply such deformations much more aggressively than one could with conventional image volumes while retaining sample realism. The neural network classifies the presence of an LVO and the affected hemisphere. In a 5-fold cross validated ablation study, we demonstrate that the use of the suggested augmentation enables us to train robust models even from few data sets. Training the EfficientNetB1 architecture on 100 data sets, the proposed augmentation scheme was able to raise the ROC AUC to 0.85 from a baseline value of 0.56 using no augmentation. The best performance was achieved using a 3D-DenseNet yielding an AUC of 0.87. The augmentation had positive impact in classification of the affected hemisphere as well, where the 3D-DenseNet reached an AUC of 0.93 on both sides. △ Less

Submitted 5 May, 2022; v1 submitted 3 December, 2021; originally announced December 2021.

Comments: 7 pages. Accepted at BVM-Workshop 2022, Springer

arXiv:2102.10487 [pdf, other]

doi 10.1063/5.0050460

Divergent Predictive States: The Statistical Complexity Dimension of Stationary, Ergodic Hidden Markov Processes

Authors: Alexandra M. Jurgens, James P. Crutchfield

Abstract: Even simply-defined, finite-state generators produce stochastic processes that require tracking an uncountable infinity of probabilistic features for optimal prediction. For processes generated by hidden Markov chains the consequences are dramatic. Their predictive models are generically infinite-state. And, until recently, one could determine neither their intrinsic randomness nor structural comp… ▽ More Even simply-defined, finite-state generators produce stochastic processes that require tracking an uncountable infinity of probabilistic features for optimal prediction. For processes generated by hidden Markov chains the consequences are dramatic. Their predictive models are generically infinite-state. And, until recently, one could determine neither their intrinsic randomness nor structural complexity. The prequel, though, introduced methods to accurately calculate the Shannon entropy rate (randomness) and to constructively determine their minimal (though, infinite) set of predictive features. Leveraging this, we address the complementary challenge of determining how structured hidden Markov processes are by calculating their statistical complexity dimension -- the information dimension of the minimal set of predictive features. This tracks the divergence rate of the minimal memory resources required to optimally predict a broad class of truly complex processes. △ Less

Submitted 15 March, 2021; v1 submitted 20 February, 2021; originally announced February 2021.

Comments: 16 pages, 6 figures; Supplementary Material, 6 pages, 2 figures; http://csc.ucdavis.edu/~cmg/compmech/pubs/icfshmp.htm

arXiv:2008.12886 [pdf, other]

doi 10.1007/s10955-021-02769-3

Shannon Entropy Rate of Hidden Markov Processes

Authors: Alexandra M. Jurgens, James P. Crutchfield

Abstract: Hidden Markov chains are widely applied statistical models of stochastic processes, from fundamental physics and chemistry to finance, health, and artificial intelligence. The hidden Markov processes they generate are notoriously complicated, however, even if the chain is finite state: no finite expression for their Shannon entropy rate exists, as the set of their predictive features is genericall… ▽ More Hidden Markov chains are widely applied statistical models of stochastic processes, from fundamental physics and chemistry to finance, health, and artificial intelligence. The hidden Markov processes they generate are notoriously complicated, however, even if the chain is finite state: no finite expression for their Shannon entropy rate exists, as the set of their predictive features is generically infinite. As such, to date one cannot make general statements about how random they are nor how structured. Here, we address the first part of this challenge by showing how to efficiently and accurately calculate their entropy rates. We also show how this method gives the minimal set of infinite predictive features. A sequel addresses the challenge's second part on structure. △ Less

Submitted 28 August, 2020; originally announced August 2020.

Comments: 11 pages, 4 figures; supplementary material 10 pages, 7 figures; http://csc.ucdavis.edu/~cmg/compmech/pubs/serhmp.htm

arXiv:2003.00139 [pdf, other]

doi 10.1103/PhysRevResearch.2.033334

Functional Thermodynamics of Maxwellian Ratchets: Constructing and Deconstructing Patterns, Randomizing and Derandomizing Behaviors

Authors: Alexandra M. Jurgens, James P. Crutchfield

Abstract: Maxwellian ratchets are autonomous, finite-state thermodynamic engines that implement input-output informational transformations. Previous studies of these "demons" focused on how they exploit environmental resources to generate work: They randomize ordered inputs, leveraging increased Shannon entropy to transfer energy from a thermal reservoir to a work reservoir while respecting both Liouvillian… ▽ More Maxwellian ratchets are autonomous, finite-state thermodynamic engines that implement input-output informational transformations. Previous studies of these "demons" focused on how they exploit environmental resources to generate work: They randomize ordered inputs, leveraging increased Shannon entropy to transfer energy from a thermal reservoir to a work reservoir while respecting both Liouvillian state-space dynamics and the Second Law. However, to date, correctly determining such functional thermodynamic operating regimes was restricted to a very few engines for which correlations among their information-bearing degrees of freedom could be calculated exactly and in closed form---a highly restricted set. Additionally, a key second dimension of ratchet behavior was largely ignored---ratchets do not merely change the randomness of environmental inputs, their operation constructs and deconstructs patterns. To address both dimensions, we adapt recent results from dynamical-systems and ergodic theories that efficiently and accurately calculate the entropy rates and the rate of statistical complexity divergence of general hidden Markov processes. In concert with the Information Processing Second Law, these methods accurately determine thermodynamic operating regimes for finite-state Maxwellian demons with arbitrary numbers of states and transitions. In addition, they facilitate analyzing structure versus randomness trade-offs that a given engine makes. The result is a greatly enhanced perspective on the information processing capabilities of information engines. As an application, we give a thorough-going analysis of the Mandal-Jarzynski ratchet, demonstrating that it has an uncountably-infinite effective state space. △ Less

Submitted 29 May, 2020; v1 submitted 28 February, 2020; originally announced March 2020.

Comments: 16 pages, 7 figures; supplemental materials, 8 pages, 3 figures; http://csc.ucdavis.edu/~cmg/compmech/pubs/ftamr.htm

Journal ref: Phys. Rev. Research 2, 033334 (2020)

Showing 1–8 of 8 results for author: Jürgens, M