Search | arXiv e-print repository

arXiv:2402.19460 [pdf, other]

Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks

Authors: Bálint Mucsányi, Michael Kirchhof, Seong Joon Oh

Abstract: Uncertainty quantification, once a singular task, has evolved into a spectrum of tasks, including abstained prediction, out-of-distribution detection, and aleatoric uncertainty quantification. The latest goal is disentanglement: the construction of multiple estimators that are each tailored to one and only one task. Hence, there is a plethora of recent advances with different intentions - that oft… ▽ More Uncertainty quantification, once a singular task, has evolved into a spectrum of tasks, including abstained prediction, out-of-distribution detection, and aleatoric uncertainty quantification. The latest goal is disentanglement: the construction of multiple estimators that are each tailored to one and only one task. Hence, there is a plethora of recent advances with different intentions - that often entirely deviate from practical behavior. This paper conducts a comprehensive evaluation of numerous uncertainty estimators across diverse tasks on ImageNet. We find that, despite promising theoretical endeavors, disentanglement is not yet achieved in practice. Additionally, we reveal which uncertainty estimators excel at which specific tasks, providing insights for practitioners and guiding future research toward task-centric and disentangled uncertainty estimation methods. Our code is available at https://github.com/bmucsanyi/bud. △ Less

Submitted 29 February, 2024; originally announced February 2024.

Comments: 43 pages

arXiv:2402.16569 [pdf, other]

Pretrained Visual Uncertainties

Authors: Michael Kirchhof, Mark Collier, Seong Joon Oh, Enkelejda Kasneci

Abstract: Accurate uncertainty estimation is vital to trustworthy machine learning, yet uncertainties typically have to be learned for each task anew. This work introduces the first pretrained uncertainty modules for vision models. Similar to standard pretraining this enables the zero-shot transfer of uncertainties learned on a large pretraining dataset to specialized downstream datasets. We enable our larg… ▽ More Accurate uncertainty estimation is vital to trustworthy machine learning, yet uncertainties typically have to be learned for each task anew. This work introduces the first pretrained uncertainty modules for vision models. Similar to standard pretraining this enables the zero-shot transfer of uncertainties learned on a large pretraining dataset to specialized downstream datasets. We enable our large-scale pretraining on ImageNet-21k by solving a gradient conflict in previous uncertainty modules and accelerating the training by up to 180x. We find that the pretrained uncertainties generalize to unseen datasets. In scrutinizing the learned uncertainties, we find that they capture aleatoric uncertainty, disentangled from epistemic components. We demonstrate that this enables safe retrieval and uncertainty-aware dataset visualization. To encourage applications to further problems and domains, we release all pretrained checkpoints and code under https://github.com/mkirchhof/url . △ Less

Submitted 27 February, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

arXiv:2310.08215 [pdf, other]

Trustworthy Machine Learning

Authors: Bálint Mucsányi, Michael Kirchhof, Elisa Nguyen, Alexander Rubinstein, Seong Joon Oh

Abstract: As machine learning technology gets applied to actual products and solutions, new challenges have emerged. Models unexpectedly fail to generalize to small changes in the distribution, tend to be confident on novel data they have never seen, or cannot communicate the rationale behind their decisions effectively with the end users. Collectively, we face a trustworthiness issue with the current machi… ▽ More As machine learning technology gets applied to actual products and solutions, new challenges have emerged. Models unexpectedly fail to generalize to small changes in the distribution, tend to be confident on novel data they have never seen, or cannot communicate the rationale behind their decisions effectively with the end users. Collectively, we face a trustworthiness issue with the current machine learning technology. This textbook on Trustworthy Machine Learning (TML) covers a theoretical and technical background of four key topics in TML: Out-of-Distribution Generalization, Explainability, Uncertainty Quantification, and Evaluation of Trustworthiness. We discuss important classical and contemporary research papers of the aforementioned fields and uncover and connect their underlying intuitions. The book evolved from the homonymous course at the University of Tübingen, first offered in the Winter Semester of 2022/23. It is meant to be a stand-alone product accompanied by code snippets and various pointers to further sources on topics of TML. The dedicated website of the book is https://trustworthyml.io/. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 373 pages, textbook at the University of Tübingen

ACM Class: I.2.0

arXiv:2307.03810 [pdf, other]

URL: A Representation Learning Benchmark for Transferable Uncertainty Estimates

Authors: Michael Kirchhof, Bálint Mucsányi, Seong Joon Oh, Enkelejda Kasneci

Abstract: Representation learning has significantly driven the field to develop pretrained models that can act as a valuable starting point when transferring to new datasets. With the rising demand for reliable machine learning and uncertainty quantification, there is a need for pretrained models that not only provide embeddings but also transferable uncertainty estimates. To guide the development of such m… ▽ More Representation learning has significantly driven the field to develop pretrained models that can act as a valuable starting point when transferring to new datasets. With the rising demand for reliable machine learning and uncertainty quantification, there is a need for pretrained models that not only provide embeddings but also transferable uncertainty estimates. To guide the development of such models, we propose the Uncertainty-aware Representation Learning (URL) benchmark. Besides the transferability of the representations, it also measures the zero-shot transferability of the uncertainty estimate using a novel metric. We apply URL to evaluate eleven uncertainty quantifiers that are pretrained on ImageNet and transferred to eight downstream datasets. We find that approaches that focus on the uncertainty of the representation itself or estimate the prediction risk directly outperform those that are based on the probabilities of upstream classes. Yet, achieving transferable uncertainty quantification remains an open challenge. Our findings indicate that it is not necessarily in conflict with traditional representation learning goals. Code is provided under https://github.com/mkirchhof/url . △ Less

Submitted 19 October, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

Comments: Accepted at the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS D&B 2023)

arXiv:2302.02865 [pdf, other]

Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguous Inputs

Authors: Michael Kirchhof, Enkelejda Kasneci, Seong Joon Oh

Abstract: Contrastively trained encoders have recently been proven to invert the data-generating process: they encode each input, e.g., an image, into the true latent vector that generated the image (Zimmermann et al., 2021). However, real-world observations often have inherent ambiguities. For instance, images may be blurred or only show a 2D view of a 3D object, so multiple latents could have generated th… ▽ More Contrastively trained encoders have recently been proven to invert the data-generating process: they encode each input, e.g., an image, into the true latent vector that generated the image (Zimmermann et al., 2021). However, real-world observations often have inherent ambiguities. For instance, images may be blurred or only show a 2D view of a 3D object, so multiple latents could have generated them. This makes the true posterior for the latent vector probabilistic with heteroscedastic uncertainty. In this setup, we extend the common InfoNCE objective and encoders to predict latent distributions instead of points. We prove that these distributions recover the correct posteriors of the data-generating process, including its level of aleatoric uncertainty, up to a rotation of the latent space. In addition to providing calibrated uncertainty estimates, these posteriors allow the computation of credible intervals in image retrieval. They comprise images with the same latent as a given query, subject to its uncertainty. Code is available at https://github.com/mkirchhof/Probabilistic_Contrastive_Learning △ Less

Submitted 17 May, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

Comments: Accepted at ICML 2023

arXiv:2207.03784 [pdf, other]

A Non-isotropic Probabilistic Take on Proxy-based Deep Metric Learning

Authors: Michael Kirchhof, Karsten Roth, Zeynep Akata, Enkelejda Kasneci

Abstract: Proxy-based Deep Metric Learning (DML) learns deep representations by embedding images close to their class representatives (proxies), commonly with respect to the angle between them. However, this disregards the embedding norm, which can carry additional beneficial context such as class- or image-intrinsic uncertainty. In addition, proxy-based DML struggles to learn class-internal structures. To… ▽ More Proxy-based Deep Metric Learning (DML) learns deep representations by embedding images close to their class representatives (proxies), commonly with respect to the angle between them. However, this disregards the embedding norm, which can carry additional beneficial context such as class- or image-intrinsic uncertainty. In addition, proxy-based DML struggles to learn class-internal structures. To address both issues at once, we introduce non-isotropic probabilistic proxy-based DML. We model images as directional von Mises-Fisher (vMF) distributions on the hypersphere that can reflect image-intrinsic uncertainties. Further, we derive non-isotropic von Mises-Fisher (nivMF) distributions for class proxies to better represent complex class-specific variances. To measure the proxy-to-image distance between these models, we develop and investigate multiple distribution-to-point and distribution-to-distribution metrics. Each framework choice is motivated by a set of ablational studies, which showcase beneficial properties of our probabilistic approach to proxy-based DML, such as uncertainty-awareness, better-behaved gradients during training, and overall improved generalization performance. The latter is especially reflected in the competitive performance on the standard DML benchmarks, where our approach compares favorably, suggesting that existing proxy-based DML can significantly benefit from a more probabilistic treatment. Code is available at github.com/ExplainableML/Probabilistic_Deep_Metric_Learning. △ Less

Submitted 8 July, 2022; originally announced July 2022.

Comments: Accepted as conference paper at ECCV 2022

arXiv:2206.13872 [pdf, other]

When are Post-hoc Conceptual Explanations Identifiable?

Authors: Tobias Leemann, Michael Kirchhof, Yao Rong, Enkelejda Kasneci, Gjergji Kasneci

Abstract: Interest in understanding and factorizing learned embedding spaces through conceptual explanations is steadily growing. When no human concept labels are available, concept discovery methods search trained embedding spaces for interpretable concepts like object shape or color that can provide post-hoc explanations for decisions. Unlike previous work, we argue that concept discovery should be identi… ▽ More Interest in understanding and factorizing learned embedding spaces through conceptual explanations is steadily growing. When no human concept labels are available, concept discovery methods search trained embedding spaces for interpretable concepts like object shape or color that can provide post-hoc explanations for decisions. Unlike previous work, we argue that concept discovery should be identifiable, meaning that a number of known concepts can be provably recovered to guarantee reliability of the explanations. As a starting point, we explicitly make the connection between concept discovery and classical methods like Principal Component Analysis and Independent Component Analysis by showing that they can recover independent concepts under non-Gaussian distributions. For dependent concepts, we propose two novel approaches that exploit functional compositionality properties of image-generating processes. Our provably identifiable concept discovery methods substantially outperform competitors on a battery of experiments including hundreds of trained models and dependent concepts, where they exhibit up to 29 % better alignment with the ground truth. Our results highlight the strict conditions under which reliable concept discovery without human labels can be guaranteed and provide a formal foundation for the domain. Our code is available online. △ Less

Submitted 6 June, 2023; v1 submitted 28 June, 2022; originally announced June 2022.

Comments: v5: UAI2023 camera-ready including supplementary material. The first two authors contributed equally

arXiv:2105.13850 [pdf, other]

pRSL: Interpretable Multi-label Stacking by Learning Probabilistic Rules

Authors: Michael Kirchhof, Lena Schmid, Christopher Reining, Michael ten Hompel, Markus Pauly

Abstract: A key task in multi-label classification is modeling the structure between the involved classes. Modeling this structure by probabilistic and interpretable means enables application in a broad variety of tasks such as zero-shot learning or learning from incomplete data. In this paper, we present the probabilistic rule stacking learner (pRSL) which uses probabilistic propositional logic rules and b… ▽ More A key task in multi-label classification is modeling the structure between the involved classes. Modeling this structure by probabilistic and interpretable means enables application in a broad variety of tasks such as zero-shot learning or learning from incomplete data. In this paper, we present the probabilistic rule stacking learner (pRSL) which uses probabilistic propositional logic rules and belief propagation to combine the predictions of several underlying classifiers. We derive algorithms for exact and approximate inference and learning, and show that pRSL reaches state-of-the-art performance on various benchmark datasets. In the process, we introduce a novel multicategorical generalization of the noisy-or gate. Additionally, we report simulation results on the quality of loopy belief propagation algorithms for approximate inference in bipartite noisy-or networks. △ Less

Submitted 28 May, 2021; originally announced May 2021.

arXiv:2006.03610 [pdf, other]

Root Cause Analysis in Lithium-Ion Battery Production with FMEA-Based Large-Scale Bayesian Network

Authors: Michael Kirchhof, Klaus Haas, Thomas Kornas, Sebastian Thiede, Mario Hirz, Christoph Herrmann

Abstract: The production of lithium-ion battery cells is characterized by a high degree of complexity due to numerous cause-effect relationships between process characteristics. Knowledge about the multi-stage production is spread among several experts, rendering tasks as failure analysis challenging. In this paper, a new method is presented that includes expert knowledge acquisition in production ramp-up b… ▽ More The production of lithium-ion battery cells is characterized by a high degree of complexity due to numerous cause-effect relationships between process characteristics. Knowledge about the multi-stage production is spread among several experts, rendering tasks as failure analysis challenging. In this paper, a new method is presented that includes expert knowledge acquisition in production ramp-up by combining Failure Mode and Effects Analysis (FMEA) with a Bayesian Network. Special algorithms are presented that help detect and resolve inconsistencies between the expert-provided parameters which are bound to occur when collecting knowledge from several process experts. We show the effectiveness of this holistic method by building up a large scale, cross-process Bayesian Failure Network in lithium-ion battery production and its application for root cause analysis. △ Less

Submitted 15 June, 2020; v1 submitted 5 June, 2020; originally announced June 2020.

Comments: Submitted to CIRP Journal of Manufacturing Science and Technology (01.2020)

MSC Class: 62P30 ACM Class: I.2.1

Showing 1–9 of 9 results for author: Kirchhof, M