Search | arXiv e-print repository

Uncertainty Estimation in Instance Segmentation with Star-convex Shapes

Authors: Qasim M. K. Siddiqui, Sebastian Starke, Peter Steinbach

Abstract: Instance segmentation has witnessed promising advancements through deep neural network-based algorithms. However, these models often exhibit incorrect predictions with unwarranted confidence levels. Consequently, evaluating prediction uncertainty becomes critical for informed decision-making. Existing methods primarily focus on quantifying uncertainty in classification or regression tasks, lacking… ▽ More Instance segmentation has witnessed promising advancements through deep neural network-based algorithms. However, these models often exhibit incorrect predictions with unwarranted confidence levels. Consequently, evaluating prediction uncertainty becomes critical for informed decision-making. Existing methods primarily focus on quantifying uncertainty in classification or regression tasks, lacking emphasis on instance segmentation. Our research addresses the challenge of estimating spatial certainty associated with the location of instances with star-convex shapes. Two distinct clustering approaches are evaluated which compute spatial and fractional certainty per instance employing samples by the Monte-Carlo Dropout or Deep Ensemble technique. Our study demonstrates that combining spatial and fractional certainty scores yields improved calibrated estimation over individual certainty scores. Notably, our experimental results show that the Deep Ensemble technique alongside our novel radial clustering approach proves to be an effective strategy. Our findings emphasize the significance of evaluating the calibration of estimated certainties for model reliability and decision-making. △ Less

Submitted 19 September, 2023; originally announced September 2023.

arXiv:2206.08738 [pdf, other]

Detecting Adversarial Examples in Batches -- a geometrical approach

Authors: Danush Kumar Venkatesh, Peter Steinbach

Abstract: Many deep learning methods have successfully solved complex tasks in computer vision and speech recognition applications. Nonetheless, the robustness of these models has been found to be vulnerable to perturbed inputs or adversarial examples, which are imperceptible to the human eye, but lead the model to erroneous output decisions. In this study, we adapt and introduce two geometric metrics, dens… ▽ More Many deep learning methods have successfully solved complex tasks in computer vision and speech recognition applications. Nonetheless, the robustness of these models has been found to be vulnerable to perturbed inputs or adversarial examples, which are imperceptible to the human eye, but lead the model to erroneous output decisions. In this study, we adapt and introduce two geometric metrics, density and coverage, and evaluate their use in detecting adversarial samples in batches of unseen data. We empirically study these metrics using MNIST and two real-world biomedical datasets from MedMNIST, subjected to two different adversarial attacks. Our experiments show promising results for both metrics to detect adversarial examples. We believe that his work can lay the ground for further study on these metrics' use in deployed machine learning systems to monitor for possible attacks by adversarial examples or related pathologies such as dataset shift. △ Less

Submitted 17 June, 2022; originally announced June 2022.

Comments: Submitted to AdvML workshop at ICML2022

arXiv:2204.14226 [pdf, other]

doi 10.1038/s41379-022-01147-y

Recommendations on test datasets for evaluating AI solutions in pathology

Authors: André Homeyer, Christian Geißler, Lars Ole Schwen, Falk Zakrzewski, Theodore Evans, Klaus Strohmenger, Max Westphal, Roman David Bülow, Michaela Kargl, Aray Karjauv, Isidre Munné-Bertran, Carl Orge Retzlaff, Adrià Romero-López, Tomasz Sołtysiński, Markus Plass, Rita Carvalho, Peter Steinbach, Yu-Chia Lan, Nassim Bouteldja, David Haber, Mateo Rojas-Carulla, Alireza Vafaei Sadr, Matthias Kraft, Daniel Krüger, Rutger Fick , et al. (5 additional authors not shown)

Abstract: Artificial intelligence (AI) solutions that automatically extract information from digital histology images have shown great promise for improving pathological diagnosis. Prior to routine use, it is important to evaluate their predictive performance and obtain regulatory approval. This assessment requires appropriate test datasets. However, compiling such datasets is challenging and specific recom… ▽ More Artificial intelligence (AI) solutions that automatically extract information from digital histology images have shown great promise for improving pathological diagnosis. Prior to routine use, it is important to evaluate their predictive performance and obtain regulatory approval. This assessment requires appropriate test datasets. However, compiling such datasets is challenging and specific recommendations are missing. A committee of various stakeholders, including commercial AI developers, pathologists, and researchers, discussed key aspects and conducted extensive literature reviews on test datasets in pathology. Here, we summarize the results and derive general recommendations for the collection of test datasets. We address several questions: Which and how many images are needed? How to deal with low-prevalence subsets? How can potential bias be detected? How should datasets be reported? What are the regulatory requirements in different countries? The recommendations are intended to help AI developers demonstrate the utility of their products and to help regulatory agencies and end users verify reported performance measures. Further research is needed to formulate criteria for sufficiently representative test datasets so that AI solutions can operate with less user intervention and better support diagnostic workflows in the future. △ Less

Submitted 21 April, 2022; originally announced April 2022.

Journal ref: Mod Pathol (2022)

arXiv:2204.05173 [pdf, other]

Machine Learning State-of-the-Art with Uncertainties

Authors: Peter Steinbach, Felicita Gernhardt, Mahnoor Tanveer, Steve Schmerler, Sebastian Starke

Abstract: With the availability of data, hardware, software ecosystem and relevant skill sets, the machine learning community is undergoing a rapid development with new architectures and approaches appearing at high frequency every year. In this article, we conduct an exemplary image classification study in order to demonstrate how confidence intervals around accuracy measurements can greatly enhance the co… ▽ More With the availability of data, hardware, software ecosystem and relevant skill sets, the machine learning community is undergoing a rapid development with new architectures and approaches appearing at high frequency every year. In this article, we conduct an exemplary image classification study in order to demonstrate how confidence intervals around accuracy measurements can greatly enhance the communication of research results as well as impact the reviewing process. In addition, we explore the hallmarks and limitations of this approximation. We discuss the relevance of this approach reflecting on a spotlight publication of ICLR22. A reproducible workflow is made available as an open-source adjoint to this publication. Based on our discussion, we make suggestions for improving the authoring and reviewing process of machine learning articles. △ Less

Submitted 14 April, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

Comments: 9 pages, 6 figures. Accepted at the ICLR2022 workshop on ML Evaluation Standards. Code to reproduce results can be obtained from https://github.com/psteinb/sota_on_uncertainties.git

arXiv:1702.00629 [pdf, other]

doi 10.1007/978-3-319-58667-0

gearshifft - The FFT Benchmark Suite for Heterogeneous Platforms

Authors: Peter Steinbach, Matthias Werner

Abstract: Fast Fourier Transforms (FFTs) are exploited in a wide variety of fields ranging from computer science to natural sciences and engineering. With the rising data production bandwidths of modern FFT applications, judging best which algorithmic tool to apply, can be vital to any scientific endeavor. As tailored FFT implementations exist for an ever increasing variety of high performance computer hard… ▽ More Fast Fourier Transforms (FFTs) are exploited in a wide variety of fields ranging from computer science to natural sciences and engineering. With the rising data production bandwidths of modern FFT applications, judging best which algorithmic tool to apply, can be vital to any scientific endeavor. As tailored FFT implementations exist for an ever increasing variety of high performance computer hardware, choosing the best performing FFT implementation has strong implications for future hardware purchase decisions, for resources FFTs consume and for possibly decisive financial and time savings ahead of the competition. This paper therefor presents gearshifft, which is an open-source and vendor agnostic benchmark suite to process a wide variety of problem sizes and types with state-of-the-art FFT implementations (fftw, clfft and cufft). gearshifft provides a reproducible, unbiased and fair comparison on a wide variety of hardware to explore which FFT variant is best for a given problem size. △ Less

Submitted 11 July, 2017; v1 submitted 2 February, 2017; originally announced February 2017.

Journal ref: High Performance Computing, Theoretical Computer Science and General Issues, Vol. 10266, Springer International Publishing. (ISC High Performance 2017)

arXiv:1505.04604 [pdf, other]

doi 10.1088/1742-6596/664/6/062048

How do particle physicists learn the programming concepts they need?

Authors: Stefan Kluth, Maria Grazia Pia, Thomas Schoerner-Sadenius, Peter Steinbach

Abstract: The ability to read, use and develop code efficiently and successfully is a key ingredient in modern particle physics. We report the experience of a training program, identified as "Advanced Programming Concepts", that introduces software concepts, methods and techniques to work effectively on a daily basis in a HEP experiment or other programming intensive fields. This paper illustrates the princ… ▽ More The ability to read, use and develop code efficiently and successfully is a key ingredient in modern particle physics. We report the experience of a training program, identified as "Advanced Programming Concepts", that introduces software concepts, methods and techniques to work effectively on a daily basis in a HEP experiment or other programming intensive fields. This paper illustrates the principles, motivations and methods that shape the "Advanced Computing Concepts" training program, the knowledge base that it conveys, an analysis of the feedback received so far, and the integration of these concepts in the software development process of the experiments as well as its applicability to a wider audience. △ Less

Submitted 18 May, 2015; originally announced May 2015.

Comments: 8 pages, 2 figures, CHEP2015 proceedings

Showing 1–6 of 6 results for author: Steinbach, P