-
Evaluating the Explainability of Attributes and Prototypes for a Medical Classification Model
Authors:
Luisa Gallée,
Catharina Silvia Lisson,
Christoph Gerhard Lisson,
Daniela Drees,
Felix Weig,
Daniel Vogele,
Meinrad Beer,
Michael Götz
Abstract:
Due to the sensitive nature of medicine, it is particularly important and highly demanded that AI methods are explainable. This need has been recognised and there is great research interest in xAI solutions with medical applications. However, there is a lack of user-centred evaluation regarding the actual impact of the explanations. We evaluate attribute- and prototype-based explanations with the…
▽ More
Due to the sensitive nature of medicine, it is particularly important and highly demanded that AI methods are explainable. This need has been recognised and there is great research interest in xAI solutions with medical applications. However, there is a lack of user-centred evaluation regarding the actual impact of the explanations. We evaluate attribute- and prototype-based explanations with the Proto-Caps model. This xAI model reasons the target classification with human-defined visual features of the target object in the form of scores and attribute-specific prototypes. The model thus provides a multimodal explanation that is intuitively understandable to humans thanks to predefined attributes. A user study involving six radiologists shows that the explanations are subjectivly perceived as helpful, as they reflect their decision-making process. The results of the model are considered a second opinion that radiologists can discuss using the model's explanations. However, it was shown that the inclusion and increased magnitude of model explanations objectively can increase confidence in the model's predictions when the model is incorrect. We can conclude that attribute scores and visual prototypes enhance confidence in the model. However, additional development and repeated user studies are needed to tailor the explanation to the respective use case.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Interpretable Medical Image Classification using Prototype Learning and Privileged Information
Authors:
Luisa Gallee,
Meinrad Beer,
Michael Goetz
Abstract:
Interpretability is often an essential requirement in medical imaging. Advanced deep learning methods are required to address this need for explainability and high performance. In this work, we investigate whether additional information available during the training process can be used to create an understandable and powerful model. We propose an innovative solution called Proto-Caps that leverage…
▽ More
Interpretability is often an essential requirement in medical imaging. Advanced deep learning methods are required to address this need for explainability and high performance. In this work, we investigate whether additional information available during the training process can be used to create an understandable and powerful model. We propose an innovative solution called Proto-Caps that leverages the benefits of capsule networks, prototype learning and the use of privileged information. Evaluating the proposed solution on the LIDC-IDRI dataset shows that it combines increased interpretability with above state-of-the-art prediction performance. Compared to the explainable baseline model, our method achieves more than 6 % higher accuracy in predicting both malignancy (93.0 %) and mean characteristic features of lung nodules. Simultaneously, the model provides case-based reasoning with prototype representations that allow visual validation of radiologist-defined attributes.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Self-Supervised Pre-Training with Contrastive and Masked Autoencoder Methods for Dealing with Small Datasets in Deep Learning for Medical Imaging
Authors:
Daniel Wolf,
Tristan Payer,
Catharina Silvia Lisson,
Christoph Gerhard Lisson,
Meinrad Beer,
Michael Götz,
Timo Ropinski
Abstract:
Deep learning in medical imaging has the potential to minimize the risk of diagnostic errors, reduce radiologist workload, and accelerate diagnosis. Training such deep learning models requires large and accurate datasets, with annotations for all training samples. However, in the medical imaging domain, annotated datasets for specific tasks are often small due to the high complexity of annotations…
▽ More
Deep learning in medical imaging has the potential to minimize the risk of diagnostic errors, reduce radiologist workload, and accelerate diagnosis. Training such deep learning models requires large and accurate datasets, with annotations for all training samples. However, in the medical imaging domain, annotated datasets for specific tasks are often small due to the high complexity of annotations, limited access, or the rarity of diseases. To address this challenge, deep learning models can be pre-trained on large image datasets without annotations using methods from the field of self-supervised learning. After pre-training, small annotated datasets are sufficient to fine-tune the models for a specific task. The most popular self-supervised pre-training approaches in medical imaging are based on contrastive learning. However, recent studies in natural image processing indicate a strong potential for masked autoencoder approaches. Our work compares state-of-the-art contrastive learning methods with the recently introduced masked autoencoder approach "SparK" for convolutional neural networks (CNNs) on medical images. Therefore we pre-train on a large unannotated CT image dataset and fine-tune on several CT classification tasks. Due to the challenge of obtaining sufficient annotated training data in medical imaging, it is of particular interest to evaluate how the self-supervised pre-training methods perform when fine-tuning on small datasets. By experimenting with gradually reducing the training dataset size for fine-tuning, we find that the reduction has different effects depending on the type of pre-training chosen. The SparK pre-training method is more robust to the training dataset size than the contrastive methods. Based on our results, we propose the SparK pre-training for medical imaging tasks with only small annotated datasets.
△ Less
Submitted 2 November, 2023; v1 submitted 12 August, 2023;
originally announced August 2023.
-
Improving COVID-19 CXR Detection with Synthetic Data Augmentation
Authors:
Daniel Schaudt,
Christopher Kloth,
Christian Spaete,
Andreas Hinteregger,
Meinrad Beer,
Reinhold von Schwerin
Abstract:
Since the beginning of the COVID-19 pandemic, researchers have developed deep learning models to classify COVID-19 induced pneumonia. As with many medical imaging tasks, the quality and quantity of the available data is often limited. In this work we train a deep learning model on publicly available COVID-19 image data and evaluate the model on local hospital chest X-ray data. The data has been re…
▽ More
Since the beginning of the COVID-19 pandemic, researchers have developed deep learning models to classify COVID-19 induced pneumonia. As with many medical imaging tasks, the quality and quantity of the available data is often limited. In this work we train a deep learning model on publicly available COVID-19 image data and evaluate the model on local hospital chest X-ray data. The data has been reviewed and labeled by two radiologists to ensure a high quality estimation of the generalization capabilities of the model. Furthermore, we are using a Generative Adversarial Network to generate synthetic X-ray images based on this data. Our results show that using those synthetic images for data augmentation can improve the model's performance significantly. This can be a promising approach for many sparse data domains.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
Semantic Segmentation of Histopathological Slides for the Classification of Cutaneous Lymphoma and Eczema
Authors:
Jérémy Scheurer,
Claudio Ferrari,
Luis Berenguer Todo Bom,
Michaela Beer,
Werner Kempf,
Luis Haug
Abstract:
Mycosis fungoides (MF) is a rare, potentially life threatening skin disease, which in early stages clinically and histologically strongly resembles Eczema, a very common and benign skin condition. In order to increase the survival rate, one needs to provide the appropriate treatment early on. To this end, one crucial step for specialists is the evaluation of histopathological slides (glass slides)…
▽ More
Mycosis fungoides (MF) is a rare, potentially life threatening skin disease, which in early stages clinically and histologically strongly resembles Eczema, a very common and benign skin condition. In order to increase the survival rate, one needs to provide the appropriate treatment early on. To this end, one crucial step for specialists is the evaluation of histopathological slides (glass slides), or Whole Slide Images (WSI), of the patients' skin tissue. We introduce a deep learning aided diagnostics tool that brings a two-fold value to the decision process of pathologists. First, our algorithm accurately segments WSI into regions that are relevant for an accurate diagnosis, achieving a Mean-IoU of 69% and a Matthews Correlation score of 83% on a novel dataset. Additionally, we also show that our model is competitive with the state of the art on a reference dataset. Second, using the segmentation map and the original image, we are able to predict if a patient has MF or Eczema. We created two models that can be applied in different stages of the diagnostic pipeline, potentially eliminating life-threatening mistakes. The classification outcome is considerably more interpretable than using only the WSI as the input, since it is also based on the segmentation map. Our segmentation model, which we call EU-Net, extends a classical U-Net with an EfficientNet-B7 encoder which was pre-trained on the Imagenet dataset.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
Computational Estimate Visualisation and Evaluation of Agent Classified Rules Learning System
Authors:
Kennedy E. Ehimwenma,
Martin Beer,
Paul Crowther
Abstract:
Student modelling and agent classified rules learning as applied in the development of the intelligent Preassessment System has been presented in [10],[11]. In this paper, we now demystify the theory behind the development of the pre-assessment system followed by some computational experimentation and graph visualisation of the agent classified rules learning algorithm in the estimation and predic…
▽ More
Student modelling and agent classified rules learning as applied in the development of the intelligent Preassessment System has been presented in [10],[11]. In this paper, we now demystify the theory behind the development of the pre-assessment system followed by some computational experimentation and graph visualisation of the agent classified rules learning algorithm in the estimation and prediction of classified rules. In addition, we present some preliminary results of the pre-assessment system evaluation. From the results, it is gathered that the system has performed according to its design specification.
△ Less
Submitted 28 May, 2016;
originally announced May 2016.
-
A system of serial computation for classified rules prediction in non-regular ontology trees
Authors:
Kennedy E. Ehimwenma,
Paul Crowther,
Martin Beer
Abstract:
Objects or structures that are regular take uniform dimensions. Based on the concepts of regular models, our previous research work has developed a system of a regular ontology that models learning structures in a multiagent system for uniform pre-assessments in a learning environment. This regular ontology has led to the modelling of a classified rules learning algorithm that predicts the actual…
▽ More
Objects or structures that are regular take uniform dimensions. Based on the concepts of regular models, our previous research work has developed a system of a regular ontology that models learning structures in a multiagent system for uniform pre-assessments in a learning environment. This regular ontology has led to the modelling of a classified rules learning algorithm that predicts the actual number of rules needed for inductive learning processes and decision making in a multiagent system. But not all processes or models are regular. Thus this paper presents a system of polynomial equation that can estimate and predict the required number of rules of a non-regular ontology model given some defined parameters.
△ Less
Submitted 8 April, 2016;
originally announced April 2016.
-
Evaluating Hive and Spark SQL with BigBench
Authors:
Todor Ivanov,
Max-Georg Beer
Abstract:
The objective of this work was to utilize BigBench [1] as a Big Data benchmark and evaluate and compare two processing engines: MapReduce [2] and Spark [3]. MapReduce is the established engine for processing data on Hadoop. Spark is a popular alternative engine that promises faster processing times than the established MapReduce engine. BigBench was chosen for this comparison because it is the fir…
▽ More
The objective of this work was to utilize BigBench [1] as a Big Data benchmark and evaluate and compare two processing engines: MapReduce [2] and Spark [3]. MapReduce is the established engine for processing data on Hadoop. Spark is a popular alternative engine that promises faster processing times than the established MapReduce engine. BigBench was chosen for this comparison because it is the first end-to-end analytics Big Data benchmark and it is currently under public review as TPCx-BB [4]. One of our goals was to evaluate the benchmark by performing various scalability tests and validate that it is able to stress test the processing engines. First, we analyzed the steps necessary to execute the available MapReduce implementation of BigBench [1] on Spark. Then, all the 30 BigBench queries were executed on MapReduce/Hive with different scale factors in order to see how the performance changes with the increase of the data size. Next, the group of HiveQL queries were executed on Spark SQL and compared with their respective Hive runtimes. This report gives a detailed overview on how to setup an experimental Hadoop cluster and execute BigBench on both Hive and Spark SQL. It provides the absolute times for all experiments preformed for different scale factors as well as query results which can be used to validate correct benchmark execution. Additionally, multiple issues and workarounds were encountered and solved during our work. An evaluation of the resource utilization (CPU, memory, disk and network usage) of a subset of representative BigBench queries is presented to illustrate the behavior of the different query groups on both processing engines. Last but not least it is important to mention that larger parts of this report are taken from the master thesis of Max-Georg Beer, entitled "Evaluation of BigBench on Apache Spark Compared to MapReduce" [5].
△ Less
Submitted 13 January, 2016; v1 submitted 28 December, 2015;
originally announced December 2015.