Search | arXiv e-print repository

A deep cut into Split Federated Self-supervised Learning

Authors: Marcin Przewięźlikowski, Marcin Osial, Bartosz Zieliński, Marek Śmieja

Abstract: Collaborative self-supervised learning has recently become feasible in highly distributed environments by dividing the network layers between client devices and a central server. However, state-of-the-art methods, such as MocoSFL, are optimized for network division at the initial layers, which decreases the protection of the client data and increases communication overhead. In this paper, we demon… ▽ More Collaborative self-supervised learning has recently become feasible in highly distributed environments by dividing the network layers between client devices and a central server. However, state-of-the-art methods, such as MocoSFL, are optimized for network division at the initial layers, which decreases the protection of the client data and increases communication overhead. In this paper, we demonstrate that splitting depth is crucial for maintaining privacy and communication efficiency in distributed training. We also show that MocoSFL suffers from a catastrophic quality deterioration for the minimal communication overhead. As a remedy, we introduce Momentum-Aligned contrastive Split Federated Learning (MonAcoSFL), which aligns online and momentum client models during training procedure. Consequently, we achieve state-of-the-art accuracy while significantly reducing the communication overhead, making MonAcoSFL more practical in real-world scenarios. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: Accepted to European Conference on Machine Learning (ECML) 2024

arXiv:2405.14331 [pdf, other]

LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision

Authors: Mateusz Pach, Dawid Rymarczyk, Koryna Lewandowska, Jacek Tabor, Bartosz Zieliński

Abstract: Prototypical parts networks combine the power of deep learning with the explainability of case-based reasoning to make accurate, interpretable decisions. They follow the this looks like that reasoning, representing each prototypical part with patches from training images. However, a single image patch comprises multiple visual features, such as color, shape, and texture, making it difficult for us… ▽ More Prototypical parts networks combine the power of deep learning with the explainability of case-based reasoning to make accurate, interpretable decisions. They follow the this looks like that reasoning, representing each prototypical part with patches from training images. However, a single image patch comprises multiple visual features, such as color, shape, and texture, making it difficult for users to identify which feature is important to the model. To reduce this ambiguity, we introduce the Lucid Prototypical Parts Network (LucidPPN), a novel prototypical parts network that separates color prototypes from other visual features. Our method employs two reasoning branches: one for non-color visual features, processing grayscale images, and another focusing solely on color information. This separation allows us to clarify whether the model's decisions are based on color, shape, or texture. Additionally, LucidPPN identifies prototypical parts corresponding to semantic parts of classified objects, making comparisons between data classes more intuitive, e.g., when two bird species might differ primarily in belly color. Our experiments demonstrate that the two branches are complementary and together achieve results comparable to baseline methods. More importantly, LucidPPN generates less ambiguous prototypical parts, enhancing user understanding. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Work in the review process. The code will be available upon acceptance

arXiv:2404.03482 [pdf, other]

AdaGlimpse: Active Visual Exploration with Arbitrary Glimpse Position and Scale

Authors: Adam Pardyl, Michał Wronka, Maciej Wołczyk, Kamil Adamczewski, Tomasz Trzciński, Bartosz Zieliński

Abstract: Active Visual Exploration (AVE) is a task that involves dynamically selecting observations (glimpses), which is critical to facilitate comprehension and navigation within an environment. While modern AVE methods have demonstrated impressive performance, they are constrained to fixed-scale glimpses from rigid grids. In contrast, existing mobile platforms equipped with optical zoom capabilities can… ▽ More Active Visual Exploration (AVE) is a task that involves dynamically selecting observations (glimpses), which is critical to facilitate comprehension and navigation within an environment. While modern AVE methods have demonstrated impressive performance, they are constrained to fixed-scale glimpses from rigid grids. In contrast, existing mobile platforms equipped with optical zoom capabilities can capture glimpses of arbitrary positions and scales. To address this gap between software and hardware capabilities, we introduce AdaGlimpse. It uses Soft Actor-Critic, a reinforcement learning algorithm tailored for exploration tasks, to select glimpses of arbitrary position and scale. This approach enables our model to rapidly establish a general awareness of the environment before zooming in for detailed analysis. Experimental results demonstrate that AdaGlimpse surpasses previous methods across various visual tasks while maintaining greater applicability in realistic AVE scenarios. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2403.07603 [pdf, other]

ProPML: Probability Partial Multi-label Learning

Authors: Łukasz Struski, Adam Pardyl, Jacek Tabor, Bartosz Zieliński

Abstract: Partial Multi-label Learning (PML) is a type of weakly supervised learning where each training instance corresponds to a set of candidate labels, among which only some are true. In this paper, we introduce \our{}, a novel probabilistic approach to this problem that extends the binary cross entropy to the PML setup. In contrast to existing methods, it does not require suboptimal disambiguation and,… ▽ More Partial Multi-label Learning (PML) is a type of weakly supervised learning where each training instance corresponds to a set of candidate labels, among which only some are true. In this paper, we introduce \our{}, a novel probabilistic approach to this problem that extends the binary cross entropy to the PML setup. In contrast to existing methods, it does not require suboptimal disambiguation and, as such, can be applied to any deep architecture. Furthermore, experiments conducted on artificial and real-world datasets indicate that \our{} outperforms existing approaches, especially for high noise in a candidate set. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: Accepted to the International Conference on Data Science and Advanced Analytics (DSAA 2023)

arXiv:2401.10191 [pdf, other]

Divide and not forget: Ensemble of selectively trained experts in Continual Learning

Authors: Grzegorz Rypeść, Sebastian Cygert, Valeriya Khan, Tomasz Trzciński, Bartosz Zieliński, Bartłomiej Twardowski

Abstract: Class-incremental learning is becoming more popular as it helps models widen their applicability while not forgetting what they already know. A trend in this area is to use a mixture-of-expert technique, where different models work together to solve the task. However, the experts are usually trained all at once using whole task data, which makes them all prone to forgetting and increasing computat… ▽ More Class-incremental learning is becoming more popular as it helps models widen their applicability while not forgetting what they already know. A trend in this area is to use a mixture-of-expert technique, where different models work together to solve the task. However, the experts are usually trained all at once using whole task data, which makes them all prone to forgetting and increasing computational burden. To address this limitation, we introduce a novel approach named SEED. SEED selects only one, the most optimal expert for a considered task, and uses data from this task to fine-tune only this expert. For this purpose, each expert represents each class with a Gaussian distribution, and the optimal expert is selected based on the similarity of those distributions. Consequently, SEED increases diversity and heterogeneity within the experts while maintaining the high stability of this ensemble method. The extensive experiments demonstrate that SEED achieves state-of-the-art performance in exemplar-free settings across various scenarios, showing the potential of expert diversification through data in continual learning. △ Less

Submitted 19 March, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

Comments: Accepted for ICLR 2024 (main track), code is available at: https://github.com/grypesc/SEED

arXiv:2311.15335 [pdf, other]

Token Recycling for Efficient Sequential Inference with Vision Transformers

Authors: Jan Olszewski, Dawid Rymarczyk, Piotr Wójcik, Mateusz Pach, Bartosz Zieliński

Abstract: Vision Transformers (ViTs) overpass Convolutional Neural Networks in processing incomplete inputs because they do not require the imputation of missing values. Therefore, ViTs are well suited for sequential decision-making, e.g. in the Active Visual Exploration problem. However, they are computationally inefficient because they perform a full forward pass each time a piece of new sequential inform… ▽ More Vision Transformers (ViTs) overpass Convolutional Neural Networks in processing incomplete inputs because they do not require the imputation of missing values. Therefore, ViTs are well suited for sequential decision-making, e.g. in the Active Visual Exploration problem. However, they are computationally inefficient because they perform a full forward pass each time a piece of new sequential information arrives. To reduce this computational inefficiency, we introduce the TOken REcycling (TORE) modification for the ViT inference, which can be used with any architecture. TORE divides ViT into two parts, iterator and aggregator. An iterator processes sequential information separately into midway tokens, which are cached. The aggregator processes midway tokens jointly to obtain the prediction. This way, we can reuse the results of computations made by iterator. Except for efficient sequential inference, we propose a complementary training policy, which significantly reduces the computational burden associated with sequential decision-making while achieving state-of-the-art accuracy. △ Less

Submitted 26 November, 2023; originally announced November 2023.

Comments: The code will be released upon acceptance

arXiv:2309.13353 [pdf, other]

Beyond Grids: Exploring Elastic Input Sampling for Vision Transformers

Authors: Adam Pardyl, Grzegorz Kurzejamski, Jan Olszewski, Tomasz Trzciński, Bartosz Zieliński

Abstract: Vision transformers have excelled in various computer vision tasks but mostly rely on rigid input sampling using a fixed-size grid of patches. This limits their applicability in real-world problems, such as in the field of robotics and UAVs, where one can utilize higher input elasticity to boost model performance and efficiency. Our paper addresses this limitation by formalizing the concept of inp… ▽ More Vision transformers have excelled in various computer vision tasks but mostly rely on rigid input sampling using a fixed-size grid of patches. This limits their applicability in real-world problems, such as in the field of robotics and UAVs, where one can utilize higher input elasticity to boost model performance and efficiency. Our paper addresses this limitation by formalizing the concept of input elasticity for vision transformers and introducing an evaluation protocol, including dedicated metrics for measuring input elasticity. Moreover, we propose modifications to the transformer architecture and training regime, which increase its elasticity. Through extensive experimentation, we spotlight opportunities and challenges associated with input sampling strategies. △ Less

Submitted 23 September, 2023; originally announced September 2023.

arXiv:2309.04607 [pdf]

Linking Symptom Inventories using Semantic Textual Similarity

Authors: Eamonn Kennedy, Shashank Vadlamani, Hannah M Lindsey, Kelly S Peterson, Kristen Dams OConnor, Kenton Murray, Ronak Agarwal, Houshang H Amiri, Raeda K Andersen, Talin Babikian, David A Baron, Erin D Bigler, Karen Caeyenberghs, Lisa Delano-Wood, Seth G Disner, Ekaterina Dobryakova, Blessen C Eapen, Rachel M Edelstein, Carrie Esopenko, Helen M Genova, Elbert Geuze, Naomi J Goodrich-Hunsaker, Jordan Grafman, Asta K Haberg, Cooper B Hodges , et al. (57 additional authors not shown)

Abstract: An extensive library of symptom inventories has been developed over time to measure clinical symptoms, but this variety has led to several long standing issues. Most notably, results drawn from different settings and studies are not comparable, which limits reproducibility. Here, we present an artificial intelligence (AI) approach using semantic textual similarity (STS) to link symptoms and scores… ▽ More An extensive library of symptom inventories has been developed over time to measure clinical symptoms, but this variety has led to several long standing issues. Most notably, results drawn from different settings and studies are not comparable, which limits reproducibility. Here, we present an artificial intelligence (AI) approach using semantic textual similarity (STS) to link symptoms and scores across previously incongruous symptom inventories. We tested the ability of four pre-trained STS models to screen thousands of symptom description pairs for related content - a challenging task typically requiring expert panels. Models were tasked to predict symptom severity across four different inventories for 6,607 participants drawn from 16 international data sources. The STS approach achieved 74.8% accuracy across five tasks, outperforming other models tested. This work suggests that incorporating contextual, semantic information can assist expert decision-making processes, yielding gains for both general and disease-specific clinical assessment. △ Less

Submitted 8 September, 2023; originally announced September 2023.

arXiv:2308.08162 [pdf, other]

Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations

Authors: Mikołaj Sacha, Bartosz Jura, Dawid Rymarczyk, Łukasz Struski, Jacek Tabor, Bartosz Zieliński

Abstract: Prototypical parts-based networks are becoming increasingly popular due to their faithful self-explanations. However, their similarity maps are calculated in the penultimate network layer. Therefore, the receptive field of the prototype activation region often depends on parts of the image outside this region, which can lead to misleading interpretations. We name this undesired behavior a spatial… ▽ More Prototypical parts-based networks are becoming increasingly popular due to their faithful self-explanations. However, their similarity maps are calculated in the penultimate network layer. Therefore, the receptive field of the prototype activation region often depends on parts of the image outside this region, which can lead to misleading interpretations. We name this undesired behavior a spatial explanation misalignment and introduce an interpretability benchmark with a set of dedicated metrics for quantifying this phenomenon. In addition, we propose a method for misalignment compensation and apply it to existing state-of-the-art models. We show the expressiveness of our benchmark and the effectiveness of the proposed compensation methodology through extensive empirical studies. △ Less

Submitted 16 August, 2023; originally announced August 2023.

Comments: Under review. Code will be release upon acceptance

arXiv:2306.10535 [pdf, other]

ProMIL: Probabilistic Multiple Instance Learning for Medical Imaging

Authors: Łukasz Struski, Dawid Rymarczyk, Arkadiusz Lewicki, Robert Sabiniewicz, Jacek Tabor, Bartosz Zieliński

Abstract: Multiple Instance Learning (MIL) is a weakly-supervised problem in which one label is assigned to the whole bag of instances. An important class of MIL models is instance-based, where we first classify instances and then aggregate those predictions to obtain a bag label. The most common MIL model is when we consider a bag as positive if at least one of its instances has a positive label. However,… ▽ More Multiple Instance Learning (MIL) is a weakly-supervised problem in which one label is assigned to the whole bag of instances. An important class of MIL models is instance-based, where we first classify instances and then aggregate those predictions to obtain a bag label. The most common MIL model is when we consider a bag as positive if at least one of its instances has a positive label. However, this reasoning does not hold in many real-life scenarios, where the positive bag label is often a consequence of a certain percentage of positive instances. To address this issue, we introduce a dedicated instance-based method called ProMIL, based on deep neural networks and Bernstein polynomial estimation. An important advantage of ProMIL is that it can automatically detect the optimal percentage level for decision-making. We show that ProMIL outperforms standard instance-based MIL in real-world medical applications. We make the code available. △ Less

Submitted 12 March, 2024; v1 submitted 18 June, 2023; originally announced June 2023.

Comments: Accepted Paper to European Conference on Artificial Intelligence (ECAI 2023)

arXiv:2306.06082 [pdf, other]

Augmentation-aware Self-supervised Learning with Conditioned Projector

Authors: Marcin Przewięźlikowski, Mateusz Pyla, Bartosz Zieliński, Bartłomiej Twardowski, Jacek Tabor, Marek Śmieja

Abstract: Self-supervised learning (SSL) is a powerful technique for learning robust representations from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo are able to reach quality on par with supervised approaches. However, this invariance may be harmful to solving some downstream tasks which depend on traits affected by augmentations used durin… ▽ More Self-supervised learning (SSL) is a powerful technique for learning robust representations from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo are able to reach quality on par with supervised approaches. However, this invariance may be harmful to solving some downstream tasks which depend on traits affected by augmentations used during pretraining, such as color. In this paper, we propose to foster sensitivity to such characteristics in the representation space by modifying the projector network, a common component of self-supervised architectures. Specifically, we supplement the projector with information about augmentations applied to images. In order for the projector to take advantage of this auxiliary conditioning when solving the SSL task, the feature extractor learns to preserve the augmentation information in its representations. Our approach, coined Conditional Augmentation-aware Self-supervised Learning (CASSLE), is directly applicable to typical joint-embedding SSL methods regardless of their objective functions. Moreover, it does not require major changes in the network architecture or prior knowledge of downstream tasks. In addition to an analysis of sensitivity towards different data augmentations, we conduct a series of experiments, which show that CASSLE improves over various SSL methods, reaching state-of-the-art performance in multiple downstream tasks. △ Less

Submitted 2 December, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

Comments: Prepint under review. Code: https://github.com/gmum/CASSLE

arXiv:2303.07811 [pdf, other]

ICICLE: Interpretable Class Incremental Continual Learning

Authors: Dawid Rymarczyk, Joost van de Weijer, Bartosz Zieliński, Bartłomiej Twardowski

Abstract: Continual learning enables incremental learning of new tasks without forgetting those previously learned, resulting in positive knowledge transfer that can enhance performance on both new and old tasks. However, continual learning poses new challenges for interpretability, as the rationale behind model predictions may change over time, leading to interpretability concept drift. We address this pro… ▽ More Continual learning enables incremental learning of new tasks without forgetting those previously learned, resulting in positive knowledge transfer that can enhance performance on both new and old tasks. However, continual learning poses new challenges for interpretability, as the rationale behind model predictions may change over time, leading to interpretability concept drift. We address this problem by proposing Interpretable Class-InCremental LEarning (ICICLE), an exemplar-free approach that adopts a prototypical part-based approach. It consists of three crucial novelties: interpretability regularization that distills previously learned concepts while preserving user-friendly positive reasoning; proximity-based prototype initialization strategy dedicated to the fine-grained setting; and task-recency bias compensation devoted to prototypical parts. Our experimental results demonstrate that ICICLE reduces the interpretability concept drift and outperforms the existing exemplar-free methods of common class-incremental learning when applied to concept-based models. △ Less

Submitted 31 July, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

Comments: Accepted to ICCV 2023

arXiv:2303.06457 [pdf]

Active Visual Exploration Based on Attention-Map Entropy

Authors: Adam Pardyl, Grzegorz Rypeść, Grzegorz Kurzejamski, Bartosz Zieliński, Tomasz Trzciński

Abstract: Active visual exploration addresses the issue of limited sensor capabilities in real-world scenarios, where successive observations are actively chosen based on the environment. To tackle this problem, we introduce a new technique called Attention-Map Entropy (AME). It leverages the internal uncertainty of the transformer-based model to determine the most informative observations. In contrast to e… ▽ More Active visual exploration addresses the issue of limited sensor capabilities in real-world scenarios, where successive observations are actively chosen based on the environment. To tackle this problem, we introduce a new technique called Attention-Map Entropy (AME). It leverages the internal uncertainty of the transformer-based model to determine the most informative observations. In contrast to existing solutions, it does not require additional loss components, which simplifies the training. Through experiments, which also mimic retina-like sensors, we show that such simplified training significantly improves the performance of reconstruction, segmentation and classification on publicly available datasets. △ Less

Submitted 8 August, 2023; v1 submitted 11 March, 2023; originally announced March 2023.

Comments: IJCAI 2023

arXiv:2301.12276 [pdf, other]

ProtoSeg: Interpretable Semantic Segmentation with Prototypical Parts

Authors: Mikołaj Sacha, Dawid Rymarczyk, Łukasz Struski, Jacek Tabor, Bartosz Zieliński

Abstract: We introduce ProtoSeg, a novel model for interpretable semantic image segmentation, which constructs its predictions using similar patches from the training set. To achieve accuracy comparable to baseline methods, we adapt the mechanism of prototypical parts and introduce a diversity loss function that increases the variety of prototypes within each class. We show that ProtoSeg discovers semantic… ▽ More We introduce ProtoSeg, a novel model for interpretable semantic image segmentation, which constructs its predictions using similar patches from the training set. To achieve accuracy comparable to baseline methods, we adapt the mechanism of prototypical parts and introduce a diversity loss function that increases the variety of prototypes within each class. We show that ProtoSeg discovers semantic concepts, in contrast to standard segmentation models. Experiments conducted on Pascal VOC and Cityscapes datasets confirm the precision and transparency of the presented method. △ Less

Submitted 28 January, 2023; originally announced January 2023.

Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 1481-1492

arXiv:2208.09931 [pdf, other]

ProPaLL: Probabilistic Partial Label Learning

Authors: Łukasz Struski, Jacek Tabor, Bartosz Zieliński

Abstract: Partial label learning is a type of weakly supervised learning, where each training instance corresponds to a set of candidate labels, among which only one is true. In this paper, we introduce ProPaLL, a novel probabilistic approach to this problem, which has at least three advantages compared to the existing approaches: it simplifies the training process, improves performance, and can be applied… ▽ More Partial label learning is a type of weakly supervised learning, where each training instance corresponds to a set of candidate labels, among which only one is true. In this paper, we introduce ProPaLL, a novel probabilistic approach to this problem, which has at least three advantages compared to the existing approaches: it simplifies the training process, improves performance, and can be applied to any deep architecture. Experiments conducted on artificial and real-world datasets indicate that ProPaLL outperforms the existing approaches. △ Less

Submitted 21 August, 2022; originally announced August 2022.

arXiv:2112.02902 [pdf, other]

Interpretable Image Classification with Differentiable Prototypes Assignment

Authors: Dawid Rymarczyk, Łukasz Struski, Michał Górszczak, Koryna Lewandowska, Jacek Tabor, Bartosz Zieliński

Abstract: We introduce ProtoPool, an interpretable image classification model with a pool of prototypes shared by the classes. The training is more straightforward than in the existing methods because it does not require the pruning stage. It is obtained by introducing a fully differentiable assignment of prototypes to particular classes. Moreover, we introduce a novel focal similarity function to focus the… ▽ More We introduce ProtoPool, an interpretable image classification model with a pool of prototypes shared by the classes. The training is more straightforward than in the existing methods because it does not require the pruning stage. It is obtained by introducing a fully differentiable assignment of prototypes to particular classes. Moreover, we introduce a novel focal similarity function to focus the model on the rare foreground features. We show that ProtoPool obtains state-of-the-art accuracy on the CUB-200-2011 and the Stanford Cars datasets, substantially reducing the number of prototypes. We provide a theoretical analysis of the method and a user study to show that our prototypes are more distinctive than those obtained with competitive methods. △ Less

Submitted 5 September, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

Comments: Accepted to ECCV 2022

arXiv:2111.00218 [pdf, ps, other]

A Non-Deterministic Multiset Query Language

Authors: Bartosz Zielinski

Abstract: We develop a multiset query and update language executable in a term rewriting system. Its most remarkable feature, besides non-standard approach to quantification and introduction of fresh values, is non-determinism - a query result is not uniquely determined by the database. We argue that this feature is very useful, e.g., in modelling user choices during simulation or reachability analysis of a… ▽ More We develop a multiset query and update language executable in a term rewriting system. Its most remarkable feature, besides non-standard approach to quantification and introduction of fresh values, is non-determinism - a query result is not uniquely determined by the database. We argue that this feature is very useful, e.g., in modelling user choices during simulation or reachability analysis of a data-centric business process - the intended application of our work. Query evaluation is implemented by converting the query into a terminating term rewriting system and normalizing the initial term which encapsulates the current database. A normal form encapsulates a query result. We prove that our language can express any relational algebra query. Finally, we present a simple business process specification framework (and an example specification). Both syntax and semantics of our query language is implemented in Maude. △ Less

Submitted 10 January, 2022; v1 submitted 30 October, 2021; originally announced November 2021.

Comments: 40 pages, version edited by Fundamenta Informaticae

ACM Class: D.2.1; D.2.4; D.2.5; F.3.1; F.4.2

Journal ref: Fundamenta Informaticae, Volume 184, Issue 2 (January 13, 2022) fi:8647

arXiv:2108.10612 [pdf, other]

ProtoMIL: Multiple Instance Learning with Prototypical Parts for Whole-Slide Image Classification

Authors: Dawid Rymarczyk, Adam Pardyl, Jarosław Kraus, Aneta Kaczyńska, Marek Skomorowski, Bartosz Zieliński

Abstract: Multiple Instance Learning (MIL) gains popularity in many real-life machine learning applications due to its weakly supervised nature. However, the corresponding effort on explaining MIL lags behind, and it is usually limited to presenting instances of a bag that are crucial for a particular prediction. In this paper, we fill this gap by introducing ProtoMIL, a novel self-explainable MIL method in… ▽ More Multiple Instance Learning (MIL) gains popularity in many real-life machine learning applications due to its weakly supervised nature. However, the corresponding effort on explaining MIL lags behind, and it is usually limited to presenting instances of a bag that are crucial for a particular prediction. In this paper, we fill this gap by introducing ProtoMIL, a novel self-explainable MIL method inspired by the case-based reasoning process that operates on visual prototypes. Thanks to incorporating prototypical features into objects description, ProtoMIL unprecedentedly joins the model accuracy and fine-grained interpretability, which we present with the experiments on five recognized MIL datasets. △ Less

Submitted 6 September, 2022; v1 submitted 24 August, 2021; originally announced August 2021.

Comments: Accepted to ECML PKDD 2022

arXiv:2107.13214 [pdf, other]

SONG: Self-Organizing Neural Graphs

Authors: Łukasz Struski, Tomasz Danel, Marek Śmieja, Jacek Tabor, Bartosz Zieliński

Abstract: Recent years have seen a surge in research on deep interpretable neural networks with decision trees as one of the most commonly incorporated tools. There are at least three advantages of using decision trees over logistic regression classification models: they are easy to interpret since they are based on binary decisions, they can make decisions faster, and they provide a hierarchy of classes. H… ▽ More Recent years have seen a surge in research on deep interpretable neural networks with decision trees as one of the most commonly incorporated tools. There are at least three advantages of using decision trees over logistic regression classification models: they are easy to interpret since they are based on binary decisions, they can make decisions faster, and they provide a hierarchy of classes. However, one of the well-known drawbacks of decision trees, as compared to decision graphs, is that decision trees cannot reuse the decision nodes. Nevertheless, decision graphs were not commonly used in deep learning due to the lack of efficient gradient-based training techniques. In this paper, we fill this gap and provide a general paradigm based on Markov processes, which allows for efficient training of the special type of decision graphs, which we call Self-Organizing Neural Graphs (SONG). We provide an extensive theoretical study of SONG, complemented by experiments conducted on Letter, Connect4, MNIST, CIFAR, and TinyImageNet datasets, showing that our method performs on par or better than existing decision models. △ Less

Submitted 28 July, 2021; originally announced July 2021.

arXiv:2106.11054 [pdf, other]

doi 10.1109/ACCESS.2023.3242982

Visual Probing: Cognitive Framework for Explaining Self-Supervised Image Representations

Authors: Witold Oleszkiewicz, Dominika Basaj, Igor Sieradzki, Michał Górszczak, Barbara Rychalska, Koryna Lewandowska, Tomasz Trzciński, Bartosz Zieliński

Abstract: Recently introduced self-supervised methods for image representation learning provide on par or superior results to their fully supervised competitors, yet the corresponding efforts to explain the self-supervised approaches lag behind. Motivated by this observation, we introduce a novel visual probing framework for explaining the self-supervised models by leveraging probing tasks employed previous… ▽ More Recently introduced self-supervised methods for image representation learning provide on par or superior results to their fully supervised competitors, yet the corresponding efforts to explain the self-supervised approaches lag behind. Motivated by this observation, we introduce a novel visual probing framework for explaining the self-supervised models by leveraging probing tasks employed previously in natural language processing. The probing tasks require knowledge about semantic relationships between image parts. Hence, we propose a systematic approach to obtain analogs of natural language in vision, such as visual words, context, and taxonomy. Our proposal is grounded in Marr's computational theory of vision and concerns features like textures, shapes, and lines. We show the effectiveness and applicability of those analogs in the context of explaining self-supervised representations. Our key findings emphasize that relations between language and vision can serve as an effective yet intuitive tool for discovering how machine learning models work, independently of data modality. Our work opens a plethora of research pathways towards more explainable and transparent AI. △ Less

Submitted 21 August, 2022; v1 submitted 21 June, 2021; originally announced June 2021.

Comments: Submitted to IEEE Access

arXiv:2012.01189 [pdf, other]

Classifying bacteria clones using attention-based deep multiple instance learning interpreted by persistence homology

Authors: Adriana Borowa, Dawid Rymarczyk, Dorota Ochońska, Monika Brzychczy-Włoch, Bartosz Zieliński

Abstract: In this work, we analyze if it is possible to distinguish between different clones of the same bacteria species (Klebsiella pneumoniae) based only on microscopic images. It is a challenging task, previously considered impossible due to the high clones similarity. For this purpose, we apply a multi-step algorithm with attention-based multiple instance learning. Except for obtaining accuracy at the… ▽ More In this work, we analyze if it is possible to distinguish between different clones of the same bacteria species (Klebsiella pneumoniae) based only on microscopic images. It is a challenging task, previously considered impossible due to the high clones similarity. For this purpose, we apply a multi-step algorithm with attention-based multiple instance learning. Except for obtaining accuracy at the level of 0.9, we introduce extensive interpretability based on CellProfiler and persistence homology, increasing the understandability and trust in the model. △ Less

Submitted 23 July, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

Comments: Published at the International Joint Conferences on Neural Networks

Journal ref: 978-0-7381-3366-9/21, 2021

arXiv:2011.14340 [pdf, other]

doi 10.1145/3447548.3467245

ProtoPShare: Prototype Sharing for Interpretable Image Classification and Similarity Discovery

Authors: Dawid Rymarczyk, Łukasz Struski, Jacek Tabor, Bartosz Zieliński

Abstract: In this paper, we introduce ProtoPShare, a self-explained method that incorporates the paradigm of prototypical parts to explain its predictions. The main novelty of the ProtoPShare is its ability to efficiently share prototypical parts between the classes thanks to our data-dependent merge-pruning. Moreover, the prototypes are more consistent and the model is more robust to image perturbations th… ▽ More In this paper, we introduce ProtoPShare, a self-explained method that incorporates the paradigm of prototypical parts to explain its predictions. The main novelty of the ProtoPShare is its ability to efficiently share prototypical parts between the classes thanks to our data-dependent merge-pruning. Moreover, the prototypes are more consistent and the model is more robust to image perturbations than the state of the art method ProtoPNet. We verify our findings on two datasets, the CUB-200-2011 and the Stanford Cars. △ Less

Submitted 29 November, 2020; originally announced November 2020.

arXiv:2005.12991 [pdf, other]

Kernel Self-Attention in Deep Multiple Instance Learning

Authors: Dawid Rymarczyk, Adriana Borowa, Jacek Tabor, Bartosz Zieliński

Abstract: Not all supervised learning problems are described by a pair of a fixed-size input tensor and a label. In some cases, especially in medical image analysis, a label corresponds to a bag of instances (e.g. image patches), and to classify such bag, aggregation of information from all of the instances is needed. There have been several attempts to create a model working with a bag of instances, howeve… ▽ More Not all supervised learning problems are described by a pair of a fixed-size input tensor and a label. In some cases, especially in medical image analysis, a label corresponds to a bag of instances (e.g. image patches), and to classify such bag, aggregation of information from all of the instances is needed. There have been several attempts to create a model working with a bag of instances, however, they are assuming that there are no dependencies within the bag and the label is connected to at least one instance. In this work, we introduce Self-Attention Attention-based MIL Pooling (SA-AbMILP) aggregation operation to account for the dependencies between instances. We conduct several experiments on MNIST, histological, microbiological, and retinal databases to show that SA-AbMILP performs better than other models. Additionally, we investigate kernel variations of Self-Attention and their influence on the results. △ Less

Submitted 5 March, 2021; v1 submitted 25 May, 2020; originally announced May 2020.

Comments: https://openaccess.thecvf.com/content/WACV2021/papers/Rymarczyk_Kernel_Self-Attention_for_Weakly-Supervised_Image_Classification_Using_Deep_Multiple_Instance_WACV_2021_paper.pdf

arXiv:2005.11772 [pdf, other]

doi 10.1371/journal.pone.0234806

Deep learning approach to describe and classify fungi microscopic images

Authors: Bartosz Zieliński, Agnieszka Sroka-Oleksiak, Dawid Rymarczyk, Adam Piekarczyk, Monika Brzychczy-Włoch

Abstract: Preliminary diagnosis of fungal infections can rely on microscopic examination. However, in many cases, it does not allow unambiguous identification of the species by microbiologist due to their visual similarity. Therefore, it is usually necessary to use additional biochemical tests. That involves additional costs and extends the identification process up to 10 days. Such a delay in the implement… ▽ More Preliminary diagnosis of fungal infections can rely on microscopic examination. However, in many cases, it does not allow unambiguous identification of the species by microbiologist due to their visual similarity. Therefore, it is usually necessary to use additional biochemical tests. That involves additional costs and extends the identification process up to 10 days. Such a delay in the implementation of targeted therapy may be grave in consequence as the mortality rate for immunosuppressed patients is high. In this paper, we apply a machine learning approach based on deep neural networks and Fisher Vector (advanced bag-of-words method) to classify microscopic images of various fungi species. Our approach has the potential to make the last stage of biochemical identification redundant, shortening the identification process by 2-3 days, and reducing the cost of the diagnosis. △ Less

Submitted 24 May, 2020; originally announced May 2020.

Report number: MIDL/2020/ExtendedAbstract/AEhp_Cqq-h

arXiv:1906.09449 [pdf, other]

doi 10.1371/journal.pone.0234806

Deep learning approach to description and classification of fungi microscopic images

Authors: Bartosz Zieliński, Agnieszka Sroka-Oleksiak, Dawid Rymarczyk, Adam Piekarczyk, Monika Brzychczy-Włoch

Abstract: Diagnosis of fungal infections can rely on microscopic examination, however, in many cases, it does not allow unambiguous identification of the species due to their visual similarity. Therefore, it is usually necessary to use additional biochemical tests. That involves additional costs and extends the identification process up to 10 days. Such a delay in the implementation of targeted treatment is… ▽ More Diagnosis of fungal infections can rely on microscopic examination, however, in many cases, it does not allow unambiguous identification of the species due to their visual similarity. Therefore, it is usually necessary to use additional biochemical tests. That involves additional costs and extends the identification process up to 10 days. Such a delay in the implementation of targeted treatment is grave in consequences as the mortality rate for immunosuppressed patients is high. In this paper, we apply machine learning approach based on deep learning and bag-of-words to classify microscopic images of various fungi species. Our approach makes the last stage of biochemical identification redundant, shortening the identification process by 2-3 days and reducing the cost of the diagnostic examination. △ Less

Submitted 28 January, 2020; v1 submitted 22 June, 2019; originally announced June 2019.

arXiv:1812.09245 [pdf, other]

Persistence Bag-of-Words for Topological Data Analysis

Authors: Bartosz Zieliński, Michał Lipiński, Mateusz Juda, Matthias Zeppelzauer, Paweł Dłotko

Abstract: Persistent homology (PH) is a rigorous mathematical theory that provides a robust descriptor of data in the form of persistence diagrams (PDs). PDs exhibit, however, complex structure and are difficult to integrate in today's machine learning workflows. This paper introduces persistence bag-of-words: a novel and stable vectorized representation of PDs that enables the seamless integration with mac… ▽ More Persistent homology (PH) is a rigorous mathematical theory that provides a robust descriptor of data in the form of persistence diagrams (PDs). PDs exhibit, however, complex structure and are difficult to integrate in today's machine learning workflows. This paper introduces persistence bag-of-words: a novel and stable vectorized representation of PDs that enables the seamless integration with machine learning. Comprehensive experiments show that the new representation achieves state-of-the-art performance and beyond in much less time than alternative approaches. △ Less

Submitted 4 June, 2019; v1 submitted 21 December, 2018; originally announced December 2018.

Comments: Accepted for the Twenty-Eight International Joint Conference on Artificial Intelligence (IJCAI-19). arXiv admin note: substantial text overlap with arXiv:1802.04852

arXiv:1805.07405 [pdf, other]

Processing of missing data by neural networks

Authors: Marek Smieja, Łukasz Struski, Jacek Tabor, Bartosz Zieliński, Przemysław Spurek

Abstract: We propose a general, theoretically justified mechanism for processing missing data by neural networks. Our idea is to replace typical neuron's response in the first hidden layer by its expected value. This approach can be applied for various types of networks at minimal cost in their modification. Moreover, in contrast to recent approaches, it does not require complete data for training. Experime… ▽ More We propose a general, theoretically justified mechanism for processing missing data by neural networks. Our idea is to replace typical neuron's response in the first hidden layer by its expected value. This approach can be applied for various types of networks at minimal cost in their modification. Moreover, in contrast to recent approaches, it does not require complete data for training. Experimental results performed on different types of architectures show that our method gives better results than typical imputation strategies and other methods dedicated for incomplete data. △ Less

Submitted 3 April, 2019; v1 submitted 18 May, 2018; originally announced May 2018.

arXiv:1803.04033 [pdf, other]

Cascade context encoder for improved inpainting

Authors: Bartosz Zieliński, Łukasz Struski, Marek Śmieja, Jacek Tabor

Abstract: In this paper, we analyze if cascade usage of the context encoder with increasing input can improve the results of the inpainting. For this purpose, we train context encoder for 64x64 pixels images in a standard way and use its resized output to fill in the missing input region of the 128x128 context encoder, both in training and evaluation phase. As the result, the inpainting is visibly more plau… ▽ More In this paper, we analyze if cascade usage of the context encoder with increasing input can improve the results of the inpainting. For this purpose, we train context encoder for 64x64 pixels images in a standard way and use its resized output to fill in the missing input region of the 128x128 context encoder, both in training and evaluation phase. As the result, the inpainting is visibly more plausible. In order to thoroughly verify the results, we introduce normalized squared-distortion, a measure for quantitative inpainting evaluation, and we provide its mathematical explanation. This is the first attempt to formalize the inpainting measure, which is based on the properties of latent feature representation, instead of L2 reconstruction loss. △ Less

Submitted 11 March, 2018; originally announced March 2018.

Comments: Supplemental materials are available at http://www.ii.uj.edu.pl/~zielinsb

arXiv:1802.04852 [pdf, other]

Persistence Codebooks for Topological Data Analysis

Authors: Bartosz Zielinski, Michal Lipinski, Mateusz Juda, Matthias Zeppelzauer, Pawel Dlotko

Abstract: Persistent homology (PH) is a rigorous mathematical theory that provides a robust descriptor of data in the form of persistence diagrams (PDs) which are 2D multisets of points. Their variable size makes them, however, difficult to combine with typical machine learning workflows. In this paper we introduce persistence codebooks, a novel expressive and discriminative fixed-size vectorized representa… ▽ More Persistent homology (PH) is a rigorous mathematical theory that provides a robust descriptor of data in the form of persistence diagrams (PDs) which are 2D multisets of points. Their variable size makes them, however, difficult to combine with typical machine learning workflows. In this paper we introduce persistence codebooks, a novel expressive and discriminative fixed-size vectorized representation of PDs. To this end, we adapt bag-of-words (BoW), vectors of locally aggregated descriptors (VLAD) and Fischer vectors (FV) for the quantization of PDs. Persistence codebooks represent PDs in a convenient way for machine learning and statistical analysis and have a number of favorable practical and theoretical properties including 1-Wasserstein stability. We evaluate the presented representations on several heterogeneous datasets and show their (high) discriminative power. Our approach achieves state-of-the-art performance and beyond in much less time than alternative approaches. △ Less

Submitted 13 June, 2019; v1 submitted 13 February, 2018; originally announced February 2018.

Comments: minor update, remove heading

arXiv:1710.10662 [pdf, other]

doi 10.1016/j.cviu.2017.10.012

A Study on Topological Descriptors for the Analysis of 3D Surface Texture

Authors: Matthias Zeppelzauer, Bartosz Zielinski, Mateusz Juda, Markus Seidl

Abstract: Methods from computational topology are becoming more and more popular in computer vision and have shown to improve the state-of-the-art in several tasks. In this paper, we investigate the applicability of topological descriptors in the context of 3D surface analysis for the classification of different surface textures. We present a comprehensive study on topological descriptors, investigate their… ▽ More Methods from computational topology are becoming more and more popular in computer vision and have shown to improve the state-of-the-art in several tasks. In this paper, we investigate the applicability of topological descriptors in the context of 3D surface analysis for the classification of different surface textures. We present a comprehensive study on topological descriptors, investigate their robustness and expressiveness and compare them with state-of-the-art methods including Convolutional Neural Networks (CNNs). Results show that class-specific information is reflected well in topological descriptors. The investigated descriptors can directly compete with non-topological descriptors and capture complementary information. As a consequence they improve the state-of-the-art when combined with non-topological descriptors. △ Less

Submitted 29 October, 2017; originally announced October 2017.

Comments: Preprint of Article "A Study on Topological Descriptors for the Analysis of 3D Surface Texture" in Elsevier Journal on Computer Vision and Image Understanding (CVIU): https://doi.org/10.1016/j.cviu.2017.10.012, 17 Pages, 19 Figures, 4 Tables

arXiv:1601.06057 [pdf, other]

Topological descriptors for 3D surface analysis

Authors: Matthias Zeppelzauer, Bartosz Zieliński, Mateusz Juda, Markus Seidl

Abstract: We investigate topological descriptors for 3D surface analysis, i.e. the classification of surfaces according to their geometric fine structure. On a dataset of high-resolution 3D surface reconstructions we compute persistence diagrams for a 2D cubical filtration. In the next step we investigate different topological descriptors and measure their ability to discriminate structurally different 3D s… ▽ More We investigate topological descriptors for 3D surface analysis, i.e. the classification of surfaces according to their geometric fine structure. On a dataset of high-resolution 3D surface reconstructions we compute persistence diagrams for a 2D cubical filtration. In the next step we investigate different topological descriptors and measure their ability to discriminate structurally different 3D surface patches. We evaluate their sensitivity to different parameters and compare the performance of the resulting topological descriptors to alternative (non-topological) descriptors. We present a comprehensive evaluation that shows that topological descriptors are (i) robust, (ii) yield state-of-the-art performance for the task of 3D surface analysis and (iii) improve classification performance when combined with non-topological descriptors. △ Less

Submitted 22 January, 2016; originally announced January 2016.

Comments: 12 pages, 3 figures, CTIC 2016

Showing 1–31 of 31 results for author: Zieliński, B