Skip to main content

Showing 1–50 of 73 results for author: Biecek, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18334  [pdf, other

    cs.LG stat.ML

    Efficient and Accurate Explanation Estimation with Distribution Compression

    Authors: Hubert Baniecki, Giuseppe Casalicchio, Bernd Bischl, Przemyslaw Biecek

    Abstract: Exact computation of various machine learning explanations requires numerous model evaluations and in extreme cases becomes impractical. The computational cost of approximation increases with an ever-increasing size of data and model parameters. Many heuristics have been proposed to approximate post-hoc explanations efficiently. This paper shows that the standard i.i.d. sampling used in a broad sp… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: To be presented at the ICML 2024 Workshop on DMLR

  2. arXiv:2406.09069  [pdf, other

    cs.LG stat.ML

    On the Robustness of Global Feature Effect Explanations

    Authors: Hubert Baniecki, Giuseppe Casalicchio, Bernd Bischl, Przemyslaw Biecek

    Abstract: We study the robustness of global post-hoc explanations for predictive models trained on tabular data. Effects of predictor features in black-box supervised learning are an essential diagnostic tool for model debugging and scientific discovery in applied sciences. However, how vulnerable they are to data and model perturbations remains an open research question. We introduce several theoretical bo… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted at ECML PKDD 2024

  3. arXiv:2405.14301  [pdf, other

    cs.CV

    Does context matter in digital pathology?

    Authors: Paulina Tomaszewska, Mateusz Sperkowski, Przemysław Biecek

    Abstract: The development of Artificial Intelligence for healthcare is of great importance. Models can sometimes achieve even superior performance to human experts, however, they can reason based on spurious features. This is not acceptable to the experts as it is expected that the models catch the valid patterns in the data following domain expertise. In the work, we analyse whether Deep Learning (DL) mode… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: International Workshop Modelling and Representing Context at ECAI 2023

  4. arXiv:2405.01557  [pdf, other

    cs.LG

    An Experimental Study on the Rashomon Effect of Balancing Methods in Imbalanced Classification

    Authors: Mustafa Cavus, Przemysław Biecek

    Abstract: Predictive models may generate biased predictions when classifying imbalanced datasets. This happens when the model favors the majority class, leading to low performance in accurately predicting the minority class. To address this issue, balancing or resampling methods are critical pre-processing steps in the modeling process. However, there have been debates and questioning of the functionality o… ▽ More

    Submitted 24 June, 2024; v1 submitted 22 March, 2024; originally announced May 2024.

    Comments: 16 pages, 6 figures

  5. arXiv:2404.18316  [pdf, other

    cs.CV cs.AI cs.LG

    Position: Do Not Explain Vision Models Without Context

    Authors: Paulina Tomaszewska, Przemysław Biecek

    Abstract: Does the stethoscope in the picture make the adjacent person a doctor or a patient? This, of course, depends on the contextual relationship of the two objects. If it's obvious, why don't explanation methods for vision models use contextual information? In this paper, we (1) review the most popular methods of explaining computer vision models by pointing out that they do not take into account conte… ▽ More

    Submitted 2 June, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

    Comments: Accepted at International Conference on Machine Learning (ICML) 2024

  6. arXiv:2404.14230  [pdf, other

    cs.HC

    Resistance Against Manipulative AI: key factors and possible actions

    Authors: Piotr Wilczyński, Wiktoria Mieleszczenko-Kowszewicz, Przemysław Biecek

    Abstract: If AI is the new electricity, what should we do to keep ourselves from getting electrocuted? In this work, we explore factors related to the potential of large language models (LLMs) to manipulate human decisions. We describe the results of two experiments designed to determine what characteristics of humans are associated with their susceptibility to LLM manipulation, and what characteristics of… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  7. arXiv:2404.12488  [pdf, other

    cs.LG cs.AI cs.CV

    Global Counterfactual Directions

    Authors: Bartlomiej Sobieski, Przemysław Biecek

    Abstract: Despite increasing progress in development of methods for generating visual counterfactual explanations, especially with the recent rise of Denoising Diffusion Probabilistic Models, previous works consider them as an entirely local technique. In this work, we take the first step at globalizing them. Specifically, we discover that the latent space of Diffusion Autoencoders encodes the inference pro… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: Preprint

  8. arXiv:2404.10387  [pdf, other

    cs.AI cs.CV

    CNN-based explanation ensembling for dataset, representation and explanations evaluation

    Authors: Weronika Hryniewska-Guzik, Luca Longo, Przemysław Biecek

    Abstract: Explainable Artificial Intelligence has gained significant attention due to the widespread use of complex deep learning models in high-stake domains such as medicine, finance, and autonomous cars. However, different explanations often present different aspects of the model's behavior. In this research manuscript, we explore the potential of ensembling explanations generated by deep classification… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: accepted at 2nd World Conference on eXplainable Artificial Intelligence

  9. arXiv:2404.06455  [pdf, other

    eess.IV cs.CV cs.LG

    A comparative analysis of deep learning models for lung segmentation on X-ray images

    Authors: Weronika Hryniewska-Guzik, Jakub Bilski, Bartosz Chrostowski, Jakub Drak Sbahi, Przemysław Biecek

    Abstract: Robust and highly accurate lung segmentation in X-rays is crucial in medical imaging. This study evaluates deep learning solutions for this task, ranking existing methods and analyzing their performance under diverse image modifications. Out of 61 analyzed papers, only nine offered implementation or pre-trained models, enabling assessment of three prominent methods: Lung VAE, TransResUNet, and CE-… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: published at the Polish Conference on Artificial Intelligence (PP-RAI), 2024

  10. arXiv:2404.02067  [pdf, other

    cs.CV cs.AI cs.LG

    Red-Teaming Segment Anything Model

    Authors: Krzysztof Jankowski, Bartlomiej Sobieski, Mateusz Kwiatkowski, Jakub Szulc, Michal Janik, Hubert Baniecki, Przemyslaw Biecek

    Abstract: Foundation models have emerged as pivotal tools, tackling many complex tasks through pre-training on vast datasets and subsequent fine-tuning for specific applications. The Segment Anything Model is one of the first and most well-known foundation models for computer vision segmentation tasks. This work presents a multi-faceted red-teaming analysis that tests the Segment Anything Model against chal… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 - The 4th Workshop of Adversarial Machine Learning on Computer Vision: Robustness of Foundation Models

  11. arXiv:2403.10250  [pdf, other

    stat.ML cs.LG stat.ME

    Interpretable Machine Learning for Survival Analysis

    Authors: Sophie Hanna Langbein, Mateusz Krzyziński, Mikołaj Spytek, Hubert Baniecki, Przemysław Biecek, Marvin N. Wright

    Abstract: With the spread and rapid advancement of black box machine learning models, the field of interpretable machine learning (IML) or explainable artificial intelligence (XAI) has become increasingly important over the last decade. This is particularly relevant for survival analysis, where the adoption of IML techniques promotes transparency, accountability and fairness in sensitive areas, such as clin… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  12. arXiv:2403.08017  [pdf, other

    cs.CV cs.AI

    Red Teaming Models for Hyperspectral Image Analysis Using Explainable AI

    Authors: Vladimir Zaigrajew, Hubert Baniecki, Lukasz Tulczyjew, Agata M. Wijata, Jakub Nalepa, Nicolas Longépé, Przemyslaw Biecek

    Abstract: Remote sensing (RS) applications in the space domain demand machine learning (ML) models that are reliable, robust, and quality-assured, making red teaming a vital approach for identifying and exposing potential flaws and biases. Since both fields advance independently, there is a notable gap in integrating red teaming strategies into RS. This paper introduces a methodology for examining ML models… ▽ More

    Submitted 14 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: 14 pages, 9 figures, ICLR 2024 Machine Learning for Remote Sensing (ML4RS) Workshop

  13. arXiv:2402.13914  [pdf, other

    cs.AI cs.CR cs.LG

    Position: Explain to Question not to Justify

    Authors: Przemyslaw Biecek, Wojciech Samek

    Abstract: Explainable Artificial Intelligence (XAI) is a young but very promising field of research. Unfortunately, the progress in this field is currently slowed down by divergent and incompatible goals. We separate various threads tangled within the area of XAI into two complementary cultures of human/value-oriented explanations (BLUE XAI) and model/validation-oriented explanations (RED XAI). This positio… ▽ More

    Submitted 28 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  14. arXiv:2402.11510  [pdf, other

    eess.IV cs.CV

    Underestimation of lung regions on chest X-ray segmentation masks assessed by comparison with total lung volume evaluated on computed tomography

    Authors: Przemysław Bombiński, Patryk Szatkowski, Bartłomiej Sobieski, Tymoteusz Kwieciński, Szymon Płotka, Mariusz Adamek, Marcin Banasiuk, Mariusz I. Furmanek, Przemysław Biecek

    Abstract: Lung mask creation lacks well-defined criteria and standardized guidelines, leading to a high degree of subjectivity between annotators. In this study, we assess the underestimation of lung regions on chest X-ray segmentation masks created according to the current state-of-the-art method, by comparison with total lung volume evaluated on computed tomography (CT). We show, that lung X-ray masks cre… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: Preprint to Elsevier

  15. arXiv:2401.17200  [pdf, other

    cs.LG cs.AI cs.CV

    NormEnsembleXAI: Unveiling the Strengths and Weaknesses of XAI Ensemble Techniques

    Authors: Weronika Hryniewska-Guzik, Bartosz Sawicki, Przemysław Biecek

    Abstract: This paper presents a comprehensive comparative analysis of explainable artificial intelligence (XAI) ensembling methods. Our research brings three significant contributions. Firstly, we introduce a novel ensembling method, NormEnsembleXAI, that leverages minimum, maximum, and average functions in conjunction with normalization techniques to enhance interpretability. Secondly, we offer insights in… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  16. arXiv:2401.10044  [pdf, other

    cs.CV

    Deep spatial context: when attention-based models meet spatial regression

    Authors: Paulina Tomaszewska, Elżbieta Sienkiewicz, Mai P. Hoang, Przemysław Biecek

    Abstract: We propose 'Deep spatial context' (DSCon) method, which serves for investigation of the attention-based vision models using the concept of spatial context. It was inspired by histopathologists, however, the method can be applied to various domains. The DSCon allows for a quantitative measure of the spatial context's role using three Spatial Context Measures: $SCM_{features}$, $SCM_{targets}$,… ▽ More

    Submitted 10 March, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

  17. arXiv:2312.12881  [pdf, other

    physics.soc-ph cs.CL cs.SI

    Big Tech influence over AI research revisited: memetic analysis of attribution of ideas to affiliation

    Authors: Stanisław Giziński, Paulina Kaczyńska, Hubert Ruczyński, Emilia Wiśnios, Bartosz Pieliński, Przemysław Biecek, Julian Sienkiewicz

    Abstract: There exists a growing discourse around the domination of Big Tech on the landscape of artificial intelligence (AI) research, yet our comprehension of this phenomenon remains cursory. This paper aims to broaden and deepen our understanding of Big Tech's reach and power within AI research. It highlights the dominance not merely in terms of sheer publication volume but rather in the propagation of n… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  18. arXiv:2311.04813  [pdf, other

    cs.CV cs.LG

    Be Careful When Evaluating Explanations Regarding Ground Truth

    Authors: Hubert Baniecki, Maciej Chrabaszcz, Andreas Holzinger, Bastian Pfeifer, Anna Saranti, Przemyslaw Biecek

    Abstract: Evaluating explanations of image classifiers regarding ground truth, e.g. segmentation masks defined by human perception, primarily evaluates the quality of the models under consideration rather than the explanation methods themselves. Driven by this observation, we propose a framework for $\textit{jointly}$ evaluating the robustness of safety-critical systems that $\textit{combine}$ a deep neural… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  19. arXiv:2308.16113  [pdf, other

    cs.LG cs.AI stat.ML

    survex: an R package for explaining machine learning survival models

    Authors: Mikołaj Spytek, Mateusz Krzyziński, Sophie Hanna Langbein, Hubert Baniecki, Marvin N. Wright, Przemysław Biecek

    Abstract: Due to their flexibility and superior performance, machine learning models frequently complement and outperform traditional statistical survival models. However, their widespread adoption is hindered by a lack of user-friendly tools to explain their internal operations and prediction rationales. To tackle this issue, we introduce the survex R package, which provides a cohesive framework for explai… ▽ More

    Submitted 21 November, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

  20. arXiv:2308.15559  [pdf, other

    cs.LG stat.ML

    Glocal Explanations of Expected Goal Models in Soccer

    Authors: Mustafa Cavus, Adrian Stando, Przemyslaw Biecek

    Abstract: The expected goal models have gained popularity, but their interpretability is often limited, especially when trained using black-box methods. Explainable artificial intelligence tools have emerged to enhance model transparency and extract descriptive knowledge for a single observation or for all observations. However, explaining black-box models for a specific group of observations may be more us… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: 26 pages, 8 figures

  21. arXiv:2308.11446  [pdf, other

    cs.LG cs.AI stat.ML

    Exploration of the Rashomon Set Assists Trustworthy Explanations for Medical Data

    Authors: Katarzyna Kobylińska, Mateusz Krzyziński, Rafał Machowicz, Mariusz Adamek, Przemysław Biecek

    Abstract: The machine learning modeling process conventionally culminates in selecting a single model that maximizes a selected performance metric. However, this approach leads to abandoning a more profound analysis of slightly inferior models. Particularly in medical and healthcare studies, where the objective extends beyond predictions to valuable insight generation, relying solely on a single model can r… ▽ More

    Submitted 18 September, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  22. arXiv:2308.01137  [pdf, other

    eess.IV cs.CV cs.LG

    Multi-task learning for classification, segmentation, reconstruction, and detection on chest CT scans

    Authors: Weronika Hryniewska-Guzik, Maria Kędzierska, Przemysław Biecek

    Abstract: Lung cancer and covid-19 have one of the highest morbidity and mortality rates in the world. For physicians, the identification of lesions is difficult in the early stages of the disease and time-consuming. Therefore, multi-task learning is an approach to extracting important features, such as lesions, from small amounts of medical data because it learns to generalize better. We propose a novel mu… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: presented at the Polish Conference on Artificial Intelligence (PP-RAI), 2023

    Journal ref: Progress in Polish Artificial Intelligence Research 4 (2023) 251-257

  23. arXiv:2307.07764  [pdf, other

    cs.AI

    Explaining and visualizing black-box models through counterfactual paths

    Authors: Bastian Pfeifer, Mateusz Krzyzinski, Hubert Baniecki, Anna Saranti, Andreas Holzinger, Przemyslaw Biecek

    Abstract: Explainable AI (XAI) is an increasingly important area of machine learning research, which aims to make black-box models transparent and interpretable. In this paper, we propose a novel approach to XAI that uses the so-called counterfactual paths generated by conditional permutations of features. The algorithm measures feature importance by identifying sequential permutations of features that most… ▽ More

    Submitted 1 August, 2023; v1 submitted 15 July, 2023; originally announced July 2023.

  24. arXiv:2307.00157  [pdf, other

    cs.LG stat.ML

    The Effect of Balancing Methods on Model Behavior in Imbalanced Classification Problems

    Authors: Adrian Stando, Mustafa Cavus, Przemysław Biecek

    Abstract: Imbalanced data poses a significant challenge in classification as model performance is affected by insufficient learning from minority classes. Balancing methods are often used to address this problem. However, such techniques can lead to problems such as overfitting or loss of information. This study addresses a more challenging aspect of balancing methods - their impact on model behavior. To ca… ▽ More

    Submitted 30 June, 2023; originally announced July 2023.

    Journal ref: Proceedings of Machine Learning Research 241 (2023) 16-30

  25. arXiv:2306.11636  [pdf, other

    cs.LG

    SeFNet: Bridging Tabular Datasets with Semantic Feature Nets

    Authors: Katarzyna Woźnica, Piotr Wilczyński, Przemysław Biecek

    Abstract: Machine learning applications cover a wide range of predictive tasks in which tabular datasets play a significant role. However, although they often address similar problems, tabular datasets are typically treated as standalone tasks. The possibilities of using previously solved problems are limited due to the lack of structured contextual information about their features and the lack of understan… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

  26. arXiv:2306.06123  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    Adversarial attacks and defenses in explainable artificial intelligence: A survey

    Authors: Hubert Baniecki, Przemyslaw Biecek

    Abstract: Explainable artificial intelligence (XAI) methods are portrayed as a remedy for debugging and trusting statistical and deep learning models, as well as interpreting their predictions. However, recent advances in adversarial machine learning (AdvML) highlight the limitations and vulnerabilities of state-of-the-art explanation methods, putting their security and trustworthiness into question. The po… ▽ More

    Submitted 13 February, 2024; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted by Information Fusion

    Journal ref: Information Fusion, vol. 107, 102303, 2024

  27. arXiv:2305.10961  [pdf, other

    cs.AI

    Prevention is better than cure: a case study of the abnormalities detection in the chest

    Authors: Weronika Hryniewska, Piotr Czarnecki, Jakub Wiśniewski, Przemysław Bombiński, Przemysław Biecek

    Abstract: Prevention is better than cure. This old truth applies not only to the prevention of diseases but also to the prevention of issues with AI models used in medicine. The source of malfunctioning of predictive models often lies not in the training process but reaches the data acquisition phase or design of the experiment phase. In this paper, we analyze in detail a single use case - a Kaggle compet… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Journal ref: CVPR 2021 Workshop Beyond Fairness: Towards a Just, Equitable, and Accountable Computer Vision

  28. Towards Evaluating Explanations of Vision Transformers for Medical Imaging

    Authors: Piotr Komorowski, Hubert Baniecki, Przemysław Biecek

    Abstract: As deep learning models increasingly find applications in critical domains such as medical imaging, the need for transparent and trustworthy decision-making becomes paramount. Many explainability methods provide insights into how these models make predictions by attributing importance to input features. As Vision Transformer (ViT) becomes a promising alternative to convolutional neural networks fo… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

    Comments: Accepted by XAI4CV Workshop at CVPR 2023

    Journal ref: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 3726-3732, 2023

  29. Interpretable machine learning for time-to-event prediction in medicine and healthcare

    Authors: Hubert Baniecki, Bartlomiej Sobieski, Patryk Szatkowski, Przemyslaw Bombinski, Przemyslaw Biecek

    Abstract: Time-to-event prediction, e.g. cancer survival analysis or hospital length of stay, is a highly prominent machine learning task in medical and healthcare applications. However, only a few interpretable machine learning methods comply with its challenges. To facilitate a comprehensive explanatory analysis of survival models, we formally introduce time-dependent feature effects and global feature im… ▽ More

    Submitted 27 March, 2024; v1 submitted 17 March, 2023; originally announced March 2023.

    Comments: An extended version of an AIME 2023 paper submitted to Artificial Intelligence in Medicine

    Journal ref: Artificial Intelligence in Medicine, vol. 1, pp. 65-74, 2023

  30. arXiv:2303.06640  [pdf, other

    cs.LG eess.SP

    Challenges facing the explainability of age prediction models: case study for two modalities

    Authors: Mikolaj Spytek, Weronika Hryniewska-Guzik, Jaroslaw Zygierewicz, Jacek Rogala, Przemyslaw Biecek

    Abstract: The prediction of age is a challenging task with various practical applications in high-impact fields like the healthcare domain or criminology. Despite the growing number of models and their increasing performance, we still know little about how these models work. Numerous examples of failures of AI systems show that performance alone is insufficient, thus, new methods are needed to explore and… ▽ More

    Submitted 12 March, 2023; originally announced March 2023.

    Comments: Presented at the Aging Hackathon at the 7th International workshop on Health Intelligence (W3PHIAI-23) at AAAI-23 Conference, http://w3phiai2023.w3phi.com/

  31. arXiv:2302.13356  [pdf, other

    stat.ML cs.LG stat.AP

    Performance is not enough: the story told by a Rashomon quartet

    Authors: Przemyslaw Biecek, Hubert Baniecki, Mateusz Krzyzinski, Dianne Cook

    Abstract: The usual goal of supervised learning is to find the best model, the one that optimizes a particular performance measure. However, what if the explanation provided by this model is completely different from another model and different again from another model despite all having similarly good fit statistics? Is it possible that the equally effective models put the spotlight on different relationsh… ▽ More

    Submitted 11 April, 2024; v1 submitted 26 February, 2023; originally announced February 2023.

  32. arXiv:2302.13099  [pdf, other

    cs.CL

    HADES: Homologous Automated Document Exploration and Summarization

    Authors: Piotr Wilczyński, Artur Żółkowski, Mateusz Krzyziński, Emilia Wiśnios, Bartosz Pieliński, Stanisław Giziński, Julian Sienkiewicz, Przemysław Biecek

    Abstract: This paper introduces HADES, a novel tool for automatic comparative documents with similar structures. HADES is designed to streamline the work of professionals dealing with large volumes of documents, such as policy documents, legal acts, and scientific papers. The tool employs a multi-step pipeline that begins with processing PDF documents using topic modeling, summarization, and analysis of the… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

  33. arXiv:2211.05852  [pdf, other

    cs.CL

    Climate Policy Tracker: Pipeline for automated analysis of public climate policies

    Authors: Artur Żółkowski, Mateusz Krzyziński, Piotr Wilczyński, Stanisław Giziński, Emilia Wiśnios, Bartosz Pieliński, Julian Sienkiewicz, Przemysław Biecek

    Abstract: The number of standardized policy documents regarding climate policy and their publication frequency is significantly increasing. The documents are long and tedious for manual analysis, especially for policy experts, lawmakers, and citizens who lack access or domain expertise to utilize data analytics tools. Potential consequences of such a situation include reduced citizen governance and involvem… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: Accepted for Tackling Climate Change with Machine Learning: workshop at NeurIPS 2022

  34. SurvSHAP(t): Time-dependent explanations of machine learning survival models

    Authors: Mateusz Krzyziński, Mikołaj Spytek, Hubert Baniecki, Przemysław Biecek

    Abstract: Machine and deep learning survival models demonstrate similar or even improved time-to-event prediction capabilities compared to classical statistical learning methods yet are too complex to be interpreted by humans. Several model-agnostic explanations are available to overcome this issue; however, none directly explain the survival function prediction. In this paper, we introduce SurvSHAP(t), the… ▽ More

    Submitted 7 September, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

    Journal ref: Knowledge-Based Systems, vol. 262, 110234, 2023

  35. arXiv:2208.09966  [pdf, other

    cs.LG cs.AI cs.CY

    Performance, Opaqueness, Consequences, and Assumptions: Simple questions for responsible planning of machine learning solutions

    Authors: Przemyslaw Biecek

    Abstract: The data revolution has generated a huge demand for data-driven solutions. This demand propels a growing number of easy-to-use tools and training for aspiring data scientists that enable the rapid building of predictive models. Today, weapons of math destruction can be easily built and deployed without detailed planning and validation. This rapidly extends the list of AI failures, i.e. deployments… ▽ More

    Submitted 21 August, 2022; originally announced August 2022.

  36. Explainable expected goal models for performance analysis in football analytics

    Authors: Mustafa Cavus, Przemysław Biecek

    Abstract: The expected goal provides a more representative measure of the team and player performance which also suit the low-scoring nature of football instead of score in modern football. The score of a match involves randomness and often may not represent the performance of the teams and players, therefore it has been popular to use the alternative statistics in recent years such as shots on target, ball… ▽ More

    Submitted 6 September, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

  37. arXiv:2205.05359  [pdf

    stat.ML cs.AI cs.LG

    Exploring Local Explanations of Nonlinear Models Using Animated Linear Projections

    Authors: Nicholas Spyrison, Dianne Cook, Przemyslaw Biecek

    Abstract: The increased predictive power of machine learning models comes at the cost of increased complexity and loss of interpretability, particularly in comparison to parametric statistical models. This trade-off has led to the emergence of eXplainable AI (XAI) which provides methods, such as local explanations (LEs) and local variable attributions (LVAs), to shed light on how a model use predictors to a… ▽ More

    Submitted 18 January, 2024; v1 submitted 11 May, 2022; originally announced May 2022.

  38. arXiv:2201.11815  [pdf, other

    cs.LG

    Consolidated learning -- a domain-specific model-free optimization strategy with examples for XGBoost and MIMIC-IV

    Authors: Katarzyna Woźnica, Mateusz Grzyb, Zuzanna Trafas, Przemysław Biecek

    Abstract: For many machine learning models, a choice of hyperparameters is a crucial step towards achieving high performance. Prevalent meta-learning approaches focus on obtaining good hyperparameters configurations with a limited computational budget for a completely new task based on the results obtained from the prior tasks. This paper proposes a new formulation of the tuning problem, called consolidated… ▽ More

    Submitted 27 January, 2022; originally announced January 2022.

  39. LIMEcraft: Handcrafted superpixel selection and inspection for Visual eXplanations

    Authors: Weronika Hryniewska, Adrianna Grudzień, Przemysław Biecek

    Abstract: The increased interest in deep learning applications, and their hard-to-detect biases result in the need to validate and explain complex models. However, current explanation methods are limited as far as both the explanation of the reasoning process and prediction results are concerned. They usually only show the location in the image that was important for model prediction. The lack of possibilit… ▽ More

    Submitted 30 April, 2022; v1 submitted 15 November, 2021; originally announced November 2021.

    Journal ref: Machine Learning (2022) 1-18

  40. arXiv:2108.11674  [pdf, other

    cs.AI cs.LG q-bio.MN

    Graph-guided random forest for gene set selection

    Authors: Bastian Pfeifer, Hubert Baniecki, Anna Saranti, Przemyslaw Biecek, Andreas Holzinger

    Abstract: Machine learning methods can detect complex relationships between variables, but usually do not exploit domain knowledge. This is a limitation because in many scientific disciplines, such as systems biology, domain knowledge is available in the form of graphs or networks, and its use can improve model performance. We need network-based algorithms that are versatile and applicable in many research… ▽ More

    Submitted 6 September, 2022; v1 submitted 26 August, 2021; originally announced August 2021.

    Journal ref: Sci Rep 12, 16857 (2022)

  41. arXiv:2108.06216  [pdf, other

    cs.IR cs.AI cs.CL cs.SI

    MAIR: Framework for mining relationships between research articles, strategies, and regulations in the field of explainable artificial intelligence

    Authors: Stanisław Gizinski, Michał Kuzba, Bartosz Pielinski, Julian Sienkiewicz, Stanisław Łaniewski, Przemysław Biecek

    Abstract: The growing number of AI applications, also for high-stake decisions, increases the interest in Explainable and Interpretable Machine Learning (XI-ML). This trend can be seen both in the increasing number of regulations and strategies for develo** trustworthy AI and the growing number of scientific papers dedicated to this topic. To ensure the sustainable development of AI, it is essential to un… ▽ More

    Submitted 29 July, 2021; originally announced August 2021.

  42. arXiv:2105.13787  [pdf, other

    cs.LG

    Do not explain without context: addressing the blind spot of model explanations

    Authors: Katarzyna Woźnica, Katarzyna Pękala, Hubert Baniecki, Wojciech Kretowicz, Elżbieta Sienkiewicz, Przemysław Biecek

    Abstract: The increasing number of regulations and expectations of predictive machine learning models, such as so called right to explanation, has led to a large number of methods promising greater interpretability. High demand has led to a widespread adoption of XAI techniques like Shapley values, Partial Dependence profiles or permutational variable importance. However, we still do not know enough about t… ▽ More

    Submitted 28 May, 2021; originally announced May 2021.

  43. Fooling Partial Dependence via Data Poisoning

    Authors: Hubert Baniecki, Wojciech Kretowicz, Przemyslaw Biecek

    Abstract: Many methods have been developed to understand complex predictive models and high expectations are placed on post-hoc model explainability. It turns out that such explanations are not robust nor trustworthy, and they can be fooled. This paper presents techniques for attacking Partial Dependence (plots, profiles, PDP), which are among the most popular methods of explaining any predictive model trai… ▽ More

    Submitted 10 July, 2022; v1 submitted 26 May, 2021; originally announced May 2021.

    Comments: Accepted at ECML PKDD 2022

    Journal ref: Machine Learning and Knowledge Discovery in Databases, vol. 3, pp. 121-136, 2022

  44. Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts

    Authors: Tomasz Stanisławek, Filip Graliński, Anna Wróblewska, Dawid Lipiński, Agnieszka Kaliska, Paulina Rosalska, Bartosz Topolski, Przemysław Biecek

    Abstract: The relevance of the Key Information Extraction (KIE) task is increasingly important in natural language processing problems. But there are still only a few well-defined problems that serve as benchmarks for solutions in this area. To bridge this gap, we introduce two new datasets (Kleister NDA and Kleister Charity). They involve a mix of scanned and born-digital long formal English-language docum… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

    Comments: accepted to ICDAR 2021

    Journal ref: International Conference on Document Analysis and Recognition ICDAR 2021

  45. arXiv:2104.06735  [pdf

    q-fin.RM cs.AI cs.LG

    Enabling Machine Learning Algorithms for Credit Scoring -- Explainable Artificial Intelligence (XAI) methods for clear understanding complex predictive models

    Authors: Przemysław Biecek, Marcin Chlebus, Janusz Gajda, Alicja Gosiewska, Anna Kozak, Dominik Ogonowski, Jakub Sztachelski, Piotr Wojewnik

    Abstract: Rapid development of advanced modelling techniques gives an opportunity to develop tools that are more and more accurate. However as usually, everything comes with a price and in this case, the price to pay is to loose interpretability of a model while gaining on its accuracy and precision. For managers to control and effectively manage credit risk and for regulators to be convinced with model qua… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

  46. arXiv:2104.03403  [pdf, other

    cs.LG

    Triplot: model agnostic measures and visualisations for variable importance in predictive models that take into account the hierarchical correlation structure

    Authors: Katarzyna Pekala, Katarzyna Woznica, Przemyslaw Biecek

    Abstract: One of the key elements of explanatory analysis of a predictive model is to assess the importance of individual variables. Rapid development of the area of predictive model exploration (also called explainable artificial intelligence or interpretable machine learning) has led to the popularization of methods for local (instance level) and global (dataset level) methods, such as Permutational Varia… ▽ More

    Submitted 7 April, 2021; originally announced April 2021.

  47. arXiv:2104.00507  [pdf, other

    stat.ML cs.LG cs.MS stat.AP

    fairmodels: A Flexible Tool For Bias Detection, Visualization, And Mitigation

    Authors: Jakub Wiśniewski, Przemysław Biecek

    Abstract: Machine learning decision systems are getting omnipresent in our lives. From dating apps to rating loan seekers, algorithms affect both our well-being and future. Typically, however, these systems are not infallible. Moreover, complex predictive models are really eager to learn social biases present in historical data that can lead to increasing discrimination. If we want to create models responsi… ▽ More

    Submitted 11 February, 2022; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: 15 pages, 9 figures

  48. arXiv:2012.14406  [pdf, other

    cs.LG cs.HC cs.SE stat.ML

    dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python

    Authors: Hubert Baniecki, Wojciech Kretowicz, Piotr Piatyszek, Jakub Wisniewski, Przemyslaw Biecek

    Abstract: The increasing amount of available data, computing power, and the constant pursuit for higher performance results in the growing complexity of predictive models. Their black-box nature leads to opaqueness debt phenomenon inflicting increased risks of discrimination, lack of reproducibility, and deflated performance due to data drift. To manage these risks, good MLOps practices ask for better valid… ▽ More

    Submitted 11 October, 2021; v1 submitted 28 December, 2020; originally announced December 2020.

    Comments: https://jmlr.org/papers/v22/20-1473.html

    Journal ref: Journal of Machine Learning Research (2021) v. 22(214); pp. 1-7

  49. Checklist for responsible deep learning modeling of medical images based on COVID-19 detection studies

    Authors: Weronika Hryniewska, Przemysław Bombiński, Patryk Szatkowski, Paulina Tomaszewska, Artur Przelaskowski, Przemysław Biecek

    Abstract: The sudden outbreak and uncontrolled spread of COVID-19 disease is one of the most important global problems today. In a short period of time, it has led to the development of many deep neural network models for COVID-19 detection with modules for explainability. In this work, we carry out a systematic analysis of various aspects of proposed models. Our analysis revealed numerous mistakes made at… ▽ More

    Submitted 23 April, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

    Report number: 2021

    Journal ref: Pattern Recognition 118 (2021) 108035

  50. arXiv:2009.13384  [pdf, other

    stat.ML cs.LG econ.GN stat.AP stat.ME

    Transparency, Auditability and eXplainability of Machine Learning Models in Credit Scoring

    Authors: Michael Bücker, Gero Szepannek, Alicja Gosiewska, Przemyslaw Biecek

    Abstract: A major requirement for credit scoring models is to provide a maximally accurate risk prediction. Additionally, regulators demand these models to be transparent and auditable. Thus, in credit scoring, very simple predictive models such as logistic regression or decision trees are still widely used and the superior predictive power of modern machine learning algorithms cannot be fully leveraged. Si… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.