Search | arXiv e-print repository

A Unified View of Abstract Visual Reasoning Problems

Authors: Mikołaj Małkiński, Jacek Mańdziuk

Abstract: The field of Abstract Visual Reasoning (AVR) encompasses a wide range of problems, many of which are inspired by human IQ tests. The variety of AVR tasks has resulted in state-of-the-art AVR methods being task-specific approaches. Furthermore, contemporary methods consider each AVR problem instance not as a whole, but in the form of a set of individual panels with particular locations and roles (c… ▽ More The field of Abstract Visual Reasoning (AVR) encompasses a wide range of problems, many of which are inspired by human IQ tests. The variety of AVR tasks has resulted in state-of-the-art AVR methods being task-specific approaches. Furthermore, contemporary methods consider each AVR problem instance not as a whole, but in the form of a set of individual panels with particular locations and roles (context vs. answer panels) pre-assigned according to the task-specific arrangements. While these highly specialized approaches have recently led to significant progress in solving particular AVR tasks, considering each task in isolation hinders the development of universal learning systems in this domain. In this paper, we introduce a unified view of AVR tasks, where each problem instance is rendered as a single image, with no a priori assumptions about the number of panels, their location, or role. The main advantage of the proposed unified view is the ability to develop universal learning models applicable to various AVR tasks. What is more, the proposed approach inherently facilitates transfer learning in the AVR domain, as various types of problems share a common representation. The experiments conducted on four AVR datasets with Raven's Progressive Matrices and Visual Analogy Problems, and one real-world visual analogy dataset show that the proposed unified representation of AVR tasks poses a challenge to state-of-the-art Deep Learning (DL) AVR models and, more broadly, contemporary DL image recognition methods. In order to address this challenge, we introduce the Unified Model for Abstract Visual Reasoning (UMAVR) capable of dealing with various types of AVR problems in a unified manner. UMAVR outperforms existing AVR methods in selected single-task learning experiments, and demonstrates effective knowledge reuse in transfer learning and curriculum learning setups. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2406.11061 [pdf, other]

Generalization and Knowledge Transfer in Abstract Visual Reasoning Models

Authors: Mikołaj Małkiński, Jacek Mańdziuk

Abstract: We study generalization and knowledge reuse capabilities of deep neural networks in the domain of abstract visual reasoning (AVR), employing Raven's Progressive Matrices (RPMs), a recognized benchmark task for assessing AVR abilities. Two knowledge transfer scenarios referring to the I-RAVEN dataset are investigated. Firstly, inspired by generalization assessment capabilities of the PGM dataset an… ▽ More We study generalization and knowledge reuse capabilities of deep neural networks in the domain of abstract visual reasoning (AVR), employing Raven's Progressive Matrices (RPMs), a recognized benchmark task for assessing AVR abilities. Two knowledge transfer scenarios referring to the I-RAVEN dataset are investigated. Firstly, inspired by generalization assessment capabilities of the PGM dataset and popularity of I-RAVEN, we introduce Attributeless-I-RAVEN, a benchmark with four generalization regimes that allow to test generalization of abstract rules applied to held-out attributes. Secondly, we construct I-RAVEN-Mesh, a dataset that enriches RPMs with a novel component structure comprising line-based patterns, facilitating assessment of progressive knowledge acquisition in transfer learning setting. The developed benchmarks reveal shortcomings of the contemporary deep learning models, which we partly address with Pathways of Normalized Group Convolution (PoNG) model, a novel neural architecture for solving AVR tasks. PoNG excels in both presented challenges, as well as the standard I-RAVEN and PGM setups. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2405.11372 [pdf, other]

ReModels: Quantile Regression Averaging models

Authors: Grzegorz Zakrzewski, Kacper Skonieczka, Mikołaj Małkiński, Jacek Mańdziuk

Abstract: Electricity price forecasts play a crucial role in making key business decisions within the electricity markets. A focal point in this domain are probabilistic predictions, which delineate future price values in a more comprehensive manner than simple point forecasts. The golden standard in probabilistic approaches to predict energy prices is the Quantile Regression Averaging (QRA) method. In this… ▽ More Electricity price forecasts play a crucial role in making key business decisions within the electricity markets. A focal point in this domain are probabilistic predictions, which delineate future price values in a more comprehensive manner than simple point forecasts. The golden standard in probabilistic approaches to predict energy prices is the Quantile Regression Averaging (QRA) method. In this paper, we present a Python package that encompasses the implementation of QRA, along with modifications of this approach that have appeared in the literature over the past few years. The proposed package also facilitates the acquisition and preparation of data related to electricity markets, as well as the evaluation of model predictions. △ Less

Submitted 18 May, 2024; originally announced May 2024.

arXiv:2405.06330 [pdf, other]

Interpretable Multi-task Learning with Shared Variable Embeddings

Authors: Maciej Żelaszczyk, Jacek Mańdziuk

Abstract: This paper proposes a general interpretable predictive system with shared information. The system is able to perform predictions in a multi-task setting where distinct tasks are not bound to have the same input/output structure. Embeddings of input and output variables in a common space are obtained, where the input embeddings are produced through attending to a set of shared embeddings, reused ac… ▽ More This paper proposes a general interpretable predictive system with shared information. The system is able to perform predictions in a multi-task setting where distinct tasks are not bound to have the same input/output structure. Embeddings of input and output variables in a common space are obtained, where the input embeddings are produced through attending to a set of shared embeddings, reused across tasks. All the embeddings are treated as model parameters and learned. Specific restrictions on the space of shared embedings and the sparsity of the attention mechanism are considered. Experiments show that the introduction of shared embeddings does not deteriorate the results obtained from a vanilla variable embeddings method. We run a number of further ablations. Inducing sparsity in the attention mechanism leads to both an increase in accuracy and a significant decrease in the number of training steps required. Shared embeddings provide a measure of interpretability in terms of both a qualitative assessment and the ability to map specific shared embeddings to pre-defined concepts that are not tailored to the considered model. There seems to be a trade-off between accuracy and interpretability. The basic shared embeddings method favors interpretability, whereas the sparse attention method promotes accuracy. The results lead to the conclusion that variable embedding methods may be extended with shared information to provide increased interpretability and accuracy. △ Less

Submitted 30 June, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

arXiv:2401.11631 [pdf, other]

Text-to-Image Cross-Modal Generation: A Systematic Review

Authors: Maciej Żelaszczyk, Jacek Mańdziuk

Abstract: We review research on generating visual data from text from the angle of "cross-modal generation." This point of view allows us to draw parallels between various methods geared towards working on input text and producing visual output, without limiting the analysis to narrow sub-areas. It also results in the identification of common templates in the field, which are then compared and contrasted bo… ▽ More We review research on generating visual data from text from the angle of "cross-modal generation." This point of view allows us to draw parallels between various methods geared towards working on input text and producing visual output, without limiting the analysis to narrow sub-areas. It also results in the identification of common templates in the field, which are then compared and contrasted both within pools of similar methods and across lines of research. We provide a breakdown of text-to-image generation into various flavors of image-from-text methods, video-from-text methods, image editing, self-supervised and graph-based approaches. In this discussion, we focus on research papers published at 8 leading machine learning conferences in the years 2016-2022, also incorporating a number of relevant papers not matching the outlined search criteria. The conducted review suggests a significant increase in the number of papers published in the area and highlights research gaps and potential lines of investigation. To our knowledge, this is the first review to systematically look at text-to-image generation from the perspective of "cross-modal generation." △ Less

Submitted 21 January, 2024; originally announced January 2024.

arXiv:2312.09997 [pdf, other]

One Self-Configurable Model to Solve Many Abstract Visual Reasoning Problems

Authors: Mikołaj Małkiński, Jacek Mańdziuk

Abstract: Abstract Visual Reasoning (AVR) comprises a wide selection of various problems similar to those used in human IQ tests. Recent years have brought dynamic progress in solving particular AVR tasks, however, in the contemporary literature AVR problems are largely dealt with in isolation, leading to highly specialized task-specific methods. With the aim of develo** universal learning systems in the… ▽ More Abstract Visual Reasoning (AVR) comprises a wide selection of various problems similar to those used in human IQ tests. Recent years have brought dynamic progress in solving particular AVR tasks, however, in the contemporary literature AVR problems are largely dealt with in isolation, leading to highly specialized task-specific methods. With the aim of develo** universal learning systems in the AVR domain, we propose the unified model for solving Single-Choice Abstract visual Reasoning tasks (SCAR), capable of solving various single-choice AVR tasks, without making any a priori assumptions about the task structure, in particular the number and location of panels. The proposed model relies on a novel Structure-Aware dynamic Layer (SAL), which adapts its weights to the structure of the considered AVR problem. Experiments conducted on Raven's Progressive Matrices, Visual Analogy Problems, and Odd One Out problems show that SCAR (SAL-based models, in general) effectively solves diverse AVR tasks, and its performance is on par with the state-of-the-art task-specific baselines. What is more, SCAR demonstrates effective knowledge reuse in multi-task and transfer learning settings. To our knowledge, this work is the first successful attempt to construct a general single-choice AVR solver relying on self-configurable architecture and unified solving method. With this work we aim to stimulate and foster progress on task-independent research paths in the AVR domain, with the long-term goal of development of a general AVR solver. △ Less

Submitted 15 December, 2023; originally announced December 2023.

Comments: Accepted to The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

arXiv:2312.09078 [pdf, other]

Coevolutionary Algorithm for Building Robust Decision Trees under Minimax Regret

Authors: Adam Żychowski, Andrew Perrault, Jacek Mańdziuk

Abstract: In recent years, there has been growing interest in develo** robust machine learning (ML) models that can withstand adversarial attacks, including one of the most widely adopted, efficient, and interpretable ML algorithms-decision trees (DTs). This paper proposes a novel coevolutionary algorithm (CoEvoRDT) designed to create robust DTs capable of handling noisy high-dimensional data in adversari… ▽ More In recent years, there has been growing interest in develo** robust machine learning (ML) models that can withstand adversarial attacks, including one of the most widely adopted, efficient, and interpretable ML algorithms-decision trees (DTs). This paper proposes a novel coevolutionary algorithm (CoEvoRDT) designed to create robust DTs capable of handling noisy high-dimensional data in adversarial contexts. Motivated by the limitations of traditional DT algorithms, we leverage adaptive coevolution to allow DTs to evolve and learn from interactions with perturbed input data. CoEvoRDT alternately evolves competing populations of DTs and perturbed features, enabling construction of DTs with desired properties. CoEvoRDT is easily adaptable to various target metrics, allowing the use of tailored robustness criteria such as minimax regret. Furthermore, CoEvoRDT has potential to improve the results of other state-of-the-art methods by incorporating their outcomes (DTs they produce) into the initial population and optimize them in the process of coevolution. Inspired by the game theory, CoEvoRDT utilizes mixed Nash equilibrium to enhance convergence. The method is tested on 20 popular datasets and shows superior performance compared to 4 state-of-the-art algorithms. It outperformed all competing methods on 13 datasets with adversarial accuracy metrics, and on all 20 considered datasets with minimax regret. Strong experimental results and flexibility in choosing the error measure make CoEvoRDT a promising approach for constructing robust DTs in real-world applications. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2309.11104 [pdf, other]

AttentionMix: Data augmentation method that relies on BERT attention mechanism

Authors: Dominik Lewy, Jacek Mańdziuk

Abstract: The Mixup method has proven to be a powerful data augmentation technique in Computer Vision, with many successors that perform image mixing in a guided manner. One of the interesting research directions is transferring the underlying Mixup idea to other domains, e.g. Natural Language Processing (NLP). Even though there already exist several methods that apply Mixup to textual data, there is still… ▽ More The Mixup method has proven to be a powerful data augmentation technique in Computer Vision, with many successors that perform image mixing in a guided manner. One of the interesting research directions is transferring the underlying Mixup idea to other domains, e.g. Natural Language Processing (NLP). Even though there already exist several methods that apply Mixup to textual data, there is still room for new, improved approaches. In this work, we introduce AttentionMix, a novel mixing method that relies on attention-based information. While the paper focuses on the BERT attention mechanism, the proposed approach can be applied to generally any attention-based model. AttentionMix is evaluated on 3 standard sentiment classification datasets and in all three cases outperforms two benchmark approaches that utilize Mixup mechanism, as well as the vanilla BERT method. The results confirm that the attention-based information can be effectively used for data augmentation in the NLP domain. △ Less

Submitted 20 September, 2023; originally announced September 2023.

arXiv:2207.04103 [pdf, other]

StatMix: Data augmentation method that relies on image statistics in federated learning

Authors: Dominik Lewy, Jacek Mańdziuk, Maria Ganzha, Marcin Paprzycki

Abstract: Availability of large amount of annotated data is one of the pillars of deep learning success. Although numerous big datasets have been made available for research, this is often not the case in real life applications (e.g. companies are not able to share data due to GDPR or concerns related to intellectual property rights protection). Federated learning (FL) is a potential solution to this proble… ▽ More Availability of large amount of annotated data is one of the pillars of deep learning success. Although numerous big datasets have been made available for research, this is often not the case in real life applications (e.g. companies are not able to share data due to GDPR or concerns related to intellectual property rights protection). Federated learning (FL) is a potential solution to this problem, as it enables training a global model on data scattered across multiple nodes, without sharing local data itself. However, even FL methods pose a threat to data privacy, if not handled properly. Therefore, we propose StatMix, an augmentation approach that uses image statistics, to improve results of FL scenario(s). StatMix is empirically tested on CIFAR-10 and CIFAR-100, using two neural network architectures. In all FL experiments, application of StatMix improves the average accuracy, compared to the baseline training (with no use of StatMix). Some improvement can also be observed in non-FL setups. △ Less

Submitted 8 July, 2022; originally announced July 2022.

arXiv:2206.08124 [pdf, other]

Using adversarial images to improve outcomes of federated learning for non-IID data

Authors: Anastasiya Danilenka, Maria Ganzha, Marcin Paprzycki, Jacek Mańdziuk

Abstract: One of the important problems in federated learning is how to deal with unbalanced data. This contribution introduces a novel technique designed to deal with label skewed non-IID data, using adversarial inputs, created by the I-FGSM method. Adversarial inputs guide the training process and allow the Weighted Federated Averaging to give more importance to clients with 'selected' local label distrib… ▽ More One of the important problems in federated learning is how to deal with unbalanced data. This contribution introduces a novel technique designed to deal with label skewed non-IID data, using adversarial inputs, created by the I-FGSM method. Adversarial inputs guide the training process and allow the Weighted Federated Averaging to give more importance to clients with 'selected' local label distributions. Experimental results, gathered from image classification tasks, for MNIST and CIFAR-10 datasets, are reported and analyzed. △ Less

Submitted 16 June, 2022; originally announced June 2022.

arXiv:2204.14173 [pdf, other]

doi 10.24963/ijcai.2022/88

Evolutionary Approach to Security Games with Signaling

Authors: Adam Żychowski, Jacek Mańdziuk, Elizabeth Bondi, Aravind Venugopal, Milind Tambe, Balaraman Ravindran

Abstract: Green Security Games have become a popular way to model scenarios involving the protection of natural resources, such as wildlife. Sensors (e.g. drones equipped with cameras) have also begun to play a role in these scenarios by providing real-time information. Incorporating both human and sensor defender resources strategically is the subject of recent work on Security Games with Signaling (SGS).… ▽ More Green Security Games have become a popular way to model scenarios involving the protection of natural resources, such as wildlife. Sensors (e.g. drones equipped with cameras) have also begun to play a role in these scenarios by providing real-time information. Incorporating both human and sensor defender resources strategically is the subject of recent work on Security Games with Signaling (SGS). However, current methods to solve SGS do not scale well in terms of time or memory. We therefore propose a novel approach to SGS, which, for the first time in this domain, employs an Evolutionary Computation paradigm: EASGS. EASGS effectively searches the huge SGS solution space via suitable solution encoding in a chromosome and a specially-designed set of operators. The operators include three types of mutations, each focusing on a particular aspect of the SGS solution, optimized crossover and a local coverage improvement scheme (a memetic aspect of EASGS). We also introduce a new set of benchmark games, based on dense or locally-dense graphs that reflect real-world SGS settings. In the majority of 342 test game instances, EASGS outperforms state-of-the-art methods, including a reinforcement learning method, in terms of time scalability, nearly constant memory utilization, and quality of the returned defender's strategies (expected payoffs). △ Less

Submitted 29 April, 2022; originally announced April 2022.

Journal ref: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, 620-627

arXiv:2204.04105 [pdf, other]

doi 10.1145/3512290.3528805

Improving LSHADE by means of a pre-screening mechanism

Authors: Mateusz Zaborski, Jacek Mańdziuk

Abstract: Evolutionary algorithms have proven to be highly effective in continuous optimization, especially when numerous fitness function evaluations (FFEs) are possible. In certain cases, however, an expensive optimization approach (i.e. with relatively low number of FFEs) must be taken, and such a setting is considered in this work. The paper introduces an extension to the well-known LSHADE algorithm in… ▽ More Evolutionary algorithms have proven to be highly effective in continuous optimization, especially when numerous fitness function evaluations (FFEs) are possible. In certain cases, however, an expensive optimization approach (i.e. with relatively low number of FFEs) must be taken, and such a setting is considered in this work. The paper introduces an extension to the well-known LSHADE algorithm in the form of a pre-screening mechanism (psLSHADE). The proposed pre-screening relies on the three following components: a specific initial sampling procedure, an archive of samples, and a global linear meta-model of a fitness function that consists of 6 independent transformations of variables. The pre-screening mechanism preliminary assesses the trial vectors and designates the best one of them for further evaluation with the fitness function. The performance of psLSHADE is evaluated using the CEC2021 benchmark in an expensive scenario with an optimization budget of 10^2-10^4 FFEs per dimension. We compare psLSHADE with the baseline LSHADE method and the MadDE algorithm. The results indicate that with restricted optimization budgets psLSHADE visibly outperforms both competitive algorithms. In addition, the use of the pre-screening mechanism results in faster population convergence of psLSHADE compared to LSHADE. △ Less

Submitted 11 April, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

Comments: Accepted at Genetic and Evolutionary Computation Conference (GECCO'22)

Journal ref: Genetic and Evolutionary Computation Conference, GECCO 2022, 884-892

arXiv:2202.10284 [pdf, other]

doi 10.1016/j.inffus.2022.11.011

A Review of Emerging Research Directions in Abstract Visual Reasoning

Authors: Mikołaj Małkiński, Jacek Mańdziuk

Abstract: Abstract Visual Reasoning (AVR) problems are commonly used to approximate human intelligence. They test the ability of applying previously gained knowledge, experience and skills in a completely new setting, which makes them particularly well-suited for this task. Recently, the AVR problems have become popular as a proxy to study machine intelligence, which has led to emergence of new distinct typ… ▽ More Abstract Visual Reasoning (AVR) problems are commonly used to approximate human intelligence. They test the ability of applying previously gained knowledge, experience and skills in a completely new setting, which makes them particularly well-suited for this task. Recently, the AVR problems have become popular as a proxy to study machine intelligence, which has led to emergence of new distinct types of problems and multiple benchmark sets. In this work we review this emerging AVR research and propose a taxonomy to categorise the AVR tasks along 5 dimensions: input shapes, hidden rules, target task, cognitive function, and main challenge. The perspective taken in this survey allows to characterise AVR problems with respect to their shared and distinct properties, provides a unified view on the existing approaches for solving AVR tasks, shows how the AVR problems relate to practical applications, and outlines promising directions for future work. One of them refers to the observation that in the machine learning literature different tasks are considered in isolation, which is in the stark contrast with the way the AVR tasks are used to measure human intelligence, where multiple types of problems are combined within a single IQ test. △ Less

Submitted 7 March, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

Journal ref: Information Fusion, Volume 91, March 2023, Pages 713-736

arXiv:2201.12382 [pdf, other]

Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive Matrices

Authors: Mikołaj Małkiński, Jacek Mańdziuk

Abstract: Abstract visual reasoning (AVR) domain encompasses problems solving which requires the ability to reason about relations among entities present in a given scene. While humans, generally, solve AVR tasks in a "natural" way, even without prior experience, this type of problems has proven difficult for current machine learning systems. The paper summarises recent progress in applying deep learning me… ▽ More Abstract visual reasoning (AVR) domain encompasses problems solving which requires the ability to reason about relations among entities present in a given scene. While humans, generally, solve AVR tasks in a "natural" way, even without prior experience, this type of problems has proven difficult for current machine learning systems. The paper summarises recent progress in applying deep learning methods to solving AVR problems, as a proxy for studying machine intelligence. We focus on the most common type of AVR tasks -- the Raven's Progressive Matrices (RPMs) -- and provide a comprehensive review of the learning methods and deep neural models applied to solve RPMs, as well as, the RPM benchmark sets. Performance analysis of the state-of-the-art approaches to solving RPMs leads to formulation of certain insights and remarks on the current and future trends in this area. We conclude the paper by demonstrating how real-world problems can benefit from the discoveries of RPM studies. △ Less

Submitted 19 April, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

ACM Class: I.2; I.5.4; I.5.1

arXiv:2110.02364 [pdf, other]

doi 10.1007/978-3-030-92307-5_66

Adversarial defenses via a mixture of generators

Authors: Maciej Żelaszczyk, Jacek Mańdziuk

Abstract: In spite of the enormous success of neural networks, adversarial examples remain a relatively weakly understood feature of deep learning systems. There is a considerable effort in both building more powerful adversarial attacks and designing methods to counter the effects of adversarial examples. We propose a method to transform the adversarial input data through a mixture of generators in order t… ▽ More In spite of the enormous success of neural networks, adversarial examples remain a relatively weakly understood feature of deep learning systems. There is a considerable effort in both building more powerful adversarial attacks and designing methods to counter the effects of adversarial examples. We propose a method to transform the adversarial input data through a mixture of generators in order to recover the correct class obfuscated by the adversarial attack. A canonical set of images is used to generate adversarial examples through potentially multiple attacks. Such transformed images are processed by a set of generators, which are trained adversarially as a whole to compete in inverting the initial transformations. To our knowledge, this is the first use of a mixture-based adversarially trained system as a defense mechanism. We show that it is possible to train such a system without supervision, simultaneously on multiple adversarial attacks. Our system is able to recover class information for previously-unseen examples with neither attack nor data labels on the MNIST dataset. The results demonstrate that this multi-attack approach is competitive with adversarial defenses tested in single-attack settings. △ Less

Submitted 5 October, 2021; originally announced October 2021.

Journal ref: International Conference on Neural Information Processing, ICONIP 2021, CCIS 1516, 566-574

arXiv:2110.02316 [pdf, other]

doi 10.1007/978-3-030-92310-5_77

Prediction of the Facial Growth Direction is Challenging

Authors: Stanisław Kaźmierczak, Zofia Juszka, Vaska Vandevska-Radunovic, Thomas JJ Maal, Piotr Fudalej, Jacek Mańdziuk

Abstract: Facial dysmorphology or malocclusion is frequently associated with abnormal growth of the face. The ability to predict facial growth (FG) direction would allow clinicians to prepare individualized therapy to increase the chance for successful treatment. Prediction of FG direction is a novel problem in the machine learning (ML) domain. In this paper, we perform feature selection and point the attri… ▽ More Facial dysmorphology or malocclusion is frequently associated with abnormal growth of the face. The ability to predict facial growth (FG) direction would allow clinicians to prepare individualized therapy to increase the chance for successful treatment. Prediction of FG direction is a novel problem in the machine learning (ML) domain. In this paper, we perform feature selection and point the attribute that plays a central role in the abovementioned problem. Then we successfully apply data augmentation (DA) methods and improve the previously reported classification accuracy by 2.81%. Finally, we present the results of two experienced clinicians that were asked to solve a similar task to ours and show how tough is solving this problem for human experts. △ Less

Submitted 28 September, 2021; originally announced October 2021.

Journal ref: International Conference on Neural Information Processing, ICONIP 2021, CCIS 1517, 665-673

arXiv:2109.13354 [pdf, other]

Audio-to-Image Cross-Modal Generation

Authors: Maciej Żelaszczyk, Jacek Mańdziuk

Abstract: Cross-modal representation learning allows to integrate information from different modalities into one representation. At the same time, research on generative models tends to focus on the visual domain with less emphasis on other domains, such as audio or text, potentially missing the benefits of shared representations. Studies successfully linking more than one modality in the generative setting… ▽ More Cross-modal representation learning allows to integrate information from different modalities into one representation. At the same time, research on generative models tends to focus on the visual domain with less emphasis on other domains, such as audio or text, potentially missing the benefits of shared representations. Studies successfully linking more than one modality in the generative setting are rare. In this context, we verify the possibility to train variational autoencoders (VAEs) to reconstruct image archetypes from audio data. Specifically, we consider VAEs in an adversarial training framework in order to ensure more variability in the generated data and find that there is a trade-off between the consistency and diversity of the generated images - this trade-off can be governed by scaling the reconstruction loss up or down, respectively. Our results further suggest that even in the case when the generated images are relatively inconsistent (diverse), features that are critical for proper image classification are preserved. △ Less

Submitted 27 September, 2021; originally announced September 2021.

Journal ref: International Joint Conference on Neural Networks, IJCNN 2022, Padua, Italy, 1-8

arXiv:2109.13036 [pdf, other]

doi 10.1007/978-3-030-92307-5_62

Learning Attacker's Bounded Rationality Model in Security Games

Authors: Adam Żychowski, Jacek Mańdziuk

Abstract: The paper proposes a novel neuroevolutionary method (NESG) for calculating leader's payoff in Stackelberg Security Games. The heart of NESG is strategy evaluation neural network (SENN). SENN is able to effectively evaluate leader's strategies against an opponent who may potentially not behave in a perfectly rational way due to certain cognitive biases or limitations. SENN is trained on historical… ▽ More The paper proposes a novel neuroevolutionary method (NESG) for calculating leader's payoff in Stackelberg Security Games. The heart of NESG is strategy evaluation neural network (SENN). SENN is able to effectively evaluate leader's strategies against an opponent who may potentially not behave in a perfectly rational way due to certain cognitive biases or limitations. SENN is trained on historical data and does not require any direct prior knowledge regarding the follower's target preferences, payoff distribution or bounded rationality model. NESG was tested on a set of 90 benchmark games inspired by real-world cybersecurity scenario known as deep packet inspections. Experimental results show an advantage of applying NESG over the existing state-of-the-art methods when playing against not perfectly rational opponents. The method provides high quality solutions with superior computation time scalability. Due to generic and knowledge-free construction of NESG, the method may be applied to various real-life security scenarios. △ Less

Submitted 27 September, 2021; originally announced September 2021.

Journal ref: International Conference on Neural Information Processing, ICONIP 2021, CCIS 1516, 530-539

arXiv:2107.09887 [pdf, other]

doi 10.1007/s10462-022-10227-z

An overview of mixing augmentation methods and augmentation strategies

Authors: Dominik Lewy, Jacek Mańdziuk

Abstract: Deep Convolutional Neural Networks have made an incredible progress in many Computer Vision tasks. This progress, however, often relies on the availability of large amounts of the training data, required to prevent over-fitting, which in many domains entails significant cost of manual data labeling. An alternative approach is application of data augmentation (DA) techniques that aim at model regul… ▽ More Deep Convolutional Neural Networks have made an incredible progress in many Computer Vision tasks. This progress, however, often relies on the availability of large amounts of the training data, required to prevent over-fitting, which in many domains entails significant cost of manual data labeling. An alternative approach is application of data augmentation (DA) techniques that aim at model regularization by creating additional observations from the available ones. This survey focuses on two DA research streams: image mixing and automated selection of augmentation strategies. First, the presented methods are briefly described, and then qualitatively compared with respect to their key characteristics. Various quantitative comparisons are also included based on the results reported in recent DA literature. This review mainly covers the methods published in the materials of top-tier conferences and in leading journals in the years 2017-2021. △ Less

Submitted 18 April, 2022; v1 submitted 21 July, 2021; originally announced July 2021.

Journal ref: Artificial Intelligence Review volume 56, pages 2111-2169 (2023)

arXiv:2106.10464 [pdf, other]

Prediction of the facial growth direction with Machine Learning methods

Authors: Stanisław Kaźmierczak, Zofia Juszka, Piotr Fudalej, Jacek Mańdziuk

Abstract: First attempts of prediction of the facial growth (FG) direction were made over half of a century ago. Despite numerous attempts and elapsed time, a satisfactory method has not been established yet and the problem still poses a challenge for medical experts. To our knowledge, this paper is the first Machine Learning approach to the prediction of FG direction. Conducted data analysis reveals the in… ▽ More First attempts of prediction of the facial growth (FG) direction were made over half of a century ago. Despite numerous attempts and elapsed time, a satisfactory method has not been established yet and the problem still poses a challenge for medical experts. To our knowledge, this paper is the first Machine Learning approach to the prediction of FG direction. Conducted data analysis reveals the inherent complexity of the problem and explains the reasons of difficulty in FG direction prediction based on 2D X-ray images. To perform growth forecasting, we employ a wide range of algorithms, from logistic regression, through tree ensembles to neural networks and consider three, slightly different, problem formulations. The resulting classification accuracy varies between 71% and 75%. △ Less

Submitted 19 June, 2021; originally announced June 2021.

arXiv:2103.04931 [pdf, other]

doi 10.1007/s10462-022-10228-y

Monte Carlo Tree Search: A Review of Recent Modifications and Applications

Authors: Maciej Świechowski, Konrad Godlewski, Bartosz Sawicki, Jacek Mańdziuk

Abstract: Monte Carlo Tree Search (MCTS) is a powerful approach to designing game-playing bots or solving sequential decision problems. The method relies on intelligent tree search that balances exploration and exploitation. MCTS performs random sampling in the form of simulations and stores statistics of actions to make more educated choices in each subsequent iteration. The method has become a state-of-th… ▽ More Monte Carlo Tree Search (MCTS) is a powerful approach to designing game-playing bots or solving sequential decision problems. The method relies on intelligent tree search that balances exploration and exploitation. MCTS performs random sampling in the form of simulations and stores statistics of actions to make more educated choices in each subsequent iteration. The method has become a state-of-the-art technique for combinatorial games, however, in more complex games (e.g. those with high branching factor or real-time ones), as well as in various practical domains (e.g. transportation, scheduling or security) an efficient MCTS application often requires its problem-dependent modification or integration with other techniques. Such domain-specific modifications and hybrid approaches are the main focus of this survey. The last major MCTS survey has been published in 2012. Contributions that appeared since its release are of particular interest for this review. △ Less

Submitted 11 June, 2022; v1 submitted 8 March, 2021; originally announced March 2021.

Comments: 99 pages, Accepted to Artificial Intelligence Review journal

Journal ref: Artificial Intelligence Review (2023), vol. 56, 2497-2562

arXiv:2012.01944 [pdf, other]

doi 10.1109/TNNLS.2022.3185949

Multi-Label Contrastive Learning for Abstract Visual Reasoning

Authors: Mikołaj Małkiński, Jacek Mańdziuk

Abstract: For a long time the ability to solve abstract reasoning tasks was considered one of the hallmarks of human intelligence. Recent advances in application of deep learning (DL) methods led, as in many other domains, to surpassing human abstract reasoning performance, specifically in the most popular type of such problems - the Raven's Progressive Matrices (RPMs). While the efficacy of DL systems is i… ▽ More For a long time the ability to solve abstract reasoning tasks was considered one of the hallmarks of human intelligence. Recent advances in application of deep learning (DL) methods led, as in many other domains, to surpassing human abstract reasoning performance, specifically in the most popular type of such problems - the Raven's Progressive Matrices (RPMs). While the efficacy of DL systems is indeed impressive, the way they approach the RPMs is very different from that of humans. State-of-the-art systems solving RPMs rely on massive pattern-based training and sometimes on exploiting biases in the dataset, whereas humans concentrate on identification of the rules / concepts underlying the RPM (or generally a visual reasoning task) to be solved. Motivated by this cognitive difference, this work aims at combining DL with human way of solving RPMs and getting the best of both worlds. Specifically, we cast the problem of solving RPMs into multi-label classification framework where each RPM is viewed as a multi-label data point, with labels determined by the set of abstract rules underlying the RPM. For efficient training of the system we introduce a generalisation of the Noise Contrastive Estimation algorithm to the case of multi-label samples. Furthermore, we propose a new sparse rule encoding scheme for RPMs which, besides the new training algorithm, is the key factor contributing to the state-of-the-art performance. The proposed approach is evaluated on two most popular benchmark datasets (Balanced-RAVEN and PGM) and on both of them demonstrates an advantage over the current state-of-the-art results. Contrary to applications of contrastive learning methods reported in other domains, the state-of-the-art performance reported in the paper is achieved with no need for large batch sizes or strong data augmentation. △ Less

Submitted 3 December, 2020; originally announced December 2020.

Journal ref: IEEE Transactions on Neural Networks and Learning Systems, 2022, Open Access, published 6 July 2022

arXiv:2006.14894 [pdf, other]

doi 10.1007/978-3-030-58112-1_30

Biologically Plausible Learning of Text Representation with Spiking Neural Networks

Authors: Marcin Białas, Marcin Michał Mirończuk, Jacek Mańdziuk

Abstract: This study proposes a novel biologically plausible mechanism for generating low-dimensional spike-based text representation. First, we demonstrate how to transform documents into series of spikes spike trains which are subsequently used as input in the training process of a spiking neural network (SNN). The network is composed of biologically plausible elements, and trained according to the unsupe… ▽ More This study proposes a novel biologically plausible mechanism for generating low-dimensional spike-based text representation. First, we demonstrate how to transform documents into series of spikes spike trains which are subsequently used as input in the training process of a spiking neural network (SNN). The network is composed of biologically plausible elements, and trained according to the unsupervised Hebbian learning rule, Spike-Timing-Dependent Plasticity (STDP). After training, the SNN can be used to generate low-dimensional spike-based text representation suitable for text/document classification. Empirical results demonstrate that the generated text representation may be effectively used in text classification leading to an accuracy of $80.19\%$ on the bydate version of the 20 newsgroups data set, which is a leading result amongst approaches that rely on low-dimensional text representations. △ Less

Submitted 26 June, 2020; originally announced June 2020.

Comments: This article was originally submitted for Parallel Problem Solving from Nature conference and will be available in Springer Lecture Notes in Computer Science (LNCS)

Journal ref: 16th International Conference on Parallel Problem Solving from Nature, PPSN 2020, 433-447, LNCS vol. 12269

arXiv:2006.09996 [pdf, other]

Dynamic Vehicle Routing Problem: A Monte Carlo approach

Authors: Michał Okulewicz, Jacek Mańdziuk

Abstract: In this work we solve the Dynamic Vehicle Routing Problem (DVRP). DVRP is a modification of the Vehicle Routing Problem, in which the clients' requests (cities) number and location might not be known at the beginning of the working day Additionally, all requests must be served during one working day by a fleet of vehicles with limited capacity. In this work we propose a Monte Carlo method (MCTree)… ▽ More In this work we solve the Dynamic Vehicle Routing Problem (DVRP). DVRP is a modification of the Vehicle Routing Problem, in which the clients' requests (cities) number and location might not be known at the beginning of the working day Additionally, all requests must be served during one working day by a fleet of vehicles with limited capacity. In this work we propose a Monte Carlo method (MCTree), which directly approaches the dynamic nature of arriving requests in the DVRP. The method is also hybridized (MCTree+PSO) with our previous Two-Phase Multi-swarm Particle Swarm Optimization (2MPSO) algorithm. Our method is based on two assumptions. First, that we know a bounding rectangle of the area in which the requests might appear. Second, that the initial requests' sizes and frequency of appearance are representative for the yet unknown clients' requests. In order to solve the DVRP we divide the working day into several time slices in which we solve a static problem. In our Monte Carlo approach we randomly generate the unknown clients' requests with uniform spatial distribution over the bounding rectangle and requests' sizes uniformly sampled from the already known requests' sizes. The solution proposal is constructed with the application of a clustering algorithm and a route construction algorithm. The MCTree method is tested on a well established set of benchmarks proposed by Kilby et al. and is compared with the results achieved by applying our previous 2MPSO algorithm and other literature results. The proposed MCTree approach achieves a better time to quality trade-off then plain heuristic algorithms. Moreover, a hybrid MCTree+PSO approach achieves better time to quality trade-off then 2MPSO for small optimization time limits, making the hybrid a good candidate for handling real world scale goods delivery problems. △ Less

Submitted 15 June, 2020; originally announced June 2020.

Journal ref: Information Technologies: Research and Their Interdisciplinary Applications 2015, 119-138, Institute of Computer Science Polish Academy of Sciences, ISBN 978-83-63159-23-8

arXiv:2006.08809 [pdf, ps, other]

A Particle Swarm Optimization hyper-heuristic for the Dynamic Vehicle Routing Problem

Authors: Michał Okulewicz, Jacek Mańdziuk

Abstract: This paper presents a method for choosing a Particle Swarm Optimization based optimizer for the Dynamic Vehicle Routing Problem on the basis of the initially available data of a given problem instance. The optimization algorithm is chosen on the basis of a prediction made by a linear model trained on that data and the relative results obtained by the optimization algorithms. The achieved results s… ▽ More This paper presents a method for choosing a Particle Swarm Optimization based optimizer for the Dynamic Vehicle Routing Problem on the basis of the initially available data of a given problem instance. The optimization algorithm is chosen on the basis of a prediction made by a linear model trained on that data and the relative results obtained by the optimization algorithms. The achieved results suggest that such a model can be used in a hyper-heuristic approach as it improved the average results, obtained on the set of benchmark instances, by choosing the appropriate algorithm in 82% of significant cases. Two leading multi-swarm Particle Swarm Optimization based algorithms for solving the Dynamic Vehicle Routing Problem are used as the basic optimization algorithms: Khouadjia's et al. Multi-Environmental Multi-Swarm Optimizer and authors' 2--Phase Multiswarm Particle Swarm Optimization. △ Less

Submitted 15 June, 2020; originally announced June 2020.

Comments: 14 pages, presented at BIOMA 2016 conference, Bled, Slovenia

Journal ref: Proceedings of Bioinspired Optimization Methods and their Applications, 215-227, Jozef Stefan Institute, 2016

arXiv:2004.10705 [pdf, other]

doi 10.1007/978-3-030-58112-1_34

A Committee of Convolutional Neural Networks for Image Classication in the Concurrent Presence of Feature and Label Noise

Authors: Stanisław Kaźmierczak, Jacek Mańdziuk

Abstract: Image classification has become a ubiquitous task. Models trained on good quality data achieve accuracy which in some application domains is already above human-level performance. Unfortunately, real-world data are quite often degenerated by the noise existing in features and/or labels. There are quite many papers that handle the problem of either feature or label noise, separately. However, to th… ▽ More Image classification has become a ubiquitous task. Models trained on good quality data achieve accuracy which in some application domains is already above human-level performance. Unfortunately, real-world data are quite often degenerated by the noise existing in features and/or labels. There are quite many papers that handle the problem of either feature or label noise, separately. However, to the best of our knowledge, this piece of research is the first attempt to address the problem of concurrent occurrence of both types of noise. Basing on the MNIST, CIFAR-10 and CIFAR-100 datasets, we experimentally proved that the difference by which committees beat single models increases along with noise level, no matter it is an attribute or label disruption. Thus, it makes ensembles legitimate to be applied to noisy images with noisy labels. The aforementioned committees' advantage over single models is positively correlated with dataset difficulty level as well. We propose three committee selection algorithms that outperform a strong baseline algorithm which relies on an ensemble of individual (nonassociated) best models. △ Less

Submitted 22 June, 2020; v1 submitted 18 April, 2020; originally announced April 2020.

Journal ref: 16th International Conference on Parallel Problem Solving from Nature, PPSN 2020, 498-511, LNCS vol. 12269

arXiv:2002.12485 [pdf, other]

Generalized Self-Adapting Particle Swarm Optimization algorithm with archive of samples

Authors: Michał Okulewicz, Mateusz Zaborski, Jacek Mańdziuk

Abstract: In this paper we enhance Generalized Self-Adapting Particle Swarm Optimization algorithm (GAPSO), initially introduced at the Parallel Problem Solving from Nature 2018 conference, and to investigate its properties. The research on GAPSO is underlined by the two following assumptions: (1) it is possible to achieve good performance of an optimization algorithm through utilization of all of the gathe… ▽ More In this paper we enhance Generalized Self-Adapting Particle Swarm Optimization algorithm (GAPSO), initially introduced at the Parallel Problem Solving from Nature 2018 conference, and to investigate its properties. The research on GAPSO is underlined by the two following assumptions: (1) it is possible to achieve good performance of an optimization algorithm through utilization of all of the gathered samples, (2) the best performance can be accomplished by means of a combination of specialized sampling behaviors (Particle Swarm Optimization, Differential Evolution, and locally fitted square functions). From a software engineering point of view, GAPSO considers a standard Particle Swarm Optimization algorithm as an ideal starting point for creating a generalpurpose global optimization framework. Within this framework hybrid optimization algorithms are developed, and various additional techniques (like algorithm restart management or adaptation schemes) are tested. The paper introduces a new version of the algorithm, abbreviated as M-GAPSO. In comparison with the original GAPSO formulation it includes the following four features: a global restart management scheme, samples gathering within an R-Tree based index (archive/memory of samples), adaptation of a sampling behavior based on a global particle performance, and a specific approach to local search. The above-mentioned enhancements resulted in improved performance of M-GAPSO over GAPSO, observed on both COCO BBOB testbed and in the black-box optimization competition BBComp. Also, for lower dimensionality functions (up to 5D) results of M-GAPSO are better or comparable to the state-of-the art version of CMA-ES (namely the KL-BIPOP-CMA-ES algorithm presented at the GECCO 2017 conference). △ Less

Submitted 27 February, 2020; originally announced February 2020.

Comments: preprint

arXiv:1912.03564 [pdf, other]

Anchoring Theory in Sequential Stackelberg Games

Authors: Jan Karwowski, Jacek Mańdziuk, Adam Żychowski

Abstract: An underlying assumption of Stackelberg Games (SGs) is perfect rationality of the players. However, in real-life situations (which are often modeled by SGs) the followers (terrorists, thieves, poachers or smugglers) -- as humans in general -- may act not in a perfectly rational way, as their decisions may be affected by biases of various kinds which bound rationality of their decisions. One of the… ▽ More An underlying assumption of Stackelberg Games (SGs) is perfect rationality of the players. However, in real-life situations (which are often modeled by SGs) the followers (terrorists, thieves, poachers or smugglers) -- as humans in general -- may act not in a perfectly rational way, as their decisions may be affected by biases of various kinds which bound rationality of their decisions. One of the popular models of bounded rationality (BR) is Anchoring Theory (AT) which claims that humans have a tendency to flatten probabilities of available options, i.e. they perceive a distribution of these probabilities as being closer to the uniform distribution than it really is. This paper proposes an efficient formulation of AT in sequential extensive-form SGs (named ATSG), suitable for Mixed-Integer Linear Program (MILP) solution methods. ATSG is implemented in three MILP/LP-based state-of-the-art methods for solving sequential SGs and two recently introduced non-MILP approaches: one relying on Monte Carlo sampling (O2UCT) and the other one (EASG) employing Evolutionary Algorithms. Experimental evaluation indicates that both non-MILP heuristic approaches scale better in time than MILP solutions while providing optimal or close-to-optimal solutions. Except for competitive time scalability, an additional asset of non-MILP methods is flexibility of potential BR formulations they are able to incorporate. While MILP approaches accept BR formulations with linear constraints only, no restrictions on the BR form are imposed in either of the two non-MILP methods. △ Less

Submitted 24 February, 2020; v1 submitted 7 December, 2019; originally announced December 2019.

MSC Class: I.2.8 ACM Class: I.2.8

arXiv:1911.05706 [pdf, other]

A Generic Metaheuristic Approach to Sequential Security Games

Authors: Adam Żychowski, Jacek Mańdziuk

Abstract: The paper introduces a generic approach to solving Sequential Security Games (SGs) which utilizes Evolutionary Algorithms. Formulation of the method (named EASG) is general and largely game-independent, which allows for its application to a wide range of SGs with just little adjustments addressing game specificity. Comprehensive experiments performed on 3 different types of games (with 300 instanc… ▽ More The paper introduces a generic approach to solving Sequential Security Games (SGs) which utilizes Evolutionary Algorithms. Formulation of the method (named EASG) is general and largely game-independent, which allows for its application to a wide range of SGs with just little adjustments addressing game specificity. Comprehensive experiments performed on 3 different types of games (with 300 instances in total) demonstrate robustness and stability of EASG, manifested by repeatable achieving optimal or near-optimal solutions in the vast majority of the cases. The main advantage of EASG is time efficiency. The method scales visibly better than state-of-the-art approaches and consequently can be applied to SG instances which are beyond capabilities of the existing methods. Furthermore, due to anytime characteristics, EASG is very well suited for time-critical applications, as the method can be terminated at any moment and still provide a reasonably good solution - the best one found so far. △ Less

Submitted 13 November, 2019; originally announced November 2019.

Journal ref: International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2020, 2089-2091, ISBN: 978-1-4503-7518-4

arXiv:1909.03934 [pdf, other]

doi 10.1609/aaai.v34i02.5578

Double-oracle sampling method for Stackelberg Equilibrium approximation in general-sum extensive-form games

Authors: Jan Karwowski, Jacek Mańdziuk

Abstract: The paper presents a new method for approximating Strong Stackelberg Equilibrium in general-sum sequential games with imperfect information and perfect recall. The proposed approach is generic as it does not rely on any specific properties of a particular game model. The method is based on iterative interleaving of the two following phases: (1) guided Monte Carlo Tree Search sampling of the Follow… ▽ More The paper presents a new method for approximating Strong Stackelberg Equilibrium in general-sum sequential games with imperfect information and perfect recall. The proposed approach is generic as it does not rely on any specific properties of a particular game model. The method is based on iterative interleaving of the two following phases: (1) guided Monte Carlo Tree Search sampling of the Follower's strategy space and (2) building the Leader's behavior strategy tree for which the sampled Follower's strategy is an optimal response. The above solution scheme is evaluated with respect to expected Leader's utility and time requirements on three sets of interception games with variable characteristics, played on graphs. A comparison with three state-of-the-art MILP/LP-based methods shows that in vast majority of test cases proposed simulation-based approach leads to optimal Leader's strategies, while excelling the competitive methods in terms of better time scalability and lower memory requirements. △ Less

Submitted 9 September, 2019; originally announced September 2019.

ACM Class: I.2.8

Journal ref: Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, 2054-2061

arXiv:1711.06763 [pdf, other]

doi 10.1016/j.knosys.2018.05.012

Addressing Expensive Multi-objective Games with Postponed Preference Articulation via Memetic Co-evolution

Authors: Adam Żychowski, Abhishek Gupta, Jacek Mańdziuk, Yew Soon Ong

Abstract: This paper presents algorithmic and empirical contributions demonstrating that the convergence characteristics of a co-evolutionary approach to tackle Multi-Objective Games (MOGs) with postponed preference articulation can often be hampered due to the possible emergence of the so-called Red Queen effect. Accordingly, it is hypothesized that the convergence characteristics can be significantly impr… ▽ More This paper presents algorithmic and empirical contributions demonstrating that the convergence characteristics of a co-evolutionary approach to tackle Multi-Objective Games (MOGs) with postponed preference articulation can often be hampered due to the possible emergence of the so-called Red Queen effect. Accordingly, it is hypothesized that the convergence characteristics can be significantly improved through the incorporation of memetics (local solution refinements as a form of lifelong learning), as a promising means of mitigating (or at least suppressing) the Red Queen phenomenon by providing a guiding hand to the purely genetic mechanisms of co-evolution. Our practical motivation is to address MOGs of a time-sensitive nature that are characterized by computationally expensive evaluations, wherein there is a natural need to reduce the total number of true function evaluations consumed in achieving good quality solutions. To this end, we propose novel enhancements to co-evolutionary approaches for tackling MOGs, such that memetic local refinements can be efficiently applied on evolved candidate strategies by searching on computationally cheap surrogate payoff landscapes (that preserve postponed preference conditions). The efficacy of the proposal is demonstrated on a suite of test MOGs that have been designed. △ Less

Submitted 17 November, 2017; originally announced November 2017.

Journal ref: Knowledge-Based Systems, 2018, vol. 154, 17-31

Showing 1–31 of 31 results for author: Mańdziuk, J