Search | arXiv e-print repository

arXiv:2406.19066 [pdf, other]

Dancing in the Shadows: Harnessing Ambiguity for Fairer Classifiers

Authors: Ainhize Barrainkua, Paula Gordaliza, Jose A. Lozano, Novi Quadrianto

Abstract: This paper introduces a novel approach to bolster algorithmic fairness in scenarios where sensitive information is only partially known. In particular, we propose to leverage instances with uncertain identity with regards to the sensitive attribute to train a conventional machine learning classifier. The enhanced fairness observed in the final predictions of this classifier highlights the promisin… ▽ More This paper introduces a novel approach to bolster algorithmic fairness in scenarios where sensitive information is only partially known. In particular, we propose to leverage instances with uncertain identity with regards to the sensitive attribute to train a conventional machine learning classifier. The enhanced fairness observed in the final predictions of this classifier highlights the promising potential of prioritizing ambiguity (i.e., non-normativity) as a means to improve fairness guarantees in real-world classification tasks. △ Less

Submitted 27 June, 2024; originally announced June 2024.

MSC Class: 68T01; 68T37 ACM Class: A.0; I.2

Journal ref: Presented at the XI Symposium of Theory and Applications of Data Mining from the XX Conference of the Spanish Association for Artificial Intelligence CAEPIA 2024

arXiv:2403.13740 [pdf, other]

Uncertainty-Aware Explanations Through Probabilistic Self-Explainable Neural Networks

Authors: Jon Vadillo, Roberto Santana, Jose A. Lozano, Marta Kwiatkowska

Abstract: The lack of transparency of Deep Neural Networks continues to be a limitation that severely undermines their reliability and usage in high-stakes applications. Promising approaches to overcome such limitations are Prototype-Based Self-Explainable Neural Networks (PSENNs), whose predictions rely on the similarity between the input at hand and a set of prototypical representations of the output clas… ▽ More The lack of transparency of Deep Neural Networks continues to be a limitation that severely undermines their reliability and usage in high-stakes applications. Promising approaches to overcome such limitations are Prototype-Based Self-Explainable Neural Networks (PSENNs), whose predictions rely on the similarity between the input at hand and a set of prototypical representations of the output classes, offering therefore a deep, yet transparent-by-design, architecture. So far, such models have been designed by considering pointwise estimates for the prototypes, which remain fixed after the learning phase of the model. In this paper, we introduce a probabilistic reformulation of PSENNs, called Prob-PSENN, which replaces point estimates for the prototypes with probability distributions over their values. This provides not only a more flexible framework for an end-to-end learning of prototypes, but can also capture the explanatory uncertainty of the model, which is a missing feature in previous approaches. In addition, since the prototypes determine both the explanation and the prediction, Prob-PSENNs allow us to detect when the model is making uninformed or uncertain predictions, and to obtain valid explanations for them. Our experiments demonstrate that Prob-PSENNs provide more meaningful and robust explanations than their non-probabilistic counterparts, thus enhancing the explainability and reliability of the models. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2311.09369 [pdf, other]

Time-dependent Probabilistic Generative Models for Disease Progression

Authors: Onintze Zaballa, Aritz Pérez, Elisa Gómez-Inhiesto, Teresa Acaiturri-Ayesta, Jose A. Lozano

Abstract: Electronic health records contain valuable information for monitoring patients' health trajectories over time. Disease progression models have been developed to understand the underlying patterns and dynamics of diseases using these data as sequences. However, analyzing temporal data from EHRs is challenging due to the variability and irregularities present in medical records. We propose a Markovi… ▽ More Electronic health records contain valuable information for monitoring patients' health trajectories over time. Disease progression models have been developed to understand the underlying patterns and dynamics of diseases using these data as sequences. However, analyzing temporal data from EHRs is challenging due to the variability and irregularities present in medical records. We propose a Markovian generative model of treatments developed to (i) model the irregular time intervals between medical events; (ii) classify treatments into subtypes based on the patient sequence of medical events and the time intervals between them; and (iii) segment treatments into subsequences of disease progression patterns. We assume that sequences have an associated structure of latent variables: a latent class representing the different subtypes of treatments; and a set of latent stages indicating the phase of progression of the treatments. We use the Expectation-Maximization algorithm to learn the model, which is efficiently solved with a dynamic programming-based method. Various parametric models have been employed to model the time intervals between medical events during the learning process, including the geometric, exponential, and Weibull distributions. The results demonstrate the effectiveness of our model in recovering the underlying model from data and accurately modeling the irregular time intervals between medical actions. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 17 pages

arXiv:2310.15974 [pdf, ps, other]

Minimax Forward and Backward Learning of Evolving Tasks with Performance Guarantees

Authors: Verónica Álvarez, Santiago Mazuelas, Jose A. Lozano

Abstract: For a sequence of classification tasks that arrive over time, it is common that tasks are evolving in the sense that consecutive tasks often have a higher similarity. The incremental learning of a growing sequence of tasks holds promise to enable accurate classification even with few samples per task by leveraging information from all the tasks in the sequence (forward and backward learning). Howe… ▽ More For a sequence of classification tasks that arrive over time, it is common that tasks are evolving in the sense that consecutive tasks often have a higher similarity. The incremental learning of a growing sequence of tasks holds promise to enable accurate classification even with few samples per task by leveraging information from all the tasks in the sequence (forward and backward learning). However, existing techniques developed for continual learning and concept drift adaptation are either designed for tasks with time-independent similarities or only aim to learn the last task in the sequence. This paper presents incremental minimax risk classifiers (IMRCs) that effectively exploit forward and backward learning and account for evolving tasks. In addition, we analytically characterize the performance improvement provided by forward and backward learning in terms of the tasks' expected quadratic change and the number of tasks. The experimental evaluation shows that IMRCs can result in a significant performance improvement, especially for reduced sample sizes. △ Less

Submitted 24 October, 2023; originally announced October 2023.

arXiv:2303.02801 [pdf, ps, other]

Neuroevolutionary algorithms driven by neuron coverage metrics for semi-supervised classification

Authors: Roberto Santana, Ivan Hidalgo-Cenalmor, Unai Garciarena, Alexander Mendiburu, Jose Antonio Lozano

Abstract: In some machine learning applications the availability of labeled instances for supervised classification is limited while unlabeled instances are abundant. Semi-supervised learning algorithms deal with these scenarios and attempt to exploit the information contained in the unlabeled examples. In this paper, we address the question of how to evolve neural networks for semi-supervised problems. We… ▽ More In some machine learning applications the availability of labeled instances for supervised classification is limited while unlabeled instances are abundant. Semi-supervised learning algorithms deal with these scenarios and attempt to exploit the information contained in the unlabeled examples. In this paper, we address the question of how to evolve neural networks for semi-supervised problems. We introduce neuroevolutionary approaches that exploit unlabeled instances by using neuron coverage metrics computed on the neural network architecture encoded by each candidate solution. Neuron coverage metrics resemble code coverage metrics used to test software, but are oriented to quantify how the different neural network components are covered by test instances. In our neuroevolutionary approach, we define fitness functions that combine classification accuracy computed on labeled examples and neuron coverage metrics evaluated using unlabeled examples. We assess the impact of these functions on semi-supervised problems with a varying amount of labeled instances. Our results show that the use of neuron coverage metrics helps neuroevolution to become less sensitive to the scarcity of labeled data, and can lead in some cases to a more robust generalization of the learned classifiers. △ Less

Submitted 5 March, 2023; originally announced March 2023.

arXiv:2302.01079 [pdf, other]

Uncertainty in Fairness Assessment: Maintaining Stable Conclusions Despite Fluctuations

Authors: Ainhize Barrainkua, Paula Gordaliza, Jose A. Lozano, Novi Quadrianto

Abstract: Several recent works encourage the use of a Bayesian framework when assessing performance and fairness metrics of a classification algorithm in a supervised setting. We propose the Uncertainty Matters (UM) framework that generalizes a Beta-Binomial approach to derive the posterior distribution of any criteria combination, allowing stable performance assessment in a bias-aware setting.We suggest mo… ▽ More Several recent works encourage the use of a Bayesian framework when assessing performance and fairness metrics of a classification algorithm in a supervised setting. We propose the Uncertainty Matters (UM) framework that generalizes a Beta-Binomial approach to derive the posterior distribution of any criteria combination, allowing stable performance assessment in a bias-aware setting.We suggest modeling the confusion matrix of each demographic group using a Multinomial distribution updated through a Bayesian procedure. We extend UM to be applicable under the popular K-fold cross-validation procedure. Experiments highlight the benefits of UM over classical evaluation frameworks regarding informativeness and stability. △ Less

Submitted 2 February, 2023; originally announced February 2023.

Comments: 25 pages (including references and appendix), 10 figures. Submitted to ICML 2023

MSC Class: 62P99 ACM Class: G.3

arXiv:2211.07530 [pdf, other]

A Survey on Preserving Fairness Guarantees in Changing Environments

Authors: Ainhize Barrainkua, Paula Gordaliza, Jose A. Lozano, Novi Quadrianto

Abstract: Human lives are increasingly being affected by the outcomes of automated decision-making systems and it is essential for the latter to be, not only accurate, but also fair. The literature of algorithmic fairness has grown considerably over the last decade, where most of the approaches are evaluated under the strong assumption that the train and test samples are independently and identically drawn… ▽ More Human lives are increasingly being affected by the outcomes of automated decision-making systems and it is essential for the latter to be, not only accurate, but also fair. The literature of algorithmic fairness has grown considerably over the last decade, where most of the approaches are evaluated under the strong assumption that the train and test samples are independently and identically drawn from the same underlying distribution. However, in practice, dissimilarity between the training and deployment environments exists, which compromises the performance of the decision-making algorithm as well as its fairness guarantees in the deployment data. There is an emergent research line that studies how to preserve fairness guarantees when the data generating processes differ between the source (train) and target (test) domains, which is growing remarkably. With this survey, we aim to provide a wide and unifying overview on the topic. For such purpose, we propose a taxonomy of the existing approaches for fair classification under distribution shift, highlight benchmarking alternatives, point out the relation with other similar research fields and eventually, identify future venues of research. △ Less

Submitted 14 November, 2022; originally announced November 2022.

Comments: 29 pages, 6 figures. Submitted to ACM Computing Surveys: Special Issue on Trustworthy AI

MSC Class: 68-02; 68T05; 68T37; ACM Class: A.1; I.0

arXiv:2205.15942 [pdf, ps, other]

Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees

Authors: Verónica Álvarez, Santiago Mazuelas, Jose A. Lozano

Abstract: The statistical characteristics of instance-label pairs often change with time in practical scenarios of supervised classification. Conventional learning techniques adapt to such concept drift accounting for a scalar rate of change by means of a carefully chosen learning rate, forgetting factor, or window size. However, the time changes in common scenarios are multidimensional, i.e., different sta… ▽ More The statistical characteristics of instance-label pairs often change with time in practical scenarios of supervised classification. Conventional learning techniques adapt to such concept drift accounting for a scalar rate of change by means of a carefully chosen learning rate, forgetting factor, or window size. However, the time changes in common scenarios are multidimensional, i.e., different statistical characteristics often change in a different manner. This paper presents adaptive minimax risk classifiers (AMRCs) that account for multidimensional time changes by means of a multivariate and high-order tracking of the time-varying underlying distribution. In addition, differently from conventional techniques, AMRCs can provide computable tight performance guarantees. Experiments on multiple benchmark datasets show the classification improvement of AMRCs compared to the state-of-the-art and the reliability of the presented performance guarantees. △ Less

Submitted 31 May, 2022; originally announced May 2022.

arXiv:2205.12943 [pdf, other]

Transitions from P to NP-hardness: the case of the Linear Ordering Problem

Authors: Anne Elorza, Leticia Hernando, Jose A. Lozano

Abstract: In this paper we evaluate how constructive heuristics degrade when a problem transits from P to NP-hard. This is done by means of the linear ordering problem. More specifically, for this problem we prove that the objective function can be expressed as the sum of two objective functions, one of which is associated with a P problem (an exact polynomial time algorithm is proposed to solve it), while… ▽ More In this paper we evaluate how constructive heuristics degrade when a problem transits from P to NP-hard. This is done by means of the linear ordering problem. More specifically, for this problem we prove that the objective function can be expressed as the sum of two objective functions, one of which is associated with a P problem (an exact polynomial time algorithm is proposed to solve it), while the other is associated with an NP-hard problem. We study how different constructive algorithms whose behaviour only depends on univariate information perform depending on the contribution of the P or NP-hard components of the problem. A number of experiments are conducted with reduced dimensions, where the global optimum of the problems is known, giving different weights to the NP-hard component, while the weight of the P component is fixed. It is observed how the performance of the constructive algorithms gets worse as the weight given to the NP-hard component increases. △ Less

Submitted 25 May, 2022; originally announced May 2022.

arXiv:2107.01943 [pdf, other]

When and How to Fool Explainable Models (and Humans) with Adversarial Examples

Authors: Jon Vadillo, Roberto Santana, Jose A. Lozano

Abstract: Reliable deployment of machine learning models such as neural networks continues to be challenging due to several limitations. Some of the main shortcomings are the lack of interpretability and the lack of robustness against adversarial examples or out-of-distribution inputs. In this exploratory review, we explore the possibilities and limits of adversarial attacks for explainable machine learning… ▽ More Reliable deployment of machine learning models such as neural networks continues to be challenging due to several limitations. Some of the main shortcomings are the lack of interpretability and the lack of robustness against adversarial examples or out-of-distribution inputs. In this exploratory review, we explore the possibilities and limits of adversarial attacks for explainable machine learning models. First, we extend the notion of adversarial examples to fit in explainable machine learning scenarios, in which the inputs, the output classifications and the explanations of the model's decisions are assessed by humans. Next, we propose a comprehensive framework to study whether (and how) adversarial examples can be generated for explainable models under human assessment, introducing and illustrating novel attack paradigms. In particular, our framework considers a wide range of relevant yet often ignored factors such as the type of problem, the user expertise or the objective of the explanations, in order to identify the attack strategies that should be adopted in each scenario to successfully deceive the model (and the human). The intention of these contributions is to serve as a basis for a more rigorous and realistic study of adversarial examples in the field of explainable machine learning. △ Less

Submitted 7 July, 2023; v1 submitted 5 July, 2021; originally announced July 2021.

Comments: Updated version. 43 pages, 9 figures, 4 tables

arXiv:2105.01878 [pdf, other]

doi 10.1145/3316782.3322764

The EMPATHIC Project: Mid-term Achievements

Authors: M. I. Torres, J. M. Olaso, C. Montenegro, R. Santana, A. Vázquez, R. Justo, J. A. Lozano, S. Schlögl, G. Chollet, N. Dugan, M. Irvine, N. Glackin, C. Pickard, A. Esposito, G. Cordasco, A. Troncone, D. Petrovska-Delacretaz, A. Mtibaa, M. A. Hmani, M. S. Korsnes, L. J. Martinussen, S. Escalera, C. Palmero Cantariño, O. Deroo, O. Gordeeva , et al. (4 additional authors not shown)

Abstract: The goal of active aging is to promote changes in the elderly community so as to maintain an active, independent and socially-engaged lifestyle. Technological advancements currently provide the necessary tools to foster and monitor such processes. This paper reports on mid-term achievements of the European H2020 EMPATHIC project, which aims to research, innovate, explore and validate new interacti… ▽ More The goal of active aging is to promote changes in the elderly community so as to maintain an active, independent and socially-engaged lifestyle. Technological advancements currently provide the necessary tools to foster and monitor such processes. This paper reports on mid-term achievements of the European H2020 EMPATHIC project, which aims to research, innovate, explore and validate new interaction paradigms and platforms for future generations of personalized virtual coaches to assist the elderly and their carers to reach the active aging goal, in the vicinity of their home. The project focuses on evidence-based, user-validated research and integration of intelligent technology, and context sensing methods through automatic voice, eye and facial analysis, integrated with visual and spoken dialogue system capabilities. In this paper, we describe the current status of the system, with a special emphasis on its components and their integration, the creation of a Wizard of Oz platform, and findings gained from user interaction studies conducted throughout the first 18 months of the project. △ Less

Submitted 5 May, 2021; originally announced May 2021.

Comments: 12 pages

arXiv:2012.14352 [pdf, other]

Analysis of Dominant Classes in Universal Adversarial Perturbations

Authors: Jon Vadillo, Roberto Santana, Jose A. Lozano

Abstract: The reasons why Deep Neural Networks are susceptible to being fooled by adversarial examples remains an open discussion. Indeed, many different strategies can be employed to efficiently generate adversarial attacks, some of them relying on different theoretical justifications. Among these strategies, universal (input-agnostic) perturbations are of particular interest, due to their capability to fo… ▽ More The reasons why Deep Neural Networks are susceptible to being fooled by adversarial examples remains an open discussion. Indeed, many different strategies can be employed to efficiently generate adversarial attacks, some of them relying on different theoretical justifications. Among these strategies, universal (input-agnostic) perturbations are of particular interest, due to their capability to fool a network independently of the input in which the perturbation is applied. In this work, we investigate an intriguing phenomenon of universal perturbations, which has been reported previously in the literature, yet without a proven justification: universal perturbations change the predicted classes for most inputs into one particular (dominant) class, even if this behavior is not specified during the creation of the perturbation. In order to justify the cause of this phenomenon, we propose a number of hypotheses and experimentally test them using a speech command classification problem in the audio domain as a testbed. Our analyses reveal interesting properties of universal perturbations, suggest new methods to generate such attacks and provide an explanation of dominant classes, under both a geometric and a data-feature perspective. △ Less

Submitted 11 January, 2021; v1 submitted 28 December, 2020; originally announced December 2020.

Comments: 20 pages, 10 figures, 4 tables

arXiv:2011.14721 [pdf, ps, other]

doi 10.1109/TPWRS.2021.3050837

Probabilistic Load Forecasting Based on Adaptive Online Learning

Authors: Verónica Álvarez, Santiago Mazuelas, José A. Lozano

Abstract: Load forecasting is crucial for multiple energy management tasks such as scheduling generation capacity, planning supply and demand, and minimizing energy trade costs. Such relevance has increased even more in recent years due to the integration of renewable energies, electric cars, and microgrids. Conventional load forecasting techniques obtain single-value load forecasts by exploiting consumptio… ▽ More Load forecasting is crucial for multiple energy management tasks such as scheduling generation capacity, planning supply and demand, and minimizing energy trade costs. Such relevance has increased even more in recent years due to the integration of renewable energies, electric cars, and microgrids. Conventional load forecasting techniques obtain single-value load forecasts by exploiting consumption patterns of past load demand. However, such techniques cannot assess intrinsic uncertainties in load demand, and cannot capture dynamic changes in consumption patterns. To address these problems, this paper presents a method for probabilistic load forecasting based on the adaptive online learning of hidden Markov models. We propose learning and forecasting techniques with theoretical guarantees, and experimentally assess their performance in multiple scenarios. In particular, we develop adaptive online learning techniques that update model parameters recursively, and sequential prediction techniques that obtain probabilistic forecasts using the most recent parameters. The performance of the method is evaluated using multiple datasets corresponding with regions that have different sizes and display assorted time-varying consumption patterns. The results show that the proposed method can significantly improve the performance of existing techniques for a wide range of scenarios. △ Less

Submitted 15 January, 2021; v1 submitted 30 November, 2020; originally announced November 2020.

Comments: \c{opyright} 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

arXiv:2011.02743 [pdf, other]

A revisited branch-and-cut algorithm for large-scale orienteering problems

Authors: Gorka Kobeaga, María Merino, Jose A. Lozano

Abstract: The orienteering problem is a route optimization problem which consists in finding a simple cycle that maximizes the total collected profit subject to a maximum distance limitation. In the last few decades, the occurrence of this problem in real-life applications has boosted the development of many heuristic algorithms to solve it. However, during the same period, not much research has been devote… ▽ More The orienteering problem is a route optimization problem which consists in finding a simple cycle that maximizes the total collected profit subject to a maximum distance limitation. In the last few decades, the occurrence of this problem in real-life applications has boosted the development of many heuristic algorithms to solve it. However, during the same period, not much research has been devoted to the field of exact algorithms for the orienteering problem. The aim of this work is to develop an exact method which is able to obtain optimality certification in a wider set of instances than with previous methods, or to improve the lower and upper bounds in its disability. We propose a revisited version of the branch-and-cut algorithm for the orienteering problem which includes new contributions in the separation algorithms of inequalities stemming from the cycle problem, in the separation loop, in the variables pricing, and in the calculation of the lower and upper bounds of the problem. Our proposal is compared to three state-of-the-art algorithms on 258 benchmark instances with up to 7397 nodes. The computational experiments show the relevance of the designed components where 18 new optima, 76 new best-known solutions and 85 new upper-bound values were obtained. △ Less

Submitted 13 January, 2021; v1 submitted 5 November, 2020; originally announced November 2020.

arXiv:2004.14574 [pdf, other]

doi 10.1007/s10479-021-04210-0

On Solving Cycle Problems with Branch-and-Cut: Extending Shrinking and Exact Subcycle Elimination Separation Algorithms

Authors: Gorka Kobeaga, María Merino, Jose A. Lozano

Abstract: In this paper, we extend techniques developed in the context of the Travelling Salesperson Problem for cycle problems. Particularly, we study the shrinking of support graphs and the exact algorithms for subcycle elimination separation problems. The efficient application of the considered techniques has proved to be essential in the Travelling Salesperson Problem when solving large size problems by… ▽ More In this paper, we extend techniques developed in the context of the Travelling Salesperson Problem for cycle problems. Particularly, we study the shrinking of support graphs and the exact algorithms for subcycle elimination separation problems. The efficient application of the considered techniques has proved to be essential in the Travelling Salesperson Problem when solving large size problems by Branch-and-Cut, and this has been the motivation behind this work. Regarding the shrinking of support graphs, we prove the validity of the Padberg-Rinaldi general shrinking rules and the Crowder-Padberg subcycle-safe shrinking rules. Concerning the subcycle separation problems, we extend two exact separation algorithms, the Dynamic Hong and the Extended Padberg-Grötschel algorithms, which are shown to be superior to the ones used so far in the literature of cycle problems. The proposed techniques are empirically tested in 24 subcycle elimination problem instances generated by solving the Orienteering Problem (involving up to 15112 vertices) with Branch-and-Cut. The experiments suggest the relevance of the proposed techniques for cycle problems. The obtained average speedup for the subcycle separation problems in the Orienteering Problem when the proposed techniques are used together is around 50 times in medium-sized instances and around 250 times in large-sized instances. △ Less

Submitted 6 September, 2021; v1 submitted 29 April, 2020; originally announced April 2020.

Comments: 23 pages + 3 appendices

MSC Class: 05C38; 90C10; 90C57

arXiv:2004.06383 [pdf, other]

Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions

Authors: Jon Vadillo, Roberto Santana, Jose A. Lozano

Abstract: Despite the remarkable performance and generalization levels of deep learning models in a wide range of artificial intelligence tasks, it has been demonstrated that these models can be easily fooled by the addition of imperceptible yet malicious perturbations to natural inputs. These altered inputs are known in the literature as adversarial examples. In this paper, we propose a novel probabilistic… ▽ More Despite the remarkable performance and generalization levels of deep learning models in a wide range of artificial intelligence tasks, it has been demonstrated that these models can be easily fooled by the addition of imperceptible yet malicious perturbations to natural inputs. These altered inputs are known in the literature as adversarial examples. In this paper, we propose a novel probabilistic framework to generalize and extend adversarial attacks in order to produce a desired probability distribution for the classes when we apply the attack method to a large number of inputs. This novel attack paradigm provides the adversary with greater control over the target model, thereby exposing, in a wide range of scenarios, threats against deep learning models that cannot be conducted by the conventional paradigms. We introduce four different strategies to efficiently generate such attacks, and illustrate our approach by extending multiple adversarial attack algorithms. We also experimentally validate our approach for the spoken command classification task and the Tweet emotion classification task, two exemplary machine learning problems in the audio and text domain, respectively. Our results demonstrate that we can closely approximate any probability distribution for the classes while maintaining a high fooling rate and even prevent the attacks from being detected by label-shift detection methods. △ Less

Submitted 25 January, 2023; v1 submitted 14 April, 2020; originally announced April 2020.

Comments: Final version as accepted in JMLR. Attribution requirements are provided at http://jmlr.org/papers/v24/21-0326.html

Journal ref: Journal of Machine Learning Research, 24(15):1-42, 2023

arXiv:2002.04236 [pdf, other]

A review on outlier/anomaly detection in time series data

Authors: Ane Blázquez-García, Angel Conde, Usue Mori, Jose A. Lozano

Abstract: Recent advances in technology have brought major breakthroughs in data collection, enabling a large amount of data to be gathered over time and thus generating time series. Mining this data has become an important task for researchers and practitioners in the past few years, including the detection of outliers or anomalies that may represent errors or events of interest. This review aims to provid… ▽ More Recent advances in technology have brought major breakthroughs in data collection, enabling a large amount of data to be gathered over time and thus generating time series. Mining this data has become an important task for researchers and practitioners in the past few years, including the detection of outliers or anomalies that may represent errors or events of interest. This review aims to provide a structured and comprehensive state-of-the-art on outlier detection techniques in the context of time series. To this end, a taxonomy is presented based on the main aspects that characterize an outlier detection technique. △ Less

Submitted 11 February, 2020; originally announced February 2020.

Comments: 32 pages, 21 figures, submitted to ACM Computing Surveys (CSUR)

arXiv:1910.05173 [pdf, other]

Evolving Gaussian Process kernels from elementary mathematical expressions

Authors: Ibai Roman, Roberto Santana, Alexander Mendiburu, Jose A. Lozano

Abstract: Choosing the most adequate kernel is crucial in many Machine Learning applications. Gaussian Process is a state-of-the-art technique for regression and classification that heavily relies on a kernel function. However, in the Gaussian Process literature, kernels have usually been either ad hoc designed, selected from a predefined set, or searched for in a space of compositions of kernels which have… ▽ More Choosing the most adequate kernel is crucial in many Machine Learning applications. Gaussian Process is a state-of-the-art technique for regression and classification that heavily relies on a kernel function. However, in the Gaussian Process literature, kernels have usually been either ad hoc designed, selected from a predefined set, or searched for in a space of compositions of kernels which have been defined a priori. In this paper, we propose a Genetic-Programming algorithm that represents a kernel function as a tree of elementary mathematical expressions. By means of this representation, a wider set of kernels can be modeled, where potentially better solutions can be found, although new challenges also arise. The proposed algorithm is able to overcome these difficulties and find kernels that accurately model the characteristics of the data. This method has been tested in several real-world time-series extrapolation problems, improving the state-of-the-art results while reducing the complexity of the kernels. △ Less

Submitted 14 October, 2019; v1 submitted 11 October, 2019; originally announced October 2019.

arXiv:1905.10852 [pdf, ps, other]

Taxonomization of Combinatorial Optimization Problems in Fourier Space

Authors: Anne Elorza, Leticia Hernando, Jose A. Lozano

Abstract: We propose and develop a novel framework for analyzing permutation-based combinatorial optimization problems, which could eventually be extended to other types of problems. Our approach is based on the decomposition of the objective functions via the generalized Fourier transform. We characterize the Fourier coefficients of three different problems: the Traveling Salesman Problem, the Linear Order… ▽ More We propose and develop a novel framework for analyzing permutation-based combinatorial optimization problems, which could eventually be extended to other types of problems. Our approach is based on the decomposition of the objective functions via the generalized Fourier transform. We characterize the Fourier coefficients of three different problems: the Traveling Salesman Problem, the Linear Ordering Problem and the Quadratic Assignment Problem. This implies that these three problems can be viewed in a homogeneous space, such as the Fourier domain. Our final target would be to create a taxonomy of problem instances, so that functions which are treated similarly under the same search algorithms are grouped together. For this purpose, we simplify the representations of the objective functions by considering them as permutations of the elements of the search space, and study the permutations that are associated with different problems. △ Less

Submitted 26 May, 2019; originally announced May 2019.

Comments: 42 pages, 3 appendices

arXiv:1904.00977 [pdf, ps, other]

Sentiment analysis with genetically evolved Gaussian kernels

Authors: Ibai Roman, Alexander Mendiburu, Roberto Santana, Jose A. Lozano

Abstract: Sentiment analysis consists of evaluating opinions or statements from the analysis of text. Among the methods used to estimate the degree in which a text expresses a given sentiment, are those based on Gaussian Processes. However, traditional Gaussian Processes methods use a predefined kernel with hyperparameters that can be tuned but whose structure can not be adapted. In this paper, we propose t… ▽ More Sentiment analysis consists of evaluating opinions or statements from the analysis of text. Among the methods used to estimate the degree in which a text expresses a given sentiment, are those based on Gaussian Processes. However, traditional Gaussian Processes methods use a predefined kernel with hyperparameters that can be tuned but whose structure can not be adapted. In this paper, we propose the application of Genetic Programming for evolving Gaussian Process kernels that are more precise for sentiment analysis. We use use a very flexible representation of kernels combined with a multi-objective approach that simultaneously considers two quality metrics and the computational time spent by the kernels. Our results show that the algorithm can outperform Gaussian Processes with traditional kernels for some of the sentiment analysis tasks considered. △ Less

Submitted 14 October, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

arXiv:1809.06106 [pdf, ps, other]

doi 10.1109/TCYB.2020.2968301

Merge Non-Dominated Sorting Algorithm for Many-Objective Optimization

Authors: Javier Moreno, Daniel Rodriguez, Antonio Nebro, Jose A. Lozano

Abstract: Many Pareto-based multi-objective evolutionary algorithms require to rank the solutions of the population in each iteration according to the dominance principle, what can become a costly operation particularly in the case of dealing with many-objective optimization problems. In this paper, we present a new efficient algorithm for computing the non-dominated sorting procedure, called Merge Non-Domi… ▽ More Many Pareto-based multi-objective evolutionary algorithms require to rank the solutions of the population in each iteration according to the dominance principle, what can become a costly operation particularly in the case of dealing with many-objective optimization problems. In this paper, we present a new efficient algorithm for computing the non-dominated sorting procedure, called Merge Non-Dominated Sorting (MNDS), which has a best computational complexity of $Θ(NlogN)$ and a worst computational complexity of $Θ(MN^2)$. Our approach is based on the computation of the dominance set of each solution by taking advantage of the characteristics of the merge sort algorithm. We compare the MNDS against four well-known techniques that can be considered as the state-of-the-art. The results indicate that the MNDS algorithm outperforms the other techniques in terms of number of comparisons as well as the total running time. △ Less

Submitted 17 September, 2018; originally announced September 2018.

arXiv:1806.04509 [pdf, ps, other]

A review on distance based time series classification

Authors: Amaia Abanda, Usue Mori, Jose A. Lozano

Abstract: Time series classification is an increasing research topic due to the vast amount of time series data that are being created over a wide variety of fields. The particularity of the data makes it a challenging task and different approaches have been taken, including the distance based approach. 1-NN has been a widely used method within distance based time series classification due to it simplicity… ▽ More Time series classification is an increasing research topic due to the vast amount of time series data that are being created over a wide variety of fields. The particularity of the data makes it a challenging task and different approaches have been taken, including the distance based approach. 1-NN has been a widely used method within distance based time series classification due to it simplicity but still good performance. However, its supremacy may be attributed to being able to use specific distances for time series within the classification process and not to the classifier itself. With the aim of exploiting these distances within more complex classifiers, new approaches have arisen in the past few years that are competitive or which outperform the 1-NN based approaches. In some cases, these new methods use the distance measure to transform the series into feature vectors, bridging the gap between time series and traditional classifiers. In other cases, the distances are employed to obtain a time series kernel and enable the use of kernel methods for time series classification. One of the main challenges is that a kernel function must be positive semi-definite, a matter that is also addressed within this review. The presented review includes a taxonomy of all those methods that aim to classify time series using a distance based approach, as well as a discussion of the strengths and weaknesses of each method. △ Less

Submitted 12 June, 2018; originally announced June 2018.

arXiv:1801.02949 [pdf, other]

An efficient K -means clustering algorithm for massive data

Authors: Marco Capó, Aritz Pérez, Jose A. Lozano

Abstract: The analysis of continously larger datasets is a task of major importance in a wide variety of scientific fields. In this sense, cluster analysis algorithms are a key element of exploratory data analysis, due to their easiness in the implementation and relatively low computational cost. Among these algorithms, the K -means algorithm stands out as the most popular approach, besides its high depende… ▽ More The analysis of continously larger datasets is a task of major importance in a wide variety of scientific fields. In this sense, cluster analysis algorithms are a key element of exploratory data analysis, due to their easiness in the implementation and relatively low computational cost. Among these algorithms, the K -means algorithm stands out as the most popular approach, besides its high dependency on the initial conditions, as well as to the fact that it might not scale well on massive datasets. In this article, we propose a recursive and parallel approximation to the K -means algorithm that scales well on both the number of instances and dimensionality of the problem, without affecting the quality of the approximation. In order to achieve this, instead of analyzing the entire dataset, we work on small weighted sets of points that mostly intend to extract information from those regions where it is harder to determine the correct cluster assignment of the original instances. In addition to different theoretical properties, which deduce the reasoning behind the algorithm, experimental results indicate that our method outperforms the state-of-the-art in terms of the trade-off between number of distance computations and the quality of the solution obtained. △ Less

Submitted 9 January, 2018; originally announced January 2018.

arXiv:1608.08984 [pdf, other]

Towards Competitive Classifiers for Unbalanced Classification Problems: A Study on the Performance Scores

Authors: Jonathan Ortigosa-Hernández, Iñaki Inza, Jose A. Lozano

Abstract: Although a great methodological effort has been invested in proposing competitive solutions to the class-imbalance problem, little effort has been made in pursuing a theoretical understanding of this matter. In order to shed some light on this topic, we perform, through a novel framework, an exhaustive analysis of the adequateness of the most commonly used performance scores to assess this compl… ▽ More Although a great methodological effort has been invested in proposing competitive solutions to the class-imbalance problem, little effort has been made in pursuing a theoretical understanding of this matter. In order to shed some light on this topic, we perform, through a novel framework, an exhaustive analysis of the adequateness of the most commonly used performance scores to assess this complex scenario. We conclude that using unweighted Hölder means with exponent $p \leq 1$ to average the recalls of all the classes produces adequate scores which are capable of determining whether a classifier is competitive. Then, we review the major solutions presented in the class-imbalance literature. Since any learning task can be defined as an optimisation problem where a loss function, usually connected to a particular score, is minimised, our goal, here, is to find whether the learning tasks found in the literature are also oriented to maximise the previously detected adequate scores. We conclude that they usually maximise the unweighted Hölder mean with $p = 1$ (a-mean). Finally, we provide bounds on the values of the studied performance scores which guarantee a classifier with a higher recall than the random classifier in each and every class. △ Less

Submitted 31 August, 2016; originally announced August 2016.

arXiv:1605.02989 [pdf, ps, other]

An efficient K-means algorithm for Massive Data

Authors: Marco Capó, Aritz Pérez, José Antonio Lozano

Abstract: Due to the progressive growth of the amount of data available in a wide variety of scientific fields, it has become more difficult to ma- nipulate and analyze such information. Even though datasets have grown in size, the K-means algorithm remains as one of the most popular clustering methods, in spite of its dependency on the initial settings and high computational cost, especially in terms of di… ▽ More Due to the progressive growth of the amount of data available in a wide variety of scientific fields, it has become more difficult to ma- nipulate and analyze such information. Even though datasets have grown in size, the K-means algorithm remains as one of the most popular clustering methods, in spite of its dependency on the initial settings and high computational cost, especially in terms of distance computations. In this work, we propose an efficient approximation to the K-means problem intended for massive data. Our approach recursively partitions the entire dataset into a small number of sub- sets, each of which is characterized by its representative (center of mass) and weight (cardinality), afterwards a weighted version of the K-means algorithm is applied over such local representation, which can drastically reduce the number of distances computed. In addition to some theoretical properties, experimental results indicate that our method outperforms well-known approaches, such as the K-means++ and the minibatch K-means, in terms of the relation between number of distance computations and the quality of the approximation. △ Less

Submitted 10 May, 2016; originally announced May 2016.

Comments: 38 pages, 10 figures

arXiv:1512.03466 [pdf, ps, other]

Computing factorized approximations of Pareto-fronts using mNM-landscapes and Boltzmann distributions

Authors: Roberto Santana, Alexander Mendiburu, Jose A. Lozano

Abstract: NM-landscapes have been recently introduced as a class of tunable rugged models. They are a subset of the general interaction models where all the interactions are of order less or equal $M$. The Boltzmann distribution has been extensively applied in single-objective evolutionary algorithms to implement selection and study the theoretical properties of model-building algorithms. In this paper we p… ▽ More NM-landscapes have been recently introduced as a class of tunable rugged models. They are a subset of the general interaction models where all the interactions are of order less or equal $M$. The Boltzmann distribution has been extensively applied in single-objective evolutionary algorithms to implement selection and study the theoretical properties of model-building algorithms. In this paper we propose the combination of the multi-objective NM-landscape model and the Boltzmann distribution to obtain Pareto-front approximations. We investigate the joint effect of the parameters of the NM-landscapes and the probabilistic factorizations in the shape of the Pareto front approximations. △ Less

Submitted 10 December, 2015; originally announced December 2015.

Comments: Accepted for CAEPIA-2015 conference, Albacete, Spain. 11 pages, 3 figures

arXiv:1405.5646 [pdf, other]

Mathematical Programming Strategies for Solving the Minimum Common String Partition Problem

Authors: Christian Blum, José A. Lozano, Pedro Pinacho Davidson

Abstract: The minimum common string partition problem is an NP-hard combinatorial optimization problem with applications in computational biology. In this work we propose the first integer linear programming model for solving this problem. Moreover, on the basis of the integer linear programming model we develop a deterministic 2-phase heuristic which is applicable to larger problem instances. The results s… ▽ More The minimum common string partition problem is an NP-hard combinatorial optimization problem with applications in computational biology. In this work we propose the first integer linear programming model for solving this problem. Moreover, on the basis of the integer linear programming model we develop a deterministic 2-phase heuristic which is applicable to larger problem instances. The results show that provenly optimal solutions can be obtained for problem instances of small and medium size from the literature by solving the proposed integer linear programming model with CPLEX. Furthermore, new best-known solutions are obtained for all considered problem instances from the literature. Concerning the heuristic, we were able to show that it outperforms heuristic competitors from the related literature. △ Less

Submitted 22 May, 2014; originally announced May 2014.

MSC Class: 90-08

arXiv:1301.3871 [pdf]

Combinatorial Optimization by Learning and Simulation of Bayesian Networks

Authors: Pedro Larrañaga, Ramon Etxeberria, Jose A. Lozano, Jose M. Pena

Abstract: This paper shows how the Bayesian network paradigm can be used in order to solve combinatorial optimization problems. To do it some methods of structure learning from data and simulation of Bayesian networks are inserted inside Estimation of Distribution Algorithms (EDA). EDA are a new tool for evolutionary computation in which populations of individuals are created by estimation and simulation of… ▽ More This paper shows how the Bayesian network paradigm can be used in order to solve combinatorial optimization problems. To do it some methods of structure learning from data and simulation of Bayesian networks are inserted inside Estimation of Distribution Algorithms (EDA). EDA are a new tool for evolutionary computation in which populations of individuals are created by estimation and simulation of the joint probability distribution of the selected individuals. We propose new approaches to EDA for combinatorial optimization based on the theory of probabilistic graphical models. Experimental results are also presented. △ Less

Submitted 16 January, 2013; originally announced January 2013.

Comments: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000)

Report number: UAI-P-2000-PG-343-352

Showing 1–28 of 28 results for author: Lozano, J A