Search | arXiv e-print repository

doi 10.1007/s12559-023-10213-9

Using Curiosity for an Even Representation of Tasks in Continual Offline Reinforcement Learning

Authors: Pankayaraj Pathmanathan, Natalia Díaz-Rodríguez, Javier Del Ser

Abstract: In this work, we investigate the means of using curiosity on replay buffers to improve offline multi-task continual reinforcement learning when tasks, which are defined by the non-stationarity in the environment, are non labeled and not evenly exposed to the learner in time. In particular, we investigate the use of curiosity both as a tool for task boundary detection and as a priority metric when… ▽ More In this work, we investigate the means of using curiosity on replay buffers to improve offline multi-task continual reinforcement learning when tasks, which are defined by the non-stationarity in the environment, are non labeled and not evenly exposed to the learner in time. In particular, we investigate the use of curiosity both as a tool for task boundary detection and as a priority metric when it comes to retaining old transition tuples, which we respectively use to propose two different buffers. Firstly, we propose a Hybrid Reservoir Buffer with Task Separation (HRBTS), where curiosity is used to detect task boundaries that are not known due to the task agnostic nature of the problem. Secondly, by using curiosity as a priority metric when it comes to retaining old transition tuples, a Hybrid Curious Buffer (HCB) is proposed. We ultimately show that these buffers, in conjunction with regular reinforcement learning algorithms, can be used to alleviate the catastrophic forgetting issue suffered by the state of the art on replay buffers when the agent's exposure to tasks is not equal along time. We evaluate catastrophic forgetting and the efficiency of our proposed buffers against the latest works such as the Hybrid Reservoir Buffer (HRB) and the Multi-Time Scale Replay Buffer (MTR) in three different continual reinforcement learning settings. Experiments were done on classical control tasks and Metaworld environment. Experiments show that our proposed replay buffers display better immunity to catastrophic forgetting compared to existing works in most of the settings. △ Less

Submitted 5 December, 2023; originally announced December 2023.

arXiv:2305.02231 [pdf, other]

Connecting the Dots in Trustworthy Artificial Intelligence: From AI Principles, Ethics, and Key Requirements to Responsible AI Systems and Regulation

Authors: Natalia Díaz-Rodríguez, Javier Del Ser, Mark Coeckelbergh, Marcos López de Prado, Enrique Herrera-Viedma, Francisco Herrera

Abstract: Trustworthy Artificial Intelligence (AI) is based on seven technical requirements sustained over three main pillars that should be met throughout the system's entire life cycle: it should be (1) lawful, (2) ethical, and (3) robust, both from a technical and a social perspective. However, attaining truly trustworthy AI concerns a wider vision that comprises the trustworthiness of all processes and… ▽ More Trustworthy Artificial Intelligence (AI) is based on seven technical requirements sustained over three main pillars that should be met throughout the system's entire life cycle: it should be (1) lawful, (2) ethical, and (3) robust, both from a technical and a social perspective. However, attaining truly trustworthy AI concerns a wider vision that comprises the trustworthiness of all processes and actors that are part of the system's life cycle, and considers previous aspects from different lenses. A more holistic vision contemplates four essential axes: the global principles for ethical use and development of AI-based systems, a philosophical take on AI ethics, a risk-based approach to AI regulation, and the mentioned pillars and requirements. The seven requirements (human agency and oversight; robustness and safety; privacy and data governance; transparency; diversity, non-discrimination and fairness; societal and environmental wellbeing; and accountability) are analyzed from a triple perspective: What each requirement for trustworthy AI is, Why it is needed, and How each requirement can be implemented in practice. On the other hand, a practical approach to implement trustworthy AI systems allows defining the concept of responsibility of AI-based systems facing the law, through a given auditing process. Therefore, a responsible AI system is the resulting notion we introduce in this work, and a concept of utmost necessity that can be realized through auditing processes, subject to the challenges posed by the use of regulatory sandboxes. Our multidisciplinary vision of trustworthy AI culminates in a debate on the diverging views published lately about the future of AI. Our reflections in this matter conclude that regulation is a key for reaching a consensus among these views, and that trustworthy and responsible AI systems will be crucial for the present and future of our society. △ Less

Submitted 12 June, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

Comments: 30 pages, 5 figures, under second review

MSC Class: 68T01 ACM Class: I.2; K.4; K.5

arXiv:2212.03041 [pdf, other]

doi 10.1016/j.knosys.2022.110189

Towards a more efficient computation of individual attribute and policy contribution for post-hoc explanation of cooperative multi-agent systems using Myerson values

Authors: Giorgio Angelotti, Natalia Díaz-Rodríguez

Abstract: A quantitative assessment of the global importance of an agent in a team is as valuable as gold for strategists, decision-makers, and sports coaches. Yet, retrieving this information is not trivial since in a cooperative task it is hard to isolate the performance of an individual from the one of the whole team. Moreover, it is not always clear the relationship between the role of an agent and his… ▽ More A quantitative assessment of the global importance of an agent in a team is as valuable as gold for strategists, decision-makers, and sports coaches. Yet, retrieving this information is not trivial since in a cooperative task it is hard to isolate the performance of an individual from the one of the whole team. Moreover, it is not always clear the relationship between the role of an agent and his personal attributes. In this work we conceive an application of the Shapley analysis for studying the contribution of both agent policies and attributes, putting them on equal footing. Since the computational complexity is NP-hard and scales exponentially with the number of participants in a transferable utility coalitional game, we resort to exploiting a-priori knowledge about the rules of the game to constrain the relations between the participants over a graph. We hence propose a method to determine a Hierarchical Knowledge Graph of agents' policies and features in a Multi-Agent System. Assuming a simulator of the system is available, the graph structure allows to exploit dynamic programming to assess the importances in a much faster way. We test the proposed approach in a proof-of-case environment deploying both hardcoded policies and policies obtained via Deep Reinforcement Learning. The proposed paradigm is less computationally demanding than trivially computing the Shapley values and provides great insight not only into the importance of an agent in a team but also into the attributes needed to deploy the policy at its best. △ Less

Submitted 6 December, 2022; originally announced December 2022.

Comments: Accepted for publication in Elsevier's Knowledge-Based Systems

Journal ref: Knowledge-Based Systems 260 (2023) 110189

arXiv:2209.14974 [pdf, other]

Greybox XAI: a Neural-Symbolic learning framework to produce interpretable predictions for image classification

Authors: Adrien Bennetot, Gianni Franchi, Javier Del Ser, Raja Chatila, Natalia Diaz-Rodriguez

Abstract: Although Deep Neural Networks (DNNs) have great generalization and prediction capabilities, their functioning does not allow a detailed explanation of their behavior. Opaque deep learning models are increasingly used to make important predictions in critical environments, and the danger is that they make and use predictions that cannot be justified or legitimized. Several eXplainable Artificial In… ▽ More Although Deep Neural Networks (DNNs) have great generalization and prediction capabilities, their functioning does not allow a detailed explanation of their behavior. Opaque deep learning models are increasingly used to make important predictions in critical environments, and the danger is that they make and use predictions that cannot be justified or legitimized. Several eXplainable Artificial Intelligence (XAI) methods that separate explanations from machine learning models have emerged, but have shortcomings in faithfulness to the model actual functioning and robustness. As a result, there is a widespread agreement on the importance of endowing Deep Learning models with explanatory capabilities so that they can themselves provide an answer to why a particular prediction was made. First, we address the problem of the lack of universal criteria for XAI by formalizing what an explanation is. We also introduced a set of axioms and definitions to clarify XAI from a mathematical perspective. Finally, we present the Greybox XAI, a framework that composes a DNN and a transparent model thanks to the use of a symbolic Knowledge Base (KB). We extract a KB from the dataset and use it to train a transparent model (i.e., a logistic regression). An encoder-decoder architecture is trained on RGB images to produce an output similar to the KB used by the transparent model. Once the two models are trained independently, they are used compositionally to form an explainable predictive model. We show how this new architecture is accurate and explainable in several datasets. △ Less

Submitted 26 September, 2022; originally announced September 2022.

Comments: Accepted in Knowledge-Based Systems Journal

arXiv:2207.08989 [pdf, other]

Capabilities, Limitations and Challenges of Style Transfer with CycleGANs: A Study on Automatic Ring Design Generation

Authors: Tomas Cabezon Pedroso, Javier Del Ser, Natalia Diaz-Rodrıguez

Abstract: Rendering programs have changed the design process completely as they permit to see how the products will look before they are fabricated. However, the rendering process is complicated and takes a significant amount of time, not only in the rendering itself but in the setting of the scene as well. Materials, lights and cameras need to be set in order to get the best quality results. Nevertheless,… ▽ More Rendering programs have changed the design process completely as they permit to see how the products will look before they are fabricated. However, the rendering process is complicated and takes a significant amount of time, not only in the rendering itself but in the setting of the scene as well. Materials, lights and cameras need to be set in order to get the best quality results. Nevertheless, the optimal output may not be obtained in the first render. This all makes the rendering process a tedious process. Since Goodfellow et al. introduced Generative Adversarial Networks (GANs) in 2014 [1], they have been used to generate computer-assigned synthetic data, from non-existing human faces to medical data analysis or image style transfer. GANs have been used to transfer image textures from one domain to another. However, paired data from both domains was needed. When Zhu et al. introduced the CycleGAN model, the elimination of this expensive constraint permitted transforming one image from one domain into another, without the need for paired data. This work validates the applicability of CycleGANs on style transfer from an initial sketch to a final render in 2D that represents a 3D design, a step that is paramount in every product design process. We inquiry the possibilities of including CycleGANs as part of the design pipeline, more precisely, applied to the rendering of ring designs. Our contribution entails a crucial part of the process as it allows the customer to see the final product before buying. This work sets a basis for future research, showing the possibilities of GANs in design and establishing a starting point for novel applications to approach crafts design. △ Less

Submitted 18 July, 2022; originally announced July 2022.

Comments: 20 pages

arXiv:2205.10232 [pdf, other]

Exploring the Trade-off between Plausibility, Change Intensity and Adversarial Power in Counterfactual Explanations using Multi-objective Optimization

Authors: Javier Del Ser, Alejandro Barredo-Arrieta, Natalia Díaz-Rodríguez, Francisco Herrera, Andreas Holzinger

Abstract: There is a broad consensus on the importance of deep learning models in tasks involving complex data. Often, an adequate understanding of these models is required when focusing on the transparency of decisions in human-critical applications. Besides other explainability techniques, trustworthiness can be achieved by using counterfactuals, like the way a human becomes familiar with an unknown proce… ▽ More There is a broad consensus on the importance of deep learning models in tasks involving complex data. Often, an adequate understanding of these models is required when focusing on the transparency of decisions in human-critical applications. Besides other explainability techniques, trustworthiness can be achieved by using counterfactuals, like the way a human becomes familiar with an unknown process: by understanding the hypothetical circumstances under which the output changes. In this work we argue that automated counterfactual generation should regard several aspects of the produced adversarial instances, not only their adversarial capability. To this end, we present a novel framework for the generation of counterfactual examples which formulates its goal as a multi-objective optimization problem balancing three different objectives: 1) plausibility, i.e., the likeliness of the counterfactual of being possible as per the distribution of the input data; 2) intensity of the changes to the original input; and 3) adversarial power, namely, the variability of the model's output induced by the counterfactual. The framework departs from a target model to be audited and uses a Generative Adversarial Network to model the distribution of input data, together with a multi-objective solver for the discovery of counterfactuals balancing among these objectives. The utility of the framework is showcased over six classification tasks comprising image and three-dimensional data. The experiments verify that the framework unveils counterfactuals that comply with intuition, increasing the trustworthiness of the user, and leading to further insights, such as the detection of bias and data misrepresentation. △ Less

Submitted 20 May, 2022; originally announced May 2022.

Comments: 52 pages, 14 figures, under review

arXiv:2202.10201 [pdf, other]

doi 10.1109/ACCESS.2022.3230590

OG-SGG: Ontology-Guided Scene Graph Generation. A Case Study in Transfer Learning for Telepresence Robotics

Authors: Fernando Amodeo, Fernando Caballero, Natalia Díaz-Rodríguez, Luis Merino

Abstract: Scene graph generation from images is a task of great interest to applications such as robotics, because graphs are the main way to represent knowledge about the world and regulate human-robot interactions in tasks such as Visual Question Answering (VQA). Unfortunately, its corresponding area of machine learning is still relatively in its infancy, and the solutions currently offered do not special… ▽ More Scene graph generation from images is a task of great interest to applications such as robotics, because graphs are the main way to represent knowledge about the world and regulate human-robot interactions in tasks such as Visual Question Answering (VQA). Unfortunately, its corresponding area of machine learning is still relatively in its infancy, and the solutions currently offered do not specialize well in concrete usage scenarios. Specifically, they do not take existing "expert" knowledge about the domain world into account; and that might indeed be necessary in order to provide the level of reliability demanded by the use case scenarios. In this paper, we propose an initial approximation to a framework called Ontology-Guided Scene Graph Generation (OG-SGG), that can improve the performance of an existing machine learning based scene graph generator using prior knowledge supplied in the form of an ontology (specifically, using the axioms defined within); and we present results evaluated on a specific scenario founded in telepresence robotics. These results show quantitative and qualitative improvements in the generated scene graphs. △ Less

Submitted 20 December, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

Comments: 20 pages; version accepted and published in IEEE Access

arXiv:2111.14260 [pdf, other]

A Practical guide on Explainable AI Techniques applied on Biomedical use case applications

Authors: Adrien Bennetot, Ivan Donadello, Ayoub El Qadi, Mauro Dragoni, Thomas Frossard, Benedikt Wagner, Anna Saranti, Silvia Tulli, Maria Trocan, Raja Chatila, Andreas Holzinger, Artur d'Avila Garcez, Natalia Díaz-Rodríguez

Abstract: Last years have been characterized by an upsurge of opaque automatic decision support systems, such as Deep Neural Networks (DNNs). Although they have great generalization and prediction skills, their functioning does not allow obtaining detailed explanations of their behaviour. As opaque machine learning models are increasingly being employed to make important predictions in critical environments… ▽ More Last years have been characterized by an upsurge of opaque automatic decision support systems, such as Deep Neural Networks (DNNs). Although they have great generalization and prediction skills, their functioning does not allow obtaining detailed explanations of their behaviour. As opaque machine learning models are increasingly being employed to make important predictions in critical environments, the danger is to create and use decisions that are not justifiable or legitimate. Therefore, there is a general agreement on the importance of endowing machine learning models with explainability. EXplainable Artificial Intelligence (XAI) techniques can serve to verify and certify model outputs and enhance them with desirable notions such as trustworthiness, accountability, transparency and fairness. This guide is meant to be the go-to handbook for any audience with a computer science background aiming at getting intuitive insights on machine learning models, accompanied with straight, fast, and intuitive explanations out of the box. This article aims to fill the lack of compelling XAI guide by applying XAI techniques in their particular day-to-day models, datasets and use-cases. Figure 1 acts as a flowchart/map for the reader and should help him to find the ideal method to use according to his type of data. In each chapter, the reader will find a description of the proposed method as well as an example of use on a Biomedical application and a Python notebook. It can be easily modified in order to be applied to specific applications. △ Less

Submitted 5 September, 2022; v1 submitted 13 November, 2021; originally announced November 2021.

arXiv:2110.01307 [pdf, other]

Collective eXplainable AI: Explaining Cooperative Strategies and Agent Contribution in Multiagent Reinforcement Learning with Shapley Values

Authors: Alexandre Heuillet, Fabien Couthouis, Natalia Díaz-Rodríguez

Abstract: While Explainable Artificial Intelligence (XAI) is increasingly expanding more areas of application, little has been applied to make deep Reinforcement Learning (RL) more comprehensible. As RL becomes ubiquitous and used in critical and general public applications, it is essential to develop methods that make it better understood and more interpretable. This study proposes a novel approach to expl… ▽ More While Explainable Artificial Intelligence (XAI) is increasingly expanding more areas of application, little has been applied to make deep Reinforcement Learning (RL) more comprehensible. As RL becomes ubiquitous and used in critical and general public applications, it is essential to develop methods that make it better understood and more interpretable. This study proposes a novel approach to explain cooperative strategies in multiagent RL using Shapley values, a game theory concept used in XAI that successfully explains the rationale behind decisions taken by Machine Learning algorithms. Through testing common assumptions of this technique in two cooperation-centered socially challenging multi-agent environments environments, this article argues that Shapley values are a pertinent way to evaluate the contribution of players in a cooperative multi-agent RL context. To palliate the high overhead of this method, Shapley values are approximated using Monte Carlo sampling. Experimental results on Multiagent Particle and Sequential Social Dilemmas show that Shapley values succeed at estimating the contribution of each agent. These results could have implications that go beyond games in economics, (e.g., for non-discriminatory decision making, ethical and responsible AI-derived decisions or policy making under fairness constraints). They also expose how Shapley values only give general explanations about a model and cannot explain a single run, episode nor justify precise actions taken by agents. Future work should focus on addressing these critical aspects. △ Less

Submitted 4 October, 2021; originally announced October 2021.

Comments: Submitted to IEEE Computational Intelligence Magazine

arXiv:2109.08642 [pdf, other]

POAR: Efficient Policy Optimization via Online Abstract State Representation Learning

Authors: Zhaorun Chen, Siqi Fan, Yuan Tan, Liang Gong, Binhao Chen, Te Sun, David Filliat, Natalia Díaz-Rodríguez, Chengliang Liu

Abstract: While the rapid progress of deep learning fuels end-to-end reinforcement learning (RL), direct application, especially in high-dimensional space like robotic scenarios still suffers from low sample efficiency. Therefore State Representation Learning (SRL) is proposed to specifically learn to encode task-relevant features from complex sensory data into low-dimensional states. However, the pervasive… ▽ More While the rapid progress of deep learning fuels end-to-end reinforcement learning (RL), direct application, especially in high-dimensional space like robotic scenarios still suffers from low sample efficiency. Therefore State Representation Learning (SRL) is proposed to specifically learn to encode task-relevant features from complex sensory data into low-dimensional states. However, the pervasive implementation of SRL is usually conducted by a decoupling strategy in which the observation-state map** is learned separately, which is prone to over-fit. To handle such problem, we summarize the state-of-the-art (SOTA) SRL sub-tasks in previous works and present a new algorithm called Policy Optimization via Abstract Representation which integrates SRL into the policy optimization phase. Firstly, We engage RL loss to assist in updating SRL model so that the states can evolve to meet the demand of RL and maintain a good physical interpretation. Secondly, we introduce a dynamic loss weighting mechanism so that both models can efficiently adapt to each other. Thirdly, we introduce a new SRL prior called domain resemblance to leverage expert demonstration to improve SRL interpretations. Finally, we provide a real-time access of state graph to monitor the course of learning. Experiments indicate that POAR significantly outperforms SOTA RL algorithms and decoupling SRL strategies in terms of sample efficiency and final rewards. We empirically verify POAR to efficiently handle tasks in high dimensions and facilitate training real-life robots directly from scratch. △ Less

Submitted 9 December, 2023; v1 submitted 17 September, 2021; originally announced September 2021.

Comments: 19 pages

arXiv:2104.14492 [pdf, other]

Questioning causality on sex, gender and COVID-19, and identifying bias in large-scale data-driven analyses: the Bias Priority Recommendations and Bias Catalog for Pandemics

Authors: Natalia Díaz-Rodríguez, Rūta Binkytė-Sadauskienė, Wafae Bakkali, Sannidhi Bookseller, Paola Tubaro, Andrius Bacevicius, Raja Chatila

Abstract: The COVID-19 pandemic has spurred a large amount of observational studies reporting linkages between the risk of develo** severe COVID-19 or dying from it, and sex and gender. By reviewing a large body of related literature and conducting a fine grained analysis based on sex-disaggregated data of 61 countries spanning 5 continents, we discover several confounding factors that could possibly expl… ▽ More The COVID-19 pandemic has spurred a large amount of observational studies reporting linkages between the risk of develo** severe COVID-19 or dying from it, and sex and gender. By reviewing a large body of related literature and conducting a fine grained analysis based on sex-disaggregated data of 61 countries spanning 5 continents, we discover several confounding factors that could possibly explain the supposed male vulnerability to COVID-19. We thus highlight the challenge of making causal claims based on available data, given the lack of statistical significance and potential existence of biases. Informed by our findings on potential variables acting as confounders, we contribute a broad overview on the issues bias, explainability and fairness entail in data-driven analyses. Thus, we outline a set of discriminatory policy consequences that could, based on such results, lead to unintended discrimination. To raise awareness on the dimensionality of such foreseen impacts, we have compiled an encyclopedia-like reference guide, the Bias Catalog for Pandemics (BCP), to provide definitions and emphasize realistic examples of bias in general, and within the COVID-19 pandemic context. These are categorized within a division of bias families and a 2-level priority scale, together with preventive steps. In addition, we facilitate the Bias Priority Recommendations on how to best use and apply this catalog, and provide guidelines in order to address real world research questions. The objective is to anticipate and avoid disparate impact and discrimination, by considering causality, explainability, bias and techniques to mitigate the latter. With these, we hope to 1) contribute to designing and conducting fair and equitable data-driven studies and research; and 2) interpret and draw meaningful and actionable conclusions from these. △ Less

Submitted 29 April, 2021; originally announced April 2021.

ACM Class: K.4.1; K.4.2

arXiv:2104.11914 [pdf, other]

doi 10.1016/j.inffus.2021.09.022

EXplainable Neural-Symbolic Learning (X-NeSyL) methodology to fuse deep learning representations with expert knowledge graphs: the MonuMAI cultural heritage use case

Authors: Natalia Díaz-Rodríguez, Alberto Lamas, Jules Sanchez, Gianni Franchi, Ivan Donadello, Siham Tabik, David Filliat, Policarpo Cruz, Rosana Montes, Francisco Herrera

Abstract: The latest Deep Learning (DL) models for detection and classification have achieved an unprecedented performance over classical machine learning algorithms. However, DL models are black-box methods hard to debug, interpret, and certify. DL alone cannot provide explanations that can be validated by a non technical audience. In contrast, symbolic AI systems that convert concepts into rules or symbol… ▽ More The latest Deep Learning (DL) models for detection and classification have achieved an unprecedented performance over classical machine learning algorithms. However, DL models are black-box methods hard to debug, interpret, and certify. DL alone cannot provide explanations that can be validated by a non technical audience. In contrast, symbolic AI systems that convert concepts into rules or symbols -- such as knowledge graphs -- are easier to explain. However, they present lower generalisation and scaling capabilities. A very important challenge is to fuse DL representations with expert knowledge. One way to address this challenge, as well as the performance-explainability trade-off is by leveraging the best of both streams without obviating domain expert knowledge. We tackle such problem by considering the symbolic knowledge is expressed in form of a domain expert knowledge graph. We present the eXplainable Neural-symbolic learning (X-NeSyL) methodology, designed to learn both symbolic and deep representations, together with an explainability metric to assess the level of alignment of machine and human expert explanations. The ultimate objective is to fuse DL representations with expert domain knowledge during the learning process to serve as a sound basis for explainability. X-NeSyL methodology involves the concrete use of two notions of explanation at inference and training time respectively: 1) EXPLANet: Expert-aligned eXplainable Part-based cLAssifier NETwork Architecture, a compositional CNN that makes use of symbolic representations, and 2) SHAP-Backprop, an explainable AI-informed training procedure that guides the DL process to align with such symbolic representations in form of knowledge graphs. We showcase X-NeSyL methodology using MonuMAI dataset for monument facade image classification, and demonstrate that our approach improves explainability and performance. △ Less

Submitted 13 October, 2021; v1 submitted 24 April, 2021; originally announced April 2021.

arXiv:2104.04785 [pdf, other]

Physically-Consistent Generative Adversarial Networks for Coastal Flood Visualization

Authors: Björn Lütjens, Brandon Leshchinskiy, Christian Requena-Mesa, Farrukh Chishtie, Natalia Díaz-Rodríguez, Océane Boulais, Aruna Sankaranarayanan, Margaux Masson-Forsythe, Aaron Piña, Yarin Gal, Chedy Raïssi, Alexander Lavin, Dava Newman

Abstract: As climate change increases the intensity of natural disasters, society needs better tools for adaptation. Floods, for example, are the most frequent natural disaster, and better tools for flood risk communication could increase the support for flood-resilient infrastructure development. Our work aims to enable more visual communication of large-scale climate impacts via visualizing the output of… ▽ More As climate change increases the intensity of natural disasters, society needs better tools for adaptation. Floods, for example, are the most frequent natural disaster, and better tools for flood risk communication could increase the support for flood-resilient infrastructure development. Our work aims to enable more visual communication of large-scale climate impacts via visualizing the output of coastal flood models as satellite imagery. We propose the first deep learning pipeline to ensure physical-consistency in synthetic visual satellite imagery. We advanced a state-of-the-art GAN called pix2pixHD, such that it produces imagery that is physically-consistent with the output of an expert-validated storm surge model (NOAA SLOSH). By evaluating the imagery relative to physics-based flood maps, we find that our proposed framework outperforms baseline models in both physical-consistency and photorealism. We envision our work to be the first step towards a global visualization of how the climate challenge will shape our landscape. Continuing on this path, we show that the proposed pipeline generalizes to visualize reforestation. We also publish a dataset of over 25k labelled image-triplets to study image-to-image translation in Earth observation. △ Less

Submitted 21 February, 2023; v1 submitted 10 April, 2021; originally announced April 2021.

Comments: arXiv admin note: text overlap with arXiv:2010.08103

arXiv:2104.00950 [pdf, other]

Explainable Artificial Intelligence (XAI) on TimeSeries Data: A Survey

Authors: Thomas Rojat, Raphaël Puget, David Filliat, Javier Del Ser, Rodolphe Gelin, Natalia Díaz-Rodríguez

Abstract: Most of state of the art methods applied on time series consist of deep learning methods that are too complex to be interpreted. This lack of interpretability is a major drawback, as several applications in the real world are critical tasks, such as the medical field or the autonomous driving field. The explainability of models applied on time series has not gather much attention compared to the c… ▽ More Most of state of the art methods applied on time series consist of deep learning methods that are too complex to be interpreted. This lack of interpretability is a major drawback, as several applications in the real world are critical tasks, such as the medical field or the autonomous driving field. The explainability of models applied on time series has not gather much attention compared to the computer vision or the natural language processing fields. In this paper, we present an overview of existing explainable AI (XAI) methods applied on time series and illustrate the type of explanations they produce. We also provide a reflection on the impact of these explanation methods to provide confidence and trust in the AI systems. △ Less

Submitted 2 April, 2021; originally announced April 2021.

arXiv:2103.08359 [pdf, other]

Explaining Credit Risk Scoring through Feature Contribution Alignment with Expert Risk Analysts

Authors: Ayoub El Qadi, Natalia Diaz-Rodriguez, Maria Trocan, Thomas Frossard

Abstract: Credit assessments activities are essential for financial institutions and allow the global economy to grow. Building robust, solid and accurate models that estimate the probability of a default of a company is mandatory for credit insurance companies, moreover when it comes to bridging the trade finance gap. Automating the risk assessment process will allow credit risk experts to reduce their wor… ▽ More Credit assessments activities are essential for financial institutions and allow the global economy to grow. Building robust, solid and accurate models that estimate the probability of a default of a company is mandatory for credit insurance companies, moreover when it comes to bridging the trade finance gap. Automating the risk assessment process will allow credit risk experts to reduce their workload and focus on the critical and complex cases, as well as to improve the loan approval process by reducing the time to process the application. The recent developments in Artificial Intelligence are offering new powerful opportunities. However, most AI techniques are labelled as blackbox models due to their lack of explainability. For both users and regulators, in order to deploy such technologies at scale, being able to understand the model logic is a must to grant accurate and ethical decision making. In this study, we focus on companies credit scoring and we benchmark different machine learning models. The aim is to build a model to predict whether a company will experience financial problems in a given time horizon. We address the black box problem using eXplainable Artificial Techniques in particular, post-hoc explanations using SHapley Additive exPlanations. We bring light by providing an expert-aligned feature relevance score highlighting the disagreement between a credit risk expert and a model feature attribution explanation in order to better quantify the convergence towards a better human-aligned decision making. △ Less

Submitted 15 March, 2021; originally announced March 2021.

Comments: 11 pages, 4, figures

arXiv:2010.08103 [pdf, other]

Physics-informed GANs for Coastal Flood Visualization

Authors: Björn Lütjens, Brandon Leshchinskiy, Christian Requena-Mesa, Farrukh Chishtie, Natalia Díaz-Rodriguez, Océane Boulais, Aaron Piña, Dava Newman, Alexander Lavin, Yarin Gal, Chedy Raïssi

Abstract: As climate change increases the intensity of natural disasters, society needs better tools for adaptation. Floods, for example, are the most frequent natural disaster, but during hurricanes the area is largely covered by clouds and emergency managers must rely on nonintuitive flood visualizations for mission planning. To assist these emergency managers, we have created a deep learning pipeline tha… ▽ More As climate change increases the intensity of natural disasters, society needs better tools for adaptation. Floods, for example, are the most frequent natural disaster, but during hurricanes the area is largely covered by clouds and emergency managers must rely on nonintuitive flood visualizations for mission planning. To assist these emergency managers, we have created a deep learning pipeline that generates visual satellite images of current and future coastal flooding. We advanced a state-of-the-art GAN called pix2pixHD, such that it produces imagery that is physically-consistent with the output of an expert-validated storm surge model (NOAA SLOSH). By evaluating the imagery relative to physics-based flood maps, we find that our proposed framework outperforms baseline models in both physical-consistency and photorealism. While this work focused on the visualization of coastal floods, we envision the creation of a global visualization of how climate change will shape our earth. △ Less

Submitted 12 February, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

Comments: Under Review

arXiv:2008.06693 [pdf, other]

Explainability in Deep Reinforcement Learning

Authors: Alexandre Heuillet, Fabien Couthouis, Natalia Díaz-Rodríguez

Abstract: A large set of the explainable Artificial Intelligence (XAI) literature is emerging on feature relevance techniques to explain a deep neural network (DNN) output or explaining models that ingest image source data. However, assessing how XAI techniques can help understand models beyond classification tasks, e.g. for reinforcement learning (RL), has not been extensively studied. We review recent wor… ▽ More A large set of the explainable Artificial Intelligence (XAI) literature is emerging on feature relevance techniques to explain a deep neural network (DNN) output or explaining models that ingest image source data. However, assessing how XAI techniques can help understand models beyond classification tasks, e.g. for reinforcement learning (RL), has not been extensively studied. We review recent works in the direction to attain Explainable Reinforcement Learning (XRL), a relatively new subfield of Explainable Artificial Intelligence, intended to be used in general public applications, with diverse audiences, requiring ethical, responsible and trustable algorithms. In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box. We evaluate mainly studies directly linking explainability to RL, and split these into two categories according to the way the explanations are generated: transparent algorithms and post-hoc explainaility. We also review the most prominent XAI works from the lenses of how they could potentially enlighten the further deployment of the latest advances in RL, in the demanding present and future of everyday problems. △ Less

Submitted 18 December, 2020; v1 submitted 15 August, 2020; originally announced August 2020.

Comments: Article accepted at Knowledge-Based Systems

arXiv:2006.00882 [pdf, other]

Should artificial agents ask for help in human-robot collaborative problem-solving?

Authors: Adrien Bennetot, Vicky Charisi, Natalia Díaz-Rodríguez

Abstract: Transferring as fast as possible the functioning of our brain to artificial intelligence is an ambitious goal that would help advance the state of the art in AI and robotics. It is in this perspective that we propose to start from hypotheses derived from an empirical study in a human-robot interaction and to verify if they are validated in the same way for children as for a basic reinforcement lea… ▽ More Transferring as fast as possible the functioning of our brain to artificial intelligence is an ambitious goal that would help advance the state of the art in AI and robotics. It is in this perspective that we propose to start from hypotheses derived from an empirical study in a human-robot interaction and to verify if they are validated in the same way for children as for a basic reinforcement learning algorithm. Thus, we check whether receiving help from an expert when solving a simple close-ended task (the Towers of Hanoï) allows to accelerate or not the learning of this task, depending on whether the intervention is canonical or requested by the player. Our experiences have allowed us to conclude that, whether requested or not, a Q-learning algorithm benefits in the same way from expert help as children do. △ Less

Submitted 25 May, 2020; originally announced June 2020.

Comments: Accepted at Brain-PIL Workshop - ICRA2020

arXiv:2005.06223 [pdf, other]

DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics

Authors: Stephane Doncieux, Nicolas Bredeche, Léni Le Goff, Benoît Girard, Alexandre Coninx, Olivier Sigaud, Mehdi Khamassi, Natalia Díaz-Rodríguez, David Filliat, Timothy Hospedales, A. Eiben, Richard Duro

Abstract: Robots are still limited to controlled conditions, that the robot designer knows with enough details to endow the robot with the appropriate models or behaviors. Learning algorithms add some flexibility with the ability to discover the appropriate behavior given either some demonstrations or a reward to guide its exploration with a reinforcement learning algorithm. Reinforcement learning algorithm… ▽ More Robots are still limited to controlled conditions, that the robot designer knows with enough details to endow the robot with the appropriate models or behaviors. Learning algorithms add some flexibility with the ability to discover the appropriate behavior given either some demonstrations or a reward to guide its exploration with a reinforcement learning algorithm. Reinforcement learning algorithms rely on the definition of state and action spaces that define reachable behaviors. Their adaptation capability critically depends on the representations of these spaces: small and discrete spaces result in fast learning while large and continuous spaces are challenging and either require a long training period or prevent the robot from converging to an appropriate behavior. Beside the operational cycle of policy execution and the learning cycle, which works at a slower time scale to acquire new policies, we introduce the redescription cycle, a third cycle working at an even slower time scale to generate or adapt the required representations to the robot, its environment and the task. We introduce the challenges raised by this cycle and we present DREAM (Deferred Restructuring of Experience in Autonomous Machines), a developmental cognitive architecture to bootstrap this redescription process stage by stage, build new state representations with appropriate motivations, and transfer the acquired knowledge across domains or tasks or even across robots. We describe results obtained so far with this approach and end up with a discussion of the questions it raises in Neuroscience. △ Less

Submitted 13 May, 2020; originally announced May 2020.

arXiv:2003.11743 [pdf, other]

Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning models

Authors: Pranav Agarwal, Alejandro Betancourt, Vana Panagiotou, Natalia Díaz-Rodríguez

Abstract: Image captioning models have been able to generate grammatically correct and human understandable sentences. However most of the captions convey limited information as the model used is trained on datasets that do not caption all possible objects existing in everyday life. Due to this lack of prior information most of the captions are biased to only a few objects present in the scene, hence limiti… ▽ More Image captioning models have been able to generate grammatically correct and human understandable sentences. However most of the captions convey limited information as the model used is trained on datasets that do not caption all possible objects existing in everyday life. Due to this lack of prior information most of the captions are biased to only a few objects present in the scene, hence limiting their usage in daily life. In this paper, we attempt to show the biased nature of the currently existing image captioning models and present a new image captioning dataset, Egoshots, consisting of 978 real life images with no captions. We further exploit the state of the art pre-trained image captioning and object recognition networks to annotate our images and show the limitations of existing works. Furthermore, in order to evaluate the quality of the generated captions, we propose a new image captioning metric, object based Semantic Fidelity (SF). Existing image captioning metrics can evaluate a caption only in the presence of their corresponding annotations; however, SF allows evaluating captions generated for images without annotations, making it highly useful for real life generated captions. △ Less

Submitted 27 March, 2020; v1 submitted 26 March, 2020; originally announced March 2020.

Comments: 15 pages, 25 figures, Accepted at Machine Learning in Real Life (ML-IRL) ICLR 2020 Workshop

arXiv:1910.10045 [pdf, other]

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

Authors: Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, Raja Chatila, Francisco Herrera

Abstract: In the last years, Artificial Intelligence (AI) has achieved a notable momentum that may deliver the best of expectations over many application sectors across the field. For this to occur, the entire community stands in front of the barrier of explainability, an inherent problem of AI techniques brought by sub-symbolism (e.g. ensembles or Deep Neural Networks) that were not present in the last hyp… ▽ More In the last years, Artificial Intelligence (AI) has achieved a notable momentum that may deliver the best of expectations over many application sectors across the field. For this to occur, the entire community stands in front of the barrier of explainability, an inherent problem of AI techniques brought by sub-symbolism (e.g. ensembles or Deep Neural Networks) that were not present in the last hype of AI. Paradigms underlying this problem fall within the so-called eXplainable AI (XAI) field, which is acknowledged as a crucial feature for the practical deployment of AI models. This overview examines the existing literature in the field of XAI, including a prospect toward what is yet to be reached. We summarize previous efforts to define explainability in Machine Learning, establishing a novel definition that covers prior conceptual propositions with a major focus on the audience for which explainability is sought. We then propose and discuss about a taxonomy of recent contributions related to the explainability of different Machine Learning models, including those aimed at Deep Learning methods for which a second taxonomy is built. This literature analysis serves as the background for a series of challenges faced by XAI, such as the crossroads between data fusion and explainability. Our prospects lead toward the concept of Responsible Artificial Intelligence, namely, a methodology for the large-scale implementation of AI methods in real organizations with fairness, model explainability and accountability at its core. Our ultimate goal is to provide newcomers to XAI with a reference material in order to stimulate future research advances, but also to encourage experts and professionals from other disciplines to embrace the benefits of AI in their activity sectors, without any prior bias for its lack of interpretability. △ Less

Submitted 26 December, 2019; v1 submitted 22 October, 2019; originally announced October 2019.

Comments: 67 pages, 13 figures, accepted for its publication in Information Fusion

arXiv:1909.09065 [pdf, other]

Towards Explainable Neural-Symbolic Visual Reasoning

Authors: Adrien Bennetot, Jean-Luc Laurent, Raja Chatila, Natalia Díaz-Rodríguez

Abstract: Many high-performance models suffer from a lack of interpretability. There has been an increasing influx of work on explainable artificial intelligence (XAI) in order to disentangle what is meant and expected by XAI. Nevertheless, there is no general consensus on how to produce and judge explanations. In this paper, we discuss why techniques integrating connectionist and symbolic paradigms are the… ▽ More Many high-performance models suffer from a lack of interpretability. There has been an increasing influx of work on explainable artificial intelligence (XAI) in order to disentangle what is meant and expected by XAI. Nevertheless, there is no general consensus on how to produce and judge explanations. In this paper, we discuss why techniques integrating connectionist and symbolic paradigms are the most efficient solutions to produce explanations for non-technical users and we propose a reasoning model, based on definitions by Doran et al. [2017] (arXiv:1710.00794) to explain a neural network's decision. We use this explanation in order to correct bias in the network's decision rationale. We accompany this model with an example of its potential use, based on the image captioning method in Burns et al. [2018] (arXiv:1803.09797). △ Less

Submitted 22 October, 2019; v1 submitted 19 September, 2019; originally announced September 2019.

Comments: Accepted at IJCAI19 Neural-Symbolic Learning and Reasoning Workshop (https://sites.google.com/view/nesy2019/home)

arXiv:1907.05855 [pdf, other]

DisCoRL: Continual Reinforcement Learning via Policy Distillation

Authors: René Traoré, Hugo Caselles-Dupré, Timothée Lesort, Te Sun, Guanghang Cai, Natalia Díaz-Rodríguez, David Filliat

Abstract: In multi-task reinforcement learning there are two main challenges: at training time, the ability to learn different policies with a single model; at test time, inferring which of those policies applying without an external signal. In the case of continual reinforcement learning a third challenge arises: learning tasks sequentially without forgetting the previous ones. In this paper, we tackle the… ▽ More In multi-task reinforcement learning there are two main challenges: at training time, the ability to learn different policies with a single model; at test time, inferring which of those policies applying without an external signal. In the case of continual reinforcement learning a third challenge arises: learning tasks sequentially without forgetting the previous ones. In this paper, we tackle these challenges by proposing DisCoRL, an approach combining state representation learning and policy distillation. We experiment on a sequence of three simulated 2D navigation tasks with a 3 wheel omni-directional robot. Moreover, we tested our approach's robustness by transferring the final policy into a real life setting. The policy can solve all tasks and automatically infer which one to run. △ Less

Submitted 11 July, 2019; originally announced July 2019.

Comments: arXiv admin note: text overlap with arXiv:1906.04452

arXiv:1907.00182 [pdf, other]

Continual Learning for Robotics: Definition, Framework, Learning Strategies, Opportunities and Challenges

Authors: Timothée Lesort, Vincenzo Lomonaco, Andrei Stoian, Davide Maltoni, David Filliat, Natalia Díaz-Rodríguez

Abstract: Continual learning (CL) is a particular machine learning paradigm where the data distribution and learning objective changes through time, or where all the training data and objective criteria are never available at once. The evolution of the learning process is modeled by a sequence of learning experiences where the goal is to be able to learn new skills all along the sequence without forgetting… ▽ More Continual learning (CL) is a particular machine learning paradigm where the data distribution and learning objective changes through time, or where all the training data and objective criteria are never available at once. The evolution of the learning process is modeled by a sequence of learning experiences where the goal is to be able to learn new skills all along the sequence without forgetting what has been previously learned. Continual learning also aims at the same time at optimizing the memory, the computation power and the speed during the learning process. An important challenge for machine learning is not necessarily finding solutions that work in the real world but rather finding stable algorithms that can learn in real world. Hence, the ideal approach would be tackling the real world in a embodied platform: an autonomous agent. Continual learning would then be effective in an autonomous agent or robot, which would learn autonomously through time about the external world, and incrementally develop a set of complex skills and knowledge. Robotic agents have to learn to adapt and interact with their environment using a continuous stream of observations. Some recent approaches aim at tackling continual learning for robotics, but most recent papers on continual learning only experiment approaches in simulation or with static datasets. Unfortunately, the evaluation of those algorithms does not provide insights on whether their solutions may help continual learning in the context of robotics. This paper aims at reviewing the existing state of the art of continual learning, summarizing existing benchmarks and metrics, and proposing a framework for presenting and evaluating both robotics and non robotics approaches in a way that makes transfer between both fields easier. △ Less

Submitted 22 November, 2019; v1 submitted 29 June, 2019; originally announced July 2019.

arXiv:1906.04452 [pdf, other]

Continual Reinforcement Learning deployed in Real-life using Policy Distillation and Sim2Real Transfer

Authors: René Traoré, Hugo Caselles-Dupré, Timothée Lesort, Te Sun, Natalia Díaz-Rodríguez, David Filliat

Abstract: We focus on the problem of teaching a robot to solve tasks presented sequentially, i.e., in a continual learning scenario. The robot should be able to solve all tasks it has encountered, without forgetting past tasks. We provide preliminary work on applying Reinforcement Learning to such setting, on 2D navigation tasks for a 3 wheel omni-directional robot. Our approach takes advantage of state rep… ▽ More We focus on the problem of teaching a robot to solve tasks presented sequentially, i.e., in a continual learning scenario. The robot should be able to solve all tasks it has encountered, without forgetting past tasks. We provide preliminary work on applying Reinforcement Learning to such setting, on 2D navigation tasks for a 3 wheel omni-directional robot. Our approach takes advantage of state representation learning and policy distillation. Policies are trained using learned features as input, rather than raw observations, allowing better sample efficiency. Policy distillation is used to combine multiple policies into a single one that solves all encountered tasks. △ Less

Submitted 11 June, 2019; originally announced June 2019.

Comments: accepted to the Workshop on Multi-Task and Lifelong Reinforcement Learning, ICML 2019

arXiv:1901.08651 [pdf, other]

Decoupling feature extraction from policy learning: assessing benefits of state representation learning in goal based robotics

Authors: Antonin Raffin, Ashley Hill, René Traoré, Timothée Lesort, Natalia Díaz-Rodríguez, David Filliat

Abstract: Scaling end-to-end reinforcement learning to control real robots from vision presents a series of challenges, in particular in terms of sample efficiency. Against end-to-end learning, state representation learning can help learn a compact, efficient and relevant representation of states that speeds up policy learning, reducing the number of samples needed, and that is easier to interpret. We evalu… ▽ More Scaling end-to-end reinforcement learning to control real robots from vision presents a series of challenges, in particular in terms of sample efficiency. Against end-to-end learning, state representation learning can help learn a compact, efficient and relevant representation of states that speeds up policy learning, reducing the number of samples needed, and that is easier to interpret. We evaluate several state representation learning methods on goal based robotics tasks and propose a new unsupervised model that stacks representations and combines strengths of several of these approaches. This method encodes all the relevant features, performs on par or better than end-to-end learning with better sample efficiency, and is robust to hyper-parameters change. △ Less

Submitted 23 June, 2019; v1 submitted 24 January, 2019; originally announced January 2019.

Comments: Github repo: https://github.com/araffin/srl-zoo Documentation: https://srl-zoo.readthedocs.io/en/latest/, As part of SRL-Toolbox: https://s-rl-toolbox.readthedocs.io/en/latest/. Accepted to the Workshop on Structure & Priors in Reinforcement Learning at ICLR 2019

arXiv:1811.05291 [pdf, other]

Intelligent Drone Swarm for Search and Rescue Operations at Sea

Authors: Vincenzo Lomonaco, Angelo Trotta, Marta Ziosi, Juan de Dios Yáñez Ávila, Natalia Díaz-Rodríguez

Abstract: In recent years, a rising numbers of people arrived in the European Union, traveling across the Mediterranean Sea or overland through Southeast Europe in what has been later named as the European migrant crisis. In the last 5 years, more than 16 thousands people have lost their lives in the Mediterranean sea during the crossing. The United Nations Secretary General Strategy on New Technologies is… ▽ More In recent years, a rising numbers of people arrived in the European Union, traveling across the Mediterranean Sea or overland through Southeast Europe in what has been later named as the European migrant crisis. In the last 5 years, more than 16 thousands people have lost their lives in the Mediterranean sea during the crossing. The United Nations Secretary General Strategy on New Technologies is supporting the use of Artificial Intelligence (AI) and Robotics to accelerate the achievement of the 2030 Sustainable Development Agenda, which includes safe and regular migration processes among the others. In the same spirit, the central idea of this project aims at using AI technology for Search And Rescue (SAR) operations at sea. In particular, we propose an autonomous fleet of self-organizing intelligent drones that would enable the coverage of a broader area, speeding-up the search processes and finally increasing the efficiency and effectiveness of migrants rescue operations. △ Less

Submitted 13 November, 2018; originally announced November 2018.

Comments: 4 Pages, 1 Figure, extended abstract accepted to the "AI for Social Good" NIPS 2018 Workshop

arXiv:1810.13166 [pdf, other]

Don't forget, there is more than forgetting: new metrics for Continual Learning

Authors: Natalia Díaz-Rodríguez, Vincenzo Lomonaco, David Filliat, Davide Maltoni

Abstract: Continual learning consists of algorithms that learn from a stream of data/tasks continuously and adaptively thought time, enabling the incremental development of ever more complex knowledge and skills. The lack of consensus in evaluating continual learning algorithms and the almost exclusive focus on forgetting motivate us to propose a more comprehensive set of implementation independent metrics… ▽ More Continual learning consists of algorithms that learn from a stream of data/tasks continuously and adaptively thought time, enabling the incremental development of ever more complex knowledge and skills. The lack of consensus in evaluating continual learning algorithms and the almost exclusive focus on forgetting motivate us to propose a more comprehensive set of implementation independent metrics accounting for several factors we believe have practical implications worth considering in the deployment of real AI systems that learn continually: accuracy or performance over time, backward and forward knowledge transfer, memory overhead as well as computational efficiency. Drawing inspiration from the standard Multi-Attribute Value Theory (MAVT) we further propose to fuse these metrics into a single score for ranking purposes and we evaluate our proposal with five continual learning strategies on the iCIFAR-100 continual learning benchmark. △ Less

Submitted 31 October, 2018; originally announced October 2018.

MSC Class: 68T05; cs.LG; cs.AI; cs.CV; cs.NE; stat.ML

arXiv:1809.09369 [pdf, other]

S-RL Toolbox: Environments, Datasets and Evaluation Metrics for State Representation Learning

Authors: Antonin Raffin, Ashley Hill, René Traoré, Timothée Lesort, Natalia Díaz-Rodríguez, David Filliat

Abstract: State representation learning aims at learning compact representations from raw observations in robotics and control applications. Approaches used for this objective are auto-encoders, learning forward models, inverse dynamics or learning using generic priors on the state characteristics. However, the diversity in applications and methods makes the field lack standard evaluation datasets, metrics… ▽ More State representation learning aims at learning compact representations from raw observations in robotics and control applications. Approaches used for this objective are auto-encoders, learning forward models, inverse dynamics or learning using generic priors on the state characteristics. However, the diversity in applications and methods makes the field lack standard evaluation datasets, metrics and tasks. This paper provides a set of environments, data generators, robotic control tasks, metrics and tools to facilitate iterative state representation learning and evaluation in reinforcement learning settings. △ Less

Submitted 10 October, 2018; v1 submitted 25 September, 2018; originally announced September 2018.

Comments: Github repo: https://github.com/araffin/robotics-rl-srl Documentation: https://s-rl-toolbox.readthedocs.io/en/latest/

arXiv:1802.04181 [pdf, ps, other]

doi 10.1016/j.neunet.2018.07.006

State Representation Learning for Control: An Overview

Authors: Timothée Lesort, Natalia Díaz-Rodríguez, Jean-François Goudou, David Filliat

Abstract: Representation learning algorithms are designed to learn abstract features that characterize data. State representation learning (SRL) focuses on a particular kind of representation learning where learned features are in low dimension, evolve through time, and are influenced by actions of an agent. The representation is learned to capture the variation in the environment generated by the agent's a… ▽ More Representation learning algorithms are designed to learn abstract features that characterize data. State representation learning (SRL) focuses on a particular kind of representation learning where learned features are in low dimension, evolve through time, and are influenced by actions of an agent. The representation is learned to capture the variation in the environment generated by the agent's actions; this kind of representation is particularly suitable for robotics and control scenarios. In particular, the low dimension characteristic of the representation helps to overcome the curse of dimensionality, provides easier interpretation and utilization by humans and can help improve performance and speed in policy learning algorithms such as reinforcement learning. This survey aims at covering the state-of-the-art on state representation learning in the most recent years. It reviews different SRL methods that involve interaction with the environment, their implementations and their applications in robotics control tasks (simulated or real). In particular, it highlights how generic learning objectives are differently exploited in the reviewed algorithms. Finally, it discusses evaluation methods to assess the representation learned and summarizes current and future lines of research. △ Less

Submitted 5 June, 2018; v1 submitted 12 February, 2018; originally announced February 2018.

arXiv:1709.05185 [pdf, other]

Unsupervised state representation learning with robotic priors: a robustness benchmark

Authors: Timothée Lesort, Mathieu Seurin, Xinrui Li, Natalia Díaz-Rodríguez, David Filliat

Abstract: Our understanding of the world depends highly on our capacity to produce intuitive and simplified representations which can be easily used to solve problems. We reproduce this simplification process using a neural network to build a low dimensional state representation of the world from images acquired by a robot. As in Jonschkowski et al. 2015, we learn in an unsupervised way using prior knowledg… ▽ More Our understanding of the world depends highly on our capacity to produce intuitive and simplified representations which can be easily used to solve problems. We reproduce this simplification process using a neural network to build a low dimensional state representation of the world from images acquired by a robot. As in Jonschkowski et al. 2015, we learn in an unsupervised way using prior knowledge about the world as loss functions called robotic priors and extend this approach to high dimension richer images to learn a 3D representation of the hand position of a robot from RGB images. We propose a quantitative evaluation of the learned representation using nearest neighbors in the state space that allows to assess its quality and show both the potential and limitations of robotic priors in realistic environments. We augment image size, add distractors and domain randomization, all crucial components to achieve transfer learning to real robots. Finally, we also contribute a new prior to improve the robustness of the representation. The applications of such low dimensional state representation range from easing reinforcement learning (RL) and knowledge transfer across tasks, to facilitating learning from raw data with more efficient and compact high level representations. The results show that the robotic prior approach is able to extract high level representation as the 3D position of an arm and organize it into a compact and coherent space of states in a challenging dataset. △ Less

Submitted 15 September, 2017; originally announced September 2017.

Comments: ICRA 2018 submission

arXiv:1603.09200 [pdf, other]

Unsupervised Understanding of Location and Illumination Changes in Egocentric Videos

Authors: Alejandro Betancourt, Natalia Díaz-Rodríguez, Emilia Barakova, Lucio Marcenaro, Matthias Rauterberg, Carlo Regazzoni

Abstract: Wearable cameras stand out as one of the most promising devices for the upcoming years, and as a consequence, the demand of computer algorithms to automatically understand the videos recorded with them is increasing quickly. An automatic understanding of these videos is not an easy task, and its mobile nature implies important challenges to be faced, such as the changing light conditions and the u… ▽ More Wearable cameras stand out as one of the most promising devices for the upcoming years, and as a consequence, the demand of computer algorithms to automatically understand the videos recorded with them is increasing quickly. An automatic understanding of these videos is not an easy task, and its mobile nature implies important challenges to be faced, such as the changing light conditions and the unrestricted locations recorded. This paper proposes an unsupervised strategy based on global features and manifold learning to endow wearable cameras with contextual information regarding the light conditions and the location captured. Results show that non-linear manifold methods can capture contextual patterns from global features without compromising large computational resources. The proposed strategy is used, as an application case, as a switching mechanism to improve the hand-detection problem in egocentric videos. △ Less

Submitted 27 March, 2017; v1 submitted 30 March, 2016; originally announced March 2016.

Comments: Submitted for publication

Showing 1–32 of 32 results for author: Díaz-Rodríguez, N