Search | arXiv e-print repository

I Know How: Combining Prior Policies to Solve New Tasks

Authors: Malio Li, Elia Piccoli, Vincenzo Lomonaco, Davide Bacciu

Abstract: Multi-Task Reinforcement Learning aims at develo** agents that are able to continually evolve and adapt to new scenarios. However, this goal is challenging to achieve due to the phenomenon of catastrophic forgetting and the high demand of computational resources. Learning from scratch for each new task is not a viable or sustainable option, and thus agents should be able to collect and exploit p… ▽ More Multi-Task Reinforcement Learning aims at develo** agents that are able to continually evolve and adapt to new scenarios. However, this goal is challenging to achieve due to the phenomenon of catastrophic forgetting and the high demand of computational resources. Learning from scratch for each new task is not a viable or sustainable option, and thus agents should be able to collect and exploit prior knowledge while facing new problems. While several methodologies have attempted to address the problem from different perspectives, they lack a common structure. In this work, we propose a new framework, I Know How (IKH), which provides a common formalization. Our methodology focuses on modularity and compositionality of knowledge in order to achieve and enhance agent's ability to learn and adapt efficiently to dynamic environments. To support our framework definition, we present a simple application of it in a simulated driving environment and compare its performance with that of state-of-the-art approaches. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 7 pages, Conference on Games (CoG) 2024

arXiv:2405.04101 [pdf, other]

Continual Learning in the Presence of Repetition

Authors: Hamed Hemati, Lorenzo Pellegrini, Xiaotian Duan, Zixuan Zhao, Fangfang Xia, Marc Masana, Benedikt Tscheschner, Eduardo Veas, Yuxiang Zheng, Shiji Zhao, Shao-Yuan Li, Sheng-Jun Huang, Vincenzo Lomonaco, Gido M. van de Ven

Abstract: Continual learning (CL) provides a framework for training models in ever-evolving environments. Although re-occurrence of previously seen objects or tasks is common in real-world problems, the concept of repetition in the data stream is not often considered in standard benchmarks for CL. Unlike with the rehearsal mechanism in buffer-based strategies, where sample repetition is controlled by the st… ▽ More Continual learning (CL) provides a framework for training models in ever-evolving environments. Although re-occurrence of previously seen objects or tasks is common in real-world problems, the concept of repetition in the data stream is not often considered in standard benchmarks for CL. Unlike with the rehearsal mechanism in buffer-based strategies, where sample repetition is controlled by the strategy, repetition in the data stream naturally stems from the environment. This report provides a summary of the CLVision challenge at CVPR 2023, which focused on the topic of repetition in class-incremental learning. The report initially outlines the challenge objective and then describes three solutions proposed by finalist teams that aim to effectively exploit the repetition in the stream to learn continually. The experimental results from the challenge highlight the effectiveness of ensemble-based solutions that employ multiple versions of similar modules, each trained on different but overlap** subsets of classes. This report underscores the transformative potential of taking a different perspective in CL by employing repetition in the data stream to foster innovative strategy design. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: Preprint; Challenge Report of the 4th Workshop on Continual Learning in Computer Vision at CVPR

arXiv:2404.07817 [pdf, other]

Calibration of Continual Learning Models

Authors: Lanpei Li, Elia Piccoli, Andrea Cossu, Davide Bacciu, Vincenzo Lomonaco

Abstract: Continual Learning (CL) focuses on maximizing the predictive performance of a model across a non-stationary stream of data. Unfortunately, CL models tend to forget previous knowledge, thus often underperforming when compared with an offline model trained jointly on the entire data stream. Given that any CL model will eventually make mistakes, it is of crucial importance to build calibrated CL mode… ▽ More Continual Learning (CL) focuses on maximizing the predictive performance of a model across a non-stationary stream of data. Unfortunately, CL models tend to forget previous knowledge, thus often underperforming when compared with an offline model trained jointly on the entire data stream. Given that any CL model will eventually make mistakes, it is of crucial importance to build calibrated CL models: models that can reliably tell their confidence when making a prediction. Model calibration is an active research topic in machine learning, yet to be properly investigated in CL. We provide the first empirical study of the behavior of calibration approaches in CL, showing that CL strategies do not inherently learn calibrated models. To mitigate this issue, we design a continual calibration approach that improves the performance of post-processing calibration methods over a wide range of different benchmarks and CL strategies. CL does not necessarily need perfect predictive models, but rather it can benefit from reliable predictive models. We believe our study on continual calibration represents a first step towards this direction. △ Less

Submitted 12 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

Comments: Accepted at CLVISION workshop, CVPR 2024

arXiv:2404.04219 [pdf, other]

Continual Policy Distillation of Reinforcement Learning-based Controllers for Soft Robotic In-Hand Manipulation

Authors: Lanpei Li, Enrico Donato, Vincenzo Lomonaco, Egidio Falotico

Abstract: Dexterous manipulation, often facilitated by multi-fingered robotic hands, holds solid impact for real-world applications. Soft robotic hands, due to their compliant nature, offer flexibility and adaptability during object gras** and manipulation. Yet, benefits come with challenges, particularly in the control development for finger coordination. Reinforcement Learning (RL) can be employed to tr… ▽ More Dexterous manipulation, often facilitated by multi-fingered robotic hands, holds solid impact for real-world applications. Soft robotic hands, due to their compliant nature, offer flexibility and adaptability during object gras** and manipulation. Yet, benefits come with challenges, particularly in the control development for finger coordination. Reinforcement Learning (RL) can be employed to train object-specific in-hand manipulation policies, but limiting adaptability and generalizability. We introduce a Continual Policy Distillation (CPD) framework to acquire a versatile controller for in-hand manipulation, to rotate different objects in shape and size within a four-fingered soft gripper. The framework leverages Policy Distillation (PD) to transfer knowledge from expert policies to a continually evolving student policy network. Exemplar-based rehearsal methods are then integrated to mitigate catastrophic forgetting and enhance generalization. The performance of the CPD framework over various replay strategies demonstrates its effectiveness in consolidating knowledge from multiple experts and achieving versatile and adaptive behaviours for in-hand manipulation tasks. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: Accepted for presentation at IEEE RoboSoft 2024

arXiv:2403.07015 [pdf, other]

Adaptive Hyperparameter Optimization for Continual Learning Scenarios

Authors: Rudy Semola, Julio Hurtado, Vincenzo Lomonaco, Davide Bacciu

Abstract: Hyperparameter selection in continual learning scenarios is a challenging and underexplored aspect, especially in practical non-stationary environments. Traditional approaches, such as grid searches with held-out validation data from all tasks, are unrealistic for building accurate lifelong learning systems. This paper aims to explore the role of hyperparameter selection in continual learning and… ▽ More Hyperparameter selection in continual learning scenarios is a challenging and underexplored aspect, especially in practical non-stationary environments. Traditional approaches, such as grid searches with held-out validation data from all tasks, are unrealistic for building accurate lifelong learning systems. This paper aims to explore the role of hyperparameter selection in continual learning and the necessity of continually and automatically tuning them according to the complexity of the task at hand. Hence, we propose leveraging the nature of sequence task learning to improve Hyperparameter Optimization efficiency. By using the functional analysis of variance-based techniques, we identify the most crucial hyperparameters that have an impact on performance. We demonstrate empirically that this approach, agnostic to continual scenarios and strategies, allows us to speed up hyperparameters optimization continually across tasks and exhibit robustness even in the face of varying sequential task orders. We believe that our findings can contribute to the advancement of continual learning methodologies towards more efficient, robust and adaptable models for real-world applications. △ Less

Submitted 19 June, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

arXiv:2311.11908 [pdf, other]

Continual Learning: Applications and the Road Forward

Authors: Eli Verwimp, Rahaf Aljundi, Shai Ben-David, Matthias Bethge, Andrea Cossu, Alexander Gepperth, Tyler L. Hayes, Eyke Hüllermeier, Christopher Kanan, Dhireesha Kudithipudi, Christoph H. Lampert, Martin Mundt, Razvan Pascanu, Adrian Popescu, Andreas S. Tolias, Joost van de Weijer, Bing Liu, Vincenzo Lomonaco, Tinne Tuytelaars, Gido M. van de Ven

Abstract: Continual learning is a subfield of machine learning, which aims to allow machine learning models to continuously learn on new data, by accumulating knowledge without forgetting what was learned in the past. In this work, we take a step back, and ask: "Why should one care about continual learning in the first place?". We set the stage by examining recent continual learning papers published at four… ▽ More Continual learning is a subfield of machine learning, which aims to allow machine learning models to continuously learn on new data, by accumulating knowledge without forgetting what was learned in the past. In this work, we take a step back, and ask: "Why should one care about continual learning in the first place?". We set the stage by examining recent continual learning papers published at four major machine learning conferences, and show that memory-constrained settings dominate the field. Then, we discuss five open problems in machine learning, and even though they might seem unrelated to continual learning at first sight, we show that continual learning will inevitably be part of their solution. These problems are model editing, personalization and specialization, on-device learning, faster (re-)training and reinforcement learning. Finally, by comparing the desiderata from these unsolved problems and the current assumptions in continual learning, we highlight and discuss four future directions for continual learning research. We hope that this work offers an interesting perspective on the future of continual learning, while displaying its potential value and the paths we have to pursue in order to make it successful. This work is the result of the many discussions the authors had at the Dagstuhl seminar on Deep Continual Learning, in March 2023. △ Less

Submitted 28 March, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

Journal ref: Transactions on Machine Learning Research (TMLR), 2024

arXiv:2310.04467 [pdf, other]

Design Principles for Lifelong Learning AI Accelerators

Authors: Dhireesha Kudithipudi, Anurag Daram, Abdullah M. Zyarah, Fatima Tuz Zohora, James B. Aimone, Angel Yanguas-Gil, Nicholas Soures, Emre Neftci, Matthew Mattina, Vincenzo Lomonaco, Clare D. Thiem, Benjamin Epstein

Abstract: Lifelong learning - an agent's ability to learn throughout its lifetime - is a hallmark of biological learning systems and a central challenge for artificial intelligence (AI). The development of lifelong learning algorithms could lead to a range of novel AI applications, but this will also require the development of appropriate hardware accelerators, particularly if the models are to be deployed… ▽ More Lifelong learning - an agent's ability to learn throughout its lifetime - is a hallmark of biological learning systems and a central challenge for artificial intelligence (AI). The development of lifelong learning algorithms could lead to a range of novel AI applications, but this will also require the development of appropriate hardware accelerators, particularly if the models are to be deployed on edge platforms, which have strict size, weight, and power constraints. Here, we explore the design of lifelong learning AI accelerators that are intended for deployment in untethered environments. We identify key desirable capabilities for lifelong learning accelerators and highlight metrics to evaluate such accelerators. We then discuss current edge AI accelerators and explore the future design of lifelong learning accelerators, considering the role that different emerging technologies could play. △ Less

Submitted 5 October, 2023; originally announced October 2023.

arXiv:2309.12727 [pdf, other]

In-context Interference in Chat-based Large Language Models

Authors: Eric Nuertey Coleman, Julio Hurtado, Vincenzo Lomonaco

Abstract: Large language models (LLMs) have had a huge impact on society due to their impressive capabilities and vast knowledge of the world. Various applications and tools have been created that allow users to interact with these models in a black-box scenario. However, one limitation of this scenario is that users cannot modify the internal knowledge of the model, and the only way to add or modify intern… ▽ More Large language models (LLMs) have had a huge impact on society due to their impressive capabilities and vast knowledge of the world. Various applications and tools have been created that allow users to interact with these models in a black-box scenario. However, one limitation of this scenario is that users cannot modify the internal knowledge of the model, and the only way to add or modify internal knowledge is by explicitly mentioning it to the model during the current interaction. This learning process is called in-context training, and it refers to training that is confined to the user's current session or context. In-context learning has significant applications, but also has limitations that are seldom studied. In this paper, we present a study that shows how the model can suffer from interference between information that continually flows in the context, causing it to forget previously learned knowledge, which can reduce the model's performance. Along with showing the problem, we propose an evaluation benchmark based on the bAbI dataset. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2308.10328 [pdf, other]

A Comprehensive Empirical Evaluation on Online Continual Learning

Authors: Albin Soutif--Cormerais, Antonio Carta, Andrea Cossu, Julio Hurtado, Hamed Hemati, Vincenzo Lomonaco, Joost Van de Weijer

Abstract: Online continual learning aims to get closer to a live learning experience by learning directly on a stream of data with temporally shifting distribution and by storing a minimum amount of data from that stream. In this empirical evaluation, we evaluate various methods from the literature that tackle online continual learning. More specifically, we focus on the class-incremental setting in the con… ▽ More Online continual learning aims to get closer to a live learning experience by learning directly on a stream of data with temporally shifting distribution and by storing a minimum amount of data from that stream. In this empirical evaluation, we evaluate various methods from the literature that tackle online continual learning. More specifically, we focus on the class-incremental setting in the context of image classification, where the learner must learn new classes incrementally from a stream of data. We compare these methods on the Split-CIFAR100 and Split-TinyImagenet benchmarks, and measure their average accuracy, forgetting, stability, and quality of the representations, to evaluate various aspects of the algorithm at the end but also during the whole training period. We find that most methods suffer from stability and underfitting issues. However, the learned representations are comparable to i.i.d. training under the same computational budget. No clear winner emerges from the results and basic experience replay, when properly tuned and implemented, is a very strong baseline. We release our modular and extensible codebase at https://github.com/AlbinSou/ocl_survey based on the avalanche framework to reproduce our results and encourage future research. △ Less

Submitted 23 September, 2023; v1 submitted 20 August, 2023; originally announced August 2023.

Comments: ICCV Visual Continual Learning Workshop 2023 accepted paper

arXiv:2307.08532 [pdf, other]

LuckyMera: a Modular AI Framework for Building Hybrid NetHack Agents

Authors: Luigi Quarantiello, Simone Marzeddu, Antonio Guzzi, Vincenzo Lomonaco

Abstract: In the last few decades we have witnessed a significant development in Artificial Intelligence (AI) thanks to the availability of a variety of testbeds, mostly based on simulated environments and video games. Among those, roguelike games offer a very good trade-off in terms of complexity of the environment and computational costs, which makes them perfectly suited to test AI agents generalization… ▽ More In the last few decades we have witnessed a significant development in Artificial Intelligence (AI) thanks to the availability of a variety of testbeds, mostly based on simulated environments and video games. Among those, roguelike games offer a very good trade-off in terms of complexity of the environment and computational costs, which makes them perfectly suited to test AI agents generalization capabilities. In this work, we present LuckyMera, a flexible, modular, extensible and configurable AI framework built around NetHack, a popular terminal-based, single-player roguelike video game. This library is aimed at simplifying and speeding up the development of AI agents capable of successfully playing the game and offering a high-level interface for designing game strategies. LuckyMera comes with a set of off-the-shelf symbolic and neural modules (called "skills"): these modules can be either hard-coded behaviors, or neural Reinforcement Learning approaches, with the possibility of creating compositional hybrid solutions. Additionally, LuckyMera comes with a set of utility features to save its experiences in the form of trajectories for further analysis and to use them as datasets to train neural modules, with a direct interface to the NetHack Learning Environment and MiniHack. Through an empirical evaluation we validate our skills implementation and propose a strong baseline agent that can reach state-of-the-art performances in the complete NetHack game. LuckyMera is open-source and available at https://github.com/Pervasive-AI-Lab/LuckyMera. △ Less

Submitted 17 July, 2023; originally announced July 2023.

arXiv:2306.10724 [pdf, other]

Partial Hypernetworks for Continual Learning

Authors: Hamed Hemati, Vincenzo Lomonaco, Davide Bacciu, Damian Borth

Abstract: Hypernetworks mitigate forgetting in continual learning (CL) by generating task-dependent weights and penalizing weight changes at a meta-model level. Unfortunately, generating all weights is not only computationally expensive for larger architectures, but also, it is not well understood whether generating all model weights is necessary. Inspired by latent replay methods in CL, we propose partial… ▽ More Hypernetworks mitigate forgetting in continual learning (CL) by generating task-dependent weights and penalizing weight changes at a meta-model level. Unfortunately, generating all weights is not only computationally expensive for larger architectures, but also, it is not well understood whether generating all model weights is necessary. Inspired by latent replay methods in CL, we propose partial weight generation for the final layers of a model using hypernetworks while freezing the initial layers. With this objective, we first answer the question of how many layers can be frozen without compromising the final performance. Through several experiments, we empirically show that the number of layers that can be frozen is proportional to the distributional similarity in the CL stream. Then, to demonstrate the effectiveness of hypernetworks, we show that noisy streams can significantly impact the performance of latent replay methods, leading to increased forgetting when features from noisy experiences are replayed with old samples. In contrast, partial hypernetworks are more robust to noise by maintaining accuracy on previous experiences. Finally, we conduct experiments on the split CIFAR-100 and TinyImagenet benchmarks and compare different versions of partial hypernetworks to latent replay methods. We conclude that partial weight generation using hypernetworks is a promising solution to the problem of forgetting in neural networks. It can provide an effective balance between computation and final test accuracy in CL streams. △ Less

Submitted 19 June, 2023; originally announced June 2023.

Comments: Accepted to the 2nd Conference on Lifelong Learning Agents (CoLLAs), 2023

arXiv:2306.09890 [pdf, other]

Studying Generalization on Memory-Based Methods in Continual Learning

Authors: Felipe del Rio, Julio Hurtado, Cristian Buc, Alvaro Soto, Vincenzo Lomonaco

Abstract: One of the objectives of Continual Learning is to learn new concepts continually over a stream of experiences and at the same time avoid catastrophic forgetting. To mitigate complete knowledge overwriting, memory-based methods store a percentage of previous data distributions to be used during training. Although these methods produce good results, few studies have tested their out-of-distribution… ▽ More One of the objectives of Continual Learning is to learn new concepts continually over a stream of experiences and at the same time avoid catastrophic forgetting. To mitigate complete knowledge overwriting, memory-based methods store a percentage of previous data distributions to be used during training. Although these methods produce good results, few studies have tested their out-of-distribution generalization properties, as well as whether these methods overfit the replay memory. In this work, we show that although these methods can help in traditional in-distribution generalization, they can strongly impair out-of-distribution generalization by learning spurious features and correlations. Using a controlled environment, the Synbol benchmark generator (Lacoste et al., 2020), we demonstrate that this lack of out-of-distribution generalization mainly occurs in the linear classifier. △ Less

Submitted 20 June, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

arXiv:2303.15888 [pdf, other]

Projected Latent Distillation for Data-Agnostic Consolidation in Distributed Continual Learning

Authors: Antonio Carta, Andrea Cossu, Vincenzo Lomonaco, Davide Bacciu, Joost van de Weijer

Abstract: Distributed learning on the edge often comprises self-centered devices (SCD) which learn local tasks independently and are unwilling to contribute to the performance of other SDCs. How do we achieve forward transfer at zero cost for the single SCDs? We formalize this problem as a Distributed Continual Learning scenario, where SCD adapt to local tasks and a CL model consolidates the knowledge from… ▽ More Distributed learning on the edge often comprises self-centered devices (SCD) which learn local tasks independently and are unwilling to contribute to the performance of other SDCs. How do we achieve forward transfer at zero cost for the single SCDs? We formalize this problem as a Distributed Continual Learning scenario, where SCD adapt to local tasks and a CL model consolidates the knowledge from the resulting stream of models without looking at the SCD's private data. Unfortunately, current CL methods are not directly applicable to this scenario. We propose Data-Agnostic Consolidation (DAC), a novel double knowledge distillation method that consolidates the stream of SC models without using the original data. DAC performs distillation in the latent space via a novel Projected Latent Distillation loss. Experimental results show that DAC enables forward transfer between SCDs and reaches state-of-the-art accuracy on Split CIFAR100, CORe50 and Split TinyImageNet, both in reharsal-free and distributed CL scenarios. Somewhat surprisingly, even a single out-of-distribution image is sufficient as the only source of data during consolidation. △ Less

Submitted 28 March, 2023; originally announced March 2023.

arXiv:2302.01766 [pdf, other]

Avalanche: A PyTorch Library for Deep Continual Learning

Authors: Antonio Carta, Lorenzo Pellegrini, Andrea Cossu, Hamed Hemati, Vincenzo Lomonaco

Abstract: Continual learning is the problem of learning from a nonstationary stream of data, a fundamental issue for sustainable and efficient training of deep neural networks over time. Unfortunately, deep learning libraries only provide primitives for offline training, assuming that model's architecture and data are fixed. Avalanche is an open source library maintained by the ContinualAI non-profit organi… ▽ More Continual learning is the problem of learning from a nonstationary stream of data, a fundamental issue for sustainable and efficient training of deep neural networks over time. Unfortunately, deep learning libraries only provide primitives for offline training, assuming that model's architecture and data are fixed. Avalanche is an open source library maintained by the ContinualAI non-profit organization that extends PyTorch by providing first-class support for dynamic architectures, streams of datasets, and incremental training and evaluation methods. Avalanche provides a large set of predefined benchmarks and training algorithms and it is easy to extend and modular while supporting a wide range of continual learning scenarios. Documentation is available at \url{https://avalanche.continualai.org}. △ Less

Submitted 2 February, 2023; originally announced February 2023.

arXiv:2301.12467 [pdf]

doi 10.1016/j.iswa.2023.200251

Continual Learning for Predictive Maintenance: Overview and Challenges

Authors: Julio Hurtado, Dario Salvati, Rudy Semola, Mattia Bosio, Vincenzo Lomonaco

Abstract: Deep learning techniques have become one of the main propellers for solving engineering problems effectively and efficiently. For instance, Predictive Maintenance methods have been used to improve predictions of when maintenance is needed on different machines and operative contexts. However, deep learning methods are not without limitations, as these models are normally trained on a fixed distrib… ▽ More Deep learning techniques have become one of the main propellers for solving engineering problems effectively and efficiently. For instance, Predictive Maintenance methods have been used to improve predictions of when maintenance is needed on different machines and operative contexts. However, deep learning methods are not without limitations, as these models are normally trained on a fixed distribution that only reflects the current state of the problem. Due to internal or external factors, the state of the problem can change, and the performance decreases due to the lack of generalization and adaptation. Contrary to this stationary training set, real-world applications change their environments constantly, creating the need to constantly adapt the model to evolving scenarios. To aid in this endeavor, Continual Learning methods propose ways to constantly adapt prediction models and incorporate new knowledge after deployment. Despite the advantages of these techniques, there are still challenges to applying them to real-world problems. In this work, we present a brief introduction to predictive maintenance, non-stationary environments, and continual learning, together with an extensive review of the current state of applying continual learning in real-world applications and specifically in predictive maintenance. We then discuss the current challenges of both predictive maintenance and continual learning, proposing future directions at the intersection of both areas. Finally, we propose a novel way to create benchmarks that favor the application of continuous learning methods in more realistic environments, giving specific examples of predictive maintenance. △ Less

Submitted 29 June, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

arXiv:2301.11396 [pdf, other]

Class-Incremental Learning with Repetition

Authors: Hamed Hemati, Andrea Cossu, Antonio Carta, Julio Hurtado, Lorenzo Pellegrini, Davide Bacciu, Vincenzo Lomonaco, Damian Borth

Abstract: Real-world data streams naturally include the repetition of previous concepts. From a Continual Learning (CL) perspective, repetition is a property of the environment and, unlike replay, cannot be controlled by the agent. Nowadays, the Class-Incremental (CI) scenario represents the leading test-bed for assessing and comparing CL strategies. This scenario type is very easy to use, but it never allo… ▽ More Real-world data streams naturally include the repetition of previous concepts. From a Continual Learning (CL) perspective, repetition is a property of the environment and, unlike replay, cannot be controlled by the agent. Nowadays, the Class-Incremental (CI) scenario represents the leading test-bed for assessing and comparing CL strategies. This scenario type is very easy to use, but it never allows revisiting previously seen classes, thus completely neglecting the role of repetition. We focus on the family of Class-Incremental with Repetition (CIR) scenario, where repetition is embedded in the definition of the stream. We propose two stochastic stream generators that produce a wide range of CIR streams starting from a single dataset and a few interpretable control parameters. We conduct the first comprehensive evaluation of repetition in CL by studying the behavior of existing CL strategies under different CIR streams. We then present a novel replay strategy that exploits repetition and counteracts the natural imbalance present in the stream. On both CIFAR100 and TinyImageNet, our strategy outperforms other replay approaches, which are not designed for environments with repetition. △ Less

Submitted 19 June, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

Comments: Accepted to the 2nd Conference on Lifelong Learning Agents (CoLLAs), 2023 19 pages

arXiv:2301.02464 [pdf, other]

Architect, Regularize and Replay (ARR): a Flexible Hybrid Approach for Continual Learning

Authors: Vincenzo Lomonaco, Lorenzo Pellegrini, Gabriele Graffieti, Davide Maltoni

Abstract: In recent years we have witnessed a renewed interest in machine learning methodologies, especially for deep representation learning, that could overcome basic i.i.d. assumptions and tackle non-stationary environments subject to various distributional shifts or sample selection biases. Within this context, several computational approaches based on architectural priors, regularizers and replay polic… ▽ More In recent years we have witnessed a renewed interest in machine learning methodologies, especially for deep representation learning, that could overcome basic i.i.d. assumptions and tackle non-stationary environments subject to various distributional shifts or sample selection biases. Within this context, several computational approaches based on architectural priors, regularizers and replay policies have been proposed with different degrees of success depending on the specific scenario in which they were developed and assessed. However, designing comprehensive hybrid solutions that can flexibly and generally be applied with tunable efficiency-effectiveness trade-offs still seems a distant goal. In this paper, we propose "Architect, Regularize and Replay" (ARR), an hybrid generalization of the renowned AR1 algorithm and its variants, that can achieve state-of-the-art results in classic scenarios (e.g. class-incremental learning) but also generalize to arbitrary data streams generated from real-world datasets such as CIFAR-100, CORe50 and ImageNet-1000. △ Less

Submitted 6 January, 2023; originally announced January 2023.

Comments: Book Chapter Preprint: 15 pages, 7 figures, 2 tables. arXiv admin note: text overlap with arXiv:1912.01100

arXiv:2212.06833 [pdf, other]

3rd Continual Learning Workshop Challenge on Egocentric Category and Instance Level Object Understanding

Authors: Lorenzo Pellegrini, Chenchen Zhu, Fanyi Xiao, Zhicheng Yan, Antonio Carta, Matthias De Lange, Vincenzo Lomonaco, Roshan Sumbaly, Pau Rodriguez, David Vazquez

Abstract: Continual Learning, also known as Lifelong or Incremental Learning, has recently gained renewed interest among the Artificial Intelligence research community. Recent research efforts have quickly led to the design of novel algorithms able to reduce the impact of the catastrophic forgetting phenomenon in deep neural networks. Due to this surge of interest in the field, many competitions have been h… ▽ More Continual Learning, also known as Lifelong or Incremental Learning, has recently gained renewed interest among the Artificial Intelligence research community. Recent research efforts have quickly led to the design of novel algorithms able to reduce the impact of the catastrophic forgetting phenomenon in deep neural networks. Due to this surge of interest in the field, many competitions have been held in recent years, as they are an excellent opportunity to stimulate research in promising directions. This paper summarizes the ideas, design choices, rules, and results of the challenge held at the 3rd Continual Learning in Computer Vision (CLVision) Workshop at CVPR 2022. The focus of this competition is the complex continual object detection task, which is still underexplored in literature compared to classification tasks. The challenge is based on the challenge version of the novel EgoObjects dataset, a large-scale egocentric object dataset explicitly designed to benchmark continual learning algorithms for egocentric category-/instance-level object understanding, which covers more than 1k unique main objects and 250+ categories in around 100k video frames. △ Less

Submitted 13 December, 2022; originally announced December 2022.

Comments: 21 pages, 12 figures, 5 tables

arXiv:2207.01145 [pdf, other]

Memory Population in Continual Learning via Outlier Elimination

Authors: Julio Hurtado, Alain Raymond-Saez, Vladimir Araujo, Vincenzo Lomonaco, Alvaro Soto, Davide Bacciu

Abstract: Catastrophic forgetting, the phenomenon of forgetting previously learned tasks when learning a new one, is a major hurdle in develo** continual learning algorithms. A popular method to alleviate forgetting is to use a memory buffer, which stores a subset of previously learned task examples for use during training on new tasks. The de facto method of filling memory is by randomly selecting previo… ▽ More Catastrophic forgetting, the phenomenon of forgetting previously learned tasks when learning a new one, is a major hurdle in develo** continual learning algorithms. A popular method to alleviate forgetting is to use a memory buffer, which stores a subset of previously learned task examples for use during training on new tasks. The de facto method of filling memory is by randomly selecting previous examples. However, this process could introduce outliers or noisy samples that could hurt the generalization of the model. This paper introduces Memory Outlier Elimination (MOE), a method for identifying and eliminating outliers in the memory buffer by choosing samples from label-homogeneous subpopulations. We show that a space with a high homogeneity is related to a feature space that is more representative of the class distribution. In practice, MOE removes a sample if it is surrounded by samples from different labels. We demonstrate the effectiveness of MOE on CIFAR-10, CIFAR-100, and CORe50, outperforming previous well-known memory population methods. △ Less

Submitted 3 October, 2023; v1 submitted 3 July, 2022; originally announced July 2022.

arXiv:2207.00010 [pdf, other]

Continual Learning for Human State Monitoring

Authors: Federico Matteoni, Andrea Cossu, Claudio Gallicchio, Vincenzo Lomonaco, Davide Bacciu

Abstract: Continual Learning (CL) on time series data represents a promising but under-studied avenue for real-world applications. We propose two new CL benchmarks for Human State Monitoring. We carefully designed the benchmarks to mirror real-world environments in which new subjects are continuously added. We conducted an empirical evaluation to assess the ability of popular CL strategies to mitigate forge… ▽ More Continual Learning (CL) on time series data represents a promising but under-studied avenue for real-world applications. We propose two new CL benchmarks for Human State Monitoring. We carefully designed the benchmarks to mirror real-world environments in which new subjects are continuously added. We conducted an empirical evaluation to assess the ability of popular CL strategies to mitigate forgetting in our benchmarks. Our results show that, possibly due to the domain-incremental properties of our benchmarks, forgetting can be easily tackled even with a simple finetuning and that existing strategies struggle in accumulating knowledge over a fixed, held-out, test subject. △ Less

Submitted 11 July, 2022; v1 submitted 29 June, 2022; originally announced July 2022.

Comments: 6 pages, 4 figures, 2 tables, Accepted as oral at ESANN 2022

arXiv:2206.06957 [pdf, other]

Continual-Learning-as-a-Service (CLaaS): On-Demand Efficient Adaptation of Predictive Models

Authors: Rudy Semola, Vincenzo Lomonaco, Davide Bacciu

Abstract: Predictive machine learning models nowadays are often updated in a stateless and expensive way. The two main future trends for companies that want to build machine learning-based applications and systems are real-time inference and continual updating. Unfortunately, both trends require a mature infrastructure that is hard and costly to realize on-premise. This paper defines a novel software servic… ▽ More Predictive machine learning models nowadays are often updated in a stateless and expensive way. The two main future trends for companies that want to build machine learning-based applications and systems are real-time inference and continual updating. Unfortunately, both trends require a mature infrastructure that is hard and costly to realize on-premise. This paper defines a novel software service and model delivery infrastructure termed Continual Learning-as-a-Service (CLaaS) to address these issues. Specifically, it embraces continual machine learning and continuous integration techniques. It provides support for model updating and validation tools for data scientists without an on-premise solution and in an efficient, stateful and easy-to-use manner. Finally, this CL model service is easy to encapsulate in any machine learning infrastructure or cloud system. This paper presents the design and implementation of a CLaaS instantiation, called LiquidBrain, evaluated in two real-world scenarios. The former is a robotic object recognition setting using the CORe50 dataset while the latter is a named category and attribute prediction using the DeepFashion-C dataset in the fashion domain. Our preliminary results suggest the usability and efficiency of the Continual Learning model services and the effectiveness of the solution in addressing real-world use-cases regardless of where the computation happens in the continuum Edge-Cloud. △ Less

Submitted 21 July, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

arXiv:2205.09357 [pdf, other]

Continual Pre-Training Mitigates Forgetting in Language and Vision

Authors: Andrea Cossu, Tinne Tuytelaars, Antonio Carta, Lucia Passaro, Vincenzo Lomonaco, Davide Bacciu

Abstract: Pre-trained models are nowadays a fundamental component of machine learning research. In continual learning, they are commonly used to initialize the model before training on the stream of non-stationary data. However, pre-training is rarely applied during continual learning. We formalize and investigate the characteristics of the continual pre-training scenario in both language and vision environ… ▽ More Pre-trained models are nowadays a fundamental component of machine learning research. In continual learning, they are commonly used to initialize the model before training on the stream of non-stationary data. However, pre-training is rarely applied during continual learning. We formalize and investigate the characteristics of the continual pre-training scenario in both language and vision environments, where a model is continually pre-trained on a stream of incoming data and only later fine-tuned to different downstream tasks. We show that continually pre-trained models are robust against catastrophic forgetting and we provide strong empirical evidence supporting the fact that self-supervised pre-training is more effective in retaining previous knowledge than supervised protocols. Code is provided at https://github.com/AndreaCossu/continual-pretraining-nlp-vision . △ Less

Submitted 19 May, 2022; originally announced May 2022.

Comments: under review

arXiv:2204.05842 [pdf, other]

Generative Negative Replay for Continual Learning

Authors: Gabriele Graffieti, Davide Maltoni, Lorenzo Pellegrini, Vincenzo Lomonaco

Abstract: Learning continually is a key aspect of intelligence and a necessary ability to solve many real-life problems. One of the most effective strategies to control catastrophic forgetting, the Achilles' heel of continual learning, is storing part of the old data and replaying them interleaved with new experiences (also known as the replay approach). Generative replay, which is using generative models t… ▽ More Learning continually is a key aspect of intelligence and a necessary ability to solve many real-life problems. One of the most effective strategies to control catastrophic forgetting, the Achilles' heel of continual learning, is storing part of the old data and replaying them interleaved with new experiences (also known as the replay approach). Generative replay, which is using generative models to provide replay patterns on demand, is particularly intriguing, however, it was shown to be effective mainly under simplified assumptions, such as simple scenarios and low-dimensional data. In this paper, we show that, while the generated data are usually not able to improve the classification accuracy for the old classes, they can be effective as negative examples (or antagonists) to better learn the new classes, especially when the learning experiences are small and contain examples of just one or few classes. The proposed approach is validated on complex class-incremental and data-incremental continual learning scenarios (CORe50 and ImageNet-1000) composed of high-dimensional data and a large number of training experiences: a setup where existing generative replay approaches usually fail. △ Less

Submitted 12 April, 2022; originally announced April 2022.

Comments: 18 pages, 10 figures, 16 tables, 2 algorithms. Under review

arXiv:2203.10317 [pdf, ps, other]

doi 10.1007/978-3-031-13324-4_47

Practical Recommendations for Replay-based Continual Learning Methods

Authors: Gabriele Merlin, Vincenzo Lomonaco, Andrea Cossu, Antonio Carta, Davide Bacciu

Abstract: Continual Learning requires the model to learn from a stream of dynamic, non-stationary data without forgetting previous knowledge. Several approaches have been developed in the literature to tackle the Continual Learning challenge. Among them, Replay approaches have empirically proved to be the most effective ones. Replay operates by saving some samples in memory which are then used to rehearse k… ▽ More Continual Learning requires the model to learn from a stream of dynamic, non-stationary data without forgetting previous knowledge. Several approaches have been developed in the literature to tackle the Continual Learning challenge. Among them, Replay approaches have empirically proved to be the most effective ones. Replay operates by saving some samples in memory which are then used to rehearse knowledge during training in subsequent tasks. However, an extensive comparison and deeper understanding of different replay implementation subtleties is still missing in the literature. The aim of this work is to compare and analyze existing replay-based strategies and provide practical recommendations on develo** efficient, effective and generally applicable replay-based strategies. In particular, we investigate the role of the memory size value, different weighting policies and discuss about the impact of data augmentation, which allows reaching better performance with lower memory sizes. △ Less

Submitted 19 March, 2022; originally announced March 2022.

Journal ref: ICIAP 2022 Workshops

arXiv:2202.13657 [pdf, other]

Avalanche RL: a Continual Reinforcement Learning Library

Authors: Nicolò Lucchesi, Antonio Carta, Vincenzo Lomonaco, Davide Bacciu

Abstract: Continual Reinforcement Learning (CRL) is a challenging setting where an agent learns to interact with an environment that is constantly changing over time (the stream of experiences). In this paper, we describe Avalanche RL, a library for Continual Reinforcement Learning which allows to easily train agents on a continuous stream of tasks. Avalanche RL is based on PyTorch and supports any OpenAI G… ▽ More Continual Reinforcement Learning (CRL) is a challenging setting where an agent learns to interact with an environment that is constantly changing over time (the stream of experiences). In this paper, we describe Avalanche RL, a library for Continual Reinforcement Learning which allows to easily train agents on a continuous stream of tasks. Avalanche RL is based on PyTorch and supports any OpenAI Gym environment. Its design is based on Avalanche, one of the more popular continual learning libraries, which allow us to reuse a large number of continual learning strategies and improve the interaction between reinforcement learning and continual learning researchers. Additionally, we propose Continual Habitat-Lab, a novel benchmark and a high-level library which enables the usage of the photorealistic simulator Habitat-Sim for CRL research. Overall, Avalanche RL attempts to unify under a common framework continual reinforcement learning applications, which we hope will foster the growth of the field. △ Less

Submitted 24 March, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

Comments: Presented at the 21st International Conference on Image Analysis and Processing (ICIAP 2021)

arXiv:2202.01645 [pdf, other]

AI-as-a-Service Toolkit for Human-Centered Intelligence in Autonomous Driving

Authors: Valerio De Caro, Saira Bano, Achilles Machumilane, Alberto Gotta, Pietro Cassará, Antonio Carta, Rudy Semola, Christos Sardianos, Christos Chronis, Iraklis Varlamis, Konstantinos Tserpes, Vincenzo Lomonaco, Claudio Gallicchio, Davide Bacciu

Abstract: This paper presents a proof-of-concept implementation of the AI-as-a-Service toolkit developed within the H2020 TEACHING project and designed to implement an autonomous driving personalization system according to the output of an automatic driver's stress recognition algorithm, both of them realizing a Cyber-Physical System of Systems. In addition, we implemented a data-gathering subsystem to coll… ▽ More This paper presents a proof-of-concept implementation of the AI-as-a-Service toolkit developed within the H2020 TEACHING project and designed to implement an autonomous driving personalization system according to the output of an automatic driver's stress recognition algorithm, both of them realizing a Cyber-Physical System of Systems. In addition, we implemented a data-gathering subsystem to collect data from different sensors, i.e., wearables and cameras, to automatize stress recognition. The system was attached for testing to a driving simulation software, CARLA, which allows testing the approach's feasibility with minimum cost and without putting at risk drivers and passengers. At the core of the relative subsystems, different learning algorithms were implemented using Deep Neural Networks, Recurrent Neural Networks, and Reinforcement Learning. △ Less

Submitted 9 February, 2022; v1 submitted 3 February, 2022; originally announced February 2022.

arXiv:2112.06511 [pdf, other]

Ex-Model: Continual Learning from a Stream of Trained Models

Authors: Antonio Carta, Andrea Cossu, Vincenzo Lomonaco, Davide Bacciu

Abstract: Learning continually from non-stationary data streams is a challenging research topic of growing popularity in the last few years. Being able to learn, adapt, and generalize continually in an efficient, effective, and scalable way is fundamental for a sustainable development of Artificial Intelligent systems. However, an agent-centric view of continual learning requires learning directly from raw… ▽ More Learning continually from non-stationary data streams is a challenging research topic of growing popularity in the last few years. Being able to learn, adapt, and generalize continually in an efficient, effective, and scalable way is fundamental for a sustainable development of Artificial Intelligent systems. However, an agent-centric view of continual learning requires learning directly from raw data, which limits the interaction between independent agents, the efficiency, and the privacy of current approaches. Instead, we argue that continual learning systems should exploit the availability of compressed information in the form of trained models. In this paper, we introduce and formalize a new paradigm named "Ex-Model Continual Learning" (ExML), where an agent learns from a sequence of previously trained models instead of raw data. We further contribute with three ex-model continual learning algorithms and an empirical setting comprising three datasets (MNIST, CIFAR-10 and CORe50), and eight scenarios, where the proposed algorithms are extensively tested. Finally, we highlight the peculiarities of the ex-model paradigm and we point out interesting future research directions. △ Less

Submitted 13 December, 2021; originally announced December 2021.

arXiv:2112.02925 [pdf, other]

Is Class-Incremental Enough for Continual Learning?

Authors: Andrea Cossu, Gabriele Graffieti, Lorenzo Pellegrini, Davide Maltoni, Davide Bacciu, Antonio Carta, Vincenzo Lomonaco

Abstract: The ability of a model to learn continually can be empirically assessed in different continual learning scenarios. Each scenario defines the constraints and the opportunities of the learning environment. Here, we challenge the current trend in the continual learning literature to experiment mainly on class-incremental scenarios, where classes present in one experience are never revisited. We posit… ▽ More The ability of a model to learn continually can be empirically assessed in different continual learning scenarios. Each scenario defines the constraints and the opportunities of the learning environment. Here, we challenge the current trend in the continual learning literature to experiment mainly on class-incremental scenarios, where classes present in one experience are never revisited. We posit that an excessive focus on this setting may be limiting for future research on continual learning, since class-incremental scenarios artificially exacerbate catastrophic forgetting, at the expense of other important objectives like forward transfer and computational efficiency. In many real-world environments, in fact, repetition of previously encountered concepts occurs naturally and contributes to softening the disruption of previous knowledge. We advocate for a more in-depth study of alternative continual learning scenarios, in which repetition is integrated by design in the stream of incoming information. Starting from already existing proposals, we describe the advantages such class-incremental with repetition scenarios could offer for a more comprehensive assessment of continual learning models. △ Less

Submitted 6 December, 2021; originally announced December 2021.

Comments: Under review

arXiv:2111.09437 [pdf, other]

Sustainable Artificial Intelligence through Continual Learning

Authors: Andrea Cossu, Marta Ziosi, Vincenzo Lomonaco

Abstract: The increasing attention on Artificial Intelligence (AI) regulation has led to the definition of a set of ethical principles grouped into the Sustainable AI framework. In this article, we identify Continual Learning, an active area of AI research, as a promising approach towards the design of systems compliant with the Sustainable AI principles. While Sustainable AI outlines general desiderata for… ▽ More The increasing attention on Artificial Intelligence (AI) regulation has led to the definition of a set of ethical principles grouped into the Sustainable AI framework. In this article, we identify Continual Learning, an active area of AI research, as a promising approach towards the design of systems compliant with the Sustainable AI principles. While Sustainable AI outlines general desiderata for ethical applications, Continual Learning provides means to put such desiderata into practice. △ Less

Submitted 17 November, 2021; originally announced November 2021.

Comments: Accepted at the 2021 International Conference on AI for People (CAIP)

arXiv:2110.14613 [pdf, other]

International Workshop on Continual Semi-Supervised Learning: Introduction, Benchmarks and Baselines

Authors: Ajmal Shahbaz, Salman Khan, Mohammad Asiful Hossain, Vincenzo Lomonaco, Kevin Cannons, Zhan Xu, Fabio Cuzzolin

Abstract: The aim of this paper is to formalize a new continual semi-supervised learning (CSSL) paradigm, proposed to the attention of the machine learning community via the IJCAI 2021 International Workshop on Continual Semi-Supervised Learning (CSSL-IJCAI), with the aim of raising field awareness about this problem and mobilizing its effort in this direction. After a formal definition of continual semi-su… ▽ More The aim of this paper is to formalize a new continual semi-supervised learning (CSSL) paradigm, proposed to the attention of the machine learning community via the IJCAI 2021 International Workshop on Continual Semi-Supervised Learning (CSSL-IJCAI), with the aim of raising field awareness about this problem and mobilizing its effort in this direction. After a formal definition of continual semi-supervised learning and the appropriate training and testing protocols, the paper introduces two new benchmarks specifically designed to assess CSSL on two important computer vision tasks: activity recognition and crowd counting. We describe the Continual Activity Recognition (CAR) and Continual Crowd Counting (CCC) challenges built upon those benchmarks, the baseline models proposed for the challenges, and describe a simple CSSL baseline which consists in applying batch self-training in temporal sessions, for a limited number of rounds. The results show that learning from unlabelled data streams is extremely challenging, and stimulate the search for methods that can encode the dynamics of the data stream. △ Less

Submitted 27 October, 2021; originally announced October 2021.

arXiv:2107.06543 [pdf, other]

TEACHING -- Trustworthy autonomous cyber-physical applications through human-centred intelligence

Authors: Davide Bacciu, Siranush Akarmazyan, Eric Armengaud, Manlio Bacco, George Bravos, Calogero Calandra, Emanuele Carlini, Antonio Carta, Pietro Cassara, Massimo Coppola, Charalampos Davalas, Patrizio Dazzi, Maria Carmela Degennaro, Daniele Di Sarli, Jürgen Dobaj, Claudio Gallicchio, Sylvain Girbal, Alberto Gotta, Riccardo Groppo, Vincenzo Lomonaco, Georg Macher, Daniele Mazzei, Gabriele Mencagli, Dimitrios Michail, Alessio Micheli , et al. (10 additional authors not shown)

Abstract: This paper discusses the perspective of the H2020 TEACHING project on the next generation of autonomous applications running in a distributed and highly heterogeneous environment comprising both virtual and physical resources spanning the edge-cloud continuum. TEACHING puts forward a human-centred vision leveraging the physiological, emotional, and cognitive state of the users as a driver for the… ▽ More This paper discusses the perspective of the H2020 TEACHING project on the next generation of autonomous applications running in a distributed and highly heterogeneous environment comprising both virtual and physical resources spanning the edge-cloud continuum. TEACHING puts forward a human-centred vision leveraging the physiological, emotional, and cognitive state of the users as a driver for the adaptation and optimization of the autonomous applications. It does so by building a distributed, embedded and federated learning system complemented by methods and tools to enforce its dependability, security and privacy preservation. The paper discusses the main concepts of the TEACHING approach and singles out the main AI-related research challenges associated with it. Further, we provide a discussion of the design choices for the TEACHING system to tackle the aforementioned challenges △ Less

Submitted 14 July, 2021; originally announced July 2021.

arXiv:2105.13127 [pdf, other]

Continual Learning at the Edge: Real-Time Training on Smartphone Devices

Authors: Lorenzo Pellegrini, Vincenzo Lomonaco, Gabriele Graffieti, Davide Maltoni

Abstract: On-device training for personalized learning is a challenging research problem. Being able to quickly adapt deep prediction models at the edge is necessary to better suit personal user needs. However, adaptation on the edge poses some questions on both the efficiency and sustainability of the learning process and on the ability to work under shifting data distributions. Indeed, naively fine-tuning… ▽ More On-device training for personalized learning is a challenging research problem. Being able to quickly adapt deep prediction models at the edge is necessary to better suit personal user needs. However, adaptation on the edge poses some questions on both the efficiency and sustainability of the learning process and on the ability to work under shifting data distributions. Indeed, naively fine-tuning a prediction model only on the newly available data results in catastrophic forgetting, a sudden erasure of previously acquired knowledge. In this paper, we detail the implementation and deployment of a hybrid continual learning strategy (AR1*) on a native Android application for real-time on-device personalization without forgetting. Our benchmark, based on an extension of the CORe50 dataset, shows the efficiency and effectiveness of our solution. △ Less

Submitted 24 May, 2021; originally announced May 2021.

Comments: 6 pages, 2 figures, 1 table

arXiv:2105.07674 [pdf, ps, other]

Continual Learning with Echo State Networks

Authors: Andrea Cossu, Davide Bacciu, Antonio Carta, Claudio Gallicchio, Vincenzo Lomonaco

Abstract: Continual Learning (CL) refers to a learning setup where data is non stationary and the model has to learn without forgetting existing knowledge. The study of CL for sequential patterns revolves around trained recurrent networks. In this work, instead, we introduce CL in the context of Echo State Networks (ESNs), where the recurrent component is kept fixed. We provide the first evaluation of catas… ▽ More Continual Learning (CL) refers to a learning setup where data is non stationary and the model has to learn without forgetting existing knowledge. The study of CL for sequential patterns revolves around trained recurrent networks. In this work, instead, we introduce CL in the context of Echo State Networks (ESNs), where the recurrent component is kept fixed. We provide the first evaluation of catastrophic forgetting in ESNs and we highlight the benefits in using CL strategies which are not applicable to trained recurrent models. Our results confirm the ESN as a promising model for CL and open to its use in streaming scenarios. △ Less

Submitted 17 August, 2021; v1 submitted 17 May, 2021; originally announced May 2021.

Comments: Accepted as oral at ESANN 2021

arXiv:2104.00405 [pdf, other]

Avalanche: an End-to-End Library for Continual Learning

Authors: Vincenzo Lomonaco, Lorenzo Pellegrini, Andrea Cossu, Antonio Carta, Gabriele Graffieti, Tyler L. Hayes, Matthias De Lange, Marc Masana, Jary Pomponi, Gido van de Ven, Martin Mundt, Qi She, Keiland Cooper, Jeremy Forest, Eden Belouadah, Simone Calderara, German I. Parisi, Fabio Cuzzolin, Andreas Tolias, Simone Scardapane, Luca Antiga, Subutai Amhad, Adrian Popescu, Christopher Kanan, Joost van de Weijer , et al. (3 additional authors not shown)

Abstract: Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standa… ▽ More Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototy**, training, and reproducible evaluation of continual learning algorithms. △ Less

Submitted 1 April, 2021; originally announced April 2021.

Comments: Official Website: https://avalanche.continualai.org

arXiv:2103.15851 [pdf, other]

Distilled Replay: Overcoming Forgetting through Synthetic Samples

Authors: Andrea Rosasco, Antonio Carta, Andrea Cossu, Vincenzo Lomonaco, Davide Bacciu

Abstract: Replay strategies are Continual Learning techniques which mitigate catastrophic forgetting by kee** a buffer of patterns from previous experiences, which are interleaved with new data during training. The amount of patterns stored in the buffer is a critical parameter which largely influences the final performance and the memory footprint of the approach. This work introduces Distilled Replay, a… ▽ More Replay strategies are Continual Learning techniques which mitigate catastrophic forgetting by kee** a buffer of patterns from previous experiences, which are interleaved with new data during training. The amount of patterns stored in the buffer is a critical parameter which largely influences the final performance and the memory footprint of the approach. This work introduces Distilled Replay, a novel replay strategy for Continual Learning which is able to mitigate forgetting by kee** a very small buffer (1 pattern per class) of highly informative samples. Distilled Replay builds the buffer through a distillation process which compresses a large dataset into a tiny set of informative examples. We show the effectiveness of our Distilled Replay against popular replay-based strategies on four Continual Learning benchmarks. △ Less

Submitted 22 June, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

arXiv:2103.07492 [pdf, other]

doi 10.1016/j.neunet.2021.07.021

Continual Learning for Recurrent Neural Networks: an Empirical Evaluation

Authors: Andrea Cossu, Antonio Carta, Vincenzo Lomonaco, Davide Bacciu

Abstract: Learning continuously during all model lifetime is fundamental to deploy machine learning solutions robust to drifts in the data distribution. Advances in Continual Learning (CL) with recurrent neural networks could pave the way to a large number of applications where incoming data is non stationary, like natural language processing and robotics. However, the existing body of work on the topic is… ▽ More Learning continuously during all model lifetime is fundamental to deploy machine learning solutions robust to drifts in the data distribution. Advances in Continual Learning (CL) with recurrent neural networks could pave the way to a large number of applications where incoming data is non stationary, like natural language processing and robotics. However, the existing body of work on the topic is still fragmented, with approaches which are application-specific and whose assessment is based on heterogeneous learning protocols and datasets. In this paper, we organize the literature on CL for sequential data processing by providing a categorization of the contributions and a review of the benchmarks. We propose two new benchmarks for CL with sequential data based on existing datasets, whose characteristics resemble real-world applications. We also provide a broad empirical evaluation of CL and Recurrent Neural Networks in class-incremental scenario, by testing their ability to mitigate forgetting with a number of different strategies which are not specific to sequential data processing. Our results highlight the key role played by the sequence length and the importance of a clear specification of the CL scenario. △ Less

Submitted 2 August, 2021; v1 submitted 12 March, 2021; originally announced March 2021.

Comments: Published in Neural Networks

Journal ref: Neural Networks, Volume 143, 2021, pages 607-627

arXiv:2009.09929 [pdf, other]

CVPR 2020 Continual Learning in Computer Vision Competition: Approaches, Results, Current Challenges and Future Directions

Authors: Vincenzo Lomonaco, Lorenzo Pellegrini, Pau Rodriguez, Massimo Caccia, Qi She, Yu Chen, Quentin Jodelet, Rui** Wang, Zheda Mai, David Vazquez, German I. Parisi, Nikhil Churamani, Marc Pickett, Issam Laradji, Davide Maltoni

Abstract: In the last few years, we have witnessed a renewed and fast-growing interest in continual learning with deep neural networks with the shared objective of making current AI systems more adaptive, efficient and autonomous. However, despite the significant and undoubted progress of the field in addressing the issue of catastrophic forgetting, benchmarking different continual learning approaches is a… ▽ More In the last few years, we have witnessed a renewed and fast-growing interest in continual learning with deep neural networks with the shared objective of making current AI systems more adaptive, efficient and autonomous. However, despite the significant and undoubted progress of the field in addressing the issue of catastrophic forgetting, benchmarking different continual learning approaches is a difficult task by itself. In fact, given the proliferation of different settings, training and evaluation protocols, metrics and nomenclature, it is often tricky to properly characterize a continual learning algorithm, relate it to other solutions and gauge its real-world applicability. The first Continual Learning in Computer Vision challenge held at CVPR in 2020 has been one of the first opportunities to evaluate different continual learning algorithms on a common hardware with a large set of shared evaluation metrics and 3 different settings based on the realistic CORe50 video benchmark. In this paper, we report the main results of the competition, which counted more than 79 teams registered, 11 finalists and 2300$ in prizes. We also summarize the winning approaches, current challenges and future research directions. △ Less

Submitted 14 September, 2020; originally announced September 2020.

Comments: Pre-print v1: 12 pages, 3 figures, 8 tables

arXiv:2007.13631 [pdf, other]

Memory-Latency-Accuracy Trade-offs for Continual Learning on a RISC-V Extreme-Edge Node

Authors: Leonardo Ravaglia, Manuele Rusci, Alessandro Capotondi, Francesco Conti, Lorenzo Pellegrini, Vincenzo Lomonaco, Davide Maltoni, Luca Benini

Abstract: AI-powered edge devices currently lack the ability to adapt their embedded inference models to the ever-changing environment. To tackle this issue, Continual Learning (CL) strategies aim at incrementally improving the decision capabilities based on newly acquired data. In this work, after quantifying memory and computational requirements of CL algorithms, we define a novel HW/SW extreme-edge platf… ▽ More AI-powered edge devices currently lack the ability to adapt their embedded inference models to the ever-changing environment. To tackle this issue, Continual Learning (CL) strategies aim at incrementally improving the decision capabilities based on newly acquired data. In this work, after quantifying memory and computational requirements of CL algorithms, we define a novel HW/SW extreme-edge platform featuring a low power RISC-V octa-core cluster tailored for on-demand incremental learning over locally sensed data. The presented multi-core HW/SW architecture achieves a peak performance of 2.21 and 1.70 MAC/cycle, respectively, when running forward and backward steps of the gradient descent. We report the trade-off between memory footprint, latency, and accuracy for learning a new class with Latent Replay CL when targeting an image classification task on the CORe50 dataset. For a CL setting that retrains all the layers, taking 5h to learn a new class and achieving up to 77.3% of precision, a more efficient solution retrains only part of the network, reaching an accuracy of 72.5% with a memory requirement of 300 MB and a computation latency of 1.5 hours. On the other side, retraining only the last layer results in the fastest (867 ms) and less memory hungry (20 MB) solution but scoring 58% on the CORe50 dataset. Thanks to the parallelism of the low-power cluster engine, our HW/SW platform results 25x faster than typical MCU device, on which CL is still impractical, and demonstrates an 11x gain in terms of energy consumption with respect to mobile-class solutions. △ Less

Submitted 22 July, 2020; originally announced July 2020.

Comments: 6 pages, 5 figures, conference

arXiv:2004.14774 [pdf, other]

IROS 2019 Lifelong Robotic Vision Challenge -- Lifelong Object Recognition Report

Authors: Qi She, Fan Feng, Qi Liu, Rosa H. M. Chan, Xinyue Hao, Chuanlin Lan, Qihan Yang, Vincenzo Lomonaco, German I. Parisi, Heechul Bae, Eoin Brophy, Baoquan Chen, Gabriele Graffieti, Vidit Goel, Hyonyoung Han, Sathursan Kanagarajah, Somesh Kumar, Siew-Kei Lam, Tin Lun Lam, Liang Ma, Davide Maltoni, Lorenzo Pellegrini, Duvindu Piyasena, Shiliang Pu, Debdoot Sheet , et al. (11 additional authors not shown)

Abstract: This report summarizes IROS 2019-Lifelong Robotic Vision Competition (Lifelong Object Recognition Challenge) with methods and results from the top $8$ finalists (out of over~$150$ teams). The competition dataset (L)ifel(O)ng (R)obotic V(IS)ion (OpenLORIS) - Object Recognition (OpenLORIS-object) is designed for driving lifelong/continual learning research and application in robotic vision domain, w… ▽ More This report summarizes IROS 2019-Lifelong Robotic Vision Competition (Lifelong Object Recognition Challenge) with methods and results from the top $8$ finalists (out of over~$150$ teams). The competition dataset (L)ifel(O)ng (R)obotic V(IS)ion (OpenLORIS) - Object Recognition (OpenLORIS-object) is designed for driving lifelong/continual learning research and application in robotic vision domain, with everyday objects in home, office, campus, and mall scenarios. The dataset explicitly quantifies the variants of illumination, object occlusion, object size, camera-object distance/angles, and clutter information. Rules are designed to quantify the learning capability of the robotic vision system when faced with the objects appearing in the dynamic environments in the contest. Individual reports, dataset information, rules, and released source code can be found at the project homepage: "https://lifelong-robotic-vision.github.io/competition/". △ Less

Submitted 26 April, 2020; originally announced April 2020.

Comments: 9 pages, 11 figures, 3 tables, accepted into IEEE Robotics and Automation Magazine. arXiv admin note: text overlap with arXiv:1911.06487

arXiv:2003.09114 [pdf, other]

doi 10.1007/978-3-030-43883-8_8

Online Continual Learning on Sequences

Authors: German I. Parisi, Vincenzo Lomonaco

Abstract: Online continual learning (OCL) refers to the ability of a system to learn over time from a continuous stream of data without having to revisit previously encountered training samples. Learning continually in a single data pass is crucial for agents and robots operating in changing environments and required to acquire, fine-tune, and transfer increasingly complex representations from non-i.i.d. in… ▽ More Online continual learning (OCL) refers to the ability of a system to learn over time from a continuous stream of data without having to revisit previously encountered training samples. Learning continually in a single data pass is crucial for agents and robots operating in changing environments and required to acquire, fine-tune, and transfer increasingly complex representations from non-i.i.d. input distributions. Machine learning models that address OCL must alleviate \textit{catastrophic forgetting} in which hidden representations are disrupted or completely overwritten when learning from streams of novel input. In this chapter, we summarize and discuss recent deep learning models that address OCL on sequential input through the use (and combination) of synaptic regularization, structural plasticity, and experience replay. Different implementations of replay have been proposed that alleviate catastrophic forgetting in connectionists architectures via the re-occurrence of (latent representations of) input sequences and that functionally resemble mechanisms of hippocampal replay in the mammalian brain. Empirical evidence shows that architectures endowed with experience replay typically outperform architectures without in (online) incremental learning tasks. △ Less

Submitted 20 March, 2020; originally announced March 2020.

Comments: L. Oneto et al. (eds.), Recent Trends in Learning From Data, Studies in Computational Intelligence 896

arXiv:1912.01100 [pdf, other]

Latent Replay for Real-Time Continual Learning

Authors: Lorenzo Pellegrini, Gabriele Graffieti, Vincenzo Lomonaco, Davide Maltoni

Abstract: Training deep neural networks at the edge on light computational devices, embedded systems and robotic platforms is nowadays very challenging. Continual learning techniques, where complex models are incrementally trained on small batches of new data, can make the learning problem tractable even for CPU-only embedded devices enabling remarkable levels of adaptiveness and autonomy. However, a number… ▽ More Training deep neural networks at the edge on light computational devices, embedded systems and robotic platforms is nowadays very challenging. Continual learning techniques, where complex models are incrementally trained on small batches of new data, can make the learning problem tractable even for CPU-only embedded devices enabling remarkable levels of adaptiveness and autonomy. However, a number of practical problems need to be solved: catastrophic forgetting before anything else. In this paper we introduce an original technique named "Latent Replay" where, instead of storing a portion of past data in the input space, we store activations volumes at some intermediate layer. This can significantly reduce the computation and storage required by native rehearsal. To keep the representation stable and the stored activations valid we propose to slow-down learning at all the layers below the latent replay one, leaving the layers above free to learn at full pace. In our experiments we show that Latent Replay, combined with existing continual learning techniques, achieves state-of-the-art performance on complex video benchmarks such as CORe50 NICv2 (with nearly 400 small and highly non-i.i.d. batches) and OpenLORIS. Finally, we demonstrate the feasibility of nearly real-time continual learning on the edge through the deployment of the proposed technique on a smartphone device. △ Less

Submitted 4 March, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

Comments: Pre-print v3: 13 pages, 9 figures, 10 tables, 1 algorithm

arXiv:1911.06487 [pdf, other]

OpenLORIS-Object: A Robotic Vision Dataset and Benchmark for Lifelong Deep Learning

Authors: Qi She, Fan Feng, Xinyue Hao, Qihan Yang, Chuanlin Lan, Vincenzo Lomonaco, Xuesong Shi, Zhengwei Wang, Yao Guo, Yimin Zhang, Fei Qiao, Rosa H. M. Chan

Abstract: The recent breakthroughs in computer vision have benefited from the availability of large representative datasets (e.g. ImageNet and COCO) for training. Yet, robotic vision poses unique challenges for applying visual algorithms developed from these standard computer vision datasets due to their implicit assumption over non-varying distributions for a fixed set of tasks. Fully retraining models eac… ▽ More The recent breakthroughs in computer vision have benefited from the availability of large representative datasets (e.g. ImageNet and COCO) for training. Yet, robotic vision poses unique challenges for applying visual algorithms developed from these standard computer vision datasets due to their implicit assumption over non-varying distributions for a fixed set of tasks. Fully retraining models each time a new task becomes available is infeasible due to computational, storage and sometimes privacy issues, while naïve incremental strategies have been shown to suffer from catastrophic forgetting. It is crucial for the robots to operate continuously under open-set and detrimental conditions with adaptive visual perceptual systems, where lifelong learning is a fundamental capability. However, very few datasets and benchmarks are available to evaluate and compare emerging techniques. To fill this gap, we provide a new lifelong robotic vision dataset ("OpenLORIS-Object") collected via RGB-D cameras. The dataset embeds the challenges faced by a robot in the real-life application and provides new benchmarks for validating lifelong object recognition algorithms. Moreover, we have provided a testbed of $9$ state-of-the-art lifelong learning algorithms. Each of them involves $48$ tasks with $4$ evaluation metrics over the OpenLORIS-Object dataset. The results demonstrate that the object recognition task in the ever-changing difficulty environments is far from being solved and the bottlenecks are at the forward/backward transfer designs. Our dataset and benchmark are publicly available at at \href{https://lifelong-robotic-vision.github.io/dataset/object}{\underline{https://lifelong-robotic-vision.github.io/dataset/object}}. △ Less

Submitted 6 March, 2020; v1 submitted 15 November, 2019; originally announced November 2019.

Comments: 7 pages, 7 figures, 4 tables

arXiv:1909.03742 [pdf, other]

doi 10.1016/j.neucom.2020.01.093

Efficient Continual Learning in Neural Networks with Embedding Regularization

Authors: Jary Pomponi, Simone Scardapane, Vincenzo Lomonaco, Aurelio Uncini

Abstract: Continual learning of deep neural networks is a key requirement for scaling them up to more complex applicative scenarios and for achieving real lifelong learning of these architectures. Previous approaches to the problem have considered either the progressive increase in the size of the networks, or have tried to regularize the network behavior to equalize it with respect to previously observed t… ▽ More Continual learning of deep neural networks is a key requirement for scaling them up to more complex applicative scenarios and for achieving real lifelong learning of these architectures. Previous approaches to the problem have considered either the progressive increase in the size of the networks, or have tried to regularize the network behavior to equalize it with respect to previously observed tasks. In the latter case, it is essential to understand what type of information best represents this past behavior. Common techniques include regularizing the past outputs, gradients, or individual weights. In this work, we propose a new, relatively simple and efficient method to perform continual learning by regularizing instead the network internal embeddings. To make the approach scalable, we also propose a dynamic sampling strategy to reduce the memory footprint of the required external storage. We show that our method performs favorably with respect to state-of-the-art approaches in the literature, while requiring significantly less space in memory and computational time. In addition, inspired inspired by to recent works, we evaluate the impact of selecting a more flexible model for the activation functions inside the network, evaluating the impact of catastrophic forgetting on the activation functions themselves. △ Less

Submitted 11 February, 2020; v1 submitted 9 September, 2019; originally announced September 2019.

Journal ref: Neurocomputing, 397, pp. 139-148, 2020

arXiv:1907.03799 [pdf, other]

Rehearsal-Free Continual Learning over Small Non-I.I.D. Batches

Authors: Vincenzo Lomonaco, Davide Maltoni, Lorenzo Pellegrini

Abstract: Robotic vision is a field where continual learning can play a significant role. An embodied agent operating in a complex environment subject to frequent and unpredictable changes is required to learn and adapt continuously. In the context of object recognition, for example, a robot should be able to learn (without forgetting) objects of never before seen classes as well as improving its recognitio… ▽ More Robotic vision is a field where continual learning can play a significant role. An embodied agent operating in a complex environment subject to frequent and unpredictable changes is required to learn and adapt continuously. In the context of object recognition, for example, a robot should be able to learn (without forgetting) objects of never before seen classes as well as improving its recognition capabilities as new instances of already known classes are discovered. Ideally, continual learning should be triggered by the availability of short videos of single objects and performed on-line on on-board hardware with fine-grained updates. In this paper, we introduce a novel continual learning protocol based on the CORe50 benchmark and propose two rehearsal-free continual learning techniques, CWR* and AR1*, that can learn effectively even in the challenging case of nearly 400 small non-i.i.d. incremental batches. In particular, our experiments show that AR1* can outperform other state-of-the-art rehearsal-free techniques by more than 15% accuracy in some cases, with a very light and constant computational and memory overhead across training batches. △ Less

Submitted 21 April, 2020; v1 submitted 8 July, 2019; originally announced July 2019.

Comments: Accepted in the CLVision Workshop at CVPR2020: 12 pages, 7 figures, 5 tables, 3 algorithms

arXiv:1907.00182 [pdf, other]

Continual Learning for Robotics: Definition, Framework, Learning Strategies, Opportunities and Challenges

Authors: Timothée Lesort, Vincenzo Lomonaco, Andrei Stoian, Davide Maltoni, David Filliat, Natalia Díaz-Rodríguez

Abstract: Continual learning (CL) is a particular machine learning paradigm where the data distribution and learning objective changes through time, or where all the training data and objective criteria are never available at once. The evolution of the learning process is modeled by a sequence of learning experiences where the goal is to be able to learn new skills all along the sequence without forgetting… ▽ More Continual learning (CL) is a particular machine learning paradigm where the data distribution and learning objective changes through time, or where all the training data and objective criteria are never available at once. The evolution of the learning process is modeled by a sequence of learning experiences where the goal is to be able to learn new skills all along the sequence without forgetting what has been previously learned. Continual learning also aims at the same time at optimizing the memory, the computation power and the speed during the learning process. An important challenge for machine learning is not necessarily finding solutions that work in the real world but rather finding stable algorithms that can learn in real world. Hence, the ideal approach would be tackling the real world in a embodied platform: an autonomous agent. Continual learning would then be effective in an autonomous agent or robot, which would learn autonomously through time about the external world, and incrementally develop a set of complex skills and knowledge. Robotic agents have to learn to adapt and interact with their environment using a continuous stream of observations. Some recent approaches aim at tackling continual learning for robotics, but most recent papers on continual learning only experiment approaches in simulation or with static datasets. Unfortunately, the evaluation of those algorithms does not provide insights on whether their solutions may help continual learning in the context of robotics. This paper aims at reviewing the existing state of the art of continual learning, summarizing existing benchmarks and metrics, and proposing a framework for presenting and evaluating both robotics and non robotics approaches in a way that makes transfer between both fields easier. △ Less

Submitted 22 November, 2019; v1 submitted 29 June, 2019; originally announced July 2019.

arXiv:1905.10112 [pdf, other]

Continual Reinforcement Learning in 3D Non-stationary Environments

Authors: Vincenzo Lomonaco, Karan Desai, Eugenio Culurciello, Davide Maltoni

Abstract: High-dimensional always-changing environments constitute a hard challenge for current reinforcement learning techniques. Artificial agents, nowadays, are often trained off-line in very static and controlled conditions in simulation such that training observations can be thought as sampled i.i.d. from the entire observations space. However, in real world settings, the environment is often non-stati… ▽ More High-dimensional always-changing environments constitute a hard challenge for current reinforcement learning techniques. Artificial agents, nowadays, are often trained off-line in very static and controlled conditions in simulation such that training observations can be thought as sampled i.i.d. from the entire observations space. However, in real world settings, the environment is often non-stationary and subject to unpredictable, frequent changes. In this paper we propose and openly release CRLMaze, a new benchmark for learning continually through reinforcement in a complex 3D non-stationary task based on ViZDoom and subject to several environmental changes. Then, we introduce an end-to-end model-free continual reinforcement learning strategy showing competitive results with respect to four different baselines and not requiring any access to additional supervised signals, previously encountered environmental conditions or observations. △ Less

Submitted 21 April, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

Comments: Accepted in the CLVision Workshop at CVPR2020: 13 pages, 4 figures, 5 tables

arXiv:1811.05291 [pdf, other]

Intelligent Drone Swarm for Search and Rescue Operations at Sea

Authors: Vincenzo Lomonaco, Angelo Trotta, Marta Ziosi, Juan de Dios Yáñez Ávila, Natalia Díaz-Rodríguez

Abstract: In recent years, a rising numbers of people arrived in the European Union, traveling across the Mediterranean Sea or overland through Southeast Europe in what has been later named as the European migrant crisis. In the last 5 years, more than 16 thousands people have lost their lives in the Mediterranean sea during the crossing. The United Nations Secretary General Strategy on New Technologies is… ▽ More In recent years, a rising numbers of people arrived in the European Union, traveling across the Mediterranean Sea or overland through Southeast Europe in what has been later named as the European migrant crisis. In the last 5 years, more than 16 thousands people have lost their lives in the Mediterranean sea during the crossing. The United Nations Secretary General Strategy on New Technologies is supporting the use of Artificial Intelligence (AI) and Robotics to accelerate the achievement of the 2030 Sustainable Development Agenda, which includes safe and regular migration processes among the others. In the same spirit, the central idea of this project aims at using AI technology for Search And Rescue (SAR) operations at sea. In particular, we propose an autonomous fleet of self-organizing intelligent drones that would enable the coverage of a broader area, speeding-up the search processes and finally increasing the efficiency and effectiveness of migrants rescue operations. △ Less

Submitted 13 November, 2018; originally announced November 2018.

Comments: 4 Pages, 1 Figure, extended abstract accepted to the "AI for Social Good" NIPS 2018 Workshop

arXiv:1810.13166 [pdf, other]

Don't forget, there is more than forgetting: new metrics for Continual Learning

Authors: Natalia Díaz-Rodríguez, Vincenzo Lomonaco, David Filliat, Davide Maltoni

Abstract: Continual learning consists of algorithms that learn from a stream of data/tasks continuously and adaptively thought time, enabling the incremental development of ever more complex knowledge and skills. The lack of consensus in evaluating continual learning algorithms and the almost exclusive focus on forgetting motivate us to propose a more comprehensive set of implementation independent metrics… ▽ More Continual learning consists of algorithms that learn from a stream of data/tasks continuously and adaptively thought time, enabling the incremental development of ever more complex knowledge and skills. The lack of consensus in evaluating continual learning algorithms and the almost exclusive focus on forgetting motivate us to propose a more comprehensive set of implementation independent metrics accounting for several factors we believe have practical implications worth considering in the deployment of real AI systems that learn continually: accuracy or performance over time, backward and forward knowledge transfer, memory overhead as well as computational efficiency. Drawing inspiration from the standard Multi-Attribute Value Theory (MAVT) we further propose to fuse these metrics into a single score for ranking purposes and we evaluate our proposal with five continual learning strategies on the iCIFAR-100 continual learning benchmark. △ Less

Submitted 31 October, 2018; originally announced October 2018.

MSC Class: 68T05; cs.LG; cs.AI; cs.CV; cs.NE; stat.ML

arXiv:1810.05596 [pdf, other]

Custom Dual Transportation Mode Detection by Smartphone Devices Exploiting Sensor Diversity

Authors: Claudia Carpineti, Vincenzo Lomonaco, Luca Bedogni, Marco Di Felice, Luciano Bononi

Abstract: Making applications aware of the mobility experienced by the user can open the door to a wide range of novel services in different use-cases, from smart parking to vehicular traffic monitoring. In the literature, there are many different studies demonstrating the theoretical possibility of performing Transportation Mode Detection (TMD) by mining smart-phones embedded sensors data. However, very fe… ▽ More Making applications aware of the mobility experienced by the user can open the door to a wide range of novel services in different use-cases, from smart parking to vehicular traffic monitoring. In the literature, there are many different studies demonstrating the theoretical possibility of performing Transportation Mode Detection (TMD) by mining smart-phones embedded sensors data. However, very few of them provide details on the benchmarking process and on how to implement the detection process in practice. In this study, we provide guidelines and fundamental results that can be useful for both researcher and practitioners aiming at implementing a working TMD system. These guidelines consist of three main contributions. First, we detail the construction of a training dataset, gathered by heterogeneous users and including five different transportation modes; the dataset is made available to the research community as reference benchmark. Second, we provide an in-depth analysis of the sensor-relevance for the case of Dual TDM, which is required by most of mobility-aware applications. Third, we investigate the possibility to perform TMD of unknown users/instances not present in the training set and we compare with state-of-the-art Android APIs for activity recognition. △ Less

Submitted 12 October, 2018; originally announced October 2018.

Comments: Pre-print of the accepted version for the 14th Workshop on Context and Activity Modeling and Recognition (IEEE COMOREA 2018), Athens, Greece, March 19-23, 2018

arXiv:1806.08568 [pdf, other]

Continuous Learning in Single-Incremental-Task Scenarios

Authors: Davide Maltoni, Vincenzo Lomonaco

Abstract: It was recently shown that architectural, regularization and rehearsal strategies can be used to train deep models sequentially on a number of disjoint tasks without forgetting previously acquired knowledge. However, these strategies are still unsatisfactory if the tasks are not disjoint but constitute a single incremental task (e.g., class-incremental learning). In this paper we point out the dif… ▽ More It was recently shown that architectural, regularization and rehearsal strategies can be used to train deep models sequentially on a number of disjoint tasks without forgetting previously acquired knowledge. However, these strategies are still unsatisfactory if the tasks are not disjoint but constitute a single incremental task (e.g., class-incremental learning). In this paper we point out the differences between multi-task and single-incremental-task scenarios and show that well-known approaches such as LWF, EWC and SI are not ideal for incremental task scenarios. A new approach, denoted as AR1, combining architectural and regularization strategies is then specifically proposed. AR1 overhead (in term of memory and computation) is very small thus making it suitable for online learning. When tested on CORe50 and iCIFAR-100, AR1 outperformed existing regularization strategies by a good margin. △ Less

Submitted 22 January, 2019; v1 submitted 22 June, 2018; originally announced June 2018.

Comments: 26 pages, 13 figures; v3: major revision (e.g. added Sec. 4.4), several typos and minor mistakes corrected

Showing 1–50 of 52 results for author: Lomonaco, V