Showing 1–2 of 2 results for author: Gordo, A

Search v0.5.6 released 2020-02-24

arXiv:2002.08165 [pdf, other]

cs.LG stat.ML

Using Hindsight to Anchor Past Knowledge in Continual Learning

Authors: Arslan Chaudhry, Albert Gordo, Puneet K. Dokania, Philip Torr, David Lopez-Paz

Abstract: In continual learning, the learner faces a stream of data whose distribution changes over time. Modern neural networks are known to suffer under this setting, as they quickly forget previously acquired knowledge. To address such catastrophic forgetting, many continual learning methods implement different types of experience replay, re-learning on past data stored in a small buffer known as episodi… ▽ More In continual learning, the learner faces a stream of data whose distribution changes over time. Modern neural networks are known to suffer under this setting, as they quickly forget previously acquired knowledge. To address such catastrophic forgetting, many continual learning methods implement different types of experience replay, re-learning on past data stored in a small buffer known as episodic memory. In this work, we complement experience replay with a new objective that we call anchoring, where the learner uses bilevel optimization to update its knowledge on the current task, while kee** intact the predictions on some anchor points of past tasks. These anchor points are learned using gradient-based optimization to maximize forgetting, which is approximated by fine-tuning the currently trained model on the episodic memory of past tasks. Experiments on several supervised learning benchmarks for continual learning demonstrate that our approach improves the standard experience replay in terms of both accuracy and forgetting metrics and for various sizes of episodic memories. △ Less

Submitted 2 March, 2021; v1 submitted 19 February, 2020; originally announced February 2020.

Comments: Accepted at AAAI 2021
arXiv:1801.08640 [pdf, other]

stat.ML cs.LG

doi 10.1007/s10994-023-06335-8

Considerations When Learning Additive Explanations for Black-Box Models

Authors: Sarah Tan, Giles Hooker, Paul Koch, Albert Gordo, Rich Caruana

Abstract: Many methods to explain black-box models, whether local or global, are additive. In this paper, we study global additive explanations for non-additive models, focusing on four explanation methods: partial dependence, Shapley explanations adapted to a global setting, distilled additive explanations, and gradient-based explanations. We show that different explanation methods characterize non-additiv… ▽ More Many methods to explain black-box models, whether local or global, are additive. In this paper, we study global additive explanations for non-additive models, focusing on four explanation methods: partial dependence, Shapley explanations adapted to a global setting, distilled additive explanations, and gradient-based explanations. We show that different explanation methods characterize non-additive components in a black-box model's prediction function in different ways. We use the concepts of main and total effects to anchor additive explanations, and quantitatively evaluate additive and non-additive explanations. Even though distilled explanations are generally the most accurate additive explanations, non-additive explanations such as tree explanations that explicitly model non-additive components tend to be even more accurate. Despite this, our user study showed that machine learning practitioners were better able to leverage additive explanations for various tasks. These considerations should be taken into account when considering which explanation to trust and use to explain black-box models. △ Less

Submitted 31 July, 2023; v1 submitted 25 January, 2018; originally announced January 2018.

Comments: Published at Machine Learning (2023). Previously titled "Learning Global Additive Explanations for Neural Nets Using Model Distillation". A short version was presented at NeurIPS 2018 Machine Learning for Health Workshop

Search v0.5.6 released 2020-02-24