Search | arXiv e-print repository

Meta Learning Black-Box Population-Based Optimizers

Authors: Hugo Siqueira Gomes, Benjamin Léger, Christian Gagné

Abstract: The no free lunch theorem states that no model is better suited to every problem. A question that arises from this is how to design methods that propose optimizers tailored to specific problems achieving state-of-the-art performance. This paper addresses this issue by proposing the use of meta-learning to infer population-based black-box optimizers that can automatically adapt to specific classes… ▽ More The no free lunch theorem states that no model is better suited to every problem. A question that arises from this is how to design methods that propose optimizers tailored to specific problems achieving state-of-the-art performance. This paper addresses this issue by proposing the use of meta-learning to infer population-based black-box optimizers that can automatically adapt to specific classes of problems. We suggest a general modeling of population-based algorithms that result in Learning-to-Optimize POMDP (LTO-POMDP), a meta-learning framework based on a specific partially observable Markov decision process (POMDP). From that framework's formulation, we propose to parameterize the algorithm using deep recurrent neural networks and use a meta-loss function based on stochastic algorithms' performance to train efficient data-driven optimizers over several related optimization tasks. The learned optimizers' performance based on this implementation is assessed on various black-box optimization tasks and hyperparameter tuning of machine learning models. Our results revealed that the meta-loss function encourages a learned algorithm to alter its search behavior so that it can easily fit into a new context. Thus, it allows better generalization and higher sample efficiency than state-of-the-art generic optimization algorithms, such as the Covariance matrix adaptation evolution strategy (CMA-ES). △ Less

Submitted 5 March, 2021; originally announced March 2021.

Comments: 9 pages, 7 figures

arXiv:1911.11195 [pdf, other]

A Novel Unsupervised Post-Processing Calibration Method for DNNS with Robustness to Domain Shift

Authors: Azadeh Sadat Mozafari, Hugo Siqueira Gomes, Christian Gagne

Abstract: The uncertainty estimation is critical in real-world decision making applications, especially when distributional shift between the training and test data are prevalent. Many calibration methods in the literature have been proposed to improve the predictive uncertainty of DNNs which are generally not well-calibrated. However, none of them is specifically designed to work properly under domain shif… ▽ More The uncertainty estimation is critical in real-world decision making applications, especially when distributional shift between the training and test data are prevalent. Many calibration methods in the literature have been proposed to improve the predictive uncertainty of DNNs which are generally not well-calibrated. However, none of them is specifically designed to work properly under domain shift condition. In this paper, we propose Unsupervised Temperature Scaling (UTS) as a robust calibration method to domain shift. It exploits unlabeled test samples instead of the training one to adjust the uncertainty prediction of deep models towards the test distribution. UTS utilizes a novel loss function, weighted NLL, which allows unsupervised calibration. We evaluate UTS on a wide range of model-datasets to show the possibility of calibration without labels and demonstrate the robustness of UTS compared to other methods (e.g., TS, MC-dropout, SVI, ensembles) in shifted domains. △ Less

Submitted 25 November, 2019; originally announced November 2019.

arXiv:1905.00174 [pdf, other]

Unsupervised Temperature Scaling: An Unsupervised Post-Processing Calibration Method of Deep Networks

Authors: Azadeh Sadat Mozafari, Hugo Siqueira Gomes, Wilson Leão, Christian Gagné

Abstract: The great performances of deep learning are undeniable, with impressive results over a wide range of tasks. However, the output confidence of these models is usually not well-calibrated, which can be an issue for applications where confidence on the decisions is central to providing trust and reliability (e.g., autonomous driving or medical diagnosis). For models using softmax at the last layer, T… ▽ More The great performances of deep learning are undeniable, with impressive results over a wide range of tasks. However, the output confidence of these models is usually not well-calibrated, which can be an issue for applications where confidence on the decisions is central to providing trust and reliability (e.g., autonomous driving or medical diagnosis). For models using softmax at the last layer, Temperature Scaling (TS) is a state-of-the-art calibration method, with low time and memory complexity as well as demonstrated effectiveness. TS relies on a T parameter to rescale and calibrate values of the softmax layer, whose parameter value is computed from a labelled dataset. We are proposing an Unsupervised Temperature Scaling (UTS) approach, which does not depend on labelled samples to calibrate the model, which allows, for example, the use of a part of a test samples to calibrate the pre-trained model before going into inference mode. We provide theoretical justifications for UTS and assess its effectiveness on a wide range of deep models and datasets. We also demonstrate calibration results of UTS on skin lesion detection, a problem where a well-calibrated output can play an important role for accurate decision-making. △ Less

Submitted 10 June, 2019; v1 submitted 30 April, 2019; originally announced May 2019.

Comments: arXiv admin note: text overlap with arXiv:1810.11586

arXiv:1810.11586 [pdf, other]

Attended Temperature Scaling: A Practical Approach for Calibrating Deep Neural Networks

Authors: Azadeh Sadat Mozafari, Hugo Siqueira Gomes, Wilson Leão, Steeven Janny, Christian Gagné

Abstract: Recently, Deep Neural Networks (DNNs) have been achieving impressive results on wide range of tasks. However, they suffer from being well-calibrated. In decision-making applications, such as autonomous driving or medical diagnosing, the confidence of deep networks plays an important role to bring the trust and reliability to the system. To calibrate the deep networks' confidence, many probabilisti… ▽ More Recently, Deep Neural Networks (DNNs) have been achieving impressive results on wide range of tasks. However, they suffer from being well-calibrated. In decision-making applications, such as autonomous driving or medical diagnosing, the confidence of deep networks plays an important role to bring the trust and reliability to the system. To calibrate the deep networks' confidence, many probabilistic and measure-based approaches are proposed. Temperature Scaling (TS) is a state-of-the-art among measure-based calibration methods which has low time and memory complexity as well as effectiveness. In this paper, we study TS and show it does not work properly when the validation set that TS uses for calibration has small size or contains noisy-labeled samples. TS also cannot calibrate highly accurate networks as well as non-highly accurate ones. Accordingly, we propose Attended Temperature Scaling (ATS) which preserves the advantages of TS while improves calibration in aforementioned challenging situations. We provide theoretical justifications for ATS and assess its effectiveness on wide range of deep models and datasets. We also compare the calibration results of TS and ATS on skin lesion detection application as a practical problem where well-calibrated system can play important role in making a decision. △ Less

Submitted 8 May, 2019; v1 submitted 26 October, 2018; originally announced October 2018.

Showing 1–4 of 4 results for author: Gomes, H S