Search | arXiv e-print repository

Learning Abstract Task Representations

Authors: Mikhail M. Meskhi, Adriano Rivolli, Rafael G. Mantovani, Ricardo Vilalta

Abstract: A proper form of data characterization can guide the process of learning-algorithm selection and model-performance estimation. The field of meta-learning has provided a rich body of work describing effective forms of data characterization using different families of meta-features (statistical, model-based, information-theoretic, topological, etc.). In this paper, we start with the abundant set of… ▽ More A proper form of data characterization can guide the process of learning-algorithm selection and model-performance estimation. The field of meta-learning has provided a rich body of work describing effective forms of data characterization using different families of meta-features (statistical, model-based, information-theoretic, topological, etc.). In this paper, we start with the abundant set of existing meta-features and propose a method to induce new abstract meta-features as latent variables in a deep neural network. We discuss the pitfalls of using traditional meta-features directly and argue for the importance of learning high-level task properties. We demonstrate our methodology using a deep neural network as a feature extractor. We demonstrate that 1) induced meta-models map** abstract meta-features to generalization performance outperform other methods by ~18% on average, and 2) abstract meta-features attain high feature-relevance scores. △ Less

Submitted 28 January, 2021; v1 submitted 19 January, 2021; originally announced January 2021.

Journal ref: AAAI Workshop on Meta-Learning 2021

arXiv:2009.07430 [pdf, other]

An Extensive Experimental Evaluation of Automated Machine Learning Methods for Recommending Classification Algorithms (Extended Version)

Authors: Márcio P. Basgalupp, Rodrigo C. Barros, Alex G. C. de Sá, Gisele L. Pappa, Rafael G. Mantovani, André C. P. L. F. de Carvalho, Alex A. Freitas

Abstract: This paper presents an experimental comparison among four Automated Machine Learning (AutoML) methods for recommending the best classification algorithm for a given input dataset. Three of these methods are based on Evolutionary Algorithms (EAs), and the other is Auto-WEKA, a well-known AutoML method based on the Combined Algorithm Selection and Hyper-parameter optimisation (CASH) approach. The EA… ▽ More This paper presents an experimental comparison among four Automated Machine Learning (AutoML) methods for recommending the best classification algorithm for a given input dataset. Three of these methods are based on Evolutionary Algorithms (EAs), and the other is Auto-WEKA, a well-known AutoML method based on the Combined Algorithm Selection and Hyper-parameter optimisation (CASH) approach. The EA-based methods build classification algorithms from a single machine learning paradigm: either decision-tree induction, rule induction, or Bayesian network classification. Auto-WEKA combines algorithm selection and hyper-parameter optimisation to recommend classification algorithms from multiple paradigms. We performed controlled experiments where these four AutoML methods were given the same runtime limit for different values of this limit. In general, the difference in predictive accuracy of the three best AutoML methods was not statistically significant. However, the EA evolving decision-tree induction algorithms has the advantage of producing algorithms that generate interpretable classification models and that are more scalable to large datasets, by comparison with many algorithms from other learning paradigms that can be recommended by Auto-WEKA. We also observed that Auto-WEKA has shown meta-overfitting, a form of overfitting at the meta-learning level, rather than at the base-learning level. △ Less

Submitted 15 September, 2020; originally announced September 2020.

Comments: Accepted at Evolutionary Intelligence

arXiv:2008.00025 [pdf, other]

Rethinking Default Values: a Low Cost and Efficient Strategy to Define Hyperparameters

Authors: Rafael Gomes Mantovani, André Luis Debiaso Rossi, Edesio Alcobaça, Jadson Castro Gertrudes, Sylvio Barbon Junior, André Carlos Ponce de Leon Ferreira de Carvalho

Abstract: Machine Learning (ML) algorithms have been increasingly applied to problems from several different areas. Despite their growing popularity, their predictive performance is usually affected by the values assigned to their hyperparameters (HPs). As consequence, researchers and practitioners face the challenge of how to set these values. Many users have limited knowledge about ML algorithms and the e… ▽ More Machine Learning (ML) algorithms have been increasingly applied to problems from several different areas. Despite their growing popularity, their predictive performance is usually affected by the values assigned to their hyperparameters (HPs). As consequence, researchers and practitioners face the challenge of how to set these values. Many users have limited knowledge about ML algorithms and the effect of their HP values and, therefore, do not take advantage of suitable settings. They usually define the HP values by trial and error, which is very subjective, not guaranteed to find good values and dependent on the user experience. Tuning techniques search for HP values able to maximize the predictive performance of induced models for a given dataset, but have the drawback of a high computational cost. Thus, practitioners use default values suggested by the algorithm developer or by tools implementing the algorithm. Although default values usually result in models with acceptable predictive performance, different implementations of the same algorithm can suggest distinct default values. To maintain a balance between tuning and using default values, we propose a strategy to generate new optimized default values. Our approach is grounded on a small set of optimized values able to obtain predictive performance values better than default settings provided by popular tools. After performing a large experiment and a careful analysis of the results, we concluded that our approach delivers better default values. Besides, it leads to competitive solutions when compared to tuned values, making it easier to use and having a lower cost. We also extracted simple rules to guide practitioners in deciding whether to use our new methodology or a HP tuning approach. △ Less

Submitted 8 July, 2021; v1 submitted 31 July, 2020; originally announced August 2020.

Comments: 44 pages, 13 figures

arXiv:1907.11277 [pdf, other]

Towards meta-learning for multi-target regression problems

Authors: Gabriel Jonas Aguiar, Everton José Santana, Saulo Martiello Mastelini, Rafael Gomes Mantovani, Sylvio Barbon Jr

Abstract: Several multi-target regression methods were devel-oped in the last years aiming at improving predictive performanceby exploring inter-target correlation within the problem. However, none of these methods outperforms the others for all problems. This motivates the development of automatic approachesto recommend the most suitable multi-target regression method. In this paper, we propose a meta-lear… ▽ More Several multi-target regression methods were devel-oped in the last years aiming at improving predictive performanceby exploring inter-target correlation within the problem. However, none of these methods outperforms the others for all problems. This motivates the development of automatic approachesto recommend the most suitable multi-target regression method. In this paper, we propose a meta-learning system to recommend the best predictive method for a given multi-target regression problem. We performed experiments with a meta-dataset generated by a total of 648 synthetic datasets. These datasets were created to explore distinct inter-targets characteristics toward recommending the most promising method. In experiments, we evaluated four different algorithms with different biases as meta-learners. Our meta-dataset is composed of 58 meta-features, based on: statistical information, correlation characteristics, linear landmarking, from the distribution and smoothness of the data, and has four different meta-labels. Results showed that induced meta-models were able to recommend the best methodfor different base level datasets with a balanced accuracy superior to 70% using a Random Forest meta-model, which statistically outperformed the meta-learning baselines. △ Less

Submitted 25 July, 2019; originally announced July 2019.

Comments: To appear on the 8th Brazilian Conference on Intelligent Systems (BRACIS)

arXiv:1906.01684 [pdf, other]

doi 10.1016/j.ins.2019.06.005

A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers

Authors: Rafael Gomes Mantovani, André Luis Debiaso Rossi, Edesio Alcobaça, Joaquin Vanschoren, André Carlos Ponce de Leon Ferreira de Carvalho

Abstract: For many machine learning algorithms, predictive performance is critically affected by the hyperparameter values used to train them. However, tuning these hyperparameters can come at a high computational cost, especially on larger datasets, while the tuned settings do not always significantly outperform the default values. This paper proposes a recommender system based on meta-learning to identify… ▽ More For many machine learning algorithms, predictive performance is critically affected by the hyperparameter values used to train them. However, tuning these hyperparameters can come at a high computational cost, especially on larger datasets, while the tuned settings do not always significantly outperform the default values. This paper proposes a recommender system based on meta-learning to identify exactly when it is better to use default values and when to tune hyperparameters for each new dataset. Besides, an in-depth analysis is performed to understand what they take into account for their decisions, providing useful insights. An extensive analysis of different categories of meta-features, meta-learners, and setups across 156 datasets is performed. Results show that it is possible to accurately predict when tuning will significantly improve the performance of the induced models. The proposed system reduces the time spent on optimization processes, without reducing the predictive performance of the induced models (when compared with the ones obtained using tuned hyperparameters). We also explain the decision-making process of the meta-learners in terms of linear separability-based hypotheses. Although this analysis is focused on the tuning of Support Vector Machines, it can also be applied to other algorithms, as shown in experiments performed with decision trees. △ Less

Submitted 11 June, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

Comments: 49 pages, 11 figures

Journal ref: Information Sciences, Volume 501, 2019. Pages 193-221, ISSN 0020-0255

arXiv:1812.02207 [pdf, other]

doi 10.1007/s10618-024-01002-5

Better Trees: An empirical study on hyperparameter tuning of classification decision tree induction algorithms

Authors: Rafael Gomes Mantovani, Tomáš Horváth, André L. D. Rossi, Ricardo Cerri, Sylvio Barbon Junior, Joaquin Vanschoren, André Carlos Ponce de Leon Ferreira de Carvalho

Abstract: Machine learning algorithms often contain many hyperparameters (HPs) whose values affect the predictive performance of the induced models in intricate ways. Due to the high number of possibilities for these HP configurations and their complex interactions, it is common to use optimization techniques to find settings that lead to high predictive performance. However, insights into efficiently explo… ▽ More Machine learning algorithms often contain many hyperparameters (HPs) whose values affect the predictive performance of the induced models in intricate ways. Due to the high number of possibilities for these HP configurations and their complex interactions, it is common to use optimization techniques to find settings that lead to high predictive performance. However, insights into efficiently exploring this vast space of configurations and dealing with the trade-off between predictive and runtime performance remain challenging. Furthermore, there are cases where the default HPs fit the suitable configuration. Additionally, for many reasons, including model validation and attendance to new legislation, there is an increasing interest in interpretable models, such as those created by the Decision Tree (DT) induction algorithms. This paper provides a comprehensive approach for investigating the effects of hyperparameter tuning for the two DT induction algorithms most often used, CART and C4.5. DT induction algorithms present high predictive performance and interpretable classification models, though many HPs need to be adjusted. Experiments were carried out with different tuning strategies to induce models and to evaluate HPs' relevance using 94 classification datasets from OpenML. The experimental results point out that different HP profiles for the tuning of each algorithm provide statistically significant improvements in most of the datasets for CART, but only in one-third for C4.5. Although different algorithms may present different tuning scenarios, the tuning techniques generally required few evaluations to find accurate solutions. Furthermore, the best technique for all the algorithms was the IRACE. Finally, we found out that tuning a specific small subset of HPs is a good alternative for achieving optimal predictive performance. △ Less

Submitted 21 December, 2023; v1 submitted 5 December, 2018; originally announced December 2018.

Comments: 60 pages, 16 figures

arXiv:1708.03731 [pdf, other]

OpenML Benchmarking Suites

Authors: Bernd Bischl, Giuseppe Casalicchio, Matthias Feurer, Pieter Gijsbers, Frank Hutter, Michel Lang, Rafael G. Mantovani, Jan N. van Rijn, Joaquin Vanschoren

Abstract: Machine learning research depends on objectively interpretable, comparable, and reproducible algorithm benchmarks. We advocate the use of curated, comprehensive suites of machine learning tasks to standardize the setup, execution, and reporting of benchmarks. We enable this through software tools that help to create and leverage these benchmarking suites. These are seamlessly integrated into the O… ▽ More Machine learning research depends on objectively interpretable, comparable, and reproducible algorithm benchmarks. We advocate the use of curated, comprehensive suites of machine learning tasks to standardize the setup, execution, and reporting of benchmarks. We enable this through software tools that help to create and leverage these benchmarking suites. These are seamlessly integrated into the OpenML platform, and accessible through interfaces in Python, Java, and R. OpenML benchmarking suites (a) are easy to use through standardized data formats, APIs, and client libraries; (b) come with extensive meta-information on the included datasets; and (c) allow benchmarks to be shared and reused in future studies. We then present a first, carefully curated and practical benchmarking suite for classification: the OpenML Curated Classification benchmarking suite 2018 (OpenML-CC18). Finally, we discuss use cases and applications which demonstrate the usefulness of OpenML benchmarking suites and the OpenML-CC18 in particular. △ Less

Submitted 22 November, 2021; v1 submitted 11 August, 2017; originally announced August 2017.

Comments: Accepted for publication in the Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS 2021)

Journal ref: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2021)

Showing 1–7 of 7 results for author: Mantovani, R G