-
Automatic Termination for Hyperparameter Optimization
Authors:
Anastasia Makarova,
Huibin Shen,
Valerio Perrone,
Aaron Klein,
Jean Baptiste Faddoul,
Andreas Krause,
Matthias Seeger,
Cedric Archambeau
Abstract:
Bayesian optimization (BO) is a widely popular approach for the hyperparameter optimization (HPO) in machine learning. At its core, BO iteratively evaluates promising configurations until a user-defined budget, such as wall-clock time or number of iterations, is exhausted. While the final performance after tuning heavily depends on the provided budget, it is hard to pre-specify an optimal value in…
▽ More
Bayesian optimization (BO) is a widely popular approach for the hyperparameter optimization (HPO) in machine learning. At its core, BO iteratively evaluates promising configurations until a user-defined budget, such as wall-clock time or number of iterations, is exhausted. While the final performance after tuning heavily depends on the provided budget, it is hard to pre-specify an optimal value in advance. In this work, we propose an effective and intuitive termination criterion for BO that automatically stops the procedure if it is sufficiently close to the global optimum. Our key insight is that the discrepancy between the true objective (predictive performance on test data) and the computable target (validation performance) suggests stop** once the suboptimality in optimizing the target is dominated by the statistical estimation error. Across an extensive range of real-world HPO problems and baselines, we show that our termination criterion achieves a better trade-off between the test performance and optimization time. Additionally, we find that overfitting may occur in the context of HPO, which is arguably an overlooked problem in the literature, and show how our termination criterion helps to mitigate this phenomenon on both small and large datasets.
△ Less
Submitted 22 July, 2022; v1 submitted 16 April, 2021;
originally announced April 2021.
-
Amazon SageMaker Automatic Model Tuning: Scalable Gradient-Free Optimization
Authors:
Valerio Perrone,
Huibin Shen,
Aida Zolic,
Iaroslav Shcherbatyi,
Amr Ahmed,
Tanya Bansal,
Michele Donini,
Fela Winkelmolen,
Rodolphe Jenatton,
Jean Baptiste Faddoul,
Barbara Pogorzelska,
Miroslav Miladinovic,
Krishnaram Kenthapadi,
Matthias Seeger,
Cédric Archambeau
Abstract:
Tuning complex machine learning systems is challenging. Machine learning typically requires to set hyperparameters, be it regularization, architecture, or optimization parameters, whose tuning is critical to achieve good predictive performance. To democratize access to machine learning systems, it is essential to automate the tuning. This paper presents Amazon SageMaker Automatic Model Tuning (AMT…
▽ More
Tuning complex machine learning systems is challenging. Machine learning typically requires to set hyperparameters, be it regularization, architecture, or optimization parameters, whose tuning is critical to achieve good predictive performance. To democratize access to machine learning systems, it is essential to automate the tuning. This paper presents Amazon SageMaker Automatic Model Tuning (AMT), a fully managed system for gradient-free optimization at scale. AMT finds the best version of a trained machine learning model by repeatedly evaluating it with different hyperparameter configurations. It leverages either random search or Bayesian optimization to choose the hyperparameter values resulting in the best model, as measured by the metric chosen by the user. AMT can be used with built-in algorithms, custom algorithms, and Amazon SageMaker pre-built containers for machine learning frameworks. We discuss the core functionality, system architecture, our design principles, and lessons learned. We also describe more advanced features of AMT, such as automated early stop** and warm-starting, showing in experiments their benefits to users.
△ Less
Submitted 18 June, 2021; v1 submitted 15 December, 2020;
originally announced December 2020.
-
Amazon SageMaker Autopilot: a white box AutoML solution at scale
Authors:
Piali Das,
Valerio Perrone,
Nikita Ivkin,
Tanya Bansal,
Zohar Karnin,
Huibin Shen,
Iaroslav Shcherbatyi,
Yotam Elor,
Wilton Wu,
Aida Zolic,
Thibaut Lienart,
Alex Tang,
Amr Ahmed,
Jean Baptiste Faddoul,
Rodolphe Jenatton,
Fela Winkelmolen,
Philip Gautier,
Leo Dirac,
Andre Perunicic,
Miroslav Miladinovic,
Giovanni Zappella,
Cédric Archambeau,
Matthias Seeger,
Bhaskar Dutt,
Laurence Rouesnel
Abstract:
AutoML systems provide a black-box solution to machine learning problems by selecting the right way of processing features, choosing an algorithm and tuning the hyperparameters of the entire pipeline. Although these systems perform well on many datasets, there is still a non-negligible number of datasets for which the one-shot solution produced by each particular system would provide sub-par perfo…
▽ More
AutoML systems provide a black-box solution to machine learning problems by selecting the right way of processing features, choosing an algorithm and tuning the hyperparameters of the entire pipeline. Although these systems perform well on many datasets, there is still a non-negligible number of datasets for which the one-shot solution produced by each particular system would provide sub-par performance. In this paper, we present Amazon SageMaker Autopilot: a fully managed system providing an automated ML solution that can be modified when needed. Given a tabular dataset and the target column name, Autopilot identifies the problem type, analyzes the data and produces a diverse set of complete ML pipelines including feature preprocessing and ML algorithms, which are tuned to generate a leaderboard of candidate models. In the scenario where the performance is not satisfactory, a data scientist is able to view and edit the proposed ML pipelines in order to infuse their expertise and business knowledge without having to revert to a fully manual solution. This paper describes the different components of Autopilot, emphasizing the infrastructure choices that allow scalability, high quality models, editable ML pipelines, consumption of artifacts of offline meta-learning, and a convenient integration with the entire SageMaker suite allowing these trained models to be used in a production setting.
△ Less
Submitted 16 December, 2020; v1 submitted 15 December, 2020;
originally announced December 2020.
-
Chargrid: Towards Understanding 2D Documents
Authors:
Anoop Raveendra Katti,
Christian Reisswig,
Cordula Guder,
Sebastian Brarda,
Steffen Bickel,
Johannes Höhne,
Jean Baptiste Faddoul
Abstract:
We introduce a novel type of text representation that preserves the 2D layout of a document. This is achieved by encoding each document page as a two-dimensional grid of characters. Based on this representation, we present a generic document understanding pipeline for structured documents. This pipeline makes use of a fully convolutional encoder-decoder network that predicts a segmentation mask an…
▽ More
We introduce a novel type of text representation that preserves the 2D layout of a document. This is achieved by encoding each document page as a two-dimensional grid of characters. Based on this representation, we present a generic document understanding pipeline for structured documents. This pipeline makes use of a fully convolutional encoder-decoder network that predicts a segmentation mask and bounding boxes. We demonstrate its capabilities on an information extraction task from invoices and show that it significantly outperforms approaches based on sequential text or document images.
△ Less
Submitted 24 September, 2018;
originally announced September 2018.