-
SmartChoices: Augmenting Software with Learned Implementations
Authors:
Daniel Golovin,
Gabor Bartok,
Eric Chen,
Emily Donahue,
Tzu-Kuo Huang,
Efi Kokiopoulou,
Ruoyan Qin,
Nikhil Sarda,
Justin Sybrandt,
Vincent Tjeng
Abstract:
We are living in a golden age of machine learning. Powerful models perform many tasks far better than is possible using traditional software engineering approaches alone. However, develo** and deploying these models in existing software systems remains challenging. In this paper, we present SmartChoices, a novel approach to incorporating machine learning into mature software stacks easily, safel…
▽ More
We are living in a golden age of machine learning. Powerful models perform many tasks far better than is possible using traditional software engineering approaches alone. However, develo** and deploying these models in existing software systems remains challenging. In this paper, we present SmartChoices, a novel approach to incorporating machine learning into mature software stacks easily, safely, and effectively. We highlight key design decisions and present case studies applying SmartChoices within a range of large-scale industrial systems.
△ Less
Submitted 30 November, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
Ranking architectures using meta-learning
Authors:
Alina Dubatovka,
Efi Kokiopoulou,
Luciano Sbaiz,
Andrea Gesmundo,
Gabor Bartok,
Jesse Berent
Abstract:
Neural architecture search has recently attracted lots of research efforts as it promises to automate the manual design of neural networks. However, it requires a large amount of computing resources and in order to alleviate this, a performance prediction network has been recently proposed that enables efficient architecture search by forecasting the performance of candidate architectures, instead…
▽ More
Neural architecture search has recently attracted lots of research efforts as it promises to automate the manual design of neural networks. However, it requires a large amount of computing resources and in order to alleviate this, a performance prediction network has been recently proposed that enables efficient architecture search by forecasting the performance of candidate architectures, instead of relying on actual model training. The performance predictor is task-aware taking as input not only the candidate architecture but also task meta-features and it has been designed to collectively learn from several tasks. In this work, we introduce a pairwise ranking loss for training a network able to rank candidate architectures for a new unseen task conditioning on its task meta-features. We present experimental results, showing that the ranking network is more effective in architecture search than the previously proposed performance predictor.
△ Less
Submitted 26 November, 2019;
originally announced November 2019.
-
Flexible Multi-task Networks by Learning Parameter Allocation
Authors:
Krzysztof Maziarz,
Efi Kokiopoulou,
Andrea Gesmundo,
Luciano Sbaiz,
Gabor Bartok,
Jesse Berent
Abstract:
This paper proposes a novel learning method for multi-task applications. Multi-task neural networks can learn to transfer knowledge across different tasks by using parameter sharing. However, sharing parameters between unrelated tasks can hurt performance. To address this issue, we propose a framework to learn fine-grained patterns of parameter sharing. Assuming that the network is composed of sev…
▽ More
This paper proposes a novel learning method for multi-task applications. Multi-task neural networks can learn to transfer knowledge across different tasks by using parameter sharing. However, sharing parameters between unrelated tasks can hurt performance. To address this issue, we propose a framework to learn fine-grained patterns of parameter sharing. Assuming that the network is composed of several components across layers, our framework uses learned binary variables to allocate components to tasks in order to encourage more parameter sharing between related tasks, and discourage parameter sharing otherwise. The binary allocation variables are learned jointly with the model parameters by standard back-propagation thanks to the Gumbel-Softmax reparametrization method. When applied to the Omniglot benchmark, the proposed method achieves a 17% relative reduction of the error rate compared to state-of-the-art.
△ Less
Submitted 18 July, 2020; v1 submitted 10 October, 2019;
originally announced October 2019.
-
Fast Task-Aware Architecture Inference
Authors:
Efi Kokiopoulou,
Anja Hauth,
Luciano Sbaiz,
Andrea Gesmundo,
Gabor Bartok,
Jesse Berent
Abstract:
Neural architecture search has been shown to hold great promise towards the automation of deep learning. However in spite of its potential, neural architecture search remains quite costly. To this point, we propose a novel gradient-based framework for efficient architecture search by sharing information across several tasks. We start by training many model architectures on several related (trainin…
▽ More
Neural architecture search has been shown to hold great promise towards the automation of deep learning. However in spite of its potential, neural architecture search remains quite costly. To this point, we propose a novel gradient-based framework for efficient architecture search by sharing information across several tasks. We start by training many model architectures on several related (training) tasks. When a new unseen task is presented, the framework performs architecture inference in order to quickly identify a good candidate architecture, before any model is trained on the new task. At the core of our framework lies a deep value network that can predict the performance of input architectures on a task by utilizing task meta-features and the previous model training experiments performed on related tasks. We adopt a continuous parametrization of the model architecture which allows for efficient gradient-based optimization. Given a new task, an effective architecture is quickly identified by maximizing the estimated performance with respect to the model architecture parameters with simple gradient ascent. It is key to point out that our goal is to achieve reasonable performance at the lowest cost. We provide experimental results showing the effectiveness of the framework despite its high computational efficiency.
△ Less
Submitted 15 February, 2019;
originally announced February 2019.
-
Importance weighting without importance weights: An efficient algorithm for combinatorial semi-bandits
Authors:
Gergely Neu,
Gábor Bartók
Abstract:
We propose a sample-efficient alternative for importance weighting for situations where one only has sample access to the probability distribution that generates the observations. Our new method, called Geometric Resampling (GR), is described and analyzed in the context of online combinatorial optimization under semi-bandit feedback, where a learner sequentially selects its actions from a combinat…
▽ More
We propose a sample-efficient alternative for importance weighting for situations where one only has sample access to the probability distribution that generates the observations. Our new method, called Geometric Resampling (GR), is described and analyzed in the context of online combinatorial optimization under semi-bandit feedback, where a learner sequentially selects its actions from a combinatorial decision set so as to minimize its cumulative loss. In particular, we show that the well-known Follow-the-Perturbed-Leader (FPL) prediction method coupled with Geometric Resampling yields the first computationally efficient reduction from offline to online optimization in this setting. We provide a thorough theoretical analysis for the resulting algorithm, showing that its performance is on par with previous, inefficient solutions. Our main contribution is showing that, despite the relatively large variance induced by the GR procedure, our performance guarantees hold with high probability rather than only in expectation. As a side result, we also improve the best known regret bounds for FPL in online combinatorial optimization with full feedback, closing the perceived performance gap between FPL and exponential weights in this setting.
△ Less
Submitted 31 August, 2016; v1 submitted 17 March, 2015;
originally announced March 2015.
-
Near-Optimally Teaching the Crowd to Classify
Authors:
Adish Singla,
Ilija Bogunovic,
Gábor Bartók,
Amin Karbasi,
Andreas Krause
Abstract:
How should we present training examples to learners to teach them classification rules? This is a natural problem when training workers for crowdsourcing labeling tasks, and is also motivated by challenges in data-driven online education. We propose a natural stochastic model of the learners, modeling them as randomly switching among hypotheses based on observed feedback. We then develop STRICT, a…
▽ More
How should we present training examples to learners to teach them classification rules? This is a natural problem when training workers for crowdsourcing labeling tasks, and is also motivated by challenges in data-driven online education. We propose a natural stochastic model of the learners, modeling them as randomly switching among hypotheses based on observed feedback. We then develop STRICT, an efficient algorithm for selecting examples to teach to workers. Our solution greedily maximizes a submodular surrogate objective function in order to select examples to show to the learners. We prove that our strategy is competitive with the optimal teaching policy. Moreover, for the special case of linear separators, we prove that an exponential reduction in error probability can be achieved. Our experiments on simulated workers as well as three real image annotation tasks on Amazon Mechanical Turk show the effectiveness of our teaching algorithm.
△ Less
Submitted 7 March, 2014; v1 submitted 10 February, 2014;
originally announced February 2014.
-
An efficient algorithm for learning with semi-bandit feedback
Authors:
Gergely Neu,
Gábor Bartók
Abstract:
We consider the problem of online combinatorial optimization under semi-bandit feedback. The goal of the learner is to sequentially select its actions from a combinatorial decision set so as to minimize its cumulative loss. We propose a learning algorithm for this problem based on combining the Follow-the-Perturbed-Leader (FPL) prediction method with a novel loss estimation procedure called Geomet…
▽ More
We consider the problem of online combinatorial optimization under semi-bandit feedback. The goal of the learner is to sequentially select its actions from a combinatorial decision set so as to minimize its cumulative loss. We propose a learning algorithm for this problem based on combining the Follow-the-Perturbed-Leader (FPL) prediction method with a novel loss estimation procedure called Geometric Resampling (GR). Contrary to previous solutions, the resulting algorithm can be efficiently implemented for any decision set where efficient offline combinatorial optimization is possible at all. Assuming that the elements of the decision set can be described with d-dimensional binary vectors with at most m non-zero entries, we show that the expected regret of our algorithm after T rounds is O(m sqrt(dT log d)). As a side result, we also improve the best known regret bounds for FPL in the full information setting to O(m^(3/2) sqrt(T log d)), gaining a factor of sqrt(d/m) over previous bounds for this algorithm.
△ Less
Submitted 13 May, 2013;
originally announced May 2013.
-
An Adaptive Algorithm for Finite Stochastic Partial Monitoring
Authors:
Gabor Bartok,
Navid Zolghadr,
Csaba Szepesvari
Abstract:
We present a new anytime algorithm that achieves near-optimal regret for any instance of finite stochastic partial monitoring. In particular, the new algorithm achieves the minimax regret, within logarithmic factors, for both "easy" and "hard" problems. For easy problems, it additionally achieves logarithmic individual regret. Most importantly, the algorithm is adaptive in the sense that if the op…
▽ More
We present a new anytime algorithm that achieves near-optimal regret for any instance of finite stochastic partial monitoring. In particular, the new algorithm achieves the minimax regret, within logarithmic factors, for both "easy" and "hard" problems. For easy problems, it additionally achieves logarithmic individual regret. Most importantly, the algorithm is adaptive in the sense that if the opponent strategy is in an "easy region" of the strategy space then the regret grows as if the problem was easy. As an implication, we show that under some reasonable additional assumptions, the algorithm enjoys an O(\sqrt{T}) regret in Dynamic Pricing, proven to be hard by Bartok et al. (2011).
△ Less
Submitted 27 June, 2012;
originally announced June 2012.
-
Non-trivial two-armed partial-monitoring games are bandits
Authors:
András Antos,
Gábor Bartók,
Csaba Szepesvári
Abstract:
We consider online learning in partial-monitoring games against an oblivious adversary. We show that when the number of actions available to the learner is two and the game is nontrivial then it is reducible to a bandit-like game and thus the minimax regret is $Θ(\sqrt{T})$.
We consider online learning in partial-monitoring games against an oblivious adversary. We show that when the number of actions available to the learner is two and the game is nontrivial then it is reducible to a bandit-like game and thus the minimax regret is $Θ(\sqrt{T})$.
△ Less
Submitted 24 August, 2011;
originally announced August 2011.
-
Toward a Classification of Finite Partial-Monitoring Games
Authors:
András Antos,
Gábor Bartók,
Dávid Pál,
Csaba Szepesvári
Abstract:
Partial-monitoring games constitute a mathematical framework for sequential decision making problems with imperfect feedback: The learner repeatedly chooses an action, opponent responds with an outcome, and then the learner suffers a loss and receives a feedback signal, both of which are fixed functions of the action and the outcome. The goal of the learner is to minimize his total cumulative loss…
▽ More
Partial-monitoring games constitute a mathematical framework for sequential decision making problems with imperfect feedback: The learner repeatedly chooses an action, opponent responds with an outcome, and then the learner suffers a loss and receives a feedback signal, both of which are fixed functions of the action and the outcome. The goal of the learner is to minimize his total cumulative loss. We make progress towards the classification of these games based on their minimax expected regret. Namely, we classify almost all games with two outcomes and finite number of actions: We show that their minimax expected regret is either zero, $\widetildeΘ(\sqrt{T})$, $Θ(T^{2/3})$, or $Θ(T)$ and we give a simple and efficiently computable classification of these four classes of games. Our hope is that the result can serve as a step** stone toward classifying all finite partial-monitoring games.
△ Less
Submitted 11 October, 2011; v1 submitted 10 February, 2011;
originally announced February 2011.