-
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization
Authors:
Alonso Marco,
Felix Berkenkamp,
Philipp Hennig,
Angela P. Schoellig,
Andreas Krause,
Stefan Schaal,
Sebastian Trimpe
Abstract:
In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robot…
▽ More
In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.
△ Less
Submitted 3 March, 2017;
originally announced March 2017.
-
Bifurcations and dynamics emergent from lattice and continuum models of bioactive porous media
Authors:
Andrew L. Krause,
Dmitry Beliaev,
Robert A. Van Gorder,
Sarah L. Waters
Abstract:
We study dynamics emergent from a two-dimensional reaction--diffusion process modelled via a finite lattice dynamical system, as well as an analogous PDE system, involving spatially nonlocal interactions. These models govern the evolution of cells in a bioactive porous medium, with evolution of the local cell density depending on a coupled quasi--static fluid flow problem. We demonstrate differenc…
▽ More
We study dynamics emergent from a two-dimensional reaction--diffusion process modelled via a finite lattice dynamical system, as well as an analogous PDE system, involving spatially nonlocal interactions. These models govern the evolution of cells in a bioactive porous medium, with evolution of the local cell density depending on a coupled quasi--static fluid flow problem. We demonstrate differences emergent from the choice of a discrete lattice or a continuum for the spatial domain of such a process. We find long--time oscillations and steady states in cell density in both lattice and continuum models, but that the continuum model only exhibits solutions with vertical symmetry, independent of initial data, whereas the finite lattice admits asymmetric oscillations and steady states arising from symmetry-breaking bifurcations. We conjecture that it is the structure of the finite lattice which allows for more complicated asymmetric dynamics. Our analysis suggests that the origin of both types of oscillations is a nonlocal reaction-diffusion mechanism mediated by quasi-static fluid flow.
△ Less
Submitted 4 June, 2018; v1 submitted 24 February, 2017;
originally announced February 2017.
-
Uniform Deviation Bounds for Unbounded Loss Functions like k-Means
Authors:
Olivier Bachem,
Mario Lucic,
S. Hamed Hassani,
Andreas Krause
Abstract:
Uniform deviation bounds limit the difference between a model's expected loss and its loss on an empirical sample uniformly for all models in a learning problem. As such, they are a critical component to empirical risk minimization. In this paper, we provide a novel framework to obtain uniform deviation bounds for loss functions which are *unbounded*. In our main application, this allows us to obt…
▽ More
Uniform deviation bounds limit the difference between a model's expected loss and its loss on an empirical sample uniformly for all models in a learning problem. As such, they are a critical component to empirical risk minimization. In this paper, we provide a novel framework to obtain uniform deviation bounds for loss functions which are *unbounded*. In our main application, this allows us to obtain bounds for $k$-Means clustering under weak assumptions on the underlying distribution. If the fourth moment is bounded, we prove a rate of $\mathcal{O}\left(m^{-\frac12}\right)$ compared to the previously known $\mathcal{O}\left(m^{-\frac14}\right)$ rate. Furthermore, we show that the rate also depends on the kurtosis - the normalized fourth moment which measures the "tailedness" of a distribution. We further provide improved rates under progressively stronger assumptions, namely, bounded higher moments, subgaussianity and bounded support.
△ Less
Submitted 27 February, 2017;
originally announced February 2017.
-
Scalable k-Means Clustering via Lightweight Coresets
Authors:
Olivier Bachem,
Mario Lucic,
Andreas Krause
Abstract:
Coresets are compact representations of data sets such that models trained on a coreset are provably competitive with models trained on the full data set. As such, they have been successfully used to scale up clustering models to massive data sets. While existing approaches generally only allow for multiplicative approximation errors, we propose a novel notion of lightweight coresets that allows f…
▽ More
Coresets are compact representations of data sets such that models trained on a coreset are provably competitive with models trained on the full data set. As such, they have been successfully used to scale up clustering models to massive data sets. While existing approaches generally only allow for multiplicative approximation errors, we propose a novel notion of lightweight coresets that allows for both multiplicative and additive errors. We provide a single algorithm to construct lightweight coresets for k-means clustering as well as soft and hard Bregman clustering. The algorithm is substantially faster than existing constructions, embarrassingly parallel, and the resulting coresets are smaller. We further show that the proposed approach naturally generalizes to statistical k-means clustering and that, compared to existing results, it can be used to compute smaller summaries for empirical risk minimization. In extensive experiments, we demonstrate that the proposed algorithm outperforms existing data summarization strategies in practice.
△ Less
Submitted 6 June, 2018; v1 submitted 27 February, 2017;
originally announced February 2017.
-
Lattice and Continuum Modelling of a Bioactive Porous Tissue Scaffold
Authors:
Andrew L. Krause,
Dmitry Beliaev,
Robert A. Van Gorder,
Sarah L. Waters
Abstract:
A contemporary procedure to grow artificial tissue is to seed cells onto a porous biomaterial scaffold and culture it within a perfusion bioreactor to facilitate the transport of nutrients to growing cells. Typical models of cell growth for tissue engineering applications make use of spatially homogeneous or spatially continuous equations to model cell growth, flow of culture medium, nutrient tran…
▽ More
A contemporary procedure to grow artificial tissue is to seed cells onto a porous biomaterial scaffold and culture it within a perfusion bioreactor to facilitate the transport of nutrients to growing cells. Typical models of cell growth for tissue engineering applications make use of spatially homogeneous or spatially continuous equations to model cell growth, flow of culture medium, nutrient transport, and their interactions. The network structure of the physical porous scaffold is often incorporated through parameters in these models, either phenomenologically or through techniques like mathematical homogenization. We derive a model on a square grid lattice to demonstrate the importance of explicitly modelling the network structure of the porous scaffold, and compare results from this model with those from a modified continuum model from the literature. We capture two-way coupling between cell growth and fluid flow by allowing cells to block pores, and by allowing the shear stress of the fluid to affect cell growth and death. We explore a range of parameters for both models, and demonstrate quantitative and qualitative differences between predictions from each of these approaches, including spatial pattern formation and local oscillations in cell density present only in the lattice model. These differences suggest that for some parameter regimes, corresponding to specific cell types and scaffold geometries, the lattice model gives qualitatively different model predictions than typical continuum models. Our results inform model selection for bioactive porous tissue scaffolds, aiding in the development of successful tissue engineering experiments and eventually clinically successful technologies.
△ Less
Submitted 29 April, 2018; v1 submitted 24 February, 2017;
originally announced February 2017.
-
Learning to Use Learners' Advice
Authors:
Adish Singla,
Hamed Hassani,
Andreas Krause
Abstract:
In this paper, we study a variant of the framework of online learning using expert advice with limited/bandit feedback. We consider each expert as a learning entity, seeking to more accurately reflecting certain real-world applications. In our setting, the feedback at any time $t$ is limited in a sense that it is only available to the expert $i^t$ that has been selected by the central algorithm (f…
▽ More
In this paper, we study a variant of the framework of online learning using expert advice with limited/bandit feedback. We consider each expert as a learning entity, seeking to more accurately reflecting certain real-world applications. In our setting, the feedback at any time $t$ is limited in a sense that it is only available to the expert $i^t$ that has been selected by the central algorithm (forecaster), \emph{i.e.}, only the expert $i^t$ receives feedback from the environment and gets to learn at time $t$. We consider a generic black-box approach whereby the forecaster does not control or know the learning dynamics of the experts apart from knowing the following no-regret learning property: the average regret of any expert $j$ vanishes at a rate of at least $O(t_j^{\regretRate-1})$ with $t_j$ learning steps where $\regretRate \in [0, 1]$ is a parameter.
In the spirit of competing against the best action in hindsight in multi-armed bandits problem, our goal here is to be competitive w.r.t. the cumulative losses the algorithm could receive by following the policy of always selecting one expert. We prove the following hardness result: without any coordination between the forecaster and the experts, it is impossible to design a forecaster achieving no-regret guarantees. In order to circumvent this hardness result, we consider a practical assumption allowing the forecaster to "guide" the learning process of the experts by filtering/blocking some of the feedbacks observed by them from the environment, \emph{i.e.}, not allowing the selected expert $i^t$ to learn at time $t$ for some time steps. Then, we design a novel no-regret learning algorithm \algo for this problem setting by carefully guiding the feedbacks observed by experts. We prove that \algo achieves the worst-case expected cumulative regret of $O(\Time^\frac{1}{2 - \regretRate})$ after $\Time$ time steps.
△ Less
Submitted 17 February, 2017; v1 submitted 15 February, 2017;
originally announced February 2017.
-
Vanishing lines for modules over the motivic Steenrod algebra
Authors:
Drew Heard,
Achim Krause
Abstract:
We study criteria for freeness and for the existence of a vanishing line for modules over certain Hopf subalgebras of the motivic Steenrod algebra over $\mathrm{Spec}(\mathbb{C})$ at the prime 2. These turn out to be determined by the vanishing of certain Margolis homology groups in the quotient Hopf algebra $\mathcal{A}/τ$.
We study criteria for freeness and for the existence of a vanishing line for modules over certain Hopf subalgebras of the motivic Steenrod algebra over $\mathrm{Spec}(\mathbb{C})$ at the prime 2. These turn out to be determined by the vanishing of certain Margolis homology groups in the quotient Hopf algebra $\mathcal{A}/τ$.
△ Less
Submitted 24 January, 2018; v1 submitted 13 February, 2017;
originally announced February 2017.
-
Coordinated Online Learning With Applications to Learning User Preferences
Authors:
Christoph Hirnschall,
Adish Singla,
Sebastian Tschiatschek,
Andreas Krause
Abstract:
We study an online multi-task learning setting, in which instances of related tasks arrive sequentially, and are handled by task-specific online learners. We consider an algorithmic framework to model the relationship of these tasks via a set of convex constraints. To exploit this relationship, we design a novel algorithm -- COOL -- for coordinating the individual online learners: Our key idea is…
▽ More
We study an online multi-task learning setting, in which instances of related tasks arrive sequentially, and are handled by task-specific online learners. We consider an algorithmic framework to model the relationship of these tasks via a set of convex constraints. To exploit this relationship, we design a novel algorithm -- COOL -- for coordinating the individual online learners: Our key idea is to coordinate their parameters via weighted projections onto a convex set. By adjusting the rate and accuracy of the projection, the COOL algorithm allows for a trade-off between the benefit of coordination and the required computation/communication. We derive regret bounds for our approach and analyze how they are influenced by these trade-off factors. We apply our results on the application of learning users' preferences on the Airbnb marketplace with the goal of incentivizing users to explore under-reviewed apartments.
△ Less
Submitted 9 February, 2017;
originally announced February 2017.
-
Truncated Variance Reduction: A Unified Approach to Bayesian Optimization and Level-Set Estimation
Authors:
Ilija Bogunovic,
Jonathan Scarlett,
Andreas Krause,
Volkan Cevher
Abstract:
We present a new algorithm, truncated variance reduction (TruVaR), that treats Bayesian optimization (BO) and level-set estimation (LSE) with Gaussian processes in a unified fashion. The algorithm greedily shrinks a sum of truncated variances within a set of potential maximizers (BO) or unclassified points (LSE), which is updated based on confidence bounds. TruVaR is effective in several important…
▽ More
We present a new algorithm, truncated variance reduction (TruVaR), that treats Bayesian optimization (BO) and level-set estimation (LSE) with Gaussian processes in a unified fashion. The algorithm greedily shrinks a sum of truncated variances within a set of potential maximizers (BO) or unclassified points (LSE), which is updated based on confidence bounds. TruVaR is effective in several important settings that are typically non-trivial to incorporate into myopic algorithms, including pointwise costs and heteroscedastic noise. We provide a general theoretical guarantee for TruVaR covering these aspects, and use it to recover and strengthen existing results on BO and LSE. Moreover, we provide a new result for a setting where one can select from a number of noise levels having associated costs. We demonstrate the effectiveness of the algorithm on both synthetic and real-world data sets.
△ Less
Submitted 24 October, 2016;
originally announced October 2016.
-
Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains
Authors:
Andrew An Bian,
Baharan Mirzasoleiman,
Joachim M. Buhmann,
Andreas Krause
Abstract:
Submodular continuous functions are a category of (generally) non-convex/non-concave functions with a wide spectrum of applications. We characterize these functions and demonstrate that they can be maximized efficiently with approximation guarantees. Specifically, i) We introduce the weak DR property that gives a unified characterization of submodularity for all set, integer-lattice and continuous…
▽ More
Submodular continuous functions are a category of (generally) non-convex/non-concave functions with a wide spectrum of applications. We characterize these functions and demonstrate that they can be maximized efficiently with approximation guarantees. Specifically, i) We introduce the weak DR property that gives a unified characterization of submodularity for all set, integer-lattice and continuous functions; ii) for maximizing monotone DR-submodular continuous functions under general down-closed convex constraints, we propose a Frank-Wolfe variant with $(1-1/e)$ approximation guarantee, and sub-linear convergence rate; iii) for maximizing general non-monotone submodular continuous functions subject to box constraints, we propose a DoubleGreedy algorithm with $1/3$ approximation guarantee. Submodular continuous functions naturally find applications in various real-world settings, including influence and revenue maximization with continuous assignments, sensor energy management, multi-resolution data summarization, facility location, etc. Experimental results show that the proposed algorithms efficiently generate superior solutions compared to baseline algorithms.
△ Less
Submitted 6 May, 2019; v1 submitted 17 June, 2016;
originally announced June 2016.
-
Safe Exploration in Finite Markov Decision Processes with Gaussian Processes
Authors:
Matteo Turchetta,
Felix Berkenkamp,
Andreas Krause
Abstract:
In classical reinforcement learning, when exploring an environment, agents accept arbitrary short term loss for long term gain. This is infeasible for safety critical applications, such as robotics, where even a single unsafe action may cause system failure. In this paper, we address the problem of safely exploring finite Markov decision processes (MDP). We define safety in terms of an, a priori u…
▽ More
In classical reinforcement learning, when exploring an environment, agents accept arbitrary short term loss for long term gain. This is infeasible for safety critical applications, such as robotics, where even a single unsafe action may cause system failure. In this paper, we address the problem of safely exploring finite Markov decision processes (MDP). We define safety in terms of an, a priori unknown, safety constraint that depends on states and actions. We aim to explore the MDP under this constraint, assuming that the unknown function satisfies regularity conditions expressed via a Gaussian process prior. We develop a novel algorithm for this task and prove that it is able to completely explore the safely reachable part of the MDP without violating the safety constraint. To achieve this, it cautiously explores safe states and actions in order to gain statistical confidence about the safety of unvisited state-action pairs from noisy observations collected while navigating the environment. Moreover, the algorithm explicitly considers reachability when exploring the MDP, ensuring that it does not get stuck in any state with no safe way out. We demonstrate our method on digital terrain models for the task of exploring an unknown map with a rover.
△ Less
Submitted 15 November, 2016; v1 submitted 15 June, 2016;
originally announced June 2016.
-
Horizontally Scalable Submodular Maximization
Authors:
Mario Lucic,
Olivier Bachem,
Morteza Zadimoghaddam,
Andreas Krause
Abstract:
A variety of large-scale machine learning problems can be cast as instances of constrained submodular maximization. Existing approaches for distributed submodular maximization have a critical drawback: The capacity - number of instances that can fit in memory - must grow with the data set size. In practice, while one can provision many machines, the capacity of each machine is limited by physical…
▽ More
A variety of large-scale machine learning problems can be cast as instances of constrained submodular maximization. Existing approaches for distributed submodular maximization have a critical drawback: The capacity - number of instances that can fit in memory - must grow with the data set size. In practice, while one can provision many machines, the capacity of each machine is limited by physical constraints. We propose a truly scalable approach for distributed submodular maximization under fixed capacity. The proposed framework applies to a broad class of algorithms and constraints and provides theoretical guarantees on the approximation factor for any available capacity. We empirically evaluate the proposed algorithm on a variety of data sets and demonstrate that it achieves performance competitive with the centralized greedy solution.
△ Less
Submitted 31 May, 2016;
originally announced May 2016.
-
Near-optimal Bayesian Active Learning with Correlated and Noisy Tests
Authors:
Yuxin Chen,
S. Hamed Hassani,
Andreas Krause
Abstract:
We consider the Bayesian active learning and experimental design problem, where the goal is to learn the value of some unknown target variable through a sequence of informative, noisy tests. In contrast to prior work, we focus on the challenging, yet practically relevant setting where test outcomes can be conditionally dependent given the hidden target variable. Under such assumptions, common heur…
▽ More
We consider the Bayesian active learning and experimental design problem, where the goal is to learn the value of some unknown target variable through a sequence of informative, noisy tests. In contrast to prior work, we focus on the challenging, yet practically relevant setting where test outcomes can be conditionally dependent given the hidden target variable. Under such assumptions, common heuristics, such as greedily performing tests that maximize the reduction in uncertainty of the target, often perform poorly. In this paper, we propose ECED, a novel, computationally efficient active learning algorithm, and prove strong theoretical guarantees that hold with correlated, noisy tests. Rather than directly optimizing the prediction error, at each step, ECED picks the test that maximizes the gain in a surrogate objective, which takes into account the dependencies between tests. Our analysis relies on an information-theoretic auxiliary function to track the progress of ECED, and utilizes adaptive submodularity to attain the near-optimal bound. We demonstrate strong empirical performance of ECED on two problem instances, including a Bayesian experimental design task intended to distinguish among economic theories of how people make risky decisions, and an active preference learning task via pairwise comparisons.
△ Less
Submitted 11 July, 2016; v1 submitted 24 May, 2016;
originally announced May 2016.
-
Actively Learning Hemimetrics with Applications to Eliciting User Preferences
Authors:
Adish Singla,
Sebastian Tschiatschek,
Andreas Krause
Abstract:
Motivated by an application of eliciting users' preferences, we investigate the problem of learning hemimetrics, i.e., pairwise distances among a set of $n$ items that satisfy triangle inequalities and non-negativity constraints. In our application, the (asymmetric) distances quantify private costs a user incurs when substituting one item by another. We aim to learn these distances (costs) by aski…
▽ More
Motivated by an application of eliciting users' preferences, we investigate the problem of learning hemimetrics, i.e., pairwise distances among a set of $n$ items that satisfy triangle inequalities and non-negativity constraints. In our application, the (asymmetric) distances quantify private costs a user incurs when substituting one item by another. We aim to learn these distances (costs) by asking the users whether they are willing to switch from one item to another for a given incentive offer. Without exploiting structural constraints of the hemimetric polytope, learning the distances between each pair of items requires $Θ(n^2)$ queries. We propose an active learning algorithm that substantially reduces this sample complexity by exploiting the structural constraints on the version space of hemimetrics. Our proposed algorithm achieves provably-optimal sample complexity for various instances of the task. For example, when the items are embedded into $K$ tight clusters, the sample complexity of our algorithm reduces to $O(n K)$. Extensive experiments on a restaurant recommendation data set support the conclusions of our theoretical analysis.
△ Less
Submitted 27 May, 2016; v1 submitted 23 May, 2016;
originally announced May 2016.
-
Algorithms for Learning Sparse Additive Models with Interactions in High Dimensions
Authors:
Hemant Tyagi,
Anastasios Kyrillidis,
Bernd Gärtner,
Andreas Krause
Abstract:
A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}φ_{l}(x_l)$ where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $φ$'s, $\mathcal{S}$ to be unknown, there exists extensive work for estimating $f$ from its samples. In this work, we consider a generalized version of SPAMs, that also allow…
▽ More
A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}φ_{l}(x_l)$ where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $φ$'s, $\mathcal{S}$ to be unknown, there exists extensive work for estimating $f$ from its samples. In this work, we consider a generalized version of SPAMs, that also allows for the presence of a sparse number of second order interaction terms. For some $\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$, with $|\mathcal{S}_1| \ll d, |\mathcal{S}_2| \ll d^2$, the function $f$ is now assumed to be of the form: $\sum_{p \in \mathcal{S}_1}φ_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}φ_{(l,l^{\prime})} (x_l,x_{l^{\prime}})$. Assuming we have the freedom to query $f$ anywhere in its domain, we derive efficient algorithms that provably recover $\mathcal{S}_1,\mathcal{S}_2$ with finite sample bounds. Our analysis covers the noiseless setting where exact samples of $f$ are obtained, and also extends to the noisy setting where the queries are corrupted with noise. For the noisy setting in particular, we consider two noise models namely: i.i.d Gaussian noise and arbitrary but bounded noise. Our main methods for identification of $\mathcal{S}_2$ essentially rely on estimation of sparse Hessian matrices, for which we provide two novel compressed sensing based schemes. Once $\mathcal{S}_1, \mathcal{S}_2$ are known, we show how the individual components $φ_p$, $φ_{(l,l^{\prime})}$ can be estimated via additional queries of $f$, with uniform error bounds. Lastly, we provide simulation results on synthetic data that validate our theoretical findings.
△ Less
Submitted 8 May, 2017; v1 submitted 2 May, 2016;
originally announced May 2016.
-
Tradeoffs for Space, Time, Data and Risk in Unsupervised Learning
Authors:
Mario Lucic,
Mesrob I. Ohannessian,
Amin Karbasi,
Andreas Krause
Abstract:
Faced with massive data, is it possible to trade off (statistical) risk, and (computational) space and time? This challenge lies at the heart of large-scale machine learning. Using k-means clustering as a prototypical unsupervised learning problem, we show how we can strategically summarize the data (control space) in order to trade off risk and time when data is generated by a probabilistic model…
▽ More
Faced with massive data, is it possible to trade off (statistical) risk, and (computational) space and time? This challenge lies at the heart of large-scale machine learning. Using k-means clustering as a prototypical unsupervised learning problem, we show how we can strategically summarize the data (control space) in order to trade off risk and time when data is generated by a probabilistic model. Our summarization is based on coreset constructions from computational geometry. We also develop an algorithm, TRAM, to navigate the space/time/data/risk tradeoff in practice. In particular, we show that for a fixed risk (or data size), as the data size increases (resp. risk increases) the running time of TRAM decreases. Our extensive experiments on real data sets demonstrate the existence and practical utility of such tradeoffs, not only for k-means but also for Gaussian Mixture Models.
△ Less
Submitted 2 May, 2016;
originally announced May 2016.
-
Linear-time Outlier Detection via Sensitivity
Authors:
Mario Lucic,
Olivier Bachem,
Andreas Krause
Abstract:
Outliers are ubiquitous in modern data sets. Distance-based techniques are a popular non-parametric approach to outlier detection as they require no prior assumptions on the data generating distribution and are simple to implement. Scaling these techniques to massive data sets without sacrificing accuracy is a challenging task. We propose a novel algorithm based on the intuition that outliers have…
▽ More
Outliers are ubiquitous in modern data sets. Distance-based techniques are a popular non-parametric approach to outlier detection as they require no prior assumptions on the data generating distribution and are simple to implement. Scaling these techniques to massive data sets without sacrificing accuracy is a challenging task. We propose a novel algorithm based on the intuition that outliers have a significant influence on the quality of divergence-based clustering solutions. We propose sensitivity - the worst-case impact of a data point on the clustering objective - as a measure of outlierness. We then prove that influence, a (non-trivial) upper-bound on the sensitivity, can be computed by a simple linear time algorithm. To scale beyond a single machine, we propose a communication efficient distributed algorithm. In an extensive experimental evaluation, we demonstrate the effectiveness and establish the statistical significance of the proposed approach. In particular, it outperforms the most popular distance-based approaches while being several orders of magnitude faster.
△ Less
Submitted 2 May, 2016;
originally announced May 2016.
-
Learning Sparse Additive Models with Interactions in High Dimensions
Authors:
Hemant Tyagi,
Anastasios Kyrillidis,
Bernd Gärtner,
Andreas Krause
Abstract:
A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is referred to as a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}φ_{l}(x_l)$, where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $φ_l$'s and $\mathcal{S}$ to be unknown, the problem of estimating $f$ from its samples has been studied extensively. In this work, we consider a generalized…
▽ More
A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is referred to as a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}φ_{l}(x_l)$, where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $φ_l$'s and $\mathcal{S}$ to be unknown, the problem of estimating $f$ from its samples has been studied extensively. In this work, we consider a generalized SPAM, allowing for second order interaction terms. For some $\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$, the function $f$ is assumed to be of the form: $$f(\mathbf{x}) = \sum_{p \in \mathcal{S}_1}φ_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}φ_{(l,l^{\prime})} (x_{l},x_{l^{\prime}}).$$ Assuming $φ_{p},φ_{(l,l^{\prime})}$, $\mathcal{S}_1$ and, $\mathcal{S}_2$ to be unknown, we provide a randomized algorithm that queries $f$ and exactly recovers $\mathcal{S}_1,\mathcal{S}_2$. Consequently, this also enables us to estimate the underlying $φ_p, φ_{(l,l^{\prime})}$. We derive sample complexity bounds for our scheme and also extend our analysis to include the situation where the queries are corrupted with noise -- either stochastic, or arbitrary but bounded. Lastly, we provide simulation results on synthetic data, that validate our theoretical findings.
△ Less
Submitted 18 April, 2016;
originally announced April 2016.
-
Safe Learning of Regions of Attraction for Uncertain, Nonlinear Systems with Gaussian Processes
Authors:
Felix Berkenkamp,
Riccardo Moriconi,
Angela P. Schoellig,
Andreas Krause
Abstract:
Control theory can provide useful insights into the properties of controlled, dynamic systems. One important property of nonlinear systems is the region of attraction (ROA), a safe subset of the state space in which a given controller renders an equilibrium point asymptotically stable. The ROA is typically estimated based on a model of the system. However, since models are only an approximation of…
▽ More
Control theory can provide useful insights into the properties of controlled, dynamic systems. One important property of nonlinear systems is the region of attraction (ROA), a safe subset of the state space in which a given controller renders an equilibrium point asymptotically stable. The ROA is typically estimated based on a model of the system. However, since models are only an approximation of the real world, the resulting estimated safe region can contain states outside the ROA of the real system. This is not acceptable in safety-critical applications. In this paper, we consider an approach that learns the ROA from experiments on a real system, without ever leaving the true ROA and, thus, without risking safety-critical failures. Based on regularity assumptions on the model errors in terms of a Gaussian process prior, we use an underlying Lyapunov function in order to determine a region in which an equilibrium point is asymptotically stable with high probability. Moreover, we provide an algorithm to actively and safely explore the state space in order to expand the ROA estimate. We demonstrate the effectiveness of this method in simulation.
△ Less
Submitted 16 August, 2017; v1 submitted 15 March, 2016;
originally announced March 2016.
-
Bayesian Optimization with Safety Constraints: Safe and Automatic Parameter Tuning in Robotics
Authors:
Felix Berkenkamp,
Andreas Krause,
Angela P. Schoellig
Abstract:
Robotic algorithms typically depend on various parameters, the choice of which significantly affects the robot's performance. While an initial guess for the parameters may be obtained from dynamic models of the robot, parameters are usually tuned manually on the real system to achieve the best performance. Optimization algorithms, such as Bayesian optimization, have been used to automate this proc…
▽ More
Robotic algorithms typically depend on various parameters, the choice of which significantly affects the robot's performance. While an initial guess for the parameters may be obtained from dynamic models of the robot, parameters are usually tuned manually on the real system to achieve the best performance. Optimization algorithms, such as Bayesian optimization, have been used to automate this process. However, these methods may evaluate unsafe parameters during the optimization process that lead to safety-critical system failures. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is often not desirable in robotics. For example, high-gain controllers might achieve low average tracking error (performance), but can overshoot and violate input constraints. In this paper, we present a generalized algorithm that allows for multiple safety constraints separate from the objective. Given an initial set of safe parameters, the algorithm maximizes performance but only evaluates parameters that satisfy safety for all constraints with high probability. To this end, it carefully explores the parameter space by exploiting regularity assumptions in terms of a Gaussian process prior. Moreover, we show how context variables can be used to safely transfer knowledge to new situations and tasks. We provide a theoretical analysis and demonstrate that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters in experiments on a quadrotor vehicle.
△ Less
Submitted 7 April, 2020; v1 submitted 14 February, 2016;
originally announced February 2016.
-
Better safe than sorry: Risky function exploitation through safe optimization
Authors:
Eric Schulz,
Quentin J. M. Huys,
Dominik R. Bach,
Maarten Speekenbrink,
Andreas Krause
Abstract:
Exploration-exploitation of functions, that is learning and optimizing a map** between inputs and expected outputs, is ubiquitous to many real world situations. These situations sometimes require us to avoid certain outcomes at all cost, for example because they are poisonous, harmful, or otherwise dangerous. We test participants' behavior in scenarios in which they have to find the optimum of a…
▽ More
Exploration-exploitation of functions, that is learning and optimizing a map** between inputs and expected outputs, is ubiquitous to many real world situations. These situations sometimes require us to avoid certain outcomes at all cost, for example because they are poisonous, harmful, or otherwise dangerous. We test participants' behavior in scenarios in which they have to find the optimum of a function while at the same time avoid outputs below a certain threshold. In two experiments, we find that Safe-Optimization, a Gaussian Process-based exploration-exploitation algorithm, describes participants' behavior well and that participants seem to care firstly whether a point is safe and then try to pick the optimal point from all such safe points. This means that their trade-off between exploration and exploitation can be seen as an intelligent, approximate, and homeostasis-driven strategy.
△ Less
Submitted 14 May, 2016; v1 submitted 2 February, 2016;
originally announced February 2016.
-
Non-classical correlations between single photons and phonons from a mechanical oscillator
Authors:
Ralf Riedinger,
Sungkun Hong,
Richard A. Norte,
Joshua A. Slater,
Juying Shang,
Alexander G. Krause,
Vikas Anant,
Markus Aspelmeyer,
Simon Gröblacher
Abstract:
Interfacing a single photon with another quantum system is a key capability in modern quantum information science. It allows quantum states of matter, such as spin states of atoms, atomic ensembles or solids, to be prepared and manipulated by photon counting and, in particular, to be distributed over long distances. Such light-matter interfaces have become crucial to fundamental tests of quantum p…
▽ More
Interfacing a single photon with another quantum system is a key capability in modern quantum information science. It allows quantum states of matter, such as spin states of atoms, atomic ensembles or solids, to be prepared and manipulated by photon counting and, in particular, to be distributed over long distances. Such light-matter interfaces have become crucial to fundamental tests of quantum physics and realizations of quantum networks. Here we report non-classical correlations between single photons and phonons -- the quanta of mechanical motion -- from a nanomechanical resonator. We implement a full quantum protocol involving initialization of the resonator in its quantum ground state of motion and subsequent generation and read-out of correlated photonphonon pairs. The observed violation of a Cauchy-Schwarz inequality is clear evidence for the non-classical nature of the mechanical state generated. Our results demonstrate the availability of on-chip solid-state mechanical resonators as light-matter quantum interfaces. The performance we achieved will enable studies of macroscopic quantum phenomena as well as applications in quantum communication, as quantum memories and as quantum transducers.
△ Less
Submitted 23 February, 2016; v1 submitted 16 December, 2015;
originally announced December 2015.
-
Noisy Submodular Maximization via Adaptive Sampling with Applications to Crowdsourced Image Collection Summarization
Authors:
Adish Singla,
Sebastian Tschiatschek,
Andreas Krause
Abstract:
We address the problem of maximizing an unknown submodular function that can only be accessed via noisy evaluations. Our work is motivated by the task of summarizing content, e.g., image collections, by leveraging users' feedback in form of clicks or ratings. For summarization tasks with the goal of maximizing coverage and diversity, submodular set functions are a natural choice. When the underlyi…
▽ More
We address the problem of maximizing an unknown submodular function that can only be accessed via noisy evaluations. Our work is motivated by the task of summarizing content, e.g., image collections, by leveraging users' feedback in form of clicks or ratings. For summarization tasks with the goal of maximizing coverage and diversity, submodular set functions are a natural choice. When the underlying submodular function is unknown, users' feedback can provide noisy evaluations of the function that we seek to maximize. We provide a generic algorithm -- \submM{} -- for maximizing an unknown submodular function under cardinality constraints. This algorithm makes use of a novel exploration module -- \blbox{} -- that proposes good elements based on adaptively sampling noisy function evaluations. \blbox{} is able to accommodate different kinds of observation models such as value queries and pairwise comparisons. We provide PAC-style guarantees on the quality and sampling cost of the solution obtained by \submM{}. We demonstrate the effectiveness of our approach in an interactive, crowdsourced image collection summarization application.
△ Less
Submitted 1 December, 2015; v1 submitted 23 November, 2015;
originally announced November 2015.
-
VERCE delivers a productive e-Science environment for seismology research
Authors:
Malcolm Atkinson,
Michele Carpené,
Emanuele Casarotti,
Steffen Claus,
Rosa Filgueira,
Anton Frank,
Michelle Galea,
Tom Garth,
André Gemünd,
Heiner Igel,
Iraklis Klampanos,
Amrey Krause,
Lion Krischer,
Siew Hoon Leong,
Federica Magnoni,
Jonas Matser,
Alberto Michelini,
Andreas Rietbrock,
Horst Schwichtenberg,
Alessandro Spinuso,
Jean-Pierre Vilotte
Abstract:
The VERCE project has pioneered an e-Infrastructure to support researchers using established simulation codes on high-performance computers in conjunction with multiple sources of observational data. This is accessed and organised via the VERCE science gateway that makes it convenient for seismologists to use these resources from any location via the Internet. Their data handling is made flexible…
▽ More
The VERCE project has pioneered an e-Infrastructure to support researchers using established simulation codes on high-performance computers in conjunction with multiple sources of observational data. This is accessed and organised via the VERCE science gateway that makes it convenient for seismologists to use these resources from any location via the Internet. Their data handling is made flexible and scalable by two Python libraries, ObsPy and dispel4py and by data services delivered by ORFEUS and EUDAT. Provenance driven tools enable rapid exploration of results and of the relationships between data, which accelerates understanding and method improvement. These powerful facilities are integrated and draw on many other e-Infrastructures. This paper presents the motivation for building such systems, it reviews how solid-Earth scientists can make significant research progress using them and explains the architecture and mechanisms that make their construction and operation achievable. We conclude with a summary of the achievements to date and identify the crucial steps needed to extend the capabilities for seismologists, for solid-Earth scientists and for similar disciplines.
△ Less
Submitted 7 October, 2015;
originally announced October 2015.
-
Safe Controller Optimization for Quadrotors with Gaussian Processes
Authors:
Felix Berkenkamp,
Angela P. Schoellig,
Andreas Krause
Abstract:
One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to obtain an initial controller, but ultimately the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning step, methods from machine learning, such as Bayesian…
▽ More
One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to obtain an initial controller, but ultimately the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning step, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters on the real system, safety-critical system failures may happen. In this paper, we overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SafeOpt, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SafeOpt automatically optimizes the parameters of a control law while guaranteeing safety. It models the underlying performance measure as a Gaussian process and only explores new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.
△ Less
Submitted 16 August, 2017; v1 submitted 3 September, 2015;
originally announced September 2015.
-
Strong Coresets for Hard and Soft Bregman Clustering with Applications to Exponential Family Mixtures
Authors:
Mario Lucic,
Olivier Bachem,
Andreas Krause
Abstract:
Coresets are efficient representations of data sets such that models trained on the coreset are provably competitive with models trained on the original data set. As such, they have been successfully used to scale up clustering models such as K-Means and Gaussian mixture models to massive data sets. However, until now, the algorithms and the corresponding theory were usually specific to each clust…
▽ More
Coresets are efficient representations of data sets such that models trained on the coreset are provably competitive with models trained on the original data set. As such, they have been successfully used to scale up clustering models such as K-Means and Gaussian mixture models to massive data sets. However, until now, the algorithms and the corresponding theory were usually specific to each clustering problem.
We propose a single, practical algorithm to construct strong coresets for a large class of hard and soft clustering problems based on Bregman divergences. This class includes hard clustering with popular distortion measures such as the Squared Euclidean distance, the Mahalanobis distance, KL-divergence and Itakura-Saito distance. The corresponding soft clustering problems are directly related to popular mixture models due to a dual relationship between Bregman divergences and Exponential family distributions. Our theoretical results further imply a randomized polynomial-time approximation scheme for hard clustering. We demonstrate the practicality of the proposed algorithm in an empirical evaluation.
△ Less
Submitted 2 May, 2016; v1 submitted 21 August, 2015;
originally announced August 2015.
-
Learning to Hire Teams
Authors:
Adish Singla,
Eric Horvitz,
Pushmeet Kohli,
Andreas Krause
Abstract:
Crowdsourcing and human computation has been employed in increasingly sophisticated projects that require the solution of a heterogeneous set of tasks. We explore the challenge of building or hiring an effective team, for performing tasks required for such projects on an ongoing basis, from an available pool of applicants or workers who have bid for the tasks. The recruiter needs to learn workers'…
▽ More
Crowdsourcing and human computation has been employed in increasingly sophisticated projects that require the solution of a heterogeneous set of tasks. We explore the challenge of building or hiring an effective team, for performing tasks required for such projects on an ongoing basis, from an available pool of applicants or workers who have bid for the tasks. The recruiter needs to learn workers' skills and expertise by performing online tests and interviews, and would like to minimize the amount of budget or time spent in this process before committing to hiring the team. How can one optimally spend budget to learn the expertise of workers as part of recruiting a team? How can one exploit the similarities among tasks as well as underlying social ties or commonalities among the workers for faster learning? We tackle these decision-theoretic challenges by casting them as an instance of online learning for best action selection. We present algorithms with PAC bounds on the required budget to hire a near-optimal team with high confidence. Furthermore, we consider an embedding of the tasks and workers in an underlying graph that may arise from task similarities or social ties, and that can provide additional side-observations for faster learning. We then quantify the improvement in the bounds that we can achieve depending on the characteristic properties of this graph structure. We evaluate our methodology on simulated problem instances as well as on real-world crowdsourcing data collected from the oDesk platform. Our methodology and results present an interesting direction of research to tackle the challenges faced by a recruiter for contract-based crowdsourcing.
△ Less
Submitted 12 August, 2015;
originally announced August 2015.
-
Crowd Access Path Optimization: Diversity Matters
Authors:
Besmira Nushi,
Adish Singla,
Anja Gruenheid,
Erfan Zamanian,
Andreas Krause,
Donald Kossmann
Abstract:
Quality assurance is one the most important challenges in crowdsourcing. Assigning tasks to several workers to increase quality through redundant answers can be expensive if asking homogeneous sources. This limitation has been overlooked by current crowdsourcing platforms resulting therefore in costly solutions. In order to achieve desirable cost-quality tradeoffs it is essential to apply efficien…
▽ More
Quality assurance is one the most important challenges in crowdsourcing. Assigning tasks to several workers to increase quality through redundant answers can be expensive if asking homogeneous sources. This limitation has been overlooked by current crowdsourcing platforms resulting therefore in costly solutions. In order to achieve desirable cost-quality tradeoffs it is essential to apply efficient crowd access optimization techniques. Our work argues that optimization needs to be aware of diversity and correlation of information within groups of individuals so that crowdsourcing redundancy can be adequately planned beforehand. Based on this intuitive idea, we introduce the Access Path Model (APM), a novel crowd model that leverages the notion of access paths as an alternative way of retrieving information. APM aggregates answers ensuring high quality and meaningful confidence. Moreover, we devise a greedy optimization algorithm for this model that finds a provably good approximate plan to access the crowd. We evaluate our approach on three crowdsourced datasets that illustrate various aspects of the problem. Our results show that the Access Path Model combined with greedy optimization is cost-efficient and practical to overcome common difficulties in large-scale crowdsourcing like data sparsity and anonymity.
△ Less
Submitted 11 August, 2015; v1 submitted 8 August, 2015;
originally announced August 2015.
-
Optical read out and feedback cooling of a nanostring optomechanical cavity
Authors:
Alex G. Krause,
Tim D. Blasius,
Oskar Painter
Abstract:
Optical measurement of the motion of a 940 kHz mechanical resonance of a silicon nitride nanostring resonator is demonstrated with a read out noise imprecision reaching 37 dB below that of the resonator's zero-point fluctuations. Via intensity modulation of the optical probe laser, radiation pressure feedback is used to cool and damp the mechanical mode from an initial room temperature occupancy o…
▽ More
Optical measurement of the motion of a 940 kHz mechanical resonance of a silicon nitride nanostring resonator is demonstrated with a read out noise imprecision reaching 37 dB below that of the resonator's zero-point fluctuations. Via intensity modulation of the optical probe laser, radiation pressure feedback is used to cool and damp the mechanical mode from an initial room temperature occupancy of $\bar{n}_{b} = 6.5 \times 10^6$ ($T_{b}=295$K) down to a phonon occupation of $\langle n \rangle = 66 \pm 10$, representing a mode temperature of $T_{m} \approx 3$mK. The five decades of cooling is enabled by the system's large single-photon cooperativity $(C_{1} = 4)$ and high quantum efficiency of optical motion detection ($η_{t} = 0.27$).
△ Less
Submitted 3 June, 2015;
originally announced June 2015.
-
Discovering Valuable Items from Massive Data
Authors:
Hastagiri P. Vanchinathan,
Andreas Marfurt,
Charles-Antoine Robelin,
Donald Kossmann,
Andreas Krause
Abstract:
Suppose there is a large collection of items, each with an associated cost and an inherent utility that is revealed only once we commit to selecting it. Given a budget on the cumulative cost of the selected items, how can we pick a subset of maximal value? This task generalizes several important problems such as multi-arm bandits, active search and the knapsack problem. We present an algorithm, GP…
▽ More
Suppose there is a large collection of items, each with an associated cost and an inherent utility that is revealed only once we commit to selecting it. Given a budget on the cumulative cost of the selected items, how can we pick a subset of maximal value? This task generalizes several important problems such as multi-arm bandits, active search and the knapsack problem. We present an algorithm, GP-Select, which utilizes prior knowledge about similarity be- tween items, expressed as a kernel function. GP-Select uses Gaussian process prediction to balance exploration (estimating the unknown value of items) and exploitation (selecting items of high value). We extend GP-Select to be able to discover sets that simultaneously have high utility and are diverse. Our preference for diversity can be specified as an arbitrary monotone submodular function that quantifies the diminishing returns obtained when selecting similar items. Furthermore, we exploit the structure of the model updates to achieve an order of magnitude (up to 40X) speedup in our experiments without resorting to approximations. We provide strong guarantees on the performance of GP-Select and apply it to three real-world case studies of industrial relevance: (1) Refreshing a repository of prices in a Global Distribution System for the travel industry, (2) Identifying diverse, binding-affine peptides in a vaccine de- sign task and (3) Maximizing clicks in a web-scale recommender system by recommending items to users.
△ Less
Submitted 2 June, 2015;
originally announced June 2015.
-
Building Hierarchies of Concepts via Crowdsourcing
Authors:
Yuyin Sun,
Adish Singla,
Dieter Fox,
Andreas Krause
Abstract:
Hierarchies of concepts are useful in many applications from navigation to organization of objects. Usually, a hierarchy is created in a centralized manner by employing a group of domain experts, a time-consuming and expensive process. The experts often design one single hierarchy to best explain the semantic relationships among the concepts, and ignore the natural uncertainty that may exist in th…
▽ More
Hierarchies of concepts are useful in many applications from navigation to organization of objects. Usually, a hierarchy is created in a centralized manner by employing a group of domain experts, a time-consuming and expensive process. The experts often design one single hierarchy to best explain the semantic relationships among the concepts, and ignore the natural uncertainty that may exist in the process. In this paper, we propose a crowdsourcing system to build a hierarchy and furthermore capture the underlying uncertainty. Our system maintains a distribution over possible hierarchies and actively selects questions to ask using an information gain criterion. We evaluate our methodology on simulated data and on a set of real world application domains. Experimental results show that our system is robust to noise, efficient in picking questions, cost-effective and builds high quality hierarchies.
△ Less
Submitted 31 July, 2015; v1 submitted 27 April, 2015;
originally announced April 2015.
-
Information Gathering in Networks via Active Exploration
Authors:
Adish Singla,
Eric Horvitz,
Pushmeet Kohli,
Ryen White,
Andreas Krause
Abstract:
How should we gather information in a network, where each node's visibility is limited to its local neighborhood? This problem arises in numerous real-world applications, such as surveying and task routing in social networks, team formation in collaborative networks and experimental design with dependency constraints. Often the informativeness of a set of nodes can be quantified via a submodular u…
▽ More
How should we gather information in a network, where each node's visibility is limited to its local neighborhood? This problem arises in numerous real-world applications, such as surveying and task routing in social networks, team formation in collaborative networks and experimental design with dependency constraints. Often the informativeness of a set of nodes can be quantified via a submodular utility function. Existing approaches for submodular optimization, however, require that the set of all nodes that can be selected is known ahead of time, which is often unrealistic. In contrast, we propose a novel model where we start our exploration from an initial node, and new nodes become visible and available for selection only once one of their neighbors has been chosen. We then present a general algorithm NetExp for this problem, and provide theoretical bounds on its performance dependent on structural properties of the underlying network. We evaluate our methodology on various simulated problem instances as well as on data collected from social question answering system deployed within a large enterprise.
△ Less
Submitted 6 May, 2015; v1 submitted 24 April, 2015;
originally announced April 2015.
-
Nonlinear radiation pressure dynamics in an optomechanical crystal
Authors:
Alex G. Krause,
Jeff T. Hill,
Max Ludwig,
Amir H. Safavi-Naeini,
Jasper Chan,
Florian Marquardt,
Oskar Painter
Abstract:
Utilizing a silicon nanobeam optomechanical crystal, we investigate the attractor diagram arising from the radiation pressure interaction between a localized optical cavity at $λ= 1552$nm and a mechanical resonance at $ω/2π= 3.72$GHz. At a temperature of $T \approx 10$K, highly nonlinear driving of mechanical motion is observed via continuous wave optical pum**. Introduction of a time-dependent…
▽ More
Utilizing a silicon nanobeam optomechanical crystal, we investigate the attractor diagram arising from the radiation pressure interaction between a localized optical cavity at $λ= 1552$nm and a mechanical resonance at $ω/2π= 3.72$GHz. At a temperature of $T \approx 10$K, highly nonlinear driving of mechanical motion is observed via continuous wave optical pum**. Introduction of a time-dependent (modulated) optical pump is used to steer the system towards an otherwise inaccessible dynamically stable attractor in which mechanical self-oscillation occurs for an optical pump red-detuned from the cavity resonance. An analytical model incorporating thermo-optic effects due to optical absorption heating is developed, and found to accurately predict the measured device behavior.
△ Less
Submitted 27 April, 2015; v1 submitted 22 April, 2015;
originally announced April 2015.
-
Scalable Variational Inference in Log-supermodular Models
Authors:
Josip Djolonga,
Andreas Krause
Abstract:
We consider the problem of approximate Bayesian inference in log-supermodular models. These models encompass regular pairwise MRFs with binary variables, but allow to capture high-order interactions, which are intractable for existing approximate inference techniques such as belief propagation, mean field, and variants. We show that a recently proposed variational approach to inference in log-supe…
▽ More
We consider the problem of approximate Bayesian inference in log-supermodular models. These models encompass regular pairwise MRFs with binary variables, but allow to capture high-order interactions, which are intractable for existing approximate inference techniques such as belief propagation, mean field, and variants. We show that a recently proposed variational approach to inference in log-supermodular models -L-FIELD- reduces to the widely-studied minimum norm problem for submodular minimization. This insight allows to leverage powerful existing tools, and hence to solve the variational problem orders of magnitude more efficiently than previously possible. We then provide another natural interpretation of L-FIELD, demonstrating that it exactly minimizes a specific type of Rényi divergence measure. This insight sheds light on the nature of the variational approximations produced by L-FIELD. Furthermore, we show how to perform parallel inference as message passing in a suitable factor graph at a linear convergence rate, without having to sum up over all the configurations of the factor. Finally, we apply our approach to a challenging image segmentation task. Our experiments confirm scalability of our approach, high quality of the marginals, and the benefit of incorporating higher-order potentials.
△ Less
Submitted 24 February, 2015; v1 submitted 23 February, 2015;
originally announced February 2015.
-
Distributed Submodular Maximization
Authors:
Baharan Mirzasoleiman,
Amin Karbasi,
Rik Sarkar,
Andreas Krause
Abstract:
Many large-scale machine learning problems--clustering, non-parametric learning, kernel machines, etc.--require selecting a small yet representative subset from a large dataset. Such problems can often be reduced to maximizing a submodular set function subject to various constraints. Classical approaches to submodular optimization require centralized access to the full dataset, which is impractica…
▽ More
Many large-scale machine learning problems--clustering, non-parametric learning, kernel machines, etc.--require selecting a small yet representative subset from a large dataset. Such problems can often be reduced to maximizing a submodular set function subject to various constraints. Classical approaches to submodular optimization require centralized access to the full dataset, which is impractical for truly large-scale problems. In this paper, we consider the problem of submodular function maximization in a distributed fashion. We develop a simple, two-stage protocol GreeDi, that is easily implemented using MapReduce style computations. We theoretically analyze our approach, and show that under certain natural conditions, performance close to the centralized approach can be achieved. We begin with monotone submodular maximization subject to a cardinality constraint, and then extend this approach to obtain approximation guarantees for (not necessarily monotone) submodular maximization subject to more general constraints including matroid or knapsack constraints. In our extensive experiments, we demonstrate the effectiveness of our approach on several applications, including sparse Gaussian process inference and exemplar based clustering on tens of millions of examples using Hadoop.
△ Less
Submitted 27 June, 2016; v1 submitted 3 November, 2014;
originally announced November 2014.
-
Lazier Than Lazy Greedy
Authors:
Baharan Mirzasoleiman,
Ashwinkumar Badanidiyuru,
Amin Karbasi,
Jan Vondrak,
Andreas Krause
Abstract:
Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice? In this paper, we develop the first linear-time algorithm for maximizing a general monotone submodular function subject to a cardinality constraint. We show that our randomized algorithm, STOCHASTIC-GREEDY, can achieve a…
▽ More
Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice? In this paper, we develop the first linear-time algorithm for maximizing a general monotone submodular function subject to a cardinality constraint. We show that our randomized algorithm, STOCHASTIC-GREEDY, can achieve a $(1-1/e-\varepsilon)$ approximation guarantee, in expectation, to the optimum solution in time linear in the size of the data and independent of the cardinality constraint. We empirically demonstrate the effectiveness of our algorithm on submodular functions arising in data summarization, including training large-scale kernel methods, exemplar-based clustering, and sensor placement. We observe that STOCHASTIC-GREEDY practically achieves the same utility value as lazy greedy but runs much faster. More surprisingly, we observe that in many practical scenarios STOCHASTIC-GREEDY does not evaluate the whole fraction of data points even once and still achieves indistinguishable results compared to lazy greedy.
△ Less
Submitted 28 November, 2014; v1 submitted 28 September, 2014;
originally announced September 2014.
-
65 nm CMOS Sensors Applied to Mathematically Exact Colorimetric Reconstruction
Authors:
C. Mayr,
S. Henker,
A. Krause,
J. -U. Schlüßler,
R. Schüffny
Abstract:
Extracting colorimetric image information from the spectral characteristics of image sensors is a key issue in accurate image acquisition. Technically feasible filter/sensor combinations usually do not replicate colorimetric responses with sufficient accuracy to be directly applicable to color representation. A variety of transformations have been proposed in the literature to compensate for this.…
▽ More
Extracting colorimetric image information from the spectral characteristics of image sensors is a key issue in accurate image acquisition. Technically feasible filter/sensor combinations usually do not replicate colorimetric responses with sufficient accuracy to be directly applicable to color representation. A variety of transformations have been proposed in the literature to compensate for this. However, most of those rely on heuristics and/or introduce a reconstruction dependent on the composition of the incoming illumination. In this work, we present a spectral reconstruction method that is independent of illumination and is derived in a mathematically strict way. It provides a deterministic method to arrive at a least mean squared error approximation of a target spectral characteristic from arbitrary sensor response curves. Further, we present a new CMOS sensor design in a standard digital 65nm CMOS technology. Novel circuit techniques are used to achieve performance comparable with much larger-sized specialized photo-CMOS processes. The sensor is utilized as testbed for the spectral reconstruction method.
△ Less
Submitted 8 August, 2014;
originally announced August 2014.
-
Asymptotic Dynamics of Stochastic $p$-Laplace Equations on Unbounded Domains
Authors:
Andrew Krause
Abstract:
This thesis is concerned with the asymptotic behavior of solutions of stochastic $p$-Laplace equations driven by non-autonomous forcing on $\mathbb{R}^n$. Two cases are studied, with additive and multiplicative noise respectively. Estimates on the tails of solutions are used to overcome the non-compactness of Sobolev embeddings on unbounded domains, and prove asymptotic compactness of solution ope…
▽ More
This thesis is concerned with the asymptotic behavior of solutions of stochastic $p$-Laplace equations driven by non-autonomous forcing on $\mathbb{R}^n$. Two cases are studied, with additive and multiplicative noise respectively. Estimates on the tails of solutions are used to overcome the non-compactness of Sobolev embeddings on unbounded domains, and prove asymptotic compactness of solution operators in $L^2(\mathbb{R}^n)$. Using this result we prove the existence and uniqueness of random attractors in each case. Additionally, we show the upper semicontinuity of the attractor for the multiplicative noise case as the intensity of the noise approaches zero.
△ Less
Submitted 3 August, 2014;
originally announced August 2014.
-
Online Submodular Maximization under a Matroid Constraint with Application to Learning Assignments
Authors:
Daniel Golovin,
Andreas Krause,
Matthew Streeter
Abstract:
Which ads should we display in sponsored search in order to maximize our revenue? How should we dynamically rank information sources to maximize the value of the ranking? These applications exhibit strong diminishing returns: Redundancy decreases the marginal utility of each ad or information source. We show that these and other problems can be formalized as repeatedly selecting an assignment of i…
▽ More
Which ads should we display in sponsored search in order to maximize our revenue? How should we dynamically rank information sources to maximize the value of the ranking? These applications exhibit strong diminishing returns: Redundancy decreases the marginal utility of each ad or information source. We show that these and other problems can be formalized as repeatedly selecting an assignment of items to positions to maximize a sequence of monotone submodular functions that arrive one by one. We present an efficient algorithm for this general problem and analyze it in the no-regret model. Our algorithm possesses strong theoretical guarantees, such as a performance ratio that converges to the optimal constant of 1 - 1/e. We empirically evaluate our algorithm on two real-world online optimization problems on the web: ad allocation with submodular utilities, and dynamically ranking blogs to detect information cascades. Finally, we present a second algorithm that handles the more general case in which the feasible sets are given by a matroid constraint, while still maintaining a 1 - 1/e asymptotic performance ratio.
△ Less
Submitted 3 July, 2014;
originally announced July 2014.
-
Bianchi's classification of 3-dimensional Lie algebras revisited
Authors:
Manuel Glas,
Panagiotis Konstantis,
Achim Krause,
Frank Loose
Abstract:
We present Bianchi's proof on the classification of real (and complex) $3$-dimensional Lie algebras in a coordinate free version from a strictly representation theoretic point of view. Nearby we also compute the automorphism groups and from this the orbit dimensions of the corresponding orbits in the algebraic variety $X\subseteqΛ^2V^*\otimes V$ describing all Lie brackets on a fixed vector space…
▽ More
We present Bianchi's proof on the classification of real (and complex) $3$-dimensional Lie algebras in a coordinate free version from a strictly representation theoretic point of view. Nearby we also compute the automorphism groups and from this the orbit dimensions of the corresponding orbits in the algebraic variety $X\subseteqΛ^2V^*\otimes V$ describing all Lie brackets on a fixed vector space $V$ of dimension $3$. Moreover we clarify which orbits lie in the closure of a given orbit and therefore the topology on the orbit space $X/G$ with $G=\mathrm{Aut}(V)$.
△ Less
Submitted 10 March, 2014;
originally announced March 2014.
-
Near Optimal Bayesian Active Learning for Decision Making
Authors:
Shervin Javdani,
Yuxin Chen,
Amin Karbasi,
Andreas Krause,
J. Andrew Bagnell,
Siddhartha Srinivasa
Abstract:
How should we gather information to make effective decisions? We address Bayesian active learning and experimental design problems, where we sequentially select tests to reduce uncertainty about a set of hypotheses. Instead of minimizing uncertainty per se, we consider a set of overlap** decision regions of these hypotheses. Our goal is to drive uncertainty into a single decision region as quick…
▽ More
How should we gather information to make effective decisions? We address Bayesian active learning and experimental design problems, where we sequentially select tests to reduce uncertainty about a set of hypotheses. Instead of minimizing uncertainty per se, we consider a set of overlap** decision regions of these hypotheses. Our goal is to drive uncertainty into a single decision region as quickly as possible.
We identify necessary and sufficient conditions for correctly identifying a decision region that contains all hypotheses consistent with observations. We develop a novel Hyperedge Cutting (HEC) algorithm for this problem, and prove that is competitive with the intractable optimal policy. Our efficient implementation of the algorithm relies on computing subsets of the complete homogeneous symmetric polynomials. Finally, we demonstrate its effectiveness on two practical applications: approximate comparison-based learning and active localization using a robot manipulator.
△ Less
Submitted 24 February, 2014;
originally announced February 2014.
-
Near-Optimally Teaching the Crowd to Classify
Authors:
Adish Singla,
Ilija Bogunovic,
Gábor Bartók,
Amin Karbasi,
Andreas Krause
Abstract:
How should we present training examples to learners to teach them classification rules? This is a natural problem when training workers for crowdsourcing labeling tasks, and is also motivated by challenges in data-driven online education. We propose a natural stochastic model of the learners, modeling them as randomly switching among hypotheses based on observed feedback. We then develop STRICT, a…
▽ More
How should we present training examples to learners to teach them classification rules? This is a natural problem when training workers for crowdsourcing labeling tasks, and is also motivated by challenges in data-driven online education. We propose a natural stochastic model of the learners, modeling them as randomly switching among hypotheses based on observed feedback. We then develop STRICT, an efficient algorithm for selecting examples to teach to workers. Our solution greedily maximizes a submodular surrogate objective function in order to select examples to show to the learners. We prove that our strategy is competitive with the optimal teaching policy. Moreover, for the special case of linear separators, we prove that an exponential reduction in error probability can be achieved. Our experiments on simulated workers as well as three real image annotation tasks on Amazon Mechanical Turk show the effectiveness of our teaching algorithm.
△ Less
Submitted 7 March, 2014; v1 submitted 10 February, 2014;
originally announced February 2014.
-
A Utility-Theoretic Approach to Privacy in Online Services
Authors:
Andreas Krause,
Eric Horvitz
Abstract:
Online offerings such as web search, news portals, and e-commerce applications face the challenge of providing high-quality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on special knowledge about users and their context. For example, a users demographics, location, and past sea…
▽ More
Online offerings such as web search, news portals, and e-commerce applications face the challenge of providing high-quality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on special knowledge about users and their context. For example, a users demographics, location, and past search and browsing may be useful in enhancing the results offered in response to web search queries. However, reasonable concerns about privacy by both users, providers, and government agencies acting on behalf of citizens, may limit access by services to such information. We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or on-demand manner, in return for expected enhancements in the quality of an online service. We focus on the example of web search and formulate realistic objective functions for search efficacy and privacy. We demonstrate how we can find a provably near-optimal optimization of the utility-privacy tradeoff in an efficient manner. We evaluate our methodology on data drawn from a log of the search activity of volunteer participants. We separately assess users' preferences about privacy and utility via a large-scale survey, aimed at eliciting preferences about peoples' willingness to trade the sharing of personal data in returns for gains in search efficiency. We show that a significant level of personalization can be achieved using a relatively small amount of information about users.
△ Less
Submitted 16 January, 2014;
originally announced January 2014.
-
Optimal Value of Information in Graphical Models
Authors:
Andreas Krause,
Carlos Guestrin
Abstract:
Many real-world decision making tasks require us to choose among several expensive observations. In a sensor network, for example, it is important to select the subset of sensors that is expected to provide the strongest reduction in uncertainty. In medical decision making tasks, one needs to select which tests to administer before deciding on the most effective treatment. It has been general prac…
▽ More
Many real-world decision making tasks require us to choose among several expensive observations. In a sensor network, for example, it is important to select the subset of sensors that is expected to provide the strongest reduction in uncertainty. In medical decision making tasks, one needs to select which tests to administer before deciding on the most effective treatment. It has been general practice to use heuristic-guided procedures for selecting observations. In this paper, we present the first efficient optimal algorithms for selecting observations for a class of probabilistic graphical models. For example, our algorithms allow to optimally label hidden variables in Hidden Markov Models (HMMs). We provide results for both selecting the optimal subset of observations, and for obtaining an optimal conditional observation plan.
Furthermore we prove a surprising result: In most graphical models tasks, if one designs an efficient algorithm for chain graphs, such as HMMs, this procedure can be generalized to polytree graphical models. We prove that the optimizing value of information is $NP^{PP}$-hard even for polytrees. It also follows from our results that just computing decision theoretic value of information objective functions, which are commonly used in practice, is a #P-complete problem even on Naive Bayes models (a simple special case of polytrees).
In addition, we consider several extensions, such as using our algorithms for scheduling observation selection for multiple sensors. We demonstrate the effectiveness of our approach on several real-world datasets, including a prototype sensor network deployment for energy conservation in buildings.
△ Less
Submitted 15 January, 2014;
originally announced January 2014.
-
Efficient Informative Sensing using Multiple Robots
Authors:
Amarjeet Singh,
Andreas Krause,
Carlos Guestrin,
William J. Kaiser
Abstract:
The need for efficient monitoring of spatio-temporal dynamics in large environmental applications, such as the water quality monitoring in rivers and lakes, motivates the use of robotic sensors in order to achieve sufficient spatial coverage. Typically, these robots have bounded resources, such as limited battery or limited amounts of time to obtain measurements. Thus, careful coordination of thei…
▽ More
The need for efficient monitoring of spatio-temporal dynamics in large environmental applications, such as the water quality monitoring in rivers and lakes, motivates the use of robotic sensors in order to achieve sufficient spatial coverage. Typically, these robots have bounded resources, such as limited battery or limited amounts of time to obtain measurements. Thus, careful coordination of their paths is required in order to maximize the amount of information collected, while respecting the resource constraints. In this paper, we present an efficient approach for near-optimally solving the NP-hard optimization problem of planning such informative paths. In particular, we first develop eSIP (efficient Single-robot Informative Path planning), an approximation algorithm for optimizing the path of a single robot. Hereby, we use a Gaussian Process to model the underlying phenomenon, and use the mutual information between the visited locations and remainder of the space to quantify the amount of information collected. We prove that the mutual information collected using paths obtained by using eSIP is close to the information obtained by an optimal solution. We then provide a general technique, sequential allocation, which can be used to extend any single robot planning algorithm, such as eSIP, for the multi-robot problem. This procedure approximately generalizes any guarantees for the single-robot problem to the multi-robot case. We extensively evaluate the effectiveness of our approach on several experiments performed in-field for two important environmental sensing applications, lake and river monitoring, and simulation experiments performed using several real world sensor network data sets.
△ Less
Submitted 15 January, 2014;
originally announced January 2014.
-
Pullback Attractors of Non-autonomous Stochastic Degenerate Parabolic Equations on Unbounded Domains
Authors:
Andrew Krause,
Bixiang Wang
Abstract:
This paper is concerned with pullback attractors of the stochastic p-Laplace equation defined on the entire space R^n. We first establish the asymptotic compactness of the equation in L^2(R^n) and then prove the existence and uniqueness of non-autonomous random attractors. This attractor is pathwise periodic if the non-autonomous deterministic forcing is time periodic. The difficulty of non-compac…
▽ More
This paper is concerned with pullback attractors of the stochastic p-Laplace equation defined on the entire space R^n. We first establish the asymptotic compactness of the equation in L^2(R^n) and then prove the existence and uniqueness of non-autonomous random attractors. This attractor is pathwise periodic if the non-autonomous deterministic forcing is time periodic. The difficulty of non-compactness of Sobolev embeddings on R^n is overcome by the uniform smallness of solutions outside a bounded domain.
△ Less
Submitted 4 September, 2013;
originally announced September 2013.
-
Incentives for Privacy Tradeoff in Community Sensing
Authors:
Adish Singla,
Andreas Krause
Abstract:
Community sensing, fusing information from populations of privately-held sensors, presents a great opportunity to create efficient and cost-effective sensing applications. Yet, reasonable privacy concerns often limit the access to such data streams. How should systems valuate and negotiate access to private information, for example in return for monetary incentives? How should they optimally choos…
▽ More
Community sensing, fusing information from populations of privately-held sensors, presents a great opportunity to create efficient and cost-effective sensing applications. Yet, reasonable privacy concerns often limit the access to such data streams. How should systems valuate and negotiate access to private information, for example in return for monetary incentives? How should they optimally choose the participants from a large population of strategic users with privacy concerns, and compensate them for information shared? In this paper, we address these questions and present a novel mechanism, SeqTGreedy, for budgeted recruitment of participants in community sensing. We first show that privacy tradeoffs in community sensing can be cast as an adaptive submodular optimization problem. We then design a budget feasible, incentive compatible (truthful) mechanism for adaptive submodular maximization, which achieves near-optimal utility for a large class of sensing applications. This mechanism is general, and of independent interest. We demonstrate the effectiveness of our approach in a case study of air quality monitoring, using data collected from the Mechanical Turk platform. Compared to the state of the art, our approach achieves up to 30% reduction in cost in order to achieve a desired level of utility.
△ Less
Submitted 13 September, 2013; v1 submitted 19 August, 2013;
originally announced August 2013.
-
Towards a living earth simulator
Authors:
M. Paolucci,
D. Kossman,
R. Conte,
P. Lukowicz,
P. Argyrakis,
A. Blandford,
G. Bonelli,
S. Anderson,
S. de Freitas,
B. Edmonds,
N. Gilbert,
M. Gross,
J. Kohlhammer,
P. Koumoutsakos,
A. Krause,
B. -O. Linnér,
P. Slusallek,
O. Sorkine,
R. W. Sumner,
D. Helbing
Abstract:
The Living Earth Simulator (LES) is one of the core components of the FuturICT architecture. It will work as a federation of methods, tools, techniques and facilities supporting all of the FuturICT simulation-related activities to allow and encourage interactive exploration and understanding of societal issues. Society-relevant problems will be targeted by leaning on approaches based on complex sy…
▽ More
The Living Earth Simulator (LES) is one of the core components of the FuturICT architecture. It will work as a federation of methods, tools, techniques and facilities supporting all of the FuturICT simulation-related activities to allow and encourage interactive exploration and understanding of societal issues. Society-relevant problems will be targeted by leaning on approaches based on complex systems theories and data science in tight interaction with the other components of FuturICT. The LES will evaluate and provide answers to real-world questions by taking into account multiple scenarios. It will build on present approaches such as agent-based simulation and modeling, multiscale modelling, statistical inference, and data mining, moving beyond disciplinary borders to achieve a new perspective on complex social systems.
△ Less
Submitted 6 April, 2013;
originally announced April 2013.
-
Near-optimal Nonmyopic Value of Information in Graphical Models
Authors:
Andreas Krause,
Carlos E. Guestrin
Abstract:
A fundamental issue in real-world systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present the first efficient randomized algorithm providing a constant factor (1-1/e-epsilon) approximation…
▽ More
A fundamental issue in real-world systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present the first efficient randomized algorithm providing a constant factor (1-1/e-epsilon) approximation guarantee for any epsilon > 0 with high confidence. The algorithm leverages the theory of submodular functions, in combination with a polynomial bound on sample complexity. We furthermore prove that no polynomial time algorithm can provide a constant factor approximation better than (1 - 1/e) unless P = NP. Finally, we provide extensive evidence of the effectiveness of our method on two complex real-world datasets.
△ Less
Submitted 4 July, 2012;
originally announced July 2012.
-
Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization
Authors:
Thomas Desautels,
Andreas Krause,
Joel Burdick
Abstract:
Can one parallelize complex exploration exploitation tradeoffs? As an example, consider the problem of optimal high-throughput experimental design, where we wish to sequentially design batches of experiments in order to simultaneously learn a surrogate function map** stimulus to response and identify the maximum of the function. We formalize the task as a multi-armed bandit problem, where the un…
▽ More
Can one parallelize complex exploration exploitation tradeoffs? As an example, consider the problem of optimal high-throughput experimental design, where we wish to sequentially design batches of experiments in order to simultaneously learn a surrogate function map** stimulus to response and identify the maximum of the function. We formalize the task as a multi-armed bandit problem, where the unknown payoff function is sampled from a Gaussian process (GP), and instead of a single arm, in each round we pull a batch of several arms in parallel. We develop GP-BUCB, a principled algorithm for choosing batches, based on the GP-UCB algorithm for sequential GP optimization. We prove a surprising result; as compared to the sequential approach, the cumulative regret of the parallel algorithm only increases by a constant factor independent of the batch size B. Our results provide rigorous theoretical support for exploiting parallelism in Bayesian global optimization. We demonstrate the effectiveness of our approach on two real-world applications.
△ Less
Submitted 27 June, 2012;
originally announced June 2012.