-
Kernel Belief Propagation
Authors:
Le Song,
Arthur Gretton,
Danny Bickson,
Yucheng Low,
Carlos Guestrin
Abstract:
We propose a nonparametric generalization of belief propagation, Kernel Belief Propagation (KBP), for pairwise Markov random fields. Messages are represented as functions in a reproducing kernel Hilbert space (RKHS), and message updates are simple linear operations in the RKHS. KBP makes none of the assumptions commonly required in classical BP algorithms: the variables need not arise from a finit…
▽ More
We propose a nonparametric generalization of belief propagation, Kernel Belief Propagation (KBP), for pairwise Markov random fields. Messages are represented as functions in a reproducing kernel Hilbert space (RKHS), and message updates are simple linear operations in the RKHS. KBP makes none of the assumptions commonly required in classical BP algorithms: the variables need not arise from a finite domain or a Gaussian distribution, nor must their relations take any particular parametric form. Rather, the relations between variables are represented implicitly, and are learned nonparametrically from training data. KBP has the advantage that it may be used on any domain where kernels are defined (Rd, strings, groups), even where explicit parametric models are not known, or closed form expressions for the BP updates do not exist. The computational cost of message updates in KBP is polynomial in the training data size. We also propose a constant time approximate message update procedure by representing messages using a small number of basis functions. In experiments, we apply KBP to image denoising, depth prediction from still images, and protein configuration prediction: KBP is faster than competing classical and nonparametric approaches (by orders of magnitude, in some cases), while providing significantly more accurate results.
△ Less
Submitted 27 May, 2011;
originally announced May 2011.
-
Parallel Coordinate Descent for L1-Regularized Loss Minimization
Authors:
Joseph K. Bradley,
Aapo Kyrola,
Danny Bickson,
Carlos Guestrin
Abstract:
We propose Shotgun, a parallel coordinate descent algorithm for minimizing L1-regularized losses. Though coordinate descent seems inherently sequential, we prove convergence bounds for Shotgun which predict linear speedups, up to a problem-dependent limit. We present a comprehensive empirical study of Shotgun for Lasso and sparse logistic regression. Our theoretical predictions on the potential fo…
▽ More
We propose Shotgun, a parallel coordinate descent algorithm for minimizing L1-regularized losses. Though coordinate descent seems inherently sequential, we prove convergence bounds for Shotgun which predict linear speedups, up to a problem-dependent limit. We present a comprehensive empirical study of Shotgun for Lasso and sparse logistic regression. Our theoretical predictions on the potential for parallelism closely match behavior on real data. Shotgun outperforms other published solvers on a range of large problems, proving to be one of the most scalable algorithms for L1.
△ Less
Submitted 26 May, 2011;
originally announced May 2011.
-
Inference with Multivariate Heavy-Tails in Linear Models
Authors:
Danny Bickson,
Carlos Guestrin
Abstract:
Heavy-tailed distributions naturally occur in many real life problems. Unfortunately, it is typically not possible to compute inference in closed-form in graphical models which involve such heavy-tailed distributions.
In this work, we propose a novel simple linear graphical model for independent latent random variables, called linear characteristic model (LCM), defined in the characteristic func…
▽ More
Heavy-tailed distributions naturally occur in many real life problems. Unfortunately, it is typically not possible to compute inference in closed-form in graphical models which involve such heavy-tailed distributions.
In this work, we propose a novel simple linear graphical model for independent latent random variables, called linear characteristic model (LCM), defined in the characteristic function domain. Using stable distributions, a heavy-tailed family of distributions which is a generalization of Cauchy, Lévy and Gaussian distributions, we show for the first time, how to compute both exact and approximate inference in such a linear multivariate graphical model. LCMs are not limited to stable distributions, in fact LCMs are always defined for any random variables (discrete, continuous or a mixture of both).
We provide a realistic problem from the field of computer networks to demonstrate the applicability of our construction. Other potential application is iterative decoding of linear channels with non-Gaussian noise.
△ Less
Submitted 21 March, 2011; v1 submitted 31 August, 2010;
originally announced August 2010.
-
Multiresolution Cube Estimators for Sensor Network Aggregate Queries
Authors:
Alexandra Meliou,
Carlos Guestrin,
Joseph M. Hellerstein
Abstract:
In this work we present in-network techniques to improve the efficiency of spatial aggregate queries. Such queries are very common in a sensornet setting, demanding more targeted techniques for their handling. Our approach constructs and maintains multi-resolution cube hierarchies inside the network, which can be constructed in a distributed fashion. In case of failures, recovery can also be perfo…
▽ More
In this work we present in-network techniques to improve the efficiency of spatial aggregate queries. Such queries are very common in a sensornet setting, demanding more targeted techniques for their handling. Our approach constructs and maintains multi-resolution cube hierarchies inside the network, which can be constructed in a distributed fashion. In case of failures, recovery can also be performed with in-network decisions. In this paper we demonstrate how in-network cube hierarchies can be used to summarize sensor data, and how they can be exploited to improve the efficiency of spatial aggregate queries. We show that query plans over our cube summaries can be computed in polynomial time, and we present a PTIME algorithm that selects the minimum number of data requests that can compute the answer to a spatial query. We further extend our algorithm to handle optimization over multiple queries, which can also be done in polynomial time. We discuss enriching cube hierarchies with extra summary information, and present an algorithm for distributed cube construction. Finally we investigate node and area failures, and algorithms to recover query results.
△ Less
Submitted 21 July, 2010;
originally announced July 2010.
-
GraphLab: A New Framework for Parallel Machine Learning
Authors:
Yucheng Low,
Joseph Gonzalez,
Aapo Kyrola,
Danny Bickson,
Carlos Guestrin,
Joseph M. Hellerstein
Abstract:
Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and Pthreads leave ML experts repeatedly solving the same design challenges. By targeting common patterns in ML, we developed GraphLab, which improves upon abstractions…
▽ More
Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and Pthreads leave ML experts repeatedly solving the same design challenges. By targeting common patterns in ML, we developed GraphLab, which improves upon abstractions like MapReduce by compactly expressing asynchronous iterative algorithms with sparse computational dependencies while ensuring data consistency and achieving a high degree of parallel performance. We demonstrate the expressiveness of the GraphLab framework by designing and implementing parallel versions of belief propagation, Gibbs sampling, Co-EM, Lasso and Compressed Sensing. We show that using GraphLab we can achieve excellent parallel performance on large scale real-world problems.
△ Less
Submitted 25 June, 2010;
originally announced June 2010.
-
Uncovering the Riffled Independence Structure of Rankings
Authors:
Jonathan Huang,
Carlos Guestrin
Abstract:
Representing distributions over permutations can be a daunting task due to the fact that the number of permutations of $n$ objects scales factorially in $n$. One recent way that has been used to reduce storage complexity has been to exploit probabilistic independence, but as we argue, full independence assumptions impose strong sparsity constraints on distributions and are unsuitable for modeling…
▽ More
Representing distributions over permutations can be a daunting task due to the fact that the number of permutations of $n$ objects scales factorially in $n$. One recent way that has been used to reduce storage complexity has been to exploit probabilistic independence, but as we argue, full independence assumptions impose strong sparsity constraints on distributions and are unsuitable for modeling rankings. We identify a novel class of independence structures, called \emph{riffled independence}, encompassing a more expressive family of distributions while retaining many of the properties necessary for performing efficient inference and reducing sample complexity. In riffled independence, one draws two permutations independently, then performs the \emph{riffle shuffle}, common in card games, to combine the two permutations to form a single permutation. Within the context of ranking, riffled independence corresponds to ranking disjoint sets of objects independently, then interleaving those rankings. In this paper, we provide a formal introduction to riffled independence and present algorithms for using riffled independence within Fourier-theoretic frameworks which have been explored by a number of recent papers. Additionally, we propose an automated method for discovering sets of items which are riffle independent from a training set of rankings. We show that our clustering-like algorithms can be used to discover meaningful latent coalitions from real preference ranking datasets and to learn the structure of hierarchically decomposable models based on riffled independence.
△ Less
Submitted 7 June, 2010;
originally announced June 2010.