-
A Multiresolution Analysis Framework for the Statistical Analysis of Incomplete Rankings
Authors:
Eric Sibony,
Stéphan Clémençon,
Jérémie Jakubowicz
Abstract:
Though the statistical analysis of ranking data has been a subject of interest over the past centuries, especially in economics, psychology or social choice theory, it has been revitalized in the past 15 years by recent applications such as recommender or search engines and is receiving now increasing interest in the machine learning literature. Numerous modern systems indeed generate ranking data…
▽ More
Though the statistical analysis of ranking data has been a subject of interest over the past centuries, especially in economics, psychology or social choice theory, it has been revitalized in the past 15 years by recent applications such as recommender or search engines and is receiving now increasing interest in the machine learning literature. Numerous modern systems indeed generate ranking data, representing for instance ordered results to a query or user preferences. Each such ranking usually involves a small but varying subset of the whole catalog of items only. The study of the variability of these data, i.e. the statistical analysis of incomplete rank-ings, is however a great statistical and computational challenge, because of their heterogeneity and the related combinatorial complexity of the problem. Whereas many statistical methods for analyzing full rankings (orderings of all the items in the catalog) are documented in the dedicated literature, partial rankings (full rankings with ties) or pairwise comparisons, only a few approaches are available today to deal with incomplete ranking, relying each on a strong specific assumption. It is the purpose of this article to introduce a novel general framework for the statistical analysis of incomplete rankings. It is based on a representation tailored to these specific data, whose construction is also explained here, which fits with the natural multi-scale structure of incomplete rankings and provides a new decomposition of rank information with a multiresolu-tion analysis interpretation (MRA). We show that the MRA representation naturally allows to overcome both the statistical and computational challenges without any structural assumption on the data. It therefore provides a general and flexible framework to solve a wide variety of statistical problems, where data are of the form of incomplete rankings.
△ Less
Submitted 4 January, 2016;
originally announced January 2016.
-
Random Pairwise Gossip on $CAT(κ)$ Metric Spaces
Authors:
Anass Bellachehab,
Jérémie Jakubowicz
Abstract:
In the context of sensor networks, gossip algorithms are a popular, well esthablished technique for achieving consensus when sensor data is encoded in linear spaces. Gossip algorithms also have several extensions to non linear data spaces. Most of these extensions deal with Riemannian manifolds and use Riemannian gradient descent. This paper, instead, exhibits a very simple metric property that do…
▽ More
In the context of sensor networks, gossip algorithms are a popular, well esthablished technique for achieving consensus when sensor data is encoded in linear spaces. Gossip algorithms also have several extensions to non linear data spaces. Most of these extensions deal with Riemannian manifolds and use Riemannian gradient descent. This paper, instead, exhibits a very simple metric property that do not rely on any differential structure. This property strongly suggests that gossip algorithms could be studied on a broader family than Riemannian manifolds. And it turns out that, indeed, (local) convergence is guaranteed as soon as the data space is a mere $CAT(κ)$ metric space. We also study convergence speed in this setting and establish linear rates for $CAT(0)$ spaces, and local linear rates for $CAT(κ)$ spaces with $κ> 0$. Numerical simulations on several scenarii, with corresponding state spaces that are either Riemannian manifolds -- as in the problem of positive definite matrices consensus -- or bare metric spaces -- as in the problem of arms consensus -- validate the results. This shows that not only does our metric approach allows for a simpler and more general mathematical analysis but also paves the way for new kinds of applications that go beyond the Riemannian setting.
△ Less
Submitted 16 May, 2014;
originally announced May 2014.
-
Multiresolution Analysis of Incomplete Rankings
Authors:
Stéphan Clémençon,
Jérémie Jakubowicz,
Eric Sibony
Abstract:
Incomplete rankings on a set of items $\{1,\; \ldots,\; n\}$ are orderings of the form $a_{1}\prec\dots\prec a_{k}$, with $\{a_{1},\dots a_{k}\}\subset\{1,\dots,n\}$ and $k < n$. Though they arise in many modern applications, only a few methods have been introduced to manipulate them, most of them consisting in representing any incomplete ranking by the set of all its possible linear extensions on…
▽ More
Incomplete rankings on a set of items $\{1,\; \ldots,\; n\}$ are orderings of the form $a_{1}\prec\dots\prec a_{k}$, with $\{a_{1},\dots a_{k}\}\subset\{1,\dots,n\}$ and $k < n$. Though they arise in many modern applications, only a few methods have been introduced to manipulate them, most of them consisting in representing any incomplete ranking by the set of all its possible linear extensions on $\{1,\; \ldots,\; n\}$. It is the major purpose of this paper to introduce a completely novel approach, which allows to treat incomplete rankings directly, representing them as injective words over $\{1,\; \ldots,\; n\}$. Unexpectedly, operations on incomplete rankings have very simple equivalents in this setting and the topological structure of the complex of injective words can be interpretated in a simple fashion from the perspective of ranking. We exploit this connection here and use recent results from algebraic topology to construct a multiresolution analysis and develop a wavelet framework for incomplete rankings. Though purely combinatorial, this construction relies on the same ideas underlying multiresolution analysis on a Euclidean space, and permits to localize the information related to rankings on each subset of items. It can be viewed as a crucial step toward nonlinear approximation of distributions of incomplete rankings and paves the way for many statistical applications, including preference data analysis and the design of recommender systems.
△ Less
Submitted 8 March, 2014;
originally announced March 2014.
-
Robust Consensus in Distributed Networks using Total Variation
Authors:
Walid Ben-Ameur,
Pascal Bianchi,
Jérémie Jakubowicz
Abstract:
Consider a connected network of agents endowed with local cost functions representing private objectives. Agents seek to find an agreement on some minimizer of the aggregate cost, by means of repeated communications between neighbors. Consensus on the average over the network, usually addressed by gossip algorithms, is a special instance of this problem, corresponding to quadratic private objectiv…
▽ More
Consider a connected network of agents endowed with local cost functions representing private objectives. Agents seek to find an agreement on some minimizer of the aggregate cost, by means of repeated communications between neighbors. Consensus on the average over the network, usually addressed by gossip algorithms, is a special instance of this problem, corresponding to quadratic private objectives. Consensus on the median, or more generally quantiles, is also a special instance, as many more consensus problems. In this paper we show that optimizing the aggregate cost function regularized by a total variation term has appealing properties. First, it can be done very naturally in a distributed way, yielding algorithms that are efficient on numerical simulations. Secondly, the optimum for the regularized cost is shown to be also the optimum for the initial aggregate cost function under assumptions that are simple to state and easily verifiable. Finally, these algorithms are robust to unreliable agents that keep injecting some false value in the network. This is remarkable enough, and is not the case, for instance, of gossip algorithms, that are entirely ruled by unreliable agents as detailed in the paper.
△ Less
Submitted 27 September, 2013;
originally announced September 2013.
-
Convergence of a Multi-Agent Projected Stochastic Gradient Algorithm for Non-Convex Optimization
Authors:
Pascal Bianchi,
Jérémie Jakubowicz
Abstract:
We introduce a new framework for the convergence analysis of a class of distributed constrained non-convex optimization algorithms in multi-agent systems. The aim is to search for local minimizers of a non-convex objective function which is supposed to be a sum of local utility functions of the agents. The algorithm under study consists of two steps: a local stochastic gradient descent at each age…
▽ More
We introduce a new framework for the convergence analysis of a class of distributed constrained non-convex optimization algorithms in multi-agent systems. The aim is to search for local minimizers of a non-convex objective function which is supposed to be a sum of local utility functions of the agents. The algorithm under study consists of two steps: a local stochastic gradient descent at each agent and a gossip step that drives the network of agents to a consensus. Under the assumption of decreasing stepsize, it is proved that consensus is asymptotically achieved in the network and that the algorithm converges to the set of Karush-Kuhn-Tucker points. As an important feature, the algorithm does not require the double-stochasticity of the gossip matrices. It is in particular suitable for use in a natural broadcast scenario for which no feedback messages between agents are required. It is proved that our result also holds if the number of communications in the network per unit of time vanishes at moderate speed as time increases, allowing for potential savings of the network's energy. Applications to power allocation in wireless ad-hoc networks are discussed. Finally, we provide numerical results which sustain our claims.
△ Less
Submitted 2 December, 2013; v1 submitted 13 July, 2011;
originally announced July 2011.
-
Distributed Stochastic Approximation for Constrained and Unconstrained Optimization
Authors:
Pascal Bianchi,
Jérémie Jakubowicz
Abstract:
In this paper, we analyze the convergence of a distributed Robbins-Monro algorithm for both constrained and unconstrained optimization in multi-agent systems. The algorithm searches for local minima of a (nonconvex) objective function which is supposed to coincide with a sum of local utility functions of the agents. The algorithm under study consists of two steps: a local stochastic gradient desce…
▽ More
In this paper, we analyze the convergence of a distributed Robbins-Monro algorithm for both constrained and unconstrained optimization in multi-agent systems. The algorithm searches for local minima of a (nonconvex) objective function which is supposed to coincide with a sum of local utility functions of the agents. The algorithm under study consists of two steps: a local stochastic gradient descent at each agent and a gossip step that drives the network of agents to a consensus. It is proved that i) an agreement is achieved between agents on the value of the estimate, ii) the algorithm converges to the set of Kuhn-Tucker points of the optimization problem. The proof relies on recent results about differential inclusions. In the context of unconstrained optimization, intelligible sufficient conditions are provided in order to ensure the stability of the algorithm. In the latter case, we also provide a central limit theorem which governs the asymptotic fluctuations of the estimate. We illustrate our results in the case of distributed power allocation for ad-hoc wireless networks.
△ Less
Submitted 19 April, 2011; v1 submitted 14 April, 2011;
originally announced April 2011.
-
Neyman-Pearson Detection of a Gaussian Source using Dumb Wireless Sensors
Authors:
Pascal Bianchi,
Jeremie Jakubowicz,
Francois Roueff
Abstract:
We investigate the performance of the Neyman-Pearson detection of a stationary Gaussian process in noise, using a large wireless sensor network (WSN). In our model, each sensor compresses its observation sequence using a linear precoder. The final decision is taken by a fusion center (FC) based on the compressed information. Two families of precoders are studied: random iid precoders and orthogo…
▽ More
We investigate the performance of the Neyman-Pearson detection of a stationary Gaussian process in noise, using a large wireless sensor network (WSN). In our model, each sensor compresses its observation sequence using a linear precoder. The final decision is taken by a fusion center (FC) based on the compressed information. Two families of precoders are studied: random iid precoders and orthogonal precoders. We analyse their performance in the regime where both the number of sensors k and the number of samples n per sensor tend to infinity at the same rate, that is, k/n tends to c in (0, 1). Contributions are as follows. 1) Using results of random matrix theory and on large Toeplitz matrices, it is proved that the miss probability of the Neyman-Pearson detector converges exponentially to zero, when the above families of precoders are used. Closed form expressions of the corresponding error exponents are provided. 2) In particular, we propose a practical orthogonal precoding strategy, the Principal Frequencies Strategy (PFS), which achieves the best error exponent among all orthogonal strategies, and which requires very few signaling overhead between the central processor and the nodes of the network. 3) Moreover, when the PFS is used, a simplified low-complexity testing procedure can be implemented at the FC. We show that the proposed suboptimal test enjoys the same error exponent as the Neyman-Pearson test, which indicates a similar asymptotic behaviour of the performance. We illustrate our findings by numerical experiments on some examples.
△ Less
Submitted 27 January, 2010; v1 submitted 26 January, 2010;
originally announced January 2010.