-
Diversity in Evolutionary Dynamics
Authors:
Yuval Rabani,
Leonard J. Schulman,
Alistair Sinclair
Abstract:
We consider the dynamics imposed by natural selection on the populations of two competing, sexually reproducing, haploid species. In this setting, the fitness of any genome varies over time due to the changing population mix of the competing species; crucially, this fitness variation arises naturally from the model itself, without the need for imposing it exogenously as is typically the case. Prev…
▽ More
We consider the dynamics imposed by natural selection on the populations of two competing, sexually reproducing, haploid species. In this setting, the fitness of any genome varies over time due to the changing population mix of the competing species; crucially, this fitness variation arises naturally from the model itself, without the need for imposing it exogenously as is typically the case. Previous work on this model [14] showed that, in the special case where each of the two species exhibits just two phenotypes, genetic diversity is maintained at all times. This finding supported the tenet that sexual reproduction is advantageous because it promotes diversity, which increases the survivability of a species.
In the present paper we consider the more realistic case where there are more than two phenotypes available to each species. The conclusions about diversity in general turn out to be very different from the two-phenotype case.
Our first result is negative: namely, we show that sexual reproduction does not guarantee the maintenance of diversity at all times, i.e., the result of [14] does not generalize. Our counterexample consists of two competing species with just three phenotypes each. We show that, for any time~$t_0$ and any $\varepsilon>0$, there is a time $t\ge t_0$ at which the combined diversity of both species is smaller than~$\varepsilon$. Our main result is a complementary positive statement, which says that in any non-degenerate example, diversity is maintained in a weaker, ``infinitely often'' sense.
Thus, our results refute the supposition that sexual reproduction ensures diversity at all times, but affirm a weaker assertion that extended periods of high diversity are necessarily a recurrent event.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Identifiability of Product of Experts Models
Authors:
Spencer L. Gordon,
Manav Kant,
Eric Ma,
Leonard J. Schulman,
Andrei Staicu
Abstract:
Product of experts (PoE) are layered networks in which the value at each node is an AND (or product) of the values (possibly negated) at its inputs. These were introduced as a neural network architecture that can efficiently learn to generate high-dimensional data which satisfy many low-dimensional constraints -- thereby allowing each individual expert to perform a simple task. PoEs have found a v…
▽ More
Product of experts (PoE) are layered networks in which the value at each node is an AND (or product) of the values (possibly negated) at its inputs. These were introduced as a neural network architecture that can efficiently learn to generate high-dimensional data which satisfy many low-dimensional constraints -- thereby allowing each individual expert to perform a simple task. PoEs have found a variety of applications in learning.
We study the problem of identifiability of a product of experts model having a layer of binary latent variables, and a layer of binary observables that are iid conditional on the latents. The previous best upper bound on the number of observables needed to identify the model was exponential in the number of parameters. We show: (a) When the latents are uniformly distributed, the model is identifiable with a number of observables equal to the number of parameters (and hence best possible). (b) In the more general case of arbitrarily distributed latents, the model is identifiable for a number of observables that is still linear in the number of parameters (and within a factor of two of best-possible). The proofs rely on root interlacing phenomena for some special three-term recurrences.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Identification of Mixtures of Discrete Product Distributions in Near-Optimal Sample and Time Complexity
Authors:
Spencer L. Gordon,
Erik Jahn,
Bijan Mazaheri,
Yuval Rabani,
Leonard J. Schulman
Abstract:
We consider the problem of identifying, from statistics, a distribution of discrete random variables $X_1,\ldots,X_n$ that is a mixture of $k$ product distributions. The best previous sample complexity for $n \in O(k)$ was $(1/ζ)^{O(k^2 \log k)}$ (under a mild separation assumption parameterized by $ζ$). The best known lower bound was $\exp(Ω(k))$. It is known that $n\geq 2k-1$ is necessary and su…
▽ More
We consider the problem of identifying, from statistics, a distribution of discrete random variables $X_1,\ldots,X_n$ that is a mixture of $k$ product distributions. The best previous sample complexity for $n \in O(k)$ was $(1/ζ)^{O(k^2 \log k)}$ (under a mild separation assumption parameterized by $ζ$). The best known lower bound was $\exp(Ω(k))$. It is known that $n\geq 2k-1$ is necessary and sufficient for identification. We show, for any $n\geq 2k-1$, how to achieve sample complexity and run-time complexity $(1/ζ)^{O(k)}$. We also extend the known lower bound of $e^{Ω(k)}$ to match our upper bound across a broad range of $ζ$. Our results are obtained by combining (a) a classic method for robust tensor decomposition, (b) a novel way of bounding the condition number of key matrices called Hadamard extensions, by studying their action only on flattened rank-1 tensors.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Causal Inference Despite Limited Global Confounding via Mixture Models
Authors:
Spencer L. Gordon,
Bijan Mazaheri,
Yuval Rabani,
Leonard J. Schulman
Abstract:
A Bayesian Network is a directed acyclic graph (DAG) on a set of $n$ random variables (the vertices); a Bayesian Network Distribution (BND) is a probability distribution on the random variables that is Markovian on the graph. A finite $k$-mixture of such models is graphically represented by a larger graph which has an additional ``hidden'' (or ``latent'') random variable $U$, ranging in…
▽ More
A Bayesian Network is a directed acyclic graph (DAG) on a set of $n$ random variables (the vertices); a Bayesian Network Distribution (BND) is a probability distribution on the random variables that is Markovian on the graph. A finite $k$-mixture of such models is graphically represented by a larger graph which has an additional ``hidden'' (or ``latent'') random variable $U$, ranging in $\{1,\ldots,k\}$, and a directed edge from $U$ to every other vertex. Models of this type are fundamental to causal inference, where $U$ models an unobserved confounding effect of multiple populations, obscuring the causal relationships in the observable DAG. By solving the mixture problem and recovering the joint probability distribution with $U$, traditionally unidentifiable causal relationships become identifiable. Using a reduction to the more well-studied ``product'' case on empty graphs, we give the first algorithm to learn mixtures of non-empty DAGs.
△ Less
Submitted 31 May, 2023; v1 submitted 21 December, 2021;
originally announced December 2021.
-
A Refined Approximation for Euclidean k-Means
Authors:
Fabrizio Grandoni,
Rafail Ostrovsky,
Yuval Rabani,
Leonard J. Schulman,
Rakesh Venkat
Abstract:
In the Euclidean $k$-Means problem we are given a collection of $n$ points $D$ in an Euclidean space and a positive integer $k$. Our goal is to identify a collection of $k$ points in the same space (centers) so as to minimize the sum of the squared Euclidean distances between each point in $D$ and the closest center. This problem is known to be APX-hard and the current best approximation ratio is…
▽ More
In the Euclidean $k$-Means problem we are given a collection of $n$ points $D$ in an Euclidean space and a positive integer $k$. Our goal is to identify a collection of $k$ points in the same space (centers) so as to minimize the sum of the squared Euclidean distances between each point in $D$ and the closest center. This problem is known to be APX-hard and the current best approximation ratio is a primal-dual $6.357$ approximation based on a standard LP for the problem [Ahmadian et al. FOCS'17, SICOMP'20].
In this note we show how a minor modification of Ahmadian et al.'s analysis leads to a slightly improved $6.12903$ approximation. As a related result, we also show that the mentioned LP has integrality gap at least $\frac{16+\sqrt{5}}{15}>1.2157$.
△ Less
Submitted 20 September, 2021; v1 submitted 15 July, 2021;
originally announced July 2021.
-
Hadamard Extensions and the Identification of Mixtures of Product Distributions
Authors:
Spencer L. Gordon,
Leonard J. Schulman
Abstract:
The Hadamard Extension of a matrix is the matrix consisting of all Hadamard products of subsets of its rows. This construction arises in the context of identifying a mixture of product distributions on binary random variables: full column rank of such extensions is a necessary ingredient of identification algorithms. We provide several results concerning when a Hadamard Extension has full column r…
▽ More
The Hadamard Extension of a matrix is the matrix consisting of all Hadamard products of subsets of its rows. This construction arises in the context of identifying a mixture of product distributions on binary random variables: full column rank of such extensions is a necessary ingredient of identification algorithms. We provide several results concerning when a Hadamard Extension has full column rank.
△ Less
Submitted 12 February, 2021; v1 submitted 27 January, 2021;
originally announced January 2021.
-
Source Identification for Mixtures of Product Distributions
Authors:
Spencer L. Gordon,
Bijan Mazaheri,
Yuval Rabani,
Leonard J. Schulman
Abstract:
We give an algorithm for source identification of a mixture of $k$ product distributions on $n$ bits. This is a fundamental problem in machine learning with many applications. Our algorithm identifies the source parameters of an identifiable mixture, given, as input, approximate values of multilinear moments (derived, for instance, from a sufficiently large sample), using $2^{O(k^2)} n^{O(k)}$ ari…
▽ More
We give an algorithm for source identification of a mixture of $k$ product distributions on $n$ bits. This is a fundamental problem in machine learning with many applications. Our algorithm identifies the source parameters of an identifiable mixture, given, as input, approximate values of multilinear moments (derived, for instance, from a sufficiently large sample), using $2^{O(k^2)} n^{O(k)}$ arithmetic operations. Our result is the first explicit bound on the computational complexity of source identification of such mixtures. The running time improves previous results by Feldman, O'Donnell, and Servedio (FOCS 2005) and Chen and Moitra (STOC 2019) that guaranteed only learning the mixture (without parametric identification of the source). Our analysis gives a quantitative version of a qualitative characterization of identifiable sources that is due to Tahmasebi, Motahari, and Maddah-Ali (ISIT 2018).
△ Less
Submitted 28 December, 2020;
originally announced December 2020.
-
The Sparse Hausdorff Moment Problem, with Application to Topic Models
Authors:
Spencer Gordon,
Bijan Mazaheri,
Leonard J. Schulman,
Yuval Rabani
Abstract:
We consider the problem of identifying, from its first $m$ noisy moments, a probability distribution on $[0,1]$ of support $k<\infty$. This is equivalent to the problem of learning a distribution on $m$ observable binary random variables $X_1,X_2,\dots,X_m$ that are iid conditional on a hidden random variable $U$ taking values in $\{1,2,\dots,k\}$. Our focus is on accomplishing this with $m=2k$, w…
▽ More
We consider the problem of identifying, from its first $m$ noisy moments, a probability distribution on $[0,1]$ of support $k<\infty$. This is equivalent to the problem of learning a distribution on $m$ observable binary random variables $X_1,X_2,\dots,X_m$ that are iid conditional on a hidden random variable $U$ taking values in $\{1,2,\dots,k\}$. Our focus is on accomplishing this with $m=2k$, which is the minimum $m$ for which verifying that the source is a $k$-mixture is possible (even with exact statistics). This problem, so simply stated, is quite useful: e.g., by a known reduction, any algorithm for it lifts to an algorithm for learning pure topic models.
We give an algorithm for identifying a $k$-mixture using samples of $m=2k$ iid binary random variables using a sample of size $\left(1/w_{\min}\right)^2 \cdot\left(1/ζ\right)^{O(k)}$ and post-sampling runtime of only $O(k^{2+o(1)})$ arithmetic operations. Here $w_{\min}$ is the minimum probability of an outcome of $U$, and $ζ$ is the minimum separation between the distinct success probabilities of the $X_i$s. Stated in terms of the moment problem, it suffices to know the moments to additive accuracy $w_{\min}\cdotζ^{O(k)}$. It is known that the sample complexity of any solution to the identification problem must be at least exponential in $k$. Previous results demonstrated either worse sample complexity and worse $O(k^c)$ runtime for some $c$ substantially larger than $2$, or similar sample complexity and much worse $k^{O(k^2)}$ runtime.
△ Less
Submitted 7 September, 2020; v1 submitted 16 July, 2020;
originally announced July 2020.
-
Edge Expansion and Spectral Gap of Nonnegative Matrices
Authors:
Jenish C. Mehta,
Leonard J. Schulman
Abstract:
The classic graphical Cheeger inequalities state that if $M$ is an $n\times n$ symmetric doubly stochastic matrix, then \[ \frac{1-λ_{2}(M)}{2}\leqφ(M)\leq\sqrt{2\cdot(1-λ_{2}(M))} \] where $φ(M)=\min_{S\subseteq[n],|S|\leq n/2}\left(\frac{1}{|S|}\sum_{i\in S,j\not\in S}M_{i,j}\right)$ is the edge expansion of $M$, and $λ_{2}(M)$ is the second largest eigenvalue of $M$. We study the relationship b…
▽ More
The classic graphical Cheeger inequalities state that if $M$ is an $n\times n$ symmetric doubly stochastic matrix, then \[ \frac{1-λ_{2}(M)}{2}\leqφ(M)\leq\sqrt{2\cdot(1-λ_{2}(M))} \] where $φ(M)=\min_{S\subseteq[n],|S|\leq n/2}\left(\frac{1}{|S|}\sum_{i\in S,j\not\in S}M_{i,j}\right)$ is the edge expansion of $M$, and $λ_{2}(M)$ is the second largest eigenvalue of $M$. We study the relationship between $φ(A)$ and the spectral gap $1-\text{Re}λ_{2}(A)$ for any doubly stochastic matrix $A$ (not necessarily symmetric), where $λ_{2}(A)$ is a nontrivial eigenvalue of $A$ with maximum real part. Fiedler showed that the upper bound on $φ(A)$ is unaffected, i.e., $φ(A)\leq\sqrt{2\cdot(1-\text{Re}λ_{2}(A))}$. With regards to the lower bound on $φ(A)$, there are known constructions with \[ φ(A)\inΘ\left(\frac{1-\text{Re}λ_{2}(A)}{\log n}\right), \] indicating that at least a mild dependence on $n$ is necessary to lower bound $φ(A)$.
In our first result, we provide an exponentially better construction of $n\times n$ doubly stochastic matrices $A_{n}$, for which \[φ(A_{n})\leq\frac{1-\text{Re}λ_{2}(A_{n})}{\sqrt{n}}.\] In fact, all nontrivial eigenvalues of our matrices are $0$, even though the matrices are highly nonexpanding. We further show that this bound is in the correct range (up to the exponent of $n$), by showing that for any doubly stochastic matrix $A$, \[φ(A)\geq\frac{1-\text{Re}λ_{2}(A)}{35\cdot n}.\]
Our second result extends these bounds to general nonnegative matrices $R$, obtaining a two-sided quantitative refinement of the Perron-Frobenius theorem in which the edge expansion $φ(R)$ (appropriately defined), a quantitative measure of the irreducibility of $R$, controls the gap between the Perron-Frobenius eigenvalue and the next-largest real part of any eigenvalue.
△ Less
Submitted 27 September, 2019;
originally announced September 2019.
-
Learning Dynamics and the Co-Evolution of Competing Sexual Species
Authors:
Georgios Piliouras,
Leonard J. Schulman
Abstract:
We analyze a stylized model of co-evolution between any two purely competing species (e.g., host and parasite), both sexually reproducing. Similarly to a recent model of Livnat \etal~\cite{evolfocs14} the fitness of an individual depends on whether the truth assignments on $n$ variables that reproduce through recombination satisfy a particular Boolean function. Whereas in the original model a sati…
▽ More
We analyze a stylized model of co-evolution between any two purely competing species (e.g., host and parasite), both sexually reproducing. Similarly to a recent model of Livnat \etal~\cite{evolfocs14} the fitness of an individual depends on whether the truth assignments on $n$ variables that reproduce through recombination satisfy a particular Boolean function. Whereas in the original model a satisfying assignment always confers a small evolutionary advantage, in our model the two species are in an evolutionary race with the parasite enjoying the advantage if the value of its Boolean function matches its host, and the host wishing to mismatch its parasite. Surprisingly, this model makes a simple and robust behavioral prediction. The typical system behavior is \textit{periodic}. These cycles stay bounded away from the boundary and thus, \textit{learning-dynamics competition between sexual species can provide an explanation for genetic diversity.} This explanation is due solely to the natural selection process. No mutations, environmental changes, etc., need be invoked.
The game played at the gene level may have many Nash equilibria with widely diverse fitness levels. Nevertheless, sexual evolution leads to gene coordination that implements an optimal strategy, i.e., an optimal population mixture, at the species level. Namely, the play of the many "selfish genes" implements a time-averaged correlated equilibrium where the average fitness of each species is exactly equal to its value in the two species zero-sum competition.
Our analysis combines tools from game theory, dynamical systems and Boolean functions to establish a novel class of conservative dynamical systems.
△ Less
Submitted 18 November, 2017;
originally announced November 2017.
-
Online codes for analog signals
Authors:
Leonard J. Schulman,
Piyush Srivastava
Abstract:
This paper revisits a classical scenario in communication theory: a waveform sampled at regular intervals is to be encoded so as to minimize distortion in its reconstruction, despite noise. This transformation must be online (causal), to enable real-time signaling; and should use no more power than the original signal. The noise model we consider is an "atomic norm" convex relaxation of the standa…
▽ More
This paper revisits a classical scenario in communication theory: a waveform sampled at regular intervals is to be encoded so as to minimize distortion in its reconstruction, despite noise. This transformation must be online (causal), to enable real-time signaling; and should use no more power than the original signal. The noise model we consider is an "atomic norm" convex relaxation of the standard (discrete alphabet) Hamming-weight-bounded model: namely, adversarial $\ell_1$-bounded. In the "block coding" (noncausal) setting, such encoding is possible due to the existence of large almost-Euclidean sections in $\ell_1$ spaces, a notion first studied in the work of Dvoretzky in 1961. Our main result is that an analogous result is achievable even causally. Equivalently, our work may be seen as a "lower triangular" version of $\ell_1$ Dvoretzky theorems. In terms of communication, the guarantees are expressed in terms of certain time-weighted norms: the time-weighted $\ell_2$ norm imposed on the decoder forces increasingly accurate reconstruction of the distant past signal, while the time-weighted $\ell_1$ norm on the noise ensures vanishing interference from distant past noise. Encoding is linear (hence easy to implement in analog hardware). Decoding is performed by an LP analogous to those used in compressed sensing.
△ Less
Submitted 1 June, 2019; v1 submitted 17 July, 2017;
originally announced July 2017.
-
Quasi-regular sequences and optimal schedules for security games
Authors:
David Kempe,
Leonard J. Schulman,
Omer Tamuz
Abstract:
We study security games in which a defender commits to a mixed strategy for protecting a finite set of targets of different values. An attacker, knowing the defender's strategy, chooses which target to attack and for how long. If the attacker spends time $t$ at a target $i$ of value $α_i$, and if he leaves before the defender visits the target, his utility is $t \cdot α_i $; if the defender visits…
▽ More
We study security games in which a defender commits to a mixed strategy for protecting a finite set of targets of different values. An attacker, knowing the defender's strategy, chooses which target to attack and for how long. If the attacker spends time $t$ at a target $i$ of value $α_i$, and if he leaves before the defender visits the target, his utility is $t \cdot α_i $; if the defender visits before he leaves, his utility is 0. The defender's goal is to minimize the attacker's utility. The defender's strategy consists of a schedule for visiting the targets; it takes her unit time to switch between targets. Such games are a simplified model of a number of real-world scenarios such as protecting computer networks from intruders, crops from thieves, etc.
We show that optimal defender play for this continuous time security games reduces to the solution of a combinatorial question regarding the existence of infinite sequences over a finite alphabet, with the following properties for each symbol $i$: (1) $i$ constitutes a prescribed fraction $p_i$ of the sequence. (2) The occurrences of $i$ are spread apart close to evenly, in that the ratio of the longest to shortest interval between consecutive occurrences is bounded by a parameter $K$. We call such sequences $K$-quasi-regular.
We show that, surprisingly, $2$-quasi-regular sequences suffice for optimal defender play. What is more, even randomized $2$-quasi-regular sequences suffice for optimality. We show that such sequences always exist, and can be calculated efficiently.
The question of the least $K$ for which deterministic $K$-quasi-regular sequences exist is fascinating. Using an ergodic theoretical approach, we show that deterministic $3$-quasi-regular sequences always exist. For $2 \leq K < 3$ we do not know whether deterministic $K$-quasi-regular sequences always exist.
△ Less
Submitted 28 October, 2017; v1 submitted 22 November, 2016;
originally announced November 2016.
-
Market Dynamics of Best-Response with Lookahead
Authors:
Krishnamurthy Dvijotham,
Yuval Rabani,
Leonard J. Schulman
Abstract:
One attractive approach to market dynamics is the level $k$ model in which a level $0$ player adopts a very simple response to current conditions, a level $1$ player best-responds to a model in which others take level $0$ actions, and so forth. (This is analogous to $k$-ply exploration of game trees in AI, and to receding-horizon control in control theory.) If players have deterministic mental mod…
▽ More
One attractive approach to market dynamics is the level $k$ model in which a level $0$ player adopts a very simple response to current conditions, a level $1$ player best-responds to a model in which others take level $0$ actions, and so forth. (This is analogous to $k$-ply exploration of game trees in AI, and to receding-horizon control in control theory.) If players have deterministic mental models with this kind of finite-level response, there is obviously no way their mental models can all be consistent. Nevertheless, there is experimental evidence that people act this way in many situations, motivating the question of what the dynamics of such interactions lead to.
We address this question in the setting of Fisher Markets with constant elasticities of substitution (CES) utilities, in the weak gross substitutes (WGS) regime. We show that despite the inconsistency of the mental models, and even if players' models change arbitrarily from round to round, the market converges to its unique equilibrium. (We show this for both synchronous and asynchronous discrete-time updates.) Moreover, the result is computationally feasible in the sense that the convergence rate is linear, i.e., the distance to equilibrium decays exponentially fast. To the best of our knowledge, this is the first result that demonstrates, in Fisher markets, convergence at any rate for dynamics driven by a plausible model of seller incentives. Even for the simple case of (level $0$) best-response dynamics, where we observe that convergence at some rate can be derived from recent results in convex optimization, our result is the first to demonstrate a linear rate of convergence.
△ Less
Submitted 29 May, 2016;
originally announced May 2016.
-
Analysis of a Classical Matrix Preconditioning Algorithm
Authors:
Leonard J. Schulman,
Alistair Sinclair
Abstract:
We study a classical iterative algorithm for balancing matrices in the $L_\infty$ norm via a scaling transformation. This algorithm, which goes back to Osborne and Parlett \& Reinsch in the 1960s, is implemented as a standard preconditioner in many numerical linear algebra packages. Surprisingly, despite its widespread use over several decades, no bounds were known on its rate of convergence. In t…
▽ More
We study a classical iterative algorithm for balancing matrices in the $L_\infty$ norm via a scaling transformation. This algorithm, which goes back to Osborne and Parlett \& Reinsch in the 1960s, is implemented as a standard preconditioner in many numerical linear algebra packages. Surprisingly, despite its widespread use over several decades, no bounds were known on its rate of convergence. In this paper we prove that, for any irreducible $n\times n$ (real or complex) input matrix~$A$, a natural variant of the algorithm converges in $O(n^3\log(nρ/\varepsilon))$ elementary balancing operations, where $ρ$ measures the initial imbalance of~$A$ and $\varepsilon$ is the target imbalance of the output matrix. (The imbalance of~$A$ is $\max_i |\log(a_i^{\text{out}}/a_i^{\text{in}})|$, where $a_i^{\text{out}},a_i^{\text{in}}$ are the maximum entries in magnitude in the $i$th row and column respectively.) This bound is tight up to the $\log n$ factor. A balancing operation scales the $i$th row and column so that their maximum entries are equal, and requires $O(m/n)$ arithmetic operations on average, where $m$ is the number of non-zero elements in~$A$. Thus the running time of the iterative algorithm is $\tilde{O}(n^2m)$. This is the first time bound of any kind on any variant of the Osborne-Parlett-Reinsch algorithm. We also prove a conjecture of Chen that characterizes those matrices for which the limit of the balancing process is independent of the order in which balancing operations are performed.
△ Less
Submitted 14 June, 2015; v1 submitted 12 April, 2015;
originally announced April 2015.
-
Learning Arbitrary Statistical Mixtures of Discrete Distributions
Authors:
Jian Li,
Yuval Rabani,
Leonard J. Schulman,
Chaitanya Swamy
Abstract:
We study the problem of learning from unlabeled samples very general statistical mixture models on large finite sets. Specifically, the model to be learned, $\vartheta$, is a probability distribution over probability distributions $p$, where each such $p$ is a probability distribution over $[n] = \{1,2,\dots,n\}$. When we sample from $\vartheta$, we do not observe $p$ directly, but only indirectly…
▽ More
We study the problem of learning from unlabeled samples very general statistical mixture models on large finite sets. Specifically, the model to be learned, $\vartheta$, is a probability distribution over probability distributions $p$, where each such $p$ is a probability distribution over $[n] = \{1,2,\dots,n\}$. When we sample from $\vartheta$, we do not observe $p$ directly, but only indirectly and in very noisy fashion, by sampling from $[n]$ repeatedly, independently $K$ times from the distribution $p$. The problem is to infer $\vartheta$ to high accuracy in transportation (earthmover) distance.
We give the first efficient algorithms for learning this mixture model without making any restricting assumptions on the structure of the distribution $\vartheta$. We bound the quality of the solution as a function of the size of the samples $K$ and the number of samples used. Our model and results have applications to a variety of unsupervised learning scenarios, including learning topic models and collaborative filtering.
△ Less
Submitted 9 April, 2015;
originally announced April 2015.
-
The Adversarial Noise Threshold for Distributed Protocols
Authors:
William M. Hoza,
Leonard J. Schulman
Abstract:
We consider the problem of implementing distributed protocols, despite adversarial channel errors, on synchronous-messaging networks with arbitrary topology.
In our first result we show that any $n$-party $T$-round protocol on an undirected communication network $G$ can be compiled into a robust simulation protocol on a sparse ($\mathcal{O}(n)$ edges) subnetwork so that the simulation tolerates…
▽ More
We consider the problem of implementing distributed protocols, despite adversarial channel errors, on synchronous-messaging networks with arbitrary topology.
In our first result we show that any $n$-party $T$-round protocol on an undirected communication network $G$ can be compiled into a robust simulation protocol on a sparse ($\mathcal{O}(n)$ edges) subnetwork so that the simulation tolerates an adversarial error rate of $Ω\left(\frac{1}{n}\right)$; the simulation has a round complexity of $\mathcal{O}\left(\frac{m \log n}{n} T\right)$, where $m$ is the number of edges in $G$. (So the simulation is work-preserving up to a $\log$ factor.) The adversary's error rate is within a constant factor of optimal. Given the error rate, the round complexity blowup is within a factor of $\mathcal{O}(k \log n)$ of optimal, where $k$ is the edge connectivity of $G$. We also determine that the maximum tolerable error rate on directed communication networks is $Θ(1/s)$ where $s$ is the number of edges in a minimum equivalent digraph.
Next we investigate adversarial per-edge error rates, where the adversary is given an error budget on each edge of the network. We determine the exact limit for tolerable per-edge error rates on an arbitrary directed graph. However, the construction that approaches this limit has exponential round complexity, so we give another compiler, which transforms $T$-round protocols into $\mathcal{O}(mT)$-round simulations, and prove that for polynomial-query black box compilers, the per-edge error rate tolerated by this last compiler is within a constant factor of optimal.
△ Less
Submitted 28 April, 2015; v1 submitted 27 December, 2014;
originally announced December 2014.
-
Achieving Target Equilibria in Network Routing Games without Knowing the Latency Functions
Authors:
Umang Bhaskar,
Katrina Ligett,
Leonard J. Schulman,
Chaitanya Swamy
Abstract:
The analysis of network routing games typically assumes, right at the onset, precise and detailed information about the latency functions. Such information may, however, be unavailable or difficult to obtain. Moreover, one is often primarily interested in enforcing a desired target flow as the equilibrium by suitably influencing player behavior in the routing game. We ask whether one can achieve t…
▽ More
The analysis of network routing games typically assumes, right at the onset, precise and detailed information about the latency functions. Such information may, however, be unavailable or difficult to obtain. Moreover, one is often primarily interested in enforcing a desired target flow as the equilibrium by suitably influencing player behavior in the routing game. We ask whether one can achieve target flows as equilibria without knowing the underlying latency functions.
Our main result gives a crisp positive answer to this question. We show that, under fairly general settings, one can efficiently compute edge tolls that induce a given target multicommodity flow in a nonatomic routing game using a polynomial number of queries to an oracle that takes candidate tolls as input and returns the resulting equilibrium flow. This result is obtained via a novel application of the ellipsoid method. Our algorithm extends easily to many other settings, such as (i) when certain edges cannot be tolled or there is an upper bound on the total toll paid by a user, and (ii) general nonatomic congestion games. We obtain tighter bounds on the query complexity for series-parallel networks, and single-commodity routing games with linear latency functions, and complement these with a query-complexity lower bound. We also obtain strong positive results for Stackelberg routing to achieve target equilibria in series-parallel graphs.
Our results build upon various new techniques that we develop pertaining to the computation of, and connections between, different notions of approximate equilibrium; properties of multicommodity flows and tolls in series-parallel graphs; and sensitivity of equilibrium flow with respect to tolls. Our results demonstrate that one can indeed circumvent the potentially-onerous task of modeling latency functions, and yet obtain meaningful results for the underlying routing game.
△ Less
Submitted 6 August, 2014;
originally announced August 2014.
-
Tree Codes and a Conjecture on Exponential Sums
Authors:
Cristopher Moore,
Leonard J. Schulman
Abstract:
We propose a new conjecture on some exponential sums. These particular sums have not apparently been considered in the literature. Subject to the conjecture we obtain the first effective construction of asymptotically good tree codes. The available numerical evidence is consistent with the conjecture and is sufficient to certify codes for significant-length communications.
We propose a new conjecture on some exponential sums. These particular sums have not apparently been considered in the literature. Subject to the conjecture we obtain the first effective construction of asymptotically good tree codes. The available numerical evidence is consistent with the conjecture and is sufficient to certify codes for significant-length communications.
△ Less
Submitted 9 December, 2013; v1 submitted 27 August, 2013;
originally announced August 2013.
-
The Network Improvement Problem for Equilibrium Routing
Authors:
Umang Bhaskar,
Katrina Ligett,
Leonard J. Schulman
Abstract:
In routing games, agents pick their routes through a network to minimize their own delay. A primary concern for the network designer in routing games is the average agent delay at equilibrium. A number of methods to control this average delay have received substantial attention, including network tolls, Stackelberg routing, and edge removal.
A related approach with arguably greater practical rel…
▽ More
In routing games, agents pick their routes through a network to minimize their own delay. A primary concern for the network designer in routing games is the average agent delay at equilibrium. A number of methods to control this average delay have received substantial attention, including network tolls, Stackelberg routing, and edge removal.
A related approach with arguably greater practical relevance is that of making investments in improvements to the edges of the network, so that, for a given investment budget, the average delay at equilibrium in the improved network is minimized. This problem has received considerable attention in the literature on transportation research and a number of different algorithms have been studied. To our knowledge, none of this work gives guarantees on the output quality of any polynomial-time algorithm. We study a model for this problem introduced in transportation research literature, and present both hardness results and algorithms that obtain nearly optimal performance guarantees.
- We first show that a simple algorithm obtains good approximation guarantees for the problem. Despite its simplicity, we show that for affine delays the approximation ratio of 4/3 obtained by the algorithm cannot be improved.
- To obtain better results, we then consider restricted topologies. For graphs consisting of parallel paths with affine delay functions we give an optimal algorithm. However, for graphs that consist of a series of parallel links, we show the problem is weakly NP-hard.
- Finally, we consider the problem in series-parallel graphs, and give an FPTAS for this case.
Our work thus formalizes the intuition held by transportation researchers that the network improvement problem is hard, and presents topology-dependent algorithms that have provably tight approximation guarantees.
△ Less
Submitted 10 November, 2013; v1 submitted 14 July, 2013;
originally announced July 2013.
-
Allocation of Divisible Goods under Lexicographic Preferences
Authors:
Leonard J. Schulman,
Vijay V. Vazirani
Abstract:
We present a simple and natural non-pricing mechanism for allocating divisible goods among strategic agents having lexicographic preferences. Our mechanism has favorable properties of incentive compatibility (strategy-proofness), Pareto efficiency, envy-freeness, and time efficiency.
We present a simple and natural non-pricing mechanism for allocating divisible goods among strategic agents having lexicographic preferences. Our mechanism has favorable properties of incentive compatibility (strategy-proofness), Pareto efficiency, envy-freeness, and time efficiency.
△ Less
Submitted 11 October, 2015; v1 submitted 19 June, 2012;
originally announced June 2012.
-
The Symmetric Group Defies Strong Fourier Sampling: Part I
Authors:
Cristopher Moore,
Alexander Russell,
Leonard J. Schulman
Abstract:
We resolve the question of whether Fourier sampling can efficiently solve the hidden subgroup problem. Specifically, we show that the hidden subgroup problem over the symmetric group cannot be efficiently solved by strong Fourier sampling, even if one may perform an arbitrary POVM on the coset state. Our results apply to the special case relevant to the Graph Isomorphism problem.
We resolve the question of whether Fourier sampling can efficiently solve the hidden subgroup problem. Specifically, we show that the hidden subgroup problem over the symmetric group cannot be efficiently solved by strong Fourier sampling, even if one may perform an arbitrary POVM on the coset state. Our results apply to the special case relevant to the Graph Isomorphism problem.
△ Less
Submitted 14 October, 2005; v1 submitted 12 January, 2005;
originally announced January 2005.