Search | arXiv e-print repository

Learning and Testing Causal Models with Interventions

Authors: Jayadev Acharya, Arnab Bhattacharyya, Constantinos Daskalakis, Saravanan Kandasamy

Abstract: We consider testing and learning problems on causal Bayesian networks as defined by Pearl (Pearl, 2009). Given a causal Bayesian network $\mathcal{M}$ on a graph with $n$ discrete variables and bounded in-degree and bounded `confounded components', we show that $O(\log n)$ interventions on an unknown causal Bayesian network $\mathcal{X}$ on the same graph, and $\tilde{O}(n/ε^2)$ samples per interv… ▽ More We consider testing and learning problems on causal Bayesian networks as defined by Pearl (Pearl, 2009). Given a causal Bayesian network $\mathcal{M}$ on a graph with $n$ discrete variables and bounded in-degree and bounded `confounded components', we show that $O(\log n)$ interventions on an unknown causal Bayesian network $\mathcal{X}$ on the same graph, and $\tilde{O}(n/ε^2)$ samples per intervention, suffice to efficiently distinguish whether $\mathcal{X}=\mathcal{M}$ or whether there exists some intervention under which $\mathcal{X}$ and $\mathcal{M}$ are farther than $ε$ in total variation distance. We also obtain sample/time/intervention efficient algorithms for: (i) testing the identity of two unknown causal Bayesian networks on the same graph; and (ii) learning a causal Bayesian network on a given graph. Although our algorithms are non-adaptive, we show that adaptivity does not help in general: $Ω(\log n)$ interventions are necessary for testing the identity of two unknown causal Bayesian networks on the same graph, even adaptively. Our algorithms are enabled by a new subadditivity inequality for the squared Hellinger distance between two causal Bayesian networks. △ Less

Submitted 24 May, 2018; originally announced May 2018.

arXiv:1803.00494 [pdf, ps, other]

Robust Repeated Auctions under Heterogeneous Buyer Behavior

Authors: Shipra Agrawal, Constantinos Daskalakis, Vahab Mirrokni, Balasubramanian Sivan

Abstract: We study revenue optimization in a repeated auction between a single seller and a single buyer. Traditionally, the design of repeated auctions requires strong modeling assumptions about the bidder behavior, such as it being myopic, infinite lookahead, or some specific form of learning behavior. Is it possible to design mechanisms which are simultaneously optimal against a multitude of possible buy… ▽ More We study revenue optimization in a repeated auction between a single seller and a single buyer. Traditionally, the design of repeated auctions requires strong modeling assumptions about the bidder behavior, such as it being myopic, infinite lookahead, or some specific form of learning behavior. Is it possible to design mechanisms which are simultaneously optimal against a multitude of possible buyer behaviors? We answer this question by designing a simple state-based mechanism that is simultaneously approximately optimal against a $k$-lookahead buyer for all $k$, a buyer who is a no-regret learner, and a buyer who is a policy-regret learner. Against each type of buyer our mechanism attains a constant fraction of the optimal revenue attainable against that type of buyer. We complement our positive results with almost tight impossibility results, showing that the revenue approximation tradeoffs achieved by our mechanism for different lookahead attitudes are near-optimal. △ Less

Submitted 10 March, 2019; v1 submitted 1 March, 2018; originally announced March 2018.

arXiv:1712.09196 [pdf, other]

The Robust Manifold Defense: Adversarial Training using Generative Models

Authors: Ajil Jalal, Andrew Ilyas, Constantinos Daskalakis, Alexandros G. Dimakis

Abstract: We propose a new type of attack for finding adversarial examples for image classifiers. Our method exploits spanners, i.e. deep neural networks whose input space is low-dimensional and whose output range approximates the set of images of interest. Spanners may be generators of GANs or decoders of VAEs. The key idea in our attack is to search over latent code pairs to find ones that generate nearby… ▽ More We propose a new type of attack for finding adversarial examples for image classifiers. Our method exploits spanners, i.e. deep neural networks whose input space is low-dimensional and whose output range approximates the set of images of interest. Spanners may be generators of GANs or decoders of VAEs. The key idea in our attack is to search over latent code pairs to find ones that generate nearby images with different classifier outputs. We argue that our attack is stronger than searching over perturbations of real images. Moreover, we show that our stronger attack can be used to reduce the accuracy of Defense-GAN to 3\%, resolving an open problem from the well-known paper by Athalye et al. We combine our attack with normal adversarial training to obtain the most robust known MNIST classifier, significantly improving the state of the art against PGD attacks. Our formulation involves solving a min-max problem, where the min player sets the parameters of the classifier and the max player is running our attack, and is thus searching for adversarial examples in the {\em low-dimensional} input space of the spanner. All code and models are available at \url{https://github.com/ajiljalal/manifold-defense.git} △ Less

Submitted 9 July, 2019; v1 submitted 26 December, 2017; originally announced December 2017.

Comments: Added pseudo code for defense-gan break

arXiv:1711.00141 [pdf, other]

Training GANs with Optimism

Authors: Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis, Haoyang Zeng

Abstract: We address the issue of limit cycling behavior in training Generative Adversarial Networks and propose the use of Optimistic Mirror Decent (OMD) for training Wasserstein GANs. Recent theoretical results have shown that optimistic mirror decent (OMD) can enjoy faster regret rates in the context of zero-sum games. WGANs is exactly a context of solving a zero-sum game with simultaneous no-regret dyna… ▽ More We address the issue of limit cycling behavior in training Generative Adversarial Networks and propose the use of Optimistic Mirror Decent (OMD) for training Wasserstein GANs. Recent theoretical results have shown that optimistic mirror decent (OMD) can enjoy faster regret rates in the context of zero-sum games. WGANs is exactly a context of solving a zero-sum game with simultaneous no-regret dynamics. Moreover, we show that optimistic mirror decent addresses the limit cycling problem in training WGANs. We formally show that in the case of bi-linear zero-sum games the last iterate of OMD dynamics converges to an equilibrium, in contrast to GD dynamics which are bound to cycle. We also portray the huge qualitative difference between GD and OMD dynamics with toy examples, even when GD is modified with many adaptations proposed in the recent literature, such as gradient penalty or momentum. We apply OMD WGAN training to a bioinformatics problem of generating DNA sequences. We observe that models trained with OMD achieve consistently smaller KL divergence with respect to the true underlying distribution, than models trained with GD variants. Finally, we introduce a new algorithm, Optimistic Adam, which is an optimistic variant of Adam. We apply it to WGAN training on CIFAR10 and observe improved performance in terms of inception score as compared to Adam. △ Less

Submitted 13 February, 2018; v1 submitted 31 October, 2017; originally announced November 2017.

arXiv:1710.04170 [pdf, ps, other]

Concentration of Multilinear Functions of the Ising Model with Applications to Network Data

Authors: Constantinos Daskalakis, Nishanth Dikkala, Gautam Kamath

Abstract: We prove near-tight concentration of measure for polynomial functions of the Ising model under high temperature. For any degree $d$, we show that a degree-$d$ polynomial of a $n$-spin Ising model exhibits exponential tails that scale as $\exp(-r^{2/d})$ at radius $r=\tildeΩ_d(n^{d/2})$. Our concentration radius is optimal up to logarithmic factors for constant $d$, improving known results by polyn… ▽ More We prove near-tight concentration of measure for polynomial functions of the Ising model under high temperature. For any degree $d$, we show that a degree-$d$ polynomial of a $n$-spin Ising model exhibits exponential tails that scale as $\exp(-r^{2/d})$ at radius $r=\tildeΩ_d(n^{d/2})$. Our concentration radius is optimal up to logarithmic factors for constant $d$, improving known results by polynomial factors in the number of spins. We demonstrate the efficacy of polynomial functions as statistics for testing the strength of interactions in social networks in both synthetic and real world data. △ Less

Submitted 11 October, 2017; originally announced October 2017.

Comments: To appear in NIPS 2017

arXiv:1709.00228 [pdf, ps, other]

Learning Multi-item Auctions with (or without) Samples

Authors: Yang Cai, Constantinos Daskalakis

Abstract: We provide algorithms that learn simple auctions whose revenue is approximately optimal in multi-item multi-bidder settings, for a wide range of valuations including unit-demand, additive, constrained additive, XOS, and subadditive. We obtain our learning results in two settings. The first is the commonly studied setting where sample access to the bidders' distributions over valuations is given, f… ▽ More We provide algorithms that learn simple auctions whose revenue is approximately optimal in multi-item multi-bidder settings, for a wide range of valuations including unit-demand, additive, constrained additive, XOS, and subadditive. We obtain our learning results in two settings. The first is the commonly studied setting where sample access to the bidders' distributions over valuations is given, for both regular distributions and arbitrary distributions with bounded support. Our algorithms require polynomially many samples in the number of items and bidders. The second is a more general max-min learning setting that we introduce, where we are given "approximate distributions," and we seek to compute an auction whose revenue is approximately optimal simultaneously for all "true distributions" that are close to the given ones. These results are more general in that they imply the sample-based results, and are also applicable in settings where we have no sample access to the underlying distributions but have estimated them indirectly via market research or by observation of previously run, potentially non-truthful auctions. Our results hold for valuation distributions satisfying the standard (and necessary) independence-across-items property. They also generalize and improve upon recent works, which have provided algorithms that learn approximately optimal auctions in more restricted settings with additive, subadditive and unit-demand valuations using sample access to distributions. We generalize these results to the complete unit-demand, additive, and XOS setting, to i.i.d. subadditive bidders, and to the max-min setting. Our results are enabled by new uniform convergence bounds for hypotheses classes under product measures. Our bounds result in exponential savings in sample complexity compared to bounds derived by bounding the VC dimension, and are of independent interest. △ Less

Submitted 1 September, 2017; originally announced September 2017.

Comments: Appears in FOCS 2017

arXiv:1708.00002 [pdf, ps, other]

Which Distribution Distances are Sublinearly Testable?

Authors: Constantinos Daskalakis, Gautam Kamath, John Wright

Abstract: Given samples from an unknown distribution $p$ and a description of a distribution $q$, are $p$ and $q$ close or far? This question of "identity testing" has received significant attention in the case of testing whether $p$ and $q$ are equal or far in total variation distance. However, in recent work, the following questions have been been critical to solving problems at the frontiers of distribut… ▽ More Given samples from an unknown distribution $p$ and a description of a distribution $q$, are $p$ and $q$ close or far? This question of "identity testing" has received significant attention in the case of testing whether $p$ and $q$ are equal or far in total variation distance. However, in recent work, the following questions have been been critical to solving problems at the frontiers of distribution testing: -Alternative Distances: Can we test whether $p$ and $q$ are far in other distances, say Hellinger? -Tolerance: Can we test when $p$ and $q$ are close, rather than equal? And if so, close in which distances? Motivated by these questions, we characterize the complexity of distribution testing under a variety of distances, including total variation, $\ell_2$, Hellinger, Kullback-Leibler, and $χ^2$. For each pair of distances $d_1$ and $d_2$, we study the complexity of testing if $p$ and $q$ are close in $d_1$ versus far in $d_2$, with a focus on identifying which problems allow strongly sublinear testers (i.e., those with complexity $O(n^{1 - γ})$ for some $γ> 0$ where $n$ is the size of the support of the distributions $p$ and $q$). We provide matching upper and lower bounds for each case. We also study these questions in the case where we only have samples from $q$ (equivalence testing), showing qualitative differences from identity testing in terms of when tolerance can be achieved. Our algorithms fall into the classical paradigm of $χ^2$-statistics, but require crucial changes to handle the challenges introduced by each distance we consider. Finally, we survey other recent results in an attempt to serve as a reference for the complexity of various distribution testing problems. △ Less

Submitted 30 October, 2017; v1 submitted 31 July, 2017; originally announced August 2017.

Comments: To appear in SODA 2018

arXiv:1704.06850 [pdf, other]

Testing Symmetric Markov Chains from a Single Trajectory

Authors: Constantinos Daskalakis, Nishanth Dikkala, Nick Gravin

Abstract: Classical distribution testing assumes access to i.i.d. samples from the distribution that is being tested. We initiate the study of Markov chain testing, assuming access to a single trajectory of a Markov Chain. In particular, we observe a single trajectory X0,...,Xt,... of an unknown, symmetric, and finite state Markov Chain M. We do not control the starting state X0, and we cannot restart the c… ▽ More Classical distribution testing assumes access to i.i.d. samples from the distribution that is being tested. We initiate the study of Markov chain testing, assuming access to a single trajectory of a Markov Chain. In particular, we observe a single trajectory X0,...,Xt,... of an unknown, symmetric, and finite state Markov Chain M. We do not control the starting state X0, and we cannot restart the chain. Given our single trajectory, the goal is to test whether M is identical to a model Markov Chain M0 , or far from it under an appropriate notion of difference. We propose a measure of difference between two Markov chains, motivated by the early work of Kazakos [Kaz78], which captures the scaling behavior of the total variation distance between trajectories sampled from the Markov chains as the length of these trajectories grows. We provide efficient testers and information-theoretic lower bounds for testing identity of symmetric Markov chains under our proposed measure of difference, which are tight up to logarithmic factors if the hitting times of the model chain M0 is O(n) in the size of the state space n. △ Less

Submitted 3 December, 2017; v1 submitted 22 April, 2017; originally announced April 2017.

Comments: 23 pages, 5 figures

arXiv:1703.10127 [pdf, other]

Priv'IT: Private and Sample Efficient Identity Testing

Authors: Bryan Cai, Constantinos Daskalakis, Gautam Kamath

Abstract: We develop differentially private hypothesis testing methods for the small sample regime. Given a sample $\cal D$ from a categorical distribution $p$ over some domain $Σ$, an explicitly described distribution $q$ over $Σ$, some privacy parameter $\varepsilon$, accuracy parameter $α$, and requirements $β_{\rm I}$ and $β_{\rm II}$ for the type I and type II errors of our test, the goal is to disting… ▽ More We develop differentially private hypothesis testing methods for the small sample regime. Given a sample $\cal D$ from a categorical distribution $p$ over some domain $Σ$, an explicitly described distribution $q$ over $Σ$, some privacy parameter $\varepsilon$, accuracy parameter $α$, and requirements $β_{\rm I}$ and $β_{\rm II}$ for the type I and type II errors of our test, the goal is to distinguish between $p=q$ and $d_{\rm{TV}}(p,q) \geq α$. We provide theoretical bounds for the sample size $|{\cal D}|$ so that our method both satisfies $(\varepsilon,0)$-differential privacy, and guarantees $β_{\rm I}$ and $β_{\rm II}$ type I and type II errors. We show that differential privacy may come for free in some regimes of parameters, and we always beat the sample complexity resulting from running the $χ^2$-test with noisy counts, or standard approaches such as repetition for endowing non-private $χ^2$-style statistics with differential privacy guarantees. We experimentally compare the sample complexity of our method to that of recently proposed methods for private hypothesis testing. △ Less

Submitted 6 June, 2017; v1 submitted 29 March, 2017; originally announced March 2017.

Comments: To appear in ICML 2017

arXiv:1702.07339 [pdf, ps, other]

A Converse to Banach's Fixed Point Theorem and its CLS Completeness

Authors: Constantinos Daskalakis, Christos Tzamos, Manolis Zampetakis

Abstract: Banach's fixed point theorem for contraction maps has been widely used to analyze the convergence of iterative methods in non-convex problems. It is a common experience, however, that iterative maps fail to be globally contracting under the natural metric in their domain, making the applicability of Banach's theorem limited. We explore how generally we can apply Banach's fixed point theorem to est… ▽ More Banach's fixed point theorem for contraction maps has been widely used to analyze the convergence of iterative methods in non-convex problems. It is a common experience, however, that iterative maps fail to be globally contracting under the natural metric in their domain, making the applicability of Banach's theorem limited. We explore how generally we can apply Banach's fixed point theorem to establish the convergence of iterative methods when pairing it with carefully designed metrics. Our first result is a strong converse of Banach's theorem, showing that it is a universal analysis tool for establishing global convergence of iterative methods to unique fixed points, and for bounding their convergence rate. In other words, we show that, whenever an iterative map globally converges to a unique fixed point, there exists a metric under which the iterative map is contracting and which can be used to bound the number of iterations until convergence. We illustrate our approach in the widely used power method, providing a new way of bounding its convergence rate through contraction arguments. We next consider the computational complexity of Banach's fixed point theorem. Making the proof of our converse theorem constructive, we show that computing a fixed point whose existence is guaranteed by Banach's fixed point theorem is CLS-complete. We thus provide the first natural complete problem for the class CLS, which was defined in [Daskalakis, Papadimitriou 2011] to capture the complexity of problems such as P-matrix LCP, computing KKT-points, and finding mixed Nash equilibria in congestion and network coordination games. △ Less

Submitted 13 February, 2018; v1 submitted 23 February, 2017; originally announced February 2017.

arXiv:1612.03164 [pdf, ps, other]

Square Hellinger Subadditivity for Bayesian Networks and its Applications to Identity Testing

Authors: Constantinos Daskalakis, Qinxuan Pan

Abstract: We show that the square Hellinger distance between two Bayesian networks on the same directed graph, $G$, is subadditive with respect to the neighborhoods of $G$. Namely, if $P$ and $Q$ are the probability distributions defined by two Bayesian networks on the same DAG, our inequality states that the square Hellinger distance, $H^2(P,Q)$, between $P$ and $Q$ is upper bounded by the sum,… ▽ More We show that the square Hellinger distance between two Bayesian networks on the same directed graph, $G$, is subadditive with respect to the neighborhoods of $G$. Namely, if $P$ and $Q$ are the probability distributions defined by two Bayesian networks on the same DAG, our inequality states that the square Hellinger distance, $H^2(P,Q)$, between $P$ and $Q$ is upper bounded by the sum, $\sum_v H^2(P_{\{v\} \cup Π_v}, Q_{\{v\} \cup Π_v})$, of the square Hellinger distances between the marginals of $P$ and $Q$ on every node $v$ and its parents $Π_v$ in the DAG. Importantly, our bound does not involve the conditionals but the marginals of $P$ and $Q$. We derive a similar inequality for more general Markov Random Fields. As an application of our inequality, we show that distinguishing whether two Bayesian networks $P$ and $Q$ on the same (but potentially unknown) DAG satisfy $P=Q$ vs $d_{\rm TV}(P,Q)>ε$ can be performed from $\tilde{O}(|Σ|^{3/4(d+1)} \cdot n/ε^2)$ samples, where $d$ is the maximum in-degree of the DAG and $Σ$ the domain of each variable of the Bayesian networks. If $P$ and $Q$ are defined on potentially different and potentially unknown trees, the sample complexity becomes $\tilde{O}(|Σ|^{4.5} n/ε^2)$, whose dependence on $n, ε$ is optimal up to logarithmic factors. Lastly, if $P$ and $Q$ are product distributions over $\{0,1\}^n$ and $Q$ is known, the sample complexity becomes $O(\sqrt{n}/ε^2)$, which is optimal up to constant factors. △ Less

Submitted 9 December, 2016; originally announced December 2016.

arXiv:1612.03147 [pdf, ps, other]

Testing Ising Models

Authors: Constantinos Daskalakis, Nishanth Dikkala, Gautam Kamath

Abstract: Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution? Similarly, is it possible to distinguish whether $p$ equals a given distribution $q$ versus $p$ and $q$ being far from each other? These problems of testing independence and goodness-of-fit have received enormou… ▽ More Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution? Similarly, is it possible to distinguish whether $p$ equals a given distribution $q$ versus $p$ and $q$ being far from each other? These problems of testing independence and goodness-of-fit have received enormous attention in statistics, information theory, and theoretical computer science, with sample-optimal algorithms known in several interesting regimes of parameters. Unfortunately, it has also been understood that these problems become intractable in large dimensions, necessitating exponential sample complexity. Motivated by the exponential lower bounds for general distributions as well as the ubiquity of Markov Random Fields (MRFs) in the modeling of high-dimensional distributions, we initiate the study of distribution testing on structured multivariate distributions, and in particular the prototypical example of MRFs: the Ising Model. We demonstrate that, in this structured setting, we can avoid the curse of dimensionality, obtaining sample and time efficient testers for independence and goodness-of-fit. One of the key technical challenges we face along the way is bounding the variance of functions of the Ising model. △ Less

Submitted 10 July, 2019; v1 submitted 9 December, 2016; originally announced December 2016.

Comments: Appeared SODA 2018. Final version to appear in IEEE Transactions on Information Theory

arXiv:1609.00368 [pdf, other]

Ten Steps of EM Suffice for Mixtures of Two Gaussians

Authors: Constantinos Daskalakis, Christos Tzamos, Manolis Zampetakis

Abstract: The Expectation-Maximization (EM) algorithm is a widely used method for maximum likelihood estimation in models with latent variables. For estimating mixtures of Gaussians, its iteration can be viewed as a soft version of the k-means clustering algorithm. Despite its wide use and applications, there are essentially no known convergence guarantees for this method. We provide global convergence guar… ▽ More The Expectation-Maximization (EM) algorithm is a widely used method for maximum likelihood estimation in models with latent variables. For estimating mixtures of Gaussians, its iteration can be viewed as a soft version of the k-means clustering algorithm. Despite its wide use and applications, there are essentially no known convergence guarantees for this method. We provide global convergence guarantees for mixtures of two Gaussians with known covariance matrices. We show that the population version of EM, where the algorithm is given access to infinitely many samples from the mixture, converges geometrically to the correct mean vectors, and provide simple, closed-form expressions for the convergence rate. As a simple illustration, we show that, in one dimension, ten steps of the EM algorithm initialized at infinity result in less than 1\% error estimation of the means. In the finite sample regime, we show that, under a random initialization, $\tilde{O}(d/ε^2)$ samples suffice to compute the unknown vectors to within $ε$ in Mahalanobis distance, where $d$ is the dimension. In particular, the error rate of the EM based estimator is $\tilde{O}\left(\sqrt{d \over n}\right)$ where $n$ is the number of samples, which is optimal up to logarithmic factors. △ Less

Submitted 5 June, 2017; v1 submitted 1 September, 2016; originally announced September 2016.

Comments: Accepted for presentation at Conference on Learning Theory (COLT) 2017

arXiv:1605.02054 [pdf, ps, other]

Revenue Maximization and Ex-Post Budget Constraints

Authors: Constantinos Daskalakis, Nikhil R. Devanur, S. Matthew Weinberg

Abstract: We consider the problem of a revenue-maximizing seller with m items for sale to n additive bidders with hard budget constraints, assuming that the seller has some prior distribution over bidder values and budgets. The prior may be correlated across items and budgets of the same bidder, but is assumed independent across bidders. We target mechanisms that are Bayesian Incentive Compatible, but that… ▽ More We consider the problem of a revenue-maximizing seller with m items for sale to n additive bidders with hard budget constraints, assuming that the seller has some prior distribution over bidder values and budgets. The prior may be correlated across items and budgets of the same bidder, but is assumed independent across bidders. We target mechanisms that are Bayesian Incentive Compatible, but that are ex-post Individually Rational and ex-post budget respecting. Virtually no such mechanisms are known that satisfy all these conditions and guarantee any revenue approximation, even with just a single item. We provide a computationally efficient mechanism that is a $3$-approximation with respect to all BIC, ex-post IR, and ex-post budget respecting mechanisms. Note that the problem is NP-hard to approximate better than a factor of 16/15, even in the case where the prior is a point mass \cite{ChakrabartyGoel}. We further characterize the optimal mechanism in this setting, showing that it can be interpreted as a distribution over virtual welfare maximizers. We prove our results by making use of a black-box reduction from mechanism to algorithm design developed by \cite{CaiDW13b}. Our main technical contribution is a computationally efficient $3$-approximation algorithm for the algorithmic problem that results by an application of their framework to this problem. The algorithmic problem has a mixed-sign objective and is NP-hard to optimize exactly, so it is surprising that a computationally efficient approximation is possible at all. In the case of a single item ($m=1$), the algorithmic problem can be solved exactly via exhaustive search, leading to a computationally efficient exact algorithm and a stronger characterization of the optimal mechanism as a distribution over virtual value maximizers. △ Less

Submitted 6 May, 2016; originally announced May 2016.

arXiv:1603.07229 [pdf, ps, other]

Sequential Mechanisms with ex-post Participation Guarantees

Authors: Itai Ashlagi, Constantinos Daskalakis, Nima Haghpanah

Abstract: We provide a characterization of revenue-optimal dynamic mechanisms in settings where a monopolist sells k items over k periods to a buyer who realizes his value for item i in the beginning of period i. We require that the mechanism satisfies a strong individual rationality constraint, requiring that the stage utility of each agent be positive during each period. We show that the optimum mechanism… ▽ More We provide a characterization of revenue-optimal dynamic mechanisms in settings where a monopolist sells k items over k periods to a buyer who realizes his value for item i in the beginning of period i. We require that the mechanism satisfies a strong individual rationality constraint, requiring that the stage utility of each agent be positive during each period. We show that the optimum mechanism can be computed by solving a nested sequence of static (single-period) mechanisms that optimize a tradeoff between the surplus of the allocation and the buyer's utility. We also provide a simple dynamic mechanism that obtains at least half of the optimal revenue. The mechanism either ignores history and posts the optimal monopoly price in each period, or allocates with a probability that is independent of the current report of the agent and is based only on previous reports. Our characterization extends to multi-agent auctions. We also formulate a discounted infinite horizon version of the problem, where we study the performance of "Markov mechanisms." △ Less

Submitted 5 July, 2016; v1 submitted 23 March, 2016; originally announced March 2016.

arXiv:1511.03641 [pdf, ps, other]

doi 10.1145/2897518.2897519

A Size-Free CLT for Poisson Multinomials and its Applications

Authors: Constantinos Daskalakis, Anindya De, Gautam Kamath, Christos Tzamos

Abstract: An $(n,k)$-Poisson Multinomial Distribution (PMD) is the distribution of the sum of $n$ independent random vectors supported on the set ${\cal B}_k=\{e_1,\ldots,e_k\}$ of standard basis vectors in $\mathbb{R}^k$. We show that any $(n,k)$-PMD is ${\rm poly}\left({k\over σ}\right)$-close in total variation distance to the (appropriately discretized) multi-dimensional Gaussian with the same first two… ▽ More An $(n,k)$-Poisson Multinomial Distribution (PMD) is the distribution of the sum of $n$ independent random vectors supported on the set ${\cal B}_k=\{e_1,\ldots,e_k\}$ of standard basis vectors in $\mathbb{R}^k$. We show that any $(n,k)$-PMD is ${\rm poly}\left({k\over σ}\right)$-close in total variation distance to the (appropriately discretized) multi-dimensional Gaussian with the same first two moments, removing the dependence on $n$ from the Central Limit Theorem of Valiant and Valiant. Interestingly, our CLT is obtained by bootstrap** the Valiant-Valiant CLT itself through the structural characterization of PMDs shown in recent work by Daskalakis, Kamath, and Tzamos. In turn, our stronger CLT can be leveraged to obtain an efficient PTAS for approximate Nash equilibria in anonymous games, significantly improving the state of the art, and matching qualitatively the running time dependence on $n$ and $1/\varepsilon$ of the best known algorithm for two-strategy anonymous games. Our new CLT also enables the construction of covers for the set of $(n,k)$-PMDs, which are proper and whose size is shown to be essentially optimal. Our cover construction combines our CLT with the Shapley-Folkman theorem and recent sparsification results for Laplacian matrices by Batson, Spielman, and Srivastava. Our cover size lower bound is based on an algebraic geometric construction. Finally, leveraging the structural properties of the Fourier spectrum of PMDs we show that these distributions can be learned from $O_k(1/\varepsilon^2)$ samples in ${\rm poly}_k(1/\varepsilon)$-time, removing the quasi-polynomial dependence of the running time on $1/\varepsilon$ from the algorithm of Daskalakis, Kamath, and Tzamos. △ Less

Submitted 16 June, 2016; v1 submitted 11 November, 2015; originally announced November 2015.

Comments: To appear in STOC 2016

arXiv:1511.01411 [pdf, ps, other]

Learning in Auctions: Regret is Hard, Envy is Easy

Authors: Constantinos Daskalakis, Vasilis Syrgkanis

Abstract: A line of recent work provides welfare guarantees of simple combinatorial auction formats, such as selling m items via simultaneous second price auctions (SiSPAs) (Christodoulou et al. 2008, Bhawalkar and Roughgarden 2011, Feldman et al. 2013). These guarantees hold even when the auctions are repeatedly executed and players use no-regret learning algorithms. Unfortunately, off-the-shelf no-regret… ▽ More A line of recent work provides welfare guarantees of simple combinatorial auction formats, such as selling m items via simultaneous second price auctions (SiSPAs) (Christodoulou et al. 2008, Bhawalkar and Roughgarden 2011, Feldman et al. 2013). These guarantees hold even when the auctions are repeatedly executed and players use no-regret learning algorithms. Unfortunately, off-the-shelf no-regret algorithms for these auctions are computationally inefficient as the number of actions is exponential. We show that this obstacle is insurmountable: there are no polynomial-time no-regret algorithms for SiSPAs, unless RP$\supseteq$ NP, even when the bidders are unit-demand. Our lower bound raises the question of how good outcomes polynomially-bounded bidders may discover in such auctions. To answer this question, we propose a novel concept of learning in auctions, termed "no-envy learning." This notion is founded upon Walrasian equilibrium, and we show that it is both efficiently implementable and results in approximately optimal welfare, even when the bidders have fractionally subadditive (XOS) valuations (assuming demand oracles) or coverage valuations (without demand oracles). No-envy learning outcomes are a relaxation of no-regret outcomes, which maintain their approximate welfare optimality while endowing them with computational tractability. Our results extend to other auction formats that have been studied in the literature via the smoothness paradigm. Our results for XOS valuations are enabled by a novel Follow-The-Perturbed-Leader algorithm for settings where the number of experts is infinite, and the payoff function of the learner is non-linear. This algorithm has applications outside of auction settings, such as in security games. Our result for coverage valuations is based on a novel use of convex rounding schemes and a reduction to online convex optimization. △ Less

Submitted 6 April, 2016; v1 submitted 4 November, 2015; originally announced November 2015.

arXiv:1508.01962 [pdf, other]

Species Trees are Recoverable from Unrooted Gene Tree Topologies Under a Constant Rate of Horizontal Gene Transfer

Authors: Constantinos Daskalakis, Sebastien Roch

Abstract: Reconstructing the tree of life from molecular sequences is a fundamental problem in computational biology. Modern data sets often contain a large number of genes, which can complicate the reconstruction problem due to the fact that different genes may undergo different evolutionary histories. This is the case in particular in the presence of horizontal genetic transfer (HGT), where a gene is inhe… ▽ More Reconstructing the tree of life from molecular sequences is a fundamental problem in computational biology. Modern data sets often contain a large number of genes, which can complicate the reconstruction problem due to the fact that different genes may undergo different evolutionary histories. This is the case in particular in the presence of horizontal genetic transfer (HGT), where a gene is inherited from a distant species rather than an immediate ancestor. Such an event produces a gene tree which is distinct from, but related to, the species phylogeny. In previous work, a natural stochastic models of HGT was introduced and studied. It was shown, both in simulation and theoretical studies, that a species phylogeny can be reconstructed from gene trees despite surprisingly high rates of HGT under this model. Rigorous lower and upper bounds on this achievable rate were also obtained, but a large gap remained. Here we close this gap, up to a constant. Specifically we show that a species phylogeny can be reconstructed correctly from gene trees even when, on each gene, each edge of the species tree has a constant probability of being the location of an HGT event. Our new reconstruction algorithm, which relies only on unrooted gene tree topologies, builds the tree recursively from the leaves and runs in polynomial time. We also provide a matching bound in the negative direction (up to a constant) and extend our results to some cases where gene trees are not perfectly known. △ Less

Submitted 20 July, 2017; v1 submitted 8 August, 2015; originally announced August 2015.

Comments: Submitted. Conference version published as: Daskalakis, Constantinos, and Sebastien Roch. "Species trees from gene trees despite a high rate of lateral genetic transfer: A tight bound." Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 2016

arXiv:1507.05952 [pdf, ps, other]

Optimal Testing for Properties of Distributions

Authors: Jayadev Acharya, Constantinos Daskalakis, Gautam Kamath

Abstract: Given samples from an unknown distribution $p$, is it possible to distinguish whether $p$ belongs to some class of distributions $\mathcal{C}$ versus $p$ being far from every distribution in $\mathcal{C}$? This fundamental question has received tremendous attention in statistics, focusing primarily on asymptotic analysis, and more recently in information theory and theoretical computer science, wh… ▽ More Given samples from an unknown distribution $p$, is it possible to distinguish whether $p$ belongs to some class of distributions $\mathcal{C}$ versus $p$ being far from every distribution in $\mathcal{C}$? This fundamental question has received tremendous attention in statistics, focusing primarily on asymptotic analysis, and more recently in information theory and theoretical computer science, where the emphasis has been on small sample size and computational complexity. Nevertheless, even for basic properties of distributions such as monotonicity, log-concavity, unimodality, independence, and monotone-hazard rate, the optimal sample complexity is unknown. We provide a general approach via which we obtain sample-optimal and computationally efficient testers for all these distribution families. At the core of our approach is an algorithm which solves the following problem: Given samples from an unknown distribution $p$, and a known distribution $q$, are $p$ and $q$ close in $χ^2$-distance, or far in total variation distance? The optimality of our testers is established by providing matching lower bounds with respect to both $n$ and $\varepsilon$. Finally, a necessary building block for our testers and an important byproduct of our work are the first known computationally efficient proper learners for discrete log-concave and monotone hazard rate distributions. △ Less

Submitted 8 December, 2015; v1 submitted 21 July, 2015; originally announced July 2015.

Comments: 31 pages, extended abstract appeared as a spotlight in NIPS 2015

arXiv:1504.08363 [pdf, ps, other]

doi 10.1109/FOCS.2015.77

On the Structure, Covering, and Learning of Poisson Multinomial Distributions

Authors: Constantinos Daskalakis, Gautam Kamath, Christos Tzamos

Abstract: An $(n,k)$-Poisson Multinomial Distribution (PMD) is the distribution of the sum of $n$ independent random vectors supported on the set ${\cal B}_k=\{e_1,\ldots,e_k\}$ of standard basis vectors in $\mathbb{R}^k$. We prove a structural characterization of these distributions, showing that, for all $\varepsilon >0$, any $(n, k)$-Poisson multinomial random vector is $\varepsilon$-close, in total vari… ▽ More An $(n,k)$-Poisson Multinomial Distribution (PMD) is the distribution of the sum of $n$ independent random vectors supported on the set ${\cal B}_k=\{e_1,\ldots,e_k\}$ of standard basis vectors in $\mathbb{R}^k$. We prove a structural characterization of these distributions, showing that, for all $\varepsilon >0$, any $(n, k)$-Poisson multinomial random vector is $\varepsilon$-close, in total variation distance, to the sum of a discretized multidimensional Gaussian and an independent $(\text{poly}(k/\varepsilon), k)$-Poisson multinomial random vector. Our structural characterization extends the multi-dimensional CLT of Valiant and Valiant, by simultaneously applying to all approximation requirements $\varepsilon$. In particular, it overcomes factors depending on $\log n$ and, importantly, the minimum eigenvalue of the PMD's covariance matrix from the distance to a multidimensional Gaussian random variable. We use our structural characterization to obtain an $\varepsilon$-cover, in total variation distance, of the set of all $(n, k)$-PMDs, significantly improving the cover size of Daskalakis and Papadimitriou, and obtaining the same qualitative dependence of the cover size on $n$ and $\varepsilon$ as the $k=2$ cover of Daskalakis and Papadimitriou. We further exploit this structure to show that $(n,k)$-PMDs can be learned to within $\varepsilon$ in total variation distance from $\tilde{O}_k(1/\varepsilon^2)$ samples, which is near-optimal in terms of dependence on $\varepsilon$ and independent of $n$. In particular, our result generalizes the single-dimensional result of Daskalakis, Diakonikolas, and Servedio for Poisson Binomials to arbitrary dimension. △ Less

Submitted 23 November, 2015; v1 submitted 30 April, 2015; originally announced April 2015.

Comments: 49 pages, extended abstract appeared in FOCS 2015

arXiv:1503.02516 [pdf, ps, other]

Optimal Pricing is Hard

Authors: Constantinos Daskalakis, Alan Deckelbaum, Christos Tzamos

Abstract: We show that computing the revenue-optimal deterministic auction in unit-demand single-buyer Bayesian settings, i.e. the optimal item-pricing, is computationally hard even in single-item settings where the buyer's value distribution is a sum of independently distributed attributes, or multi-item settings where the buyer's values for the items are independent. We also show that it is intractable to… ▽ More We show that computing the revenue-optimal deterministic auction in unit-demand single-buyer Bayesian settings, i.e. the optimal item-pricing, is computationally hard even in single-item settings where the buyer's value distribution is a sum of independently distributed attributes, or multi-item settings where the buyer's values for the items are independent. We also show that it is intractable to optimally price the grand bundle of multiple items for an additive bidder whose values for the items are independent. These difficulties stem from implicit definitions of a value distribution. We provide three instances of how different properties of implicit distributions can lead to intractability: the first is a #P-hardness proof, while the remaining two are reductions from the SQRT-SUM problem of Garey, Graham, and Johnson. While simple pricing schemes can oftentimes approximate the best scheme in revenue, they can have drastically different underlying structure. We argue therefore that either the specification of the input distribution must be highly restricted in format, or it is necessary for the goal to be mere approximation to the optimal scheme's revenue instead of computing properties of the scheme itself. △ Less

Submitted 9 March, 2015; originally announced March 2015.

arXiv:1503.01958 [pdf, ps, other]

doi 10.1145/2482540.2482593

Mechanism Design via Optimal Transport

Authors: Constantinos Daskalakis, Alan Deckelbaum, Christos Tzamos

Abstract: Optimal mechanisms have been provided in quite general multi-item settings, as long as each bidder's type distribution is given explicitly by listing every type in the support along with its associated probability. In the implicit setting, e.g. when the bidders have additive valuations with independent and/or continuous values for the items, these results do not apply, and it was recently shown th… ▽ More Optimal mechanisms have been provided in quite general multi-item settings, as long as each bidder's type distribution is given explicitly by listing every type in the support along with its associated probability. In the implicit setting, e.g. when the bidders have additive valuations with independent and/or continuous values for the items, these results do not apply, and it was recently shown that exact revenue optimization is intractable, even when there is only one bidder. Even for item distributions with special structure, optimal mechanisms have been surprisingly rare and the problem is challenging even in the two-item case. In this paper, we provide a framework for designing optimal mechanisms using optimal transport theory and duality theory. We instantiate our framework to obtain conditions under which only pricing the grand bundle is optimal in multi-item settings (complementing the work of [Manelli and Vincent 2006], as well as to characterize optimal two-item mechanisms. We use our results to derive closed-form descriptions of the optimal mechanism in several two-item settings, exhibiting also a setting where a continuum of lotteries is necessary for revenue optimization but a closed-form representation of the mechanism can still be found efficiently using our framework. △ Less

Submitted 6 March, 2015; originally announced March 2015.

ACM Class: J.4

arXiv:1412.4840 [pdf, ps, other]

doi 10.1109/FOCS.2014.10

A Counter-Example to Karlin's Strong Conjecture for Fictitious Play

Authors: Constantinos Daskalakis, Qinxuan Pan

Abstract: Fictitious play is a natural dynamic for equilibrium play in zero-sum games, proposed by [Brown 1949], and shown to converge by [Robinson 1951]. Samuel Karlin conjectured in 1959 that fictitious play converges at rate $O(1/\sqrt{t})$ with the number of steps $t$. We disprove this conjecture showing that, when the payoff matrix of the row player is the $n \times n$ identity matrix, fictitious play… ▽ More Fictitious play is a natural dynamic for equilibrium play in zero-sum games, proposed by [Brown 1949], and shown to converge by [Robinson 1951]. Samuel Karlin conjectured in 1959 that fictitious play converges at rate $O(1/\sqrt{t})$ with the number of steps $t$. We disprove this conjecture showing that, when the payoff matrix of the row player is the $n \times n$ identity matrix, fictitious play may converge with rate as slow as $Ω(t^{-1/n})$. △ Less

Submitted 17 December, 2014; v1 submitted 15 December, 2014; originally announced December 2014.

Comments: 55th IEEE Symposium on Foundations of Computer Science

arXiv:1410.3386 [pdf, ps, other]

Testing Poisson Binomial Distributions

Authors: Jayadev Acharya, Constantinos Daskalakis

Abstract: A Poisson Binomial distribution over $n$ variables is the distribution of the sum of $n$ independent Bernoullis. We provide a sample near-optimal algorithm for testing whether a distribution $P$ supported on $\{0,...,n\}$ to which we have sample access is a Poisson Binomial distribution, or far from all Poisson Binomial distributions. The sample complexity of our algorithm is $O(n^{1/4})$ to which… ▽ More A Poisson Binomial distribution over $n$ variables is the distribution of the sum of $n$ independent Bernoullis. We provide a sample near-optimal algorithm for testing whether a distribution $P$ supported on $\{0,...,n\}$ to which we have sample access is a Poisson Binomial distribution, or far from all Poisson Binomial distributions. The sample complexity of our algorithm is $O(n^{1/4})$ to which we provide a matching lower bound. We note that our sample complexity improves quadratically upon that of the naive "learn followed by tolerant-test" approach, while instance optimal identity testing [VV14] is not applicable since we are looking to simultaneously test against a whole family of distributions. △ Less

Submitted 13 October, 2014; v1 submitted 13 October, 2014; originally announced October 2014.

Comments: To appear in ACM-SIAM Symposium on Discrete Algorithms (SODA) 2015

arXiv:1409.4150 [pdf, other]

Strong Duality for a Multiple-Good Monopolist

Authors: Constantinos Daskalakis, Alan Deckelbaum, Christos Tzamos

Abstract: We characterize optimal mechanisms for the multiple-good monopoly problem and provide a framework to find them. We show that a mechanism is optimal if and only if a measure $μ$ derived from the buyer's type distribution satisfies certain stochastic dominance conditions. This measure expresses the marginal change in the seller's revenue under marginal changes in the rent paid to subsets of buyer ty… ▽ More We characterize optimal mechanisms for the multiple-good monopoly problem and provide a framework to find them. We show that a mechanism is optimal if and only if a measure $μ$ derived from the buyer's type distribution satisfies certain stochastic dominance conditions. This measure expresses the marginal change in the seller's revenue under marginal changes in the rent paid to subsets of buyer types. As a corollary, we characterize the optimality of grand-bundling mechanisms, strengthening several results in the literature, where only sufficient optimality conditions have been derived. As an application, we show that the optimal mechanism for $n$ independent uniform items each supported on $[c,c+1]$ is a grand-bundling mechanism, as long as $c$ is sufficiently large, extending Pavlov's result for $2$ items [Pavlov'11]. At the same time, our characterization also implies that, for all $c$ and for all sufficiently large $n$, the optimal mechanism for $n$ independent uniform items supported on $[c,c+1]$ is not a grand bundling mechanism. △ Less

Submitted 5 September, 2017; v1 submitted 14 September, 2014; originally announced September 2014.

arXiv:1408.2539 [pdf, ps, other]

Optimum Statistical Estimation with Strategic Data Sources

Authors: Yang Cai, Constantinos Daskalakis, Christos H. Papadimitriou

Abstract: We propose an optimum mechanism for providing monetary incentives to the data sources of a statistical estimator such as linear regression, so that high quality data is provided at low cost, in the sense that the sum of payments and estimation error is minimized. The mechanism applies to a broad range of estimators, including linear and polynomial regression, kernel regression, and, under some add… ▽ More We propose an optimum mechanism for providing monetary incentives to the data sources of a statistical estimator such as linear regression, so that high quality data is provided at low cost, in the sense that the sum of payments and estimation error is minimized. The mechanism applies to a broad range of estimators, including linear and polynomial regression, kernel regression, and, under some additional assumptions, ridge regression. It also generalizes to several objectives, including minimizing estimation error subject to budget constraints. Besides our concrete results for regression problems, we contribute a mechanism design framework through which to design and analyze statistical estimators whose examples are supplied by workers with cost for labeling said examples. △ Less

Submitted 24 April, 2015; v1 submitted 11 August, 2014; originally announced August 2014.

arXiv:1405.5940 [pdf, ps, other]

Bayesian Truthful Mechanisms for Job Scheduling from Bi-criterion Approximation Algorithms

Authors: Constantinos Daskalakis, S. Matthew Weinberg

Abstract: We provide polynomial-time approximately optimal Bayesian mechanisms for makespan minimization on unrelated machines as well as for max-min fair allocations of indivisible goods, with approximation factors of $2$ and $\min\{m-k+1, \tilde{O}(\sqrt{k})\}$ respectively, matching the approximation ratios of best known polynomial-time \emph{algorithms} (for max-min fairness, the latter claim is true fo… ▽ More We provide polynomial-time approximately optimal Bayesian mechanisms for makespan minimization on unrelated machines as well as for max-min fair allocations of indivisible goods, with approximation factors of $2$ and $\min\{m-k+1, \tilde{O}(\sqrt{k})\}$ respectively, matching the approximation ratios of best known polynomial-time \emph{algorithms} (for max-min fairness, the latter claim is true for certain ratios of the number of goods $m$ to people $k$). Our mechanisms are obtained by establishing a polynomial-time approximation-sensitive reduction from the problem of designing approximately optimal {\em mechanisms} for some arbitrary objective ${\cal O}$ to that of designing bi-criterion approximation {\em algorithms} for the same objective ${\cal O}$ plus a linear allocation cost term. Our reduction is itself enabled by extending the celebrated "equivalence of separation and optimization"[GLSS81,KP80] to also accommodate bi-criterion approximations. Moreover, to apply the reduction to the specific problems of makespan and max-min fairness we develop polynomial-time bi-criterion approximation algorithms for makespan minimization with costs and max-min fairness with costs, adapting the algorithms of [ST93], [BD05] and [AS07] to the type of bi-criterion approximation that is required by the reduction. △ Less

Submitted 22 May, 2014; originally announced May 2014.

arXiv:1312.1054 [pdf, ps, other]

Faster and Sample Near-Optimal Algorithms for Proper Learning Mixtures of Gaussians

Authors: Constantinos Daskalakis, Gautam Kamath

Abstract: We provide an algorithm for properly learning mixtures of two single-dimensional Gaussians without any separability assumptions. Given $\tilde{O}(1/\varepsilon^2)$ samples from an unknown mixture, our algorithm outputs a mixture that is $\varepsilon$-close in total variation distance, in time $\tilde{O}(1/\varepsilon^5)$. Our sample complexity is optimal up to logarithmic factors, and significantl… ▽ More We provide an algorithm for properly learning mixtures of two single-dimensional Gaussians without any separability assumptions. Given $\tilde{O}(1/\varepsilon^2)$ samples from an unknown mixture, our algorithm outputs a mixture that is $\varepsilon$-close in total variation distance, in time $\tilde{O}(1/\varepsilon^5)$. Our sample complexity is optimal up to logarithmic factors, and significantly improves upon both Kalai et al., whose algorithm has a prohibitive dependence on $1/\varepsilon$, and Feldman et al., whose algorithm requires bounds on the mixture parameters and depends pseudo-polynomially in these parameters. One of our main contributions is an improved and generalized algorithm for selecting a good candidate distribution from among competing hypotheses. Namely, given a collection of $N$ hypotheses containing at least one candidate that is $\varepsilon$-close to an unknown distribution, our algorithm outputs a candidate which is $O(\varepsilon)$-close to the distribution. The algorithm requires ${O}(\log{N}/\varepsilon^2)$ samples from the unknown distribution and ${O}(N \log N/\varepsilon^2)$ time, which improves previous such results (such as the Scheffé estimator) from a quadratic dependence of the running time on $N$ to quasilinear. Given the wide use of such results for the purpose of hypothesis selection, our improved algorithm implies immediate improvements to any such use. △ Less

Submitted 19 May, 2014; v1 submitted 4 December, 2013; originally announced December 2013.

Comments: 31 pages, to appear in COLT 2014

arXiv:1309.7084 [pdf, other]

How good is the Chord algorithm?

Authors: Constantinos Daskalakis, Ilias Diakonikolas, Mihalis Yannakakis

Abstract: The Chord algorithm is a popular, simple method for the succinct approximation of curves, which is widely used, under different names, in a variety of areas, such as, multiobjective and parametric optimization, computational geometry, and graphics. We analyze the performance of the Chord algorithm, as compared to the optimal approximation that achieves a desired accuracy with the minimum number of… ▽ More The Chord algorithm is a popular, simple method for the succinct approximation of curves, which is widely used, under different names, in a variety of areas, such as, multiobjective and parametric optimization, computational geometry, and graphics. We analyze the performance of the Chord algorithm, as compared to the optimal approximation that achieves a desired accuracy with the minimum number of points. We prove sharp upper and lower bounds, both in the worst case and average case setting. △ Less

Submitted 26 September, 2013; originally announced September 2013.

Comments: 47 pages, full version of SODA 2010 paper

arXiv:1307.3621 [pdf, ps, other]

A Polynomial-time Approximation Scheme for Fault-tolerant Distributed Storage

Authors: Constantinos Daskalakis, Anindya De, Ilias Diakonikolas, Ankur Moitra, Rocco A. Servedio

Abstract: We consider a problem which has received considerable attention in systems literature because of its applications to routing in delay tolerant networks and replica placement in distributed storage systems. In abstract terms the problem can be stated as follows: Given a random variable $X$ generated by a known product distribution over $\{0,1\}^n$ and a target value $0 \leq θ\leq 1$, output a non-n… ▽ More We consider a problem which has received considerable attention in systems literature because of its applications to routing in delay tolerant networks and replica placement in distributed storage systems. In abstract terms the problem can be stated as follows: Given a random variable $X$ generated by a known product distribution over $\{0,1\}^n$ and a target value $0 \leq θ\leq 1$, output a non-negative vector $w$, with $\|w\|_1 \le 1$, which maximizes the probability of the event $w \cdot X \ge θ$. This is a challenging non-convex optimization problem for which even computing the value $\Pr[w \cdot X \ge θ]$ of a proposed solution vector $w$ is #P-hard. We provide an additive EPTAS for this problem which, for constant-bounded product distributions, runs in $ \poly(n) \cdot 2^{\poly(1/\eps)}$ time and outputs an $\eps$-approximately optimal solution vector $w$ for this problem. Our approach is inspired by, and extends, recent structural results from the complexity-theoretic study of linear threshold functions. Furthermore, in spite of the objective function being non-smooth, we give a \emph{unicriterion} PTAS while previous work for such objective functions has typically led to a \emph{bicriterion} PTAS. We believe our techniques may be applicable to get unicriterion PTAS for other non-smooth objective functions. △ Less

Submitted 13 July, 2013; originally announced July 2013.

arXiv:1306.1265 [pdf, ps, other]

Sparse Covers for Sums of Indicators

Authors: Constantinos Daskalakis, Christos Papadimitriou

Abstract: For all $n, ε>0$, we show that the set of Poisson Binomial distributions on $n$ variables admits a proper $ε$-cover in total variation distance of size $n^2+n \cdot (1/ε)^{O(\log^2 (1/ε))}$, which can also be computed in polynomial time. We discuss the implications of our construction for approximation algorithms and the computation of approximate Nash equilibria in anonymous games. For all $n, ε>0$, we show that the set of Poisson Binomial distributions on $n$ variables admits a proper $ε$-cover in total variation distance of size $n^2+n \cdot (1/ε)^{O(\log^2 (1/ε))}$, which can also be computed in polynomial time. We discuss the implications of our construction for approximation algorithms and the computation of approximate Nash equilibria in anonymous games. △ Less

Submitted 1 October, 2014; v1 submitted 5 June, 2013; originally announced June 2013.

Comments: PTRF, to appear

arXiv:1305.4002 [pdf, ps, other]

Understanding Incentives: Mechanism Design becomes Algorithm Design

Authors: Yang Cai, Constantinos Daskalakis, S. Matthew Weinberg

Abstract: We provide a computationally efficient black-box reduction from mechanism design to algorithm design in very general settings. Specifically, we give an approximation-preserving reduction from truthfully maximizing \emph{any} objective under \emph{arbitrary} feasibility constraints with \emph{arbitrary} bidder types to (not necessarily truthfully) maximizing the same objective plus virtual welfare… ▽ More We provide a computationally efficient black-box reduction from mechanism design to algorithm design in very general settings. Specifically, we give an approximation-preserving reduction from truthfully maximizing \emph{any} objective under \emph{arbitrary} feasibility constraints with \emph{arbitrary} bidder types to (not necessarily truthfully) maximizing the same objective plus virtual welfare (under the same feasibility constraints). Our reduction is based on a fundamentally new approach: we describe a mechanism's behavior indirectly only in terms of the expected value it awards bidders for certain behavior, and never directly access the allocation rule at all. Applying our new approach to revenue, we exhibit settings where our reduction holds \emph{both ways}. That is, we also provide an approximation-sensitive reduction from (non-truthfully) maximizing virtual welfare to (truthfully) maximizing revenue, and therefore the two problems are computationally equivalent. With this equivalence in hand, we show that both problems are NP-hard to approximate within any polynomial factor, even for a single monotone submodular bidder. We further demonstrate the applicability of our reduction by providing a truthful mechanism maximizing fractional max-min fairness. This is the first instance of a truthful mechanism that optimizes a non-linear objective. △ Less

Submitted 17 May, 2013; originally announced May 2013.

arXiv:1305.4000 [pdf, ps, other]

Reducing Revenue to Welfare Maximization: Approximation Algorithms and other Generalizations

Authors: Yang Cai, Constantinos Daskalakis, S. Matthew Weinberg

Abstract: It was recently shown in [http://arxiv.longhoe.net/abs/1207.5518] that revenue optimization can be computationally efficiently reduced to welfare optimization in all multi-dimensional Bayesian auction problems with arbitrary (possibly combinatorial) feasibility constraints and independent additive bidders with arbitrary (possibly combinatorial) demand constraints. This reduction provides a poly-time solut… ▽ More It was recently shown in [http://arxiv.longhoe.net/abs/1207.5518] that revenue optimization can be computationally efficiently reduced to welfare optimization in all multi-dimensional Bayesian auction problems with arbitrary (possibly combinatorial) feasibility constraints and independent additive bidders with arbitrary (possibly combinatorial) demand constraints. This reduction provides a poly-time solution to the optimal mechanism design problem in all auction settings where welfare optimization can be solved efficiently, but it is fragile to approximation and cannot provide solutions to settings where welfare maximization can only be tractably approximated. In this paper, we extend the reduction to accommodate approximation algorithms, providing an approximation preserving reduction from (truthful) revenue maximization to (not necessarily truthful) welfare maximization. The mechanisms output by our reduction choose allocations via black-box calls to welfare approximation on randomly selected inputs, thereby generalizing also our earlier structural results on optimal multi-dimensional mechanisms to approximately optimal mechanisms. Unlike [http://arxiv.longhoe.net/abs/1207.5518], our results here are obtained through novel uses of the Ellipsoid algorithm and other optimization techniques over {\em non-convex regions}. △ Less

Submitted 17 May, 2013; originally announced May 2013.

arXiv:1211.1703 [pdf, ps, other]

The Complexity of Optimal Mechanism Design

Authors: Constantinos Daskalakis, Alan Deckelbaum, Christos Tzamos

Abstract: Myerson's seminal work provides a computationally efficient revenue-optimal auction for selling one item to multiple bidders. Generalizing this work to selling multiple items at once has been a central question in economics and algorithmic game theory, but its complexity has remained poorly understood. We answer this question by showing that a revenue-optimal auction in multi-item settings cannot… ▽ More Myerson's seminal work provides a computationally efficient revenue-optimal auction for selling one item to multiple bidders. Generalizing this work to selling multiple items at once has been a central question in economics and algorithmic game theory, but its complexity has remained poorly understood. We answer this question by showing that a revenue-optimal auction in multi-item settings cannot be found and implemented computationally efficiently, unless ZPP contains P^#P. This is true even for a single additive bidder whose values for the items are independently distributed on two rational numbers with rational probabilities. Our result is very general: we show that it is hard to compute any encoding of an optimal auction of any format (direct or indirect, truthful or non-truthful) that can be implemented in expected polynomial time. In particular, under well-believed complexity-theoretic assumptions, revenue-optimization in very simple multi-item settings can only be tractably approximated. We note that our hardness result applies to randomized mechanisms in a very simple setting, and is not an artifact of introducing combinatorial structure to the problem by allowing correlation among item values, introducing combinatorial valuations, or requiring the mechanism to be deterministic (whose structure is readily combinatorial). Our proof is enabled by a flow-interpretation of the solutions of an exponential-size linear program for revenue maximization with an additional supermodularity constraint. △ Less

Submitted 30 March, 2013; v1 submitted 7 November, 2012; originally announced November 2012.

arXiv:1207.5518 [pdf, ps, other]

Optimal Multi-Dimensional Mechanism Design: Reducing Revenue to Welfare Maximization

Authors: Yang Cai, Constantinos Daskalakis, S. Matthew Weinberg

Abstract: We provide a reduction from revenue maximization to welfare maximization in multi-dimensional Bayesian auctions with arbitrary (possibly combinatorial) feasibility constraints and independent bidders with arbitrary (possibly combinatorial) demand constraints, appropriately extending Myerson's result to this setting. We also show that every feasible Bayesian auction can be implemented as a distribu… ▽ More We provide a reduction from revenue maximization to welfare maximization in multi-dimensional Bayesian auctions with arbitrary (possibly combinatorial) feasibility constraints and independent bidders with arbitrary (possibly combinatorial) demand constraints, appropriately extending Myerson's result to this setting. We also show that every feasible Bayesian auction can be implemented as a distribution over virtual VCG allocation rules. A virtual VCG allocation rule has the following simple form: Every bidder's type t_i is transformed into a virtual type f_i(t_i), via a bidder-specific function. Then, the allocation maximizing virtual welfare is chosen. Using this characterization, we show how to find and run the revenue-optimal auction given only black box access to an implementation of the VCG allocation rule. We generalize this result to arbitrarily correlated bidders, introducing the notion of a second-order VCG allocation rule. We obtain our reduction from revenue to welfare optimization via two algorithmic results on reduced forms in settings with arbitrary feasibility and demand constraints. First, we provide a separation oracle for determining feasibility of a reduced form. Second, we provide a geometric algorithm to decompose any feasible reduced form into a distribution over virtual VCG allocation rules. In addition, we show how to execute both algorithms given only black box access to an implementation of the VCG allocation rule. Our results are computationally efficient for all multi-dimensional settings where the bidders are additive. In this case, our mechanisms run in time polynomial in the total number of bidder types, but not type profiles. For generic correlated distributions, this is the natural description complexity of the problem. The runtime can be further improved to poly(#items, #bidders) in item-symmetric settings by making use of recent techniques. △ Less

Submitted 23 July, 2012; originally announced July 2012.

arXiv:1112.5659 [pdf, ps, other]

Testing $k$-Modal Distributions: Optimal Algorithms via Reductions

Authors: Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio, Gregory Valiant, Paul Valiant

Abstract: We give highly efficient algorithms, and almost matching lower bounds, for a range of basic statistical problems that involve testing and estimating the L_1 distance between two k-modal distributions $p$ and $q$ over the discrete domain $\{1,\dots,n\}$. More precisely, we consider the following four problems: given sample access to an unknown k-modal distribution $p$, Testing identity to a known… ▽ More We give highly efficient algorithms, and almost matching lower bounds, for a range of basic statistical problems that involve testing and estimating the L_1 distance between two k-modal distributions $p$ and $q$ over the discrete domain $\{1,\dots,n\}$. More precisely, we consider the following four problems: given sample access to an unknown k-modal distribution $p$, Testing identity to a known or unknown distribution: 1. Determine whether $p = q$ (for an explicitly given k-modal distribution $q$) versus $p$ is $\eps$-far from $q$; 2. Determine whether $p=q$ (where $q$ is available via sample access) versus $p$ is $\eps$-far from $q$; Estimating $L_1$ distance ("tolerant testing'') against a known or unknown distribution: 3. Approximate $d_{TV}(p,q)$ to within additive $\eps$ where $q$ is an explicitly given k-modal distribution $q$; 4. Approximate $d_{TV}(p,q)$ to within additive $\eps$ where $q$ is available via sample access. For each of these four problems we give sub-logarithmic sample algorithms, that we show are tight up to additive $\poly(k)$ and multiplicative $\polylog\log n+\polylog k$ factors. Thus our bounds significantly improve the previous results of \cite{BKR:04}, which were for testing identity of distributions (items (1) and (2) above) in the special cases k=0 (monotone distributions) and k=1 (unimodal distributions) and required $O((\log n)^3)$ samples. As our main conceptual contribution, we introduce a new reduction-based approach for distribution-testing problems that lets us obtain all the above results in a unified way. Roughly speaking, this approach enables us to transform various distribution testing problems for k-modal distributions over $\{1,\dots,n\}$ to the corresponding distribution testing problems for unrestricted distributions over a much smaller domain $\{1,\dots,\ell\}$ where $\ell = O(k \log n).$ △ Less

Submitted 23 December, 2011; originally announced December 2011.

arXiv:1112.4572 [pdf, ps, other]

A Constructive Approach to Reduced-Form Auctions with Applications to Multi-Item Mechanism Design

Authors: Yang Cai, Constantinos Daskalakis, S. Matthew Weinberg

Abstract: We provide a constructive proof of Border's theorem [Bor91, HR15a] and its generalization to reduced-form auctions with asymmetric bidders [Bor07, MV10, CKM13]. Given a reduced form, we identify a subset of Border constraints that are necessary and sufficient to determine its feasibility. Importantly, the number of these constraints is linear in the total number of bidder types. In addition, we pr… ▽ More We provide a constructive proof of Border's theorem [Bor91, HR15a] and its generalization to reduced-form auctions with asymmetric bidders [Bor07, MV10, CKM13]. Given a reduced form, we identify a subset of Border constraints that are necessary and sufficient to determine its feasibility. Importantly, the number of these constraints is linear in the total number of bidder types. In addition, we provide a characterization result showing that every feasible reduced form can be induced by an ex-post allocation rule that is a distribution over ironings of the same total ordering of the union of all bidders' types. We show how to leverage our results for single-item reduced forms to design auctions with heterogeneous items and asymmetric bidders with valuations that are additive over items. Appealing to our constructive Border's theorem, we obtain polynomial-time algorithms for computing the revenue-optimal auction. Appealing to our characterization of feasible reduced forms, we characterize feasible multi-item allocation rules. △ Less

Submitted 19 November, 2017; v1 submitted 20 December, 2011; originally announced December 2011.

arXiv:1112.4006 [pdf, other]

On Optimal Multi-Dimensional Mechanism Design

Authors: Constantinos Daskalakis, S. Matthew Weinberg

Abstract: We efficiently solve the optimal multi-dimensional mechanism design problem for independent bidders with arbitrary demand constraints when either the number of bidders is a constant or the number of items is a constant. In the first setting, we need that each bidder's values for the items are sampled from a possibly correlated, item-symmetric distribution, allowing different distributions for each… ▽ More We efficiently solve the optimal multi-dimensional mechanism design problem for independent bidders with arbitrary demand constraints when either the number of bidders is a constant or the number of items is a constant. In the first setting, we need that each bidder's values for the items are sampled from a possibly correlated, item-symmetric distribution, allowing different distributions for each bidder. In the second setting, we allow the values of each bidder for the items to be arbitrarily correlated, but assume that the distribution of bidder types is bidder-symmetric. For all eps>0, we obtain an additive eps-approximation, when the value distributions are bounded, or a multiplicative (1-eps)-approximation when the value distributions are unbounded, but satisfy the Monotone Hazard Rate condition, covering a widely studied class of distributions in Economics. Our runtime is polynomial in max{#items,#bidders}, and not the size of the support of the joint distribution of all bidders' values for all items, which is typically exponential in both the number of items and the number of bidders. Our mechanisms are randomized, explicitly price bundles, and can sometimes accommodate budget constraints. Our results are enabled by establishing several new tools and structural properties of Bayesian mechanisms. We provide a symmetrization technique turning any truthful mechanism into one that has the same revenue and respects all symmetries in the underlying value distributions. We also prove that item-symmetric mechanisms satisfy a natural monotonicity property which, unlike cyclic-monotonicity, can be harnessed algorithmically. Finally, we provide a technique that turns any given eps-BIC mechanism (i.e. one where incentive constraints are violated by eps) into a truly-BIC mechanism at the cost of O(sqrt{eps}) revenue. We expect our tools to be used beyond the settings we consider here. △ Less

Submitted 16 December, 2011; originally announced December 2011.

arXiv:1109.5002 [pdf, ps, other]

doi 10.1214/12-AAP852

Alignment-free phylogenetic reconstruction: Sample complexity via a branching process analysis

Authors: Constantinos Daskalakis, Sebastien Roch

Abstract: We present an efficient phylogenetic reconstruction algorithm allowing insertions and deletions which provably achieves a sequence-length requirement (or sample complexity) growing polynomially in the number of taxa. Our algorithm is distance-based, that is, it relies on pairwise sequence comparisons. More importantly, our approach largely bypasses the difficult problem of multiple sequence alignm… ▽ More We present an efficient phylogenetic reconstruction algorithm allowing insertions and deletions which provably achieves a sequence-length requirement (or sample complexity) growing polynomially in the number of taxa. Our algorithm is distance-based, that is, it relies on pairwise sequence comparisons. More importantly, our approach largely bypasses the difficult problem of multiple sequence alignment. △ Less

Submitted 22 February, 2013; v1 submitted 23 September, 2011; originally announced September 2011.

Comments: Published in at http://dx.doi.org/10.1214/12-AAP852 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AAP-AAP852

Journal ref: Annals of Applied Probability 2013, Vol. 23, No. 2, 693-721

arXiv:1107.2702 [pdf, ps, other]

Learning Poisson Binomial Distributions

Authors: Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio

Abstract: We consider a basic problem in unsupervised learning: learning an unknown \emph{Poisson Binomial Distribution}. A Poisson Binomial Distribution (PBD) over $\{0,1,\dots,n\}$ is the distribution of a sum of $n$ independent Bernoulli random variables which may have arbitrary, potentially non-equal, expectations. These distributions were first studied by S. Poisson in 1837 \cite{Poisson:37} and are a… ▽ More We consider a basic problem in unsupervised learning: learning an unknown \emph{Poisson Binomial Distribution}. A Poisson Binomial Distribution (PBD) over $\{0,1,\dots,n\}$ is the distribution of a sum of $n$ independent Bernoulli random variables which may have arbitrary, potentially non-equal, expectations. These distributions were first studied by S. Poisson in 1837 \cite{Poisson:37} and are a natural $n$-parameter generalization of the familiar Binomial Distribution. Surprisingly, prior to our work this basic learning problem was poorly understood, and known results for it were far from optimal. We essentially settle the complexity of the learning problem for this basic class of distributions. As our first main result we give a highly efficient algorithm which learns to $\eps$-accuracy (with respect to the total variation distance) using $\tilde{O}(1/\eps^3)$ samples \emph{independent of $n$}. The running time of the algorithm is \emph{quasilinear} in the size of its input data, i.e., $\tilde{O}(\log(n)/\eps^3)$ bit-operations. (Observe that each draw from the distribution is a $\log(n)$-bit string.) Our second main result is a {\em proper} learning algorithm that learns to $\eps$-accuracy using $\tilde{O}(1/\eps^2)$ samples, and runs in time $(1/\eps)^{\poly (\log (1/\eps))} \cdot \log n$. This is nearly optimal, since any algorithm {for this problem} must use $Ω(1/\eps^2)$ samples. We also give positive and negative results for some extensions of this learning problem to weighted sums of independent Bernoulli random variables. △ Less

Submitted 16 February, 2015; v1 submitted 13 July, 2011; originally announced July 2011.

Comments: Revised full version. Improved sample complexity bound of O~(1/eps^2)

arXiv:1107.2700 [pdf, ps, other]

Learning $k$-Modal Distributions via Testing

Authors: Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio

Abstract: A $k$-modal probability distribution over the discrete domain $\{1,...,n\}$ is one whose histogram has at most $k$ "peaks" and "valleys." Such distributions are natural generalizations of monotone ($k=0$) and unimodal ($k=1$) probability distributions, which have been intensively studied in probability theory and statistics. In this paper we consider the problem of \emph{learning} (i.e., perform… ▽ More A $k$-modal probability distribution over the discrete domain $\{1,...,n\}$ is one whose histogram has at most $k$ "peaks" and "valleys." Such distributions are natural generalizations of monotone ($k=0$) and unimodal ($k=1$) probability distributions, which have been intensively studied in probability theory and statistics. In this paper we consider the problem of \emph{learning} (i.e., performing density estimation of) an unknown $k$-modal distribution with respect to the $L_1$ distance. The learning algorithm is given access to independent samples drawn from an unknown $k$-modal distribution $p$, and it must output a hypothesis distribution $\widehat{p}$ such that with high probability the total variation distance between $p$ and $\widehat{p}$ is at most $ε.$ Our main goal is to obtain \emph{computationally efficient} algorithms for this problem that use (close to) an information-theoretically optimal number of samples. We give an efficient algorithm for this problem that runs in time $\mathrm{poly}(k,\log(n),1/ε)$. For $k \leq \tilde{O}(\log n)$, the number of samples used by our algorithm is very close (within an $\tilde{O}(\log(1/ε))$ factor) to being information-theoretically optimal. Prior to this work computationally efficient algorithms were known only for the cases $k=0,1$ \cite{Birge:87b,Birge:97}. A novel feature of our approach is that our learning algorithm crucially uses a new algorithm for \emph{property testing of probability distributions} as a key subroutine. The learning algorithm uses the property tester to efficiently decompose the $k$-modal distribution into $k$ (near-)monotone distributions, which are easier to learn. △ Less

Submitted 14 September, 2014; v1 submitted 13 July, 2011; originally announced July 2011.

Comments: 28 pages, full version of SODA'12 paper, to appear in Theory of Computing

arXiv:1106.0519 [pdf, other]

Extreme-Value Theorems for Optimal Multidimensional Pricing

Authors: Yang Cai, Constantinos Daskalakis

Abstract: We provide a near-optimal, computationally efficient algorithm for the unit-demand pricing problem, where a seller wants to price n items to optimize revenue against a unit-demand buyer whose values for the items are independently drawn from known distributions. For any chosen accuracy eps>0 and item values bounded in [0,1], our algorithm achieves revenue that is optimal up to an additive error of… ▽ More We provide a near-optimal, computationally efficient algorithm for the unit-demand pricing problem, where a seller wants to price n items to optimize revenue against a unit-demand buyer whose values for the items are independently drawn from known distributions. For any chosen accuracy eps>0 and item values bounded in [0,1], our algorithm achieves revenue that is optimal up to an additive error of at most eps, in polynomial time. For values sampled from Monotone Hazard Rate (MHR) distributions, we achieve a (1-eps)-fraction of the optimal revenue in polynomial time, while for values sampled from regular distributions the same revenue guarantees are achieved in quasi-polynomial time. Our algorithm for bounded distributions applies probabilistic techniques to understand the statistical properties of revenue distributions, obtaining a reduction in the search space of the algorithm via dynamic programming. Adapting this approach to MHR and regular distributions requires the proof of novel extreme value theorems for such distributions. As a byproduct, our techniques establish structural properties of approximately-optimal and near-optimal solutions. We show that, for values independently distributed according to MHR distributions, pricing all items at the same price achieves a constant fraction of the optimal revenue. Moreover, for all eps >0, g(1/eps) distinct prices suffice to obtain a (1-eps)-fraction of the optimal revenue, where g(1/eps) is quadratic in 1/eps and independent of n. Similarly, for all eps>0 and n>0, at most g(1/(eps log n)) distinct prices suffice if the values are independently distributed according to regular distributions, where g() is a polynomial function. Finally, when the values are i.i.d. from some MHR distribution, we show that, if n is a sufficiently large function of 1/eps, a single price suffices to achieve a (1-eps)-fraction of the optimal revenue. △ Less

Submitted 26 October, 2014; v1 submitted 2 June, 2011; originally announced June 2011.

Comments: 58 pages, 2 figure, Appeared in FOCS 2011 and accepted to Games and Economics Behavior

arXiv:1103.0598 [pdf, ps, other]

Learning transformed product distributions

Authors: Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio

Abstract: We consider the problem of learning an unknown product distribution $X$ over $\{0,1\}^n$ using samples $f(X)$ where $f$ is a \emph{known} transformation function. Each choice of a transformation function $f$ specifies a learning problem in this framework. Information-theoretic arguments show that for every transformation function $f$ the corresponding learning problem can be solved to accuracy… ▽ More We consider the problem of learning an unknown product distribution $X$ over $\{0,1\}^n$ using samples $f(X)$ where $f$ is a \emph{known} transformation function. Each choice of a transformation function $f$ specifies a learning problem in this framework. Information-theoretic arguments show that for every transformation function $f$ the corresponding learning problem can be solved to accuracy $\eps$, using $\tilde{O}(n/\eps^2)$ examples, by a generic algorithm whose running time may be exponential in $n.$ We show that this learning problem can be computationally intractable even for constant $\eps$ and rather simple transformation functions. Moreover, the above sample complexity bound is nearly optimal for the general problem, as we give a simple explicit linear transformation function $f(x)=w \cdot x$ with integer weights $w_i \leq n$ and prove that the corresponding learning problem requires $Ω(n)$ samples. As our main positive result we give a highly efficient algorithm for learning a sum of independent unknown Bernoulli random variables, corresponding to the transformation function $f(x)= \sum_{i=1}^n x_i$. Our algorithm learns to $\eps$-accuracy in poly$(n)$ time, using a surprising poly$(1/\eps)$ number of samples that is independent of $n.$ We also give an efficient algorithm that uses $\log n \cdot \poly(1/\eps)$ samples but has running time that is only $\poly(\log n, 1/\eps).$ △ Less

Submitted 2 March, 2011; originally announced March 2011.

arXiv:1102.2280 [pdf, ps, other]

On Oblivious PTAS's for Nash Equilibrium

Authors: Constantinos Daskalakis, Christos H. Papadimitriou

Abstract: If a game has a Nash equilibrium with probability values that are either zero or Omega(1) then this equilibrium can be found exhaustively in polynomial time. Somewhat surprisingly, we show that there is a PTAS for the games whose equilibria are guaranteed to have small-O(1/n)-values, and therefore large-Omega(n)-supports. We also point out that there is a PTAS for games with sparse payoff matrices… ▽ More If a game has a Nash equilibrium with probability values that are either zero or Omega(1) then this equilibrium can be found exhaustively in polynomial time. Somewhat surprisingly, we show that there is a PTAS for the games whose equilibria are guaranteed to have small-O(1/n)-values, and therefore large-Omega(n)-supports. We also point out that there is a PTAS for games with sparse payoff matrices, which are known to be PPAD-complete to solve exactly. Both algorithms are of a special kind that we call oblivious: The algorithm just samples a fixed distribution on pairs of mixed strategies, and the game is only used to determine whether the sampled strategies comprise an eps-Nash equilibrium; the answer is yes with inverse polynomial probability. These results bring about the question: Is there an oblivious PTAS for Nash equilibrium in general games? We answer this question in the negative; our lower bound comes close to the quasi-polynomial upper bound of [Lipton, Markakis, Mehta 2003]. Another recent PTAS for anonymous games is also oblivious in a weaker sense appropriate for this class of games (it samples from a fixed distribution on unordered collections of mixed strategies), but its runtime is exponential in 1/eps. We prove that any oblivious PTAS for anonymous games with two strategies and three player types must have 1/eps^c in the exponent of the running time for some c>1/3, rendering the algorithm in [Daskalakis 2008] essentially optimal within oblivious algorithms. In contrast, we devise a poly(n) (1/eps)^O(log^2(1/eps)) non-oblivious PTAS for anonymous games with 2 strategies and any bounded number of player types. Our algorithm is based on the construction of a sparse (and efficiently computable) eps-cover of the set of all possible sums of n independent indicators, under the total variation distance. The size of the cover is poly(n) (1/ eps^{O(log^2 (1/eps))}. △ Less

Submitted 10 February, 2011; originally announced February 2011.

Comments: extended version of paper of the same title that appeared in STOC 2009

ACM Class: F.2.0

Journal ref: STOC 2009

arXiv:0912.2577 [pdf, ps, other]

Global Alignment of Molecular Sequences via Ancestral State Reconstruction

Authors: Alexandr Andoni, Constantinos Daskalakis, Avinatan Hassidim, Sebastien Roch

Abstract: Molecular phylogenetic techniques do not generally account for such common evolutionary events as site insertions and deletions (known as indels). Instead tree building algorithms and ancestral state inference procedures typically rely on substitution-only models of sequence evolution. In practice these methods are extended beyond this simplified setting with the use of heuristics that produce g… ▽ More Molecular phylogenetic techniques do not generally account for such common evolutionary events as site insertions and deletions (known as indels). Instead tree building algorithms and ancestral state inference procedures typically rely on substitution-only models of sequence evolution. In practice these methods are extended beyond this simplified setting with the use of heuristics that produce global alignments of the input sequences--an important problem which has no rigorous model-based solution. In this paper we consider a new version of the multiple sequence alignment in the context of stochastic indel models. More precisely, we introduce the following {\em trace reconstruction problem on a tree} (TRPT): a binary sequence is broadcast through a tree channel where we allow substitutions, deletions, and insertions; we seek to reconstruct the original sequence from the sequences received at the leaves of the tree. We give a recursive procedure for this problem with strong reconstruction guarantees at low mutation rates, providing also an alignment of the sequences at the leaves of the tree. The TRPT problem without indels has been studied in previous work (Mossel 2004, Daskalakis et al. 2006) as a bootstrap** step towards obtaining optimal phylogenetic reconstruction methods. The present work sets up a framework for extending these works to evolutionary models with indels. △ Less

Submitted 14 December, 2009; originally announced December 2009.

arXiv:0812.2277 [pdf, ps, other]

An Efficient PTAS for Two-Strategy Anonymous Games

Authors: Constantinos Daskalakis

Abstract: We present a novel polynomial time approximation scheme for two-strategy anonymous games, in which the players' utility functions, although potentially different, do not differentiate among the identities of the other players. Our algorithm computes an $eps$-approximate Nash equilibrium of an $n$-player 2-strategy anonymous game in time $poly(n) (1/eps)^{O(1/eps^2)}$, which significantly improve… ▽ More We present a novel polynomial time approximation scheme for two-strategy anonymous games, in which the players' utility functions, although potentially different, do not differentiate among the identities of the other players. Our algorithm computes an $eps$-approximate Nash equilibrium of an $n$-player 2-strategy anonymous game in time $poly(n) (1/eps)^{O(1/eps^2)}$, which significantly improves upon the running time $n^{O(1/eps^2)}$ required by the algorithm of Daskalakis & Papadimitriou, 2007. The improved running time is based on a new structural understanding of approximate Nash equilibria: We show that, for any $eps$, there exists an $eps$-approximate Nash equilibrium in which either only $O(1/eps^3)$ players randomize, or all players who randomize use the same mixed strategy. To show this result we employ tools from the literature on Stein's Method. △ Less

Submitted 11 December, 2008; originally announced December 2008.

arXiv:0808.2801 [pdf, ps, other]

doi 10.1109/FOCS.2008.84

Discretized Multinomial Distributions and Nash Equilibria in Anonymous Games

Authors: Constantinos Daskalakis, Christos H. Papadimitriou

Abstract: We show that there is a polynomial-time approximation scheme for computing Nash equilibria in anonymous games with any fixed number of strategies (a very broad and important class of games), extending the two-strategy result of Daskalakis and Papadimitriou 2007. The approximation guarantee follows from a probabilistic result of more general interest: The distribution of the sum of n independent… ▽ More We show that there is a polynomial-time approximation scheme for computing Nash equilibria in anonymous games with any fixed number of strategies (a very broad and important class of games), extending the two-strategy result of Daskalakis and Papadimitriou 2007. The approximation guarantee follows from a probabilistic result of more general interest: The distribution of the sum of n independent unit vectors with values ranging over {e1, e2, ...,ek}, where ei is the unit vector along dimension i of the k-dimensional Euclidean space, can be approximated by the distribution of the sum of another set of independent unit vectors whose probabilities of obtaining each value are multiples of 1/z for some integer z, and so that the variational distance of the two distributions is at most eps, where eps is bounded by an inverse polynomial in z and a function of k, but with no dependence on n. Our probabilistic result specifies the construction of a surprisingly sparse eps-cover -- under the total variation distance -- of the set of distributions of sums of independent unit vectors, which is of interest on its own right. △ Less

Submitted 20 August, 2008; originally announced August 2008.

Comments: In the 49th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2008

ACM Class: F.2.0

arXiv:0802.1604 [pdf, ps, other]

On the Complexity of Nash Equilibria of Action-Graph Games

Authors: Constantinos Daskalakis, Grant Schoenebeck, Gregory Valiant, Paul Valiant

Abstract: We consider the problem of computing Nash Equilibria of action-graph games (AGGs). AGGs, introduced by Bhat and Leyton-Brown, is a succinct representation of games that encapsulates both "local" dependencies as in graphical games, and partial indifference to other agents' identities as in anonymous games, which occur in many natural settings. This is achieved by specifying a graph on the set of… ▽ More We consider the problem of computing Nash Equilibria of action-graph games (AGGs). AGGs, introduced by Bhat and Leyton-Brown, is a succinct representation of games that encapsulates both "local" dependencies as in graphical games, and partial indifference to other agents' identities as in anonymous games, which occur in many natural settings. This is achieved by specifying a graph on the set of actions, so that the payoff of an agent for selecting a strategy depends only on the number of agents playing each of the neighboring strategies in the action graph. We present a Polynomial Time Approximation Scheme for computing mixed Nash equilibria of AGGs with constant treewidth and a constant number of agent types (and an arbitrary number of strategies), together with hardness results for the cases when either the treewidth or the number of agent types is unconstrained. In particular, we show that even if the action graph is a tree, but the number of agent-types is unconstrained, it is NP-complete to decide the existence of a pure-strategy Nash equilibrium and PPAD-complete to compute a mixed Nash equilibrium (even an approximate one); similarly for symmetric AGGs (all agents belong to a single type), if we allow arbitrary treewidth. These hardness results suggest that, in some sense, our PTAS is as strong of a positive result as one can expect. △ Less

Submitted 12 February, 2008; originally announced February 2008.

arXiv:0801.4190 [pdf, other]

Phylogenies without Branch Bounds: Contracting the Short, Pruning the Deep

Authors: Constantinos Daskalakis, Elchanan Mossel, Sebastien Roch

Abstract: We introduce a new phylogenetic reconstruction algorithm which, unlike most previous rigorous inference techniques, does not rely on assumptions regarding the branch lengths or the depth of the tree. The algorithm returns a forest which is guaranteed to contain all edges that are: 1) sufficiently long and 2) sufficiently close to the leaves. How much of the true tree is recovered depends on the… ▽ More We introduce a new phylogenetic reconstruction algorithm which, unlike most previous rigorous inference techniques, does not rely on assumptions regarding the branch lengths or the depth of the tree. The algorithm returns a forest which is guaranteed to contain all edges that are: 1) sufficiently long and 2) sufficiently close to the leaves. How much of the true tree is recovered depends on the sequence length provided. The algorithm is distance-based and runs in polynomial time. △ Less

Submitted 27 July, 2009; v1 submitted 28 January, 2008; originally announced January 2008.

arXiv:0710.5582 [pdf, ps, other]

Computing Equilibria in Anonymous Games

Authors: Constantinos Daskalakis, Christos Papadimitriou

Abstract: We present efficient approximation algorithms for finding Nash equilibria in anonymous games, that is, games in which the players utilities, though different, do not differentiate between other players. Our results pertain to such games with many players but few strategies. We show that any such game has an approximate pure Nash equilibrium, computable in polynomial time, with approximation O(s^… ▽ More We present efficient approximation algorithms for finding Nash equilibria in anonymous games, that is, games in which the players utilities, though different, do not differentiate between other players. Our results pertain to such games with many players but few strategies. We show that any such game has an approximate pure Nash equilibrium, computable in polynomial time, with approximation O(s^2 L), where s is the number of strategies and L is the Lipschitz constant of the utilities. Finally, we show that there is a PTAS for finding an epsilon △ Less

Submitted 30 October, 2007; originally announced October 2007.

Showing 51–100 of 106 results for author: Daskalakis, C