-
Bandit-Feedback Online Multiclass Classification: Variants and Tradeoffs
Authors:
Yuval Filmus,
Steve Hanneke,
Idan Mehalel,
Shay Moran
Abstract:
Consider the domain of multiclass classification within the adversarial online setting. What is the price of relying on bandit feedback as opposed to full information? To what extent can an adaptive adversary amplify the loss compared to an oblivious one? To what extent can a randomized learner reduce the loss compared to a deterministic one? We study these questions in the mistake bound model and…
▽ More
Consider the domain of multiclass classification within the adversarial online setting. What is the price of relying on bandit feedback as opposed to full information? To what extent can an adaptive adversary amplify the loss compared to an oblivious one? To what extent can a randomized learner reduce the loss compared to a deterministic one? We study these questions in the mistake bound model and provide nearly tight answers.
We demonstrate that the optimal mistake bound under bandit feedback is at most $O(k)$ times higher than the optimal mistake bound in the full information case, where $k$ represents the number of labels. This bound is tight and provides an answer to an open question previously posed and studied by Daniely and Helbertal ['13] and by Long ['17, '20], who focused on deterministic learners.
Moreover, we present nearly optimal bounds of $\tildeΘ(k)$ on the gap between randomized and deterministic learners, as well as between adaptive and oblivious adversaries in the bandit feedback setting. This stands in contrast to the full information scenario, where adaptive and oblivious adversaries are equivalent, and the gap in mistake bounds between randomized and deterministic learners is a constant multiplicative factor of $2$.
In addition, our results imply that in some cases the optimal randomized mistake bound is approximately the square-root of its deterministic parallel. Previous results show that this is essentially the smallest it can get.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Bounded Simultaneous Messages
Authors:
Andrej Bogdanov,
Krishnamoorthy Dinesh,
Yuval Filmus,
Yuval Ishai,
Avi Kaplan,
Sruthi Sekar
Abstract:
We consider the following question of bounded simultaneous messages (BSM) protocols: Can computationally unbounded Alice and Bob evaluate a function $f(x,y)$ of their inputs by sending polynomial-size messages to a computationally bounded Carol? The special case where $f$ is the mod-2 inner-product function and Carol is bounded to AC$^0$ has been studied in previous works. The general question can…
▽ More
We consider the following question of bounded simultaneous messages (BSM) protocols: Can computationally unbounded Alice and Bob evaluate a function $f(x,y)$ of their inputs by sending polynomial-size messages to a computationally bounded Carol? The special case where $f$ is the mod-2 inner-product function and Carol is bounded to AC$^0$ has been studied in previous works. The general question can be broadly motivated by applications in which distributed computation is more costly than local computation, including secure two-party computation.
In this work, we initiate a more systematic study of the BSM model, with different functions $f$ and computational bounds on Carol. In particular, we give evidence against the existence of BSM protocols with polynomial-size Carol for naturally distributed variants of NP-complete languages.
△ Less
Submitted 21 December, 2023; v1 submitted 30 September, 2023;
originally announced October 2023.
-
Generalized polymorphisms
Authors:
Gilad Chase,
Yuval Filmus
Abstract:
We find all functions $f_0,f_1,\dots,f_m\colon \{0,1\}^n \to \{0,1\}$ and $g_0,g_1,\dots,g_n\colon \{0,1\}^m \to \{0,1\}$ satisfying the following identity for all $n \times m$ matrices $(z_{ij}) \in \{0,1\}^{n \times m}$: \[ f_0(g_1(z_{11},\dots,z_{1m}),\dots,g_n(z_{n1},\dots,z_{nm})) =
g_0(f_1(z_{11},\dots,z_{n1}),\dots,f_m(z_{1m},\dots,z_{nm})). \] Our results generalize work of Dokow and Hol…
▽ More
We find all functions $f_0,f_1,\dots,f_m\colon \{0,1\}^n \to \{0,1\}$ and $g_0,g_1,\dots,g_n\colon \{0,1\}^m \to \{0,1\}$ satisfying the following identity for all $n \times m$ matrices $(z_{ij}) \in \{0,1\}^{n \times m}$: \[ f_0(g_1(z_{11},\dots,z_{1m}),\dots,g_n(z_{n1},\dots,z_{nm})) =
g_0(f_1(z_{11},\dots,z_{n1}),\dots,f_m(z_{1m},\dots,z_{nm})). \] Our results generalize work of Dokow and Holzman (2010), which considered the case $g_0 = g_1 = \cdots = g_n$, and of Chase, Filmus, Minzer, Mossel and Saurabh (2022), which considered the case $g_0 \neq g_1 = \cdots = g_n$.
△ Less
Submitted 19 November, 2023; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Sampling and Certifying Symmetric Functions
Authors:
Yuval Filmus,
Itai Leigh,
Artur Riazanov,
Dmitry Sokolov
Abstract:
A circuit $\mathcal{C}$ samples a distribution $\mathbf{X}$ with an error $ε$ if the statistical distance between the output of $\mathcal{C}$ on the uniform input and $\mathbf{X}$ is $ε$. We study the hardness of sampling a uniform distribution over the set of $n$-bit strings of Hamming weight $k$ denoted by $\mathbf{U}^n_k$ for _decision forests_, i.e. every output bit is computed as a decision t…
▽ More
A circuit $\mathcal{C}$ samples a distribution $\mathbf{X}$ with an error $ε$ if the statistical distance between the output of $\mathcal{C}$ on the uniform input and $\mathbf{X}$ is $ε$. We study the hardness of sampling a uniform distribution over the set of $n$-bit strings of Hamming weight $k$ denoted by $\mathbf{U}^n_k$ for _decision forests_, i.e. every output bit is computed as a decision tree of the inputs. For every $k$ there is an $O(\log n)$-depth decision forest sampling $\mathbf{U}^n_k$ with an inverse-polynomial error [Viola 2012, Czumaj 2015]. We show that for every $ε> 0$ there exists $τ$ such that for decision depth $τ\log (n/k) / \log \log (n/k)$, the error for sampling $\mathbf{U}_k^n$ is at least $1-ε$. Our result is based on the recent robust sunflower lemma [Alweiss, Lovett, Wu, Zhang 2021, Rao 2019].
Our second result is about matching a set of $n$-bit strings with the image of a $d$-_local_ circuit, i.e. such that each output bit depends on at most $d$ input bits. We study the set of all $n$-bit strings whose Hamming weight is at least $n/2$. We improve the previously known locality lower bound from $Ω(\log^* n)$ [Beyersdorff, Datta, Krebs, Mahajan, Scharfenberger-Fabian, Sreenivasaiah, Thomas and Vollmer, 2013] to $Ω(\sqrt{\log n})$, leaving only a quartic gap from the best upper bound of $O(\log^2 n)$.
△ Less
Submitted 7 May, 2023;
originally announced May 2023.
-
Optimal Prediction Using Expert Advice and Randomized Littlestone Dimension
Authors:
Yuval Filmus,
Steve Hanneke,
Idan Mehalel,
Shay Moran
Abstract:
A classical result in online learning characterizes the optimal mistake bound achievable by deterministic learners using the Littlestone dimension (Littlestone '88). We prove an analogous result for randomized learners: we show that the optimal expected mistake bound in learning a class $\mathcal{H}$ equals its randomized Littlestone dimension, which is the largest $d$ for which there exists a tre…
▽ More
A classical result in online learning characterizes the optimal mistake bound achievable by deterministic learners using the Littlestone dimension (Littlestone '88). We prove an analogous result for randomized learners: we show that the optimal expected mistake bound in learning a class $\mathcal{H}$ equals its randomized Littlestone dimension, which is the largest $d$ for which there exists a tree shattered by $\mathcal{H}$ whose average depth is $2d$. We further study optimal mistake bounds in the agnostic case, as a function of the number of mistakes made by the best function in $\mathcal{H}$, denoted by $k$. We show that the optimal randomized mistake bound for learning a class with Littlestone dimension $d$ is $k + Θ(\sqrt{k d} + d )$. This also implies an optimal deterministic mistake bound of $2k + Θ(d) + O(\sqrt{k d})$, thus resolving an open question which was studied by Auer and Long ['99].
As an application of our theory, we revisit the classical problem of prediction using expert advice: about 30 years ago Cesa-Bianchi, Freund, Haussler, Helmbold, Schapire and Warmuth studied prediction using expert advice, provided that the best among the $n$ experts makes at most $k$ mistakes, and asked what are the optimal mistake bounds. Cesa-Bianchi, Freund, Helmbold, and Warmuth ['93, '96] provided a nearly optimal bound for deterministic learners, and left the randomized case as an open problem. We resolve this question by providing an optimal learning rule in the randomized case, and showing that its expected mistake bound equals half of the deterministic bound of Cesa-Bianchi et al. ['93,'96], up to negligible additive terms. In contrast with previous works by Abernethy, Langford, and Warmuth ['06], and by Brânzei and Peres ['19], our result applies to all pairs $n,k$.
△ Less
Submitted 17 August, 2023; v1 submitted 27 February, 2023;
originally announced February 2023.
-
Proving Unsatisfiability with Hitting Formulas
Authors:
Yuval Filmus,
Edward A. Hirsch,
Artur Riazanov,
Alexander Smal,
Marc Vinyals
Abstract:
Hitting formulas have been studied in many different contexts at least since [Iwama,89]. A hitting formula is a set of Boolean clauses such that any two of them cannot be simultaneously falsified. [Peitl,Szeider,05] conjectured that hitting formulas should contain the hardest formulas for resolution. They supported their conjecture with experimental findings. Using the fact that hitting formulas a…
▽ More
Hitting formulas have been studied in many different contexts at least since [Iwama,89]. A hitting formula is a set of Boolean clauses such that any two of them cannot be simultaneously falsified. [Peitl,Szeider,05] conjectured that hitting formulas should contain the hardest formulas for resolution. They supported their conjecture with experimental findings. Using the fact that hitting formulas are easy to check for satisfiability we use them to build a static proof system Hitting: a refutation of a CNF in Hitting is an unsatisfiable hitting formula such that each of its clauses is a weakening of a clause of the refuted CNF. Comparing this system to resolution and other proof systems is equivalent to studying the hardness of hitting formulas.
We show that tree-like resolution and Hitting are quasi-polynomially separated. We prove that Hitting is quasi-polynomially simulated by tree-like resolution, thus hitting formulas cannot be exponentially hard for resolution, so Peitl-Szeider's conjecture is partially refuted. Nevertheless Hitting is surprisingly difficult to polynomially simulate. Using the ideas of PIT for noncommutative circuits [Raz-Shpilka,05] we show that Hitting is simulated by Extended Frege. As a byproduct, we show that a number of static (semi)algebraic systems are verifiable in a deterministic polynomial time.
We consider multiple extensions of Hitting. Hitting(+) formulas are conjunctions of clauses containing affine equations instead of just literals, and every assignment falsifies at most one clause. The resulting system is related to Res(+) proof system for which no superpolynomial lower bounds are known: Hitting(+) simulates the tree-like version of Res(+) and is at least quasi-polynomially stronger. We show an exponential lower bound for Hitting(+).
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
A Resilient Distributed Boosting Algorithm
Authors:
Yuval Filmus,
Idan Mehalel,
Shay Moran
Abstract:
Given a learning task where the data is distributed among several parties, communication is one of the fundamental resources which the parties would like to minimize. We present a distributed boosting algorithm which is resilient to a limited amount of noise. Our algorithm is similar to classical boosting algorithms, although it is equipped with a new component, inspired by Impagliazzo's hard-core…
▽ More
Given a learning task where the data is distributed among several parties, communication is one of the fundamental resources which the parties would like to minimize. We present a distributed boosting algorithm which is resilient to a limited amount of noise. Our algorithm is similar to classical boosting algorithms, although it is equipped with a new component, inspired by Impagliazzo's hard-core lemma [Impagliazzo95], adding a robustness quality to the algorithm. We also complement this result by showing that resilience to any asymptotically larger noise is not achievable by a communication-efficient algorithm.
△ Less
Submitted 13 June, 2022; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Junta threshold for low degree Boolean functions on the slice
Authors:
Yuval Filmus
Abstract:
We show that a Boolean degree $d$ function on the slice $\binom{[n]}{k}$ is a junta if $k \geq 2d$, and that this bound is sharp. We prove a similar result for $A$-valued degree $d$ functions for arbitrary finite $A$, and for functions on an infinite analog of the slice.
We show that a Boolean degree $d$ function on the slice $\binom{[n]}{k}$ is a junta if $k \geq 2d$, and that this bound is sharp. We prove a similar result for $A$-valued degree $d$ functions for arbitrary finite $A$, and for functions on an infinite analog of the slice.
△ Less
Submitted 14 March, 2022; v1 submitted 9 March, 2022;
originally announced March 2022.
-
Simple Algebraic Proofs of Uniqueness for Erdős-Ko-Rado Theorems
Authors:
Yuval Filmus,
Nathan Lindzey
Abstract:
We give simpler algebraic proofs of uniqueness for several Erdős-Ko-Rado results, i.e., that the canonically intersecting families are the only largest intersecting families. Using these techniques, we characterize the largest partially 2-intersecting families of perfect hypermatchings, resolving a recent conjecture of Meagher, Shirazi, and Stevens.
We give simpler algebraic proofs of uniqueness for several Erdős-Ko-Rado results, i.e., that the canonically intersecting families are the only largest intersecting families. Using these techniques, we characterize the largest partially 2-intersecting families of perfect hypermatchings, resolving a recent conjecture of Meagher, Shirazi, and Stevens.
△ Less
Submitted 8 January, 2022;
originally announced January 2022.
-
Boolean functions on $S_n$ which are nearly linear
Authors:
Yuval Filmus
Abstract:
We show that if $f\colon S_n \to \{0,1\}$ is $ε$-close to linear in $L_2$ and $\mathbb{E}[f] \leq 1/2$ then $f$ is $O(ε)$-close to a union of "mostly disjoint" cosets, and moreover this is sharp: any such union is close to linear. This constitutes a sharp Friedgut-Kalai-Naor theorem for the symmetric group.
Using similar techniques, we show that if $f\colon S_n \to \mathbb{R}$ is linear,…
▽ More
We show that if $f\colon S_n \to \{0,1\}$ is $ε$-close to linear in $L_2$ and $\mathbb{E}[f] \leq 1/2$ then $f$ is $O(ε)$-close to a union of "mostly disjoint" cosets, and moreover this is sharp: any such union is close to linear. This constitutes a sharp Friedgut-Kalai-Naor theorem for the symmetric group.
Using similar techniques, we show that if $f\colon S_n \to \mathbb{R}$ is linear, $\Pr[f \notin \{0,1\}] \leq ε$, and $\Pr[f = 1] \leq 1/2$, then $f$ is $O(ε)$-close to a union of mostly disjoint cosets, and this is also sharp; and that if $f\colon S_n \to \mathbb{R}$ is linear and $ε$-close to $\{0,1\}$ in $L_\infty$ then $f$ is $O(ε)$-close in $L_\infty$ to a union of disjoint cosets.
△ Less
Submitted 10 December, 2021; v1 submitted 16 July, 2021;
originally announced July 2021.
-
Optimal sets of questions for Twenty Questions
Authors:
Yuval Filmus,
Idan Mehalel
Abstract:
In the distributional Twenty Questions game, Bob chooses a number $x$ from $1$ to $n$ according to a distribution $μ$, and Alice (who knows $μ$) attempts to identify $x$ using Yes/No questions, which Bob answers truthfully. Her goal is to minimize the expected number of questions.
The optimal strategy for the Twenty Questions game corresponds to a Huffman code for $μ$, yet this strategy could po…
▽ More
In the distributional Twenty Questions game, Bob chooses a number $x$ from $1$ to $n$ according to a distribution $μ$, and Alice (who knows $μ$) attempts to identify $x$ using Yes/No questions, which Bob answers truthfully. Her goal is to minimize the expected number of questions.
The optimal strategy for the Twenty Questions game corresponds to a Huffman code for $μ$, yet this strategy could potentially uses all $2^n$ possible questions. Dagan et al. constructed a set of $1.25^{n+o(n)}$ questions which suffice to construct an optimal strategy for all $μ$, and showed that this number is optimal (up to sub-exponential factors) for infinitely many $n$.
We determine the optimal size of such a set of questions for all $n$ (up to sub-exponential factors), answering an open question of Dagan et al. In addition, we generalize the results of Dagan et al. to the $d$-ary setting, obtaining similar results with $1.25$ replaced by $1 + (d-1)/d^{d/(d-1)}$.
△ Less
Submitted 19 March, 2024; v1 submitted 3 June, 2021;
originally announced June 2021.
-
Approximate polymorphisms
Authors:
Gilad Chase,
Yuval Filmus,
Dor Minzer,
Elchanan Mossel,
Nitin Saurabh
Abstract:
For a function $g\colon\{0,1\}^m\to\{0,1\}$, a function $f\colon \{0,1\}^n\to\{0,1\}$ is called a $g$-polymorphism if their actions commute: $f(g(\mathsf{row}_1(Z)),\ldots,g(\mathsf{row}_n(Z))) = g(f(\mathsf{col}_1(Z)),\ldots,f(\mathsf{col}_m(Z)))$ for all $Z\in\{0,1\}^{n\times m}$. The function $f$ is called an approximate polymorphism if this equality holds with probability close to $1$, when…
▽ More
For a function $g\colon\{0,1\}^m\to\{0,1\}$, a function $f\colon \{0,1\}^n\to\{0,1\}$ is called a $g$-polymorphism if their actions commute: $f(g(\mathsf{row}_1(Z)),\ldots,g(\mathsf{row}_n(Z))) = g(f(\mathsf{col}_1(Z)),\ldots,f(\mathsf{col}_m(Z)))$ for all $Z\in\{0,1\}^{n\times m}$. The function $f$ is called an approximate polymorphism if this equality holds with probability close to $1$, when $Z$ is sampled uniformly.
We study the structure of exact polymorphisms as well as approximate polymorphisms. Our results include:
- We prove that an approximate polymorphism $f$ must be close to an exact polymorphism;
- We give a characterization of exact polymorphisms, showing that besides trivial cases, only the functions $g = \mathsf{AND}, \mathsf{XOR}, \mathsf{OR}, \mathsf{NXOR}$ admit non-trivial exact polymorphisms.
We also study the approximate polymorphism problem in the list-decoding regime (i.e., when the probability equality holds is not close to $1$, but is bounded away from some value). We show that if $f(x \land y) = f(x) \land f(y)$ with probability larger than $s_\land \approx 0.815$ then $f$ correlates with some low-degree character, and $s_\land$ is the optimal threshold for this property.
Our result generalize the classical linearity testing result of Blum, Luby and Rubinfeld, that in this language showed that the approximate polymorphisms of $g = \mathsf{XOR}$ are close to XOR's, as well as a recent result of Filmus, Lifshitz, Minzer and Mossel, showing that the approximate polymorphisms of AND can only be close to AND functions.
△ Less
Submitted 20 June, 2021; v1 submitted 31 May, 2021;
originally announced June 2021.
-
Revisiting the Complexity Analysis of Conflict-Based Search: New Computational Techniques and Improved Bounds
Authors:
Ofir Gordon,
Yuval Filmus,
Oren Salzman
Abstract:
The problem of Multi-Agent Path Finding (MAPF) calls for finding a set of conflict-free paths for a fleet of agents operating in a given environment. Arguably, the state-of-the-art approach to computing optimal solutions is Conflict-Based Search (CBS). In this work we revisit the complexity analysis of CBS to provide tighter bounds on the algorithm's run-time in the worst-case. Our analysis paves…
▽ More
The problem of Multi-Agent Path Finding (MAPF) calls for finding a set of conflict-free paths for a fleet of agents operating in a given environment. Arguably, the state-of-the-art approach to computing optimal solutions is Conflict-Based Search (CBS). In this work we revisit the complexity analysis of CBS to provide tighter bounds on the algorithm's run-time in the worst-case. Our analysis paves the way to better pinpoint the parameters that govern (in the worst case) the algorithm's computational complexity.
Our analysis is based on two complementary approaches: In the first approach we bound the run-time using the size of a Multi-valued Decision Diagram (MDD) -- a layered graph which compactly contains all possible single-agent paths between two given vertices for a specific path length.
In the second approach we express the running time by a novel recurrence relation which bounds the algorithm's complexity. We use generating functions-based analysis in order to tightly bound the recurrence.
Using these technique we provide several new upper-bounds on CBS's complexity. The results allow us to improve the existing bound on the running time of CBS for many cases. For example, on a set of common benchmarks we improve the upper-bound by a factor of at least $2^{10^{7}}$.
△ Less
Submitted 18 April, 2021;
originally announced April 2021.
-
Shrinkage under Random Projections, and Cubic Formula Lower Bounds for $\mathbf{AC}^0$
Authors:
Yuval Filmus,
Or Meir,
Avishay Tal
Abstract:
Håstad showed that any De Morgan formula (composed of AND, OR and NOT gates) shrinks by a factor of $O(p^{2})$ under a random restriction that leaves each variable alive independently with probability $p$ [SICOMP, 1998]. Using this result, he gave an $\widetildeΩ(n^{3})$ formula size lower bound for the Andreev function, which, up to lower order improvements, remains the state-of-the-art lower bou…
▽ More
Håstad showed that any De Morgan formula (composed of AND, OR and NOT gates) shrinks by a factor of $O(p^{2})$ under a random restriction that leaves each variable alive independently with probability $p$ [SICOMP, 1998]. Using this result, he gave an $\widetildeΩ(n^{3})$ formula size lower bound for the Andreev function, which, up to lower order improvements, remains the state-of-the-art lower bound for any explicit function. In this work, we extend the shrinkage result of Håstad to hold under a far wider family of random restrictions and their generalization -- random projections. Based on our shrinkage results, we obtain an $\widetildeΩ(n^{3})$ formula size lower bound for an explicit function computed in $\mathbf{AC}^0$. This improves upon the best known formula size lower bounds for $\mathbf{AC}^0$, that were only quadratic prior to our work. In addition, we prove that the KRW conjecture [Karchmer et al., Computational Complexity 5(3/4), 1995] holds for inner functions for which the unweighted quantum adversary bound is tight. In particular, this holds for inner functions with a tight Khrapchenko bound. Our random projections are tailor-made to the function's structure so that the function maintains structure even under projection -- using such projections is necessary, as standard random restrictions simplify $\mathbf{AC}^0$ circuits. In contrast, we show that any De Morgan formula shrinks by a quadratic factor under our random projections, allowing us to prove the cubic lower bound. Our proof techniques build on the proof of Håstad for the simpler case of balanced formulas. This allows for a significantly simpler proof at the cost of slightly worse parameters. As such, when specialized to the case of $p$-random restrictions, our proof can be used as an exposition of Håstad's result.
△ Less
Submitted 29 December, 2020; v1 submitted 3 December, 2020;
originally announced December 2020.
-
Complexity Measures on the Symmetric Group and Beyond
Authors:
Neta Dafni,
Yuval Filmus,
Noam Lifshitz,
Nathan Lindzey,
Marc Vinyals
Abstract:
We extend the definitions of complexity measures of functions to domains such as the symmetric group. The complexity measures we consider include degree, approximate degree, decision tree complexity, sensitivity, block sensitivity, and a few others. We show that these complexity measures are polynomially related for the symmetric group and for many other domains.
To show that all measures but se…
▽ More
We extend the definitions of complexity measures of functions to domains such as the symmetric group. The complexity measures we consider include degree, approximate degree, decision tree complexity, sensitivity, block sensitivity, and a few others. We show that these complexity measures are polynomially related for the symmetric group and for many other domains.
To show that all measures but sensitivity are polynomially related, we generalize classical arguments of Nisan and others. To add sensitivity to the mix, we reduce to Huang's sensitivity theorem using "pseudo-characters", which witness the degree of a function.
Using similar ideas, we extend the characterization of Boolean degree 1 functions on the symmetric group due to Ellis, Friedgut and Pilpel to the perfect matching scheme. As another application of our ideas, we simplify the characterization of maximum-size $t$-intersecting families in the symmetric group and the perfect matching scheme.
△ Less
Submitted 14 October, 2020;
originally announced October 2020.
-
Hypercontractivity on the symmetric group
Authors:
Yuval Filmus,
Guy Kindler,
Noam Lifshitz,
Dor Minzer
Abstract:
The hypercontractive inequality is a fundamental result in analysis, with many applications throughout discrete mathematics, theoretical computer science, combinatorics and more. So far, variants of this inequality have been proved mainly for product spaces, which raises the question of whether analogous results hold over non-product domains.
We consider the symmetric group, $S_n$, one of the mo…
▽ More
The hypercontractive inequality is a fundamental result in analysis, with many applications throughout discrete mathematics, theoretical computer science, combinatorics and more. So far, variants of this inequality have been proved mainly for product spaces, which raises the question of whether analogous results hold over non-product domains.
We consider the symmetric group, $S_n$, one of the most basic non-product domains, and establish hypercontractive inequalities on it. Our inequalities are most effective for the class of \emph{global functions} on $S_n$, which are functions whose $2$-norm remains small when restricting $O(1)$ coordinates of the input, and assert that low-degree, global functions have small $q$-norms, for $q>2$.
As applications, we show:
1. An analog of the level-$d$ inequality on the hypercube, asserting that the mass of a global function on low-degrees is very small. We also show how to use this inequality to bound the size of global, product-free sets in the alternating group $A_n$.
2. Isoperimetric inequalities on the transposition Cayley graph of $S_n$ for global functions, that are analogous to the KKL theorem and to the small-set expansion property in the Boolean hypercube.
3. Hypercontractive inequalities on the multi-slice, and stability versions of the Kruskal--Katona Theorem in some regimes of parameters.
△ Less
Submitted 27 October, 2020; v1 submitted 11 September, 2020;
originally announced September 2020.
-
Explicit SoS lower bounds from high-dimensional expanders
Authors:
Irit Dinur,
Yuval Filmus,
Prahladh Harsha,
Madhur Tulsiani
Abstract:
We construct an explicit family of 3XOR instances which is hard for $O(\sqrt{\log n})$ levels of the Sum-of-Squares hierarchy. In contrast to earlier constructions, which involve a random component, our systems can be constructed explicitly in deterministic polynomial time.
Our construction is based on the high-dimensional expanders devised by Lubotzky, Samuels and Vishne, known as LSV complexes…
▽ More
We construct an explicit family of 3XOR instances which is hard for $O(\sqrt{\log n})$ levels of the Sum-of-Squares hierarchy. In contrast to earlier constructions, which involve a random component, our systems can be constructed explicitly in deterministic polynomial time.
Our construction is based on the high-dimensional expanders devised by Lubotzky, Samuels and Vishne, known as LSV complexes or Ramanujan complexes, and our analysis is based on two notions of expansion for these complexes: cosystolic expansion, and a local isoperimetric inequality due to Gromov.
Our construction offers an interesting contrast to the recent work of Alev, Jeronimo and the last author~(FOCS 2019). They showed that 3XOR instances in which the variables correspond to vertices in a high-dimensional expander are easy to solve. In contrast, in our instances the variables correspond to the edges of the complex.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
MaxSAT Resolution and Subcube Sums
Authors:
Yuval Filmus,
Meena Mahajan,
Gaurav Sood,
Marc Vinyals
Abstract:
We study the MaxRes rule in the context of certifying unsatisfiability. We show that it can be exponentially more powerful than tree-like resolution, and when augmented with weakening (the system MaxResW), p-simulates tree-like resolution. In devising a lower bound technique specific to MaxRes (and not merely inheriting lower bounds from Res), we define a new proof system called the SubCubeSums pr…
▽ More
We study the MaxRes rule in the context of certifying unsatisfiability. We show that it can be exponentially more powerful than tree-like resolution, and when augmented with weakening (the system MaxResW), p-simulates tree-like resolution. In devising a lower bound technique specific to MaxRes (and not merely inheriting lower bounds from Res), we define a new proof system called the SubCubeSums proof system. This system, which p-simulates MaxResW, can be viewed as a special case of the semialgebraic Sherali-Adams proof system. In expressivity, it is the integral restriction of conical juntas studied in the contexts of communication complexity and extension complexity. We show that it is not simulated by Res. Using a proof technique qualitatively different from the lower bounds that MaxResW inherits from Res, we show that Tseitin contradictions on expander graphs are hard to refute in SubCubeSums. We also establish a lower bound technique via lifting: for formulas requiring large degree in SubCubeSums, their XOR-ification requires large size in SubCubeSums.
△ Less
Submitted 22 October, 2022; v1 submitted 23 May, 2020;
originally announced May 2020.
-
Asymptotic performance of the Grimmett-McDiarmid heuristic
Authors:
Yuval Filmus
Abstract:
Grimmett and McDiarmid suggested a simple heuristic for finding stable sets in random graphs. They showed that the heuristic finds a stable set of size $\sim\log_2 n$ (with high probability) on a $G(n, 1/2)$ random graph. We determine the asymptotic distribution of the size of the stable set found by the algorithm.
Grimmett and McDiarmid suggested a simple heuristic for finding stable sets in random graphs. They showed that the heuristic finds a stable set of size $\sim\log_2 n$ (with high probability) on a $G(n, 1/2)$ random graph. We determine the asymptotic distribution of the size of the stable set found by the algorithm.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
AND Testing and Robust Judgement Aggregation
Authors:
Yuval Filmus,
Noam Lifshitz,
Dor Minzer,
Elchanan Mossel
Abstract:
A function $f\colon\{0,1\}^n\to \{0,1\}$ is called an approximate AND-homomorphism if choosing ${\bf x},{\bf y}\in\{0,1\}^n$ randomly, we have that $f({\bf x}\land {\bf y}) = f({\bf x})\land f({\bf y})$ with probability at least $1-ε$, where $x\land y = (x_1\land y_1,\ldots,x_n\land y_n)$. We prove that if $f\colon \{0,1\}^n \to \{0,1\}$ is an approximate AND-homomorphism, then $f$ is $δ$-close to…
▽ More
A function $f\colon\{0,1\}^n\to \{0,1\}$ is called an approximate AND-homomorphism if choosing ${\bf x},{\bf y}\in\{0,1\}^n$ randomly, we have that $f({\bf x}\land {\bf y}) = f({\bf x})\land f({\bf y})$ with probability at least $1-ε$, where $x\land y = (x_1\land y_1,\ldots,x_n\land y_n)$. We prove that if $f\colon \{0,1\}^n \to \{0,1\}$ is an approximate AND-homomorphism, then $f$ is $δ$-close to either a constant function or an AND function, where $δ(ε) \to 0$ as $ε\to0$. This improves on a result of Nehama, who proved a similar statement in which $δ$ depends on $n$.
Our theorem implies a strong result on judgement aggregation in computational social choice. In the language of social choice, our result shows that if $f$ is $ε$-close to satisfying judgement aggregation, then it is $δ(ε)$-close to an oligarchy (the name for the AND function in social choice theory). This improves on Nehama's result, in which $δ$ decays polynomially with $n$.
Our result follows from a more general one, in which we characterize approximate solutions to the eigenvalue equation $\mathrm T f = λg$, where $\mathrm T$ is the downwards noise operator $\mathrm T f(x) = \mathbb{E}_{\bf y}[f(x \land {\bf y})]$, $f$ is $[0,1]$-valued, and $g$ is $\{0,1\}$-valued. We identify all exact solutions to this equation, and show that any approximate solution in which $\mathrm T f$ and $λg$ are close is close to an exact solution.
△ Less
Submitted 31 October, 2019;
originally announced November 2019.
-
Query-to-Communication Lifting Using Low-Discrepancy Gadgets
Authors:
Arkadev Chattopadhyay,
Yuval Filmus,
Sa** Koroth,
Or Meir,
Toniann Pitassi
Abstract:
Lifting theorems are theorems that relate the query complexity of a function $f:\{0,1\}^{n}\to\{0,1\}$ to the communication complexity of the composed function $f \circ g^{n}$, for some "gadget" $g:\{0,1\}^{b}\times\{0,1\}^{b}\to\{0,1\}$. Such theorems allow transferring lower bounds from query complexity to the communication complexity, and have seen numerous applications in the recent years. In…
▽ More
Lifting theorems are theorems that relate the query complexity of a function $f:\{0,1\}^{n}\to\{0,1\}$ to the communication complexity of the composed function $f \circ g^{n}$, for some "gadget" $g:\{0,1\}^{b}\times\{0,1\}^{b}\to\{0,1\}$. Such theorems allow transferring lower bounds from query complexity to the communication complexity, and have seen numerous applications in the recent years. In addition, such theorems can be viewed as a strong generalization of a direct-sum theorem for the gadget $g$.
We prove a new lifting theorem that works for all gadgets $g$ that have logarithmic length and exponentially-small discrepancy, for both deterministic and randomized communication complexity. Thus, we significantly increase the range of gadgets for which such lifting theorems hold.
Our result has two main motivations: First, allowing a larger variety of gadgets may support more applications. In particular, our work is the first to prove a randomized lifting theorem for logarithmic-size gadgets, thus improving some applications of the theorem. Second, our result can be seen as a strong generalization of a direct-sum theorem for functions with low discrepancy.
△ Less
Submitted 5 October, 2021; v1 submitted 30 April, 2019;
originally announced April 2019.
-
Biasing Boolean Functions and Collective Coin-Flip** Protocols over Arbitrary Product Distributions
Authors:
Yuval Filmus,
Lianna Hambardzumyan,
Hamed Hatami,
Pooya Hatami,
David Zuckerman
Abstract:
The seminal result of Kahn, Kalai and Linial shows that a coalition of $O(\frac{n}{\log n})$ players can bias the outcome of any Boolean function $\{0,1\}^n \to \{0,1\}$ with respect to the uniform measure. We extend their result to arbitrary product measures on $\{0,1\}^n$, by combining their argument with a completely different argument that handles very biased coordinates.
We view this result…
▽ More
The seminal result of Kahn, Kalai and Linial shows that a coalition of $O(\frac{n}{\log n})$ players can bias the outcome of any Boolean function $\{0,1\}^n \to \{0,1\}$ with respect to the uniform measure. We extend their result to arbitrary product measures on $\{0,1\}^n$, by combining their argument with a completely different argument that handles very biased coordinates.
We view this result as a step towards proving a conjecture of Friedgut, which states that Boolean functions on the continuous cube $[0,1]^n$ (or, equivalently, on $\{1,\dots,n\}^n$) can be biased using coalitions of $o(n)$ players. This is the first step taken in this direction since Friedgut proposed the conjecture in 2004.
Russell, Saks and Zuckerman extended the result of Kahn, Kalai and Linial to multi-round protocols, showing that when the number of rounds is $o(\log^* n)$, a coalition of $o(n)$ players can bias the outcome with respect to the uniform measure. We extend this result as well to arbitrary product measures on $\{0,1\}^n$.
The argument of Russell et al. relies on the fact that a coalition of $o(n)$ players can boost the expectation of any Boolean function from $ε$ to $1-ε$ with respect to the uniform measure. This fails for general product distributions, as the example of the AND function with respect to $μ_{1-1/n}$ shows. Instead, we use a novel boosting argument alongside a generalization of our first result to arbitrary finite ranges.
△ Less
Submitted 20 February, 2019;
originally announced February 2019.
-
Tight Approximation for Unconstrained XOS Maximization
Authors:
Yuval Filmus,
Yasushi Kawase,
Yusuke Kobayashi,
Yutaro Yamaguchi
Abstract:
A set function is called XOS if it can be represented by the maximum of additive functions. When such a representation is fixed, the number of additive functions required to define the XOS function is called the width.
In this paper, we study the problem of maximizing XOS functions in the value oracle model. The problem is trivial for the XOS functions of width $1$ because they are just additive…
▽ More
A set function is called XOS if it can be represented by the maximum of additive functions. When such a representation is fixed, the number of additive functions required to define the XOS function is called the width.
In this paper, we study the problem of maximizing XOS functions in the value oracle model. The problem is trivial for the XOS functions of width $1$ because they are just additive, but it is already nontrivial even when the width is restricted to $2$. We show two types of tight bounds on the polynomial-time approximability for this problem. First, in general, the approximation bound is between $O(n)$ and $Ω(n / \log n)$, and exactly $Θ(n / \log n)$ if randomization is allowed, where $n$ is the ground set size. Second, when the width of the input XOS functions is bounded by a constant $k \geq 2$, the approximation bound is between $k - 1$ and $k - 1 - ε$ for any $ε> 0$. In particular, we give a linear-time algorithm to find an exact maximizer of a given XOS function of width $2$, while we show that any exact algorithm requires an exponential number of value oracle calls even when the width is restricted to $3$.
△ Less
Submitted 7 July, 2020; v1 submitted 22 November, 2018;
originally announced November 2018.
-
The entropy of lies: playing twenty questions with a liar
Authors:
Yuval Dagan,
Yuval Filmus,
Daniel Kane,
Shay Moran
Abstract:
`Twenty questions' is a guessing game played by two players: Bob thinks of an integer between $1$ and $n$, and Alice's goal is to recover it using a minimal number of Yes/No questions. Shannon's entropy has a natural interpretation in this context. It characterizes the average number of questions used by an optimal strategy in the distributional variant of the game: let $μ$ be a distribution over…
▽ More
`Twenty questions' is a guessing game played by two players: Bob thinks of an integer between $1$ and $n$, and Alice's goal is to recover it using a minimal number of Yes/No questions. Shannon's entropy has a natural interpretation in this context. It characterizes the average number of questions used by an optimal strategy in the distributional variant of the game: let $μ$ be a distribution over $[n]$, then the average number of questions used by an optimal strategy that recovers $x\sim μ$ is between $H(μ)$ and $H(μ)+1$. We consider an extension of this game where at most $k$ questions can be answered falsely. We extend the classical result by showing that an optimal strategy uses roughly $H(μ) + k H_2(μ)$ questions, where $H_2(μ) = \sum_x μ(x)\log\log\frac{1}{μ(x)}$. This also generalizes a result by Rivest et al. for the uniform distribution. Moreover, we design near optimal strategies that only use comparison queries of the form `$x \leq c$?' for $c\in[n]$. The usage of comparison queries lends itself naturally to the context of sorting, where we derive sorting algorithms in the presence of adversarial noise.
△ Less
Submitted 6 November, 2018;
originally announced November 2018.
-
A log-Sobolev inequality for the multislice, with applications
Authors:
Yuval Filmus,
Ryan O'Donnell,
Xinyu Wu
Abstract:
Let $κ\in \mathbb{N}_+^\ell$ satisfy $κ_1 + \dots + κ_\ell = n$ and let $\mathcal{U}_κ$ denote the "multislice" of all strings $u$ in $[\ell]^n$ having exactly $κ_i$ coordinates equal to $i$, for all $i \in [\ell]$. Consider the Markov chain on $\mathcal{U}_κ$, where a step is a random transposition of two coordinates of $u$. We show that the log-Sobolev constant $ρ_κ$ for the chain satisfies…
▽ More
Let $κ\in \mathbb{N}_+^\ell$ satisfy $κ_1 + \dots + κ_\ell = n$ and let $\mathcal{U}_κ$ denote the "multislice" of all strings $u$ in $[\ell]^n$ having exactly $κ_i$ coordinates equal to $i$, for all $i \in [\ell]$. Consider the Markov chain on $\mathcal{U}_κ$, where a step is a random transposition of two coordinates of $u$. We show that the log-Sobolev constant $ρ_κ$ for the chain satisfies $$(ρ_κ)^{-1} \leq n \sum_{i=1}^{\ell} \tfrac{1}{2} \log_2(4n/κ_i),$$ which is sharp up to constants whenever $\ell$ is constant. From this, we derive some consequences for small-set expansion and isoperimetry in the multislice, including a KKL Theorem, a Kruskal--Katona Theorem for the multislice, a Friedgut Junta Theorem, and a Nisan--Szegedy Theorem.
△ Less
Submitted 10 September, 2018;
originally announced September 2018.
-
FKN theorem for the multislice, with applications
Authors:
Yuval Filmus
Abstract:
The Friedgut-Kalai-Naor (FKN) theorem states that if $f$ is a Boolean function on the Boolean cube which is close to degree 1, then $f$ is close to a dictator, a function depending on a single coordinate. The author has extended the theorem to the slice, the subset of the Boolean cube consisting of all vectors with fixed Hamming weight. We extend the theorem further, to the multislice, a multicolo…
▽ More
The Friedgut-Kalai-Naor (FKN) theorem states that if $f$ is a Boolean function on the Boolean cube which is close to degree 1, then $f$ is close to a dictator, a function depending on a single coordinate. The author has extended the theorem to the slice, the subset of the Boolean cube consisting of all vectors with fixed Hamming weight. We extend the theorem further, to the multislice, a multicoloured version of the slice.
As an application, we prove a stability version of the edge-isoperimetric inequality for settings of parameters in which the optimal set is a dictator.
△ Less
Submitted 9 September, 2018;
originally announced September 2018.
-
Online Submodular Maximization: Beating 1/2 Made Simple
Authors:
Niv Buchbinder,
Moran Feldman,
Yuval Filmus,
Mohit Garg
Abstract:
The Submodular Welfare Maximization problem (SWM) captures an important subclass of combinatorial auctions and has been studied extensively from both computational and economic perspectives. In particular, it has been studied in a natural online setting in which items arrive one-by-one and should be allocated irrevocably upon arrival. In this setting, it is well known that the greedy algorithm ach…
▽ More
The Submodular Welfare Maximization problem (SWM) captures an important subclass of combinatorial auctions and has been studied extensively from both computational and economic perspectives. In particular, it has been studied in a natural online setting in which items arrive one-by-one and should be allocated irrevocably upon arrival. In this setting, it is well known that the greedy algorithm achieves a competitive ratio of 1/2, and recently Kapralov et al. (2013) showed that this ratio is optimal for the problem. Surprisingly, despite this impossibility result, Korula et al. (2015) were able to show that the same algorithm is 0.5052-competitive when the items arrive in a uniformly random order, but unfortunately, their proof is very long and involved. In this work, we present an (arguably) much simpler analysis that provides a slightly better guarantee of 0.5096-competitiveness for the greedy algorithm in the random-arrival model. Moreover, this analysis applies also to a generalization of online SWM in which the sets defining a (simple) partition matroid arrive online in a uniformly random order, and we would like to maximize a monotone submodular function subject to this matroid. Furthermore, for this more general problem, we prove an upper bound of 0.576 on the competitive ratio of the greedy algorithm, ruling out the possibility that the competitiveness of this natural algorithm matches the optimal offline approximation ratio of 1-1/e.
△ Less
Submitted 19 November, 2018; v1 submitted 15 July, 2018;
originally announced July 2018.
-
Boolean functions on high-dimensional expanders
Authors:
Yotam Dikstein,
Irit Dinur,
Yuval Filmus,
Prahladh Harsha
Abstract:
We initiate the study of Boolean function analysis on high-dimensional expanders. We give a random-walk based definition of high-dimensional expansion, which coincides with the earlier definition in terms of two-sided link expanders. Using this definition, we describe an analog of the Fourier expansion and the Fourier levels of the Boolean hypercube for simplicial complexes. Our analog is a decomp…
▽ More
We initiate the study of Boolean function analysis on high-dimensional expanders. We give a random-walk based definition of high-dimensional expansion, which coincides with the earlier definition in terms of two-sided link expanders. Using this definition, we describe an analog of the Fourier expansion and the Fourier levels of the Boolean hypercube for simplicial complexes. Our analog is a decomposition into approximate eigenspaces of random walks associated with the simplicial complexes. Our random-walk definition and the decomposition have the additional advantage that they extend to the more general setting of posets, encompassing both high-dimensional expanders and the Grassmann poset, which appears in recent work on the unique games conjecture.
We then use this decomposition to extend the Friedgut-Kalai-Naor theorem to high-dimensional expanders. Our results demonstrate that a constant-degree high-dimensional expander can sometimes serve as a sparse model for the Boolean slice or hypercube, and quite possibly additional results from Boolean function analysis can be carried over to this sparse model. Therefore, this model can be viewed as a derandomization of the Boolean slice, containing only $|X(k-1)|=O(n)$ points in contrast to $\binom{n}{k}$ points in the $(k)$-slice (which consists of all $n$-bit strings with exactly $k$ ones).
△ Less
Submitted 17 January, 2024; v1 submitted 22 April, 2018;
originally announced April 2018.
-
Boolean constant degree functions on the slice are juntas
Authors:
Yuval Filmus,
Ferdinand Ihringer
Abstract:
We show that a Boolean degree $d$ function on the slice $\binom{[n]}{k} = \{ (x_1,\ldots,x_n) \in \{0,1\} : \sum_{i=1}^n x_i = k \}$ is a junta, assuming that $k,n-k$ are large enough. This generalizes a classical result of Nisan and Szegedy on the hypercube. Moreover, we show that the maximum number of coordinates that a Boolean degree $d$ function can depend on is the same on the slice and the h…
▽ More
We show that a Boolean degree $d$ function on the slice $\binom{[n]}{k} = \{ (x_1,\ldots,x_n) \in \{0,1\} : \sum_{i=1}^n x_i = k \}$ is a junta, assuming that $k,n-k$ are large enough. This generalizes a classical result of Nisan and Szegedy on the hypercube. Moreover, we show that the maximum number of coordinates that a Boolean degree $d$ function can depend on is the same on the slice and the hypercube.
△ Less
Submitted 22 January, 2018; v1 submitted 19 January, 2018;
originally announced January 2018.
-
Sparse juntas on the biased hypercube
Authors:
Irit Dinur,
Yuval Filmus,
Prahladh Harsha
Abstract:
We give a structure theorem for Boolean functions on the $p$-biased hypercube which are $ε$-close to degree~$d$ in $L_2$, showing that they are close to \emph{sparse juntas}. Our structure theorem implies that such functions are $O(ε^{C_d} + p)$-close to constant functions. We pinpoint the exact value of the constant $C_d$.
We also give an analogous result for monotone Boolean functions on the b…
▽ More
We give a structure theorem for Boolean functions on the $p$-biased hypercube which are $ε$-close to degree~$d$ in $L_2$, showing that they are close to \emph{sparse juntas}. Our structure theorem implies that such functions are $O(ε^{C_d} + p)$-close to constant functions. We pinpoint the exact value of the constant $C_d$.
We also give an analogous result for monotone Boolean functions on the biased hypercube which are $ε$-close to degree~$d$ in $L_2$, showing that they are close to \emph{sparse DNFs}.
Our structure theorems are optimal in the following sense: for every $d,ε,p$, we identify a class $\mathcal{F}_{d,ε,p}$ of degree~$d$ sparse juntas which are $O(ε)$-close to Boolean (in the monotone case, width~$d$ sparse DNFs) such that a Boolean function on the $p$-biased hypercube is $O(ε)$-close to degree~$d$ in $L_2$ iff it is $O(ε)$-close to a function in $\mathcal{F}_{d,ε,p}$.
△ Less
Submitted 23 March, 2024; v1 submitted 26 November, 2017;
originally announced November 2017.
-
Agreement tests on graphs and hypergraphs
Authors:
Irit Dinur,
Yuval Filmus,
Prahladh Harsha
Abstract:
Agreement tests are a generalization of low degree tests that capture a local-to-global phenomenon, which forms the combinatorial backbone of most PCP constructions. In an agreement test, a function is given by an ensemble of local restrictions. The agreement test checks that the restrictions agree when they overlap, and the main question is whether average agreement of the local pieces implies th…
▽ More
Agreement tests are a generalization of low degree tests that capture a local-to-global phenomenon, which forms the combinatorial backbone of most PCP constructions. In an agreement test, a function is given by an ensemble of local restrictions. The agreement test checks that the restrictions agree when they overlap, and the main question is whether average agreement of the local pieces implies that there exists a global function that agrees with most local restrictions.
There are very few structures that support agreement tests, essentially either coming from algebraic low degree tests or from direct product tests (and recently also from high-dimensional expanders). In this work, we prove a new agreement theorem which extends direct product tests to higher dimensions, analogous to how low degree tests extend linearity testing. As a corollary of our main theorem, it follows that an ensemble of small graphs on overlap** sets of vertices can be glued together to one global graph assuming they agree with each other on average.
We prove the agreement theorem by (re)proving the agreement theorem for dimension 1, and then generalizing it to higher dimensions (with the dimension 1 case being the direct product test, and dimension 2 being the graph case). A key technical step in our proof is the reverse union bound, which allows us to treat dependent events as if they are disjoint, and may be of independent interest. An added benefit of the reverse union bound is that it can be used to show that the "majority decoded" function also serves as a global function that explains the local consistency of the agreement theorem, a fact that was not known even in the direct product setting (dimension 1) prior to our work.
△ Less
Submitted 11 December, 2020; v1 submitted 26 November, 2017;
originally announced November 2017.
-
Information complexity of the AND function in the two-Party, and multiparty settings
Authors:
Yuval Filmus,
Hamed Hatami,
Yaqiao Li,
Suzin You
Abstract:
In a recent breakthrough paper [M. Braverman, A. Garg, D. Pankratov, and O. Weinstein, From information to exact communication, STOC'13] Braverman et al. developed a local characterization for the zero-error information complexity in the two party model, and used it to compute the exact internal and external information complexity of the 2-bit AND function, which was then applied to determine the…
▽ More
In a recent breakthrough paper [M. Braverman, A. Garg, D. Pankratov, and O. Weinstein, From information to exact communication, STOC'13] Braverman et al. developed a local characterization for the zero-error information complexity in the two party model, and used it to compute the exact internal and external information complexity of the 2-bit AND function, which was then applied to determine the exact asymptotic of randomized communication complexity of the set disjointness problem.
In this article, we extend their results on AND function to the multi-party number-in-hand model by proving that the generalization of their protocol has optimal internal and external information cost for certain distributions. Our proof has new components, and in particular it fixes some minor gaps in the proof of Braverman et al.
△ Less
Submitted 22 March, 2017;
originally announced March 2017.
-
Trading information complexity for error
Authors:
Yuval Dagan,
Yuval Filmus,
Hamed Hatami,
Yaqiao Li
Abstract:
We consider the standard two-party communication model. The central problem studied in this article is how much one can save in information complexity by allowing an error of $ε$.
For arbitrary functions, we obtain lower bounds and upper bounds indicating a gain that is of order $Ω(h(ε))$ and $O(h(\sqrtε))$. Here $h$ denotes the binary entropy function. We analyze the case of the two-bit AND fun…
▽ More
We consider the standard two-party communication model. The central problem studied in this article is how much one can save in information complexity by allowing an error of $ε$.
For arbitrary functions, we obtain lower bounds and upper bounds indicating a gain that is of order $Ω(h(ε))$ and $O(h(\sqrtε))$. Here $h$ denotes the binary entropy function. We analyze the case of the two-bit AND function in detail to show that for this function the gain is $Θ(h(ε))$. This answers a question of [M. Braverman, A. Garg, D. Pankratov, and O. Weinstein, From information to exact communication (extended abstract), STOC'13].
We obtain sharp bounds for the set disjointness function of order $n$. For the case of the distributional error, we introduce a new protocol that achieves a gain of $Θ(\sqrt{h(ε)})$ provided that $n$ is sufficiently large. We apply these results to answer another of question of Braverman et al. regarding the randomized communication complexity of the set disjointness function.
Answering a question of [Mark Braverman, Interactive information complexity, STOC'12], we apply our analysis of the set disjointness function to establish a gap between the two different notions of the prior-free information cost. This implies that amortized randomized communication complexity is not necessarily equal to the amortized distributional communication complexity with respect to the hardest distribution.
△ Less
Submitted 21 November, 2016;
originally announced November 2016.
-
Twenty (simple) questions
Authors:
Yuval Dagan,
Yuval Filmus,
Ariel Gabizon,
Shay Moran
Abstract:
A basic combinatorial interpretation of Shannon's entropy function is via the "20 questions" game. This cooperative game is played by two players, Alice and Bob: Alice picks a distribution $π$ over the numbers $\{1,\ldots,n\}$, and announces it to Bob. She then chooses a number $x$ according to $π$, and Bob attempts to identify $x$ using as few Yes/No queries as possible, on average.
An optimal…
▽ More
A basic combinatorial interpretation of Shannon's entropy function is via the "20 questions" game. This cooperative game is played by two players, Alice and Bob: Alice picks a distribution $π$ over the numbers $\{1,\ldots,n\}$, and announces it to Bob. She then chooses a number $x$ according to $π$, and Bob attempts to identify $x$ using as few Yes/No queries as possible, on average.
An optimal strategy for the "20 questions" game is given by a Huffman code for $π$: Bob's questions reveal the codeword for $x$ bit by bit. This strategy finds $x$ using fewer than $H(π)+1$ questions on average. However, the questions asked by Bob could be arbitrary. In this paper, we investigate the following question: Are there restricted sets of questions that match the performance of Huffman codes, either exactly or approximately?
Our first main result shows that for every distribution $π$, Bob has a strategy that uses only questions of the form "$x < c$?" and "$x = c$?", and uncovers $x$ using at most $H(π)+1$ questions on average, matching the performance of Huffman codes in this sense. We also give a natural set of $O(rn^{1/r})$ questions that achieve a performance of at most $H(π)+r$, and show that $Ω(rn^{1/r})$ questions are required to achieve such a guarantee.
Our second main result gives a set $\mathcal{Q}$ of $1.25^{n+o(n)}$ questions such that for every distribution $π$, Bob can implement an optimal strategy for $π$ using only questions from $\mathcal{Q}$. We also show that $1.25^{n-o(n)}$ questions are needed, for infinitely many $n$. If we allow a small slack of $r$ over the optimal strategy, then roughly $(rn)^{Θ(1/r)}$ questions are necessary and sufficient.
△ Less
Submitted 25 April, 2017; v1 submitted 5 November, 2016;
originally announced November 2016.
-
Shapley Values in Weighted Voting Games with Random Weights
Authors:
Yuval Filmus,
Joel Oren,
Kannan Soundararajan
Abstract:
We investigate the distribution of the well-studied Shapley--Shubik values in weighted voting games where the agents are stochastically determined. The Shapley--Shubik value measures the voting power of an agent, in typical collective decision making systems. While easy to estimate empirically given the parameters of a weighted voting game, the Shapley values are notoriously hard to reason about a…
▽ More
We investigate the distribution of the well-studied Shapley--Shubik values in weighted voting games where the agents are stochastically determined. The Shapley--Shubik value measures the voting power of an agent, in typical collective decision making systems. While easy to estimate empirically given the parameters of a weighted voting game, the Shapley values are notoriously hard to reason about analytically.
We propose a probabilistic approach in which the agent weights are drawn i.i.d. from some known exponentially decaying distribution. We provide a general closed-form characterization of the highest and lowest expected Shapley values in such a game, as a function of the parameters of the underlying distribution. To do so, we give a novel reinterpretation of the stochastic process that generates the Shapley variables as a renewal process. We demonstrate the use of our results on the uniform and exponential distributions. Furthermore, we show the strength of our theoretical predictions on several synthetic datasets.
△ Less
Submitted 22 January, 2016;
originally announced January 2016.
-
Fast Matrix Multiplication: Limitations of the Laser Method
Authors:
Andris Ambainis,
Yuval Filmus,
François Le Gall
Abstract:
Until a few years ago, the fastest known matrix multiplication algorithm, due to Coppersmith and Winograd (1990), ran in time $O(n^{2.3755})$. Recently, a surge of activity by Stothers, Vassilevska-Williams, and Le Gall has led to an improved algorithm running in time $O(n^{2.3729})$. These algorithms are obtained by analyzing higher and higher tensor powers of a certain identity of Coppersmith an…
▽ More
Until a few years ago, the fastest known matrix multiplication algorithm, due to Coppersmith and Winograd (1990), ran in time $O(n^{2.3755})$. Recently, a surge of activity by Stothers, Vassilevska-Williams, and Le Gall has led to an improved algorithm running in time $O(n^{2.3729})$. These algorithms are obtained by analyzing higher and higher tensor powers of a certain identity of Coppersmith and Winograd. We show that this exact approach cannot result in an algorithm with running time $O(n^{2.3725})$, and identify a wide class of variants of this approach which cannot result in an algorithm with running time $O(n^{2.3078})$; in particular, this approach cannot prove the conjecture that for every $ε> 0$, two $n\times n$ matrices can be multiplied in time $O(n^{2+ε})$.
We describe a new framework extending the original laser method, which is the method underlying the previously mentioned algorithms. Our framework accommodates the algorithms by Coppersmith and Winograd, Stothers, Vassilevska-Williams and Le Gall. We obtain our main result by analyzing this framework. The framework is also the first to explain why taking tensor powers of the Coppersmith-Winograd identity results in faster algorithms.
△ Less
Submitted 19 November, 2014;
originally announced November 2014.
-
From Small Space to Small Width in Resolution
Authors:
Yuval Filmus,
Massimo Lauria,
Mladen Mikša,
Jakob Nordström,
Marc Vinyals
Abstract:
In 2003, Atserias and Dalmau resolved a major open question about the resolution proof system by establishing that the space complexity of CNF formulas is always an upper bound on the width needed to refute them. Their proof is beautiful but somewhat mysterious in that it relies heavily on tools from finite model theory. We give an alternative, completely elementary proof that works by simple synt…
▽ More
In 2003, Atserias and Dalmau resolved a major open question about the resolution proof system by establishing that the space complexity of CNF formulas is always an upper bound on the width needed to refute them. Their proof is beautiful but somewhat mysterious in that it relies heavily on tools from finite model theory. We give an alternative, completely elementary proof that works by simple syntactic manipulations of resolution refutations. As a by-product, we develop a "black-box" technique for proving space lower bounds via a "static" complexity measure that works against any resolution refutation---previous techniques have been inherently adaptive. We conclude by showing that the related question for polynomial calculus (i.e., whether space is an upper bound on degree) seems unlikely to be resolvable by similar methods.
△ Less
Submitted 10 September, 2014;
originally announced September 2014.
-
Power Distribution in Randomized Weighted Voting: the Effects of the Quota
Authors:
Joel Oren,
Yuval Filmus,
Yair Zick,
Yoram Bachrach
Abstract:
We study the Shapley value in weighted voting games. The Shapley value has been used as an index for measuring the power of individual agents in decision-making bodies and political organizations, where decisions are made by a majority vote process. We characterize the impact of changing the quota (i.e., the minimum number of seats in the parliament that are required to form a coalition) on the Sh…
▽ More
We study the Shapley value in weighted voting games. The Shapley value has been used as an index for measuring the power of individual agents in decision-making bodies and political organizations, where decisions are made by a majority vote process. We characterize the impact of changing the quota (i.e., the minimum number of seats in the parliament that are required to form a coalition) on the Shapley values of the agents. Contrary to previous studies, which assumed that the agent weights (corresponding to the size of a caucus or a political party) are fixed, we analyze new domains in which the weights are stochastically generated, modelling, for example, elections processes.
We examine a natural weight generation process: the Balls and Bins model, with uniform as well as exponentially decaying probabilities. We also analyze weights that admit a super-increasing sequence, answering several open questions pertaining to the Shapley values in such games.
△ Less
Submitted 2 August, 2014;
originally announced August 2014.
-
On the sum of the L1 influences of bounded functions
Authors:
Yuval Filmus,
Hamed Hatami,
Nathan Keller,
Noam Lifshitz
Abstract:
Let $f\colon \{-1,1\}^n \to [-1,1]$ have degree $d$ as a multilinear polynomial. It is well-known that the total influence of $f$ is at most $d$. Aaronson and Ambainis asked whether the total $L_1$ influence of $f$ can also be bounded as a function of $d$. Bačkurs and Bavarian answered this question in the affirmative, providing a bound of $O(d^3)$ for general functions and $O(d^2)$ for homogeneou…
▽ More
Let $f\colon \{-1,1\}^n \to [-1,1]$ have degree $d$ as a multilinear polynomial. It is well-known that the total influence of $f$ is at most $d$. Aaronson and Ambainis asked whether the total $L_1$ influence of $f$ can also be bounded as a function of $d$. Bačkurs and Bavarian answered this question in the affirmative, providing a bound of $O(d^3)$ for general functions and $O(d^2)$ for homogeneous functions. We improve on their results by providing a bound of $d^2$ for general functions and $O(d\log d)$ for homogeneous functions. In addition, we prove a bound of $d/(2 π)+o(d)$ for monotone functions, and provide a matching example.
△ Less
Submitted 28 March, 2015; v1 submitted 13 April, 2014;
originally announced April 2014.
-
A SageTeX Hypermatrix Algebra Package
Authors:
Edinah K. Gnang,
Ori Parzanchevski,
Yuval Filmus
Abstract:
We describe here a rudimentary sage implementation of the Bhattacharya-Mesner hypermatrix algebra package.
We describe here a rudimentary sage implementation of the Bhattacharya-Mesner hypermatrix algebra package.
△ Less
Submitted 11 March, 2014;
originally announced March 2014.
-
Universal codes of the natural numbers
Authors:
Yuval Filmus
Abstract:
A code of the natural numbers is a uniquely-decodable binary code of the natural numbers with non-decreasing codeword lengths, which satisfies Kraft's inequality tightly. We define a natural partial order on the set of codes, and show how to construct effectively a code better than a given sequence of codes, in a certain precise sense. As an application, we prove that the existence of a scale of…
▽ More
A code of the natural numbers is a uniquely-decodable binary code of the natural numbers with non-decreasing codeword lengths, which satisfies Kraft's inequality tightly. We define a natural partial order on the set of codes, and show how to construct effectively a code better than a given sequence of codes, in a certain precise sense. As an application, we prove that the existence of a scale of codes (a well-ordered set of codes which contains a code better than any given code) is independent of ZFC.
△ Less
Submitted 27 August, 2013; v1 submitted 7 August, 2013;
originally announced August 2013.
-
The Complexity of the Comparator Circuit Value Problem
Authors:
Stephen A. Cook,
Yuval Filmus,
Dai Tri Man Le
Abstract:
In 1990 Subramanian defined the complexity class CC as the set of problems log-space reducible to the comparator circuit value problem (CCV). He and Mayr showed that NL \subseteq CC \subseteq P, and proved that in addition to CCV several other problems are complete for CC, including the stable marriage problem, and finding the lexicographically first maximal matching in a bipartite graph. We are i…
▽ More
In 1990 Subramanian defined the complexity class CC as the set of problems log-space reducible to the comparator circuit value problem (CCV). He and Mayr showed that NL \subseteq CC \subseteq P, and proved that in addition to CCV several other problems are complete for CC, including the stable marriage problem, and finding the lexicographically first maximal matching in a bipartite graph. We are interested in CC because we conjecture that it is incomparable with the parallel class NC which also satisfies NL \subseteq NC \subseteq P, and note that this conjecture implies that none of the CC-complete problems has an efficient polylog time parallel algorithm. We provide evidence for our conjecture by giving oracle settings in which relativized CC and relativized NC are incomparable.
We give several alternative definitions of CC, including (among others) the class of problems computed by uniform polynomial-size families of comparator circuits supplied with copies of the input and its negation, the class of problems AC^0-reducible to CCV, and the class of problems computed by uniform AC^0 circuits with CCV gates. We also give a machine model for CC, which corresponds to its characterization as log-space uniform polynomial-size families of comparator circuits. These various characterizations show that CC is a robust class. The main technical tool we employ is universal comparator circuits.
Other results include a simpler proof of NL \subseteq CC, and an explanation of the relation between the Gale-Shapley algorithm and Subramanian's algorithm for stable marriage.
This paper continues the previous work of Cook, Lê and Ye which focused on Cook-Nguyen style uniform proof complexity, answering several open questions raised in that paper.
△ Less
Submitted 25 July, 2013; v1 submitted 13 August, 2012;
originally announced August 2012.
-
A Tight Combinatorial Algorithm for Submodular Maximization Subject to a Matroid Constraint
Authors:
Yuval Filmus,
Justin Ward
Abstract:
We present an optimal, combinatorial 1-1/e approximation algorithm for monotone submodular optimization over a matroid constraint. Compared to the continuous greedy algorithm (Calinescu, Chekuri, Pal and Vondrak, 2008), our algorithm is extremely simple and requires no rounding. It consists of the greedy algorithm followed by local search. Both phases are run not on the actual objective function,…
▽ More
We present an optimal, combinatorial 1-1/e approximation algorithm for monotone submodular optimization over a matroid constraint. Compared to the continuous greedy algorithm (Calinescu, Chekuri, Pal and Vondrak, 2008), our algorithm is extremely simple and requires no rounding. It consists of the greedy algorithm followed by local search. Both phases are run not on the actual objective function, but on a related non-oblivious potential function, which is also monotone submodular. Our algorithm runs in randomized time O(n^8u), where n is the rank of the given matroid and u is the size of its ground set. We additionally obtain a 1-1/e-eps approximation algorithm running in randomized time O (eps^-3n^4u). For matroids in which n = o(u), this improves on the runtime of the continuous greedy algorithm. The improvement is due primarily to the time required by the pipage rounding phase, which we avoid altogether. Furthermore, the independence of our algorithm from pipage rounding techniques suggests that our general approach may be helpful in contexts such as monotone submodular maximization subject to multiple matroid constraints.
Our approach generalizes to the case where the monotone submodular function has restricted curvature. For any curvature c, we adapt our algorithm to produce a (1-e^-c)/c approximation. This result complements results of Vondrak (2008), who has shown that the continuous greedy algorithm produces a (1-e^-c)/c approximation when the objective function has curvature c. He has also proved that achieving any better approximation ratio is impossible in the value oracle model.
△ Less
Submitted 19 November, 2013; v1 submitted 19 April, 2012;
originally announced April 2012.