Search | arXiv e-print repository

Density of growth-rates of subgroups of a free group and the non-backtracking spectrum of the configuration model

Authors: Michail Louvaris, Daniel T. Wise, Gal Yehuda

Abstract: We prove the set of growth-rates of subgroups of a rank~$r$ free group is dense in $[1,2r-1]$. Our main technical contribution is a concentration result for the leading eigenvalue of the non-backtracking matrix in the configuration model. We prove the set of growth-rates of subgroups of a rank~$r$ free group is dense in $[1,2r-1]$. Our main technical contribution is a concentration result for the leading eigenvalue of the non-backtracking matrix in the configuration model. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2311.14082 [pdf, other]

Geometric Covering using Random Fields

Authors: Felipe Goncalves, Daniel Keren, Amit Shahar, Gal Yehuda

Abstract: A set of vectors $S \subseteq \mathbb{R}^d$ is $(k_1,\varepsilon)$-clusterable if there are $k_1$ balls of radius $\varepsilon$ that cover $S$. A set of vectors $S \subseteq \mathbb{R}^d$ is $(k_2,δ)$-far from being clusterable if there are at least $k_2$ vectors in $S$, with all pairwise distances at least $δ$. We propose a probabilistic algorithm to distinguish between these two cases. Our algor… ▽ More A set of vectors $S \subseteq \mathbb{R}^d$ is $(k_1,\varepsilon)$-clusterable if there are $k_1$ balls of radius $\varepsilon$ that cover $S$. A set of vectors $S \subseteq \mathbb{R}^d$ is $(k_2,δ)$-far from being clusterable if there are at least $k_2$ vectors in $S$, with all pairwise distances at least $δ$. We propose a probabilistic algorithm to distinguish between these two cases. Our algorithm reaches a decision by only looking at the extreme values of a scalar valued hash function, defined by a random field, on $S$; hence, it is especially suitable in distributed and online settings. An important feature of our method is that the algorithm is oblivious to the number of vectors: in the online setting, for example, the algorithm stores only a constant number of scalars, which is independent of the stream length. We introduce random field hash functions, which are a key ingredient in our paradigm. Random field hash functions generalize locality-sensitive hashing (LSH). In addition to the LSH requirement that ``nearby vectors are hashed to similar values", our hash function also guarantees that the ``hash values are (nearly) independent random variables for distant vectors". We formulate necessary conditions for the kernels which define the random fields applied to our problem, as well as a measure of kernel optimality, for which we provide a bound. Then, we propose a method to construct kernels which approximate the optimal one. △ Less

Submitted 23 November, 2023; originally announced November 2023.

Comments: 27 pages, 15 figures

arXiv:2308.04412 [pdf, other]

Probabilistic Invariant Learning with Randomized Linear Classifiers

Authors: Leonardo Cotta, Gal Yehuda, Assaf Schuster, Chris J. Maddison

Abstract: Designing models that are both expressive and preserve known invariances of tasks is an increasingly hard problem. Existing solutions tradeoff invariance for computational or memory resources. In this work, we show how to leverage randomness and design models that are both expressive and invariant but use less resources. Inspired by randomized algorithms, our key insight is that accepting probabil… ▽ More Designing models that are both expressive and preserve known invariances of tasks is an increasingly hard problem. Existing solutions tradeoff invariance for computational or memory resources. In this work, we show how to leverage randomness and design models that are both expressive and invariant but use less resources. Inspired by randomized algorithms, our key insight is that accepting probabilistic notions of universal approximation and invariance can reduce our resource requirements. More specifically, we propose a class of binary classification models called Randomized Linear Classifiers (RLCs). We give parameter and sample size conditions in which RLCs can, with high probability, approximate any (smooth) function while preserving invariance to compact group transformations. Leveraging this result, we design three RLCs that are provably probabilistic invariant for classification tasks over sets, graphs, and spherical data. We show how these models can achieve probabilistic invariance and universality using less resources than (deterministic) neural networks and their invariant counterparts. Finally, we empirically demonstrate the benefits of this new class of models on invariant tasks where deterministic invariant neural networks are known to struggle. △ Less

Submitted 27 September, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

arXiv:2206.09182 [pdf, other]

Coin Flip** Neural Networks

Authors: Yuval Sieradzki, Nitzan Hodos, Gal Yehuda, Assaf Schuster

Abstract: We show that neural networks with access to randomness can outperform deterministic networks by using amplification. We call such networks Coin-Flip** Neural Networks, or CFNNs. We show that a CFNN can approximate the indicator of a $d$-dimensional ball to arbitrary accuracy with only 2 layers and $\mathcal{O}(1)$ neurons, where a 2-layer deterministic network was shown to require $Ω(e^d)$ neuro… ▽ More We show that neural networks with access to randomness can outperform deterministic networks by using amplification. We call such networks Coin-Flip** Neural Networks, or CFNNs. We show that a CFNN can approximate the indicator of a $d$-dimensional ball to arbitrary accuracy with only 2 layers and $\mathcal{O}(1)$ neurons, where a 2-layer deterministic network was shown to require $Ω(e^d)$ neurons, an exponential improvement (arXiv:1610.09887). We prove a highly non-trivial result, that for almost any classification problem, there exists a trivially simple network that solves it given a sufficiently powerful generator for the network's weights. Combining these results we conjecture that for most classification problems, there is a CFNN which solves them with higher accuracy or fewer neurons than any deterministic network. Finally, we verify our proofs experimentally using novel CFNN architectures on CIFAR10 and CIFAR100, reaching an improvement of 9.25\% from the baseline. △ Less

Submitted 22 June, 2022; v1 submitted 18 June, 2022; originally announced June 2022.

arXiv:2105.13615 [pdf, other]

A lower bound for essential covers of the cube

Authors: Gal Yehuda, Amir Yehudayoff

Abstract: Essential covers were introduced by Linial and Radhakrishnan as a model that captures two complementary properties: (1) all variables must be included and (2) no element is redundant. In their seminal paper, they proved that every essential cover of the $n$-dimensional hypercube must be of size at least $Ω(n^{0.5})$. Later on, this notion found several applications in complexity theory. We improve… ▽ More Essential covers were introduced by Linial and Radhakrishnan as a model that captures two complementary properties: (1) all variables must be included and (2) no element is redundant. In their seminal paper, they proved that every essential cover of the $n$-dimensional hypercube must be of size at least $Ω(n^{0.5})$. Later on, this notion found several applications in complexity theory. We improve the lower bound to $Ω(n^{0.52})$, and describe two applications. △ Less

Submitted 28 May, 2021; originally announced May 2021.

Comments: 10 pages, 1 figure

MSC Class: 05D99 ACM Class: G.2.1; F.2.2

arXiv:2102.05536 [pdf, ps, other]

Slicing the hypercube is not easy

Authors: Gal Yehuda, Amir Yehudayoff

Abstract: We prove that at least $Ω(n^{0.51})$ hyperplanes are needed to slice all edges of the $n$-dimensional hypercube. We provide a couple of applications: lower bounds on the computational complexity of parity, and a lower bound on the cover number of the hypercube by skew hyperplanes. We prove that at least $Ω(n^{0.51})$ hyperplanes are needed to slice all edges of the $n$-dimensional hypercube. We provide a couple of applications: lower bounds on the computational complexity of parity, and a lower bound on the cover number of the hypercube by skew hyperplanes. △ Less

Submitted 17 February, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

Comments: 20 pages

MSC Class: 05D05; 52C35 ACM Class: F.2; G.2.1

arXiv:2002.09398 [pdf, other]

It's Not What Machines Can Learn, It's What We Cannot Teach

Authors: Gal Yehuda, Moshe Gabel, Assaf Schuster

Abstract: Can deep neural networks learn to solve any task, and in particular problems of high complexity? This question attracts a lot of interest, with recent works tackling computationally hard tasks such as the traveling salesman problem and satisfiability. In this work we offer a different perspective on this question. Given the common assumption that $\textit{NP} \neq \textit{coNP}$ we prove that any… ▽ More Can deep neural networks learn to solve any task, and in particular problems of high complexity? This question attracts a lot of interest, with recent works tackling computationally hard tasks such as the traveling salesman problem and satisfiability. In this work we offer a different perspective on this question. Given the common assumption that $\textit{NP} \neq \textit{coNP}$ we prove that any polynomial-time sample generator for an $\textit{NP}$-hard problem samples, in fact, from an easier sub-problem. We empirically explore a case study, Conjunctive Query Containment, and show how common data generation techniques generate biased datasets that lead practitioners to over-estimate model accuracy. Our results suggest that machine learning approaches that require training on a dense uniform sampling from the target distribution cannot be used to solve computationally hard problems, the reason being the difficulty of generating sufficiently large and unbiased training sets. △ Less

Submitted 28 June, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

Comments: Accepted to ICML 2020

arXiv:1604.02557 [pdf, ps, other]

The Complexity of Computing (Almost) Unitary Matrices With $\eps$-Copies of the Fourier Transform

Authors: Nir Ailon, Gal Yehuda

Abstract: The complexity of computing the Fourier transform is a longstanding open problem. Very recently, Ailon (2013, 2014, 2015) showed in a collection of papers that, roughly speaking, a speedup of the Fourier transform computation implies numerical ill-condition. The papers also quantify this tradeoff. The main method for proving these results is via a potential function called quasi-entropy, reminisce… ▽ More The complexity of computing the Fourier transform is a longstanding open problem. Very recently, Ailon (2013, 2014, 2015) showed in a collection of papers that, roughly speaking, a speedup of the Fourier transform computation implies numerical ill-condition. The papers also quantify this tradeoff. The main method for proving these results is via a potential function called quasi-entropy, reminiscent of Shannon entropy. The quasi-entropy method opens new doors to understanding the computational complexity of the important Fourier transformation. However, it suffers from various obvious limitations. This paper, motivated by one such limitation, partly overcomes it, while at the same time sheds llight on new interesting, and problems on the intersection of computational complexity and group theory. The paper also explains why this research direction, if fruitful, has a chance of solving much bigger questions about the complexity of the Fourier transform. △ Less

Submitted 17 April, 2019; v1 submitted 9 April, 2016; originally announced April 2016.

Showing 1–8 of 8 results for author: Yehuda, G