-
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
Authors:
Rylan Schaeffer,
Hailey Schoelkopf,
Brando Miranda,
Gabriel Mukobi,
Varun Madan,
Adam Ibrahim,
Herbie Bradley,
Stella Biderman,
Sanmi Koyejo
Abstract:
Predictable behavior from scaling advanced AI systems is an extremely desirable property. Although a well-established literature exists on how pretraining performance scales, the literature on how particular downstream capabilities scale is significantly muddier. In this work, we take a step back and ask: why has predicting specific downstream capabilities with scale remained elusive? While many f…
▽ More
Predictable behavior from scaling advanced AI systems is an extremely desirable property. Although a well-established literature exists on how pretraining performance scales, the literature on how particular downstream capabilities scale is significantly muddier. In this work, we take a step back and ask: why has predicting specific downstream capabilities with scale remained elusive? While many factors are certainly responsible, we identify a new factor that makes modeling scaling behavior on widely used multiple-choice question-answering benchmarks challenging. Using five model families and twelve well-established multiple-choice benchmarks, we show that downstream performance is computed from negative log likelihoods via a sequence of transformations that progressively degrade the statistical relationship between performance and scale. We then reveal the mechanism causing this degradation: downstream metrics require comparing the correct choice against a small number of specific incorrect choices, meaning accurately predicting downstream capabilities requires predicting not just how probability mass concentrates on the correct choice with scale, but also how probability mass fluctuates on specific incorrect choices with scale. We empirically study how probability mass on the correct choice co-varies with probability mass on incorrect choices with increasing compute, suggesting that scaling laws for incorrect choices might be achievable. Our work also explains why pretraining scaling laws are commonly regarded as more predictable than downstream capabilities and contributes towards establishing scaling-predictable evaluations of frontier AI models.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Representation Projection Invariance Mitigates Representation Collapse
Authors:
Anastasia Razdaibiedina,
Ashish Khetan,
Zohar Karnin,
Daniel Khashabi,
Vishaal Kapoor,
Vivek Madan
Abstract:
Fine-tuning contextualized representations learned by pre-trained language models remains a prevalent practice in NLP. However, fine-tuning can lead to representation degradation (also known as representation collapse), which may result in instability, sub-optimal performance, and weak generalization.
In this paper, we propose Representation Projection Invariance (REPINA), a novel regularization…
▽ More
Fine-tuning contextualized representations learned by pre-trained language models remains a prevalent practice in NLP. However, fine-tuning can lead to representation degradation (also known as representation collapse), which may result in instability, sub-optimal performance, and weak generalization.
In this paper, we propose Representation Projection Invariance (REPINA), a novel regularization method to maintain the information content of representation and reduce representation collapse during fine-tuning by discouraging undesirable changes in the representations. We study the empirical behavior of the proposed regularization in comparison to 5 comparable baselines across 13 language understanding tasks (GLUE benchmark and six additional datasets). When evaluating in-domain performance, REPINA consistently outperforms other baselines on most tasks (10 out of 13). We also demonstrate its effectiveness in few-shot settings and robustness to label perturbation. As a by-product, we extend previous studies of representation collapse and propose several metrics to quantify it. Our empirical findings show that our approach is significantly more effective at mitigating representation collapse.
△ Less
Submitted 21 November, 2023; v1 submitted 23 May, 2022;
originally announced May 2022.
-
Improving Early Sepsis Prediction with Multi Modal Learning
Authors:
Fred Qin,
Vivek Madan,
Ujjwal Ratan,
Zohar Karnin,
Vishaal Kapoor,
Parminder Bhatia,
Taha Kass-Hout
Abstract:
Sepsis is a life-threatening disease with high morbidity, mortality and healthcare costs. The early prediction and administration of antibiotics and intravenous fluids is considered crucial for the treatment of sepsis and can save potentially millions of lives and billions in health care costs. Professional clinical care practitioners have proposed clinical criterion which aid in early detection o…
▽ More
Sepsis is a life-threatening disease with high morbidity, mortality and healthcare costs. The early prediction and administration of antibiotics and intravenous fluids is considered crucial for the treatment of sepsis and can save potentially millions of lives and billions in health care costs. Professional clinical care practitioners have proposed clinical criterion which aid in early detection of sepsis; however, performance of these criterion is often limited. Clinical text provides essential information to estimate the severity of the sepsis in addition to structured clinical data. In this study, we explore how clinical text can complement structured data towards early sepsis prediction task. In this paper, we propose multi modal model which incorporates both structured data in the form of patient measurements as well as textual notes on the patient. We employ state-of-the-art NLP models such as BERT and a highly specialized NLP model in Amazon Comprehend Medical to represent the text. On the MIMIC-III dataset containing records of ICU admissions, we show that by using these notes, one achieves an improvement of 6.07 points in a standard utility score for Sepsis prediction and 2.89% in AUROC score. Our methods significantly outperforms a clinical criteria suggested by experts, qSOFA, as well as the winning model of the PhysioNet Computing in Cardiology Challenge for predicting Sepsis.
△ Less
Submitted 23 July, 2021;
originally announced July 2021.
-
Stress Testing of Meta-learning Approaches for Few-shot Learning
Authors:
Aroof Aimen,
Sahil Sidheekh,
Vineet Madan,
Narayanan C. Krishnan
Abstract:
Meta-learning (ML) has emerged as a promising learning method under resource constraints such as few-shot learning. ML approaches typically propose a methodology to learn generalizable models. In this work-in-progress paper, we put the recent ML approaches to a stress test to discover their limitations. Precisely, we measure the performance of ML approaches for few-shot learning against increasing…
▽ More
Meta-learning (ML) has emerged as a promising learning method under resource constraints such as few-shot learning. ML approaches typically propose a methodology to learn generalizable models. In this work-in-progress paper, we put the recent ML approaches to a stress test to discover their limitations. Precisely, we measure the performance of ML approaches for few-shot learning against increasing task complexity. Our results show a quick degradation in the performance of initialization strategies for ML (MAML, TAML, and MetaSGD), while surprisingly, approaches that use an optimization strategy (MetaLSTM) perform significantly better. We further demonstrate the effectiveness of an optimization strategy for ML (MetaLSTM++) trained in a MAML manner over a pure optimization strategy. Our experiments also show that the optimization strategies for ML achieve higher transferability from simple to complex tasks.
△ Less
Submitted 21 January, 2021;
originally announced January 2021.
-
On Duality Gap as a Measure for Monitoring GAN Training
Authors:
Sahil Sidheekh,
Aroof Aimen,
Vineet Madan,
Narayanan C. Krishnan
Abstract:
Generative adversarial network (GAN) is among the most popular deep learning models for learning complex data distributions. However, training a GAN is known to be a challenging task. This is often attributed to the lack of correlation between the training progress and the trajectory of the generator and discriminator losses and the need for the GAN's subjective evaluation. A recently proposed mea…
▽ More
Generative adversarial network (GAN) is among the most popular deep learning models for learning complex data distributions. However, training a GAN is known to be a challenging task. This is often attributed to the lack of correlation between the training progress and the trajectory of the generator and discriminator losses and the need for the GAN's subjective evaluation. A recently proposed measure inspired by game theory - the duality gap, aims to bridge this gap. However, as we demonstrate, the duality gap's capability remains constrained due to limitations posed by its estimation process. This paper presents a theoretical understanding of this limitation and proposes a more dependable estimation process for the duality gap. At the crux of our approach is the idea that local perturbations can help agents in a zero-sum game escape non-Nash saddle points efficiently. Through exhaustive experimentation across GAN models and datasets, we establish the efficacy of our approach in capturing the GAN training progress with minimal increase to the computational complexity. Further, we show that our estimate, with its ability to identify model convergence/divergence, is a potential performance measure that can be used to tune the hyperparameters of a GAN.
△ Less
Submitted 11 December, 2020;
originally announced December 2020.
-
Estimates for initial coefficients of certain bi-univalent functions
Authors:
Vibha Madaan,
Ajay Kumar,
V. Ravichandran
Abstract:
Estimates are obtained for the initial coefficients of a normalized analytic function $f$ in the unit disk $\mathbb{D}$ such that $f$ and the analytic extension of $f^{-1}$ to $\mathbb{D}$ belong to certain subclasses of univalent functions. The bounds obtained improve some existing known bounds.
Estimates are obtained for the initial coefficients of a normalized analytic function $f$ in the unit disk $\mathbb{D}$ such that $f$ and the analytic extension of $f^{-1}$ to $\mathbb{D}$ belong to certain subclasses of univalent functions. The bounds obtained improve some existing known bounds.
△ Less
Submitted 21 June, 2020;
originally announced June 2020.
-
Maximizing Determinants under Matroid Constraints
Authors:
Vivek Madan,
Aleksandar Nikolov,
Mohit Singh,
Uthaipon Tantipongpipat
Abstract:
Given vectors $v_1,\dots,v_n\in\mathbb{R}^d$ and a matroid $M=([n],I)$, we study the problem of finding a basis $S$ of $M$ such that $\det(\sum_{i \in S}v_i v_i^\top)$ is maximized. This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning. The current best results include an $e^{2k}$-estimation for any matroid of ran…
▽ More
Given vectors $v_1,\dots,v_n\in\mathbb{R}^d$ and a matroid $M=([n],I)$, we study the problem of finding a basis $S$ of $M$ such that $\det(\sum_{i \in S}v_i v_i^\top)$ is maximized. This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning. The current best results include an $e^{2k}$-estimation for any matroid of rank $k$ and a $(1+ε)^d$-approximation for a uniform matroid of rank $k\ge d+\frac dε$, where the rank $k\ge d$ denotes the desired size of the optimal set. Our main result is a new approximation algorithm with an approximation guarantee that depends only on the dimension $d$ of the vectors and not on the size $k$ of the output set. In particular, we show an $(O(d))^{d}$-estimation and an $(O(d))^{d^3}$-approximation for any matroid, giving a significant improvement over prior work when $k\gg d$.
Our result relies on the existence of an optimal solution to a convex programming relaxation for the problem which has sparse support; in particular, no more than $O(d^2)$ variables of the solution have fractional values. The sparsity results rely on the interplay between the first-order optimality conditions for the convex program and matroid theory. We believe that the techniques introduced to show sparsity of optimal solutions to convex programs will be of independent interest. We also give a randomized algorithm that rounds a sparse fractional solution to a feasible integral solution to the original problem. To show the approximation guarantee, we utilize recent works on strongly log-concave polynomials and show new relationships between different convex programs studied for the problem. Finally, we use the estimation algorithm and sparsity results to give an efficient deterministic approximation algorithm with an approximation guarantee that depends solely on the dimension $d$.
△ Less
Submitted 16 April, 2020;
originally announced April 2020.
-
Critical group structure from the parameters of a strongly regular graph
Authors:
Joshua E. Ducey,
David L. Duncan,
Wesley J. Engelbrecht,
Jawahar V. Madan,
Eric Piato,
Christina S. Shatford,
Angela Vichitbandha
Abstract:
We give simple arithmetic conditions that force the Sylow $p$-subgroup of the critical group of a strongly regular graph to take a specific form. These conditions depend only on the parameters $(v, k, λ, μ)$ of the strongly regular graph under consideration. We give many examples, including how the theory can be used to compute the critical group of Conway's $99$-graph and to give an elementary ar…
▽ More
We give simple arithmetic conditions that force the Sylow $p$-subgroup of the critical group of a strongly regular graph to take a specific form. These conditions depend only on the parameters $(v, k, λ, μ)$ of the strongly regular graph under consideration. We give many examples, including how the theory can be used to compute the critical group of Conway's $99$-graph and to give an elementary argument that no $srg(28,9,0,4)$ exists.
△ Less
Submitted 16 October, 2019;
originally announced October 2019.
-
Radii of Starlikeness and Convexity of Bessel Functions
Authors:
Vibha Madaan,
Ajay Kumar,
V. Ravichandran
Abstract:
The radii of starlikeness and convexity associated with lemniscate of Bernoulli and the Janowski function, $(1+Az)/(1+Bz)$ for $-1\leq B<A\leq 1$, have been determined for normalizations of $q$-Bessel function, Bessel function of first kind of order $ν$, Lommel function of first kind and Legendre polynomial of odd degree.
The radii of starlikeness and convexity associated with lemniscate of Bernoulli and the Janowski function, $(1+Az)/(1+Bz)$ for $-1\leq B<A\leq 1$, have been determined for normalizations of $q$-Bessel function, Bessel function of first kind of order $ν$, Lommel function of first kind and Legendre polynomial of odd degree.
△ Less
Submitted 13 June, 2019;
originally announced June 2019.
-
Lemniscate Convexity and Other Properties of Generalized Bessel Functions
Authors:
Vibha Madaan,
Ajay Kumar,
V. Ravichandran
Abstract:
Sufficient conditions on associated parameters $p,b$ and $c$ are obtained so that the generalized and \textquotedblleft{normalized}\textquotedblright{} Bessel function $u_p(z)=u_{p,b,c}(z)$ satisfies $|(1+(zu''_p(z)/u'_p(z)))^2-1|<1$ or $|((zu_p(z))'/u_p(z))^2-1|<1$. We also determine the condition on these parameters so that $-(4(p+(b+1)/2)/c)u'_p(z)\prec\sqrt{1+z}$. Relations between the paramet…
▽ More
Sufficient conditions on associated parameters $p,b$ and $c$ are obtained so that the generalized and \textquotedblleft{normalized}\textquotedblright{} Bessel function $u_p(z)=u_{p,b,c}(z)$ satisfies $|(1+(zu''_p(z)/u'_p(z)))^2-1|<1$ or $|((zu_p(z))'/u_p(z))^2-1|<1$. We also determine the condition on these parameters so that $-(4(p+(b+1)/2)/c)u'_p(z)\prec\sqrt{1+z}$. Relations between the parameters $μ$ and $p$ are obtained such that the normalized Lommel function of first kind $h_{μ,p}(z)$ satisfies the subordination $1+(zh''_{μ,p}(z)/h'_{μ,p}(z))\prec\sqrt{1+z}$. Moreover, the properties of Alexander transform of the function $h_{μ,p}(z) $ are discussed.
△ Less
Submitted 12 February, 2019;
originally announced February 2019.
-
Improving the Integrality Gap for Multiway Cut
Authors:
Kristóf Bérczi,
Karthekeyan Chandrasekaran,
Tamás Király,
Vivek Madan
Abstract:
In the multiway cut problem, we are given an undirected graph with non-negative edge weights and a collection of $k$ terminal nodes, and the goal is to partition the node set of the graph into $k$ non-empty parts each containing exactly one terminal so that the total weight of the edges crossing the partition is minimized. The multiway cut problem for $k\ge 3$ is APX-hard. For arbitrary $k$, the b…
▽ More
In the multiway cut problem, we are given an undirected graph with non-negative edge weights and a collection of $k$ terminal nodes, and the goal is to partition the node set of the graph into $k$ non-empty parts each containing exactly one terminal so that the total weight of the edges crossing the partition is minimized. The multiway cut problem for $k\ge 3$ is APX-hard. For arbitrary $k$, the best-known approximation factor is $1.2965$ due to [Sharma and Vondrák, 2014] while the best known inapproximability factor is $1.2$ due to [Angelidakis, Makarychev and Manurangsi, 2017]. In this work, we improve on the lower bound to $1.20016$ by constructing an integrality gap instance for the CKR relaxation.
A technical challenge in improving the gap has been the lack of geometric tools to understand higher-dimensional simplices. Our instance is a non-trivial $3$-dimensional instance that overcomes this technical challenge. We analyze the gap of the instance by viewing it as a convex combination of $2$-dimensional instances and a uniform 3-dimensional instance. We believe that this technique could be exploited further to construct instances with larger integrality gap. One of the ingredients of our proof technique is a generalization of a result on \emph{Sperner admissible labelings} due to [Mirzakhani and Vondrák, 2015] that might be of independent combinatorial interest.
△ Less
Submitted 21 November, 2018; v1 submitted 25 July, 2018;
originally announced July 2018.
-
Starlikeness associated with lemniscate of Bernoulli
Authors:
Vibha Madaan,
Ajay Kumar,
V. Ravichandran
Abstract:
For an analytic function $f$ on the unit disk $\mathbb{D}=\{z:|z|<1\}$ satisfying $f(0)=0=f'(0)-1,$ we obtain sufficient conditions so that $f$ satisfies $|(zf'(z)/f(z))^2-1|<1.$ The technique of differential subordination of first or second order is used. The admissibility conditions for lemniscate of Bernoulli are derived and employed in order to prove the main results.
For an analytic function $f$ on the unit disk $\mathbb{D}=\{z:|z|<1\}$ satisfying $f(0)=0=f'(0)-1,$ we obtain sufficient conditions so that $f$ satisfies $|(zf'(z)/f(z))^2-1|<1.$ The technique of differential subordination of first or second order is used. The admissibility conditions for lemniscate of Bernoulli are derived and employed in order to prove the main results.
△ Less
Submitted 13 June, 2018;
originally announced June 2018.
-
Spectrally Robust Graph Isomorphism
Authors:
Alexandra Kolla,
Ioannis Koutis,
Vivek Madan,
Ali Kemal Sinop
Abstract:
We initiate the study of spectral generalizations of the graph isomorphism problem.
(a)The Spectral Graph Dominance (SGD) problem: On input of two graphs $G$ and $H$ does there exist a permutation $π$ such that $G\preceq π(H)$?
(b) The Spectrally Robust Graph Isomorphism (SRGI) problem: On input of two graphs $G$ and $H$, find the smallest number $κ$ over all permutations $π$ such that…
▽ More
We initiate the study of spectral generalizations of the graph isomorphism problem.
(a)The Spectral Graph Dominance (SGD) problem: On input of two graphs $G$ and $H$ does there exist a permutation $π$ such that $G\preceq π(H)$?
(b) The Spectrally Robust Graph Isomorphism (SRGI) problem: On input of two graphs $G$ and $H$, find the smallest number $κ$ over all permutations $π$ such that $ π(H) \preceq G\preceq κc π(H)$ for some $c$. SRGI is a natural formulation of the network alignment problem that has various applications, most notably in computational biology.
Here $G\preceq c H$ means that for all vectors $x$ we have $x^T L_G x \leq c x^T L_H x$, where $L_G$ is the Laplacian $G$.
We prove NP-hardness for SGD. We also present a $κ$-approximation algorithm for SRGI for the case when both $G$ and $H$ are bounded-degree trees. The algorithm runs in polynomial time when $κ$ is a constant.
△ Less
Submitted 1 May, 2018;
originally announced May 2018.
-
Approximating Multicut and the Demand Graph
Authors:
Chandra Chekuri,
Vivek Madan
Abstract:
In the minimum Multicut problem, the input is an edge-weighted supply graph $G=(V,E)$ and a simple demand graph $H=(V,F)$. Either $G$ and $H$ are directed (DMulC) or both are undirected (UMulC). The goal is to remove a minimum weight set of edges in $G$ such that there is no path from $s$ to $t$ in the remaining graph for any $(s,t) \in F$. UMulC admits an $O(\log k)$-approximation where $k$ is th…
▽ More
In the minimum Multicut problem, the input is an edge-weighted supply graph $G=(V,E)$ and a simple demand graph $H=(V,F)$. Either $G$ and $H$ are directed (DMulC) or both are undirected (UMulC). The goal is to remove a minimum weight set of edges in $G$ such that there is no path from $s$ to $t$ in the remaining graph for any $(s,t) \in F$. UMulC admits an $O(\log k)$-approximation where $k$ is the vertex cover size of $H$ while the best known approximation for DMulC is $\min\{k, \tilde{O}(n^{11/23})\}$. These approximations are obtained by proving corresponding results on the multicommodity flow-cut gap. In contrast to these results some special cases of Multicut, such as the well-studied Multiway Cut problem, admit a constant factor approximation in both undirected and directed graphs. Motivated by both concrete instances from applications and abstract considerations, we consider the role that the structure of the demand graph $H$ plays in determining the approximability of Multicut.
In undirected graphs our main result is a $2$-approximation in $n^{O(t)}$ time when the demand graph $H$ excludes an induced matching of size $t$. This gives a constant factor approximation for a specific demand graph that motivated this work.
In contrast to undirected graphs, we prove that in directed graphs such approximation algorithms can not exist. Assuming the Unique Games Conjecture (UGC), for a large class of fixed demand graphs DMulC cannot be approximated to a factor better than worst-case flow-cut gap. As a consequence we prove that for any fixed $k$, assuming UGC, DMulC with $k$ demand pairs is hard to approximate to within a factor better than $k$. On the positive side, we prove an approximation of $k$ when the demand graph excludes certain graphs as an induced subgraph. This generalizes the Multiway Cut result to a much larger class of demand graphs.
△ Less
Submitted 25 July, 2016;
originally announced July 2016.
-
Simple and Fast Rounding Algorithms for Directed and Node-weighted Multiway Cut
Authors:
Chandra Chekuri,
Vivek Madan
Abstract:
In Directed Multiway Cut(Dir-MC) the input is an edge-weighted directed graph $G=(V,E)$ and a set of $k$ terminal nodes $\{s_1,s_2,\ldots,s_k\} \subseteq V$; the goal is to find a min-weight subset of edges whose removal ensures that there is no path from $s_i$ to $s_j$ for any $i \neq j$. In Node-weighted Multiway Cut(Node-MC) the input is a node-weighted undirected graph $G$ and a set of $k$ ter…
▽ More
In Directed Multiway Cut(Dir-MC) the input is an edge-weighted directed graph $G=(V,E)$ and a set of $k$ terminal nodes $\{s_1,s_2,\ldots,s_k\} \subseteq V$; the goal is to find a min-weight subset of edges whose removal ensures that there is no path from $s_i$ to $s_j$ for any $i \neq j$. In Node-weighted Multiway Cut(Node-MC) the input is a node-weighted undirected graph $G$ and a set of $k$ terminal nodes $\{s_1,s_2,\ldots,s_k\} \subseteq V$; the goal is to remove a min-weight subset of nodes to disconnect each pair of terminals. Dir-MC admits a $2$-approximation [Naor, Zosin '97] and Node-MC admits a $2(1-\frac{1}{k})$-approximation [Garg, Vazirani, Yannakakis '94], both via rounding of LP relaxations. Previous rounding algorithms for these problems, from nearly twenty years ago, are based on careful rounding of an "optimum" solution to an LP relaxation. This is particularly true for Dir-MC for which the rounding relies on a custom LP formulation instead of the natural distance based LP relaxation [Naor, Zosin '97].
In this paper we describe extremely simple and near linear-time rounding algorithms for Dir-MC and Node-MC via a natural distance based LP relaxation. The dual of this relaxation is a special case of the maximum multicommodity flow problem. Our algorithms achieve the same bounds as before but have the significant advantage in that they can work with "any feasible" solution to the relaxation. Consequently, in addition to obtaining "book" proofs of LP rounding for these two basic problems, we also obtain significantly faster approximation algorithms by taking advantage of known algorithms for computing near-optimal solutions for maximum multicommodity flow problems. We also investigate lower bounds for Dir-MC when $k=2$ and in particular prove that the integrality gap of the LP relaxation is $2$ even in directed planar graphs.
△ Less
Submitted 16 July, 2015;
originally announced July 2015.
-
On the Expansion of Group-Based Lifts
Authors:
Naman Agarwal,
Karthekeyan Chandrasekaran,
Alexandra Kolla,
Vivek Madan
Abstract:
A $k$-lift of an $n$-vertex base graph $G$ is a graph $H$ on $n\times k$ vertices, where each vertex $v$ of $G$ is replaced by $k$ vertices $v_1,\cdots{},v_k$ and each edge $(u,v)$ in $G$ is replaced by a matching representing a bijection $π_{uv}$ so that the edges of $H$ are of the form $(u_i,v_{π_{uv}(i)})$. Lifts have been studied as a means to efficiently construct expanders. In this work, we…
▽ More
A $k$-lift of an $n$-vertex base graph $G$ is a graph $H$ on $n\times k$ vertices, where each vertex $v$ of $G$ is replaced by $k$ vertices $v_1,\cdots{},v_k$ and each edge $(u,v)$ in $G$ is replaced by a matching representing a bijection $π_{uv}$ so that the edges of $H$ are of the form $(u_i,v_{π_{uv}(i)})$. Lifts have been studied as a means to efficiently construct expanders. In this work, we study lifts obtained from groups and group actions. We derive the spectrum of such lifts via the representation theory principles of the underlying group. Our main results are:
(1) There is a constant $c_1$ such that for every $k\geq 2^{c_1nd}$, there does not exist an abelian $k$-lift $H$ of any $n$-vertex $d$-regular base graph with $H$ being almost Ramanujan (nontrivial eigenvalues of the adjacency matrix at most $O(\sqrt{d})$ in magnitude). This can be viewed as an analogue of the well-known no-expansion result for abelian Cayley graphs.
(2) A uniform random lift in a cyclic group of order $k$ of any $n$-vertex $d$-regular base graph $G$, with the nontrivial eigenvalues of the adjacency matrix of $G$ bounded by $λ$ in magnitude, has the new nontrivial eigenvalues also bounded by $λ+O(\sqrt{d})$ in magnitude with probability $1-ke^{-Ω(n/d^2)}$. In particular, there is a constant $c_2$ such that for every $k\leq 2^{c_2n/d^2}$, there exists a lift $H$ of every Ramanujan graph in a cyclic group of order $k$ with $H$ being almost Ramanujan. We use this to design a quasi-polynomial time algorithm to construct almost Ramanujan expanders deterministically.
The existence of expanding lifts in cyclic groups of order $k=2^{O(n/d^2)}$ can be viewed as a lower bound on the order $k_0$ of the largest abelian group that produces expanding lifts. Our results show that the lower bound matches the upper bound for $k_0$ (upto $d^3$ in the exponent).
△ Less
Submitted 17 December, 2016; v1 submitted 13 November, 2013;
originally announced November 2013.
-
Preservation under Substructures modulo Bounded Cores
Authors:
Abhisekh Sankaran,
Bharat Adsul,
Vivek Madan,
Pritish Kamath,
Supratik Chakraborty
Abstract:
We investigate a model-theoretic property that generalizes the classical notion of "preservation under substructures". We call this property \emph{preservation under substructures modulo bounded cores}, and present a syntactic characterization via $Σ_2^0$ sentences for properties of arbitrary structures definable by FO sentences. As a sharper characterization, we further show that the count of exi…
▽ More
We investigate a model-theoretic property that generalizes the classical notion of "preservation under substructures". We call this property \emph{preservation under substructures modulo bounded cores}, and present a syntactic characterization via $Σ_2^0$ sentences for properties of arbitrary structures definable by FO sentences. As a sharper characterization, we further show that the count of existential quantifiers in the $Σ_2^0$ sentence equals the size of the smallest bounded core. We also present our results on the sharper characterization for special fragments of FO and also over special classes of structures. We present a (not FO-definable) class of finite structures for which the sharper characterization fails, but for which the classical Łoś-Tarski preservation theorem holds. As a fallout of our studies, we obtain combinatorial proofs of the Łoś-Tarski theorem for some of the aforementioned cases.
△ Less
Submitted 12 July, 2012; v1 submitted 7 May, 2012;
originally announced May 2012.