Search | arXiv e-print repository

Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?

Authors: Rylan Schaeffer, Hailey Schoelkopf, Brando Miranda, Gabriel Mukobi, Varun Madan, Adam Ibrahim, Herbie Bradley, Stella Biderman, Sanmi Koyejo

Abstract: Predictable behavior from scaling advanced AI systems is an extremely desirable property. Although a well-established literature exists on how pretraining performance scales, the literature on how particular downstream capabilities scale is significantly muddier. In this work, we take a step back and ask: why has predicting specific downstream capabilities with scale remained elusive? While many f… ▽ More Predictable behavior from scaling advanced AI systems is an extremely desirable property. Although a well-established literature exists on how pretraining performance scales, the literature on how particular downstream capabilities scale is significantly muddier. In this work, we take a step back and ask: why has predicting specific downstream capabilities with scale remained elusive? While many factors are certainly responsible, we identify a new factor that makes modeling scaling behavior on widely used multiple-choice question-answering benchmarks challenging. Using five model families and twelve well-established multiple-choice benchmarks, we show that downstream performance is computed from negative log likelihoods via a sequence of transformations that progressively degrade the statistical relationship between performance and scale. We then reveal the mechanism causing this degradation: downstream metrics require comparing the correct choice against a small number of specific incorrect choices, meaning accurately predicting downstream capabilities requires predicting not just how probability mass concentrates on the correct choice with scale, but also how probability mass fluctuates on specific incorrect choices with scale. We empirically study how probability mass on the correct choice co-varies with probability mass on incorrect choices with increasing compute, suggesting that scaling laws for incorrect choices might be achievable. Our work also explains why pretraining scaling laws are commonly regarded as more predictable than downstream capabilities and contributes towards establishing scaling-predictable evaluations of frontier AI models. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2205.11603 [pdf, other]

Representation Projection Invariance Mitigates Representation Collapse

Authors: Anastasia Razdaibiedina, Ashish Khetan, Zohar Karnin, Daniel Khashabi, Vishaal Kapoor, Vivek Madan

Abstract: Fine-tuning contextualized representations learned by pre-trained language models remains a prevalent practice in NLP. However, fine-tuning can lead to representation degradation (also known as representation collapse), which may result in instability, sub-optimal performance, and weak generalization. In this paper, we propose Representation Projection Invariance (REPINA), a novel regularization… ▽ More Fine-tuning contextualized representations learned by pre-trained language models remains a prevalent practice in NLP. However, fine-tuning can lead to representation degradation (also known as representation collapse), which may result in instability, sub-optimal performance, and weak generalization. In this paper, we propose Representation Projection Invariance (REPINA), a novel regularization method to maintain the information content of representation and reduce representation collapse during fine-tuning by discouraging undesirable changes in the representations. We study the empirical behavior of the proposed regularization in comparison to 5 comparable baselines across 13 language understanding tasks (GLUE benchmark and six additional datasets). When evaluating in-domain performance, REPINA consistently outperforms other baselines on most tasks (10 out of 13). We also demonstrate its effectiveness in few-shot settings and robustness to label perturbation. As a by-product, we extend previous studies of representation collapse and propose several metrics to quantify it. Our empirical findings show that our approach is significantly more effective at mitigating representation collapse. △ Less

Submitted 21 November, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

Comments: 41 pages, 6 figures

arXiv:2107.11094 [pdf, other]

Improving Early Sepsis Prediction with Multi Modal Learning

Authors: Fred Qin, Vivek Madan, Ujjwal Ratan, Zohar Karnin, Vishaal Kapoor, Parminder Bhatia, Taha Kass-Hout

Abstract: Sepsis is a life-threatening disease with high morbidity, mortality and healthcare costs. The early prediction and administration of antibiotics and intravenous fluids is considered crucial for the treatment of sepsis and can save potentially millions of lives and billions in health care costs. Professional clinical care practitioners have proposed clinical criterion which aid in early detection o… ▽ More Sepsis is a life-threatening disease with high morbidity, mortality and healthcare costs. The early prediction and administration of antibiotics and intravenous fluids is considered crucial for the treatment of sepsis and can save potentially millions of lives and billions in health care costs. Professional clinical care practitioners have proposed clinical criterion which aid in early detection of sepsis; however, performance of these criterion is often limited. Clinical text provides essential information to estimate the severity of the sepsis in addition to structured clinical data. In this study, we explore how clinical text can complement structured data towards early sepsis prediction task. In this paper, we propose multi modal model which incorporates both structured data in the form of patient measurements as well as textual notes on the patient. We employ state-of-the-art NLP models such as BERT and a highly specialized NLP model in Amazon Comprehend Medical to represent the text. On the MIMIC-III dataset containing records of ICU admissions, we show that by using these notes, one achieves an improvement of 6.07 points in a standard utility score for Sepsis prediction and 2.89% in AUROC score. Our methods significantly outperforms a clinical criteria suggested by experts, qSOFA, as well as the winning model of the PhysioNet Computing in Cardiology Challenge for predicting Sepsis. △ Less

Submitted 23 July, 2021; originally announced July 2021.

arXiv:2101.08587 [pdf, other]

Stress Testing of Meta-learning Approaches for Few-shot Learning

Authors: Aroof Aimen, Sahil Sidheekh, Vineet Madan, Narayanan C. Krishnan

Abstract: Meta-learning (ML) has emerged as a promising learning method under resource constraints such as few-shot learning. ML approaches typically propose a methodology to learn generalizable models. In this work-in-progress paper, we put the recent ML approaches to a stress test to discover their limitations. Precisely, we measure the performance of ML approaches for few-shot learning against increasing… ▽ More Meta-learning (ML) has emerged as a promising learning method under resource constraints such as few-shot learning. ML approaches typically propose a methodology to learn generalizable models. In this work-in-progress paper, we put the recent ML approaches to a stress test to discover their limitations. Precisely, we measure the performance of ML approaches for few-shot learning against increasing task complexity. Our results show a quick degradation in the performance of initialization strategies for ML (MAML, TAML, and MetaSGD), while surprisingly, approaches that use an optimization strategy (MetaLSTM) perform significantly better. We further demonstrate the effectiveness of an optimization strategy for ML (MetaLSTM++) trained in a MAML manner over a pure optimization strategy. Our experiments also show that the optimization strategies for ML achieve higher transferability from simple to complex tasks. △ Less

Submitted 21 January, 2021; originally announced January 2021.

arXiv:2012.06723 [pdf, other]

On Duality Gap as a Measure for Monitoring GAN Training

Authors: Sahil Sidheekh, Aroof Aimen, Vineet Madan, Narayanan C. Krishnan

Abstract: Generative adversarial network (GAN) is among the most popular deep learning models for learning complex data distributions. However, training a GAN is known to be a challenging task. This is often attributed to the lack of correlation between the training progress and the trajectory of the generator and discriminator losses and the need for the GAN's subjective evaluation. A recently proposed mea… ▽ More Generative adversarial network (GAN) is among the most popular deep learning models for learning complex data distributions. However, training a GAN is known to be a challenging task. This is often attributed to the lack of correlation between the training progress and the trajectory of the generator and discriminator losses and the need for the GAN's subjective evaluation. A recently proposed measure inspired by game theory - the duality gap, aims to bridge this gap. However, as we demonstrate, the duality gap's capability remains constrained due to limitations posed by its estimation process. This paper presents a theoretical understanding of this limitation and proposes a more dependable estimation process for the duality gap. At the crux of our approach is the idea that local perturbations can help agents in a zero-sum game escape non-Nash saddle points efficiently. Through exhaustive experimentation across GAN models and datasets, we establish the efficacy of our approach in capturing the GAN training progress with minimal increase to the computational complexity. Further, we show that our estimate, with its ability to identify model convergence/divergence, is a potential performance measure that can be used to tune the hyperparameters of a GAN. △ Less

Submitted 11 December, 2020; originally announced December 2020.

arXiv:2006.11742 [pdf, ps, other]

Estimates for initial coefficients of certain bi-univalent functions

Authors: Vibha Madaan, Ajay Kumar, V. Ravichandran

Abstract: Estimates are obtained for the initial coefficients of a normalized analytic function $f$ in the unit disk $\mathbb{D}$ such that $f$ and the analytic extension of $f^{-1}$ to $\mathbb{D}$ belong to certain subclasses of univalent functions. The bounds obtained improve some existing known bounds. Estimates are obtained for the initial coefficients of a normalized analytic function $f$ in the unit disk $\mathbb{D}$ such that $f$ and the analytic extension of $f^{-1}$ to $\mathbb{D}$ belong to certain subclasses of univalent functions. The bounds obtained improve some existing known bounds. △ Less

Submitted 21 June, 2020; originally announced June 2020.

MSC Class: 30C45; 30C80

arXiv:2004.07886 [pdf, ps, other]

Maximizing Determinants under Matroid Constraints

Authors: Vivek Madan, Aleksandar Nikolov, Mohit Singh, Uthaipon Tantipongpipat

Abstract: Given vectors $v_1,\dots,v_n\in\mathbb{R}^d$ and a matroid $M=([n],I)$, we study the problem of finding a basis $S$ of $M$ such that $\det(\sum_{i \in S}v_i v_i^\top)$ is maximized. This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning. The current best results include an $e^{2k}$-estimation for any matroid of ran… ▽ More Given vectors $v_1,\dots,v_n\in\mathbb{R}^d$ and a matroid $M=([n],I)$, we study the problem of finding a basis $S$ of $M$ such that $\det(\sum_{i \in S}v_i v_i^\top)$ is maximized. This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning. The current best results include an $e^{2k}$-estimation for any matroid of rank $k$ and a $(1+ε)^d$-approximation for a uniform matroid of rank $k\ge d+\frac dε$, where the rank $k\ge d$ denotes the desired size of the optimal set. Our main result is a new approximation algorithm with an approximation guarantee that depends only on the dimension $d$ of the vectors and not on the size $k$ of the output set. In particular, we show an $(O(d))^{d}$-estimation and an $(O(d))^{d^3}$-approximation for any matroid, giving a significant improvement over prior work when $k\gg d$. Our result relies on the existence of an optimal solution to a convex programming relaxation for the problem which has sparse support; in particular, no more than $O(d^2)$ variables of the solution have fractional values. The sparsity results rely on the interplay between the first-order optimality conditions for the convex program and matroid theory. We believe that the techniques introduced to show sparsity of optimal solutions to convex programs will be of independent interest. We also give a randomized algorithm that rounds a sparse fractional solution to a feasible integral solution to the original problem. To show the approximation guarantee, we utilize recent works on strongly log-concave polynomials and show new relationships between different convex programs studied for the problem. Finally, we use the estimation algorithm and sparsity results to give an efficient deterministic approximation algorithm with an approximation guarantee that depends solely on the dimension $d$. △ Less

Submitted 16 April, 2020; originally announced April 2020.

arXiv:1910.07686 [pdf, ps, other]

Critical group structure from the parameters of a strongly regular graph

Authors: Joshua E. Ducey, David L. Duncan, Wesley J. Engelbrecht, Jawahar V. Madan, Eric Piato, Christina S. Shatford, Angela Vichitbandha

Abstract: We give simple arithmetic conditions that force the Sylow $p$-subgroup of the critical group of a strongly regular graph to take a specific form. These conditions depend only on the parameters $(v, k, λ, μ)$ of the strongly regular graph under consideration. We give many examples, including how the theory can be used to compute the critical group of Conway's $99$-graph and to give an elementary ar… ▽ More We give simple arithmetic conditions that force the Sylow $p$-subgroup of the critical group of a strongly regular graph to take a specific form. These conditions depend only on the parameters $(v, k, λ, μ)$ of the strongly regular graph under consideration. We give many examples, including how the theory can be used to compute the critical group of Conway's $99$-graph and to give an elementary argument that no $srg(28,9,0,4)$ exists. △ Less

Submitted 16 October, 2019; originally announced October 2019.

Comments: 20 pages

MSC Class: 05C50

arXiv:1906.05547 [pdf, ps, other]

Radii of Starlikeness and Convexity of Bessel Functions

Authors: Vibha Madaan, Ajay Kumar, V. Ravichandran

Abstract: The radii of starlikeness and convexity associated with lemniscate of Bernoulli and the Janowski function, $(1+Az)/(1+Bz)$ for $-1\leq B<A\leq 1$, have been determined for normalizations of $q$-Bessel function, Bessel function of first kind of order $ν$, Lommel function of first kind and Legendre polynomial of odd degree. The radii of starlikeness and convexity associated with lemniscate of Bernoulli and the Janowski function, $(1+Az)/(1+Bz)$ for $-1\leq B<A\leq 1$, have been determined for normalizations of $q$-Bessel function, Bessel function of first kind of order $ν$, Lommel function of first kind and Legendre polynomial of odd degree. △ Less

Submitted 13 June, 2019; originally announced June 2019.

MSC Class: 30C10; 30C15; 30C45

arXiv:1902.04277 [pdf, ps, other]

Lemniscate Convexity and Other Properties of Generalized Bessel Functions

Authors: Vibha Madaan, Ajay Kumar, V. Ravichandran

Abstract: Sufficient conditions on associated parameters $p,b$ and $c$ are obtained so that the generalized and \textquotedblleft{normalized}\textquotedblright{} Bessel function $u_p(z)=u_{p,b,c}(z)$ satisfies $|(1+(zu''_p(z)/u'_p(z)))^2-1|<1$ or $|((zu_p(z))'/u_p(z))^2-1|<1$. We also determine the condition on these parameters so that $-(4(p+(b+1)/2)/c)u'_p(z)\prec\sqrt{1+z}$. Relations between the paramet… ▽ More Sufficient conditions on associated parameters $p,b$ and $c$ are obtained so that the generalized and \textquotedblleft{normalized}\textquotedblright{} Bessel function $u_p(z)=u_{p,b,c}(z)$ satisfies $|(1+(zu''_p(z)/u'_p(z)))^2-1|<1$ or $|((zu_p(z))'/u_p(z))^2-1|<1$. We also determine the condition on these parameters so that $-(4(p+(b+1)/2)/c)u'_p(z)\prec\sqrt{1+z}$. Relations between the parameters $μ$ and $p$ are obtained such that the normalized Lommel function of first kind $h_{μ,p}(z)$ satisfies the subordination $1+(zh''_{μ,p}(z)/h'_{μ,p}(z))\prec\sqrt{1+z}$. Moreover, the properties of Alexander transform of the function $h_{μ,p}(z) $ are discussed. △ Less

Submitted 12 February, 2019; originally announced February 2019.

MSC Class: 30C10; 30C45

arXiv:1807.09735 [pdf, other]

Improving the Integrality Gap for Multiway Cut

Authors: Kristóf Bérczi, Karthekeyan Chandrasekaran, Tamás Király, Vivek Madan

Abstract: In the multiway cut problem, we are given an undirected graph with non-negative edge weights and a collection of $k$ terminal nodes, and the goal is to partition the node set of the graph into $k$ non-empty parts each containing exactly one terminal so that the total weight of the edges crossing the partition is minimized. The multiway cut problem for $k\ge 3$ is APX-hard. For arbitrary $k$, the b… ▽ More In the multiway cut problem, we are given an undirected graph with non-negative edge weights and a collection of $k$ terminal nodes, and the goal is to partition the node set of the graph into $k$ non-empty parts each containing exactly one terminal so that the total weight of the edges crossing the partition is minimized. The multiway cut problem for $k\ge 3$ is APX-hard. For arbitrary $k$, the best-known approximation factor is $1.2965$ due to [Sharma and Vondrák, 2014] while the best known inapproximability factor is $1.2$ due to [Angelidakis, Makarychev and Manurangsi, 2017]. In this work, we improve on the lower bound to $1.20016$ by constructing an integrality gap instance for the CKR relaxation. A technical challenge in improving the gap has been the lack of geometric tools to understand higher-dimensional simplices. Our instance is a non-trivial $3$-dimensional instance that overcomes this technical challenge. We analyze the gap of the instance by viewing it as a convex combination of $2$-dimensional instances and a uniform 3-dimensional instance. We believe that this technique could be exploited further to construct instances with larger integrality gap. One of the ingredients of our proof technique is a generalization of a result on \emph{Sperner admissible labelings} due to [Mirzakhani and Vondrák, 2015] that might be of independent combinatorial interest. △ Less

Submitted 21 November, 2018; v1 submitted 25 July, 2018; originally announced July 2018.

Comments: 28 pages

arXiv:1806.05136 [pdf, ps, other]

Starlikeness associated with lemniscate of Bernoulli

Authors: Vibha Madaan, Ajay Kumar, V. Ravichandran

Abstract: For an analytic function $f$ on the unit disk $\mathbb{D}=\{z:|z|<1\}$ satisfying $f(0)=0=f'(0)-1,$ we obtain sufficient conditions so that $f$ satisfies $|(zf'(z)/f(z))^2-1|<1.$ The technique of differential subordination of first or second order is used. The admissibility conditions for lemniscate of Bernoulli are derived and employed in order to prove the main results. For an analytic function $f$ on the unit disk $\mathbb{D}=\{z:|z|<1\}$ satisfying $f(0)=0=f'(0)-1,$ we obtain sufficient conditions so that $f$ satisfies $|(zf'(z)/f(z))^2-1|<1.$ The technique of differential subordination of first or second order is used. The admissibility conditions for lemniscate of Bernoulli are derived and employed in order to prove the main results. △ Less

Submitted 13 June, 2018; originally announced June 2018.

Comments: 20 pages

arXiv:1805.00181 [pdf, ps, other]

Spectrally Robust Graph Isomorphism

Authors: Alexandra Kolla, Ioannis Koutis, Vivek Madan, Ali Kemal Sinop

Abstract: We initiate the study of spectral generalizations of the graph isomorphism problem. (a)The Spectral Graph Dominance (SGD) problem: On input of two graphs $G$ and $H$ does there exist a permutation $π$ such that $G\preceq π(H)$? (b) The Spectrally Robust Graph Isomorphism (SRGI) problem: On input of two graphs $G$ and $H$, find the smallest number $κ$ over all permutations $π$ such that… ▽ More We initiate the study of spectral generalizations of the graph isomorphism problem. (a)The Spectral Graph Dominance (SGD) problem: On input of two graphs $G$ and $H$ does there exist a permutation $π$ such that $G\preceq π(H)$? (b) The Spectrally Robust Graph Isomorphism (SRGI) problem: On input of two graphs $G$ and $H$, find the smallest number $κ$ over all permutations $π$ such that $ π(H) \preceq G\preceq κc π(H)$ for some $c$. SRGI is a natural formulation of the network alignment problem that has various applications, most notably in computational biology. Here $G\preceq c H$ means that for all vectors $x$ we have $x^T L_G x \leq c x^T L_H x$, where $L_G$ is the Laplacian $G$. We prove NP-hardness for SGD. We also present a $κ$-approximation algorithm for SRGI for the case when both $G$ and $H$ are bounded-degree trees. The algorithm runs in polynomial time when $κ$ is a constant. △ Less

Submitted 1 May, 2018; originally announced May 2018.

Comments: Extended version of a paper appearing in the proceedings of ICALP 2018

arXiv:1607.07200 [pdf, other]

Approximating Multicut and the Demand Graph

Authors: Chandra Chekuri, Vivek Madan

Abstract: In the minimum Multicut problem, the input is an edge-weighted supply graph $G=(V,E)$ and a simple demand graph $H=(V,F)$. Either $G$ and $H$ are directed (DMulC) or both are undirected (UMulC). The goal is to remove a minimum weight set of edges in $G$ such that there is no path from $s$ to $t$ in the remaining graph for any $(s,t) \in F$. UMulC admits an $O(\log k)$-approximation where $k$ is th… ▽ More In the minimum Multicut problem, the input is an edge-weighted supply graph $G=(V,E)$ and a simple demand graph $H=(V,F)$. Either $G$ and $H$ are directed (DMulC) or both are undirected (UMulC). The goal is to remove a minimum weight set of edges in $G$ such that there is no path from $s$ to $t$ in the remaining graph for any $(s,t) \in F$. UMulC admits an $O(\log k)$-approximation where $k$ is the vertex cover size of $H$ while the best known approximation for DMulC is $\min\{k, \tilde{O}(n^{11/23})\}$. These approximations are obtained by proving corresponding results on the multicommodity flow-cut gap. In contrast to these results some special cases of Multicut, such as the well-studied Multiway Cut problem, admit a constant factor approximation in both undirected and directed graphs. Motivated by both concrete instances from applications and abstract considerations, we consider the role that the structure of the demand graph $H$ plays in determining the approximability of Multicut. In undirected graphs our main result is a $2$-approximation in $n^{O(t)}$ time when the demand graph $H$ excludes an induced matching of size $t$. This gives a constant factor approximation for a specific demand graph that motivated this work. In contrast to undirected graphs, we prove that in directed graphs such approximation algorithms can not exist. Assuming the Unique Games Conjecture (UGC), for a large class of fixed demand graphs DMulC cannot be approximated to a factor better than worst-case flow-cut gap. As a consequence we prove that for any fixed $k$, assuming UGC, DMulC with $k$ demand pairs is hard to approximate to within a factor better than $k$. On the positive side, we prove an approximation of $k$ when the demand graph excludes certain graphs as an induced subgraph. This generalizes the Multiway Cut result to a much larger class of demand graphs. △ Less

Submitted 25 July, 2016; originally announced July 2016.

arXiv:1507.04674 [pdf, other]

Simple and Fast Rounding Algorithms for Directed and Node-weighted Multiway Cut

Authors: Chandra Chekuri, Vivek Madan

Abstract: In Directed Multiway Cut(Dir-MC) the input is an edge-weighted directed graph $G=(V,E)$ and a set of $k$ terminal nodes $\{s_1,s_2,\ldots,s_k\} \subseteq V$; the goal is to find a min-weight subset of edges whose removal ensures that there is no path from $s_i$ to $s_j$ for any $i \neq j$. In Node-weighted Multiway Cut(Node-MC) the input is a node-weighted undirected graph $G$ and a set of $k$ ter… ▽ More In Directed Multiway Cut(Dir-MC) the input is an edge-weighted directed graph $G=(V,E)$ and a set of $k$ terminal nodes $\{s_1,s_2,\ldots,s_k\} \subseteq V$; the goal is to find a min-weight subset of edges whose removal ensures that there is no path from $s_i$ to $s_j$ for any $i \neq j$. In Node-weighted Multiway Cut(Node-MC) the input is a node-weighted undirected graph $G$ and a set of $k$ terminal nodes $\{s_1,s_2,\ldots,s_k\} \subseteq V$; the goal is to remove a min-weight subset of nodes to disconnect each pair of terminals. Dir-MC admits a $2$-approximation [Naor, Zosin '97] and Node-MC admits a $2(1-\frac{1}{k})$-approximation [Garg, Vazirani, Yannakakis '94], both via rounding of LP relaxations. Previous rounding algorithms for these problems, from nearly twenty years ago, are based on careful rounding of an "optimum" solution to an LP relaxation. This is particularly true for Dir-MC for which the rounding relies on a custom LP formulation instead of the natural distance based LP relaxation [Naor, Zosin '97]. In this paper we describe extremely simple and near linear-time rounding algorithms for Dir-MC and Node-MC via a natural distance based LP relaxation. The dual of this relaxation is a special case of the maximum multicommodity flow problem. Our algorithms achieve the same bounds as before but have the significant advantage in that they can work with "any feasible" solution to the relaxation. Consequently, in addition to obtaining "book" proofs of LP rounding for these two basic problems, we also obtain significantly faster approximation algorithms by taking advantage of known algorithms for computing near-optimal solutions for maximum multicommodity flow problems. We also investigate lower bounds for Dir-MC when $k=2$ and in particular prove that the integrality gap of the LP relaxation is $2$ even in directed planar graphs. △ Less

Submitted 16 July, 2015; originally announced July 2015.

arXiv:1311.3268 [pdf, ps, other]

On the Expansion of Group-Based Lifts

Authors: Naman Agarwal, Karthekeyan Chandrasekaran, Alexandra Kolla, Vivek Madan

Abstract: A $k$-lift of an $n$-vertex base graph $G$ is a graph $H$ on $n\times k$ vertices, where each vertex $v$ of $G$ is replaced by $k$ vertices $v_1,\cdots{},v_k$ and each edge $(u,v)$ in $G$ is replaced by a matching representing a bijection $π_{uv}$ so that the edges of $H$ are of the form $(u_i,v_{π_{uv}(i)})$. Lifts have been studied as a means to efficiently construct expanders. In this work, we… ▽ More A $k$-lift of an $n$-vertex base graph $G$ is a graph $H$ on $n\times k$ vertices, where each vertex $v$ of $G$ is replaced by $k$ vertices $v_1,\cdots{},v_k$ and each edge $(u,v)$ in $G$ is replaced by a matching representing a bijection $π_{uv}$ so that the edges of $H$ are of the form $(u_i,v_{π_{uv}(i)})$. Lifts have been studied as a means to efficiently construct expanders. In this work, we study lifts obtained from groups and group actions. We derive the spectrum of such lifts via the representation theory principles of the underlying group. Our main results are: (1) There is a constant $c_1$ such that for every $k\geq 2^{c_1nd}$, there does not exist an abelian $k$-lift $H$ of any $n$-vertex $d$-regular base graph with $H$ being almost Ramanujan (nontrivial eigenvalues of the adjacency matrix at most $O(\sqrt{d})$ in magnitude). This can be viewed as an analogue of the well-known no-expansion result for abelian Cayley graphs. (2) A uniform random lift in a cyclic group of order $k$ of any $n$-vertex $d$-regular base graph $G$, with the nontrivial eigenvalues of the adjacency matrix of $G$ bounded by $λ$ in magnitude, has the new nontrivial eigenvalues also bounded by $λ+O(\sqrt{d})$ in magnitude with probability $1-ke^{-Ω(n/d^2)}$. In particular, there is a constant $c_2$ such that for every $k\leq 2^{c_2n/d^2}$, there exists a lift $H$ of every Ramanujan graph in a cyclic group of order $k$ with $H$ being almost Ramanujan. We use this to design a quasi-polynomial time algorithm to construct almost Ramanujan expanders deterministically. The existence of expanding lifts in cyclic groups of order $k=2^{O(n/d^2)}$ can be viewed as a lower bound on the order $k_0$ of the largest abelian group that produces expanding lifts. Our results show that the lower bound matches the upper bound for $k_0$ (upto $d^3$ in the exponent). △ Less

Submitted 17 December, 2016; v1 submitted 13 November, 2013; originally announced November 2013.

arXiv:1205.1358 [pdf, ps, other]

Preservation under Substructures modulo Bounded Cores

Authors: Abhisekh Sankaran, Bharat Adsul, Vivek Madan, Pritish Kamath, Supratik Chakraborty

Abstract: We investigate a model-theoretic property that generalizes the classical notion of "preservation under substructures". We call this property \emph{preservation under substructures modulo bounded cores}, and present a syntactic characterization via $Σ_2^0$ sentences for properties of arbitrary structures definable by FO sentences. As a sharper characterization, we further show that the count of exi… ▽ More We investigate a model-theoretic property that generalizes the classical notion of "preservation under substructures". We call this property \emph{preservation under substructures modulo bounded cores}, and present a syntactic characterization via $Σ_2^0$ sentences for properties of arbitrary structures definable by FO sentences. As a sharper characterization, we further show that the count of existential quantifiers in the $Σ_2^0$ sentence equals the size of the smallest bounded core. We also present our results on the sharper characterization for special fragments of FO and also over special classes of structures. We present a (not FO-definable) class of finite structures for which the sharper characterization fails, but for which the classical Łoś-Tarski preservation theorem holds. As a fallout of our studies, we obtain combinatorial proofs of the Łoś-Tarski theorem for some of the aforementioned cases. △ Less

Submitted 12 July, 2012; v1 submitted 7 May, 2012; originally announced May 2012.

Comments: From v2 to v3: Corrected typos, edited sentences for better readability; Conjecture 1 of v2 is now resolved so it is now Theorem 4, its proof is included in a new section (Section 7), Thm i in v2 is now Thm i+1 for i >= 4; everything else remains the same. From v1 to v2: Thm i is now Thm i-1 for i >= 7, Corrected the proof of Theorem 10 (now Theorem 9) for B > 2 (statement is still correct)

Showing 1–17 of 17 results for author: Madan, V