Skip to main content

Showing 1–20 of 20 results for author: Rajendran, G

.
  1. arXiv:2406.18400  [pdf, other

    cs.CL cs.LG stat.ML

    Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers

    Authors: Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam

    Abstract: Large Language Models (LLMs) have the capacity to store and recall facts. Through experimentation with open-source models, we observe that this ability to retrieve facts can be easily manipulated by changing contexts, even without altering their factual meanings. These findings highlight that LLMs might behave like an associative memory model where certain tokens in the contexts serve as clues to… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2405.15084  [pdf, other

    cs.DS cs.LG stat.ML

    Efficient Certificates of Anti-Concentration Beyond Gaussians

    Authors: Ainesh Bakshi, Pravesh Kothari, Goutham Rajendran, Madhur Tulsiani, Aravindan Vijayaraghavan

    Abstract: A set of high dimensional points $X=\{x_1, x_2,\ldots, x_n\} \subset R^d$ in isotropic position is said to be $δ$-anti concentrated if for every direction $v$, the fraction of points in $X$ satisfying $|\langle x_i,v \rangle |\leq δ$ is at most $O(δ)$. Motivated by applications to list-decodable learning and clustering, recent works have considered the problem of constructing efficient certificate… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  3. arXiv:2403.03867  [pdf, other

    cs.CL cs.LG stat.ML

    On the Origins of Linear Representations in Large Language Models

    Authors: Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam, Victor Veitch

    Abstract: Recent works have argued that high-level semantic concepts are encoded "linearly" in the representation space of large language models. In this work, we study the origins of such linear representations. To that end, we introduce a simple latent variable model to abstract and formalize the concept dynamics of the next token prediction. We use this formalism to show that the next token prediction ob… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  4. arXiv:2402.09236  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models

    Authors: Goutham Rajendran, Simon Buchholz, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: To build intelligent machine learning systems, there are two broad approaches. One approach is to build inherently interpretable models, as endeavored by the growing field of causal representation learning. The other approach is to build highly-performant foundation models and then invest efforts into understanding how they work. In this work, we relate these two approaches and study how to learn… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 36 pages

  5. arXiv:2311.18048  [pdf, other

    cs.LG cs.CE eess.SY stat.ME

    An Interventional Perspective on Identifiability in Gaussian LTI Systems with Independent Component Analysis

    Authors: Goutham Rajendran, Patrik Reizinger, Wieland Brendel, Pradeep Ravikumar

    Abstract: We investigate the relationship between system identification and intervention design in dynamical systems. While previous research demonstrated how identifiable representation learning methods, such as Independent Component Analysis (ICA), can reveal cause-effect relationships, it relied on a passive perspective without considering how to collect data. Our work shows that in Gaussian Linear Time-… ▽ More

    Submitted 16 February, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: CLeaR2024 camera ready. Code available at https://github.com/rpatrik96/lti-ica

  6. Performance Evaluation of Video Streaming Applications with Target Wake Time in Wi-Fi 6

    Authors: Govind Rajendran, Rishabh Roy, Preyas Hathi, Nadeem Akhtar, Samar Agnihotri

    Abstract: The Target Wake Time (TWT) feature, introduced in Wi-Fi 6, was primarily meant as an advanced power save mechanism. However, it has some interesting applications in scheduling and resource allocation. TWT-based resource allocation can be used to improve the user experience for certain applications, e.g., VoIP, IoT, video streaming, etc. In this work, we analyze the packet arrival pattern for strea… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: This paper was part of 15th International Conference on COMmunication Systems & NETworkS (COMSNETS), Bangalore, India, 2023

  7. arXiv:2306.02235  [pdf, other

    cs.LG cs.AI math.ST stat.ME stat.ML

    Learning Linear Causal Representations from Interventions under General Nonlinear Mixing

    Authors: Simon Buchholz, Goutham Rajendran, Elan Rosenfeld, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general. We prove strong identifiability results given unknown single-node interventions, i.e., without having access to the intervention targets. This generalizes prior works which have focused on weaker cl… ▽ More

    Submitted 18 December, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

    Comments: Accepted as Oral paper at NeurIPS 2023

  8. arXiv:2304.13531  [pdf, other

    cs.ET

    Integrated Architecture for Neural Networks and Security Primitives using RRAM Crossbar

    Authors: Simranjeet Singh, Furqan Zahoor, Gokulnath Rajendran, Vikas Rana, Sachin Patkar, Anupam Chattopadhyay, Farhad Merchant

    Abstract: This paper proposes an architecture that integrates neural networks (NNs) and hardware security modules using a single resistive random access memory (RRAM) crossbar. The proposed architecture enables using a single crossbar to implement NN, true random number generator (TRNG), and physical unclonable function (PUF) applications while exploiting the multi-state storage characteristic of the RRAM c… ▽ More

    Submitted 1 May, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

  9. arXiv:2303.17506  [pdf, other

    cs.CC cs.DS

    Sum-of-Squares Lower Bounds for Densest $k$-Subgraph

    Authors: Chris Jones, Aaron Potechin, Goutham Rajendran, Jeff Xu

    Abstract: Given a graph and an integer $k$, Densest $k$-Subgraph is the algorithmic task of finding the subgraph on $k$ vertices with the maximum number of edges. This is a fundamental problem that has been subject to intense study for decades, with applications spanning a wide variety of fields. The state-of-the-art algorithm is an $O(n^{1/4 + ε})$-factor approximation (for any $ε> 0$) due to Bhaskara et a… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    ACM Class: F.2.2

  10. arXiv:2302.04462  [pdf, other

    cs.CC cs.DS cs.LG stat.ML

    Nonlinear Random Matrices and Applications to the Sum of Squares Hierarchy

    Authors: Goutham Rajendran

    Abstract: We develop new tools in the theory of nonlinear random matrices and apply them to study the performance of the Sum of Squares (SoS) hierarchy on average-case problems. The SoS hierarchy is a powerful optimization technique that has achieved tremendous success for various problems in combinatorial optimization, robust statistics and machine learning. It's a family of convex relaxations that lets… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: Dissertation, University of Chicago

  11. arXiv:2211.03526  [pdf, other

    cs.CR cs.AR cs.ET

    Hardware Security Primitives using Passive RRAM Crossbar Array: Novel TRNG and PUF Designs

    Authors: Simranjeet Singh, Furqan Zahoor, Gokulnath Rajendran, Sachin Patkar, Anupam Chattopadhyay, Farhad Merchant

    Abstract: With rapid advancements in electronic gadgets, the security and privacy aspects of these devices are significant. For the design of secure systems, physical unclonable function (PUF) and true random number generator (TRNG) are critical hardware security primitives for security applications. This paper proposes novel implementations of PUF and TRNGs on the RRAM crossbar structure. Firstly, two tech… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: To appear at ASP-DAC 2023

  12. arXiv:2209.02655  [pdf, other

    cs.CC cs.DM cs.LG math.PR math.ST

    Concentration of polynomial random matrices via Efron-Stein inequalities

    Authors: Goutham Rajendran, Madhur Tulsiani

    Abstract: Analyzing concentration of large random matrices is a common task in a wide variety of fields. Given independent random variables, many tools are available to analyze random matrices whose entries are linear in the variables, e.g. the matrix-Bernstein inequality. However, in many applications, we need to analyze random matrices whose entries are polynomials in the variables. These arise naturally… ▽ More

    Submitted 17 January, 2023; v1 submitted 6 September, 2022; originally announced September 2022.

    Comments: To appear at SODA 2023. 41 pages, 6 figures

  13. arXiv:2208.08509  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Analyzing Robustness of End-to-End Neural Models for Automatic Speech Recognition

    Authors: Goutham Rajendran, Wei Zou

    Abstract: We investigate robustness properties of pre-trained neural models for automatic speech recognition. Real life data in machine learning is usually very noisy and almost never clean, which can be attributed to various factors depending on the domain, e.g. outliers, random noise and adversarial noise. Therefore, the models we develop for various tasks should be robust to such kinds of noisy data, whi… ▽ More

    Submitted 17 August, 2022; originally announced August 2022.

    Comments: 5 pages, 14 figures

  14. arXiv:2208.04374  [pdf, ps, other

    cs.CC cs.DS

    Combinatorial Optimization via the Sum of Squares Hierarchy

    Authors: Goutham Rajendran

    Abstract: We study the Sum of Squares (SoS) Hierarchy with a view towards combinatorial optimization. We survey the use of the SoS hierarchy to obtain approximation algorithms on graphs using their spectral properties. We present a simplified proof of the result of Feige and Krauthgamer on the performance of the hierarchy for the Maximum Clique problem on random graphs. We also present a result of Guruswami… ▽ More

    Submitted 1 September, 2022; v1 submitted 8 August, 2022; originally announced August 2022.

    Comments: Master's thesis, University of Chicago

  15. arXiv:2206.10044  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Identifiability of deep generative models without auxiliary information

    Authors: Bohdan Kivva, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam

    Abstract: We prove identifiability of a broad class of deep latent variable models that (a) have universal approximation capabilities and (b) are the decoders of variational autoencoders that are commonly used in practice. Unlike existing work, our analysis does not require weak supervision, auxiliary information, or conditioning in the latent space. Specifically, we show that for a broad class of generativ… ▽ More

    Submitted 18 October, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: 34 pages, 9 figures, to appear in NeurIPS 2022

  16. arXiv:2111.09250  [pdf, other

    cs.CC

    Sum-of-Squares Lower Bounds for Sparse Independent Set

    Authors: Chris Jones, Aaron Potechin, Goutham Rajendran, Madhur Tulsiani, Jeff Xu

    Abstract: The Sum-of-Squares (SoS) hierarchy of semidefinite programs is a powerful algorithmic paradigm which captures state-of-the-art algorithmic guarantees for a wide array of problems. In the average case setting, SoS lower bounds provide strong evidence of algorithmic hardness or information-computation gaps. Prior to this work, SoS lower bounds have been obtained for problems in the "dense" input reg… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

  17. arXiv:2110.04719  [pdf, other

    cs.LG cs.AI stat.ML

    Structure learning in polynomial time: Greedy algorithms, Bregman information, and exponential families

    Authors: Goutham Rajendran, Bohdan Kivva, Ming Gao, Bryon Aragam

    Abstract: Greedy algorithms have long been a workhorse for learning graphical models, and more broadly for learning statistical models with sparse structure. In the context of learning directed acyclic graphs, greedy algorithms are popular despite their worst-case exponential runtime. In practice, however, they are very efficient. We provide new insight into this phenomenon by studying a general greedy scor… ▽ More

    Submitted 28 October, 2021; v1 submitted 10 October, 2021; originally announced October 2021.

    Comments: Accepted to NeurIPS 2021; 27 pages, 9 figures

  18. arXiv:2106.15563  [pdf, other

    cs.LG cs.AI stat.ML

    Learning latent causal graphs via mixture oracles

    Authors: Bohdan Kivva, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam

    Abstract: We study the problem of reconstructing a causal graphical model from data in the presence of latent variables. The main problem of interest is recovering the causal structure over the latent variables while allowing for general, potentially nonlinear dependence between the variables. In many practical problems, the dependence between raw observations (e.g. pixels in an image) is much less relevant… ▽ More

    Submitted 21 November, 2021; v1 submitted 29 June, 2021; originally announced June 2021.

    Comments: To appear at NeurIPS 2021. 41 pages

  19. arXiv:2011.04253  [pdf, other

    cs.CC

    Machinery for Proving Sum-of-Squares Lower Bounds on Certification Problems

    Authors: Aaron Potechin, Goutham Rajendran

    Abstract: In this paper, we construct general machinery for proving Sum-of-Squares lower bounds on certification problems by generalizing the techniques used by Barak et al. [FOCS 2016] to prove Sum-of-Squares lower bounds for planted clique. Using this machinery, we prove degree $n^ε$ Sum-of-Squares lower bounds for tensor PCA, the Wishart model of sparse PCA, and a variant of planted clique which we call… ▽ More

    Submitted 9 February, 2023; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: 143 pages

  20. arXiv:2009.01874  [pdf, other

    cs.CC math.CO

    Sum-of-Squares Lower Bounds for Sherrington-Kirkpatrick via Planted Affine Planes

    Authors: Mrinalkanti Ghosh, Fernando Granha Jeronimo, Chris Jones, Aaron Potechin, Goutham Rajendran

    Abstract: The Sum-of-Squares (SoS) hierarchy is a semi-definite programming meta-algorithm that captures state-of-the-art polynomial time guarantees for many optimization problems such as Max-$k$-CSPs and Tensor PCA. On the flip side, a SoS lower bound provides evidence of hardness, which is particularly relevant to average-case problems for which NP-hardness may not be available. In this paper, we consid… ▽ More

    Submitted 3 September, 2020; originally announced September 2020.

    Comments: 68 pages