Skip to main content

Showing 1–6 of 6 results for author: Choraria, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.07449  [pdf, other

    cs.CV

    Language Grounded QFormer for Efficient Vision Language Understanding

    Authors: Moulik Choraria, Nitesh Sekhar, Yue Wu, Xu Zhang, Prateek Singhal, Lav R. Varshney

    Abstract: Large-scale pretraining and instruction tuning have been successful for training general-purpose language models with broad competencies. However, extending to general-purpose vision-language models is challenging due to the distributional diversity in visual inputs. A recent line of work explores vision-language instruction tuning, taking inspiration from the Query Transformer (QFormer) approach… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: Preprint Under Review

  2. arXiv:2307.07843  [pdf, other

    cs.LG cs.CL

    Transformers are Universal Predictors

    Authors: Sourya Basu, Moulik Choraria, Lav R. Varshney

    Abstract: We find limits to the Transformer architecture for language modeling and show it has a universal prediction property in an information-theoretic sense. We further analyze performance in non-asymptotic data regimes to understand the role of various components of the Transformer architecture, especially in the context of data-efficient training. We validate our theoretical analysis with experiments… ▽ More

    Submitted 15 July, 2023; originally announced July 2023.

    Comments: Neural Compression Workshop (ICML 2023)

  3. arXiv:2301.12067  [pdf, other

    cs.LG cs.CV

    Learning Optimal Features via Partial Invariance

    Authors: Moulik Choraria, Ibtihal Ferwana, Ankur Mani, Lav R. Varshney

    Abstract: Learning models that are robust to distribution shifts is a key concern in the context of their real-life applicability. Invariant Risk Minimization (IRM) is a popular framework that aims to learn robust models from multiple environments. The success of IRM requires an important assumption: the underlying causal mechanisms/features remain invariant across environments. When not satisfied, we show… ▽ More

    Submitted 3 April, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: Presented at the 37th AAAI Conference on Artificial Intelligence, 2023

  4. arXiv:2202.13473  [pdf, other

    cs.LG cs.CV

    The Spectral Bias of Polynomial Neural Networks

    Authors: Moulik Choraria, Leello Tadesse Dadi, Grigorios Chrysos, Julien Mairal, Volkan Cevher

    Abstract: Polynomial neural networks (PNNs) have been recently shown to be particularly effective at image generation and face recognition, where high-frequency information is critical. Previous studies have revealed that neural networks demonstrate a $\textit{spectral bias}$ towards low-frequency functions, which yields faster learning of low-frequency components during training. Inspired by such studies,… ▽ More

    Submitted 27 February, 2022; originally announced February 2022.

    Comments: Accepted at the International Conference on Learning Representations(ICLR) 2022

  5. arXiv:2112.09346  [pdf, other

    cs.LG

    Balancing Fairness and Robustness via Partial Invariance

    Authors: Moulik Choraria, Ibtihal Ferwana, Ankur Mani, Lav R. Varshney

    Abstract: The Invariant Risk Minimization (IRM) framework aims to learn invariant features from a set of environments for solving the out-of-distribution (OOD) generalization problem. The underlying assumption is that the causal components of the data generating distributions remain constant across the environments or alternately, the data "overlaps" across environments to find meaningful invariant features… ▽ More

    Submitted 24 December, 2021; v1 submitted 17 December, 2021; originally announced December 2021.

    Comments: Accepted at the Algorithmic Fairness through the Lens of Causality and Robustness (AFCR) Workshop, NeurIPS 2021

  6. arXiv:2101.05567  [pdf, other

    eess.SY cs.MA

    Design of false data injection attack on distributed process estimation

    Authors: Moulik Choraria, Arpan Chattopadhyay, Urbashi Mitra, Erik Strom

    Abstract: Herein, design of false data injection attack on a distributed cyber-physical system is considered. A stochastic process with linear dynamics and Gaussian noise is measured by multiple agent nodes, each equipped with multiple sensors. The agent nodes form a multi-hop network among themselves. Each agent node computes an estimate of the process by using its sensor observation and messages obtained… ▽ More

    Submitted 14 January, 2021; originally announced January 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2002.01545