Skip to main content

Showing 1–6 of 6 results for author: Ryali, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.05613  [pdf, other

    cs.CV

    Window Attention is Bugged: How not to Interpolate Position Embeddings

    Authors: Daniel Bolya, Chaitanya Ryali, Judy Hoffman, Christoph Feichtenhofer

    Abstract: Window attention, position embeddings, and high resolution finetuning are core concepts in the modern transformer era of computer vision. However, we find that naively combining these near ubiquitous components can have a detrimental effect on performance. The issue is simple: interpolating position embeddings while using window attention is wrong. We study two state-of-the-art methods that have t… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: Preprint. Code release will be coming in the future

  2. arXiv:2306.00989  [pdf, other

    cs.CV cs.LG

    Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

    Authors: Chaitanya Ryali, Yuan-Ting Hu, Daniel Bolya, Chen Wei, Haoqi Fan, Po-Yao Huang, Vaibhav Aggarwal, Arkabandhu Chowdhury, Omid Poursaeed, Judy Hoffman, Jitendra Malik, Yanghao Li, Christoph Feichtenhofer

    Abstract: Modern hierarchical vision transformers have added several vision-specific components in the pursuit of supervised classification performance. While these components lead to effective accuracies and attractive FLOP counts, the added complexity actually makes these transformers slower than their vanilla ViT counterparts. In this paper, we argue that this additional bulk is unnecessary. By pretraini… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: ICML 2023 Oral version. Code+Models: https://github.com/facebookresearch/hiera

  3. arXiv:2212.08071  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    MAViL: Masked Audio-Video Learners

    Authors: Po-Yao Huang, Vasu Sharma, Hu Xu, Chaitanya Ryali, Haoqi Fan, Yanghao Li, Shang-Wen Li, Gargi Ghosh, Jitendra Malik, Christoph Feichtenhofer

    Abstract: We present Masked Audio-Video Learners (MAViL) to train audio-visual representations. Our approach learns with three complementary forms of self-supervision: (1) reconstruction of masked audio and video input data, (2) intra- and inter-modal contrastive learning with masking, and (3) self-training by reconstructing joint audio-video contextualized features learned from the first two objectives. Pr… ▽ More

    Submitted 17 July, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: Technical report

  4. arXiv:2103.12719  [pdf, other

    cs.CV cs.AI

    Characterizing and Improving the Robustness of Self-Supervised Learning through Background Augmentations

    Authors: Chaitanya K. Ryali, David J. Schwab, Ari S. Morcos

    Abstract: Recent progress in self-supervised learning has demonstrated promising results in multiple visual tasks. An important ingredient in high-performing self-supervised methods is the use of data augmentation by training models to place different augmented views of the same image nearby in embedding space. However, commonly used augmentation pipelines treat images holistically, ignoring the semantic re… ▽ More

    Submitted 12 November, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: Technical Report; Additional Results

  5. arXiv:2101.06887  [pdf, other

    cs.CL cs.LG cs.NE q-bio.NC stat.ML

    Can a Fruit Fly Learn Word Embeddings?

    Authors: Yuchen Liang, Chaitanya K. Ryali, Benjamin Hoover, Leopold Grinberg, Saket Navlakha, Mohammed J. Zaki, Dmitry Krotov

    Abstract: The mushroom body of the fruit fly brain is one of the best studied systems in neuroscience. At its core it consists of a population of Kenyon cells, which receive inputs from multiple sensory modalities. These cells are inhibited by the anterior paired lateral neuron, thus creating a sparse high dimensional representation of the inputs. In this work we study a mathematical formalization of this n… ▽ More

    Submitted 14 March, 2021; v1 submitted 18 January, 2021; originally announced January 2021.

    Comments: Accepted for publication at ICLR 2021

  6. arXiv:2001.04907  [pdf, other

    cs.LG cs.DB cs.IR q-bio.NC stat.ML

    Bio-Inspired Hashing for Unsupervised Similarity Search

    Authors: Chaitanya K. Ryali, John J. Hopfield, Leopold Grinberg, Dmitry Krotov

    Abstract: The fruit fly Drosophila's olfactory circuit has inspired a new locality sensitive hashing (LSH) algorithm, FlyHash. In contrast with classical LSH algorithms that produce low dimensional hash codes, FlyHash produces sparse high-dimensional hash codes and has also been shown to have superior empirical performance compared to classical LSH algorithms in similarity search. However, FlyHash uses rand… ▽ More

    Submitted 30 June, 2020; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: Accepted for publication in ICML 2020

    Journal ref: Proceedings of the International Conference on Machine Learning, 2020, pp.8739-8750