Skip to main content

Showing 1–3 of 3 results for author: Aladago, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01408  [pdf, other

    cs.CV cs.AI cs.LG

    Semantic Compositions Enhance Vision-Language Contrastive Learning

    Authors: Maxwell Aladago, Lorenzo Torresani, Soroush Vosoughi

    Abstract: In the field of vision-language contrastive learning, models such as CLIP capitalize on matched image-caption pairs as positive examples and leverage within-batch non-matching pairs as negatives. This approach has led to remarkable outcomes in zero-shot image classification, cross-modal retrieval, and linear evaluation tasks. We show that the zero-shot classification and retrieval capabilities of… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2212.01447  [pdf, other

    cs.CV cs.LG

    Compound Tokens: Channel Fusion for Vision-Language Representation Learning

    Authors: Maxwell Mbabilla Aladago, AJ Piergiovanni

    Abstract: We present an effective method for fusing visual-and-language representations for several question answering tasks including visual question answering and visual entailment. In contrast to prior works that concatenate unimodal representations or use only cross-attention, we compose multimodal representations via channel fusion. By fusing on the channels, the model is able to more effectively align… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

  3. arXiv:2101.06475  [pdf, other

    cs.LG cs.AI

    Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks

    Authors: Maxwell Mbabilla Aladago, Lorenzo Torresani

    Abstract: In contrast to traditional weight optimization in a continuous space, we demonstrate the existence of effective random networks whose weights are never updated. By selecting a weight among a fixed set of random values for each individual connection, our method uncovers combinations of random weights that match the performance of traditionally-trained networks of the same capacity. We refer to our… ▽ More

    Submitted 8 June, 2021; v1 submitted 16 January, 2021; originally announced January 2021.