Skip to main content

Showing 1–8 of 8 results for author: Dhekane, E

.
  1. arXiv:2403.05490  [pdf, other

    cs.LG cs.AI cs.CV cs.IT stat.ML

    Poly-View Contrastive Learning

    Authors: Amitis Shidani, Devon Hjelm, Jason Ramapuram, Russ Webb, Eeshan Gunesh Dhekane, Dan Busbridge

    Abstract: Contrastive learning typically matches pairs of related views among a number of unrelated negative views. Views can be generated (e.g. by augmentations) or be observed. We investigate matching when there are more than two related views which we call poly-view tasks, and derive new representation learning objectives using information maximization and sufficient statistics. We show that with unlimit… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted to ICLR 2024. 42 pages, 7 figures, 3 tables, loss pseudo-code included in appendix

  2. arXiv:2312.03213  [pdf, other

    cs.LG stat.ML

    Bootstrap Your Own Variance

    Authors: Polina Turishcheva, Jason Ramapuram, Sinead Williamson, Dan Busbridge, Eeshan Dhekane, Russ Webb

    Abstract: Understanding model uncertainty is important for many applications. We propose Bootstrap Your Own Variance (BYOV), combining Bootstrap Your Own Latent (BYOL), a negative-free Self-Supervised Learning (SSL) algorithm, with Bayes by Backprop (BBB), a Bayesian method for estimating model posteriors. We find that the learned predictive std of BYOV vs. a supervised BBB model is well captured by a Gauss… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Journal ref: NeurIPS 2023 Workshop: Self-Supervised Learning - Theory and Practice

  3. arXiv:2307.13813  [pdf, other

    stat.ML cs.AI cs.LG

    How to Scale Your EMA

    Authors: Dan Busbridge, Jason Ramapuram, Pierre Ablin, Tatiana Likhomanenko, Eeshan Gunesh Dhekane, Xavier Suau, Russ Webb

    Abstract: Preserving training dynamics across batch sizes is an important tool for practical machine learning as it enables the trade-off between batch size and wall-clock time. This trade-off is typically enabled by a scaling rule, for example, in stochastic gradient descent, one should scale the learning rate linearly with the batch size. Another important machine learning tool is the model EMA, a functio… ▽ More

    Submitted 7 November, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: Spotlight at NeurIPS 2023, 53 pages, 32 figures, 17 tables

  4. arXiv:2210.16365  [pdf, other

    cs.LG

    Elastic Weight Consolidation Improves the Robustness of Self-Supervised Learning Methods under Transfer

    Authors: Andrius Ovsianas, Jason Ramapuram, Dan Busbridge, Eeshan Gunesh Dhekane, Russ Webb

    Abstract: Self-supervised representation learning (SSL) methods provide an effective label-free initial condition for fine-tuning downstream tasks. However, in numerous realistic scenarios, the downstream task might be biased with respect to the target label distribution. This in turn moves the learned fine-tuned model posterior away from the initial (label) bias-free self-supervised model posterior. In thi… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022 Workshop: Self-Supervised Learning - Theory and Practice

  5. arXiv:2105.01119  [pdf, other

    cs.LG

    Iterated learning for emergent systematicity in VQA

    Authors: Ankit Vani, Max Schwarzer, Yuchen Lu, Eeshan Dhekane, Aaron Courville

    Abstract: Although neural module networks have an architectural bias towards compositionality, they require gold standard layouts to generalize systematically in practice. When instead learning layouts and modules jointly, compositionality does not arise automatically and an explicit pressure is necessary for the emergence of layouts exhibiting the right structure. We propose to address this problem using i… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

    Comments: Published as a conference paper at ICLR 2021. 9 pages main, 21 pages total including references and appendix

    ACM Class: I.2.6

    Journal ref: 9th International Conference on Learning Representations (ICLR 2021)

  6. arXiv:1906.03574  [pdf, other

    cs.LG cs.AI stat.ML

    Transfer Learning by Modeling a Distribution over Policies

    Authors: Disha Shrivastava, Eeshan Gunesh Dhekane, Riashat Islam

    Abstract: Exploration and adaptation to new tasks in a transfer learning setup is a central challenge in reinforcement learning. In this work, we build on the idea of modeling a distribution over policies in a Bayesian deep reinforcement learning setup to propose a transfer strategy. Recent works have shown to induce diversity in the learned policies by maximizing the entropy of a distribution of policies (… ▽ More

    Submitted 9 June, 2019; originally announced June 2019.

    Comments: Accepted at the ICML 2019 workshop on Multi-Task and Lifelong Reinforcement Learning

  7. arXiv:1905.04866  [pdf, other

    cs.LG stat.ML

    Hierarchical Importance Weighted Autoencoders

    Authors: Chin-Wei Huang, Kris Sankaran, Eeshan Dhekane, Alexandre Lacoste, Aaron Courville

    Abstract: Importance weighted variational inference (Burda et al., 2015) uses multiple i.i.d. samples to have a tighter variational lower bound. We believe a joint proposal has the potential of reducing the number of redundant samples, and introduce a hierarchical structure to induce correlation. The hope is that the proposals would coordinate to make up for the error made by one another to reduce the varia… ▽ More

    Submitted 13 May, 2019; originally announced May 2019.

    Comments: Accepted by ICML 2019. 17 pages

  8. arXiv:1904.00150  [pdf, other

    cs.MM cs.LG cs.SD eess.AS

    Learning Affective Correspondence between Music and Image

    Authors: Gaurav Verma, Eeshan Gunesh Dhekane, Tanaya Guha

    Abstract: We introduce the problem of learning affective correspondence between audio (music) and visual data (images). For this task, a music clip and an image are considered similar (having true correspondence) if they have similar emotion content. In order to estimate this crossmodal, emotion-centric similarity, we propose a deep neural network architecture that learns to project the data from the two mo… ▽ More

    Submitted 16 April, 2019; v1 submitted 30 March, 2019; originally announced April 2019.

    Comments: 5 pages, International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019