Skip to main content

Showing 1–8 of 8 results for author: Webb, R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2403.05490  [pdf, other

    cs.LG cs.AI cs.CV cs.IT stat.ML

    Poly-View Contrastive Learning

    Authors: Amitis Shidani, Devon Hjelm, Jason Ramapuram, Russ Webb, Eeshan Gunesh Dhekane, Dan Busbridge

    Abstract: Contrastive learning typically matches pairs of related views among a number of unrelated negative views. Views can be generated (e.g. by augmentations) or be observed. We investigate matching when there are more than two related views which we call poly-view tasks, and derive new representation learning objectives using information maximization and sufficient statistics. We show that with unlimit… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted to ICLR 2024. 42 pages, 7 figures, 3 tables, loss pseudo-code included in appendix

  2. arXiv:2312.03213  [pdf, other

    cs.LG stat.ML

    Bootstrap Your Own Variance

    Authors: Polina Turishcheva, Jason Ramapuram, Sinead Williamson, Dan Busbridge, Eeshan Dhekane, Russ Webb

    Abstract: Understanding model uncertainty is important for many applications. We propose Bootstrap Your Own Variance (BYOV), combining Bootstrap Your Own Latent (BYOL), a negative-free Self-Supervised Learning (SSL) algorithm, with Bayes by Backprop (BBB), a Bayesian method for estimating model posteriors. We find that the learned predictive std of BYOV vs. a supervised BBB model is well captured by a Gauss… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Journal ref: NeurIPS 2023 Workshop: Self-Supervised Learning - Theory and Practice

  3. arXiv:2307.13813  [pdf, other

    stat.ML cs.AI cs.LG

    How to Scale Your EMA

    Authors: Dan Busbridge, Jason Ramapuram, Pierre Ablin, Tatiana Likhomanenko, Eeshan Gunesh Dhekane, Xavier Suau, Russ Webb

    Abstract: Preserving training dynamics across batch sizes is an important tool for practical machine learning as it enables the trade-off between batch size and wall-clock time. This trade-off is typically enabled by a scaling rule, for example, in stochastic gradient descent, one should scale the learning rate linearly with the batch size. Another important machine learning tool is the model EMA, a functio… ▽ More

    Submitted 7 November, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: Spotlight at NeurIPS 2023, 53 pages, 32 figures, 17 tables

  4. arXiv:2110.00528  [pdf, other

    cs.CV cs.LG stat.ML

    Do Self-Supervised and Supervised Methods Learn Similar Visual Representations?

    Authors: Tom George Grigg, Dan Busbridge, Jason Ramapuram, Russ Webb

    Abstract: Despite the success of a number of recent techniques for visual self-supervised deep learning, there has been limited investigation into the representations that are ultimately learned. By leveraging recent advances in the comparison of neural representations, we explore in this direction by comparing a contrastive self-supervised algorithm to supervision for simple image data in a common architec… ▽ More

    Submitted 2 December, 2021; v1 submitted 1 October, 2021; originally announced October 2021.

    Comments: Accepted to 2nd Workshop on Self-Supervised Learning: Theory and Practice (NeurIPS 2021), Sydney, Australia. Fixed typos, added acknowledgements. 5 pages + 2 pages of appendices, 5 figures, 1 table

  5. arXiv:1912.08444  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Relational Mimic for Visual Adversarial Imitation Learning

    Authors: Lionel Blondé, Yichuan Charlie Tang, Jian Zhang, Russ Webb

    Abstract: In this work, we introduce a new method for imitation learning from video demonstrations. Our method, Relational Mimic (RM), improves on previous visual imitation learning methods by combining generative adversarial networks and relational learning. RM is flexible and can be used in conjunction with other recent advances in generative adversarial imitation learning to better address the need for m… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

  6. arXiv:1905.03658  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Improving Discrete Latent Representations With Differentiable Approximation Bridges

    Authors: Jason Ramapuram, Russ Webb

    Abstract: Modern neural network training relies on piece-wise (sub-)differentiable functions in order to use backpropagation to update model parameters. In this work, we introduce a novel method to allow simple non-differentiable functions at intermediary layers of deep neural networks. We do so by training with a differentiable approximation bridge (DAB) neural network which approximates the non-differenti… ▽ More

    Submitted 25 October, 2019; v1 submitted 9 May, 2019; originally announced May 2019.

  7. arXiv:1812.03170  [pdf, other

    cs.CV cs.LG stat.ML

    Variational Saccading: Efficient Inference for Large Resolution Images

    Authors: Jason Ramapuram, Maurits Diephuis, Frantzeska Lavda, Russ Webb, Alexandros Kalousis

    Abstract: Image classification with deep neural networks is typically restricted to images of small dimensionality such as 224 x 244 in Resnet models [24]. This limitation excludes the 4000 x 3000 dimensional images that are taken by modern smartphone cameras and smart devices. In this work, we aim to mitigate the prohibitive inferential and memory costs of operating in such large dimensional spaces. To sam… ▽ More

    Submitted 6 September, 2019; v1 submitted 8 December, 2018; originally announced December 2018.

    Comments: Published BMVC 2019 & NIPS 2018 Bayesian Deep Learning Workshop

  8. arXiv:1807.00126  [pdf, other

    cs.LG stat.ML

    A New Benchmark and Progress Toward Improved Weakly Supervised Learning

    Authors: Jason Ramapuram, Russ Webb

    Abstract: Knowledge Matters: Importance of Prior Information for Optimization [7], by Gulcehre et. al., sought to establish the limits of current black-box, deep learning techniques by posing problems which are difficult to learn without engineering knowledge into the model or training procedure. In our work, we completely solve the previous Knowledge Matters problem using a generic model, pose a more diffi… ▽ More

    Submitted 18 September, 2018; v1 submitted 30 June, 2018; originally announced July 2018.