Skip to main content

Showing 1–6 of 6 results for author: Assran, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2302.14483  [pdf, other

    cs.LG cs.CV stat.ML

    RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data

    Authors: Sangwoo Mo, Jong-Chyi Su, Chih-Yao Ma, Mido Assran, Ishan Misra, Licheng Yu, Sean Bell

    Abstract: Semi-supervised learning aims to train a model using limited labels. State-of-the-art semi-supervised methods for image classification such as PAWS rely on self-supervised representations learned with large-scale unlabeled but curated data. However, PAWS is often less effective when using real-world unlabeled data that is uncurated, e.g., contains out-of-class data. We propose RoPAWS, a robust ext… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: ICLR 2023

  2. arXiv:2006.13838  [pdf, other

    cs.LG math.OC stat.ML

    Advances in Asynchronous Parallel and Distributed Optimization

    Authors: Mahmoud Assran, Arda Aytekin, Hamid Feyzmahdavian, Mikael Johansson, Michael Rabbat

    Abstract: Motivated by large-scale optimization problems arising in the context of machine learning, there have been several advances in the study of asynchronous parallel and distributed optimization methods during the past decade. Asynchronous methods do not require all processors to maintain a consistent view of the optimization variables. Consequently, they generally can make more efficient use of compu… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: 33 pages, 4 figures

  3. arXiv:2006.10803  [pdf, other

    cs.LG cs.CV stat.ML

    Supervision Accelerates Pre-training in Contrastive Semi-Supervised Learning of Visual Representations

    Authors: Mahmoud Assran, Nicolas Ballas, Lluis Castrejon, Michael Rabbat

    Abstract: We investigate a strategy for improving the efficiency of contrastive learning of visual representations by leveraging a small amount of supervised information during pre-training. We propose a semi-supervised loss, SuNCEt, based on noise-contrastive estimation and neighbourhood component analysis, that aims to distinguish examples of different classes in addition to the self-supervised instance-w… ▽ More

    Submitted 1 December, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

  4. arXiv:2002.12414  [pdf, other

    cs.LG math.OC stat.ML

    On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings

    Authors: Mahmoud Assran, Michael Rabbat

    Abstract: We study Nesterov's accelerated gradient method with constant step-size and momentum parameters in the stochastic approximation setting (unbiased gradients with bounded variance) and the finite-sum setting (where randomness is due to sampling mini-batches). To build better insight into the behavior of Nesterov's method in stochastic settings, we focus throughout on objectives that are smooth, stro… ▽ More

    Submitted 27 June, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

    Journal ref: International Conference on Machine Learning (ICML 2020)

  5. arXiv:1906.04585  [pdf, other

    cs.LG cs.AI cs.MA math.OC stat.ML

    Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning

    Authors: Mahmoud Assran, Joshua Romoff, Nicolas Ballas, Joelle Pineau, Michael Rabbat

    Abstract: Multi-simulator training has contributed to the recent success of Deep Reinforcement Learning by stabilizing learning and allowing for higher training throughputs. We propose Gossip-based Actor-Learner Architectures (GALA) where several actor-learners (such as A2C agents) are organized in a peer-to-peer communication topology, and exchange information through asynchronous gossip in order to take a… ▽ More

    Submitted 21 April, 2020; v1 submitted 9 June, 2019; originally announced June 2019.

    Journal ref: Advances in Neural Information Processing Systems (2019) 13299-13309

  6. arXiv:1811.10792  [pdf, other

    cs.LG cs.AI cs.DC cs.MA math.OC stat.ML

    Stochastic Gradient Push for Distributed Deep Learning

    Authors: Mahmoud Assran, Nicolas Loizou, Nicolas Ballas, Michael Rabbat

    Abstract: Distributed data-parallel algorithms aim to accelerate the training of deep neural networks by parallelizing the computation of large mini-batch gradient updates across multiple nodes. Approaches that synchronize nodes using exact distributed averaging (e.g., via AllReduce) are sensitive to stragglers and communication delays. The PushSum gossip algorithm is robust to these issues, but only perfor… ▽ More

    Submitted 14 May, 2019; v1 submitted 26 November, 2018; originally announced November 2018.

    Comments: ICML 2019

    Journal ref: International Conference on Machine Learning 97 (2019) 344-353