Skip to main content

Showing 1–4 of 4 results for author: Pumir, T

.
  1. arXiv:2406.02052  [pdf, other

    cs.LG stat.ML

    PETRA: Parallel End-to-end Training with Reversible Architectures

    Authors: Stéphane Rivaud, Louis Fournier, Thomas Pumir, Eugene Belilovsky, Michael Eickenberg, Edouard Oyallon

    Abstract: Reversible architectures have been shown to be capable of performing on par with their non-reversible architectures, being applied in deep learning for memory savings and generative modeling. In this work, we show how reversible architectures can solve challenges in parallelizing deep model training. We introduce PETRA, a novel alternative to backpropagation for parallelizing gradient computations… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  2. arXiv:2011.03358  [pdf, ps, other

    math.OC math.NA

    Generalization of Quasi-Newton Methods: Application to Robust Symmetric Multisecant Updates

    Authors: Damien Scieur, Lewis Liu, Thomas Pumir, Nicolas Boumal

    Abstract: Quasi-Newton techniques approximate the Newton step by estimating the Hessian using the so-called secant equations. Some of these methods compute the Hessian using several secant equations but produce non-symmetric updates. Other quasi-Newton schemes, such as BFGS, enforce symmetry but cannot satisfy more than one secant equation. We propose a new type of quasi-Newton symmetric update using severa… ▽ More

    Submitted 8 February, 2021; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: AISTATS 2021

  3. The generalized orthogonal Procrustes problem in the high noise regime

    Authors: Thomas Pumir, Amit Singer, Nicolas Boumal

    Abstract: We consider the problem of estimating a cloud of points from numerous noisy observations of that cloud after unknown rotations, and possibly reflections. This is an instance of the general problem of estimation under group action, originally inspired by applications in 3-D imaging and computer vision. We focus on a regime where the noise level is larger than the magnitude of the signal, so much so… ▽ More

    Submitted 23 May, 2021; v1 submitted 1 July, 2019; originally announced July 2019.

    MSC Class: 34K30; 35K57; 35Q80; 92D25

    Journal ref: Information and Inference: A Journal of the IMA, iaaa035, 2021

  4. arXiv:1806.03763  [pdf, other

    stat.ML cs.LG math.OC

    Smoothed analysis of the low-rank approach for smooth semidefinite programs

    Authors: Thomas Pumir, Samy Jelassi, Nicolas Boumal

    Abstract: We consider semidefinite programs (SDPs) of size n with equality constraints. In order to overcome scalability issues, Burer and Monteiro proposed a factorized approach based on optimizing over a matrix Y of size $n$ by $k$ such that $X = YY^*$ is the SDP variable. The advantages of such formulation are twofold: the dimension of the optimization variable is reduced and positive semidefiniteness is… ▽ More

    Submitted 27 November, 2018; v1 submitted 10 June, 2018; originally announced June 2018.