Skip to main content

Showing 1–8 of 8 results for author: Mulayoff, R

.
  1. arXiv:2405.15719  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Hierarchical Uncertainty Exploration via Feedforward Posterior Trees

    Authors: Elias Nehme, Rotem Mulayoff, Tomer Michaeli

    Abstract: When solving ill-posed inverse problems, one often desires to explore the space of potential solutions rather than be presented with a single plausible reconstruction. Valuable insights into these feasible solutions and their associated probabilities are embedded in the posterior distribution. However, when confronted with data of high dimensionality (such as images), visualizing this distribution… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 32 pages, 21 figures

  2. arXiv:2402.13810  [pdf, other

    cs.LG

    The Expected Loss of Preconditioned Langevin Dynamics Reveals the Hessian Rank

    Authors: Amitay Bar, Rotem Mulayoff, Tomer Michaeli, Ronen Talmon

    Abstract: Langevin dynamics (LD) is widely used for sampling from distributions and for optimization. In this work, we derive a closed-form expression for the expected loss of preconditioned LD near stationary points of the objective function. We use the fact that at the vicinity of such points, LD reduces to an Ornstein-Uhlenbeck process, which is amenable to convenient mathematical treatment. Our analysis… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted to AAAI-24 main track

  3. arXiv:2306.17499  [pdf, other

    cs.LG

    The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks

    Authors: Mor Shpigel Nacson, Rotem Mulayoff, Greg Ongie, Tomer Michaeli, Daniel Soudry

    Abstract: We study the type of solutions to which stochastic gradient descent converges when used to train a single hidden-layer multivariate ReLU network with the quadratic loss. Our results are based on a dynamical stability analysis. In the univariate case, it was shown that linearly stable minima correspond to network functions (predictors), whose second derivative has a bounded weighted $L^1$ norm. Not… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: Published at ICLR 2023. Fixed statements and proofs of Proposition 3 and Theorem 2

  4. arXiv:2306.07850  [pdf, other

    cs.LG

    Exact Mean Square Linear Stability Analysis for SGD

    Authors: Rotem Mulayoff, Tomer Michaeli

    Abstract: The dynamical stability of optimization methods at the vicinity of minima of the loss has recently attracted significant attention. For gradient descent (GD), stable convergence is possible only to minima that are sufficiently flat w.r.t. the step size, and those have been linked with favorable properties of the trained model. However, while the stability threshold of GD is well-known, to date, no… ▽ More

    Submitted 16 June, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: In Conference on Learning Theory (COLT) 2024

  5. arXiv:2303.11073  [pdf, other

    cs.CV cs.LG

    Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models

    Authors: René Haas, Inbar Huberman-Spiegelglas, Rotem Mulayoff, Stella Graßhof, Sami S. Brandt, Tomer Michaeli

    Abstract: Denoising Diffusion Models (DDMs) have emerged as a strong competitor to Generative Adversarial Networks (GANs). However, despite their widespread use in image synthesis and editing applications, their latent space is still not as well understood. Recently, a semantic latent space for DDMs, coined `$h$-space', was shown to facilitate semantic image editing in a way reminiscent of GANs. The $h$-spa… ▽ More

    Submitted 29 May, 2024; v1 submitted 20 March, 2023; originally announced March 2023.

  6. arXiv:2004.04386  [pdf, other

    cs.LG stat.ML

    Spectral Discovery of Jointly Smooth Features for Multimodal Data

    Authors: Felix Dietrich, Or Yair, Rotem Mulayoff, Ronen Talmon, Ioannis G. Kevrekidis

    Abstract: In this paper, we propose a spectral method for deriving functions that are jointly smooth on multiple observed manifolds. This allows us to register measurements of the same phenomenon by heterogeneous sensors, and to reject sensor-specific noise. Our method is unsupervised and primarily consists of two steps. First, using kernels, we obtain a subspace spanning smooth functions on each separate m… ▽ More

    Submitted 29 April, 2021; v1 submitted 9 April, 2020; originally announced April 2020.

    MSC Class: 46Nxx; 47Nxx; 35K08; 58C05

  7. arXiv:2002.04710  [pdf, other

    cs.LG stat.ML

    Unique Properties of Flat Minima in Deep Networks

    Authors: Rotem Mulayoff, Tomer Michaeli

    Abstract: It is well known that (stochastic) gradient descent has an implicit bias towards flat minima. In deep neural network training, this mechanism serves to screen out minima. However, the precise effect that this has on the trained network is not yet fully understood. In this paper, we characterize the flat minima in linear neural networks trained with a quadratic loss. First, we show that linear ResN… ▽ More

    Submitted 8 August, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: Presented at ICML2020

  8. arXiv:1804.04897  [pdf, other

    eess.SP cs.IT

    On the Minimal Overcompleteness Allowing Universal Sparse Representation

    Authors: Rotem Mulayoff, Tomer Michaeli

    Abstract: Sparse representation over redundant dictionaries constitutes a good model for many classes of signals (e.g., patches of natural images, segments of speech signals, etc.). However, despite its popularity, very little is known about the representation capacity of this model. In this paper, we study how redundant a dictionary must be so as to allow any vector to admit a sparse approximation with a p… ▽ More

    Submitted 6 March, 2019; v1 submitted 13 April, 2018; originally announced April 2018.

    Comments: To appear in IEEE Transactions on Information Theory