Skip to main content

Showing 1–50 of 64 results for author: Michaeli, T

.
  1. arXiv:2407.08674  [pdf, other

    cs.CV

    Still-Moving: Customized Video Generation without Customized Video Data

    Authors: Hila Chefer, Shiran Zada, Roni Paiss, Ariel Ephrat, Omer Tov, Michael Rubinstein, Lior Wolf, Tali Dekel, Tomer Michaeli, Inbar Mosseri

    Abstract: Customizing text-to-image (T2I) models has seen tremendous progress recently, particularly in areas such as personalization, stylization, and conditional generation. However, expanding this progress to video generation is still in its infancy, primarily due to the lack of customized video data. In this work, we introduce Still-Moving, a novel generic framework for customizing a text-to-video (T2V)… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Webpage: https://still-moving.github.io/ | Video: https://www.youtube.com/watch?v=U7UuV_VIjnA

  2. arXiv:2407.08256  [pdf, other

    eess.IV cs.CV cs.LG

    Adaptive Compressed Sensing with Diffusion-Based Posterior Sampling

    Authors: Noam Elata, Tomer Michaeli, Michael Elad

    Abstract: Compressed Sensing (CS) facilitates rapid image acquisition by selecting a small subset of measurements sufficient for high-fidelity reconstruction. Adaptive CS seeks to further enhance this process by dynamically choosing future measurements based on information gleaned from data that is already acquired. However, many existing frameworks are often tailored to specific tasks and require intricate… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Published in European Conference on Computer Vision (ECCV) 2024

  3. arXiv:2406.00828  [pdf, other

    cs.CV

    Stealing Image-to-Image Translation Models With a Single Query

    Authors: Nurit S**arn-Eliezer, Tomer Michaeli

    Abstract: Training deep neural networks requires significant computational resources and large datasets that are often confidential or expensive to collect. As a result, owners tend to protect their models by allowing access only via an API. Many works demonstrated the possibility of stealing such protected models by repeatedly querying the API. However, to date, research has predominantly focused on steali… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  4. arXiv:2405.15719  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Hierarchical Uncertainty Exploration via Feedforward Posterior Trees

    Authors: Elias Nehme, Rotem Mulayoff, Tomer Michaeli

    Abstract: When solving ill-posed inverse problems, one often desires to explore the space of potential solutions rather than be presented with a single plausible reconstruction. Valuable insights into these feasible solutions and their associated probabilities are embedded in the posterior distribution. However, when confronted with data of high dimensionality (such as images), visualizing this distribution… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 32 pages, 21 figures

  5. arXiv:2405.13805  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Perceptual Fairness in Image Restoration

    Authors: Guy Ohayon, Michael Elad, Tomer Michaeli

    Abstract: Fairness in image restoration tasks is the desire to treat different sub-groups of images equally well. Existing definitions of fairness in image restoration are highly restrictive. They consider a reconstruction to be a correct outcome for a group (e.g., women) only if it falls within the group's set of ground truth images (e.g., natural images of women); otherwise, it is considered entirely inco… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  6. arXiv:2405.12211  [pdf, other

    cs.CV

    Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices

    Authors: Nathaniel Cohen, Vladimir Kulikov, Matan Kleiner, Inbar Huberman-Spiegelglas, Tomer Michaeli

    Abstract: Text-to-image (T2I) diffusion models achieve state-of-the-art results in image synthesis and editing. However, leveraging such pretrained models for video editing is considered a major challenge. Many existing works attempt to enforce temporal consistency in the edited video through explicit correspondence mechanisms, either in pixel space or between deep features. These methods, however, struggle… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: ICML 2024. Code and examples are available at https://matankleiner.github.io/slicedit/

  7. Exploring Communication Dynamics: Eye-tracking Analysis in Pair Programming of Computer Science Education

    Authors: Wunmin Jang, Hong Gao, Tilman Michaeli, Enkelejda Kasneci

    Abstract: Pair programming is widely recognized as an effective educational tool in computer science that promotes collaborative learning and mirrors real-world work dynamics. However, communication breakdowns within pairs significantly challenge this learning process. In this study, we use eye-tracking data recorded during pair programming sessions to study communication dynamics between various pair progr… ▽ More

    Submitted 2 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  8. arXiv:2402.13810  [pdf, other

    cs.LG

    The Expected Loss of Preconditioned Langevin Dynamics Reveals the Hessian Rank

    Authors: Amitay Bar, Rotem Mulayoff, Tomer Michaeli, Ronen Talmon

    Abstract: Langevin dynamics (LD) is widely used for sampling from distributions and for optimization. In this work, we derive a closed-form expression for the expected loss of preconditioned LD near stationary points of the objective function. We use the fact that at the vicinity of such points, LD reduces to an Ornstein-Uhlenbeck process, which is amenable to convenient mathematical treatment. Our analysis… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted to AAAI-24 main track

  9. arXiv:2402.10095  [pdf, other

    cs.LG

    Classification Diffusion Models: Revitalizing Density Ratio Estimation

    Authors: Shahar Yadin, Noam Elata, Tomer Michaeli

    Abstract: A prominent family of methods for learning data distributions relies on density ratio estimation (DRE), where a model is trained to $\textit{classify}$ between data samples and samples from some reference distribution. DRE-based models can directly output the likelihood for any given input, a highly desired property that is lacking in most generative techniques. Nevertheless, to date, DRE methods… ▽ More

    Submitted 12 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  10. arXiv:2402.10009  [pdf, other

    cs.SD cs.LG eess.AS

    Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion

    Authors: Hila Manor, Tomer Michaeli

    Abstract: Editing signals using large pre-trained models, in a zero-shot manner, has recently seen rapid advancements in the image domain. However, this wave has yet to reach the audio domain. In this paper, we explore two zero-shot editing techniques for audio signals, which use DDPM inversion with pre-trained diffusion models. The first, which we coin ZEro-shot Text-based Audio (ZETA) editing, is adopted… ▽ More

    Submitted 29 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: Accepted for ICML 2024; Examples and code available in https://hilamanor.github.io/AudioEditing/

  11. arXiv:2401.12945  [pdf, other

    cs.CV

    Lumiere: A Space-Time Diffusion Model for Video Generation

    Authors: Omer Bar-Tal, Hila Chefer, Omer Tov, Charles Herrmann, Roni Paiss, Shiran Zada, Ariel Ephrat, Junhwa Hur, Guanghui Liu, Amit Raj, Yuanzhen Li, Michael Rubinstein, Tomer Michaeli, Oliver Wang, Deqing Sun, Tali Dekel, Inbar Mosseri

    Abstract: We introduce Lumiere -- a text-to-video diffusion model designed for synthesizing videos that portray realistic, diverse and coherent motion -- a pivotal challenge in video synthesis. To this end, we introduce a Space-Time U-Net architecture that generates the entire temporal duration of the video at once, through a single pass in the model. This is in contrast to existing video models which synth… ▽ More

    Submitted 5 February, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: Webpage: https://lumiere-video.github.io/ | Video: https://www.youtube.com/watch?v=wxLr02Dz2Sc

  12. arXiv:2312.07804  [pdf, other

    cs.CV

    Uncertainty Visualization via Low-Dimensional Posterior Projections

    Authors: Omer Yair, Elias Nehme, Tomer Michaeli

    Abstract: In ill-posed inverse problems, it is commonly desirable to obtain insight into the full spectrum of plausible solutions, rather than extracting only a single reconstruction. Information about the plausible solutions and their likelihoods is encoded in the posterior distribution. However, for high-dimensional data, this distribution is challenging to visualize. In this work, we introduce a new appr… ▽ More

    Submitted 12 May, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  13. arXiv:2311.09253  [pdf, other

    eess.IV cs.CV cs.LG eess.SP

    The Perception-Robustness Tradeoff in Deterministic Image Restoration

    Authors: Guy Ohayon, Tomer Michaeli, Michael Elad

    Abstract: We study the behavior of deterministic methods for solving inverse problems in imaging. These methods are commonly designed to achieve two goals: (1) attaining high perceptual quality, and (2) generating reconstructions that are consistent with the measurements. We provide a rigorous proof that the better a predictor satisfies these two requirements, the larger its Lipschitz constant must be, rega… ▽ More

    Submitted 8 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

  14. arXiv:2310.16047  [pdf, other

    cs.CV cs.LG eess.IV

    From Posterior Sampling to Meaningful Diversity in Image Restoration

    Authors: Noa Cohen, Hila Manor, Yuval Bahat, Tomer Michaeli

    Abstract: Image restoration problems are typically ill-posed in the sense that each degraded image can be restored in infinitely many valid ways. To accommodate this, many works generate a diverse set of outputs by attempting to randomly sample from the posterior distribution of natural images given the degraded input. Here we argue that this strategy is commonly of limited practical value because of the he… ▽ More

    Submitted 11 March, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted for ICLR 2024. Code and examples are available at https://noa-cohen.github.io/MeaningfulDiversityInIR

  15. arXiv:2309.15533  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Uncertainty Quantification via Neural Posterior Principal Components

    Authors: Elias Nehme, Omer Yair, Tomer Michaeli

    Abstract: Uncertainty quantification is crucial for the deployment of image restoration models in safety-critical domains, like autonomous driving and biological imaging. To date, methods for uncertainty visualization have mainly focused on per-pixel estimates. Yet, a heatmap of per-pixel variances is typically of little practical use, as it does not capture the strong correlations between pixels. A more na… ▽ More

    Submitted 4 November, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: NeurIPS 2023 Camera Ready, interactive examples at https://eliasnehme.github.io/NPPC/

  16. arXiv:2309.13598  [pdf, other

    cs.CV cs.LG stat.ML

    On the Posterior Distribution in Denoising: Application to Uncertainty Quantification

    Authors: Hila Manor, Tomer Michaeli

    Abstract: Denoisers play a central role in many applications, from noise suppression in low-grade imaging sensors, to empowering score-based generative models. The latter category of methods makes use of Tweedie's formula, which links the posterior mean in Gaussian denoising (\ie the minimum MSE denoiser) with the score of the data distribution. Here, we derive a fundamental relation between the higher-orde… ▽ More

    Submitted 19 February, 2024; v1 submitted 24 September, 2023; originally announced September 2023.

    Comments: Accepted for ICLR 2024. Code and examples are available on the project webpage in https://hilamanor.github.io/GaussianDenoisingPosterior/

  17. arXiv:2306.17499  [pdf, other

    cs.LG

    The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks

    Authors: Mor Shpigel Nacson, Rotem Mulayoff, Greg Ongie, Tomer Michaeli, Daniel Soudry

    Abstract: We study the type of solutions to which stochastic gradient descent converges when used to train a single hidden-layer multivariate ReLU network with the quadratic loss. Our results are based on a dynamical stability analysis. In the univariate case, it was shown that linearly stable minima correspond to network functions (predictors), whose second derivative has a bounded weighted $L^1$ norm. Not… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: Published at ICLR 2023. Fixed statements and proofs of Proposition 3 and Theorem 2

  18. arXiv:2306.07850  [pdf, other

    cs.LG

    Exact Mean Square Linear Stability Analysis for SGD

    Authors: Rotem Mulayoff, Tomer Michaeli

    Abstract: The dynamical stability of optimization methods at the vicinity of minima of the loss has recently attracted significant attention. For gradient descent (GD), stable convergence is possible only to minima that are sufficiently flat w.r.t. the step size, and those have been linked with favorable properties of the trained model. However, while the stability threshold of GD is well-known, to date, no… ▽ More

    Submitted 16 June, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: In Conference on Learning Theory (COLT) 2024

  19. arXiv:2306.02400  [pdf, other

    cs.LG cs.IT stat.ML

    Perceptual Kalman Filters: Online State Estimation under a Perfect Perceptual-Quality Constraint

    Authors: Dror Freirich, Tomer Michaeli, Ron Meir

    Abstract: Many practical settings call for the reconstruction of temporal signals from corrupted or missing data. Classic examples include decoding, tracking, signal enhancement and denoising. Since the reconstructed signals are ultimately viewed by humans, it is desirable to achieve reconstructions that are pleasing to human perception. Mathematically, perfect perceptual-quality is achieved when the distri… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

  20. arXiv:2306.02342  [pdf, other

    cs.AI

    Deep Optimal Transport: A Practical Algorithm for Photo-realistic Image Restoration

    Authors: Theo Adrai, Guy Ohayon, Tomer Michaeli, Michael Elad

    Abstract: We propose an image restoration algorithm that can control the perceptual quality and/or the mean square error (MSE) of any pre-trained model, trading one over the other at test time. Our algorithm is few-shot: Given about a dozen images restored by the model, it can significantly improve the perceptual quality and/or the MSE of the model for newly restored images without further training. Our app… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

  21. arXiv:2305.19066  [pdf, other

    cs.CV

    Nested Diffusion Processes for Anytime Image Generation

    Authors: Noam Elata, Bahjat Kawar, Tomer Michaeli, Michael Elad

    Abstract: Diffusion models are the current state-of-the-art in image generation, synthesizing high-quality images by breaking down the generation process into many fine-grained denoising steps. Despite their good performance, diffusion models are computationally expensive, requiring many neural function evaluations (NFEs). In this work, we propose an anytime diffusion-based method that can generate viable i… ▽ More

    Submitted 30 October, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

  22. arXiv:2305.13128  [pdf, other

    eess.IV cs.CV cs.LG

    GSURE-Based Diffusion Model Training with Corrupted Data

    Authors: Bahjat Kawar, Noam Elata, Tomer Michaeli, Michael Elad

    Abstract: Diffusion models have demonstrated impressive results in both data generation and downstream tasks such as inverse problems, text-based editing, classification, and more. However, training such models usually requires large amounts of clean signals which are often difficult or impossible to obtain. In this work, we propose a novel training technique for generative diffusion models based only on co… ▽ More

    Submitted 13 June, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Code: https://github.com/bahjat-kawar/gsure-diffusion

  23. arXiv:2305.06450  [pdf

    cs.CY cs.AI

    What Students Can Learn About Artificial Intelligence -- Recommendations for K-12 Computing Education

    Authors: Tilman Michaeli, Stefan Seegerer, Ralf Romeike

    Abstract: Technological advances in the context of digital transformation are the basis for rapid developments in the field of artificial intelligence (AI). Although AI is not a new topic in computer science (CS), recent developments are having an immense impact on everyday life and society. In consequence, everyone needs competencies to be able to adequately and competently analyze, discuss and help shape… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: To be published in the proceedings of the proceedings of World Conference on Computers in Education, WCCE 2022

    ACM Class: K.3.2; I.2.0

  24. arXiv:2305.06442  [pdf, other

    cs.CY cs.LG

    Data, Trees, and Forests -- Decision Tree Learning in K-12 Education

    Authors: Tilman Michaeli, Stefan Seegerer, Lennard Kerber, Ralf Romeike

    Abstract: As a consequence of the increasing influence of machine learning on our lives, everyone needs competencies to understand corresponding phenomena, but also to get involved in sha** our world and making informed decisions regarding the influences on our society. Therefore, in K-12 education, students need to learn about core ideas and principles of machine learning. However, for this target group,… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: To be published in the proceedings of the 3rd Teaching in Machine Learning Workshop, PMLR, 2022

    ACM Class: K.3.2; I.2.6

  25. arXiv:2304.06140  [pdf, other

    cs.CV cs.LG

    An Edit Friendly DDPM Noise Space: Inversion and Manipulations

    Authors: Inbar Huberman-Spiegelglas, Vladimir Kulikov, Tomer Michaeli

    Abstract: Denoising diffusion probabilistic models (DDPMs) employ a sequence of white Gaussian noise samples to generate an image. In analogy with GANs, those noise maps could be considered as the latent code associated with the generated image. However, this native noise space does not possess a convenient structure, and is thus challenging to work with in editing tasks. Here, we propose an alternative lat… ▽ More

    Submitted 9 April, 2024; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: CVPR 2024. Code and examples are available at https://github.com/inbarhub/DDPM_inversion

  26. arXiv:2303.11073  [pdf, other

    cs.CV cs.LG

    Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models

    Authors: René Haas, Inbar Huberman-Spiegelglas, Rotem Mulayoff, Stella Graßhof, Sami S. Brandt, Tomer Michaeli

    Abstract: Denoising Diffusion Models (DDMs) have emerged as a strong competitor to Generative Adversarial Networks (GANs). However, despite their widespread use in image synthesis and editing applications, their latent space is still not as well understood. Recently, a semantic latent space for DDMs, coined `$h$-space', was shown to facilitate semantic image editing in a way reminiscent of GANs. The $h$-spa… ▽ More

    Submitted 29 May, 2024; v1 submitted 20 March, 2023; originally announced March 2023.

  27. arXiv:2303.08085  [pdf, other

    cs.CV eess.IV

    Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations

    Authors: Hagay Michaeli, Tomer Michaeli, Daniel Soudry

    Abstract: Although CNNs are believed to be invariant to translations, recent works have shown this is not the case, due to aliasing effects that stem from downsampling layers. The existing architectural solutions to prevent aliasing are partial since they do not solve these effects, that originate in non-linearities. We propose an extended anti-aliasing method that tackles both downsampling and non-linear l… ▽ More

    Submitted 15 March, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: The paper was accepted to CVPR 2023. Our code is available at https://github.com/hmichaeli/alias_free_convnets/

  28. arXiv:2212.10280  [pdf, other

    cs.CV

    Internal Diverse Image Completion

    Authors: Noa Alkobi, Tamar Rott Shaham, Tomer Michaeli

    Abstract: Image completion is widely used in photo restoration and editing applications, e.g. for object removal. Recently, there has been a surge of research on generating diverse completions for missing regions. However, existing methods require large training sets from a specific domain of interest, and often fail on general-content images. In this paper, we propose a diverse completion method that does… ▽ More

    Submitted 18 December, 2022; originally announced December 2022.

  29. arXiv:2212.01589  [pdf, other

    cs.CV

    BlendGAN: Learning and Blending the Internal Distributions of Single Images by Spatial Image-Identity Conditioning

    Authors: Idan Kligvasser, Tamar Rott Shaham, Noa Alkobi, Tomer Michaeli

    Abstract: Training a generative model on a single image has drawn significant attention in recent years. Single image generative methods are designed to learn the internal patch distribution of a single natural image at multiple scales. These models can be used for drawing diverse samples that semantically resemble the training image, as well as for solving many image editing and restoration tasks that invo… ▽ More

    Submitted 3 December, 2022; originally announced December 2022.

  30. arXiv:2211.16582  [pdf, other

    cs.CV cs.LG eess.IV

    SinDDM: A Single Image Denoising Diffusion Model

    Authors: Vladimir Kulikov, Shahar Yadin, Matan Kleiner, Tomer Michaeli

    Abstract: Denoising diffusion models (DDMs) have led to staggering performance leaps in image generation, editing and restoration. However, existing DDMs use very large datasets for training. Here, we introduce a framework for training a DDM on a single image. Our method, which we coin SinDDM, learns the internal statistics of the training image by using a multi-scale diffusion process. To drive the reverse… ▽ More

    Submitted 6 June, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: Updated for ICML 2023 and added the Appendix. Note that the images are lightly compressed. Visit our project page for uncompressed results: https://matankleiner.github.io/sinddm/

  31. arXiv:2211.08944  [pdf, other

    eess.IV cs.CV cs.LG

    Reasons for the Superiority of Stochastic Estimators over Deterministic Ones: Robustness, Consistency and Perceptual Quality

    Authors: Guy Ohayon, Theo Adrai, Michael Elad, Tomer Michaeli

    Abstract: Stochastic restoration algorithms allow to explore the space of solutions that correspond to the degraded input. In this paper we reveal additional fundamental advantages of stochastic methods over deterministic ones, which further motivate their use. First, we prove that any restoration algorithm that attains perfect perceptual quality and whose outputs are consistent with the input must be a pos… ▽ More

    Submitted 26 July, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Journal ref: Proceedings of the 40th International Conference on Machine Learning, PMLR 202:26474-26494, 2023

  32. arXiv:2202.02783  [pdf, other

    cs.LG cs.CV

    Energy awareness in low precision neural networks

    Authors: Nurit S**arn Eliezer, Ron Banner, Elad Hoffer, Hilla Ben-Yaakov, Tomer Michaeli

    Abstract: Power consumption is a major obstacle in the deployment of deep neural networks (DNNs) on end devices. Existing approaches for reducing power consumption rely on quite general principles, including avoidance of multiplication operations and aggressive quantization of weights and activations. However, these methods do not take into account the precise power consumed by each module in the network, a… ▽ More

    Submitted 6 February, 2022; originally announced February 2022.

  33. arXiv:2107.02555  [pdf, other

    eess.IV cs.CV

    A Theory of the Distortion-Perception Tradeoff in Wasserstein Space

    Authors: Dror Freirich, Tomer Michaeli, Ron Meir

    Abstract: The lower the distortion of an estimator, the more the distribution of its outputs generally deviates from the distribution of the signals it attempts to estimate. This phenomenon, known as the perception-distortion tradeoff, has captured significant attention in image restoration, where it implies that fidelity to ground truth images comes at the expense of perceptual quality (deviation from stat… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

  34. arXiv:2106.06426  [pdf, other

    cs.SD cs.LG eess.AS

    Catch-A-Waveform: Learning to Generate Audio from a Single Short Example

    Authors: Gal Greshler, Tamar Rott Shaham, Tomer Michaeli

    Abstract: Models for audio generation are typically trained on hours of recordings. Here, we illustrate that capturing the essence of an audio source is typically possible from as little as a few tens of seconds from a single training signal. Specifically, we present a GAN-based generative model that can be trained on one short audio signal from any domain (e.g. speech, music, etc.) and does not require pre… ▽ More

    Submitted 26 October, 2021; v1 submitted 11 June, 2021; originally announced June 2021.

  35. On information rates over a binary input channel

    Authors: Michael Peleg, Tomer Michaeli, Shlomo Shamai

    Abstract: We study communication systems over band-limited Additive White Gaussian Noise (AWGN) channels in which the transmitter output is constrained to be symmetric binary (bi-polar). In this work we improve the original Ozarov-Wyner-Ziv (OWZ) lower bound on capacity by introducing new achievability schemes with two advantages over the studied OWZ scheme which is based on peak-power constrained pulse-amp… ▽ More

    Submitted 6 October, 2021; v1 submitted 13 May, 2021; originally announced May 2021.

    Comments: in IEEE Open Journal of the Communications Society, 2021

  36. arXiv:2103.02458  [pdf, other

    cs.CV eess.IV

    Sparsity Aware Normalization for GANs

    Authors: Idan Kligvasser, Tomer Michaeli

    Abstract: Generative adversarial networks (GANs) are known to benefit from regularization or normalization of their critic (discriminator) network during training. In this paper, we analyze the popular spectral normalization scheme, find a significant drawback and introduce sparsity aware normalization (SAN), a new alternative approach for stabilizing GAN training. As opposed to other normalization methods,… ▽ More

    Submitted 28 April, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

    Comments: AAAI Conference on Artificial Intelligence (AAAI-21)

  37. arXiv:2012.05328  [pdf, other

    cs.CV

    GAN "Steerability" without optimization

    Authors: Nurit S**arn-Eliezer, Ron Banner, Tomer Michaeli

    Abstract: Recent research has shown remarkable success in revealing "steering" directions in the latent spaces of pre-trained GANs. These directions correspond to semantically meaningful image transformations e.g., shift, zoom, color manipulations), and have similar interpretable effects across all categories that the GAN can generate. Some methods focus on user-specified transformations, while others disco… ▽ More

    Submitted 24 January, 2021; v1 submitted 9 December, 2020; originally announced December 2020.

  38. arXiv:2012.03295  [pdf, other

    cs.LG stat.ML

    Contrastive Divergence Learning is a Time Reversal Adversarial Game

    Authors: Omer Yair, Tomer Michaeli

    Abstract: Contrastive divergence (CD) learning is a classical method for fitting unnormalized statistical models to data samples. Despite its wide-spread use, the convergence properties of this algorithm are still not well understood. The main source of difficulty is an unjustified approximation which has been used to derive the gradient of the loss. In this paper, we present an alternative derivation of CD… ▽ More

    Submitted 15 March, 2021; v1 submitted 6 December, 2020; originally announced December 2020.

  39. arXiv:2012.02992  [pdf, other

    cs.CV

    Spatially-Adaptive Pixelwise Networks for Fast Image Translation

    Authors: Tamar Rott Shaham, Michael Gharbi, Richard Zhang, Eli Shechtman, Tomer Michaeli

    Abstract: We introduce a new generator architecture, aimed at fast and efficient high-resolution image-to-image translation. We design the generator to be an extremely lightweight function of the full-resolution image. In fact, we use pixel-wise networks; that is, each pixel is processed independently of others, through a composition of simple affine transformations and nonlinearities. We take three importa… ▽ More

    Submitted 5 December, 2020; originally announced December 2020.

  40. arXiv:2009.14303  [pdf, other

    eess.IV cs.CV physics.bio-ph physics.optics

    Learning an optimal PSF-pair for ultra-dense 3D localization microscopy

    Authors: Elias Nehme, Boris Ferdman, Lucien E. Weiss, Tal Naor, Daniel Freedman, Tomer Michaeli, Yoav Shechtman

    Abstract: A long-standing challenge in multiple-particle-tracking is the accurate and precise 3D localization of individual particles at close proximity. One established approach for snapshot 3D imaging is point-spread-function (PSF) engineering, in which the PSF is modified to encode the axial information. However, engineered PSFs are challenging to localize at high densities due to lateral PSF overlaps. H… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: 20 pages, 20 figures

  41. arXiv:2006.09332  [pdf, other

    eess.IV cs.CV

    What's in the Image? Explorable Decoding of Compressed Images

    Authors: Yuval Bahat, Tomer Michaeli

    Abstract: The ever-growing amounts of visual contents captured on a daily basis necessitate the use of lossy compression methods in order to save storage space and transmission bandwidth. While extensive research efforts are devoted to improving compression techniques, every method inevitably discards information. Especially at low bit rates, this information often corresponds to semantically meaningful vis… ▽ More

    Submitted 27 March, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

  42. arXiv:2002.04710  [pdf, other

    cs.LG stat.ML

    Unique Properties of Flat Minima in Deep Networks

    Authors: Rotem Mulayoff, Tomer Michaeli

    Abstract: It is well known that (stochastic) gradient descent has an implicit bias towards flat minima. In deep neural network training, this mechanism serves to screen out minima. However, the precise effect that this has on the trained network is not yet fully understood. In this paper, we characterize the flat minima in linear neural networks trained with a quadratic loss. First, we show that linear ResN… ▽ More

    Submitted 8 August, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: Presented at ICML2020

  43. arXiv:1912.01839  [pdf, other

    cs.CV

    Explorable Super Resolution

    Authors: Yuval Bahat, Tomer Michaeli

    Abstract: Single image super resolution (SR) has seen major performance leaps in recent years. However, existing methods do not allow exploring the infinitely many plausible reconstructions that might have given rise to the observed low-resolution (LR) image. These different explanations to the LR image may dramatically vary in their textures and fine details, and may often encode completely different seman… ▽ More

    Submitted 22 June, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

    Journal ref: Proceedings .of .the.IEEE/CVF.Conference.on.Computer.Vision.and.Pattern.Recognition. (2020) 2716-2725

  44. arXiv:1906.09957  [pdf, other

    eess.IV physics.optics

    DeepSTORM3D: dense three dimensional localization microscopy and point spread function design by deep learning

    Authors: Elias Nehme, Daniel Freedman, Racheli Gordon, Boris Ferdman, Lucien E. Weiss, Onit Alalouf, Reut Orange, Tomer Michaeli, Yoav Shechtman

    Abstract: Localization microscopy is an imaging technique in which the positions of individual nanoscale point emitters (e.g. fluorescent molecules) are determined at high precision from their images. This is the key ingredient in single/multiple-particle-tracking and several super-resolution microscopy approaches. Localization in three-dimensions (3D) can be performed by modifying the image that a point-so… ▽ More

    Submitted 12 September, 2019; v1 submitted 21 June, 2019; originally announced June 2019.

    Comments: main text: 9 pages, 5 figures, supplementary information: 29 pages, 20 figures

    Journal ref: Nature Methods 17 (2020) 734-740

  45. arXiv:1905.01164  [pdf, other

    cs.CV

    SinGAN: Learning a Generative Model from a Single Natural Image

    Authors: Tamar Rott Shaham, Tali Dekel, Tomer Michaeli

    Abstract: We introduce SinGAN, an unconditional generative model that can be learned from a single natural image. Our model is trained to capture the internal distribution of patches within the image, and is then able to generate high quality, diverse samples that carry the same visual content as the image. SinGAN contains a pyramid of fully convolutional GANs, each responsible for learning the patch distri… ▽ More

    Submitted 4 September, 2019; v1 submitted 2 May, 2019; originally announced May 2019.

    Comments: ICCV 2019

  46. arXiv:1901.07821  [pdf, other

    cs.LG cs.CV cs.IT stat.ML

    Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff

    Authors: Yochai Blau, Tomer Michaeli

    Abstract: Lossy compression algorithms are typically designed and analyzed through the lens of Shannon's rate-distortion theory, where the goal is to achieve the lowest possible distortion (e.g., low MSE or high SSIM) at any given bit rate. However, in recent years, it has become increasingly accepted that "low distortion" is not a synonym for "high perceptual quality", and in fact optimization of one often… ▽ More

    Submitted 30 July, 2019; v1 submitted 23 January, 2019; originally announced January 2019.

    Comments: ICML 2019 (long oral presentation), see talk at: https://slideslive.com/38917633/applications-computer-vision

    Journal ref: Proceedings of the 36th International Conference on Machine Learning, PMLR 97:675-685, 2019

  47. arXiv:1811.11051  [pdf, other

    cs.CV

    Dense xUnit Networks

    Authors: Idan Kligvasser, Tomer Michaeli

    Abstract: Deep net architectures have constantly evolved over the past few years, leading to significant advancements in a wide array of computer vision tasks. However, besides high accuracy, many applications also require a low computational load and limited memory footprint. To date, efficiency has typically been achieved either by architectural choices at the macro level (e.g. using skip connections or p… ▽ More

    Submitted 27 November, 2018; originally announced November 2018.

  48. arXiv:1809.07517  [pdf, other

    cs.CV

    The 2018 PIRM Challenge on Perceptual Image Super-resolution

    Authors: Yochai Blau, Roey Mechrez, Radu Timofte, Tomer Michaeli, Lihi Zelnik-Manor

    Abstract: This paper reports on the 2018 PIRM challenge on perceptual super-resolution (SR), held in conjunction with the Perceptual Image Restoration and Manipulation (PIRM) workshop at ECCV 2018. In contrast to previous SR challenges, our evaluation methodology jointly quantifies accuracy and perceptual quality, therefore enabling perceptual-driven methods to compete alongside algorithms that target PSNR… ▽ More

    Submitted 31 January, 2019; v1 submitted 20 September, 2018; originally announced September 2018.

    Comments: Workshop and Challenge on Perceptual Image Restoration and Manipulation in conjunction with ECCV 2018 webpage: https://www.pirm2018.org/

    Journal ref: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018

  49. arXiv:1807.01637  [pdf

    physics.optics

    Multicolor localization microscopy by deep learning

    Authors: Eran Hershko*, Lucien E. Weiss*, Tomer Michaeli, Yoav Shechtman

    Abstract: Deep learning has become an extremely effective tool for image classification and image restoration problems. Here, we apply deep learning to microscopy, and demonstrate how neural networks can exploit the chromatic dependence of the point-spread function to classify the colors of single emitters imaged on a grayscale camera. While existing single-molecule methods for spectral classification requi… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

    Comments: 34 pages, 16 figures

  50. arXiv:1804.04897  [pdf, other

    eess.SP cs.IT

    On the Minimal Overcompleteness Allowing Universal Sparse Representation

    Authors: Rotem Mulayoff, Tomer Michaeli

    Abstract: Sparse representation over redundant dictionaries constitutes a good model for many classes of signals (e.g., patches of natural images, segments of speech signals, etc.). However, despite its popularity, very little is known about the representation capacity of this model. In this paper, we study how redundant a dictionary must be so as to allow any vector to admit a sparse approximation with a p… ▽ More

    Submitted 6 March, 2019; v1 submitted 13 April, 2018; originally announced April 2018.

    Comments: To appear in IEEE Transactions on Information Theory