Skip to main content

Showing 1–18 of 18 results for author: Alaluf, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.01300  [pdf, other

    cs.CV

    pOps: Photo-Inspired Diffusion Operators

    Authors: Elad Richardson, Yuval Alaluf, Ali Mahdavi-Amiri, Daniel Cohen-Or

    Abstract: Text-guided image generation enables the creation of visual content from textual descriptions. However, certain visual concepts cannot be effectively conveyed through language alone. This has sparked a renewed interest in utilizing the CLIP image embedding space for more visually-oriented tasks through methods such as IP-Adapter. Interestingly, the CLIP image embedding space has been shown to be s… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Project Page: https://popspaper.github.io/pOps/

  2. arXiv:2403.14599  [pdf, other

    cs.CV

    MyVLM: Personalizing VLMs for User-Specific Queries

    Authors: Yuval Alaluf, Elad Richardson, Sergey Tulyakov, Kfir Aberman, Daniel Cohen-Or

    Abstract: Recent large-scale vision-language models (VLMs) have demonstrated remarkable capabilities in understanding and generating textual descriptions for visual content. However, these models lack an understanding of user-specific concepts. In this work, we take a first step toward the personalization of VLMs, enabling them to learn and reason over user-provided concepts. For example, we explore whether… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Project page: https://snap-research.github.io/MyVLM/

  3. arXiv:2311.13608  [pdf, other

    cs.CV cs.GR cs.LG

    Breathing Life Into Sketches Using Text-to-Video Priors

    Authors: Rinon Gal, Yael Vinker, Yuval Alaluf, Amit H. Bermano, Daniel Cohen-Or, Ariel Shamir, Gal Chechik

    Abstract: A sketch is one of the most intuitive and versatile tools humans use to convey their ideas visually. An animated sketch opens another dimension to the expression of ideas and is widely used by designers for a variety of purposes. Animating sketches is a laborious process, requiring extensive experience and professional design skills. In this work, we present a method that automatically adds motion… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    Comments: Project page: https://livesketch.github.io/

  4. arXiv:2311.03335  [pdf, other

    cs.CV cs.GR

    Cross-Image Attention for Zero-Shot Appearance Transfer

    Authors: Yuval Alaluf, Daniel Garibi, Or Patashnik, Hadar Averbuch-Elor, Daniel Cohen-Or

    Abstract: Recent advancements in text-to-image generative models have demonstrated a remarkable ability to capture a deep semantic understanding of images. In this work, we leverage this semantic knowledge to transfer the visual appearance between objects that share similar semantics but may differ significantly in shape. To achieve this, we build upon the self-attention layers of these generative models an… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Project page: https://garibida.github.io/cross-image-attention

  5. arXiv:2308.02669  [pdf, other

    cs.CV

    ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior Constraints

    Authors: Elad Richardson, Kfir Goldberg, Yuval Alaluf, Daniel Cohen-Or

    Abstract: Recent text-to-image generative models have enabled us to transform our words into vibrant, captivating imagery. The surge of personalization techniques that has followed has also allowed us to imagine unique concepts in new scenes. However, an intriguing question remains: How can we generate a new, imaginary concept that has never been seen before? In this paper, we present the task of creative t… ▽ More

    Submitted 17 December, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

    Comments: Project page: https://kfirgoldberg.github.io/ConceptLab/

  6. arXiv:2305.15391  [pdf, other

    cs.CV

    A Neural Space-Time Representation for Text-to-Image Personalization

    Authors: Yuval Alaluf, Elad Richardson, Gal Metzer, Daniel Cohen-Or

    Abstract: A key aspect of text-to-image personalization methods is the manner in which the target concept is represented within the generative process. This choice greatly affects the visual fidelity, downstream editability, and disk space needed to store the learned concept. In this paper, we explore a new text-conditioning space that is dependent on both the denoising process timestep (time) and the denoi… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: Project page available at https://neuraltextualinversion.github.io/NeTI/

  7. arXiv:2302.01721  [pdf, other

    cs.CV cs.GR

    TEXTure: Text-Guided Texturing of 3D Shapes

    Authors: Elad Richardson, Gal Metzer, Yuval Alaluf, Raja Giryes, Daniel Cohen-Or

    Abstract: In this paper, we present TEXTure, a novel method for text-guided generation, editing, and transfer of textures for 3D shapes. Leveraging a pretrained depth-to-image diffusion model, TEXTure applies an iterative scheme that paints a 3D model from different viewpoints. Yet, while depth-to-image models can create plausible textures from a single viewpoint, the stochastic nature of the generation pro… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

    Comments: Project page available at https://texturepaper.github.io/TEXTurePaper/

  8. arXiv:2301.13826  [pdf, other

    cs.CV cs.CL cs.GR cs.LG

    Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models

    Authors: Hila Chefer, Yuval Alaluf, Yael Vinker, Lior Wolf, Daniel Cohen-Or

    Abstract: Recent text-to-image generative models have demonstrated an unparalleled ability to generate diverse and creative imagery guided by a target text prompt. While revolutionary, current state-of-the-art diffusion models may still fail in generating images that fully convey the semantics in the given text prompt. We analyze the publicly available Stable Diffusion model and assess the existence of cata… ▽ More

    Submitted 31 May, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

    Comments: Accepted to SIGGRAPH 2023; Project page available at https://yuval-alaluf.github.io/Attend-and-Excite/

  9. arXiv:2211.17256  [pdf, other

    cs.CV cs.GR

    CLIPascene: Scene Sketching with Different Types and Levels of Abstraction

    Authors: Yael Vinker, Yuval Alaluf, Daniel Cohen-Or, Ariel Shamir

    Abstract: In this paper, we present a method for converting a given scene image into a sketch using different types and multiple levels of abstraction. We distinguish between two types of abstraction. The first considers the fidelity of the sketch, varying its representation from a more precise portrayal of the input to a looser depiction. The second is defined by the visual simplicity of the sketch, moving… ▽ More

    Submitted 1 May, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: Project page available at https://clipascene.github.io/CLIPascene/

  10. arXiv:2208.01618  [pdf, other

    cs.CV cs.CL cs.GR cs.LG

    An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

    Authors: Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or

    Abstract: Text-to-image models offer unprecedented freedom to guide creation through natural language. Yet, it is unclear how such freedom can be exercised to generate images of specific unique concepts, modify their appearance, or compose them in new roles and novel scenes. In other words, we ask: how can we use language-guided models to turn our cat into a painting, or imagine a new product based on our f… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    Comments: Project page: https://textual-inversion.github.io

  11. arXiv:2202.14020  [pdf, other

    cs.CV cs.GR cs.LG

    State-of-the-Art in the Architecture, Methods and Applications of StyleGAN

    Authors: Amit H. Bermano, Rinon Gal, Yuval Alaluf, Ron Mokady, Yotam Nitzan, Omer Tov, Or Patashnik, Daniel Cohen-Or

    Abstract: Generative Adversarial Networks (GANs) have established themselves as a prevalent approach to image synthesis. Of these, StyleGAN offers a fascinating case study, owing to its remarkable visual quality and an ability to support a large array of downstream tasks. This state-of-the-art report covers the StyleGAN architecture, and the ways it has been employed since its conception, while also analyzi… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

  12. arXiv:2201.13433  [pdf, other

    cs.CV

    Third Time's the Charm? Image and Video Editing with StyleGAN3

    Authors: Yuval Alaluf, Or Patashnik, Zongze Wu, Asif Zamir, Eli Shechtman, Dani Lischinski, Daniel Cohen-Or

    Abstract: StyleGAN is arguably one of the most intriguing and well-studied generative models, demonstrating impressive performance in image generation, inversion, and manipulation. In this work, we explore the recent StyleGAN3 architecture, compare it to its predecessor, and investigate its unique advantages, as well as drawbacks. In particular, we demonstrate that while StyleGAN3 can be trained on unaligne… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

    Comments: Project page available at https://yuval-alaluf.github.io/stylegan3-editing/

  13. arXiv:2111.15666  [pdf, other

    cs.CV

    HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing

    Authors: Yuval Alaluf, Omer Tov, Ron Mokady, Rinon Gal, Amit H. Bermano

    Abstract: The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately represent real images typically suffer from degraded semantic control. Recent work proposes to mitigate this t… ▽ More

    Submitted 29 March, 2022; v1 submitted 30 November, 2021; originally announced November 2021.

    Comments: Accepted to CVPR 2022; Project page available at http://yuval-alaluf.github.io/hyperstyle/

  14. arXiv:2107.07437  [pdf, other

    cs.CV

    StyleFusion: A Generative Model for Disentangling Spatial Segments

    Authors: Omer Kafri, Or Patashnik, Yuval Alaluf, Daniel Cohen-Or

    Abstract: We present StyleFusion, a new map** architecture for StyleGAN, which takes as input a number of latent codes and fuses them into a single style code. Inserting the resulting style code into a pre-trained StyleGAN generator results in a single harmonized image in which each semantic region is controlled by one of the input latent codes. Effectively, StyleFusion yields a disentangled representatio… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

    Comments: Code is available at: https://github.com/OmerKafri/StyleFusion

  15. arXiv:2104.02699  [pdf, other

    cs.CV

    ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement

    Authors: Yuval Alaluf, Or Patashnik, Daniel Cohen-Or

    Abstract: Recently, the power of unconditional image synthesis has significantly advanced through the use of Generative Adversarial Networks (GANs). The task of inverting an image into its corresponding latent code of the trained GAN is of utmost importance as it allows for the manipulation of real images, leveraging the rich semantics learned by the network. Recognizing the limitations of current inversion… ▽ More

    Submitted 24 August, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: Accepted to ICCV 2021; Project page available at https://yuval-alaluf.github.io/restyle-encoder/

  16. arXiv:2102.02766  [pdf, other

    cs.CV

    Designing an Encoder for StyleGAN Image Manipulation

    Authors: Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, Daniel Cohen-Or

    Abstract: Recently, there has been a surge of diverse methods for performing image editing by employing pre-trained unconditional generators. Applying these methods on real images, however, remains a challenge, as it necessarily requires the inversion of the images into their latent space. To successfully invert a real image, one needs to find a latent code that reconstructs the input image accurately, and… ▽ More

    Submitted 4 February, 2021; originally announced February 2021.

  17. arXiv:2102.02754  [pdf, other

    cs.CV

    Only a Matter of Style: Age Transformation Using a Style-Based Regression Model

    Authors: Yuval Alaluf, Or Patashnik, Daniel Cohen-Or

    Abstract: The task of age transformation illustrates the change of an individual's appearance over time. Accurately modeling this complex transformation over an input facial image is extremely challenging as it requires making convincing, possibly large changes to facial features and head shape, while still preserving the input identity. In this work, we present an image-to-image translation method that lea… ▽ More

    Submitted 18 May, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: Accepted to SIGGRAPH 2021, project page available at https://yuval-alaluf.github.io/SAM/

  18. arXiv:2008.00951  [pdf, other

    cs.CV

    Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

    Authors: Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, Daniel Cohen-Or

    Abstract: We present a generic image-to-image translation framework, pixel2style2pixel (pSp). Our pSp framework is based on a novel encoder network that directly generates a series of style vectors which are fed into a pretrained StyleGAN generator, forming the extended W+ latent space. We first show that our encoder can directly embed real images into W+, with no additional optimization. Next, we propose u… ▽ More

    Submitted 21 April, 2021; v1 submitted 3 August, 2020; originally announced August 2020.

    Comments: Accepted to CVPR 2021, project page available at https://eladrich.github.io/pixel2style2pixel/