Skip to main content

Showing 1–7 of 7 results for author: Bar-Tal, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.12945  [pdf, other

    cs.CV

    Lumiere: A Space-Time Diffusion Model for Video Generation

    Authors: Omer Bar-Tal, Hila Chefer, Omer Tov, Charles Herrmann, Roni Paiss, Shiran Zada, Ariel Ephrat, Junhwa Hur, Guanghui Liu, Amit Raj, Yuanzhen Li, Michael Rubinstein, Tomer Michaeli, Oliver Wang, Deqing Sun, Tali Dekel, Inbar Mosseri

    Abstract: We introduce Lumiere -- a text-to-video diffusion model designed for synthesizing videos that portray realistic, diverse and coherent motion -- a pivotal challenge in video synthesis. To this end, we introduce a Space-Time U-Net architecture that generates the entire temporal duration of the video at once, through a single pass in the model. This is in contrast to existing video models which synth… ▽ More

    Submitted 5 February, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: Webpage: https://lumiere-video.github.io/ | Video: https://www.youtube.com/watch?v=wxLr02Dz2Sc

  2. arXiv:2311.17009  [pdf, other

    cs.CV

    Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer

    Authors: Danah Yatim, Rafail Fridman, Omer Bar-Tal, Yoni Kasten, Tali Dekel

    Abstract: We present a new method for text-driven motion transfer - synthesizing a video that complies with an input text prompt describing the target objects and scene while maintaining an input video's motion and scene layout. Prior methods are confined to transferring motion across two subjects within the same or closely related object categories and are applicable for limited domains (e.g., humans). In… ▽ More

    Submitted 3 December, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Project page: https://diffusion-motion-transfer.github.io/

  3. Disentangling Structure and Appearance in ViT Feature Space

    Authors: Narek Tumanyan, Omer Bar-Tal, Shir Amir, Shai Bagon, Tali Dekel

    Abstract: We present a method for semantically transferring the visual appearance of one natural image to another. Specifically, our goal is to generate an image in which objects in a source structure image are "painted" with the visual appearance of their semantically related objects in a target appearance image. To integrate semantic information into our framework, our key idea is to leverage a pre-traine… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: Accepted to ACM Transactions on Graphics. arXiv admin note: substantial text overlap with arXiv:2201.00424

  4. arXiv:2307.10373  [pdf, other

    cs.CV

    TokenFlow: Consistent Diffusion Features for Consistent Video Editing

    Authors: Michal Geyer, Omer Bar-Tal, Shai Bagon, Tali Dekel

    Abstract: The generative AI revolution has recently expanded to videos. Nevertheless, current state-of-the-art video models are still lagging behind image models in terms of visual quality and user control over the generated content. In this work, we present a framework that harnesses the power of a text-to-image diffusion model for the task of text-driven video editing. Specifically, given a source video a… ▽ More

    Submitted 20 November, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

  5. arXiv:2302.08113  [pdf, other

    cs.CV

    MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation

    Authors: Omer Bar-Tal, Lior Yariv, Yaron Lipman, Tali Dekel

    Abstract: Recent advances in text-to-image generation with diffusion models present transformative capabilities in image quality. However, user controllability of the generated image, and fast adaptation to new tasks still remains an open challenge, currently mostly addressed by costly and long re-training and fine-tuning or ad-hoc adaptations to specific image generation tasks. In this work, we present Mul… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

  6. arXiv:2204.02491  [pdf, other

    cs.CV

    Text2LIVE: Text-Driven Layered Image and Video Editing

    Authors: Omer Bar-Tal, Dolev Ofri-Amar, Rafail Fridman, Yoni Kasten, Tali Dekel

    Abstract: We present a method for zero-shot, text-driven appearance manipulation in natural images and videos. Given an input image or video and a target text prompt, our goal is to edit the appearance of existing objects (e.g., object's texture) or augment the scene with visual effects (e.g., smoke, fire) in a semantically meaningful manner. We train a generator using an internal dataset of training exampl… ▽ More

    Submitted 25 May, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: Project page: https://text2live.github.io

  7. arXiv:2201.00424  [pdf, other

    cs.CV

    Splicing ViT Features for Semantic Appearance Transfer

    Authors: Narek Tumanyan, Omer Bar-Tal, Shai Bagon, Tali Dekel

    Abstract: We present a method for semantically transferring the visual appearance of one natural image to another. Specifically, our goal is to generate an image in which objects in a source structure image are "painted" with the visual appearance of their semantically related objects in a target appearance image. Our method works by training a generator given only a single structure/appearance image pair a… ▽ More

    Submitted 2 January, 2022; originally announced January 2022.