Skip to main content

Showing 1–3 of 3 results for author: Byeon, W

Searching in archive eess. Search in all archives.
.
  1. arXiv:2309.04509  [pdf, other

    cs.SD cs.CV cs.GR eess.AS

    The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion

    Authors: Yu** Jeong, Wonjeong Ryoo, Seunghyun Lee, Dabin Seo, Wonmin Byeon, Sangpil Kim, **kyu Kim

    Abstract: In recent years, video generation has become a prominent generative tool and has drawn significant attention. However, there is little consideration in audio-to-video generation, though audio contains unique qualities like temporal semantics and magnitude. Hence, we propose The Power of Sound (TPoS) model to incorporate audio input that includes both changeable temporal semantics and magnitude. To… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: ICCV2023

  2. arXiv:2211.11381  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    LISA: Localized Image Stylization with Audio via Implicit Neural Representation

    Authors: Seung Hyun Lee, Chanyoung Kim, Wonmin Byeon, Sang Ho Yoon, **kyu Kim, Sangpil Kim

    Abstract: We present a novel framework, Localized Image Stylization with Audio (LISA) which performs audio-driven localized image stylization. Sound often provides information about the specific context of the scene and is closely related to a certain part of the scene or object. However, existing image stylization works have focused on stylizing the entire image using an image or text input. Stylizing a pa… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

  3. arXiv:2112.00007  [pdf, other

    cs.GR cs.CV cs.LG cs.SD eess.AS

    Sound-Guided Semantic Image Manipulation

    Authors: Seung Hyun Lee, Wonseok Roh, Wonmin Byeon, Sang Ho Yoon, Chan Young Kim, **kyu Kim, Sangpil Kim

    Abstract: The recent success of the generative model shows that leveraging the multi-modal embedding space can manipulate an image using text information. However, manipulating an image with other sources rather than text, such as sound, is not easy due to the dynamic characteristics of the sources. Especially, sound can convey vivid emotions and dynamic expressions of the real world. Here, we propose a fra… ▽ More

    Submitted 30 November, 2021; originally announced December 2021.