Skip to main content

Showing 1–3 of 3 results for author: Ha, S J

.
  1. arXiv:2311.12467  [pdf, other

    cs.CV

    GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap

    Authors: Hyogun Lee, Kyungho Bae, Seong Jong Ha, Yumin Ko, Gyeong-Moon Park, **woo Choi

    Abstract: In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled target domains. To establish a more realistic setting, we introduce a novel UVDA scenario, denoted as… ▽ More

    Submitted 22 November, 2023; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: This is an accepted WACV 2024 paper. Our code is available at https://github.com/KHUVLL/GLAD

  2. arXiv:2205.14022  [pdf, other

    cs.CV

    Future Transformer for Long-term Action Anticipation

    Authors: Dayoung Gong, Joonseok Lee, Man** Kim, Seong Jong Ha, Minsu Cho

    Abstract: The task of predicting future actions from a video is crucial for a real-world agent interacting with others. When anticipating actions in the distant future, we humans typically consider long-term relations over the whole sequence of actions, i.e., not only observed actions in the past but also potential actions in the future. In a similar spirit, we propose an end-to-end attention model for acti… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: Accepted to CVPR 2022

  3. arXiv:2110.00428  [pdf, other

    cs.CL cs.AI cs.CV

    Zero-shot Natural Language Video Localization

    Authors: **woo Nam, Daechul Ahn, Dongyeop Kang, Seong Jong Ha, Jonghyun Choi

    Abstract: Understanding videos to localize moments with natural language often requires large expensive annotated video regions paired with language queries. To eliminate the annotation costs, we make a first attempt to train a natural language video localization model in zero-shot manner. Inspired by unsupervised image captioning setup, we merely require random text corpora, unlabeled video collections, an… ▽ More

    Submitted 29 August, 2021; originally announced October 2021.

    Comments: 10 pages, 7 figures