Skip to main content

Showing 1–7 of 7 results for author: Jepson, A D

.
  1. arXiv:2307.08507  [pdf, other

    cs.LG

    Efficient and Accurate Optimal Transport with Mirror Descent and Conjugate Gradients

    Authors: Mete Kemertas, Allan D. Jepson, Amir-massoud Farahmand

    Abstract: We design a novel algorithm for optimal transport by drawing from the entropic optimal transport, mirror descent and conjugate gradients literatures. Our scalable and GPU parallelizable algorithm is able to compute the Wasserstein distance with extreme precision, reaching relative error rates of $10^{-8}$ without numerical stability issues. Empirically, the algorithm converges to high precision so… ▽ More

    Submitted 31 October, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

  2. arXiv:2304.13265  [pdf, other

    cs.CV

    StepFormer: Self-supervised Step Discovery and Localization in Instructional Videos

    Authors: Nikita Dvornik, Isma Hadji, Ran Zhang, Konstantinos G. Derpanis, Animesh Garg, Richard P. Wildes, Allan D. Jepson

    Abstract: Instructional videos are an important resource to learn procedural tasks from human demonstrations. However, the instruction steps in such videos are typically short and sparse, with most of the video being irrelevant to the procedure. This motivates the need to temporally localize the instruction steps in such videos, i.e. the task called key-step localization. Traditional methods for key-step lo… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: CVPR'23

  3. arXiv:2210.04996  [pdf, other

    cs.CV cs.AI

    Graph2Vid: Flow graph to Video Grounding for Weakly-supervised Multi-Step Localization

    Authors: Nikita Dvornik, Isma Hadji, Hai Pham, Dhaivat Bhatt, Brais Martinez, Afsaneh Fazly, Allan D. Jepson

    Abstract: In this work, we consider the problem of weakly-supervised multi-step localization in instructional videos. An established approach to this problem is to rely on a given list of steps. However, in reality, there is often more than one way to execute a procedure successfully, by following the set of steps in slightly varying orders. Thus, for successful localization in a given video, recent works r… ▽ More

    Submitted 31 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: ECCV'22, oral

    Journal ref: ECCV 2022

  4. arXiv:2205.02300  [pdf, other

    cs.CV

    P3IV: Probabilistic Procedure Planning from Instructional Videos with Weak Supervision

    Authors: He Zhao, Isma Hadji, Nikita Dvornik, Konstantinos G. Derpanis, Richard P. Wildes, Allan D. Jepson

    Abstract: In this paper, we study the problem of procedure planning in instructional videos. Here, an agent must produce a plausible sequence of actions that can transform the environment from a given start to a desired goal state. When learning procedure planning from instructional videos, most recent work leverages intermediate visual observations as supervision, which requires expensive annotation effort… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: Accepted as an oral paper at CVPR 2022

  5. arXiv:2108.11996  [pdf, other

    cs.CV

    Drop-DTW: Aligning Common Signal Between Sequences While Drop** Outliers

    Authors: Nikita Dvornik, Isma Hadji, Konstantinos G. Derpanis, Animesh Garg, Allan D. Jepson

    Abstract: In this work, we consider the problem of sequence-to-sequence alignment for signals containing outliers. Assuming the absence of outliers, the standard Dynamic Time War** (DTW) algorithm efficiently computes the optimal alignment between two (generally) variable-length sequences. While DTW is robust to temporal shifts and dilations of the signal, it fails to align sequences in a meaningful way i… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

  6. arXiv:2105.05217  [pdf, other

    cs.CV

    Representation Learning via Global Temporal Alignment and Cycle-Consistency

    Authors: Isma Hadji, Konstantinos G. Derpanis, Allan D. Jepson

    Abstract: We introduce a weakly supervised method for representation learning based on aligning temporal sequences (e.g., videos) of the same process (e.g., human action). The main idea is to use the global temporal ordering of latent correspondences across sequence pairs as a supervisory signal. In particular, we propose a loss based on scoring the optimal sequence alignment to train an embedding network.… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

    Comments: accepted to CVPR 2021

  7. arXiv:2011.08026  [pdf, other

    cs.CV cs.LG

    Cycle-Consistent Generative Rendering for 2D-3D Modality Translation

    Authors: Tristan Aumentado-Armstrong, Alex Levinshtein, Stavros Tsogkas, Konstantinos G. Derpanis, Allan D. Jepson

    Abstract: For humans, visual understanding is inherently generative: given a 3D shape, we can postulate how it would look in the world; given a 2D image, we can infer the 3D structure that likely gave rise to it. We can thus translate between the 2D visual and 3D structural modalities of a given object. In the context of computer vision, this corresponds to a learnable module that serves two purposes: (i) g… ▽ More

    Submitted 16 November, 2020; originally announced November 2020.

    Comments: 3DV 2020 (oral). Project page: https://ttaa9.github.io/genren/

    ACM Class: I.2.10; I.2.6