Skip to main content

Showing 1–18 of 18 results for author: Jepson, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.09081  [pdf, other

    cs.CV cs.LG

    Probabilistic Directed Distance Fields for Ray-Based Shape Representations

    Authors: Tristan Aumentado-Armstrong, Stavros Tsogkas, Sven Dickinson, Allan Jepson

    Abstract: In modern computer vision, the optimal representation of 3D shape continues to be task-dependent. One fundamental operation applied to such representations is differentiable rendering, as it enables inverse graphics approaches in learning frameworks. Standard explicit shape representations (voxels, point clouds, or meshes) are often easily rendered, but can suffer from limited geometric fidelity,… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: Extension of arXiv:2112.05300

    ACM Class: I.2.10

  2. arXiv:2307.08507  [pdf, other

    cs.LG

    Efficient and Accurate Optimal Transport with Mirror Descent and Conjugate Gradients

    Authors: Mete Kemertas, Allan D. Jepson, Amir-massoud Farahmand

    Abstract: We design a novel algorithm for optimal transport by drawing from the entropic optimal transport, mirror descent and conjugate gradients literatures. Our scalable and GPU parallelizable algorithm is able to compute the Wasserstein distance with extreme precision, reaching relative error rates of $10^{-8}$ without numerical stability issues. Empirically, the algorithm converges to high precision so… ▽ More

    Submitted 31 October, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

  3. arXiv:2304.13265  [pdf, other

    cs.CV

    StepFormer: Self-supervised Step Discovery and Localization in Instructional Videos

    Authors: Nikita Dvornik, Isma Hadji, Ran Zhang, Konstantinos G. Derpanis, Animesh Garg, Richard P. Wildes, Allan D. Jepson

    Abstract: Instructional videos are an important resource to learn procedural tasks from human demonstrations. However, the instruction steps in such videos are typically short and sparse, with most of the video being irrelevant to the procedure. This motivates the need to temporally localize the instruction steps in such videos, i.e. the task called key-step localization. Traditional methods for key-step lo… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: CVPR'23

  4. arXiv:2301.10759  [pdf, other

    cs.CV

    Efficient Flow-Guided Multi-frame De-fencing

    Authors: Stavros Tsogkas, Fengjia Zhang, Allan Jepson, Alex Levinshtein

    Abstract: Taking photographs ''in-the-wild'' is often hindered by fence obstructions that stand between the camera user and the scene of interest, and which are hard or impossible to avoid. De-fencing is the algorithmic process of automatically removing such obstructions from images, revealing the invisible parts of the scene. While this problem can be formulated as a combination of fence segmentation and i… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

    Comments: 16 pages, 12 figures. Published at the Winter Conference on Application of Computer Vision (WACV) 2023

    Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023, pp. 1838-1847

  5. arXiv:2210.04996  [pdf, other

    cs.CV cs.AI

    Graph2Vid: Flow graph to Video Grounding for Weakly-supervised Multi-Step Localization

    Authors: Nikita Dvornik, Isma Hadji, Hai Pham, Dhaivat Bhatt, Brais Martinez, Afsaneh Fazly, Allan D. Jepson

    Abstract: In this work, we consider the problem of weakly-supervised multi-step localization in instructional videos. An established approach to this problem is to rely on a given list of steps. However, in reality, there is often more than one way to execute a procedure successfully, by following the set of steps in slightly varying orders. Thus, for successful localization in a given video, recent works r… ▽ More

    Submitted 31 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: ECCV'22, oral

    Journal ref: ECCV 2022

  6. arXiv:2205.02300  [pdf, other

    cs.CV

    P3IV: Probabilistic Procedure Planning from Instructional Videos with Weak Supervision

    Authors: He Zhao, Isma Hadji, Nikita Dvornik, Konstantinos G. Derpanis, Richard P. Wildes, Allan D. Jepson

    Abstract: In this paper, we study the problem of procedure planning in instructional videos. Here, an agent must produce a plausible sequence of actions that can transform the environment from a given start to a desired goal state. When learning procedure planning from instructional videos, most recent work leverages intermediate visual observations as supervision, which requires expensive annotation effort… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: Accepted as an oral paper at CVPR 2022

  7. arXiv:2204.09268  [pdf, other

    cs.LG cs.CL cs.CV cs.IR

    Uncertainty-based Cross-Modal Retrieval with Probabilistic Representations

    Authors: Leila Pishdad, Ran Zhang, Konstantinos G. Derpanis, Allan Jepson, Afsaneh Fazly

    Abstract: Probabilistic embeddings have proven useful for capturing polysemous word meanings, as well as ambiguity in image matching. In this paper, we study the advantages of probabilistic embeddings in a cross-modal setting (i.e., text and images), and propose a simple approach that replaces the standard vector point embeddings in extant image-text matching models with probabilistic distributions that are… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: 13 pages, 7 figures

  8. arXiv:2202.02881  [pdf, other

    cs.LG cs.AI

    Approximate Policy Iteration with Bisimulation Metrics

    Authors: Mete Kemertas, Allan Jepson

    Abstract: Bisimulation metrics define a distance measure between states of a Markov decision process (MDP) based on a comparison of reward sequences. Due to this property they provide theoretical guarantees in value function approximation (VFA). In this work we first prove that bisimulation and $π$-bisimulation metrics can be defined via a more general class of Sinkhorn distances, which unifies various stat… ▽ More

    Submitted 14 November, 2022; v1 submitted 6 February, 2022; originally announced February 2022.

    Comments: Accepted to Transactions on Machine Learning Research (TMLR)

    ACM Class: I.2.6

  9. arXiv:2112.05300  [pdf, other

    cs.CV cs.LG

    Representing 3D Shapes with Probabilistic Directed Distance Fields

    Authors: Tristan Aumentado-Armstrong, Stavros Tsogkas, Sven Dickinson, Allan Jepson

    Abstract: Differentiable rendering is an essential operation in modern vision, allowing inverse graphics approaches to 3D understanding to be utilized in modern machine learning frameworks. Explicit shape representations (voxels, point clouds, or meshes), while relatively easily rendered, often suffer from limited geometric fidelity or topological constraints. On the other hand, implicit representations (oc… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: 22 pages

    ACM Class: I.2.6; I.2.10

  10. GraN-GAN: Piecewise Gradient Normalization for Generative Adversarial Networks

    Authors: Vineeth S. Bhaskara, Tristan Aumentado-Armstrong, Allan Jepson, Alex Levinshtein

    Abstract: Modern generative adversarial networks (GANs) predominantly use piecewise linear activation functions in discriminators (or critics), including ReLU and LeakyReLU. Such models learn piecewise linear map**s, where each piece handles a subset of the input space, and the gradients per subset are piecewise constant. Under such a class of discriminator (or critic) functions, we present Gradient Norma… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

    Comments: WACV 2022 Main Conference Paper (Submitted: 18 Aug 2021, Accepted: 4 Oct 2021)

    Journal ref: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 2432-2441

  11. arXiv:2108.11996  [pdf, other

    cs.CV

    Drop-DTW: Aligning Common Signal Between Sequences While Drop** Outliers

    Authors: Nikita Dvornik, Isma Hadji, Konstantinos G. Derpanis, Animesh Garg, Allan D. Jepson

    Abstract: In this work, we consider the problem of sequence-to-sequence alignment for signals containing outliers. Assuming the absence of outliers, the standard Dynamic Time War** (DTW) algorithm efficiently computes the optimal alignment between two (generally) variable-length sequences. While DTW is robust to temporal shifts and dilations of the signal, it fails to align sequences in a meaningful way i… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

  12. arXiv:2106.00133  [pdf, other

    cs.AI

    AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement Learning

    Authors: Maayan Shvo, Zhiming Hu, Rodrigo Toro Icarte, Iqbal Mohomed, Allan Jepson, Sheila A. McIlraith

    Abstract: Human beings, even small children, quickly become adept at figuring out how to use applications on their mobile devices. Learning to use a new app is often achieved via trial-and-error, accelerated by transfer of knowledge from past experiences with like apps. The prospect of building a smarter smartphone - one that can learn how to achieve tasks using mobile apps - is tantalizing. In this paper w… ▽ More

    Submitted 6 June, 2021; v1 submitted 31 May, 2021; originally announced June 2021.

  13. arXiv:2105.05217  [pdf, other

    cs.CV

    Representation Learning via Global Temporal Alignment and Cycle-Consistency

    Authors: Isma Hadji, Konstantinos G. Derpanis, Allan D. Jepson

    Abstract: We introduce a weakly supervised method for representation learning based on aligning temporal sequences (e.g., videos) of the same process (e.g., human action). The main idea is to use the global temporal ordering of latent correspondences across sequence pairs as a supervisory signal. In particular, we propose a loss based on scoring the optimal sequence alignment to train an embedding network.… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

    Comments: accepted to CVPR 2021

  14. Disentangling Geometric Deformation Spaces in Generative Latent Shape Models

    Authors: Tristan Aumentado-Armstrong, Stavros Tsogkas, Sven Dickinson, Allan Jepson

    Abstract: A complete representation of 3D objects requires characterizing the space of deformations in an interpretable manner, from articulations of a single instance to changes in shape across categories. In this work, we improve on a prior generative model of geometric disentanglement for 3D shapes, wherein the space of object geometry is factorized into rigid orientation, non-rigid pose, and intrinsic s… ▽ More

    Submitted 18 March, 2023; v1 submitted 27 February, 2021; originally announced March 2021.

    Comments: Accepted to IJCV

    ACM Class: I.2.10; I.5.4

  15. arXiv:2011.08026  [pdf, other

    cs.CV cs.LG

    Cycle-Consistent Generative Rendering for 2D-3D Modality Translation

    Authors: Tristan Aumentado-Armstrong, Alex Levinshtein, Stavros Tsogkas, Konstantinos G. Derpanis, Allan D. Jepson

    Abstract: For humans, visual understanding is inherently generative: given a 3D shape, we can postulate how it would look in the world; given a 2D image, we can infer the 3D structure that likely gave rise to it. We can thus translate between the 2D visual and 3D structural modalities of a given object. In the context of computer vision, this corresponds to a learnable module that serves two purposes: (i) g… ▽ More

    Submitted 16 November, 2020; originally announced November 2020.

    Comments: 3DV 2020 (oral). Project page: https://ttaa9.github.io/genren/

    ACM Class: I.2.10; I.2.6

  16. arXiv:2009.06943  [pdf, other

    eess.IV cs.CV

    AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results

    Authors: Kai Zhang, Martin Danelljan, Yawei Li, Radu Timofte, Jie Liu, Jie Tang, Gangshan Wu, Yu Zhu, Xiangyu He, Wenjie Xu, Chenghua Li, Cong Leng, Jian Cheng, Guangyang Wu, Wenyi Wang, Xiaohong Liu, Hengyuan Zhao, Xiangtao Kong, **gwen He, Yu Qiao, Chao Dong, Xiaotong Luo, Liang Chen, Jiangtao Zhang, Maitreya Suin , et al. (60 additional authors not shown)

    Abstract: This paper reviews the AIM 2020 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor x4 based on a set of prior examples of low and corresponding high resolution images. The goal is to devise a network that reduces one or several aspects such as runtime, parameter co… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

  17. arXiv:1908.06386  [pdf, other

    cs.CV cs.LG eess.IV

    Geometric Disentanglement for Generative Latent Shape Models

    Authors: Tristan Aumentado-Armstrong, Stavros Tsogkas, Allan Jepson, Sven Dickinson

    Abstract: Representing 3D shape is a fundamental problem in artificial intelligence, which has numerous applications within computer vision and graphics. One avenue that has recently begun to be explored is the use of latent representations of generative models. However, it remains an open problem to learn a generative model of shape that is interpretable and easily manipulated, particularly in the absence… ▽ More

    Submitted 18 August, 2019; originally announced August 2019.

    Comments: ICCV 2019

    ACM Class: I.2.10; I.5.4

  18. arXiv:1811.10524  [pdf, other

    cs.CV

    Scene Categorization from Contours: Medial Axis Based Salience Measures

    Authors: Morteza Rezanejad, Gabriel Downs, John Wilder, Dirk B. Walther, Allan Jepson, Sven Dickinson, Kaleem Siddiqi

    Abstract: The computer vision community has witnessed recent advances in scene categorization from images, with the state-of-the art systems now achieving impressive recognition rates on challenging benchmarks such as the Places365 dataset. Such systems have been trained on photographs which include color, texture and shading cues. The geometry of shapes and surfaces, as conveyed by scene contours, is not e… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.