Skip to main content

Showing 1–22 of 22 results for author: Camps, O

.
  1. arXiv:2407.02403  [pdf, other

    cs.CV cs.AI

    Face Reconstruction Transfer Attack as Out-of-Distribution Generalization

    Authors: Yoon Gyo Jung, Jaewoo Park, Xingbo Dong, Ho** Park, Andrew Beng ** Teoh, Octavia Camps

    Abstract: Understanding the vulnerability of face recognition systems to malicious attacks is of critical importance. Previous works have focused on reconstructing face images that can penetrate a targeted verification system. Even in the white-box scenario, however, naively reconstructed images misrepresent the identity information, hence the attacks are easily neutralized once the face system is updated o… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV2024

  2. arXiv:2406.02720  [pdf, other

    cs.CV cs.GR

    3D-HGS: 3D Half-Gaussian Splatting

    Authors: Haolin Li, **yang Liu, Mario Sznaier, Octavia Camps

    Abstract: Photo-realistic 3D Reconstruction is a fundamental problem in 3D computer vision. This domain has seen considerable advancements owing to the advent of recent neural rendering techniques. These techniques predominantly aim to focus on learning volumetric representations of 3D scenes and refining these representations via loss functions derived from rendering. Among these, 3D Gaussian Splatting (3D… ▽ More

    Submitted 13 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 9 pages, 6 figures

  3. arXiv:2404.07292  [pdf, other

    cs.CV

    Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers

    Authors: **yang Liu, Wondmgezahu Teshome, Sandesh Ghimire, Mario Sznaier, Octavia Camps

    Abstract: Solving image and video jigsaw puzzles poses the challenging task of rearranging image fragments or video frames from unordered sequences to restore meaningful images and video sequences. Existing approaches often hinge on discriminative models tasked with predicting either the absolute positions of puzzle elements or the permutation actions applied to the original data. Unfortunately, these metho… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 8 pages, 7 figures

  4. arXiv:2404.01296  [pdf, other

    cs.CV

    MagicMirror: Fast and High-Quality Avatar Generation with a Constrained Search Space

    Authors: Armand Comas-Massagué, Di Qiu, Menglei Chai, Marcel Bühler, Amit Raj, Ruiqi Gao, Qiangeng Xu, Mark Matthews, Paulo Gotardo, Octavia Camps, Sergio Orts-Escolano, Thabo Beeler

    Abstract: We introduce a novel framework for 3D human avatar generation and personalization, leveraging text prompts to enhance user engagement and customization. Central to our approach are key innovations aimed at overcoming the challenges in photo-realistic avatar synthesis. Firstly, we utilize a conditional Neural Radiance Fields (NeRF) model, trained on a large-scale unannotated multi-view dataset, to… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  5. arXiv:2310.14466  [pdf, other

    cs.LG

    Inferring Relational Potentials in Interacting Systems

    Authors: Armand Comas-Massagué, Yilun Du, Christian Fernandez, Sandesh Ghimire, Mario Sznaier, Joshua B. Tenenbaum, Octavia Camps

    Abstract: Systems consisting of interacting agents are prevalent in the world, ranging from dynamical systems in physics to complex biological networks. To build systems which can interact robustly in the real world, it is thus important to be able to infer the precise interactions governing such systems. Existing approaches typically discover such interactions by explicitly modeling the feed-forward dynami… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: Published at ICML 2023 (Oral)

  6. arXiv:2305.11365  [pdf, other

    cs.CV

    Enhancing Transformer Backbone for Egocentric Video Action Segmentation

    Authors: Sakib Reza, Balaji Sundareshan, Mohsen Moghaddam, Octavia Camps

    Abstract: Egocentric temporal action segmentation in videos is a crucial task in computer vision with applications in various fields such as mixed reality, human behavior analysis, and robotics. Although recent research has utilized advanced visual-language frameworks, transformers remain the backbone of action segmentation models. Therefore, it is necessary to improve transformers to enhance the robustness… ▽ More

    Submitted 23 May, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Joint 3rd Ego4D and 11th EPIC Workshop on Egocentric Vision at CVPR 2023

  7. arXiv:2305.01733  [pdf, other

    cs.CV

    Cross-view Action Recognition via Contrastive View-invariant Representation

    Authors: Yuexi Zhang, Dan Luo, Balaji Sundareshan, Octavia Camps, Mario Sznaier

    Abstract: Cross view action recognition (CVAR) seeks to recognize a human action when observed from a previously unseen viewpoint. This is a challenging problem since the appearance of an action changes significantly with the viewpoint. Applications of CVAR include surveillance and monitoring of assisted living facilities where is not practical or feasible to collect large amounts of training data when addi… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  8. arXiv:2302.06670  [pdf, other

    cs.LG cs.AI

    Explainable Anomaly Detection in Images and Videos: A Survey

    Authors: Yizhou Wang, Dongliang Guo, Sheng Li, Octavia Camps, Yun Fu

    Abstract: Anomaly detection and localization of visual data, including images and videos, are of great significance in both machine learning academia and applied real-world scenarios. Despite the rapid development of visual anomaly detection techniques in recent years, the interpretations of these black-box models and reasonable explanations of why anomalies can be distinguished out are scarce. This paper p… ▽ More

    Submitted 9 April, 2024; v1 submitted 13 February, 2023; originally announced February 2023.

  9. arXiv:2302.04411  [pdf, other

    cs.LG cs.AI

    Geometry of Score Based Generative Models

    Authors: Sandesh Ghimire, **yang Liu, Armand Comas, Davin Hill, Aria Masoomi, Octavia Camps, Jennifer Dy

    Abstract: In this work, we look at Score-based generative models (also called diffusion generative models) from a geometric perspective. From a new view point, we prove that both the forward and backward process of adding noise and generating from noise are Wasserstein gradient flow in the space of probability measures. We are the first to prove this connection. Our understanding of Score-based (and Diffusi… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

  10. arXiv:2302.02272  [pdf, other

    cs.CV

    Divide and Compose with Score Based Generative Models

    Authors: Sandesh Ghimire, Armand Comas, Davin Hill, Aria Masoomi, Octavia Camps, Jennifer Dy

    Abstract: While score based generative models, or diffusion models, have found success in image synthesis, they are often coupled with text data or image label to be able to manipulate and conditionally generate images. Even though manipulation of images by changing the text prompt is possible, our understanding of the text embedding and our ability to modify it to edit images is quite limited. Towards the… ▽ More

    Submitted 4 February, 2023; originally announced February 2023.

  11. arXiv:2212.07495  [pdf, other

    cs.CV

    SAIF: Sparse Adversarial and Imperceptible Attack Framework

    Authors: Tooba Imtiaz, Morgan Kohler, Jared Miller, Zifeng Wang, Mario Sznaier, Octavia Camps, Jennifer Dy

    Abstract: Adversarial attacks hamper the decision-making ability of neural networks by perturbing the input signal. The addition of calculated small distortion to images, for instance, can deceive a well-trained image classification network. In this work, we propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF). Specifically, we design imperceptible attacks tha… ▽ More

    Submitted 6 December, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

  12. arXiv:2111.04807  [pdf, ps, other

    eess.IV cs.CV cs.LG

    Unsupervised Approaches for Out-Of-Distribution Dermoscopic Lesion Detection

    Authors: Max Torop, Sandesh Ghimire, Wenqian Liu, Dana H. Brooks, Octavia Camps, Milind Rajadhyaksha, Jennifer Dy, Kivanc Kose

    Abstract: There are limited works showing the efficacy of unsupervised Out-of-Distribution (OOD) methods on complex medical data. Here, we present preliminary findings of our unsupervised OOD detection algorithm, SimCLR-LOF, as well as a recent state of the art approach (SSD), applied on medical images. SimCLR-LOF learns semantically meaningful features using SimCLR and uses LOF for scoring if a test sample… ▽ More

    Submitted 8 November, 2021; originally announced November 2021.

    Comments: NeurIPS: Medical Imaging Meets NeurIPS Workshop

  13. arXiv:2110.00547  [pdf, other

    cs.CV

    Self-Supervised Decomposition, Disentanglement and Prediction of Video Sequences while Interpreting Dynamics: A Koopman Perspective

    Authors: Armand Comas, Sandesh Ghimire, Haolin Li, Mario Sznaier, Octavia Camps

    Abstract: Human interpretation of the world encompasses the use of symbols to categorize sensory inputs and compose them in a hierarchical manner. One of the long-term objectives of Computer Vision and Artificial Intelligence is to endow machines with the capacity of structuring and interpreting the world as we do. Towards this goal, recent methods have successfully been able to decompose and disentangle vi… ▽ More

    Submitted 1 October, 2021; originally announced October 2021.

  14. arXiv:2007.15217  [pdf, other

    cs.CV

    Key Frame Proposal Network for Efficient Pose Estimation in Videos

    Authors: Yuexi Zhang, Yin Wang, Octavia Camps, Mario Sznaier

    Abstract: Human pose estimation in video relies on local information by either estimating each frame independently or tracking poses across frames. In this paper, we propose a novel method combining local approaches with global context. We introduce a light weighted, unsupervised, key frame proposal network (K-FPN) to select informative frames and a learned dictionary to recover the entire pose sequence fro… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

    Comments: Accepted by ECCV2020

  15. arXiv:2006.13391  [pdf, other

    cs.LG cs.CV stat.ML

    Learning Disentangled Representations of Video with Missing Data

    Authors: Armand Comas-Massagué, Chi Zhang, Zlatan Feric, Octavia Camps, Rose Yu

    Abstract: Missing data poses significant challenges while learning representations of video sequences. We present Disentangled Imputed Video autoEncoder (DIVE), a deep generative model that imputes and predicts future video frames in the presence of missing data. Specifically, DIVE introduces a missingness latent variable, disentangles the hidden video representations into static and dynamic appearance, pos… ▽ More

    Submitted 3 November, 2020; v1 submitted 23 June, 2020; originally announced June 2020.

    Comments: Published at NeurIPS 2020

  16. arXiv:1911.07389  [pdf, other

    cs.CV cs.LG

    Towards Visually Explaining Variational Autoencoders

    Authors: Wenqian Liu, Runze Li, Meng Zheng, Srikrishna Karanam, Ziyan Wu, Bir Bhanu, Richard J. Radke, Octavia Camps

    Abstract: Recent advances in Convolutional Neural Network (CNN) model interpretability have led to impressive progress in visualizing and understanding model predictions. In particular, gradient-based visual attention methods have driven much recent effort in using visual attention maps as a means for visual explanations. A key problem, however, is these methods are designed for classification and categoriz… ▽ More

    Submitted 14 April, 2020; v1 submitted 17 November, 2019; originally announced November 2019.

    Comments: 10 pages, 9 figures, 2 tables, CVPR 2020

  17. arXiv:1803.07201  [pdf, other

    cs.CV

    DYAN: A Dynamical Atoms-Based Network for Video Prediction

    Authors: Wenqian Liu, Abhishek Sharma, Octavia Camps, Mario Sznaier

    Abstract: The ability to anticipate the future is essential when making real time critical decisions, provides valuable information to understand dynamic natural scenes, and can help unsupervised video representation learning. State-of-art video prediction is based on LSTM recursive networks and/or generative adversarial network learning. These are complex architectures that need to learn large numbers of p… ▽ More

    Submitted 14 September, 2018; v1 submitted 19 March, 2018; originally announced March 2018.

  18. arXiv:1802.07303  [pdf, other

    cs.CV

    MoNet: Moments Embedding Network

    Authors: Mengran Gou, Fei Xiong, Octavia Camps, Mario Sznaier

    Abstract: Bilinear pooling has been recently proposed as a feature encoding layer, which can be used after the convolutional layers of a deep network, to improve performance in multiple vision tasks. Different from conventional global average pooling or fully connected layer, bilinear pooling gathers 2nd order information in a translation invariant fashion. However, a serious drawback of this family of pool… ▽ More

    Submitted 29 March, 2018; v1 submitted 20 February, 2018; originally announced February 2018.

    Comments: Accepted in CVPR 2018

  19. arXiv:1709.07065  [pdf, other

    cs.CV

    Multi-camera Multi-Object Tracking

    Authors: Wenqian Liu, Octavia Camps, Mario Sznaier

    Abstract: In this paper, we propose a pipeline for multi-target visual tracking under multi-camera system. For multi-camera system tracking problem, efficient data association across cameras, and at the same time, across frames becomes more important than single-camera system tracking. However, most of the multi-camera tracking algorithms emphasis on single camera across frame data association. Thus in our… ▽ More

    Submitted 20 September, 2017; originally announced September 2017.

  20. arXiv:1605.09653  [pdf, other

    cs.CV

    A Systematic Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets

    Authors: Srikrishna Karanam, Mengran Gou, Ziyan Wu, Angels Rates-Borras, Octavia Camps, Richard J. Radke

    Abstract: Person re-identification (re-id) is a critical problem in video analytics applications such as security and surveillance. The public release of several datasets and code for vision algorithms has facilitated rapid progress in this area over the last few years. However, directly comparing re-id algorithms reported in the literature has become difficult since a wide variety of features, experimental… ▽ More

    Submitted 14 February, 2018; v1 submitted 31 May, 2016; originally announced May 2016.

    Comments: Preliminary work on person Re-Id benchmark. S. Karanam and M. Gou contributed equally. 14 pages, 6 figures, 4 tables. For supplementary material, see http://robustsystems.coe.neu.edu/sites/robustsystems.coe.neu.edu/files/systems/supmat/ReID_benchmark_supp.zip

  21. arXiv:1604.00367  [pdf, other

    cs.CV

    Person Re-identification in Appearance Impaired Scenarios

    Authors: Mengran Gou, Xikang Zhang, Angels Rates-Borras, Sadjad Asghari-Esfeden, Mario Sznaier, Octavia Camps

    Abstract: Person re-identification is critical in surveillance applications. Current approaches rely on appearance based features extracted from a single or multiple shots of the target and candidate matches. These approaches are at a disadvantage when trying to distinguish between candidates dressed in similar colors or when targets change their clothing. In this paper we propose a dynamics-based feature t… ▽ More

    Submitted 1 April, 2016; originally announced April 2016.

    Comments: 10 pages

  22. arXiv:1504.00905  [pdf, other

    math.OC cs.CV cs.LG eess.SY

    Robust Anomaly Detection Using Semidefinite Programming

    Authors: Jose A. Lopez, Octavia Camps, Mario Sznaier

    Abstract: This paper presents a new approach, based on polynomial optimization and the method of moments, to the problem of anomaly detection. The proposed technique only requires information about the statistical moments of the normal-state distribution of the features of interest and compares favorably with existing approaches (such as Parzen windows and 1-class SVM). In addition, it provides a succinct d… ▽ More

    Submitted 30 May, 2015; v1 submitted 3 April, 2015; originally announced April 2015.

    Comments: 13 pages, 11 figures