Skip to main content

Showing 1–50 of 75 results for author: Hébert, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.06712  [pdf, other

    cs.CV cs.AI

    Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion Models

    Authors: Zhipeng Bao, Yijun Li, Krishna Kumar Singh, Yu-Xiong Wang, Martial Hebert

    Abstract: Despite recent significant strides achieved by diffusion-based Text-to-Image (T2I) models, current systems are still less capable of ensuring decent compositional generation aligned with text prompts, particularly for the multi-object generation. This work illuminates the fundamental reasons for such misalignment, pinpointing issues related to low attention activation scores and mask overlaps. Whi… ▽ More

    Submitted 31 January, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

  2. arXiv:2309.17450  [pdf, other

    cs.CV

    Multi-task View Synthesis with Neural Radiance Fields

    Authors: Shuhong Zheng, Zhipeng Bao, Martial Hebert, Yu-Xiong Wang

    Abstract: Multi-task visual learning is a critical aspect of computer vision. Current research, however, predominantly concentrates on the multi-task dense prediction setting, which overlooks the intrinsic 3D world and its multi-view consistent structures, and lacks the capability for versatile imagination. In response to these limitations, we present a novel problem setting -- multi-task view synthesis (MT… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: ICCV 2023, Website: https://zsh2000.github.io/mtvs.github.io/

  3. arXiv:2308.14737  [pdf, other

    cs.CV cs.AI cs.GR

    Flexible Techniques for Differentiable Rendering with 3D Gaussians

    Authors: Leonid Keselman, Martial Hebert

    Abstract: Fast, reliable shape reconstruction is an essential ingredient in many computer vision applications. Neural Radiance Fields demonstrated that photorealistic novel view synthesis is within reach, but was gated by performance requirements for fast reconstruction of real scenes and objects. Several recent approaches have built on alternative shape representations, in particular, 3D Gaussians. We deve… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    ACM Class: I.2.10; I.3.7; I.4.0

  4. arXiv:2308.04571  [pdf, other

    cs.RO cs.CV cs.HC

    Optimizing Algorithms From Pairwise User Preferences

    Authors: Leonid Keselman, Katherine Shih, Martial Hebert, Aaron Steinfeld

    Abstract: Typical black-box optimization approaches in robotics focus on learning from metric scores. However, that is not always possible, as not all developers have ground truth available. Learning appropriate robot behavior in human-centric contexts often requires querying users, who typically cannot provide precise metric scores. Existing approaches leverage human feedback in an attempt to model an impl… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted at IROS 2023

    ACM Class: I.2.9; H.1.2; I.2.8

  5. arXiv:2306.14035  [pdf, other

    cs.CV

    Thinking Like an Annotator: Generation of Dataset Labeling Instructions

    Authors: Nadine Chang, Francesco Ferroni, Michael J. Tarr, Martial Hebert, Deva Ramanan

    Abstract: Large-scale datasets are essential to modern day deep learning. Advocates argue that understanding these methods requires dataset transparency (e.g. "dataset curation, motivation, composition, collection process, etc..."). However, almost no one has suggested the release of the detailed definitions and visual category examples provided to annotators - information critical to understanding the stru… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

  6. arXiv:2304.12372  [pdf, other

    cs.CV

    Beyond the Pixel: a Photometrically Calibrated HDR Dataset for Luminance and Color Prediction

    Authors: Christophe Bolduc, Justine Giroux, Marc Hébert, Claude Demers, Jean-François Lalonde

    Abstract: Light plays an important role in human well-being. However, most computer vision tasks treat pixels without considering their relationship to physical luminance. To address this shortcoming, we introduce the Laval Photometric Indoor HDR Dataset, the first large-scale photometrically calibrated dataset of high dynamic range 360° panoramas. Our key contribution is the calibration of an existing, unc… ▽ More

    Submitted 13 October, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

  7. arXiv:2303.15555  [pdf, other

    cs.CV

    Object Discovery from Motion-Guided Tokens

    Authors: Zhipeng Bao, Pavel Tokmakov, Yu-Xiong Wang, Adrien Gaidon, Martial Hebert

    Abstract: Object discovery -- separating objects from the background without manual labels -- is a fundamental open challenge in computer vision. Previous methods struggle to go beyond clustering of low-level cues, whether handcrafted (e.g., color, texture) or learned (e.g., from auto-encoders). In this work, we augment the auto-encoder representation learning framework with two key components: motion-guida… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Journal ref: CVPR 2023

  8. arXiv:2303.07434  [pdf, other

    cs.AI cs.RO

    Discovering Multiple Algorithm Configurations

    Authors: Leonid Keselman, Martial Hebert

    Abstract: Many practitioners in robotics regularly depend on classic, hand-designed algorithms. Often the performance of these algorithms is tuned across a dataset of annotated examples which represent typical deployment conditions. Automatic tuning of these settings is traditionally known as algorithm configuration. In this work, we extend algorithm configuration to automatically discover multiple modes in… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: 8 pages, accepted to ICRA 2023

    ACM Class: I.2.9; I.2.6; I.2.8

  9. arXiv:2211.11182  [pdf, other

    cs.CV

    Deep Projective Rotation Estimation through Relative Supervision

    Authors: Brian Okorn, Chuer Pan, Martial Hebert, David Held

    Abstract: Orientation estimation is the core to a variety of vision and robotics tasks such as camera and object pose estimation. Deep learning has offered a way to develop image-based orientation estimators; however, such estimators often require training on a large labeled dataset, which can be time-intensive to collect. In this work, we explore whether self-supervised learning from unlabeled data can be… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

    Comments: Conference on Robot Learning (CoRL), 2022. Supplementary material is available at https://sites.google.com/view/deep-projective-rotation/home

  10. arXiv:2208.12278  [pdf, other

    cs.CV cs.AI cs.GR

    Learning Continuous Implicit Representation for Near-Periodic Patterns

    Authors: Bowei Chen, Tiancheng Zhi, Martial Hebert, Srinivasa G. Narasimhan

    Abstract: Near-Periodic Patterns (NPP) are ubiquitous in man-made scenes and are composed of tiled motifs with appearance differences caused by lighting, defects, or design elements. A good NPP representation is useful for many applications including image completion, segmentation, and geometric remap**. But representing NPP is challenging because it needs to maintain global consistency (tiled motifs layo… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: ECCV 2022. Project page: https://armastuschen.github.io/projects/NPP_Net/

  11. arXiv:2207.10606  [pdf, other

    cs.CV cs.AI cs.GR

    Approximate Differentiable Rendering with Algebraic Surfaces

    Authors: Leonid Keselman, Martial Hebert

    Abstract: Differentiable renderers provide a direct mathematical link between an object's 3D representation and images of that object. In this work, we develop an approximate differentiable renderer for a compact, interpretable representation, which we call Fuzzy Metaballs. Our approximate renderer focuses on rendering shapes via depth maps and silhouettes. It sacrifices fidelity for utility, producing fast… ▽ More

    Submitted 21 July, 2022; originally announced July 2022.

    Comments: Accepted to the European Conference on Computer Vision (ECCV) 2022

    ACM Class: I.2.10; I.3.7; I.4.0

  12. arXiv:2206.04669  [pdf, other

    cs.CV

    Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields

    Authors: Mingtong Zhang, Shuhong Zheng, Zhipeng Bao, Martial Hebert, Yu-Xiong Wang

    Abstract: Comprehensive 3D scene understanding, both geometrically and semantically, is important for real-world applications such as robot perception. Most of the existing work has focused on develo** data-driven discriminative models for scene understanding. This paper provides a new approach to scene understanding, from a synthesis model perspective, by leveraging the recent progress on implicit 3D rep… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

  13. arXiv:2205.13150  [pdf, other

    cs.GR

    Semantically Supervised Appearance Decomposition for Virtual Staging from a Single Panorama

    Authors: Tiancheng Zhi, Bowei Chen, Ivaylo Boyadzhiev, Sing Bing Kang, Martial Hebert, Srinivasa G. Narasimhan

    Abstract: We describe a novel approach to decompose a single panorama of an empty indoor environment into four appearance components: specular, direct sunlight, diffuse and diffuse ambient without direct sunlight. Our system is weakly supervised by automatically generated semantic maps (with floor, wall, ceiling, lamp, window and door labels) that have shown success on perspective views and are trained for… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Comments: To appear in SIGGRAPH 2022

  14. arXiv:2203.10159  [pdf, other

    cs.CV

    Discovering Objects that Can Move

    Authors: Zhipeng Bao, Pavel Tokmakov, Allan Jabri, Yu-Xiong Wang, Adrien Gaidon, Martial Hebert

    Abstract: This paper studies the problem of object discovery -- separating objects from the background without manual labels. Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions. However, by relying on appearance alone, these methods fail to separate objects from the background in cluttered scenes. This is a fundamental limitation since… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

    Comments: Accepted to CVPR 2022

  15. arXiv:2106.13409  [pdf, other

    cs.CV

    Generative Modeling for Multi-task Visual Learning

    Authors: Zhipeng Bao, Martial Hebert, Yu-Xiong Wang

    Abstract: Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images. In this paper, motivated by multi-task learning of shareable feature representations, we consider a novel problem of learning a shared generative model that is useful across various visual perception tasks. Correspondingly, we propose a general multi-task or… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

  16. arXiv:2106.10766  [pdf, other

    cs.CV cs.AI

    Learning to Track Object Position through Occlusion

    Authors: Satyaki Chakraborty, Martial Hebert

    Abstract: Occlusion is one of the most significant challenges encountered by object detectors and trackers. While both object detection and tracking has received a lot of attention in the past, most existing methods in this domain do not target detecting or tracking objects when they are occluded. However, being able to detect or track an object of interest through occlusion has been a long standing challen… ▽ More

    Submitted 20 June, 2021; originally announced June 2021.

  17. arXiv:2104.13526  [pdf, other

    cs.CV cs.RO

    ZePHyR: Zero-shot Pose Hypothesis Rating

    Authors: Brian Okorn, Qiao Gu, Martial Hebert, David Held

    Abstract: Pose estimation is a basic module in many robot manipulation pipelines. Estimating the pose of objects in the environment can be useful for gras**, motion planning, or manipulation. However, current state-of-the-art methods for pose estimation either rely on large annotated training sets or simulated data. Further, the long training times for these methods prohibit quick interaction with novel o… ▽ More

    Submitted 30 April, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: 8 pages, 4 figures. Accepted to ICRA 2021. Brian and Qiao have equal contributions

  18. arXiv:2012.09418  [pdf, other

    cs.CV

    PanoNet3D: Combining Semantic and Geometric Understanding for LiDARPoint Cloud Detection

    Authors: Xia Chen, Jianren Wang, David Held, Martial Hebert

    Abstract: Visual data in autonomous driving perception, such as camera image and LiDAR point cloud, can be interpreted as a mixture of two aspects: semantic feature and geometric structure. Semantics come from the appearance and context of objects to the sensor, while geometric structure is the actual 3D shape of point clouds. Most detectors on LiDAR point clouds focus only on analyzing the geometric struct… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: 3DV2020

  19. arXiv:2010.07968  [pdf, other

    cs.AI cs.LG cs.RO

    Constrained Model-based Reinforcement Learning with Robust Cross-Entropy Method

    Authors: Zuxin Liu, Hongyi Zhou, Baiming Chen, Sicheng Zhong, Martial Hebert, Ding Zhao

    Abstract: This paper studies the constrained/safe reinforcement learning (RL) problem with sparse indicator signals for constraint violations. We propose a model-based approach to enable RL agents to effectively explore the environment with unknown system dynamics and environment constraints given a significantly small number of violation budgets. We employ the neural network ensemble model to estimate the… ▽ More

    Submitted 6 March, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

    Comments: 8 pages, 5 figures

  20. arXiv:2008.09892  [pdf, other

    cs.CV

    Few-Shot Learning with Intra-Class Knowledge Transfer

    Authors: Vivek Roy, Yan Xu, Yu-Xiong Wang, Kris Kitani, Ruslan Salakhutdinov, Martial Hebert

    Abstract: We consider the few-shot classification task with an unbalanced dataset, in which some classes have sufficient training samples while other classes only have limited training samples. Recent works have proposed to solve this task by augmenting the training data of the few-shot classes using generative models with the few-shot training samples as the seeds. However, due to the limited number of the… ▽ More

    Submitted 22 August, 2020; originally announced August 2020.

  21. arXiv:2008.07073  [pdf, other

    cs.CV

    AlphaNet: Improving Long-Tail Classification By Combining Classifiers

    Authors: Nadine Chang, Jayanth Koushik, Aarti Singh, Martial Hebert, Yu-Xiong Wang, Michael J. Tarr

    Abstract: Methods in long-tail learning focus on improving performance for data-poor (rare) classes; however, performance for such classes remains much lower than performance for more data-rich (frequent) classes. Analyzing the predictions of long-tail methods for rare classes reveals that a large number of errors are due to misclassification of rare items as visually similar frequent classes. To address th… ▽ More

    Submitted 26 July, 2023; v1 submitted 16 August, 2020; originally announced August 2020.

  22. arXiv:2008.06981  [pdf, other

    cs.CV

    Bowtie Networks: Generative Modeling for Joint Few-Shot Recognition and Novel-View Synthesis

    Authors: Zhipeng Bao, Yu-Xiong Wang, Martial Hebert

    Abstract: We propose a novel task of joint few-shot recognition and novel-view synthesis: given only one or few images of a novel object from arbitrary views with only category annotation, we aim to simultaneously learn an object classifier and generate images of that type of object from new viewpoints. While existing work copes with two or more tasks mainly by multi-task learning of shareable feature repre… ▽ More

    Submitted 6 April, 2021; v1 submitted 16 August, 2020; originally announced August 2020.

    Comments: Accepted as a Poster paper at ICLR 2021

  23. arXiv:2008.00192  [pdf, other

    cs.CV

    PanoNet: Real-time Panoptic Segmentation through Position-Sensitive Feature Embedding

    Authors: Xia Chen, Jianren Wang, Martial Hebert

    Abstract: We propose a simple, fast, and flexible framework to generate simultaneously semantic and instance masks for panoptic segmentation. Our method, called PanoNet, incorporates a clean and natural structure design that tackles the problem purely as a segmentation task without the time-consuming detection process. We also introduce position-sensitive embedding for instance grou** by accounting for bo… ▽ More

    Submitted 1 August, 2020; originally announced August 2020.

  24. arXiv:2007.15724  [pdf, other

    cs.RO cs.AI cs.LG cs.MA

    MAPPER: Multi-Agent Path Planning with Evolutionary Reinforcement Learning in Mixed Dynamic Environments

    Authors: Zuxin Liu, Baiming Chen, Hongyi Zhou, Guru Koushik, Martial Hebert, Ding Zhao

    Abstract: Multi-agent navigation in dynamic environments is of great industrial value when deploying a large scale fleet of robot to real-world applications. This paper proposes a decentralized partially observable multi-agent path planning with evolutionary reinforcement learning (MAPPER) method to learn an effective local planning policy in mixed dynamic environments. Reinforcement learning-based methods… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

    Comments: 6 pages, accepted at the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2020)

  25. Learning Orientation Distributions for Object Pose Estimation

    Authors: Brian Okorn, Mengyun Xu, Martial Hebert, David Held

    Abstract: For robots to operate robustly in the real world, they should be aware of their uncertainty. However, most methods for object pose estimation return a single point estimate of the object's pose. In this work, we propose two learned methods for estimating a distribution over an object's orientation. Our methods take into account both the inaccuracies in the pose estimation as well as the object sym… ▽ More

    Submitted 10 August, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

  26. arXiv:2006.15731  [pdf, other

    cs.CV

    Unsupervised Learning of Video Representations via Dense Trajectory Clustering

    Authors: Pavel Tokmakov, Martial Hebert, Cordelia Schmid

    Abstract: This paper addresses the task of unsupervised learning of representations for action recognition in videos. Previous works proposed to utilize future prediction, or other domain-specific objectives to train a network, but achieved only limited success. In contrast, in the relevant field of image representation learning, simpler, discrimination-based methods have recently bridged the gap to fully-s… ▽ More

    Submitted 28 June, 2020; originally announced June 2020.

  27. arXiv:1911.12911  [pdf, other

    cs.CV

    Unlocking the Full Potential of Small Data with Diverse Supervision

    Authors: Ziqi Pang, Zhiyuan Hu, Pavel Tokmakov, Yu-Xiong Wang, Martial Hebert

    Abstract: Virtually all of deep learning literature relies on the assumption of large amounts of available training data. Indeed, even the majority of few-shot learning methods rely on a large set of "base classes" for pretraining. This assumption, however, does not always hold. For some tasks, annotating a large number of classes can be infeasible, and even collecting the images themselves can be a challen… ▽ More

    Submitted 26 April, 2021; v1 submitted 28 November, 2019; originally announced November 2019.

    Comments: Learning from Limited and Imperfect Data (L2ID) Workshop @ CVPR 2021

  28. arXiv:1910.07093  [pdf, other

    cs.RO cs.AI cs.LG

    Explainable Semantic Map** for First Responders

    Authors: Jean Oh, Martial Hebert, Hae-Gon Jeon, Xavier Perez, Chia Dai, Yeeho Song

    Abstract: One of the key challenges in the semantic map** problem in postdisaster environments is how to analyze a large amount of data efficiently with minimal supervision. To address this challenge, we propose a deep learning-based semantic map** tool consisting of three main ideas. First, we develop a frugal semantic segmentation algorithm that uses only a small amount of labeled data. Next, we inves… ▽ More

    Submitted 15 October, 2019; originally announced October 2019.

    Comments: Artificial Intelligence for Humanitarian Assistance and Disaster Response Workshop at NeurIPS 2019

  29. arXiv:1907.11821  [pdf, other

    cs.CV

    Quadtree Generating Networks: Efficient Hierarchical Scene Parsing with Sparse Convolutions

    Authors: Kashyap Chitta, Jose M. Alvarez, Martial Hebert

    Abstract: Semantic segmentation with Convolutional Neural Networks is a memory-intensive task due to the high spatial resolution of feature maps and output predictions. In this paper, we present Quadtree Generating Networks (QGNs), a novel approach able to drastically reduce the memory footprint of modern semantic segmentation networks. The key idea is to use quadtrees to represent the predictions and targe… ▽ More

    Submitted 17 September, 2019; v1 submitted 26 July, 2019; originally announced July 2019.

    Comments: Accepted for IEEE Winter Conference on Applications of Computer Vision (WACV) 2020

  30. arXiv:1907.07844  [pdf, other

    cs.CV cs.LG

    Growing a Brain: Fine-Tuning by Increasing Model Capacity

    Authors: Yu-Xiong Wang, Deva Ramanan, Martial Hebert

    Abstract: CNNs have made an undeniable impact on computer vision through the ability to learn high-capacity models with large annotated training sets. One of their remarkable properties is the ability to transfer knowledge from a large source dataset to a (typically smaller) target dataset. This is usually accomplished through fine-tuning a fixed-size network on new target data. Indeed, virtually every cont… ▽ More

    Submitted 17 July, 2019; originally announced July 2019.

    Comments: CVPR

  31. arXiv:1906.04838  [pdf, other

    cs.CV

    Edge-Direct Visual Odometry

    Authors: Kevin Christensen, Martial Hebert

    Abstract: In this paper we propose an edge-direct visual odometry algorithm that efficiently utilizes edge pixels to find the relative pose that minimizes the photometric error between images. Prior work on exploiting edge pixels instead treats edges as features and employ various techniques to match edge lines or pixels, which adds unnecessary complexity. Direct methods typically operate on all pixel inten… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

  32. arXiv:1905.11641  [pdf, other

    cs.CV

    Image Deformation Meta-Networks for One-Shot Learning

    Authors: Zitian Chen, Yanwei Fu, Yu-Xiong Wang, Lin Ma, Wei Liu, Martial Hebert

    Abstract: Humans can robustly learn novel visual concepts even when images undergo various deformations and lose certain information. Mimicking the same behavior and synthesizing deformed instances of new concepts may help visual recognition systems perform better one-shot learning, i.e., learning concepts from one or few examples. Our key insight is that, while the deformed images may not be visually reali… ▽ More

    Submitted 17 July, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: Oral at CVPR2019. Code is available at https://github.com/tankche1/IDeMe-Net

  33. arXiv:1905.02706  [pdf, other

    cs.CV cs.LG

    Learning Unsupervised Multi-View Stereopsis via Robust Photometric Consistency

    Authors: Tejas Khot, Shubham Agrawal, Shubham Tulsiani, Christoph Mertz, Simon Lucey, Martial Hebert

    Abstract: We present a learning based approach for multi-view stereopsis (MVS). While current deep MVS methods achieve impressive results, they crucially rely on ground-truth 3D training data, and acquisition of such precise 3D geometry for supervision is a major hurdle. Our framework instead leverages photometric consistency between multiple views as supervisory signal for learning depth prediction in a wi… ▽ More

    Submitted 6 June, 2019; v1 submitted 7 May, 2019; originally announced May 2019.

  34. arXiv:1904.12993  [pdf, other

    cs.CV

    A Study on Action Detection in the Wild

    Authors: Yubo Zhang, Pavel Tokmakov, Martial Hebert, Cordelia Schmid

    Abstract: The recent introduction of the AVA dataset for action detection has caused a renewed interest to this problem. Several approaches have been recently proposed that improved the performance. However, all of them have ignored the main difficulty of the AVA dataset - its realistic distribution of training and test examples. This dataset was collected by exhaustive annotation of human action in uncurat… ▽ More

    Submitted 9 June, 2019; v1 submitted 29 April, 2019; originally announced April 2019.

  35. arXiv:1904.05537  [pdf, other

    cs.CV cs.GR

    Direct Fitting of Gaussian Mixture Models

    Authors: Leonid Keselman, Martial Hebert

    Abstract: When fitting Gaussian Mixture Models to 3D geometry, the model is typically fit to point clouds, even when the shapes were obtained as 3D meshes. Here we present a formulation for fitting Gaussian Mixture Models (GMMs) directly to a triangular mesh instead of using points sampled from its surface. Part of this work analyzes a general formulation for evaluating likelihood of geometric objects. This… ▽ More

    Submitted 11 June, 2019; v1 submitted 11 April, 2019; originally announced April 2019.

    Comments: Accepted to the Conference on Computer and Robot Vision 2019. 8 pages

  36. arXiv:1812.09213  [pdf, other

    cs.CV

    Learning Compositional Representations for Few-Shot Recognition

    Authors: Pavel Tokmakov, Yu-Xiong Wang, Martial Hebert

    Abstract: One of the key limitations of modern deep learning approaches lies in the amount of data required to train them. Humans, by contrast, can learn to recognize novel categories from just a few examples. Instrumental to this rapid learning ability is the compositional structure of concept representations in the human brain --- something that deep learning models are lacking. In this work, we make a st… ▽ More

    Submitted 17 August, 2019; v1 submitted 21 December, 2018; originally announced December 2018.

  37. Deep Spectral Reflectance and Illuminant Estimation from Self-Interreflections

    Authors: Rada Deeb, Joost Van De Weijer, Damien Muselet, Mathieu Hebert, Alain Tremeau

    Abstract: In this work, we propose a CNN-based approach to estimate the spectral reflectance of a surface and the spectral power distribution of the light from a single RGB image of a V-shaped surface. Interreflections happening in a concave surface lead to gradients of RGB values over its area. These gradients carry a lot of information concerning the physical properties of the surface and the illuminant.… ▽ More

    Submitted 9 December, 2018; originally announced December 2018.

    Comments: Accepted by JOSA A

  38. arXiv:1812.03544  [pdf, other

    cs.CV

    A Structured Model For Action Detection

    Authors: Yubo Zhang, Pavel Tokmakov, Martial Hebert, Cordelia Schmid

    Abstract: A dominant paradigm for learning-based approaches in computer vision is training generic models, such as ResNet for image recognition, or I3D for video understanding, on large datasets and allowing them to discover the optimal representation for the problem at hand. While this is an obviously attractive approach, it is not applicable in all scenarios. We claim that action detection is one such cha… ▽ More

    Submitted 5 June, 2019; v1 submitted 9 December, 2018; originally announced December 2018.

  39. arXiv:1811.11209  [pdf, other

    cs.CV cs.LG

    Iterative Transformer Network for 3D Point Cloud

    Authors: Wentao Yuan, David Held, Christoph Mertz, Martial Hebert

    Abstract: 3D point cloud is an efficient and flexible representation of 3D structures. Recently, neural networks operating on point clouds have shown superior performance on 3D understanding tasks such as shape classification and part segmentation. However, performance on such tasks is evaluated on complete shapes aligned in a canonical frame, while real world 3D data are partial and unaligned. A key challe… ▽ More

    Submitted 17 October, 2019; v1 submitted 27 November, 2018; originally announced November 2018.

  40. arXiv:1811.03542  [pdf, other

    cs.CV cs.LG

    Adaptive Semantic Segmentation with a Strategic Curriculum of Proxy Labels

    Authors: Kashyap Chitta, Jianwei Feng, Martial Hebert

    Abstract: Training deep networks for semantic segmentation requires annotation of large amounts of data, which can be time-consuming and expensive. Unfortunately, these trained networks still generalize poorly when tested in domains not consistent with the training data. In this paper, we show that by carefully presenting a mixture of labeled source domain and proxy-labeled target domain data to a network,… ▽ More

    Submitted 8 November, 2018; originally announced November 2018.

  41. Real-Time Object Pose Estimation with Pose Interpreter Networks

    Authors: Jimmy Wu, Bolei Zhou, Rebecca Russell, Vincent Kee, Syler Wagner, Mitchell Hebert, Antonio Torralba, David M. S. Johnson

    Abstract: In this work, we introduce pose interpreter networks for 6-DoF object pose estimation. In contrast to other CNN-based approaches to pose estimation that require expensively annotated object pose data, our pose interpreter network is trained entirely on synthetic pose data. We use object masks as an intermediate representation to bridge real and synthetic. We show that when combined with a segmenta… ▽ More

    Submitted 3 August, 2018; originally announced August 2018.

    Comments: To appear at 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018). Code available at https://github.com/jimmyyhwu/pose-interpreter-networks

  42. arXiv:1808.00671  [pdf, other

    cs.CV cs.RO

    PCN: Point Completion Network

    Authors: Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, Martial Hebert

    Abstract: Shape completion, the problem of estimating the complete geometry of objects from partial observations, lies at the core of many vision and robotics applications. In this work, we propose Point Completion Network (PCN), a novel learning-based approach for shape completion. Unlike existing shape completion methods, PCN directly operates on raw point clouds without any structural assumption (e.g. sy… ▽ More

    Submitted 26 September, 2019; v1 submitted 2 August, 2018; originally announced August 2018.

    Comments: 3DV 2018 oral. Honorable mention for Best Paper award

  43. Spectral reflectance estimation from one RGB image using self-interreflections in a concave object

    Authors: Rada Deeb, Damien Muselet, Mathieu Hebert, Alain Tremeau

    Abstract: Light interreflections occurring in a concave object generate a color gradient which is characteristic of the object's spectral reflectance. In this paper, we use this property in order to estimate the spectral reflectance of matte, uniformly colored, V-shaped surfaces from a single RGB image taken under directional lighting. First, simulations show that using one image of the concave object is eq… ▽ More

    Submitted 5 March, 2018; originally announced March 2018.

    Comments: submitted to Applied Optics

  44. arXiv:1801.05401  [pdf, other

    cs.CV

    Low-Shot Learning from Imaginary Data

    Authors: Yu-Xiong Wang, Ross Girshick, Martial Hebert, Bharath Hariharan

    Abstract: Humans can quickly learn new visual concepts, perhaps because they can easily visualize or imagine what novel objects look like from different views. Incorporating this ability to hallucinate novel instances of new concepts might help machine vision systems perform better low-shot learning, i.e., learning concepts from few examples. We present a novel approach to low-shot learning that uses this i… ▽ More

    Submitted 2 April, 2018; v1 submitted 16 January, 2018; originally announced January 2018.

    Comments: CVPR 2018 camera-ready version

  45. arXiv:1712.01238  [pdf, other

    cs.CV cs.CL cs.LG

    Learning by Asking Questions

    Authors: Ishan Misra, Ross Girshick, Rob Fergus, Martial Hebert, Abhinav Gupta, Laurens van der Maaten

    Abstract: We introduce an interactive learning framework for the development and testing of intelligent visual systems, called learning-by-asking (LBA). We explore LBA in context of the Visual Question Answering (VQA) task. LBA differs from standard VQA training in that most questions are not observed during training time, and the learner must ask questions it wants answers to. Thus, LBA more closely mimics… ▽ More

    Submitted 4 December, 2017; originally announced December 2017.

  46. arXiv:1711.02216  [pdf, other

    cs.RO

    SegICP-DSR: Dense Semantic Scene Reconstruction and Registration

    Authors: Jay M. Wong, Syler Wagner, Connor Lawson, Vincent Kee, Mitchell Hebert, Justin Rooney, Gian-Luca Mariottini, Rebecca Russell, Abraham Schneider, Rahul Chipalkatty, David M. S. Johnson

    Abstract: To enable autonomous robotic manipulation in unstructured environments, we present SegICP-DSR, a real- time, dense, semantic scene reconstruction and pose estimation algorithm that achieves mm-level pose accuracy and standard deviation (7.9 mm, σ=7.6 mm and 1.7 deg, σ=0.7 deg) and suc- cessfully identified the object pose in 97% of test cases. This represents a 29% increase in accuracy, and a 14%… ▽ More

    Submitted 6 November, 2017; originally announced November 2017.

  47. arXiv:1711.00002  [pdf, other

    cs.CV cs.LG

    Log-DenseNet: How to Sparsify a DenseNet

    Authors: Hanzhang Hu, Debadeepta Dey, Allison Del Giorno, Martial Hebert, J. Andrew Bagnell

    Abstract: Skip connections are increasingly utilized by deep neural networks to improve accuracy and cost-efficiency. In particular, the recent DenseNet is efficient in computation and parameters, and achieves state-of-the-art predictions by directly connecting each feature layer to all previous ones. However, DenseNet's extreme connectivity pattern may hinder its scalability to high depths, and in applicat… ▽ More

    Submitted 30 October, 2017; originally announced November 2017.

  48. arXiv:1709.08520  [pdf, other

    stat.ML cs.LG

    Predictive-State Decoders: Encoding the Future into Recurrent Networks

    Authors: Arun Venkatraman, Nicholas Rhinehart, Wen Sun, Lerrel Pinto, Martial Hebert, Byron Boots, Kris M. Kitani, J. Andrew Bagnell

    Abstract: Recurrent neural networks (RNNs) are a vital modeling technique that rely on internal states learned indirectly by optimization of a supervised, unsupervised, or reinforcement training loss. RNNs are used to model dynamic processes that are characterized by underlying latent states whose form is often unknown, precluding its analytic representation inside an RNN. In the Predictive-State Representa… ▽ More

    Submitted 25 September, 2017; originally announced September 2017.

    Comments: NIPS 2017

  49. arXiv:1709.04549  [pdf, other

    cs.LG

    Ignoring Distractors in the Absence of Labels: Optimal Linear Projection to Remove False Positives During Anomaly Detection

    Authors: Allison Del Giorno, J. Andrew Bagnell, Martial Hebert

    Abstract: In the anomaly detection setting, the native feature embedding can be a crucial source of bias. We present a technique, Feature Omission using Context in Unsupervised Settings (FOCUS) to learn a feature map** that is invariant to changes exemplified in training sets while retaining as much descriptive power as possible. While this method could apply to many unsupervised settings, we focus on app… ▽ More

    Submitted 13 September, 2017; originally announced September 2017.

    Comments: 13 pages, 6 figures

  50. arXiv:1708.06832  [pdf, other

    cs.LG cs.AI

    Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing

    Authors: Hanzhang Hu, Debadeepta Dey, Martial Hebert, J. Andrew Bagnell

    Abstract: This work considers the trade-off between accuracy and test-time computational cost of deep neural networks (DNNs) via \emph{anytime} predictions from auxiliary predictions. Specifically, we optimize auxiliary losses jointly in an \emph{adaptive} weighted sum, where the weights are inversely proportional to average of each loss. Intuitively, this balances the losses to have the same scale. We demo… ▽ More

    Submitted 25 May, 2018; v1 submitted 22 August, 2017; originally announced August 2017.