Skip to main content

Showing 1–17 of 17 results for author: Zablocki, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08113  [pdf, other

    cs.CV cs.RO

    Valeo4Cast: A Modular Approach to End-to-End Forecasting

    Authors: Yihong Xu, Éloi Zablocki, Alexandre Boulch, Gilles Puy, Mickael Chen, Florent Bartoccioni, Nermin Samet, Oriane Siméoni, Spyros Gidaris, Tuan-Hung Vu, Andrei Bursuc, Eduardo Valle, Renaud Marlet, Matthieu Cord

    Abstract: Motion forecasting is crucial in autonomous driving systems to anticipate the future trajectories of surrounding agents such as pedestrians, vehicles, and traffic signals. In end-to-end forecasting, the model must jointly detect from sensor data (cameras or LiDARs) the position and past trajectories of the different elements of the scene and predict their future location. We depart from the curren… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Winning solution of the Argoverse 2 "Unified Detection, Tracking, and Forecasting" challenge, held at CVPR 2024 WAD

  2. arXiv:2403.15098  [pdf, other

    cs.CV

    UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction

    Authors: Lan Feng, Mohammadhossein Bahari, Kaouther Messaoud Ben Amor, Éloi Zablocki, Matthieu Cord, Alexandre Alahi

    Abstract: Vehicle trajectory prediction has increasingly relied on data-driven solutions, but their ability to scale to different data domains and the impact of larger dataset sizes on their generalization remain under-explored. While these questions can be studied by employing multiple datasets, it is challenging due to several discrepancies, e.g., in data formats, map resolution, and semantic annotation t… ▽ More

    Submitted 27 March, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  3. arXiv:2312.00703  [pdf, other

    cs.CV

    PointBeV: A Sparse Approach to BeV Predictions

    Authors: Loick Chambon, Eloi Zablocki, Mickael Chen, Florent Bartoccioni, Patrick Perez, Matthieu Cord

    Abstract: Bird's-eye View (BeV) representations have emerged as the de-facto shared space in driving applications, offering a unified space for sensor data fusion and supporting various downstream tasks. However, conventional models use grids with fixed resolution and range and face computational inefficiencies due to the uniform allocation of resources across all cells. To address this, we propose PointBeV… ▽ More

    Submitted 23 May, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: https://github.com/valeoai/PointBeV

  4. arXiv:2310.12904  [pdf, other

    cs.CV

    Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey

    Authors: Oriane Siméoni, Éloi Zablocki, Spyros Gidaris, Gilles Puy, Patrick Pérez

    Abstract: The recent enthusiasm for open-world vision systems show the high interest of the community to perform perception tasks outside of the closed-vocabulary benchmark setups which have been so popular until now. Being able to discover objects in images/videos without knowing in advance what objects populate the dataset is an exciting prospect. But how to find objects without knowing anything about the… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

  5. arXiv:2306.09281  [pdf, other

    cs.RO cs.CV

    Towards Motion Forecasting with Real-World Perception Inputs: Are End-to-End Approaches Competitive?

    Authors: Yihong Xu, Loïck Chambon, Éloi Zablocki, Mickaël Chen, Alexandre Alahi, Matthieu Cord, Patrick Pérez

    Abstract: Motion forecasting is crucial in enabling autonomous vehicles to anticipate the future trajectories of surrounding agents. To do so, it requires solving map**, detection, tracking, and then forecasting problems, in a multi-step pipeline. In this complex system, advances in conventional forecasting methods have been made using curated data, i.e., with the assumption of perfect maps, detection, an… ▽ More

    Submitted 5 March, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: Accepted to ICRA 2024

  6. arXiv:2212.07834  [pdf, other

    cs.CV

    Unsupervised Object Localization: Observing the Background to Discover Objects

    Authors: Oriane Siméoni, Chloé Sekkat, Gilles Puy, Antonin Vobecky, Éloi Zablocki, Patrick Pérez

    Abstract: Recent advances in self-supervised visual representation learning have paved the way for unsupervised methods tackling tasks such as object discovery and instance segmentation. However, discovering objects in an image with no supervision is a very hard task; what are the desired objects, when to separate them into parts, how many are there, and of what classes? The answers to these questions depen… ▽ More

    Submitted 29 March, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: CVPR 2023

  7. arXiv:2211.12380  [pdf, other

    cs.CV cs.AI

    OCTET: Object-aware Counterfactual Explanations

    Authors: Mehdi Zemni, Mickaël Chen, Éloi Zablocki, Hédi Ben-Younes, Patrick Pérez, Matthieu Cord

    Abstract: Nowadays, deep vision models are being widely deployed in safety-critical applications, e.g., autonomous driving, and explainability of such models is becoming a pressing concern. Among explanation methods, counterfactual explanations aim to find minimal and interpretable changes to the input image that would also change the output of the model to be explained. Such explanations point end-users at… ▽ More

    Submitted 24 March, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: CVPR 2023

  8. arXiv:2206.13294  [pdf, other

    cs.CV cs.AI cs.RO

    LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation

    Authors: Florent Bartoccioni, Éloi Zablocki, Andrei Bursuc, Patrick Pérez, Matthieu Cord, Karteek Alahari

    Abstract: Recent works in autonomous driving have widely adopted the bird's-eye-view (BEV) semantic map as an intermediate representation of the world. Online prediction of these BEV maps involves non-trivial operations such as multi-camera data extraction as well as fusion and projection into a common topview grid. This is usually done with error-prone geometric operations (e.g., homography or back-project… ▽ More

    Submitted 26 November, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

    MSC Class: 68T45

    Journal ref: CoRL 2022 https://openreview.net/forum?id=abd_D-iVjk0

  9. arXiv:2111.09094  [pdf, other

    cs.CV

    STEEX: Steering Counterfactual Explanations with Semantics

    Authors: Paul Jacob, Éloi Zablocki, Hédi Ben-Younes, Mickaël Chen, Patrick Pérez, Matthieu Cord

    Abstract: As deep learning models are increasingly used in safety-critical applications, explainability and trustworthiness become major concerns. For simple images, such as low-resolution face portraits, synthesizing visual counterfactual explanations has recently been proposed as a way to uncover the decision mechanisms of a trained classification model. In this work, we address the problem of producing c… ▽ More

    Submitted 18 July, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

    Comments: ECCV 2022 --- 14 pages + supplementary

  10. arXiv:2109.08048  [pdf, other

    cs.CV cs.AI cs.RO

    Raising context awareness in motion forecasting

    Authors: Hédi Ben-Younes, Éloi Zablocki, Mickaël Chen, Patrick Pérez, Matthieu Cord

    Abstract: Learning-based trajectory prediction models have encountered great success, with the promise of leveraging contextual information in addition to motion history. Yet, we find that state-of-the-art forecasting methods tend to overly rely on the agent's current dynamics, failing to exploit the semantic contextual cues provided at its input. To alleviate this issue, we introduce CAB, a motion forecast… ▽ More

    Submitted 21 April, 2022; v1 submitted 16 September, 2021; originally announced September 2021.

    Comments: CVPR Workshop on Autonomous Driving - WAD 2022

  11. arXiv:2109.03569  [pdf, other

    cs.CV cs.AI cs.RO

    LiDARTouch: Monocular metric depth estimation with a few-beam LiDAR

    Authors: Florent Bartoccioni, Éloi Zablocki, Patrick Pérez, Matthieu Cord, Karteek Alahari

    Abstract: Vision-based depth estimation is a key feature in autonomous systems, which often relies on a single camera or several independent ones. In such a monocular setup, dense depth is obtained with either additional input from one or several expensive LiDARs, e.g., with 64 beams, or camera-only methods, which suffer from scale-ambiguity and infinite-depth problems. In this paper, we propose a new alter… ▽ More

    Submitted 25 November, 2022; v1 submitted 8 September, 2021; originally announced September 2021.

    MSC Class: 68T45

  12. arXiv:2101.05307  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Explainability of deep vision-based autonomous driving systems: Review and challenges

    Authors: Éloi Zablocki, Hédi Ben-Younes, Patrick Pérez, Matthieu Cord

    Abstract: This survey reviews explainability methods for vision-based self-driving systems trained with behavior cloning. The concept of explainability has several facets and the need for explainability is strong in driving, a safety-critical application. Gathering contributions from several research fields, namely computer vision, deep learning, autonomous driving, explainable AI (X-AI), this survey tackle… ▽ More

    Submitted 19 July, 2022; v1 submitted 13 January, 2021; originally announced January 2021.

    Comments: IJCV 2022

  13. arXiv:2012.04983  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    Driving Behavior Explanation with Multi-level Fusion

    Authors: Hédi Ben-Younes, Éloi Zablocki, Patrick Pérez, Matthieu Cord

    Abstract: In this era of active development of autonomous vehicles, it becomes crucial to provide driving systems with the capacity to explain their decisions. In this work, we focus on generating high-level driving explanations as the vehicle drives. We present BEEF, for BEhavior Explanation with Fusion, a deep architecture which explains the behavior of a trajectory prediction model. Supervised by annotat… ▽ More

    Submitted 9 December, 2020; originally announced December 2020.

    Comments: Accepted at NeurIPS Workshop ML4AD 2020

    Journal ref: Pattern Recognition, Volume 123, March 2022, 108421

  14. arXiv:2011.06850  [pdf, other

    cs.CV cs.AI

    Transductive Zero-Shot Learning using Cross-Modal CycleGAN

    Authors: Patrick Bordes, Eloi Zablocki, Benjamin Piwowarski, Patrick Gallinari

    Abstract: In Computer Vision, Zero-Shot Learning (ZSL) aims at classifying unseen classes -- classes for which no matching training image exists. Most of ZSL works learn a cross-modal map** between images and class labels for seen classes. However, the data distribution of seen and unseen classes might differ, causing a domain shift problem. Following this observation, transductive ZSL (T-ZSL) assumes tha… ▽ More

    Submitted 13 November, 2020; originally announced November 2020.

  15. arXiv:2002.02734  [pdf, other

    cs.CL

    Incorporating Visual Semantics into Sentence Representations within a Grounded Space

    Authors: Patrick Bordes, Eloi Zablocki, Laure Soulier, Benjamin Piwowarski, Patrick Gallinari

    Abstract: Language grounding is an active field aiming at enriching textual representations with visual information. Generally, textual and visual elements are embedded in the same representation space, which implicitly assumes a one-to-one correspondence between modalities. This hypothesis does not hold when representing words, and becomes problematic when used to learn sentence representations --- the foc… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

  16. arXiv:1904.12638  [pdf, other

    cs.CV cs.CL cs.LG stat.ML

    Context-Aware Zero-Shot Learning for Object Recognition

    Authors: Eloi Zablocki, Patrick Bordes, Benjamin Piwowarski, Laure Soulier, Patrick Gallinari

    Abstract: Zero-Shot Learning (ZSL) aims at classifying unlabeled objects by leveraging auxiliary knowledge, such as semantic representations. A limitation of previous approaches is that only intrinsic properties of objects, e.g. their visual appearance, are taken into account while their context, e.g. the surrounding objects in the image, is ignored. Following the intuitive principle that objects tend to be… ▽ More

    Submitted 30 April, 2019; v1 submitted 24 April, 2019; originally announced April 2019.

    Comments: Accepted at ICML 2019

  17. arXiv:1711.03483  [pdf, other

    cs.CL cs.AI cs.CV

    Learning Multi-Modal Word Representation Grounded in Visual Context

    Authors: Éloi Zablocki, Benjamin Piwowarski, Laure Soulier, Patrick Gallinari

    Abstract: Representing the semantics of words is a long-standing problem for the natural language processing community. Most methods compute word semantics given their textual context in large corpora. More recently, researchers attempted to integrate perceptual and visual features. Most of these works consider the visual appearance of objects to enhance word representations but they ignore the visual envir… ▽ More

    Submitted 9 November, 2017; originally announced November 2017.