Skip to main content

Showing 1–50 of 60 results for author: Wolf, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.13848  [pdf, other

    cs.CV cs.RO

    Zero-BEV: Zero-shot Projection of Any First-Person Modality to BEV Maps

    Authors: Gianluca Monaci, Leonid Antsfeld, Boris Chidlovskii, Christian Wolf

    Abstract: Bird's-eye view (BEV) maps are an important geometrically structured representation widely used in robotics, in particular self-driving vehicles and terrestrial robots. Existing algorithms either require depth information for the geometric projection, which is not always reliably available, or are trained end-to-end in a fully supervised way to map visual first-person observations to BEV represent… ▽ More

    Submitted 25 March, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  2. arXiv:2402.07739  [pdf, other

    cs.CV cs.LG cs.RO

    Task-conditioned adaptation of visual features in multi-task policy learning

    Authors: Pierre Marza, Laetitia Matignon, Olivier Simonin, Christian Wolf

    Abstract: Successfully addressing a wide variety of tasks is a core ability of autonomous agents, requiring flexibly adapting the underlying decision-making strategies and, as we argue in this work, also adapting the perception modules. An analogical argument would be the human visual system, which uses top-down signals to focus attention determined by the current task. Similarly, we adapt pre-trained large… ▽ More

    Submitted 6 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  3. arXiv:2401.14349  [pdf, other

    cs.RO cs.CV

    Learning to navigate efficiently and precisely in real environments

    Authors: Guillaume Bono, Hervé Poirier, Leonid Antsfeld, Gianluca Monaci, Boris Chidlovskii, Christian Wolf

    Abstract: In the context of autonomous navigation of terrestrial robots, the creation of realistic models for agent dynamics and sensing is a widespread habit in the robotics literature and in commercial applications, where they are used for model based control and/or for localization and map**. The more recent Embodied AI literature, on the other hand, focuses on modular or end-to-end agents trained in s… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  4. arXiv:2401.13800  [pdf, other

    cs.RO cs.AI

    Multi-Object Navigation in real environments using hybrid policies

    Authors: Assem Sadek, Guillaume Bono, Boris Chidlovskii, Atilla Baskurt, Christian Wolf

    Abstract: Navigation has been classically solved in robotics through the combination of SLAM and planning. More recently, beyond waypoint planning, problems involving significant components of (visual) high-level reasoning have been explored in simulated environments, mostly addressed with large-scale machine learning, in particular RL, offline-RL or imitation learning. These methods require the agent to le… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  5. arXiv:2309.16634  [pdf, other

    cs.CV

    End-to-End (Instance)-Image Goal Navigation through Correspondence as an Emergent Phenomenon

    Authors: Guillaume Bono, Leonid Antsfeld, Boris Chidlovskii, Philippe Weinzaepfel, Christian Wolf

    Abstract: Most recent work in goal oriented visual navigation resorts to large-scale machine learning in simulated environments. The main challenge lies in learning compact representations generalizable to unseen environments and in learning high-capacity perception modules capable of reasoning on high-dimensional input. The latter is particularly difficult when the goal is not given as a category ("ObjectN… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  6. arXiv:2307.16710  [pdf, other

    cs.RO

    Learning whom to trust in navigation: dynamically switching between classical and neural planning

    Authors: Sombit Dey, Assem Sadek, Gianluca Monaci, Boris Chidlovskii, Christian Wolf

    Abstract: Navigation of terrestrial robots is typically addressed either with localization and map** (SLAM) followed by classical planning on the dynamically created maps, or by machine learning (ML), often through end-to-end training with reinforcement learning (RL) or imitation learning (IL). Recently, modular designs have achieved promising results, and hybrid algorithms that combine ML with classical… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: 8 pages including references. International Conference on Intelligent Robots and Systems (IROS 2023)

  7. arXiv:2306.03857  [pdf, other

    cs.RO cs.CV

    Learning with a Mole: Transferable latent spatial representations for navigation without reconstruction

    Authors: Guillaume Bono, Leonid Antsfeld, Assem Sadek, Gianluca Monaci, Christian Wolf

    Abstract: Agents navigating in 3D environments require some form of memory, which should hold a compact and actionable representation of the history of observations useful for decision taking and planning. In most end-to-end learning approaches the representation is latent and usually does not have a clearly defined interpretation, whereas classical robotics addresses this with scene reconstruction resultin… ▽ More

    Submitted 29 September, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

  8. arXiv:2304.11241  [pdf, other

    cs.CV cs.LG cs.RO

    AutoNeRF: Training Implicit Scene Representations with Autonomous Agents

    Authors: Pierre Marza, Laetitia Matignon, Olivier Simonin, Dhruv Batra, Christian Wolf, Devendra Singh Chaplot

    Abstract: Implicit representations such as Neural Radiance Fields (NeRF) have been shown to be very effective at novel view synthesis. However, these models typically require manual and careful human data collection for training. In this paper, we present AutoNeRF, a method to collect data required to train NeRFs using autonomous embodied agents. Our method allows an agent to explore an unseen environment e… ▽ More

    Submitted 22 December, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

  9. arXiv:2302.10803  [pdf, other

    cs.LG cs.AI physics.flu-dyn

    Eagle: Large-Scale Learning of Turbulent Fluid Dynamics with Mesh Transformers

    Authors: Steeven Janny, Aurélien Béneteau, Madiha Nadri, Julie Digne, Nicolas Thome, Christian Wolf

    Abstract: Estimating fluid dynamics is classically done through the simulation and integration of numerical models solving the Navier-Stokes equations, which is computationally complex and time-consuming even on high-end hardware. This is a notoriously hard problem to solve, which has recently been addressed with machine learning, in particular graph neural networks (GNN) and variants trained and evaluated… ▽ More

    Submitted 17 March, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: Published as a conference paper at ICLR 2023

    Journal ref: International Conference on Learning Representation (ICLR) 2023

  10. arXiv:2210.05129  [pdf, other

    cs.CV cs.LG cs.RO

    Multi-Object Navigation with dynamically learned neural implicit representations

    Authors: Pierre Marza, Laetitia Matignon, Olivier Simonin, Christian Wolf

    Abstract: Understanding and map** a new environment are core abilities of any autonomously navigating agent. While classical robotics usually estimates maps in a stand-alone manner with SLAM variants, which maintain a topological or metric representation, end-to-end learning of navigation keeps some form of memory in a neural network. Networks are typically imbued with inductive biases, which can range fr… ▽ More

    Submitted 27 September, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

  11. arXiv:2206.04791  [pdf, other

    eess.SY cs.AI

    Learning Reduced Nonlinear State-Space Models: an Output-Error Based Canonical Approach

    Authors: Steeven Janny, Quentin Possamai, Laurent Bako, Madiha Nadri, Christian Wolf

    Abstract: The identification of a nonlinear dynamic model is an open topic in control theory, especially from sparse input-output measurements. A fundamental challenge of this problem is that very few to zero prior knowledge is available on both the state and the nonlinear system model. To cope with this challenge, we investigate the effectiveness of deep learning in the modeling of dynamic systems with non… ▽ More

    Submitted 19 April, 2022; originally announced June 2022.

  12. arXiv:2206.04367  [pdf, ps, other

    cs.CG math.CO math.MG

    Distinct Angles in General Position

    Authors: Henry L. Fleischmann, Sergei V. Konyagin, Steven J. Miller, Eyvindur A. Palsson, Ethan Pesikoff, Charles Wolf

    Abstract: The Erdős distinct distance problem is a ubiquitous problem in discrete geometry. Somewhat less well known is Erdős' distinct angle problem, the problem of finding the minimum number of distinct angles between $n$ non-collinear points in the plane. Recent work has introduced bounds on a wide array of variants of this problem, inspired by similar variants in the distance setting. In this short no… ▽ More

    Submitted 13 June, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: Former Corollary 4.1 upgraded to Theorem 1.2 with improved bounds

    MSC Class: 52C10

  13. arXiv:2203.14726  [pdf, other

    cs.RO

    Learning to estimate UAV created turbulence from scene structure observed by onboard cameras

    Authors: Quentin Possamaï, Steeven Janny, Madiha Nadri, Laurent Bako, Christian Wolf

    Abstract: Controlling UAV flights precisely requires a realistic dynamic model and accurate state estimates from onboard sensors like UAV, GPS and visual observations. Obtaining a precise dynamic model is extremely difficult, as important aerodynamic effects are hard to model, in particular ground effect and other turbulences. While machine learning has been used in the past to estimate UAV created turbulen… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: 8 pages, 6 figures, 2 tables. Submitted to International Conference on Intelligent Robots and Systems

  14. arXiv:2202.06858  [pdf, other

    cs.CV

    An experimental study of the vision-bottleneck in VQA

    Authors: Pierre Marza, Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf

    Abstract: As in many tasks combining vision and language, both modalities play a crucial role in Visual Question Answering (VQA). To properly solve the task, a given model should both understand the content of the proposed image and the nature of the question. While the fusion between modalities, which is another obviously important part of the problem, has been highly studied, the vision part has received… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

  15. arXiv:2202.00403  [pdf, other

    cs.RO

    MoCap-less Quantitative Evaluation of Ego-Pose Estimation Without Ground Truth Measurements

    Authors: Quentin Possamaï, Steeven Janny, Guillaume Bono, Madiha Nadri, Laurent Bako, Christian Wolf

    Abstract: The emergence of data-driven approaches for control and planning in robotics have highlighted the need for develo** experimental robotic platforms for data collection. However, their implementation is often complex and expensive, in particular for flying and terrestrial robots where the precise estimation of the position requires motion capture devices (MoCap) or Lidar. In order to simplify the… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

    Comments: 7 pages, 6 figures, 1 table. Submitted to International Conference on Pattern Recognition. For associated videos: https://www.youtube.com/playlist?list=PLRsYEUUGzW54jqsfRdkNAYjZUnoEM4uhM

  16. arXiv:2202.00368  [pdf, other

    cs.CV cs.LG

    Filtered-CoPhy: Unsupervised Learning of Counterfactual Physics in Pixel Space

    Authors: Steeven Janny, Fabien Baradel, Natalia Neverova, Madiha Nadri, Greg Mori, Christian Wolf

    Abstract: Learning causal relationships in high-dimensional data (images, videos) is a hard task, as they are often defined on low dimensional manifolds and must be extracted from complex signals dominated by appearance, lighting, textures and also spurious correlations in the data. We present a method for learning counterfactual reasoning of physical processes in pixel space, which requires the prediction… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

    Journal ref: International Conference on Learning Representation (2022)

  17. arXiv:2112.11731  [pdf, other

    cs.LG cs.AI

    Graph augmented Deep Reinforcement Learning in the GameRLand3D environment

    Authors: Edward Beeching, Maxim Peter, Philippe Marcotte, Jilles Debangoye, Olivier Simonin, Joshua Romoff, Christian Wolf

    Abstract: We address planning and navigation in challenging 3D video games featuring maps with disconnected regions reachable by agents using special actions. In this setting, classical symbolic planners are not applicable or difficult to adapt. We introduce a hybrid technique combining a low level policy trained with reinforcement learning and a graph based high level classical planner. In addition to prov… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

  18. arXiv:2112.03636  [pdf, other

    cs.LG

    Godot Reinforcement Learning Agents

    Authors: Edward Beeching, Jilles Debangoye, Olivier Simonin, Christian Wolf

    Abstract: We present Godot Reinforcement Learning (RL) Agents, an open-source interface for develo** environments and agents in the Godot Game Engine. The Godot RL Agents interface allows the design, creation and learning of agent behaviors in challenging 2D and 3D environments with various on-policy and off-policy Deep RL algorithms. We provide a standard Gym interface, with wrappers for learning in the… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

  19. arXiv:2111.14666  [pdf, other

    cs.AI cs.RO

    An in-depth experimental study of sensor usage and visual reasoning of robots navigating in real environments

    Authors: Assem Sadek, Guillaume Bono, Boris Chidlovskii, Christian Wolf

    Abstract: Visual navigation by mobile robots is classically tackled through SLAM plus optimal planning, and more recently through end-to-end training of policies implemented as deep networks. While the former are often limited to waypoint planning, but have proven their efficiency even on real physical environments, the latter solutions are most frequently employed in simulation, but have been shown to be a… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

  20. arXiv:2110.05812  [pdf, other

    cs.CV cs.GR

    Satellite Image Semantic Segmentation

    Authors: Eric Guérin, Killian Oechslin, Christian Wolf, Benoît Martinez

    Abstract: In this paper, we propose a method for the automatic semantic segmentation of satellite images into six classes (sparse forest, dense forest, moor, herbaceous formation, building, and road). We rely on Swin Transformer architecture and build the dataset from IGN open data. We report quantitative and qualitative segmentation results on this dataset and discuss strengths and limitations. The dataset… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

    Comments: 8 pages, 3 figures

    ACM Class: I.4.6

  21. arXiv:2109.11801  [pdf, other

    cs.RO cs.CV cs.HC cs.LG

    SIM2REALVIZ: Visualizing the Sim2Real Gap in Robot Ego-Pose Estimation

    Authors: Theo Jaunet, Guillaume Bono, Romain Vuillemot, Christian Wolf

    Abstract: The Robotics community has started to heavily rely on increasingly realistic 3D simulators for large-scale training of robots on massive amounts of data. But once robots are deployed in the real world, the simulation gap, as well as changes in the real world (e.g. lights, objects displacements) lead to errors. In this paper, we introduce Sim2RealViz, a visual analytics tool to assist experts in un… ▽ More

    Submitted 3 December, 2021; v1 submitted 24 September, 2021; originally announced September 2021.

  22. arXiv:2108.12015  [pdf, other

    math.CO cs.CG

    Distinct Angle Problems and Variants

    Authors: Henry L. Fleischmann, Hongyi B. Hu, Faye Jackson, Steven J. Miller, Eyvindur A. Palsson, Ethan Pesikoff, Charles Wolf

    Abstract: The Erdős distinct distance problem is a ubiquitous problem in discrete geometry. Less well known is Erdős' distinct angle problem, the problem of finding the minimum number of distinct angles between $n$ non-collinear points in the plane. The standard problem is already well understood. However, it admits many of the same variants as the distinct distance problem, many of which are unstudied. W… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

    MSC Class: 05

  23. arXiv:2107.13977  [pdf, other

    cs.AI

    Underwater Acoustic Networks for Security Risk Assessment in Public Drinking Water Reservoirs

    Authors: Jörg Stork, Philip Wenzel, Severin Landwein, Maria-Elena Algorri, Martin Zaefferer, Wolfgang Kusch, Martin Staubach, Thomas Bartz-Beielstein, Hartmut Köhn, Hermann Dejager, Christian Wolf

    Abstract: We have built a novel system for the surveillance of drinking water reservoirs using underwater sensor networks. We implement an innovative AI-based approach to detect, classify and localize underwater events. In this paper, we describe the technology and cognitive AI architecture of the system based on one of the sensor networks, the hydrophone network. We discuss the challenges of installing and… ▽ More

    Submitted 29 July, 2021; originally announced July 2021.

  24. arXiv:2107.06011  [pdf, other

    cs.CV cs.LG cs.RO

    Teaching Agents how to Map: Spatial Reasoning for Multi-Object Navigation

    Authors: Pierre Marza, Laetitia Matignon, Olivier Simonin, Christian Wolf

    Abstract: In the context of visual navigation, the capacity to map a novel environment is necessary for an agent to exploit its observation history in the considered place and efficiently reach known goals. This ability can be associated with spatial reasoning, where an agent is able to perceive spatial relationships and regularities, and discover object characteristics. Recent work introduces learnable pol… ▽ More

    Submitted 25 April, 2023; v1 submitted 13 July, 2021; originally announced July 2021.

  25. arXiv:2107.05186  [pdf

    cs.CV

    Early warning of pedestrians and cyclists

    Authors: Joerg Christian Wolf

    Abstract: State-of-the-art motor vehicles are able to break for pedestrians in an emergency. We investigate what it would take to issue an early warning to the driver so he/she has time to react. We have identified that predicting the intention of a pedestrian reliably by position is a particularly hard challenge. This paper describes an early pedestrian warning demonstration system.

    Submitted 12 July, 2021; originally announced July 2021.

  26. arXiv:2106.11576  [pdf, other

    cs.CV cs.AI

    Universal Domain Adaptation in Ordinal Regression

    Authors: Boris Chidlovskii, Assem Sadek, Christian Wolf

    Abstract: We address the problem of universal domain adaptation (UDA) in ordinal regression (OR), which attempts to solve classification problems in which labels are not independent, but follow a natural order. We show that the UDA techniques developed for classification and based on the clustering assumption, under-perform in OR settings. We propose a method that complements the OR classifier with an auxil… ▽ More

    Submitted 25 August, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

  27. arXiv:2106.05597  [pdf, other

    cs.CV cs.LG

    Supervising the Transfer of Reasoning Patterns in VQA

    Authors: Corentin Kervadec, Christian Wolf, Grigory Antipov, Moez Baccouche, Madiha Nadri

    Abstract: Methods for Visual Question Anwering (VQA) are notorious for leveraging dataset biases rather than performing reasoning, hindering generalization. It has been recently shown that better reasoning patterns emerge in attention layers of a state-of-the-art VQA model when they are trained on perfect (oracle) visual inputs. This provides evidence that deep neural networks can learn to reason when train… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

  28. arXiv:2104.03656  [pdf, other

    cs.CV

    How Transferable are Reasoning Patterns in VQA?

    Authors: Corentin Kervadec, Theo Jaunet, Grigory Antipov, Moez Baccouche, Romain Vuillemot, Christian Wolf

    Abstract: Since its inception, Visual Question Answering (VQA) is notoriously known as a task, where models are prone to exploit biases in datasets to find shortcuts instead of performing high-level reasoning. Classical methods address this by removing biases from training data, or adding branches to models to detect and remove biases. In this paper, we argue that uncertainty in vision is a dominating facto… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

  29. arXiv:2104.00926  [pdf, other

    cs.CV cs.HC

    VisQA: X-raying Vision and Language Reasoning in Transformers

    Authors: Theo Jaunet, Corentin Kervadec, Romain Vuillemot, Grigory Antipov, Moez Baccouche, Christian Wolf

    Abstract: Visual Question Answering systems target answering open-ended textual questions given input images. They are a testbed for learning high-level reasoning with a primary use in HCI, for instance assistance for the visually impaired. Recent research has shown that state-of-the-art models tend to produce answers exploiting biases and shortcuts in the training data, and sometimes do not even look at th… ▽ More

    Submitted 20 July, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

  30. Deep KKL: Data-driven Output Prediction for Non-Linear Systems

    Authors: Steeven Janny, Vincent Andrieu, Madiha Nadri, Christian Wolf

    Abstract: We address the problem of output prediction, ie. designing a model for autonomous nonlinear systems capable of forecasting their future observations. We first define a general framework bringing together the necessary properties for the development of such an output predictor. In particular, we look at this problem from two different viewpoints, control theory and data-driven techniques (machine l… ▽ More

    Submitted 1 February, 2022; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: Conference on Decision and Control (CDC 2021)

    MSC Class: Conference on Decision and Control (CDC 2021)

    Journal ref: Conference on Decision and Control (CDC 2021)

  31. arXiv:2101.08833  [pdf, other

    cs.CV

    SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation

    Authors: Brendan Duke, Abdalla Ahmed, Christian Wolf, Parham Aarabi, Graham W. Taylor

    Abstract: In this paper we introduce a Transformer-based approach to video object segmentation (VOS). To address compounding error and scalability issues of prior work, we propose a scalable, end-to-end method for VOS called Sparse Spatiotemporal Transformers (SST). SST extracts per-pixel representations for each object in a video using sparse attention over spatiotemporal features. Our attention-based form… ▽ More

    Submitted 28 March, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

    Comments: CVPR 2021 (Oral)

  32. arXiv:2007.05270  [pdf, other

    cs.LG cs.AI stat.ML

    Learning to plan with uncertain topological maps

    Authors: Edward Beeching, Jilles Dibangoye, Olivier Simonin, Christian Wolf

    Abstract: We train an agent to navigate in 3D environments using a hierarchical strategy including a high-level graph based planner and a local policy. Our main contribution is a data driven learning based approach for planning under uncertainty in topological maps, requiring an estimate of shortest paths in valued graphs with a probabilistic structure. Whereas classical symbolic algorithms achieve optimal… ▽ More

    Submitted 10 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  33. arXiv:2006.05726  [pdf, other

    cs.CV cs.CL

    Estimating semantic structure for the VQA answer space

    Authors: Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf

    Abstract: Since its appearance, Visual Question Answering (VQA, i.e. answering a question posed over an image), has always been treated as a classification problem over a set of predefined answers. Despite its convenience, this classification approach poorly reflects the semantics of the problem limiting the answering to a choice between independent proposals, without taking into account the similarity betw… ▽ More

    Submitted 8 April, 2021; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: [WARNING] We want to notice the reader that additional experiments (not in the paper) have shown that using a `random' semantic space performs as much as the proposed semantic loss. This additional result question the effectiveness of our method

  34. arXiv:2006.05121  [pdf, other

    cs.CV

    Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To?

    Authors: Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf

    Abstract: Models for Visual Question Answering (VQA) are notorious for their tendency to rely on dataset biases, as the large and unbalanced diversity of questions and concepts involved and tends to prevent models from learning to reason, leading them to perform educated guesses instead. In this paper, we claim that the standard evaluation metric, which consists in measuring the overall in-domain accuracy,… ▽ More

    Submitted 7 April, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

  35. arXiv:2005.07841  [pdf, ps, other

    cs.SE eess.SY

    Research Challenges for Heterogeneous CPS Design

    Authors: Shuvra S. Bhattacharyya, Marilyn C. Wolf

    Abstract: Heterogeneous computing is widely used at all levels of computing from data center to edge due to its power/performance characteristics. However, heterogeneity presents challenges. Interoperability---the management of workloads across heterogeneous resources---requires more careful design than is the case for homogeneous platforms. Cyber-physical systems present additional challenges. This article… ▽ More

    Submitted 15 May, 2020; originally announced May 2020.

    Comments: This is a pre-publication version of a paper that has been accepted for publication in IEEE Computer. The official/final version of the paper will be posted on IEEE Xplore

  36. arXiv:2002.07535  [pdf, other

    cs.NI

    Adaptive Real-Time Scheduling for Cooperative Cyber-Physical Systems

    Authors: Georg von Zengen, **g**g Yu, Lars C. Wolf

    Abstract: CPSs are widely used in all sorts of applications ranging from industrial automation to search-and-rescue. So far, in these applications they work either isolated with a high mobility or operate in a static networks setup. If mobile CPSs work cooperatively, it is in applications with relaxed real-time requirements. To enable such cooperation also in hard real-time applications we present a schedul… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

    Comments: 12 pages, 16 figures

  37. arXiv:2002.02286  [pdf, other

    cs.LG cs.AI

    EgoMap: Projective map** and structured egocentric memory for Deep RL

    Authors: Edward Beeching, Christian Wolf, Jilles Dibangoye, Olivier Simonin

    Abstract: Tasks involving localization, memorization and planning in partially observable 3D environments are an ongoing challenge in Deep Reinforcement Learning. We present EgoMap, a spatially structured neural memory architecture. EgoMap augments a deep reinforcement learning agent's performance in 3D environments on challenging tasks with multi-step objectives. The EgoMap architecture incorporates severa… ▽ More

    Submitted 7 February, 2020; v1 submitted 24 January, 2020; originally announced February 2020.

  38. arXiv:1912.03063  [pdf, other

    cs.CV cs.CL cs.LG cs.NE

    Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks

    Authors: Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf

    Abstract: The large adoption of the self-attention (i.e. transformer model) and BERT-like training principles has recently resulted in a number of high performing models on a large panoply of vision-and-language problems (such as Visual Question Answering (VQA), image retrieval, etc.). In this paper we claim that these State-Of-The-Art (SOTA) approaches perform reasonably well in structuring information ins… ▽ More

    Submitted 6 December, 2019; originally announced December 2019.

  39. arXiv:1909.12000  [pdf, other

    cs.CV

    CoPhy: Counterfactual Learning of Physical Dynamics

    Authors: Fabien Baradel, Natalia Neverova, Julien Mille, Greg Mori, Christian Wolf

    Abstract: Understanding causes and effects in mechanical systems is an essential component of reasoning in the physical world. This work poses a new problem of counterfactual learning of object mechanics from visual input. We develop the CoPhy benchmark to assess the capacity of the state-of-the-art models for causal physical reasoning in a synthetic 3D environment and propose a model for learning the physi… ▽ More

    Submitted 7 April, 2020; v1 submitted 26 September, 2019; originally announced September 2019.

    Comments: ICLR 2020 -Spotlight presentation

  40. arXiv:1909.02982  [pdf, other

    cs.LG cs.AI cs.HC cs.NE stat.ML

    DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning

    Authors: Theo Jaunet, Romain Vuillemot, Christian Wolf

    Abstract: We present DRLViz, a visual analytics interface to interpret the internal memory of an agent (e.g. a robot) trained using deep reinforcement learning. This memory is composed of large temporal vectors updated when the agent moves in an environment and is not trivial to understand due to the number of dimensions, dependencies to past vectors, spatial/temporal correlations, and co-correlation betwee… ▽ More

    Submitted 25 May, 2020; v1 submitted 6 September, 2019; originally announced September 2019.

  41. arXiv:1904.07802  [pdf, other

    cs.LG cs.AI cs.HC

    Learning 3D Navigation Protocols on Touch Interfaces with Cooperative Multi-Agent Reinforcement Learning

    Authors: Quentin Debard, Jilles Steeve Dibangoye, Stéphane Canu, Christian Wolf

    Abstract: Using touch devices to navigate in virtual 3D environments such as computer assisted design (CAD) models or geographical information systems (GIS) is inherently difficult for humans, as the 3D operations have to be performed by the user on a 2D touch surface. This ill-posed problem is classically solved with a fixed and handcrafted interaction protocol, which must be learned by the user. We propos… ▽ More

    Submitted 27 August, 2019; v1 submitted 16 April, 2019; originally announced April 2019.

    Comments: 17 pages, 8 figures. Accepted at The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2019 (ECMLPKDD 2019)

  42. arXiv:1904.01806  [pdf, other

    cs.LG stat.ML

    Deep Reinforcement Learning on a Budget: 3D Control and Reasoning Without a Supercomputer

    Authors: Edward Beeching, Christian Wolf, Jilles Dibangoye, Olivier Simonin

    Abstract: An important goal of research in Deep Reinforcement Learning in mobile robotics is to train agents capable of solving complex tasks, which require a high level of scene understanding and reasoning from an egocentric perspective. When trained from simulations, optimal environments should satisfy a currently unobtainable combination of high-fidelity photographic observations, massive amounts of diff… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

  43. arXiv:1903.10407  [pdf

    cs.DC

    Yosys+nextpnr: an Open Source Framework from Verilog to Bitstream for Commercial FPGAs

    Authors: David Shah, Eddie Hung, Clifford Wolf, Serge Bazanski, Dan Gisselquist, Miodrag Milanović

    Abstract: This paper introduces a fully free and open source software (FOSS) architecture-neutral FPGA framework comprising of Yosys for Verilog synthesis, and nextpnr for placement, routing, and bitstream generation. Currently, this flow supports two commercially available FPGA families, Lattice iCE40 (up to 8K logic elements) and Lattice ECP5 (up to 85K elements) and has been hardware-proven for custom-co… ▽ More

    Submitted 25 March, 2019; originally announced March 2019.

    Comments: 4 page short paper to appear at IEEE FCCM 2019 (https://www.fccm.org)

  44. arXiv:1806.06157  [pdf, other

    cs.CV

    Object Level Visual Reasoning in Videos

    Authors: Fabien Baradel, Natalia Neverova, Christian Wolf, Julien Mille, Greg Mori

    Abstract: Human activity recognition is typically addressed by detecting key concepts like global and local motion, features related to object classes present in the scene, as well as features related to the global context. The next open challenges in activity recognition require a level of understanding that pushes beyond this and call for models with capabilities for fine distinction and detailed comprehe… ▽ More

    Submitted 20 September, 2018; v1 submitted 15 June, 2018; originally announced June 2018.

    Comments: Accepted at ECCV 2018 - long version (16 pages + ref)

    Journal ref: ECCV 2018

  45. arXiv:1802.09901  [pdf, other

    cs.LG stat.ML

    Learning to recognize touch gestures: recurrent vs. convolutional features and dynamic sampling

    Authors: Quentin Debard, Christian Wolf, Stéphane Canu, Julien Arné

    Abstract: We propose a fully automatic method for learning gestures on big touch devices in a potentially multi-user context. The goal is to learn general models capable of adapting to different gestures, user styles and hardware variations (e.g. device sizes, sampling frequencies and regularities). Based on deep neural networks, our method features a novel dynamic sampling and temporal normalization comp… ▽ More

    Submitted 19 February, 2018; originally announced February 2018.

    Comments: 9 pages, 4 figures, accepted at the 13th IEEE Conference on Automatic Face and Gesture Recognition (FG2018). Dataset available at http://itekube7.itekube.com

  46. arXiv:1802.07898  [pdf, other

    cs.CV

    Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points

    Authors: Fabien Baradel, Christian Wolf, Julien Mille, Graham W. Taylor

    Abstract: We propose a method for human activity recognition from RGB data that does not rely on any pose information during test time and does not explicitly calculate pose information internally. Instead, a visual attention module learns to predict glimpse sequences in each frame. These glimpses correspond to interest points in the scene that are relevant to the classified activities. No spatial coherence… ▽ More

    Submitted 21 August, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

    Comments: CVPR 2018 - project page: https://fabienbaradel.github.io/cvpr18_glimpseclouds/

    Journal ref: CVPR 2018

  47. arXiv:1712.08002  [pdf, other

    cs.CV

    Human Action Recognition: Pose-based Attention draws focus to Hands

    Authors: Fabien Baradel, Christian Wolf, Julien Mille

    Abstract: We propose a new spatio-temporal attention based mechanism for human action recognition able to automatically attend to the hands most involved into the studied action and detect the most discriminative moments in an action. Attention is handled in a recurrent manner employing Recurrent Neural Network (RNN) and is fully-differentiable. In contrast to standard soft-attention based mechanisms, our a… ▽ More

    Submitted 20 December, 2017; originally announced December 2017.

    Comments: ICCV 2017 Workshop "Hands in action". arXiv admin note: text overlap with arXiv:1703.10106

    Journal ref: ICCV 2017

  48. arXiv:1710.10444  [pdf, other

    math.NA cs.IT

    Compressive Time-of-Flight 3D Imaging Using Block-Structured Sensing Matrices

    Authors: Stephan Antholzer, Christoph Wolf, Michael Sandbichler, Markus Dielacher, Markus Haltmeier

    Abstract: Spatially and temporally highly resolved depth information enables numerous applications including human-machine interaction in gaming or safety functions in the automotive industry. In this paper, we address this issue using Time-of-flight (ToF) 3D cameras which are compact devices providing highly resolved depth information. Practical restrictions often require to reduce the amount of data to be… ▽ More

    Submitted 22 December, 2018; v1 submitted 28 October, 2017; originally announced October 2017.

    Comments: According to a suggestion, we changed the old title "A Framework for Compressive Time-of-Flight 3D Sensing" to "Compressive Time-of-Flight 3D Imaging Using Block-Structured Sensing Matrices"

  49. arXiv:1709.08527  [pdf, other

    cs.CV

    Multi-view pose estimation with mixtures-of-parts and adaptive viewpoint selection

    Authors: Emre Dogan, Gonen Eren, Christian Wolf, Eric Lombardi, Atilla Baskurt

    Abstract: We propose a new method for human pose estimation which leverages information from multiple views to impose a strong prior on articulated pose. The novelty of the method concerns the types of coherence modelled. Consistency is maximised over the different views through different terms modelling classical geometric information (coherence of the resulting poses) as well as appearance information whi… ▽ More

    Submitted 25 September, 2017; originally announced September 2017.

    Comments: 8 pages, 7 figures, 4 tables. Second revision to the paper, as submitted to IET Computer Vision on September 24th 2017

  50. arXiv:1707.07958  [pdf, other

    cs.CV

    Residual Conv-Deconv Grid Network for Semantic Segmentation

    Authors: Damien Fourure, Rémi Emonet, Elisa Fromont, Damien Muselet, Alain Tremeau, Christian Wolf

    Abstract: This paper presents GridNet, a new Convolutional Neural Network (CNN) architecture for semantic image segmentation (full scene labelling). Classical neural networks are implemented as one stream from the input to the output with subsampling operators applied in the stream in order to reduce the feature maps size and to increase the receptive field for the final prediction. However, for semantic im… ▽ More

    Submitted 26 July, 2017; v1 submitted 25 July, 2017; originally announced July 2017.

    Comments: Accepted for publication at BMVC 2017