Skip to main content

Showing 1–50 of 86 results for author: Davison, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09726  [pdf, other

    cs.CV cs.DC cs.MA cs.RO eess.IV

    PixRO: Pixel-Distributed Rotational Odometry with Gaussian Belief Propagation

    Authors: Ignacio Alzugaray, Riku Murai, Andrew Davison

    Abstract: Visual sensors are not only becoming better at capturing high-quality images but also they have steadily increased their capabilities in processing data on their own on-chip. Yet the majority of VO pipelines rely on the transmission and processing of full images in a centralized unit (e.g. CPU or GPU), which often contain much redundant and low-quality information for the task. In this paper, we a… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  2. arXiv:2404.03531  [pdf, other

    cs.CV

    COMO: Compact Map** and Odometry

    Authors: Eric Dexheimer, Andrew J. Davison

    Abstract: We present COMO, a real-time monocular map** and odometry system that encodes dense geometry via a compact set of 3D anchor points. Decoding anchor point projections into dense geometry via per-keyframe depth covariance functions guarantees that depth maps are joined together at visible anchor points. The representation enables joint optimization of camera poses and dense geometry, intrinsic 3D… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  3. arXiv:2403.15583  [pdf, other

    cs.CV

    U-ARE-ME: Uncertainty-Aware Rotation Estimation in Manhattan Environments

    Authors: Aalok Patwardhan, Callum Rhodes, Gwangbin Bae, Andrew J. Davison

    Abstract: Camera rotation estimation from a single image is a challenging task, often requiring depth data and/or camera intrinsics, which are generally not available for in-the-wild videos. Although external sensors such as inertial measurement units (IMUs) can help, they often suffer from drift and are not applicable in non-inertial reference frames. We present U-ARE-ME, an algorithm that estimates camera… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: For the project page and video see https://callum-rhodes.github.io/U-ARE-ME

  4. arXiv:2403.00712  [pdf, other

    cs.CV

    Rethinking Inductive Biases for Surface Normal Estimation

    Authors: Gwangbin Bae, Andrew J. Davison

    Abstract: Despite the growing demand for accurate surface normal estimation models, existing methods use general-purpose dense prediction models, adopting the same inductive biases as other tasks. In this paper, we discuss the inductive biases needed for surface normal estimation and propose to (1) utilize the per-pixel ray direction and (2) encode the relationship between neighboring surface normals by lea… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: CVPR 2024 (camera-ready version will be uploaded in March 2024)

  5. arXiv:2402.03908  [pdf, other

    cs.CV

    EscherNet: A Generative Model for Scalable View Synthesis

    Authors: Xin Kong, Shikun Liu, Xiaoyang Lyu, Marwan Taher, Xiaojuan Qi, Andrew J. Davison

    Abstract: We introduce EscherNet, a multi-view conditioned diffusion model for view synthesis. EscherNet learns implicit and generative 3D representations coupled with a specialised camera positional encoding, allowing precise and continuous relative control of the camera transformation between an arbitrary number of reference and target views. EscherNet offers exceptional generality, flexibility, and scala… ▽ More

    Submitted 19 March, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: CVPR2024 Project Page: https://kxhit.github.io/EscherNet

  6. Distributed Simultaneous Localisation and Auto-Calibration using Gaussian Belief Propagation

    Authors: Riku Murai, Ignacio Alzugaray, Paul H. J. Kelly, Andrew J. Davison

    Abstract: We present a novel scalable, fully distributed, and online method for simultaneous localisation and extrinsic calibration for multi-robot setups. Individual a priori unknown robot poses are probabilistically inferred as robots sense each other while simultaneously calibrating their sensors and markers extrinsic using Gaussian Belief Propagation. In the presented experiments, we show how our method… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: Published in IEEE Robotics and Automation Letters (RA-L) 2024

    Journal ref: IEEE Robotics and Automation Letters, vol. 9, no. 3, pp. 2136-2143, March 2024

  7. arXiv:2401.02357  [pdf, other

    cs.CV

    Fit-NGP: Fitting Object Models to Neural Graphics Primitives

    Authors: Marwan Taher, Ignacio Alzugaray, Andrew J. Davison

    Abstract: Accurate 3D object pose estimation is key to enabling many robotic applications that involve challenging object interactions. In this work, we show that the density field created by a state-of-the-art efficient radiance field reconstruction method is suitable for highly accurate and robust pose estimation for objects with known 3D models, even when they are very small and with challenging reflecti… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  8. arXiv:2312.06741  [pdf, other

    cs.CV cs.RO

    Gaussian Splatting SLAM

    Authors: Hidenobu Matsuki, Riku Murai, Paul H. J. Kelly, Andrew J. Davison

    Abstract: We present the first application of 3D Gaussian Splatting in monocular SLAM, the most fundamental but the hardest setup for Visual SLAM. Our method, which runs live at 3fps, utilises Gaussians as the only 3D representation, unifying the required representation for accurate, efficient tracking, map**, and high-quality rendering. Designed for challenging monocular settings, our approach is seamles… ▽ More

    Submitted 14 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: CVPR2024 Highlight. First two authors contributed equally to this work. Project Page: https://rmurai.co.uk/projects/GaussianSplattingSLAM/

  9. arXiv:2312.05889  [pdf, other

    cs.CV

    SuperPrimitive: Scene Reconstruction at a Primitive Level

    Authors: Kirill Mazur, Gwangbin Bae, Andrew J. Davison

    Abstract: Joint camera pose and dense geometry estimation from a set of images or a monocular video remains a challenging problem due to its computational complexity and inherent visual ambiguities. Most dense incremental reconstruction systems operate directly on image pixels and solve for their 3D positions using multi-view geometry cues. Such pixel-level approaches suffer from ambiguities or violations o… ▽ More

    Submitted 17 April, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

    Comments: CVPR2024. Project Page: https://makezur.github.io/SuperPrimitive/

  10. arXiv:2311.14649  [pdf, other

    cs.LG stat.ML

    Learning in Deep Factor Graphs with Gaussian Belief Propagation

    Authors: Seth Nabarro, Mark van der Wilk, Andrew J Davison

    Abstract: We propose an approach to do learning in Gaussian factor graphs. We treat all relevant quantities (inputs, outputs, parameters, latents) as random variables in a graphical model, and view both training and prediction as inference problems with different observed nodes. Our experiments show that these problems can be efficiently solved with belief propagation (BP), whose updates are inherently loca… ▽ More

    Submitted 28 February, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

  11. arXiv:2310.17712  [pdf, other

    stat.ML cs.LG cs.SI stat.ME

    Community Detection and Classification Guarantees Using Embeddings Learned by Node2Vec

    Authors: Andrew Davison, S. Carlyle Morgan, Owen G. Ward

    Abstract: Embedding the nodes of a large network into an Euclidean space is a common objective in modern machine learning, with a variety of tools available. These embeddings can then be used as features for tasks such as community detection/node clustering or link prediction, where they achieve state of the art performance. With the exception of spectral clustering methods, there is little theoretical unde… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  12. arXiv:2310.01930  [pdf, other

    cs.RO

    A Distributed Multi-Robot Framework for Exploration, Information Acquisition and Consensus

    Authors: Aalok Patwardhan, Andrew J. Davison

    Abstract: The distributed coordination of robot teams performing complex tasks is challenging to formulate. The different aspects of a complete task such as local planning for obstacle avoidance, global goal coordination and collaborative map** are often solved separately, when clearly each of these should influence the others for the most efficient behaviour. In this paper we use the example application… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: We encourage the reader to view our demos at https://aalpatya.github.io/gbpstack

  13. arXiv:2303.12157  [pdf, other

    cs.CV cs.LG cs.RO

    Learning a Depth Covariance Function

    Authors: Eric Dexheimer, Andrew J. Davison

    Abstract: We propose learning a depth covariance function with applications to geometric vision tasks. Given RGB images as input, the covariance function can be flexibly used to define priors over depth functions, predictive distributions given observations, and methods for active point selection. We leverage these techniques for a selection of downstream tasks: depth completion, bundle adjustment, and mono… ▽ More

    Submitted 21 March, 2024; v1 submitted 21 March, 2023; originally announced March 2023.

    Comments: CVPR 2023. Project page: https://edexheim.github.io/DepthCov/

  14. arXiv:2302.01838  [pdf, other

    cs.CV

    vMAP: Vectorised Object Map** for Neural Field SLAM

    Authors: Xin Kong, Shikun Liu, Marwan Taher, Andrew J. Davison

    Abstract: We present vMAP, an object-level dense SLAM system using neural field representations. Each object is represented by a small MLP, enabling efficient, watertight object modelling without the need for 3D priors. As an RGB-D camera browses a scene with no prior information, vMAP detects object instances on-the-fly, and dynamically adds them to its map. Specifically, thanks to the power of vectorised… ▽ More

    Submitted 13 March, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: CVPR2023 Project Page:https://kxhit.github.io/vMAP

  15. arXiv:2210.17325  [pdf, other

    cs.RO cs.CV

    Real-time Map** of Physical Scene Properties with an Autonomous Robot Experimenter

    Authors: Iain Haughton, Edgar Sucar, Andre Mouton, Edward Johns, Andrew J. Davison

    Abstract: Neural fields can be trained from scratch to represent the shape and appearance of 3D scenes efficiently. It has also been shown that they can densely map correlated properties such as semantics, via sparse interactions from a human labeller. In this work, we show that a robot can densely annotate a scene with arbitrary discrete or continuous physical properties via its own fully-autonomous experi… ▽ More

    Submitted 31 October, 2022; originally announced October 2022.

  16. arXiv:2210.03043  [pdf, other

    cs.CV cs.LG cs.RO

    Feature-Realistic Neural Fusion for Real-Time, Open Set Scene Understanding

    Authors: Kirill Mazur, Edgar Sucar, Andrew J. Davison

    Abstract: General scene understanding for robotics requires flexible semantic representation, so that novel objects and structures which may not have been known at training time can be identified, segmented and grouped. We present an algorithm which fuses general learned features from a standard pre-trained network into a highly efficient 3D geometric neural field representation during real-time SLAM. The f… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

    Comments: For our project page, see https://makezur.github.io/FeatureRealisticFusion/

  17. arXiv:2208.05067  [pdf, other

    cs.CV cs.RO

    Learning to Complete Object Shapes for Object-level Map** in Dynamic Scenes

    Authors: Binbin Xu, Andrew J. Davison, Stefan Leutenegger

    Abstract: In this paper, we propose a novel object-level map** system that can simultaneously segment, track, and reconstruct objects in dynamic scenes. It can further predict and complete their full geometries by conditioning on reconstructions from depth inputs and a category-level shape prior with the aim that completed object geometry leads to better object reconstruction and tracking accuracy. For ea… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

    Comments: International Conference on Intelligent Robots and Systems (IROS) 2022

  18. arXiv:2203.11618  [pdf, other

    cs.RO cs.AI cs.MA

    Distributing Collaborative Multi-Robot Planning with Gaussian Belief Propagation

    Authors: Aalok Patwardhan, Riku Murai, Andrew J. Davison

    Abstract: Precise coordinated planning over a forward time window enables safe and highly efficient motion when many robots must work together in tight spaces, but this would normally require centralised control of all devices which is difficult to scale. We demonstrate GBP Planning, a new purely distributed technique based on Gaussian Belief Propagation for multi-robot planning problems, formulated by a ge… ▽ More

    Submitted 26 January, 2023; v1 submitted 22 March, 2022; originally announced March 2022.

    Journal ref: IEEE Robotics and Automation Letters, vol. 8, no. 2, pp. 552-559, Feb. 2023

  19. arXiv:2203.08040  [pdf, other

    cs.RO cs.CV

    Simultaneous Localisation and Map** with Quadric Surfaces

    Authors: Tristan Laidlow, Andrew J. Davison

    Abstract: There are many possibilities for how to represent the map in simultaneous localisation and map** (SLAM). While sparse, keypoint-based SLAM systems have achieved impressive levels of accuracy and robustness, their maps may not be suitable for many robotic tasks. Dense SLAM systems are capable of producing dense reconstructions, but can be computationally expensive and, like sparse systems, lack h… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: 7 pages, 4 figures

  20. arXiv:2202.11092  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    ReorientBot: Learning Object Reorientation for Specific-Posed Placement

    Authors: Kentaro Wada, Stephen James, Andrew J. Davison

    Abstract: Robots need the capability of placing objects in arbitrary, specific poses to rearrange the world and achieve various valuable tasks. Object reorientation plays a crucial role in this as objects may not initially be oriented such that the robot can grasp and then immediately place them in a specific goal pose. In this work, we present a vision-based manipulation system, ReorientBot, which consists… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

    Comments: 7 pages, 6 figures, IEEE International Conference on Robotics and Automation (ICRA) 2022

  21. arXiv:2202.05832  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    SafePicking: Learning Safe Object Extraction via Object-Level Map**

    Authors: Kentaro Wada, Stephen James, Andrew J. Davison

    Abstract: Robots need object-level scene understanding to manipulate objects while reasoning about contact, support, and occlusion among objects. Given a pile of objects, object recognition and reconstruction can identify the boundary of object instances, giving important cues as to how the objects form and support the pile. In this work, we present a system, SafePicking, that integrates object-level mappin… ▽ More

    Submitted 1 March, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

    Comments: 7 pages, 6 figures, IEEE International Conference on Robotics and Automation (ICRA) 2022

  22. arXiv:2202.03314  [pdf, other

    cs.RO cs.AI cs.MA

    A Robot Web for Distributed Many-Device Localisation

    Authors: Riku Murai, Joseph Ortiz, Sajad Saeedi, Paul H. J. Kelly, Andrew J. Davison

    Abstract: We show that a distributed network of robots or other devices which make measurements of each other can collaborate to globally localise via efficient ad-hoc peer to peer communication. Our Robot Web solution is based on Gaussian Belief Propagation on the fundamental non-linear factor graph describing the probabilistic structure of all of the observations robots make internally or of each other, a… ▽ More

    Submitted 26 January, 2024; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: Published in IEEE Transactions on Robotics (TRO) 2023

    Journal ref: IEEE Transactions on Robotics, vol. 40, pp. 121-138, 2024

  23. arXiv:2202.03091  [pdf, other

    cs.LG cs.AI cs.CV

    Auto-Lambda: Disentangling Dynamic Task Relationships

    Authors: Shikun Liu, Stephen James, Andrew J. Davison, Edward Johns

    Abstract: Understanding the structure of multiple related tasks allows for multi-task learning to improve the generalisation ability of one or all of them. However, it usually requires training each pairwise combination of tasks together in order to capture task relationships, at an extremely high computational cost. In this work, we learn task relationships via an automated weighting framework, named Auto-… ▽ More

    Submitted 2 June, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: Published at TMLR 2022. Project Page: https://shikun.io/projects/auto-lambda Code: https://github.com/lorenmt/auto-lambda

  24. arXiv:2201.01689  [pdf, other

    stat.ML cs.LG math.ST

    Asymptotics of $\ell_2$ Regularized Network Embeddings

    Authors: Andrew Davison

    Abstract: A common approach to solving prediction tasks on large networks, such as node classification or link prediction, begin by learning a Euclidean embedding of the nodes of the network, from which traditional machine learning methods can then be applied. This includes methods such as DeepWalk and node2vec, which learn embeddings by optimizing stochastic losses formed over subsamples of the graph at ea… ▽ More

    Submitted 18 December, 2022; v1 submitted 5 January, 2022; originally announced January 2022.

    Comments: Accepted in Neural Information Processing Systems 2022. 44 pages, 2 figures, 2 tables

  25. arXiv:2111.14637  [pdf, other

    cs.CV

    ILabel: Interactive Neural Scene Labelling

    Authors: Shuaifeng Zhi, Edgar Sucar, Andre Mouton, Iain Haughton, Tristan Laidlow, Andrew J. Davison

    Abstract: Joint representation of geometry, colour and semantics using a 3D neural field enables accurate dense labelling from ultra-sparse interactions as a user reconstructs a scene in real-time using a handheld RGB-D sensor. Our iLabel system requires no training data, yet can densely label scenes more accurately than standard methods trained on large, expensively labelled image datasets. Furthermore, it… ▽ More

    Submitted 3 December, 2021; v1 submitted 29 November, 2021; originally announced November 2021.

    Comments: Project page: https://edgarsucar.github.io/ilabel/ Video: https://youtu.be/bL7RZaMhRbk

  26. arXiv:2109.06241  [pdf, other

    cs.CV cs.RO

    Incremental Abstraction in Distributed Probabilistic SLAM Graphs

    Authors: Joseph Ortiz, Talfan Evans, Edgar Sucar, Andrew J. Davison

    Abstract: Scene graphs represent the key components of a scene in a compact and semantically rich way, but are difficult to build during incremental SLAM operation because of the challenges of robustly identifying abstract scene elements and optimising continually changing, complex graphs. We present a distributed, graph-based SLAM framework for incrementally building scene graphs based on two novel compone… ▽ More

    Submitted 4 April, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: Published at ICRA 2022. Project page: https://joeaortiz.github.io/incremental_abstraction/

  27. arXiv:2107.08994  [pdf, other

    cs.CV cs.RO

    CodeMap**: Real-Time Dense Map** for Sparse SLAM using Compact Scene Representations

    Authors: Hidenobu Matsuki, Raluca Scona, Jan Czarnowski, Andrew J. Davison

    Abstract: We propose a novel dense map** framework for sparse visual SLAM systems which leverages a compact scene representation. State-of-the-art sparse visual SLAM systems provide accurate and reliable estimates of the camera trajectory and locations of landmarks. While these sparse maps are useful for localization, they cannot be used for other tasks such as obstacle avoidance or scene understanding. I… ▽ More

    Submitted 19 July, 2021; originally announced July 2021.

    Comments: Accepted to IEEE Robotics and Automation Letters (RA-L) 2021

  28. arXiv:2107.02363  [pdf, other

    stat.ML cs.LG math.ST

    Asymptotics of Network Embeddings Learned via Subsampling

    Authors: Andrew Davison, Morgane Austern

    Abstract: Network data are ubiquitous in modern machine learning, with tasks of interest including node classification, node clustering and link prediction. A frequent approach begins by learning an Euclidean embedding of the network, to which algorithms developed for vector-valued data are applied. For large networks, embeddings are learned using stochastic gradient methods where the sub-sampling scheme ca… ▽ More

    Submitted 17 May, 2023; v1 submitted 5 July, 2021; originally announced July 2021.

    Comments: Accepted at Journal of Machine Learning Research (JMLR). 120 pages, 3 figures, 1 table

    Journal ref: Journal of Machine Learning Research 24 (2023) 1-120. Published 5/23

  29. arXiv:2107.02308  [pdf, other

    cs.AI cs.CV cs.LG cs.RO

    A visual introduction to Gaussian Belief Propagation

    Authors: Joseph Ortiz, Talfan Evans, Andrew J. Davison

    Abstract: In this article, we present a visual introduction to Gaussian Belief Propagation (GBP), an approximate probabilistic inference algorithm that operates by passing messages between the nodes of arbitrarily structured factor graphs. A special case of loopy belief propagation, GBP updates rely only on local information and will converge independently of the message schedule. Our key argument is that,… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Comments: See online version of this article: https://gaussianbp.github.io/

  30. arXiv:2106.12534  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation

    Authors: Stephen James, Kentaro Wada, Tristan Laidlow, Andrew J. Davison

    Abstract: We present a coarse-to-fine discretisation method that enables the use of discrete reinforcement learning approaches in place of unstable and data-inefficient actor-critic methods in continuous robotics domains. This approach builds on the recently released ARM algorithm, which replaces the continuous next-best pose agent with a discrete one, with coarse-to-fine Q-attention. Given a voxelised scen… ▽ More

    Submitted 14 March, 2022; v1 submitted 23 June, 2021; originally announced June 2021.

    Comments: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2022). Videos and code: https://sites.google.com/view/c2f-q-attention

  31. arXiv:2105.14829  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Q-attention: Enabling Efficient Learning for Vision-based Robotic Manipulation

    Authors: Stephen James, Andrew J. Davison

    Abstract: Despite the success of reinforcement learning methods, they have yet to have their breakthrough moment when applied to a broad range of robotic manipulation tasks. This is partly due to the fact that reinforcement learning algorithms are notoriously difficult and time consuming to train, which is exacerbated when training from images rather than full-state inputs. As humans perform manipulation ta… ▽ More

    Submitted 3 February, 2022; v1 submitted 31 May, 2021; originally announced May 2021.

    Comments: IEEE Robotics and Automation Letters, 2022 (+ presentation at ICRA 2022). Videos and code found at: https://sites.google.com/view/q-attention

  32. arXiv:2105.06340  [pdf, other

    cs.CV

    3D-CNN for Facial Micro- and Macro-expression Spotting on Long Video Sequences using Temporal Oriented Reference Frame

    Authors: Chuin Hong Yap, Moi Hoon Yap, Adrian K. Davison, Connah Kendrick, **gting Li, Su**g Wang, Ryan Cunningham

    Abstract: Facial expression spotting is the preliminary step for micro- and macro-expression analysis. The task of reliably spotting such expressions in video sequences is currently unsolved. The current best systems depend upon optical flow methods to extract regional motion features, before categorisation of that motion into a specific class of facial movement. Optical flow is susceptible to drift error,… ▽ More

    Submitted 26 May, 2022; v1 submitted 13 May, 2021; originally announced May 2021.

  33. arXiv:2104.04465  [pdf, other

    cs.CV cs.LG

    Bootstrap** Semantic Segmentation with Regional Contrast

    Authors: Shikun Liu, Shuaifeng Zhi, Edward Johns, Andrew J. Davison

    Abstract: We present ReCo, a contrastive learning framework designed at a regional level to assist learning in semantic segmentation. ReCo performs semi-supervised or supervised pixel-level contrastive learning on a sparse set of hard negative pixels, with minimal additional memory footprint. ReCo is easy to implement, being built on top of off-the-shelf segmentation networks, and consistently improves perf… ▽ More

    Submitted 31 January, 2022; v1 submitted 9 April, 2021; originally announced April 2021.

    Comments: Published at ICLR 2022. Project Page: https://shikun.io/projects/regional-contrast. Code: https://github.com/lorenmt/reco

  34. arXiv:2103.16442  [pdf, other

    cs.CV

    SIMstack: A Generative Shape and Instance Model for Unordered Object Stacks

    Authors: Zoe Landgraf, Raluca Scona, Tristan Laidlow, Stephen James, Stefan Leutenegger, Andrew J. Davison

    Abstract: By estimating 3D shape and instances from a single view, we can capture information about an environment quickly, without the need for comprehensive scanning and multi-view fusion. Solving this task for composite scenes (such as object stacks) is challenging: occluded areas are not only ambiguous in shape but also in instance segmentation; multiple decompositions could be valid. We observe that ph… ▽ More

    Submitted 26 September, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

    Journal ref: ICCV 2021

  35. arXiv:2103.15875  [pdf, other

    cs.CV

    In-Place Scene Labelling and Understanding with Implicit Scene Representation

    Authors: Shuaifeng Zhi, Tristan Laidlow, Stefan Leutenegger, Andrew J. Davison

    Abstract: Semantic labelling is highly correlated with geometry and radiance reconstruction, as scene entities with similar shape and appearance are more likely to come from similar classes. Recent implicit neural reconstruction techniques are appealing as they do not require prior training data, but the same fully self-supervised approach is not possible for semantics because labels are human-defined prope… ▽ More

    Submitted 21 August, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: Camera ready version. To be published in Proceedings of IEEE International Conference on Computer Vision (ICCV 2021) as Oral Presentation. Project page with more videos: https://shuaifengzhi.com/Semantic-NeRF/

  36. arXiv:2103.12352  [pdf, other

    cs.CV

    iMAP: Implicit Map** and Positioning in Real-Time

    Authors: Edgar Sucar, Shikun Liu, Joseph Ortiz, Andrew J. Davison

    Abstract: We show for the first time that a multilayer perceptron (MLP) can serve as the only scene representation in a real-time SLAM system for a handheld RGB-D camera. Our network is trained in live operation without prior data, building a dense, scene-specific implicit 3D model of occupancy and colour which is also immediately used for tracking. Achieving real-time SLAM via continual training of a neu… ▽ More

    Submitted 13 September, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: Typos, make pdf smaller

  37. arXiv:2102.07764  [pdf, other

    cs.RO cs.AI cs.CV cs.LG cs.NE

    End-to-End Egospheric Spatial Memory

    Authors: Daniel Lenton, Stephen James, Ronald Clark, Andrew J. Davison

    Abstract: Spatial memory, or the ability to remember and recall specific locations and objects, is central to autonomous agents' ability to carry out tasks in real environments. However, most existing artificial memory modules are not very adept at storing spatial information. We propose a parameter-free module, Egospheric Spatial Memory (ESM), which encodes the memory in an ego-sphere around the agent, ena… ▽ More

    Submitted 17 February, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: Conference paper at ICLR 2021. Implementation: https://github.com/ivy-dl/memory Project page: https://djl11.github.io/ESM/

  38. arXiv:2011.01975  [pdf, other

    cs.AI cs.CV cs.LG cs.RO

    Rearrangement: A Challenge for Embodied AI

    Authors: Dhruv Batra, Angel X. Chang, Sonia Chernova, Andrew J. Davison, Jia Deng, Vladlen Koltun, Sergey Levine, Jitendra Malik, Igor Mordatch, Roozbeh Mottaghi, Manolis Savva, Hao Su

    Abstract: We describe a framework for research and evaluation in Embodied AI. Our proposal is based on a canonical task: Rearrangement. A standard task can focus the development of new techniques and serve as a source of trained models that can be transferred to other settings. In the rearrangement task, the goal is to bring a given physical environment into a specified state. The goal state can be specifie… ▽ More

    Submitted 3 November, 2020; originally announced November 2020.

    Comments: Authors are listed in alphabetical order

  39. arXiv:2008.13504  [pdf, other

    cs.CV cs.RO

    Deep Probabilistic Feature-metric Tracking

    Authors: Binbin Xu, Andrew J. Davison, Stefan Leutenegger

    Abstract: Dense image alignment from RGB-D images remains a critical issue for real-world applications, especially under challenging lighting conditions and in a wide baseline setting. In this paper, we propose a new framework to learn a pixel-wise deep feature map and a deep feature-metric uncertainty map predicted by a Convolutional Neural Network (CNN), which together formulate a deep probabilistic featu… ▽ More

    Submitted 25 November, 2020; v1 submitted 31 August, 2020; originally announced August 2020.

    Comments: RAL 2020. 8 pages, 9 figures, video link: https://youtu.be/6pMosl6ZAPE

  40. arXiv:2007.05385  [pdf, ps, other

    stat.ML cs.LG stat.AP

    Next Waves in Veridical Network Embedding

    Authors: Owen G. Ward, Zhen Huang, Andrew Davison, Tian Zheng

    Abstract: Embedding nodes of a large network into a metric (e.g., Euclidean) space has become an area of active research in statistical machine learning, which has found applications in natural and social sciences. Generally, a representation of a network object is learned in a Euclidean geometry and is then used for subsequent tasks regarding the nodes and/or edges of the network, such as community detecti… ▽ More

    Submitted 12 August, 2021; v1 submitted 10 July, 2020; originally announced July 2020.

  41. arXiv:2004.04485  [pdf, other

    cs.CV

    NodeSLAM: Neural Object Descriptors for Multi-View Shape Reconstruction

    Authors: Edgar Sucar, Kentaro Wada, Andrew Davison

    Abstract: The choice of scene representation is crucial in both the shape inference algorithms it requires and the smart applications it enables. We present efficient and optimisable multi-class learned object descriptors together with a novel probabilistic and differential rendering engine, for principled full object shape inference from one or more RGB-D images. Our framework allows for accurate and robus… ▽ More

    Submitted 10 October, 2020; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: to be published in 3DV

  42. arXiv:2004.04336  [pdf, other

    cs.CV cs.RO

    MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

    Authors: Kentaro Wada, Edgar Sucar, Stephen James, Daniel Lenton, Andrew J. Davison

    Abstract: Robots and other smart devices need efficient object-based scene representations from their on-board vision systems to reason about contact, physics and occlusion. Recognized precise object models will play an important role alongside non-parametric reconstructions of unrecognized structures. We present a system which can estimate the accurate poses of multiple known objects in contact and occlusi… ▽ More

    Submitted 8 April, 2020; originally announced April 2020.

    Comments: 10 pages, 10 figures, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020

  43. arXiv:2003.03134  [pdf, other

    cs.CV cs.DC

    Bundle Adjustment on a Graph Processor

    Authors: Joseph Ortiz, Mark Pupilli, Stefan Leutenegger, Andrew J. Davison

    Abstract: Graph processors such as Graphcore's Intelligence Processing Unit (IPU) are part of the major new wave of novel computer architecture for AI, and have a general design with massively parallel computation, distributed on-chip memory and very high inter-core communication bandwidth which allows breakthrough performance for message passing algorithms on arbitrary graphs. We show for the first time th… ▽ More

    Submitted 30 March, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: Published in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2020). Video: https://www.youtube.com/watch?v=TqeN8aQNgd0

  44. arXiv:2002.10342  [pdf, other

    cs.CV

    Comparing View-Based and Map-Based Semantic Labelling in Real-Time SLAM

    Authors: Zoe Landgraf, Fabian Falck, Michael Bloesch, Stefan Leutenegger, Andrew Davison

    Abstract: Generally capable Spatial AI systems must build persistent scene representations where geometric models are combined with meaningful semantic labels. The many approaches to labelling scenes can be divided into two clear groups: view-based which estimate labels from the input view-wise data and then incrementally fuse them into the scene model as it is built; and map-based which label the generated… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

    Comments: ICRA 2020

  45. DeepFactors: Real-Time Probabilistic Dense Monocular SLAM

    Authors: Jan Czarnowski, Tristan Laidlow, Ronald Clark, Andrew J. Davison

    Abstract: The ability to estimate rich geometry and camera motion from monocular imagery is fundamental to future interactive robotics and augmented reality applications. Different approaches have been proposed that vary in scene geometry representation (sparse landmarks, dense maps), the consistency metric used for optimising the multi-view problem, and the use of learned priors. We present a SLAM system t… ▽ More

    Submitted 14 January, 2020; originally announced January 2020.

    Comments: RA-L

  46. arXiv:1911.05116  [pdf, other

    q-fin.RM cs.LG stat.ML

    An Unethical Optimization Principle

    Authors: Nicholas Beale, Heather Battey, Anthony C. Davison, Robert S. MacKay

    Abstract: If an artificial intelligence aims to maximise risk-adjusted return, then under mild conditions it is disproportionately likely to pick an unethical strategy unless the objective function allows sufficiently for this risk. Even if the proportion $η$ of available unethical strategies is small, the probability ${p_U}$ of picking an unethical strategy can become large; indeed unless returns are fat-t… ▽ More

    Submitted 12 November, 2019; originally announced November 2019.

  47. arXiv:1911.01103  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Learning One-Shot Imitation from Humans without Humans

    Authors: Alessandro Bonardi, Stephen James, Andrew J. Davison

    Abstract: Humans can naturally learn to execute a new task by seeing it performed by other individuals once, and then reproduce it in a variety of configurations. Endowing robots with this ability of imitating humans from third person is a very immediate and natural way of teaching new tasks. Only recently, through meta-learning, there have been successful attempts to one-shot imitation learning from humans… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

    Comments: Videos can be found here: https://sites.google.com/view/tecnets-humans

  48. arXiv:1910.14139  [pdf, other

    cs.AI cs.CV cs.DC cs.RO

    FutureMap** 2: Gaussian Belief Propagation for Spatial AI

    Authors: Andrew J. Davison, Joseph Ortiz

    Abstract: We argue the case for Gaussian Belief Propagation (GBP) as a strong algorithmic framework for the distributed, generic and incremental probabilistic estimation we need in Spatial AI as we aim at high performance smart robots and devices which operate within the constraints of real products. Processor hardware is changing rapidly, and GBP has the right character to take advantage of highly distribu… ▽ More

    Submitted 7 November, 2022; v1 submitted 30 October, 2019; originally announced October 2019.

  49. arXiv:1909.12271  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    RLBench: The Robot Learning Benchmark & Learning Environment

    Authors: Stephen James, Zicong Ma, David Rovick Arrojo, Andrew J. Davison

    Abstract: We present a challenging new benchmark and learning-environment for robot learning: RLBench. The benchmark features 100 completely unique, hand-designed tasks ranging in difficulty, from simple target reaching and door opening, to longer multi-stage tasks, such as opening an oven and placing a tray in it. We provide an array of both proprioceptive observations and visual observations, which includ… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

    Comments: Videos and code: https://sites.google.com/view/rlbench

  50. arXiv:1906.11176  [pdf, other

    cs.RO cs.CV cs.LG

    PyRep: Bringing V-REP to Deep Robot Learning

    Authors: Stephen James, Marc Freese, Andrew J. Davison

    Abstract: PyRep is a toolkit for robot learning research, built on top of the virtual robotics experimentation platform (V-REP). Through a series of modifications and additions, we have created a tailored version of V-REP built with robot learning in mind. The new PyRep toolkit offers three improvements: (1) a simple and flexible API for robot control and scene manipulation, (2) a new rendering engine, and… ▽ More

    Submitted 26 June, 2019; originally announced June 2019.