Skip to main content

Showing 1–17 of 17 results for author: Golodetz, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2209.10471  [pdf, other

    cs.CV cs.LG cs.RO

    Sample, Crop, Track: Self-Supervised Mobile 3D Object Detection for Urban Driving LiDAR

    Authors: Sangyun Shin, Stuart Golodetz, Madhu Vankadari, Kaichen Zhou, Andrew Markham, Niki Trigoni

    Abstract: Deep learning has led to great progress in the detection of mobile (i.e. movement-capable) objects in urban driving scenes in recent years. Supervised approaches typically require the annotation of large training sets; there has thus been great interest in leveraging weakly, semi- or self-supervised methods to avoid this, with much success. Whilst weakly and semi-supervised methods require some an… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    MSC Class: 68T45 ACM Class: I.2.10

  2. arXiv:2206.13850  [pdf, other

    cs.CV cs.RO

    When the Sun Goes Down: Repairing Photometric Losses for All-Day Depth Estimation

    Authors: Madhu Vankadari, Stuart Golodetz, Sourav Garg, Sangyun Shin, Andrew Markham, Niki Trigoni

    Abstract: Self-supervised deep learning methods for joint depth and ego-motion estimation can yield accurate trajectories without needing ground-truth training data. However, as they typically use photometric losses, their performance can degrade significantly when the assumptions these losses make (e.g. temporal illumination consistency, a static scene, and the absence of noise and occlusions) are violated… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

    MSC Class: 68T45 ACM Class: I.2.10

  3. arXiv:2203.02453  [pdf, other

    cs.CV cs.RO

    Real-Time Hybrid Map** of Populated Indoor Scenes using a Low-Cost Monocular UAV

    Authors: Stuart Golodetz, Madhu Vankadari, Aluna Everitt, Sangyun Shin, Andrew Markham, Niki Trigoni

    Abstract: Unmanned aerial vehicles (UAVs) have been used for many applications in recent years, from urban search and rescue, to agricultural surveying, to autonomous underground mine exploration. However, deploying UAVs in tight, indoor spaces, especially close to humans, remains a challenge. One solution, when limited payload is required, is to use micro-UAVs, which pose less risk to humans and typically… ▽ More

    Submitted 4 March, 2022; originally announced March 2022.

    Comments: Submitted to IROS 2022

    MSC Class: 68T45 ACM Class: I.2.10; I.2.9

  4. arXiv:2008.02004  [pdf, other

    cs.CV

    Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes

    Authors: Johanna Wald, Torsten Sattler, Stuart Golodetz, Tommaso Cavallari, Federico Tombari

    Abstract: Long-term camera re-localization is an important task with numerous computer vision and robotics applications. Whilst various outdoor benchmarks exist that target lighting, weather and seasonal changes, far less attention has been paid to appearance changes that occur indoors. This has led to a mismatch between popular indoor benchmarks, which focus on static scenes, and indoor environments that a… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

    Comments: ECCV 2020, project website https://waldjohannau.github.io/RIO10

  5. arXiv:2002.09437  [pdf, other

    cs.LG cs.CV stat.ML

    Calibrating Deep Neural Networks using Focal Loss

    Authors: Jishnu Mukhoti, Viveka Kulharia, Amartya Sanyal, Stuart Golodetz, Philip H. S. Torr, Puneet K. Dokania

    Abstract: Miscalibration - a mismatch between a model's confidence and its correctness - of Deep Neural Networks (DNNs) makes their predictions hard to rely on. Ideally, we want networks to be accurate, calibrated and confident. We show that, as opposed to the standard cross-entropy loss, focal loss [Lin et. al., 2017] allows us to learn models that are already very well calibrated. When combined with tempe… ▽ More

    Submitted 26 October, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: This paper was accepted at NeurIPS 2020

  6. arXiv:1907.07745  [pdf, other

    cs.CV eess.IV eess.SP

    Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC

    Authors: Oscar Rahnama, Tommaso Cavallari, Stuart Golodetz, Alessio Tonioni, Thomas Joy, Luigi Di Stefano, Simon Walker, Philip H. S. Torr

    Abstract: Obtaining highly accurate depth from stereo images in real time has many applications across computer vision and robotics, but in some contexts, upper bounds on power consumption constrain the feasible hardware to embedded platforms such as FPGAs. Whilst various stereo algorithms have been deployed on these platforms, usually cut down to better match the embedded architecture, certain key parts of… ▽ More

    Submitted 17 July, 2019; originally announced July 2019.

    Comments: 6 pages, 7 figures, 2 tables, journal

    Journal ref: IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 66, no. 5, pp. 773-777, May 2019

  7. arXiv:1906.08744  [pdf, other

    cs.CV cs.LG cs.RO

    Let's Take This Online: Adapting Scene Coordinate Regression Network Predictions for Online RGB-D Camera Relocalisation

    Authors: Tommaso Cavallari, Luca Bertinetto, Jishnu Mukhoti, Philip Torr, Stuart Golodetz

    Abstract: Many applications require a camera to be relocalised online, without expensive offline training on the target scene. Whilst both keyframe and sparse keypoint matching methods can be used online, the former often fail away from the training trajectory, and the latter can struggle in textureless regions. By contrast, scene coordinate regression (SCoRe) methods generalise to novel poses and can lever… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

    Comments: Tommaso Cavallari and Stuart Golodetz contributed equally to this paper

  8. arXiv:1905.11358  [pdf, other

    cs.CV cs.AI cs.LG

    Straight to Shapes++: Real-time Instance Segmentation Made More Accurate

    Authors: Laurynas Miksys, Saumya Jetley, Michael Sapienza, Stuart Golodetz, Philip H. S. Torr

    Abstract: Instance segmentation is an important problem in computer vision, with applications in autonomous driving, drone navigation and robotic manipulation. However, most existing methods are not real-time, complicating their deployment in time-sensitive contexts. In this work, we extend an existing approach to real-time instance segmentation, called `Straight to Shapes' (STS), which makes use of low-dim… ▽ More

    Submitted 30 July, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: Technical report, 27 pages (12 main, 15 supplementary), 17 figures, 14 tables

    Report number: STS-2018

  9. R$^3$SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems

    Authors: Oscar Rahnama, Tommaso Cavallari, Stuart Golodetz, Simon Walker, Philip H. S. Torr

    Abstract: Stereo depth estimation is used for many computer vision applications. Though many popular methods strive solely for depth quality, for real-time mobile applications (e.g. prosthetic glasses or micro-UAVs), speed and power efficiency are equally, if not more, important. Many real-world systems rely on Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but power efficiency is… ▽ More

    Submitted 30 October, 2018; originally announced October 2018.

    Comments: Accepted in FPT 2018 as Oral presentation, 8 pages, 6 figures, 4 tables

    Journal ref: 2018 International Conference on Field-Programmable Technology (FPT)

  10. Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade

    Authors: Tommaso Cavallari, Stuart Golodetz, Nicholas A. Lord, Julien Valentin, Victor A. Prisacariu, Luigi Di Stefano, Philip H. S. Torr

    Abstract: Camera pose estimation is an important problem in computer vision. Common techniques either match the current image against keyframes with known poses, directly regress the pose, or establish correspondences between keypoints in the image and points in the scene to estimate the pose. In recent years, regression forests have become a popular alternative to establish such correspondences. They achie… ▽ More

    Submitted 2 July, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: Tommaso Cavallari, Stuart Golodetz, Nicholas Lord and Julien Valentin assert joint first authorship

    MSC Class: 68T45

  11. Collaborative Large-Scale Dense 3D Reconstruction with Online Inter-Agent Pose Optimisation

    Authors: Stuart Golodetz, Tommaso Cavallari, Nicholas A Lord, Victor A Prisacariu, David W Murray, Philip H S Torr

    Abstract: Reconstructing dense, volumetric models of real-world 3D scenes is important for many tasks, but capturing large scenes can take significant time, and the risk of transient changes to the scene goes up as the capture time increases. These are good reasons to want instead to capture several smaller sub-scenes that can be joined to make the whole scene. Achieving this has traditionally been difficul… ▽ More

    Submitted 2 July, 2019; v1 submitted 25 January, 2018; originally announced January 2018.

    Comments: Stuart Golodetz, Tommaso Cavallari and Nicholas Lord assert joint first authorship

    MSC Class: 68T45

    Journal ref: IEEE Transactions on Visualization and Computer Graphics 24(11):2895-2905, 2018

  12. arXiv:1708.00783  [pdf, other

    cs.CV

    InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure

    Authors: Victor Adrian Prisacariu, Olaf Kähler, Stuart Golodetz, Michael Sapienza, Tommaso Cavallari, Philip H S Torr, David W Murray

    Abstract: Volumetric models have become a popular representation for 3D scenes in recent years. One breakthrough leading to their popularity was KinectFusion, which focuses on 3D reconstruction using RGB-D sensors. However, monocular SLAM has since also been tackled with very similar approaches. Representing the reconstruction volumetrically as a TSDF leads to most of the simplicity and efficiency that can… ▽ More

    Submitted 2 August, 2017; originally announced August 2017.

    Comments: This article largely supersedes arxiv:1410.0925 (it describes version 3 of the InfiniTAM framework)

  13. arXiv:1702.02779  [pdf, other

    cs.CV

    On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation

    Authors: Tommaso Cavallari, Stuart Golodetz, Nicholas A. Lord, Julien Valentin, Luigi Di Stefano, Philip H. S. Torr

    Abstract: Camera relocalisation is an important problem in computer vision, with applications in simultaneous localisation and map**, virtual/augmented reality and navigation. Common techniques either match the current image against keyframes with known poses coming from a tracker, or establish 2D-to-3D correspondences between keypoints in the current image and points in the scene in order to estimate the… ▽ More

    Submitted 26 June, 2017; v1 submitted 9 February, 2017; originally announced February 2017.

    Comments: To appear in the proceedings of CVPR 2017

  14. arXiv:1611.07932  [pdf, other

    cs.CV

    Straight to Shapes: Real-time Detection of Encoded Shapes

    Authors: Saumya Jetley, Michael Sapienza, Stuart Golodetz, Philip H. S. Torr

    Abstract: Current object detection approaches predict bounding boxes, but these provide little instance-specific information beyond location, scale and aspect ratio. In this work, we propose to directly regress to objects' shapes in addition to their bounding boxes and categories. It is crucial to find an appropriate shape representation that is compact and decodable, and in which objects can be compared fo… ▽ More

    Submitted 5 July, 2017; v1 submitted 23 November, 2016; originally announced November 2016.

    Comments: 16 pages including appendix; Published at CVPR 2017

  15. arXiv:1601.02220  [pdf, other

    cs.CV cs.SD

    Joint Object-Material Category Segmentation from Audio-Visual Cues

    Authors: Anurag Arnab, Michael Sapienza, Stuart Golodetz, Julien Valentin, Ondrej Miksik, Shahram Izadi, Philip Torr

    Abstract: It is not always possible to recognise objects and infer material properties for a scene from visual cues alone, since objects can look visually similar whilst being made of very different materials. In this paper, we therefore present an approach that augments the available dense visual cues with sparse auditory cues in order to estimate dense object and material labels. Since estimates of object… ▽ More

    Submitted 10 January, 2016; originally announced January 2016.

    Comments: Published in British Machine Vision Conference (BMVC) 2015

  16. arXiv:1512.01355  [pdf, other

    cs.CV

    Staple: Complementary Learners for Real-Time Tracking

    Authors: Luca Bertinetto, Jack Valmadre, Stuart Golodetz, Ondrej Miksik, Philip Torr

    Abstract: Correlation Filter-based trackers have recently achieved excellent performance, showing great robustness to challenging situations exhibiting motion blur and illumination changes. However, since the model that they learn depends strongly on the spatial layout of the tracked object, they are notoriously sensitive to deformation. Models based on colour statistics have complementary traits: they cope… ▽ More

    Submitted 13 April, 2016; v1 submitted 4 December, 2015; originally announced December 2015.

    Comments: To appear in CVPR 2016

  17. SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes

    Authors: Stuart Golodetz, Michael Sapienza, Julien P. C. Valentin, Vibhav Vineet, Ming-Ming Cheng, Anurag Arnab, Victor A. Prisacariu, Olaf Kähler, Carl Yuheng Ren, David W. Murray, Shahram Izadi, Philip H. S. Torr

    Abstract: We present an open-source, real-time implementation of SemanticPaint, a system for geometric reconstruction, object-class segmentation and learning of 3D scenes. Using our system, a user can walk into a room wearing a depth camera and a virtual reality headset, and both densely reconstruct the 3D scene and interactively segment the environment into object classes such as 'chair', 'floor' and 'tabl… ▽ More

    Submitted 13 October, 2015; originally announced October 2015.

    Comments: 33 pages, Project: http://www.semantic-paint.com, Code: https://github.com/torrvision/spaint

    ACM Class: I.2.10