Skip to main content

Showing 1–24 of 24 results for author: Kendall, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.14115  [pdf, other

    cs.RO cs.AI cs.CV

    LingoQA: Video Question Answering for Autonomous Driving

    Authors: Ana-Maria Marcu, Long Chen, Jan Hünermann, Alice Karnsund, Benoit Hanotte, Prajwal Chidananda, Saurabh Nair, Vijay Badrinarayanan, Alex Kendall, Jamie Shotton, Elahe Arani, Oleg Sinavski

    Abstract: Autonomous driving has long faced a challenge with public acceptance due to the lack of explainability in the decision-making process. Video question-answering (QA) in natural language provides the opportunity for bridging this gap. Nonetheless, evaluating the performance of Video QA models has proved particularly tough due to the absence of comprehensive benchmarks. To fill this gap, we introduce… ▽ More

    Submitted 19 March, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Benchmark and dataset are available at https://github.com/wayveai/LingoQA/

  2. arXiv:2309.17080  [pdf, other

    cs.CV cs.AI cs.RO

    GAIA-1: A Generative World Model for Autonomous Driving

    Authors: Anthony Hu, Lloyd Russell, Hudson Yeo, Zak Murez, George Fedoseev, Alex Kendall, Jamie Shotton, Gianluca Corrado

    Abstract: Autonomous driving promises transformative improvements to transportation, but building systems capable of safely navigating the unstructured complexity of real-world scenarios remains challenging. A critical problem lies in effectively predicting the various potential outcomes that may emerge in response to the vehicle's actions as the world evolves. To address this challenge, we introduce GAIA… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: Technical Report

  3. arXiv:2307.07147  [pdf, other

    cs.CV

    Linking vision and motion for self-supervised object-centric perception

    Authors: Kaylene C. Stocking, Zak Murez, Vijay Badrinarayanan, Jamie Shotton, Alex Kendall, Claire Tomlin, Christopher P. Burgess

    Abstract: Object-centric representations enable autonomous driving algorithms to reason about interactions between many independent agents and scene features. Traditionally these representations have been obtained via supervised learning, but this decouples perception from the downstream driving task and could harm generalization. In this work we adapt a self-supervised object-centric vision model to perfor… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: Presented at the CVPR 2023 Vision-Centric Autonomous Driving workshop

  4. arXiv:2210.07729  [pdf, other

    cs.CV cs.AI cs.RO

    Model-Based Imitation Learning for Urban Driving

    Authors: Anthony Hu, Gianluca Corrado, Nicolas Griffiths, Zak Murez, Corina Gurau, Hudson Yeo, Alex Kendall, Roberto Cipolla, Jamie Shotton

    Abstract: An accurate model of the environment and the dynamic agents acting in it offers great potential for improving motion planning. We present MILE: a Model-based Imitation LEarning approach to jointly learn a model of the world and a policy for autonomous driving. Our method leverages 3D geometry as an inductive bias and learns a highly compact latent space directly from high-resolution videos of expe… ▽ More

    Submitted 3 November, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  5. arXiv:2108.05805  [pdf, other

    cs.LG cs.AI cs.RO

    Reimagining an autonomous vehicle

    Authors: Jeffrey Hawke, Haibo E, Vijay Badrinarayanan, Alex Kendall

    Abstract: The self driving challenge in 2021 is this century's technological equivalent of the space race, and is now entering the second major decade of development. Solving the technology will create social change which parallels the invention of the automobile itself. Today's autonomous driving technology is laudable, though rooted in decisions made a decade ago. We argue that a rethink is required, reco… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

    Comments: Under review

  6. arXiv:2105.03533  [pdf, other

    cs.CV

    Video Class Agnostic Segmentation with Contrastive Learning for Autonomous Driving

    Authors: Mennatullah Siam, Alex Kendall, Martin Jagersand

    Abstract: Semantic segmentation in autonomous driving predominantly focuses on learning from large-scale data with a closed set of known classes without considering unknown objects. Motivated by safety reasons, we address the video class agnostic segmentation task, which considers unknown objects outside the closed set of known classes in our training data. We propose a novel auxiliary contrastive loss to l… ▽ More

    Submitted 10 May, 2021; v1 submitted 7 May, 2021; originally announced May 2021.

  7. arXiv:2104.10490  [pdf, other

    cs.CV cs.RO

    FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras

    Authors: Anthony Hu, Zak Murez, Nikhil Mohan, Sofía Dudas, Jeffrey Hawke, Vijay Badrinarayanan, Roberto Cipolla, Alex Kendall

    Abstract: Driving requires interacting with road agents and predicting their future behaviour in order to navigate safely. We present FIERY: a probabilistic future prediction model in bird's-eye view from monocular cameras. Our model predicts future instance segmentation and motion of dynamic agents that can be transformed into non-parametric future trajectories. Our approach combines the perception, sensor… ▽ More

    Submitted 18 October, 2021; v1 submitted 21 April, 2021; originally announced April 2021.

    Comments: ICCV 2021

  8. arXiv:2103.11015  [pdf, other

    cs.CV

    Video Class Agnostic Segmentation Benchmark for Autonomous Driving

    Authors: Mennatullah Siam, Alex Kendall, Martin Jagersand

    Abstract: Semantic segmentation approaches are typically trained on large-scale data with a closed finite set of known classes without considering unknown objects. In certain safety-critical robotics applications, especially autonomous driving, it is important to segment all objects, including those unknown at training time. We formalize the task of video class agnostic segmentation from monocular video seq… ▽ More

    Submitted 19 April, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

    Comments: Accepted in WAD workshop, CVPR 2021

  9. arXiv:2102.09144  [pdf, other

    cs.RO math.OC physics.app-ph

    Stochastic Spatio-Temporal Optimization for Control and Co-Design of Systems in Robotics and Applied Physics

    Authors: Ethan N. Evans, Andrew P. Kendall, Evangelos A. Theodorou

    Abstract: Correlated with the trend of increasing degrees of freedom in robotic systems is a similar trend of rising interest in Spatio-Temporal systems described by Partial Differential Equations (PDEs) among the robotics and control communities. These systems often exhibit dramatic under-actuation, high dimensionality, bifurcations, and multimodal instabilities. Their control represents many of the curren… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

    Comments: 34 pages, 10 figures. Submitted to Autonomous Robots special issue of RSS 2020. arXiv admin note: text overlap with arXiv:2002.01397

  10. arXiv:2003.06409  [pdf, other

    cs.CV cs.LG cs.RO

    Probabilistic Future Prediction for Video Scene Understanding

    Authors: Anthony Hu, Fergal Cotter, Nikhil Mohan, Corina Gurau, Alex Kendall

    Abstract: We present a novel deep learning architecture for probabilistic future prediction from video. We predict the future semantics, geometry and motion of complex real-world urban scenes and use this representation to control an autonomous vehicle. This work is the first to jointly predict ego-motion, static scene, and the motion of dynamic agents in a probabilistic manner, which allows sampling consis… ▽ More

    Submitted 17 July, 2020; v1 submitted 13 March, 2020; originally announced March 2020.

    Comments: Accepted as a conference paper at ECCV 2020

  11. arXiv:1912.08969  [pdf, other

    cs.CV

    Learning a Spatio-Temporal Embedding for Video Instance Segmentation

    Authors: Anthony Hu, Alex Kendall, Roberto Cipolla

    Abstract: We present a novel embedding approach for video instance segmentation. Our method learns a spatio-temporal embedding integrating cues from appearance, motion, and geometry; a 3D causal convolutional network models motion, and a monocular self-supervised depth loss models geometry. In this embedding space, video-pixels of the same instance are clustered together while being separated from other ins… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

  12. arXiv:1912.00177  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Urban Driving with Conditional Imitation Learning

    Authors: Jeffrey Hawke, Richard Shen, Corina Gurau, Siddharth Sharma, Daniele Reda, Nikolay Nikolov, Przemyslaw Mazur, Sean Micklethwaite, Nicolas Griffiths, Amar Shah, Alex Kendall

    Abstract: Hand-crafting generalised decision-making rules for real-world urban autonomous driving is hard. Alternatively, learning behaviour from easy-to-collect human driving demonstrations is appealing. Prior work has studied imitation learning (IL) for autonomous driving with a number of limitations. Examples include only performing lane-following rather than following a user-defined route, only using a… ▽ More

    Submitted 5 December, 2019; v1 submitted 30 November, 2019; originally announced December 2019.

    Comments: Under submission; added acknowledgements

  13. arXiv:1901.02709  [pdf

    cs.CY

    A taxonomy of circular economy indicators

    Authors: Michael Saidani, Bernard Yannou, Yann Leroy, François Cluzel, Alissa Kendall

    Abstract: Implementing circular economy (CE) principles is increasingly recommended as a convenient solution to meet the goals of sustainable development. New tools are required to support practitioners, decision-makers and policy-makers towards more CE practices, as well as to monitor the effects of CE adoption. Worldwide, academics, industrialists and politicians all agree on the need to use CE-related me… ▽ More

    Submitted 29 December, 2018; originally announced January 2019.

    Journal ref: Journal of Cleaner Production, Elsevier, 2019, 207, pp.542-559

  14. arXiv:1812.03823  [pdf, other

    cs.CV

    Learning to Drive from Simulation without Real World Labels

    Authors: Alex Bewley, Jessica Rigley, Yuxuan Liu, Jeffrey Hawke, Richard Shen, Vinh-Dieu Lam, Alex Kendall

    Abstract: Simulation can be a powerful tool for understanding machine learning systems and designing methods to solve real-world problems. Training and evaluating methods purely in simulation is often "doomed to succeed" at the desired task in a simulated environment, but the resulting models are incapable of operation in the real world. Here we present and evaluate a method for transferring a vision-based… ▽ More

    Submitted 13 December, 2018; v1 submitted 10 December, 2018; originally announced December 2018.

  15. arXiv:1811.08188  [pdf, other

    cs.CV

    Orthographic Feature Transform for Monocular 3D Object Detection

    Authors: Thomas Roddick, Alex Kendall, Roberto Cipolla

    Abstract: 3D object detection from monocular images has proven to be an enormously challenging task, with the performance of leading systems not yet achieving even 10\% of that of LiDAR-based counterparts. One explanation for this performance gap is that existing systems are entirely at the mercy of the perspective image-based representation, in which the appearance and scale of objects varies drastically w… ▽ More

    Submitted 20 November, 2018; originally announced November 2018.

  16. arXiv:1807.00412  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Learning to Drive in a Day

    Authors: Alex Kendall, Jeffrey Hawke, David Janz, Przemyslaw Mazur, Daniele Reda, John-Mark Allen, Vinh-Dieu Lam, Alex Bewley, Amar Shah

    Abstract: We demonstrate the first application of deep reinforcement learning to autonomous driving. From randomly initialised parameters, our model is able to learn a policy for lane following in a handful of training episodes using a single monocular image as input. We provide a general and easy to obtain reward: the distance travelled by the vehicle without the safety driver taking control. We use a cont… ▽ More

    Submitted 11 September, 2018; v1 submitted 1 July, 2018; originally announced July 2018.

    Comments: Further results and demo videos can be viewed at: https://wayve.ai/blog/l2diad

  17. arXiv:1705.07115  [pdf, other

    cs.CV

    Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics

    Authors: Alex Kendall, Yarin Gal, Roberto Cipolla

    Abstract: Numerous deep learning applications benefit from multi-task learning with multiple regression and classification objectives. In this paper we make the observation that the performance of such systems is strongly dependent on the relative weighting between each task's loss. Tuning these weights by hand is a difficult and expensive process, making multi-task learning prohibitive in practice. We prop… ▽ More

    Submitted 24 April, 2018; v1 submitted 19 May, 2017; originally announced May 2017.

    Comments: CVPR 2018

  18. arXiv:1704.00390  [pdf, other

    cs.CV

    Geometric Loss Functions for Camera Pose Regression with Deep Learning

    Authors: Alex Kendall, Roberto Cipolla

    Abstract: Deep learning has shown to be effective for robust and real-time monocular image relocalisation. In particular, PoseNet is a deep convolutional neural network which learns to regress the 6-DOF camera pose from a single image. It learns to localize using high level features and is robust to difficult lighting, motion blur and unknown camera intrinsics, where point based SIFT registration fails. How… ▽ More

    Submitted 23 May, 2017; v1 submitted 2 April, 2017; originally announced April 2017.

    Comments: CVPR 2017

  19. arXiv:1703.04977  [pdf, other

    cs.CV

    What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?

    Authors: Alex Kendall, Yarin Gal

    Abstract: There are two major types of uncertainty one can model. Aleatoric uncertainty captures noise inherent in the observations. On the other hand, epistemic uncertainty accounts for uncertainty in the model -- uncertainty which can be explained away given enough data. Traditionally it has been difficult to model epistemic uncertainty in computer vision, but with new Bayesian deep learning tools this is… ▽ More

    Submitted 5 October, 2017; v1 submitted 15 March, 2017; originally announced March 2017.

    Comments: NIPS 2017

  20. arXiv:1703.04309  [pdf, other

    cs.CV cs.NE

    End-to-End Learning of Geometry and Context for Deep Stereo Regression

    Authors: Alex Kendall, Hayk Martirosyan, Saumitro Dasgupta, Peter Henry, Ryan Kennedy, Abraham Bachrach, Adam Bry

    Abstract: We propose a novel deep learning architecture for regressing disparity from a rectified pair of stereo images. We leverage knowledge of the problem's geometry to form a cost volume using deep feature representations. We learn to incorporate contextual information using 3-D convolutions over this volume. Disparity values are regressed from the cost volume using a proposed differentiable soft argmin… ▽ More

    Submitted 13 March, 2017; originally announced March 2017.

  21. arXiv:1511.02680  [pdf, other

    cs.CV cs.NE

    Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding

    Authors: Alex Kendall, Vijay Badrinarayanan, Roberto Cipolla

    Abstract: We present a deep learning framework for probabilistic pixel-wise semantic segmentation, which we term Bayesian SegNet. Semantic segmentation is an important tool for visual scene understanding and a meaningful measure of uncertainty is essential for decision making. Our contribution is a practical system which is able to predict pixel-wise class labels with a measure of model uncertainty. We achi… ▽ More

    Submitted 10 October, 2016; v1 submitted 9 November, 2015; originally announced November 2015.

  22. arXiv:1511.00561  [pdf, other

    cs.CV cs.LG cs.NE

    SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

    Authors: Vijay Badrinarayanan, Alex Kendall, Roberto Cipolla

    Abstract: We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16… ▽ More

    Submitted 10 October, 2016; v1 submitted 2 November, 2015; originally announced November 2015.

  23. arXiv:1509.05909  [pdf, other

    cs.CV cs.RO

    Modelling Uncertainty in Deep Learning for Camera Relocalization

    Authors: Alex Kendall, Roberto Cipolla

    Abstract: We present a robust and real-time monocular six degree of freedom visual relocalization system. We use a Bayesian convolutional neural network to regress the 6-DOF camera pose from a single RGB image. It is trained in an end-to-end manner with no need of additional engineering or graph optimisation. The algorithm can operate indoors and outdoors in real time, taking under 6ms to compute. It obtain… ▽ More

    Submitted 18 February, 2016; v1 submitted 19 September, 2015; originally announced September 2015.

    Comments: ICRA 2016; Fixed numerical error with rotation results

  24. arXiv:1505.07427  [pdf, other

    cs.CV cs.NE cs.RO

    PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization

    Authors: Alex Kendall, Matthew Grimes, Roberto Cipolla

    Abstract: We present a robust and real-time monocular six degree of freedom relocalization system. Our system trains a convolutional neural network to regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need of additional engineering or graph optimisation. The algorithm can operate indoors and outdoors in real time, taking 5ms per frame to compute. It obtains approximately… ▽ More

    Submitted 18 February, 2016; v1 submitted 27 May, 2015; originally announced May 2015.

    Comments: 9 pages, 13 figures; Corrected numerical error in orientation results