Skip to main content

Showing 1–9 of 9 results for author: Renz, K

.
  1. arXiv:2406.10165  [pdf, other

    cs.CV cs.RO

    CarLLaVA: Vision language models for camera-only closed-loop driving

    Authors: Katrin Renz, Long Chen, Ana-Maria Marcu, Jan Hünermann, Benoit Hanotte, Alice Karnsund, Jamie Shotton, Elahe Arani, Oleg Sinavski

    Abstract: In this technical report, we present CarLLaVA, a Vision Language Model (VLM) for autonomous driving, developed for the CARLA Autonomous Driving Challenge 2.0. CarLLaVA uses the vision encoder of the LLaVA VLM and the LLaMA architecture as backbone, achieving state-of-the-art closed-loop driving performance with only camera input and without the need for complex or expensive labels. Additionally, w… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Outstanding Champion & Innovation Award @ CARLA Autonomous Driving Challenge 2024; Project video: https://youtu.be/E1nsEgcHRuc

  2. arXiv:2404.07569  [pdf, other

    cs.RO cs.AI cs.LG

    Can Vehicle Motion Planning Generalize to Realistic Long-tail Scenarios?

    Authors: Marcel Hallgarten, Julian Zapata, Martin Stoll, Katrin Renz, Andreas Zell

    Abstract: Real-world autonomous driving systems must make safe decisions in the face of rare and diverse traffic scenarios. Current state-of-the-art planners are mostly evaluated on real-world datasets like nuScenes (open-loop) or nuPlan (closed-loop). In particular, nuPlan seems to be an expressive evaluation method since it is based on real-world data and closed-loop, yet it mostly covers basic driving sc… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  3. arXiv:2312.14150  [pdf, other

    cs.CV

    DriveLM: Driving with Graph Visual Question Answering

    Authors: Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, ** Luo, Andreas Geiger, Hongyang Li

    Abstract: We study how vision-language models (VLMs) trained on web-scale data can be integrated into end-to-end driving systems to boost generalization and enable interactivity with human users. While recent approaches adapt VLMs to driving via single-round visual question answering (VQA), human drivers reason about decisions in multiple steps. Starting from the localization of key objects, humans estimate… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  4. arXiv:2308.12779  [pdf, other

    cs.CV cs.RO

    On Offline Evaluation of 3D Object Detection for Autonomous Driving

    Authors: Tim Schreier, Katrin Renz, Andreas Geiger, Kashyap Chitta

    Abstract: Prior work in 3D object detection evaluates models using offline metrics like average precision since closed-loop online evaluation on the downstream driving task is costly. However, it is unclear how indicative offline results are of driving performance. In this work, we perform the first empirical evaluation measuring how predictive different detection metrics are of driving performance when det… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: Appears in: IEEE International Conference on Computer Vision (ICCV'23) Workshops

  5. arXiv:2210.14222  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    PlanT: Explainable Planning Transformers via Object-Level Representations

    Authors: Katrin Renz, Kashyap Chitta, Otniel-Bogdan Mercea, A. Sophia Koepke, Zeynep Akata, Andreas Geiger

    Abstract: Planning an optimal route in a complex environment requires efficient reasoning about the surrounding scene. While human drivers prioritize important objects and ignore details not relevant to the decision, learning-based planners typically extract features from dense, high-dimensional grid representations containing all vehicle and road context information. In this paper, we propose PlanT, a nove… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: CoRL 2022. Project Page: https://www.katrinrenz.de/plant/

  6. arXiv:2205.15997  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving

    Authors: Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, Andreas Geiger

    Abstract: How should we integrate representations from complementary sensors for autonomous driving? Geometry-based fusion has shown promise for perception (e.g. object detection, motion forecasting). However, in the context of end-to-end driving, we find that imitation learning based on existing sensor fusion methods underperforms in complex driving scenarios with a high density of dynamic agents. Therefor… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Comments: arXiv admin note: text overlap with arXiv:2104.09224

  7. arXiv:2204.13683  [pdf, other

    cs.RO cs.CV cs.LG

    KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients

    Authors: Niklas Hanselmann, Katrin Renz, Kashyap Chitta, Apratim Bhattacharyya, Andreas Geiger

    Abstract: Simulators offer the possibility of safe, low-cost development of self-driving systems. However, current driving simulators exhibit naïve behavior models for background traffic. Hand-tuned scenarios are typically added during simulation to induce safety-critical situations. An alternative approach is to adversarially perturb the background traffic trajectories. In this paper, we study this approac… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

  8. arXiv:2104.13817  [pdf, other

    cs.CV

    Sign Segmentation with Changepoint-Modulated Pseudo-Labelling

    Authors: Katrin Renz, Nicolaj C. Stache, Neil Fox, Gül Varol, Samuel Albanie

    Abstract: The objective of this work is to find temporal boundaries between signs in continuous sign language. Motivated by the paucity of annotation available for this task, we propose a simple yet effective algorithm to improve segmentation performance on unlabelled signing footage from a domain of interest. We make the following contributions: (1) We motivate and introduce the task of source-free domain… ▽ More

    Submitted 28 April, 2021; originally announced April 2021.

    Comments: Appears in: 2021 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW'21). 11 pages

  9. arXiv:2011.12986  [pdf, other

    cs.CV

    Sign language segmentation with temporal convolutional networks

    Authors: Katrin Renz, Nicolaj C. Stache, Samuel Albanie, Gül Varol

    Abstract: The objective of this work is to determine the location of temporal boundaries between signs in continuous sign language videos. Our approach employs 3D convolutional neural network representations with iterative temporal segment refinement to resolve ambiguities between sign boundary cues. We demonstrate the effectiveness of our approach on the BSLCORPUS, PHOENIX14 and BSL-1K datasets, showing co… ▽ More

    Submitted 12 February, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

    Comments: Appears in: 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'21). 5 pages