Skip to main content

Showing 1–3 of 3 results for author: Sinavski, O

.
  1. arXiv:2406.10165  [pdf, other

    cs.CV cs.RO

    CarLLaVA: Vision language models for camera-only closed-loop driving

    Authors: Katrin Renz, Long Chen, Ana-Maria Marcu, Jan Hünermann, Benoit Hanotte, Alice Karnsund, Jamie Shotton, Elahe Arani, Oleg Sinavski

    Abstract: In this technical report, we present CarLLaVA, a Vision Language Model (VLM) for autonomous driving, developed for the CARLA Autonomous Driving Challenge 2.0. CarLLaVA uses the vision encoder of the LLaVA VLM and the LLaMA architecture as backbone, achieving state-of-the-art closed-loop driving performance with only camera input and without the need for complex or expensive labels. Additionally, w… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Outstanding Champion & Innovation Award @ CARLA Autonomous Driving Challenge 2024; Project video: https://youtu.be/E1nsEgcHRuc

  2. arXiv:2312.14115  [pdf, other

    cs.RO cs.AI cs.CV

    LingoQA: Video Question Answering for Autonomous Driving

    Authors: Ana-Maria Marcu, Long Chen, Jan Hünermann, Alice Karnsund, Benoit Hanotte, Prajwal Chidananda, Saurabh Nair, Vijay Badrinarayanan, Alex Kendall, Jamie Shotton, Elahe Arani, Oleg Sinavski

    Abstract: Autonomous driving has long faced a challenge with public acceptance due to the lack of explainability in the decision-making process. Video question-answering (QA) in natural language provides the opportunity for bridging this gap. Nonetheless, evaluating the performance of Video QA models has proved particularly tough due to the absence of comprehensive benchmarks. To fill this gap, we introduce… ▽ More

    Submitted 19 March, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Benchmark and dataset are available at https://github.com/wayveai/LingoQA/

  3. arXiv:2310.01957  [pdf, other

    cs.RO cs.AI cs.CL cs.CV

    Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving

    Authors: Long Chen, Oleg Sinavski, Jan Hünermann, Alice Karnsund, Andrew James Willmott, Danny Birch, Daniel Maund, Jamie Shotton

    Abstract: Large Language Models (LLMs) have shown promise in the autonomous driving sector, particularly in generalization and interpretability. We introduce a unique object-level multimodal LLM architecture that merges vectorized numeric modalities with a pre-trained LLM to improve context understanding in driving situations. We also present a new dataset of 160k QA pairs derived from 10k driving scenarios… ▽ More

    Submitted 13 October, 2023; v1 submitted 3 October, 2023; originally announced October 2023.