Skip to main content

Showing 1–50 of 67 results for author: Anguelov, D

.
  1. arXiv:2405.02811  [pdf, other

    cs.CV

    PVTransformer: Point-to-Voxel Transformer for Scalable 3D Object Detection

    Authors: Zhaoqi Leng, Pei Sun, Tong He, Dragomir Anguelov, Mingxing Tan

    Abstract: 3D object detectors for point clouds often rely on a pooling-based PointNet to encode sparse points into grid-like voxels or pillars. In this paper, we identify that the common PointNet design introduces an information bottleneck that limits 3D object detection accuracy and scalability. To address this limitation, we propose PVTransformer: a transformer-based point-to-voxel architecture for 3D det… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  2. arXiv:2404.19531  [pdf, other

    cs.CV

    MoST: Multi-modality Scene Tokenization for Motion Prediction

    Authors: Norman Mu, **gwei Ji, Zhenpei Yang, Nate Harada, Haotian Tang, Kan Chen, Charles R. Qi, Runzhou Ge, Kratarth Goel, Zoey Yang, Scott Ettinger, Rami Al-Rfou, Dragomir Anguelov, Yin Zhou

    Abstract: Many existing motion prediction approaches rely on symbolic perception outputs to generate agent trajectories, such as bounding boxes, road graph information and traffic lights. This symbolic representation is a high-level abstraction of the real world, which may render the motion prediction model vulnerable to perception errors (e.g., failures in detecting open-vocabulary obstacles) while missing… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  3. arXiv:2310.08710  [pdf, other

    cs.RO cs.LG

    Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

    Authors: Cole Gulino, Justin Fu, Wenjie Luo, George Tucker, Eli Bronstein, Yiren Lu, Jean Harb, Xinlei Pan, Yan Wang, Xiangyu Chen, John D. Co-Reyes, Rishabh Agarwal, Rebecca Roelofs, Yao Lu, Nico Montali, Paul Mougin, Zoey Yang, Brandyn White, Aleksandra Faust, Rowan McAllister, Dragomir Anguelov, Benjamin Sapp

    Abstract: Simulation is an essential tool to develop and benchmark autonomous vehicle planning software in a safe and cost-effective manner. However, realistic simulation requires accurate modeling of nuanced and complex multi-agent interactive behaviors. To address these challenges, we introduce Waymax, a new data-driven simulator for autonomous driving in multi-agent scenes, designed for large-scale simul… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  4. arXiv:2309.16870  [pdf, other

    cs.CV cs.LG cs.RO

    LEF: Late-to-Early Temporal Fusion for LiDAR 3D Object Detection

    Authors: Tong He, Pei Sun, Zhaoqi Leng, Chenxi Liu, Dragomir Anguelov, Mingxing Tan

    Abstract: We propose a late-to-early recurrent feature fusion scheme for 3D object detection using temporal LiDAR point clouds. Our main motivation is fusing object-aware latent embeddings into the early stages of a 3D object detector. This feature fusion strategy enables the model to better capture the shapes and poses for challenging objects, compared with learning from raw points directly. Our method con… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  5. arXiv:2309.14491  [pdf, other

    cs.CV

    Unsupervised 3D Perception with 2D Vision-Language Distillation for Autonomous Driving

    Authors: Mahyar Najibi, **gwei Ji, Yin Zhou, Charles R. Qi, Xinchen Yan, Scott Ettinger, Dragomir Anguelov

    Abstract: Closed-set 3D perception models trained on only a pre-defined set of object categories can be inadequate for safety critical applications such as autonomous driving where new object types can be encountered after deployment. In this paper, we present a multi-modal auto labeling pipeline capable of generating amodal 3D bounding boxes and tracklets for training models on open-set categories without… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: ICCV 2023

  6. arXiv:2306.04745  [pdf, other

    cs.CV cs.AI

    3D Human Keypoints Estimation From Point Clouds in the Wild Without Human Labels

    Authors: Zhenzhen Weng, Alexander S. Gorban, **gwei Ji, Mahyar Najibi, Yin Zhou, Dragomir Anguelov

    Abstract: Training a 3D human keypoint detector from point clouds in a supervised manner requires large volumes of high quality labels. While it is relatively easy to capture large amounts of human point clouds, annotating 3D keypoints is expensive, subjective, error prone and especially difficult for long-tail cases (pedestrians with rare poses, scooterists, etc.). In this work, we propose GC-KPL - Geometr… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: CVPR 2023

  7. arXiv:2306.03206  [pdf, other

    cs.CV

    MoDAR: Using Motion Forecasting for 3D Object Detection in Point Cloud Sequences

    Authors: Yingwei Li, Charles R. Qi, Yin Zhou, Chenxi Liu, Dragomir Anguelov

    Abstract: Occluded and long-range objects are ubiquitous and challenging for 3D object detection. Point cloud sequence data provide unique opportunities to improve such cases, as an occluded or distant object can be observed from different viewpoints or gets better visibility over time. However, the efficiency and effectiveness in encoding long-term sequence data can still be improved. In this work, we prop… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: CVPR 2023

  8. arXiv:2306.03083  [pdf, other

    cs.RO cs.AI

    MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion

    Authors: Chiyu Max Jiang, Andre Cornman, Cheolho Park, Ben Sapp, Yin Zhou, Dragomir Anguelov

    Abstract: We present MotionDiffuser, a diffusion based representation for the joint distribution of future trajectories over multiple agents. Such representation has several key advantages: first, our model learns a highly multimodal distribution that captures diverse future outcomes. Second, the simple predictor design requires only a single L2 loss training objective, and does not depend on trajectory anc… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted as a highlight paper in CVPR 2023. Walkthrough video: https://youtu.be/IfGTZwm1abg

  9. arXiv:2305.12032  [pdf, other

    cs.CV cs.LG cs.MA cs.RO

    The Waymo Open Sim Agents Challenge

    Authors: Nico Montali, John Lambert, Paul Mougin, Alex Kuefler, Nick Rhinehart, Michelle Li, Cole Gulino, Tristan Emrich, Zoey Yang, Shimon Whiteson, Brandyn White, Dragomir Anguelov

    Abstract: Simulation with realistic, interactive agents represents a key task for autonomous vehicle software development. In this work, we introduce the Waymo Open Sim Agents Challenge (WOSAC). WOSAC is the first public challenge to tackle this task and propose corresponding metrics. The goal of the challenge is to stimulate the design of realistic simulators that can be used to evaluate and train a behavi… ▽ More

    Submitted 11 December, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Accepted to NeurIPS 2023, Track on Datasets and Benchmarks. Public leaderboard available at https://waymo.com/open/challenges/2023/sim-agents/

  10. arXiv:2304.03834  [pdf, other

    cs.CV

    WOMD-LiDAR: Raw Sensor Dataset Benchmark for Motion Forecasting

    Authors: Kan Chen, Runzhou Ge, Hang Qiu, Rami AI-Rfou, Charles R. Qi, Xuanyu Zhou, Zoey Yang, Scott Ettinger, Pei Sun, Zhaoqi Leng, Mustafa Baniodeh, Ivan Bogun, Weiyue Wang, Mingxing Tan, Dragomir Anguelov

    Abstract: Widely adopted motion forecasting datasets substitute the observed sensory inputs with higher-level abstractions such as 3D boxes and polylines. These sparse shapes are inferred through annotating the original scenes with perception systems' predictions. Such intermediate representations tie the quality of the motion forecasting models to the performance of computer vision models. Moreover, the hu… ▽ More

    Submitted 18 February, 2024; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: ICRA 2024 camera ready version. Dataset website: https://waymo.com/open/data/motion/

  11. arXiv:2304.02163  [pdf, other

    cs.CV cs.AI cs.GR cs.RO

    GINA-3D: Learning to Generate Implicit Neural Assets in the Wild

    Authors: Bokui Shen, Xinchen Yan, Charles R. Qi, Mahyar Najibi, Boyang Deng, Leonidas Guibas, Yin Zhou, Dragomir Anguelov

    Abstract: Modeling the 3D world from sensor data for simulation is a scalable way of develo** testing and validation environments for robotic learning problems such as autonomous driving. However, manually creating or re-creating real-world-like environments is difficult, expensive, and not scalable. Recent generative model techniques have shown promising progress to address such challenges by learning 3D… ▽ More

    Submitted 28 August, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: Accepted by CVPR 2023; Our WOD-ObjectAsset can be accessed through waymo.com/open

  12. arXiv:2212.11419  [pdf, other

    cs.AI cs.RO

    Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios

    Authors: Yiren Lu, Justin Fu, George Tucker, Xinlei Pan, Eli Bronstein, Rebecca Roelofs, Benjamin Sapp, Brandyn White, Aleksandra Faust, Shimon Whiteson, Dragomir Anguelov, Sergey Levine

    Abstract: Imitation learning (IL) is a simple and powerful way to use high-quality human driving data, which can be collected at scale, to produce human-like behavior. However, policies based on imitation learning alone often fail to sufficiently account for safety and reliability concerns. In this paper, we show how imitation learning combined with reinforcement learning using simple rewards can substantia… ▽ More

    Submitted 10 August, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    ACM Class: I.2.9; I.2.6

  13. arXiv:2212.08710  [pdf, other

    cs.MA cs.LG cs.RO

    JFP: Joint Future Prediction with Interactive Multi-Agent Modeling for Autonomous Driving

    Authors: Wenjie Luo, Cheolho Park, Andre Cornman, Benjamin Sapp, Dragomir Anguelov

    Abstract: We propose JFP, a Joint Future Prediction model that can learn to generate accurate and consistent multi-agent future trajectories. For this task, many different methods have been proposed to capture social interactions in the encoding part of the model, however, considerably less focus has been placed on representing interactions in the decoder and output stages. As a result, the predicted trajec… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

  14. arXiv:2212.07729  [pdf, other

    cs.CV

    HUM3DIL: Semi-supervised Multi-modal 3D Human Pose Estimation for Autonomous Driving

    Authors: Andrei Zanfir, Mihai Zanfir, Alexander Gorban, **gwei Ji, Yin Zhou, Dragomir Anguelov, Cristian Sminchisescu

    Abstract: Autonomous driving is an exciting new industry, posing important research questions. Within the perception module, 3D human pose estimation is an emerging technology, which can enable the autonomous vehicle to perceive and understand the subtle and complex behaviors of pedestrians. While hardware systems and sensors have dramatically improved over the decades -- with cars potentially boasting comp… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: Published at the 6th Conference on Robot Learning (CoRL 2022), Auckland, New Zealand

  15. arXiv:2212.03267  [pdf, other

    cs.CV

    NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors

    Authors: Congyue Deng, Chiyu "Max'' Jiang, Charles R. Qi, Xinchen Yan, Yin Zhou, Leonidas Guibas, Dragomir Anguelov

    Abstract: 2D-to-3D reconstruction is an ill-posed problem, yet humans are good at solving this problem due to their prior knowledge of the 3D world developed over years. Driven by this observation, we propose NeRDi, a single-view NeRF synthesis framework with general image priors from 2D diffusion models. Formulating single-view reconstruction as an image-conditioned 3D generation problem, we optimize the N… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

  16. arXiv:2210.13488  [pdf, other

    cs.CV

    LidarAugment: Searching for Scalable 3D LiDAR Data Augmentations

    Authors: Zhaoqi Leng, Guowang Li, Chenxi Liu, Ekin Dogus Cubuk, Pei Sun, Tong He, Dragomir Anguelov, Mingxing Tan

    Abstract: Data augmentations are important in training high-performance 3D object detectors for point clouds. Despite recent efforts on designing new data augmentations, perhaps surprisingly, most state-of-the-art 3D detectors only use a few simple data augmentations. In particular, different from 2D image data augmentations, 3D data augmentations need to account for different representations of input data… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

  17. PseudoAugment: Learning to Use Unlabeled Data for Data Augmentation in Point Clouds

    Authors: Zhaoqi Leng, Shuyang Cheng, Benjamin Caine, Weiyue Wang, Xiao Zhang, Jonathon Shlens, Mingxing Tan, Dragomir Anguelov

    Abstract: Data augmentation is an important technique to improve data efficiency and save labeling cost for 3D detection in point clouds. Yet, existing augmentation policies have so far been designed to only utilize labeled data, which limits the data diversity. In this paper, we recognize that pseudo labeling and data augmentation are complementary, thus propose to leverage unlabeled data for data augmenta… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Journal ref: ECCV 2022 (pp. 555-572). Springer, Cham

  18. arXiv:2210.09539  [pdf, other

    cs.RO cs.AI cs.LG

    Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving

    Authors: Eli Bronstein, Mark Palatucci, Dominik Notz, Brandyn White, Alex Kuefler, Yiren Lu, Supratik Paul, Payam Nikdel, Paul Mougin, Hongge Chen, Justin Fu, Austin Abrams, Punit Shah, Evan Racah, Benjamin Frenkel, Shimon Whiteson, Dragomir Anguelov

    Abstract: We demonstrate the first large-scale application of model-based generative adversarial imitation learning (MGAIL) to the task of dense urban self-driving. We augment standard MGAIL using a hierarchical model to enable generalization to arbitrary goal routes, and measure performance using a closed-loop evaluation framework with simulated interactive agents. We train policies from expert trajectorie… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: IROS 2022

    Journal ref: IEEE/RSJ international conference on intelligent robots and systems (IROS) 2022, pages 8652-8659

  19. arXiv:2210.09267  [pdf, other

    cs.CV cs.LG cs.RO

    CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for Robust 3D Object Detection

    Authors: Jyh-**g Hwang, Henrik Kretzschmar, Joshua Manela, Sean Rafferty, Nicholas Armstrong-Crews, Tiffany Chen, Dragomir Anguelov

    Abstract: Robust 3D object detection is critical for safe autonomous driving. Camera and radar sensors are synergistic as they capture complementary information and work well under different environmental conditions. Fusing camera and radar data is challenging, however, as each of the sensors lacks information along a perpendicular axis, that is, depth is unknown to camera and elevation is unknown to radar.… ▽ More

    Submitted 17 October, 2022; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: ECCV 2022

  20. arXiv:2210.08375  [pdf, other

    cs.CV cs.LG

    Improving the Intra-class Long-tail in 3D Detection via Rare Example Mining

    Authors: Chiyu Max Jiang, Mahyar Najibi, Charles R. Qi, Yin Zhou, Dragomir Anguelov

    Abstract: Continued improvements in deep learning architectures have steadily advanced the overall performance of 3D object detectors to levels on par with humans for certain tasks and datasets, where the overall performance is mostly driven by common examples. However, even the best performing models suffer from the most naive mistakes when it comes to rare examples that do not appear frequently in the tra… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Comments: Accepted to European Conference on Computer Vision (ECCV) 2022

    MSC Class: 68T45

  21. arXiv:2210.08064  [pdf, other

    cs.CV cs.RO

    LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds

    Authors: Minghua Liu, Yin Zhou, Charles R. Qi, Boqing Gong, Hao Su, Dragomir Anguelov

    Abstract: Semantic segmentation of LiDAR point clouds is an important task in autonomous driving. However, training deep models via conventional supervised methods requires large datasets which are costly to label. It is critical to have label-efficient segmentation approaches to scale up the model to new operational domains or to improve performance on rare cases. While most prior works focus on indoor sce… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  22. arXiv:2210.08061  [pdf, other

    cs.CV cs.LG cs.RO

    Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving

    Authors: Mahyar Najibi, **gwei Ji, Yin Zhou, Charles R. Qi, Xinchen Yan, Scott Ettinger, Dragomir Anguelov

    Abstract: Learning-based perception and prediction modules in modern autonomous driving systems typically rely on expensive human annotation and are designed to perceive only a handful of predefined object categories. This closed-set paradigm is insufficient for the safety-critical autonomous driving task, where the autonomous vehicle needs to process arbitrarily many types of traffic participants and their… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: ECCV 2022

  23. arXiv:2210.07372  [pdf, other

    cs.CV

    SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds

    Authors: Pei Sun, Mingxing Tan, Weiyue Wang, Chenxi Liu, Fei Xia, Zhaoqi Leng, Dragomir Anguelov

    Abstract: 3D object detection in point clouds is a core component for modern robotics and autonomous driving systems. A key challenge in 3D object detection comes from the inherent sparse nature of point occupancy within the 3D scene. In this paper, we propose Sparse Window Transformer (SWFormer ), a scalable and accurate model for 3D object detection, which can take full advantage of the sparsity of point… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Journal ref: ECCV 2022

  24. arXiv:2210.05018  [pdf, other

    cs.CV

    LidarNAS: Unifying and Searching Neural Architectures for 3D Point Clouds

    Authors: Chenxi Liu, Zhaoqi Leng, Pei Sun, Shuyang Cheng, Charles R. Qi, Yin Zhou, Mingxing Tan, Dragomir Anguelov

    Abstract: Develo** neural models that accurately understand objects in 3D point clouds is essential for the success of robotics and autonomous driving. However, arguably due to the higher-dimensional nature of the data (as compared to images), existing neural architectures exhibit a large variety in their designs, including but not limited to the views considered, the format of the neural features, and th… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: ECCV 2022

  25. arXiv:2206.07705  [pdf, other

    cs.CV

    LET-3D-AP: Longitudinal Error Tolerant 3D Average Precision for Camera-Only 3D Detection

    Authors: Wei-Chih Hung, Vincent Casser, Henrik Kretzschmar, Jyh-**g Hwang, Dragomir Anguelov

    Abstract: The 3D Average Precision (3D AP) relies on the intersection over union between predictions and ground truth objects. However, camera-only detectors have limited depth accuracy, which may cause otherwise reasonable predictions that suffer from such longitudinal localization errors to be treated as false positives. We therefore propose variants of the 3D AP metric to be more permissive with respect… ▽ More

    Submitted 3 May, 2024; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: Find the primary metrics for the 2022 Waymo Open Dataset 3D Camera-Only Detection Challenge at https://waymo.com/open/challenges/2022/3d-camera-only-detection/ . Find the code at https://github.com/waymo-research/waymo-open-dataset

  26. arXiv:2206.07704  [pdf, other

    cs.CV

    Waymo Open Dataset: Panoramic Video Panoptic Segmentation

    Authors: Jieru Mei, Alex Zihao Zhu, Xinchen Yan, Hang Yan, Siyuan Qiao, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar, Dragomir Anguelov

    Abstract: Panoptic image segmentation is the computer vision task of finding groups of pixels in an image and assigning semantic classes and object instance identifiers to them. Research in image segmentation has become increasingly popular due to its critical applications in robotics and autonomous driving. The research community thereby relies on publicly available benchmark dataset to advance the state-o… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Comments: Our dataset can be found at https://waymo.com/open

  27. arXiv:2206.03666  [pdf, other

    cs.CV

    Depth Estimation Matters Most: Improving Per-Object Depth Estimation for Monocular 3D Detection and Tracking

    Authors: Longlong **g, Ruichi Yu, Henrik Kretzschmar, Kang Li, Charles R. Qi, Hang Zhao, Alper Ayvaci, Xu Chen, Dillon Cower, Yingwei Li, Yurong You, Han Deng, Congcong Li, Dragomir Anguelov

    Abstract: Monocular image-based 3D perception has become an active research area in recent years owing to its applications in autonomous driving. Approaches to monocular 3D perception including detection and tracking, however, often yield inferior performance when compared to LiDAR-based techniques. Through systematic analysis, we identified that per-object depth estimation accuracy is a major factor boundi… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Journal ref: ICRA2022

  28. arXiv:2206.01738  [pdf, other

    eess.IV cs.CV

    RIDDLE: Lidar Data Compression with Range Image Deep Delta Encoding

    Authors: Xuanyu Zhou, Charles R. Qi, Yin Zhou, Dragomir Anguelov

    Abstract: Lidars are depth measuring sensors widely used in autonomous driving and augmented reality. However, the large volume of data produced by lidars can lead to high costs in data storage and transmission. While lidar data can be represented as two interchangeable representations: 3D point clouds and range images, most previous work focus on compressing the generic 3D point clouds. In this work, we sh… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

    Comments: 14 pages, 10 figures; CVPR 2022

  29. arXiv:2206.00991  [pdf, ps, other

    cs.RO cs.CV

    StopNet: Scalable Trajectory and Occupancy Prediction for Urban Autonomous Driving

    Authors: **kyu Kim, Reza Mahjourian, Scott Ettinger, Mayank Bansal, Brandyn White, Ben Sapp, Dragomir Anguelov

    Abstract: We introduce a motion forecasting (behavior prediction) method that meets the latency requirements for autonomous driving in dense urban environments without sacrificing accuracy. A whole-scene sparse input representation allows StopNet to scale to predicting trajectories for hundreds of road agents with reliable latency. In addition to predicting trajectories, our scene encoder lends itself to pr… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

    Journal ref: IEEE International Conference on Robotics and Automation 2022

  30. arXiv:2205.05703  [pdf, other

    cs.CV cs.RO

    Multi-Class 3D Object Detection with Single-Class Supervision

    Authors: Mao Ye, Chenxi Liu, Maoqing Yao, Weiyue Wang, Zhaoqi Leng, Charles R. Qi, Dragomir Anguelov

    Abstract: While multi-class 3D detectors are needed in many robotics applications, training them with fully labeled datasets can be expensive in labeling cost. An alternative approach is to have targeted single-class labels on disjoint data samples. In this paper, we are interested in training a multi-class 3D object detection model, while using these single-class labeled data. We begin by detailing the uni… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: ICRA 2022

  31. arXiv:2205.03195  [pdf, other

    cs.LG cs.RO

    Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation

    Authors: Maximilian Igl, Daewoo Kim, Alex Kuefler, Paul Mougin, Punit Shah, Kyriacos Shiarlis, Dragomir Anguelov, Mark Palatucci, Brandyn White, Shimon Whiteson

    Abstract: Simulation is a crucial tool for accelerating the development of autonomous vehicles. Making simulation realistic requires models of the human road users who interact with such cars. Such models can be obtained by applying learning from demonstration (LfD) to trajectories observed by cars already on the road. However, existing LfD methods are typically insufficient, yielding policies that frequent… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: Accepted to ICRA-2022

  32. arXiv:2204.12511  [pdf, other

    cs.CV

    PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions

    Authors: Zhaoqi Leng, Mingxing Tan, Chenxi Liu, Ekin Dogus Cubuk, Xiaojie Shi, Shuyang Cheng, Dragomir Anguelov

    Abstract: Cross-entropy loss and focal loss are the most common choices when training deep neural networks for classification problems. Generally speaking, however, a good loss function can take on much more flexible forms, and should be tailored for different tasks and datasets. Motivated by how functions can be approximated via Taylor expansion, we propose a simple framework, named PolyLoss, to view and d… ▽ More

    Submitted 10 May, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

    Comments: Add ablation studies on COCO detection using RetinaNet (Section 8)

    Journal ref: International Conference on Learning Representations. 2021

  33. Occupancy Flow Fields for Motion Forecasting in Autonomous Driving

    Authors: Reza Mahjourian, **kyu Kim, Yuning Chai, Mingxing Tan, Ben Sapp, Dragomir Anguelov

    Abstract: We propose Occupancy Flow Fields, a new representation for motion forecasting of multiple agents, an important task in autonomous driving. Our representation is a spatio-temporal grid with each grid cell containing both the probability of the cell being occupied by any agent, and a two-dimensional flow vector representing the direction and magnitude of the motion in that cell. Our method successfu… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

    Journal ref: IEEE Robotics and Automation Letters

  34. arXiv:2201.05938  [pdf, other

    cs.LG cs.CV

    GradTail: Learning Long-Tailed Data Using Gradient-based Sample Weighting

    Authors: Zhao Chen, Vincent Casser, Henrik Kretzschmar, Dragomir Anguelov

    Abstract: We propose GradTail, an algorithm that uses gradients to improve model performance on the fly in the face of long-tailed training data distributions. Unlike conventional long-tail classifiers which operate on converged - and possibly overfit - models, we demonstrate that an approach based on gradient dot product agreement can isolate long-tailed data early on during model training and improve perf… ▽ More

    Submitted 18 January, 2022; v1 submitted 15 January, 2022; originally announced January 2022.

    Comments: 15 pages (including Appendix), 8 figures

  35. arXiv:2112.12141  [pdf, other

    cs.CV

    Multi-modal 3D Human Pose Estimation with 2D Weak Supervision in Autonomous Driving

    Authors: **gxiao Zheng, Xinwei Shi, Alexander Gorban, Junhua Mao, Yang Song, Charles R. Qi, Ting Liu, Visesh Chari, Andre Cornman, Yin Zhou, Congcong Li, Dragomir Anguelov

    Abstract: 3D human pose estimation (HPE) in autonomous vehicles (AV) differs from other use cases in many factors, including the 3D resolution and range of data, absence of dense depth maps, failure modes for LiDAR, relative location between the camera and LiDAR, and a high bar for estimation accuracy. Data collected for other use cases (such as virtual reality, gaming, and animation) may therefore not be u… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

  36. arXiv:2112.07787  [pdf, other

    cs.CV cs.RO

    Revisiting 3D Object Detection From an Egocentric Perspective

    Authors: Boyang Deng, Charles R. Qi, Mahyar Najibi, Thomas Funkhouser, Yin Zhou, Dragomir Anguelov

    Abstract: 3D object detection is a key module for safety-critical robotics applications such as autonomous driving. For these applications, we care most about how the detections affect the ego-agent's behavior and safety (the egocentric perspective). Intuitively, we seek more accurate descriptions of object geometry when it's more likely to interfere with the ego-agent's motion trajectory. However, current… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: Published in NeurIPS 2021

  37. arXiv:2111.14973  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    MultiPath++: Efficient Information Fusion and Trajectory Aggregation for Behavior Prediction

    Authors: Balakrishnan Varadarajan, Ahmed Hefny, Avikalp Srivastava, Khaled S. Refaat, Nigamaa Nayakanti, Andre Cornman, Kan Chen, Bertrand Douillard, Chi Pang Lam, Dragomir Anguelov, Benjamin Sapp

    Abstract: Predicting the future behavior of road users is one of the most challenging and important problems in autonomous driving. Applying deep learning to this problem requires fusing heterogeneous world state in the form of rich perception signals and map information, and inferring highly multi-modal distributions over possible futures. In this paper, we present MultiPath++, a future prediction model th… ▽ More

    Submitted 21 December, 2021; v1 submitted 29 November, 2021; originally announced November 2021.

  38. arXiv:2108.06709  [pdf, other

    cs.CV

    SPG: Unsupervised Domain Adaptation for 3D Object Detection via Semantic Point Generation

    Authors: Qiangeng Xu, Yin Zhou, Weiyue Wang, Charles R. Qi, Dragomir Anguelov

    Abstract: In autonomous driving, a LiDAR-based object detector should perform reliably at different geographic locations and under various weather conditions. While recent 3D detection research focuses on improving performance within a single domain, our study reveals that the performance of modern detectors can drop drastically cross-domain. In this paper, we investigate unsupervised domain adaptation (UDA… ▽ More

    Submitted 15 August, 2021; originally announced August 2021.

  39. arXiv:2106.14880  [pdf, other

    cs.CV

    HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps

    Authors: Lu Mi, Hang Zhao, Charlie Nash, Xiaohan **, Jiyang Gao, Chen Sun, Cordelia Schmid, Nir Shavit, Yuning Chai, Dragomir Anguelov

    Abstract: High Definition (HD) maps are maps with precise definitions of road lanes with rich semantics of the traffic rules. They are critical for several key stages in an autonomous driving system, including motion forecasting and planning. However, there are only a small amount of real-world road topologies and geometries, which significantly limits our ability to test out the self-driving stack to gener… ▽ More

    Submitted 28 June, 2021; originally announced June 2021.

  40. arXiv:2106.13381  [pdf, other

    cs.CV

    To the Point: Efficient 3D Object Detection in the Range Image with Graph Convolution Kernels

    Authors: Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Benjamin Caine, Vijay Vasudevan, Xiao Zhang, Dragomir Anguelov

    Abstract: 3D object detection is vital for many robotics applications. For tasks where a 2D perspective range image exists, we propose to learn a 3D representation directly from this range image view. To this end, we designed a 2D convolutional network architecture that carries the 3D spherical coordinates of each pixel throughout the network. Its layers can consume any arbitrary convolution kernel in place… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Journal ref: CVPR 2021

  41. arXiv:2106.13365  [pdf, other

    cs.CV

    RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection

    Authors: Pei Sun, Weiyue Wang, Yuning Chai, Gamaleldin Elsayed, Alex Bewley, Xiao Zhang, Cristian Sminchisescu, Dragomir Anguelov

    Abstract: The detection of 3D objects from LiDAR data is a critical component in most autonomous driving systems. Safe, high speed driving needs larger detection ranges, which are enabled by new LiDARs. These larger detection ranges require more efficient and accurate detection models. Towards this goal, we propose Range Sparse Net (RSN), a simple, efficient, and accurate 3D object detector in order to tack… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Journal ref: CVPR 2021

  42. arXiv:2104.10133  [pdf, other

    cs.CV cs.LG cs.RO

    Large Scale Interactive Motion Forecasting for Autonomous Driving : The Waymo Open Motion Dataset

    Authors: Scott Ettinger, Shuyang Cheng, Benjamin Caine, Chenxi Liu, Hang Zhao, Sabeek Pradhan, Yuning Chai, Ben Sapp, Charles Qi, Yin Zhou, Zoey Yang, Aurelien Chouard, Pei Sun, Jiquan Ngiam, Vijay Vasudevan, Alexander McCauley, Jonathon Shlens, Dragomir Anguelov

    Abstract: As autonomous driving systems mature, motion forecasting has received increasing attention as a critical requirement for planning. Of particular importance are interactive situations such as merges, unprotected turns, etc., where predicting individual object motion is not sufficient. Joint predictions of multiple objects are required for effective route planning. There has been a critical need for… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

    Comments: 15 pages, 10 figures

  43. arXiv:2104.09959  [pdf, other

    cs.RO

    Identifying Driver Interactions via Conditional Behavior Prediction

    Authors: Ekaterina Tolstaya, Reza Mahjourian, Carlton Downey, Balakrishnan Varadarajan, Benjamin Sapp, Dragomir Anguelov

    Abstract: Interactive driving scenarios, such as lane changes, merges and unprotected turns, are some of the most challenging situations for autonomous driving. Planning in interactive scenarios requires accurately modeling the reactions of other agents to different future actions of the ego agent. We develop end-to-end models for conditional behavior prediction (CBP) that take as an input a query future tr… ▽ More

    Submitted 1 June, 2021; v1 submitted 20 April, 2021; originally announced April 2021.

  44. arXiv:2103.05073  [pdf, other

    cs.CV

    Offboard 3D Object Detection from Point Cloud Sequences

    Authors: Charles R. Qi, Yin Zhou, Mahyar Najibi, Pei Sun, Khoa Vo, Boyang Deng, Dragomir Anguelov

    Abstract: While current 3D object recognition research mostly focuses on the real-time, onboard scenario, there are many offboard use cases of perception that are largely under-explored, such as using machines to automatically generate high-quality 3D labels. Existing 3D object detectors fail to satisfy the high-quality requirement for offboard uses due to the limited input and speed constraints. In this pa… ▽ More

    Submitted 8 March, 2021; originally announced March 2021.

    Comments: 18 pages, 7 figures, 19 tables

  45. arXiv:2010.06808  [pdf, other

    cs.LG cs.CV

    Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout

    Authors: Zhao Chen, Jiquan Ngiam, Yan** Huang, Thang Luong, Henrik Kretzschmar, Yuning Chai, Dragomir Anguelov

    Abstract: The vast majority of deep models use multiple gradient signals, typically corresponding to a sum of multiple loss terms, to update a shared set of trainable weights. However, these multiple updates can impede optimal training by pulling the model in conflicting directions. We present Gradient Sign Dropout (GradDrop), a probabilistic masking procedure which samples gradients at an activation layer… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: Conference on Neural Information Processing Systems (NeurIPS) 2020

  46. arXiv:2008.08294  [pdf, other

    cs.CV cs.RO

    TNT: Target-driveN Trajectory Prediction

    Authors: Hang Zhao, Jiyang Gao, Tian Lan, Chen Sun, Benjamin Sapp, Balakrishnan Varadarajan, Yue Shen, Yi Shen, Yuning Chai, Cordelia Schmid, Congcong Li, Dragomir Anguelov

    Abstract: Predicting the future behavior of moving agents is essential for real world applications. It is challenging as the intent of the agent and the corresponding behavior is unknown and intrinsically multimodal. Our key insight is that for prediction within a moderate time horizon, the future modes can be effectively captured by a set of target states. This leads to our target-driven trajectory predict… ▽ More

    Submitted 21 August, 2020; v1 submitted 19 August, 2020; originally announced August 2020.

  47. arXiv:2008.07725  [pdf, other

    cs.CV

    SoDA: Multi-Object Tracking with Soft Data Association

    Authors: Wei-Chih Hung, Henrik Kretzschmar, Tsung-Yi Lin, Yuning Chai, Ruichi Yu, Ming-Hsuan Yang, Dragomir Anguelov

    Abstract: Robust multi-object tracking (MOT) is a prerequisite fora safe deployment of self-driving cars. Tracking objects, however, remains a highly challenging problem, especially in cluttered autonomous driving scenes in which objects tend to interact with each other in complex ways and frequently get occluded. We propose a novel approach to MOT that uses attention to compute track embeddings that encode… ▽ More

    Submitted 19 August, 2020; v1 submitted 17 August, 2020; originally announced August 2020.

  48. arXiv:2005.09927  [pdf, other

    cs.CV cs.LG cs.RO

    Range Conditioned Dilated Convolutions for Scale Invariant 3D Object Detection

    Authors: Alex Bewley, Pei Sun, Thomas Mensink, Dragomir Anguelov, Cristian Sminchisescu

    Abstract: This paper presents a novel 3D object detection framework that processes LiDAR data directly on its native representation: range images. Benefiting from the compactness of range images, 2D convolutions can efficiently process dense LiDAR data of a scene. To overcome scale sensitivity in this perspective view, a novel range-conditioned dilation (RCD) layer is proposed to dynamically adjust a contin… ▽ More

    Submitted 22 January, 2021; v1 submitted 20 May, 2020; originally announced May 2020.

    Comments: CoRL 2020

  49. arXiv:2005.04259  [pdf, other

    cs.CV cs.LG stat.ML

    VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation

    Authors: Jiyang Gao, Chen Sun, Hang Zhao, Yi Shen, Dragomir Anguelov, Congcong Li, Cordelia Schmid

    Abstract: Behavior prediction in dynamic, multi-agent systems is an important problem in the context of self-driving cars, due to the complex representations and interactions of road components, including moving agents (e.g. pedestrians and vehicles) and road context information (e.g. lanes, traffic lights). This paper introduces VectorNet, a hierarchical graph neural network that first exploits the spatial… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Comments: CVPR 2020

  50. arXiv:2005.04255  [pdf, other

    cs.CV

    STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction

    Authors: Zhishuai Zhang, Jiyang Gao, Junhua Mao, Yukai Liu, Dragomir Anguelov, Congcong Li

    Abstract: Detecting pedestrians and predicting future trajectories for them are critical tasks for numerous applications, such as autonomous driving. Previous methods either treat the detection and prediction as separate tasks or simply add a trajectory regression head on top of a detector. In this work, we present a novel end-to-end two-stage network: Spatio-Temporal-Interactive Network (STINet). In additi… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Journal ref: CVPR 2020