Search | arXiv e-print repository

Do More With What You Have: Transferring Depth-Scale from Labeled to Unlabeled Domains

Authors: Alexandra Dana, Nadav Carmel, Amit Shomer, Ofer Manela, Tomer Peleg

Abstract: Transferring the absolute depth prediction capabilities of an estimator to a new domain is a task with significant real-world applications. This task is specifically challenging when images from the new domain are collected without ground-truth depth measurements, and possibly with sensors of different intrinsics. To overcome such limitations, a recent zero-shot solution was trained on an extensiv… ▽ More Transferring the absolute depth prediction capabilities of an estimator to a new domain is a task with significant real-world applications. This task is specifically challenging when images from the new domain are collected without ground-truth depth measurements, and possibly with sensors of different intrinsics. To overcome such limitations, a recent zero-shot solution was trained on an extensive training dataset and encoded the various camera intrinsics. Other solutions generated synthetic data with depth labels that matched the intrinsics of the new target data to enable depth-scale transfer between the domains. In this work we present an alternative solution that can utilize any existing synthetic or real dataset, that has a small number of images annotated with ground truth depth labels. Specifically, we show that self-supervised depth estimators result in up-to-scale predictions that are linearly correlated to their absolute depth values across the domain, a property that we model in this work using a single scalar. In addition, aligning the field-of-view of two datasets prior to training, results in a common linear relationship for both domains. We use this observed property to transfer the depth-scale from source datasets that have absolute depth labels to new target datasets that lack these measurements, enabling absolute depth predictions in the target domain. The suggested method was successfully demonstrated on the KITTI, DDAD and nuScenes datasets, while using other existing real or synthetic source datasets, that have a different field-of-view, other image style or structural content, achieving comparable or better accuracy than other existing methods that do not use target ground-truth depths. △ Less

Submitted 15 April, 2024; v1 submitted 14 March, 2023; originally announced March 2023.

arXiv:2212.05315 [pdf, other]

Mind The Edge: Refining Depth Edges in Sparsely-Supervised Monocular Depth Estimation

Authors: Lior Talker, Aviad Cohen, Erez Yosef, Alexandra Dana, Michael Dinerstein

Abstract: Monocular Depth Estimation (MDE) is a fundamental problem in computer vision with numerous applications. Recently, LIDAR-supervised methods have achieved remarkable per-pixel depth accuracy in outdoor scenes. However, significant errors are typically found in the proximity of depth discontinuities, i.e., depth edges, which often hinder the performance of depth-dependent applications that are sensi… ▽ More Monocular Depth Estimation (MDE) is a fundamental problem in computer vision with numerous applications. Recently, LIDAR-supervised methods have achieved remarkable per-pixel depth accuracy in outdoor scenes. However, significant errors are typically found in the proximity of depth discontinuities, i.e., depth edges, which often hinder the performance of depth-dependent applications that are sensitive to such inaccuracies, e.g., novel view synthesis and augmented reality. Since direct supervision for the location of depth edges is typically unavailable in sparse LIDAR-based scenes, encouraging the MDE model to produce correct depth edges is not straightforward. To the best of our knowledge this paper is the first attempt to address the depth edges issue for LIDAR-supervised scenes. In this work we propose to learn to detect the location of depth edges from densely-supervised synthetic data, and use it to generate supervision for the depth edges in the MDE training. To quantitatively evaluate our approach, and due to the lack of depth edges GT in LIDAR-based scenes, we manually annotated subsets of the KITTI and the DDAD datasets with depth edges ground truth. We demonstrate significant gains in the accuracy of the depth edges with comparable per-pixel depth accuracy on several challenging datasets. Code and datasets are available at \url{https://github.com/liortalker/MindTheEdge}. △ Less

Submitted 3 April, 2024; v1 submitted 10 December, 2022; originally announced December 2022.

Comments: Appears in CVPR24'

arXiv:2107.10050 [pdf, ps, other]

You Better Look Twice: a new perspective for designing accurate detectors with reduced computations

Authors: Alexandra Dana, Maor Shutman, Yotam Perlitz, Ran Vitek, Tomer Peleg, Roy J Jevnisek

Abstract: General object detectors use powerful backbones that uniformly extract features from images for enabling detection of a vast amount of object types. However, utilization of such backbones in object detection applications developed for specific object types can unnecessarily over-process an extensive amount of background. In addition, they are agnostic to object scales, thus redundantly process all… ▽ More General object detectors use powerful backbones that uniformly extract features from images for enabling detection of a vast amount of object types. However, utilization of such backbones in object detection applications developed for specific object types can unnecessarily over-process an extensive amount of background. In addition, they are agnostic to object scales, thus redundantly process all image regions at the same resolution. In this work we introduce BLT-net, a new low-computation two-stage object detection architecture designed to process images with a significant amount of background and objects of variate scales. BLT-net reduces computations by separating objects from background using a very lite first-stage. BLT-net then efficiently merges obtained proposals to further decrease processed background and then dynamically reduces their resolution to minimize computations. Resulting image proposals are then processed in the second-stage by a highly accurate model. We demonstrate our architecture on the pedestrian detection problem, where objects are of different sizes, images are of high resolution and object detection is required to run in real-time. We show that our design reduces computations by a factor of x4-x7 on the Citypersons and Caltech datasets with respect to leading pedestrian detectors, on account of a small accuracy degradation. This method can be applied on other object detection applications in scenes with a considerable amount of background and variate object sizes to reduce computations. △ Less

Submitted 3 August, 2021; v1 submitted 21 July, 2021; originally announced July 2021.

arXiv:2101.04824 [pdf, ps, other]

doi 10.1109/LSP.2021.3051522

Energy-Efficient Distributed Learning Algorithms for Coarsely Quantized Signals

Authors: A. Danaee, R. C. de Lamare, V. H. Nascimento

Abstract: In this work, we present an energy-efficient distributed learning framework using low-resolution ADCs and coarsely quantized signals for Internet of Things (IoT) networks. In particular, we develop a distributed quantization-aware least-mean square (DQA-LMS) algorithm that can learn parameters in an energy-efficient fashion using signals quantized with few bits while requiring a low computational… ▽ More In this work, we present an energy-efficient distributed learning framework using low-resolution ADCs and coarsely quantized signals for Internet of Things (IoT) networks. In particular, we develop a distributed quantization-aware least-mean square (DQA-LMS) algorithm that can learn parameters in an energy-efficient fashion using signals quantized with few bits while requiring a low computational cost. We also carry out a statistical analysis of the proposed DQA-LMS algorithm that includes a stability condition. Simulations assess the DQA-LMS algorithm against existing techniques for a distributed parameter estimation task where IoT devices operate in a peer-to-peer mode and demonstrate the effectiveness of the DQA-LMS algorithm. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: 5 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2012.10939

arXiv:2012.10939 [pdf, ps, other]

Study of Energy-Efficient Distributed RLS-based Learning with Coarsely Quantized Signals

Authors: A. Danaee, R. C. de Lamare, V. H. Nascimento

Abstract: In this work, we present an energy-efficient distributed learning framework using coarsely quantized signals for Internet of Things (IoT) networks. In particular, we develop a distributed quantization-aware recursive least squares (DQA-RLS) algorithm that can learn parameters in an energy-efficient fashion using signals quantized with few bits while requiring a low computational cost. Numerical re… ▽ More In this work, we present an energy-efficient distributed learning framework using coarsely quantized signals for Internet of Things (IoT) networks. In particular, we develop a distributed quantization-aware recursive least squares (DQA-RLS) algorithm that can learn parameters in an energy-efficient fashion using signals quantized with few bits while requiring a low computational cost. Numerical results assess the DQA-RLS algorithm against existing techniques for a distributed parameter estimation task where IoT devices operate in a peer-to-peer mode. △ Less

Submitted 20 December, 2020; originally announced December 2020.

Comments: 6 pages, 5 figures

Showing 1–5 of 5 results for author: Dana, A