-
Gaussian Latent Representations for Uncertainty Estimation using Mahalanobis Distance in Deep Classifiers
Authors:
Aishwarya Venkataramanan,
Assia Benbihi,
Martin Laviale,
Cedric Pradalier
Abstract:
Recent works show that the data distribution in a network's latent space is useful for estimating classification uncertainty and detecting Out-of-distribution (OOD) samples. To obtain a well-regularized latent space that is conducive for uncertainty estimation, existing methods bring in significant changes to model architectures and training procedures. In this paper, we present a lightweight, fas…
▽ More
Recent works show that the data distribution in a network's latent space is useful for estimating classification uncertainty and detecting Out-of-distribution (OOD) samples. To obtain a well-regularized latent space that is conducive for uncertainty estimation, existing methods bring in significant changes to model architectures and training procedures. In this paper, we present a lightweight, fast, and high-performance regularization method for Mahalanobis distance-based uncertainty prediction, and that requires minimal changes to the network's architecture. To derive Gaussian latent representation favourable for Mahalanobis Distance calculation, we introduce a self-supervised representation learning method that separates in-class representations into multiple Gaussians. Classes with non-Gaussian representations are automatically identified and dynamically clustered into multiple new classes that are approximately Gaussian. Evaluation on standard OOD benchmarks shows that our method achieves state-of-the-art results on OOD detection with minimal inference time, and is very competitive on predictive probability calibration. Finally, we show the applicability of our method to a real-life computer vision use case on microorganism classification.
△ Less
Submitted 29 September, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Object-Guided Day-Night Visual Localization in Urban Scenes
Authors:
Assia Benbihi,
Cédric Pradalier,
Ondřej Chum
Abstract:
We introduce Object-Guided Localization (OGuL) based on a novel method of local-feature matching. Direct matching of local features is sensitive to significant changes in illumination. In contrast, object detection often survives severe changes in lighting conditions. The proposed method first detects semantic objects and establishes correspondences of those objects between images. Object correspo…
▽ More
We introduce Object-Guided Localization (OGuL) based on a novel method of local-feature matching. Direct matching of local features is sensitive to significant changes in illumination. In contrast, object detection often survives severe changes in lighting conditions. The proposed method first detects semantic objects and establishes correspondences of those objects between images. Object correspondences provide local coarse alignment of the images in the form of a planar homography. These homographies are consequently used to guide the matching of local features. Experiments on standard urban localization datasets (Aachen, Extended-CMU-Season, RobotCar-Season) show that OGuL significantly improves localization results with as simple local features as SIFT, and its performance competes with the state-of-the-art CNN-based methods trained for day-to-night localization.
△ Less
Submitted 9 February, 2022;
originally announced February 2022.
-
Image-Based Place Recognition on Bucolic Environment Across Seasons From Semantic Edge Description
Authors:
Assia Benbihi,
Stéphanie Aravecchia,
Matthieu Geist,
Cédric Pradalier
Abstract:
Most of the research effort on image-based place recognition is designed for urban environments. In bucolic environments such as natural scenes with low texture and little semantic content, the main challenge is to handle the variations in visual appearance across time such as illumination, weather, vegetation state or viewpoints. The nature of the variations is different and this leads to a diffe…
▽ More
Most of the research effort on image-based place recognition is designed for urban environments. In bucolic environments such as natural scenes with low texture and little semantic content, the main challenge is to handle the variations in visual appearance across time such as illumination, weather, vegetation state or viewpoints. The nature of the variations is different and this leads to a different approach to describing a bucolic scene. We introduce a global image descriptor computed from its semantic and topological information. It is built from the wavelet transforms of the image semantic edges. Matching two images is then equivalent to matching their semantic edge descriptors. We show that this method reaches state-of-the-art image retrieval performance on two multi-season environment-monitoring datasets: the CMU-Seasons and the Symphony Lake dataset. It also generalises to urban scenes on which it is on par with the current baselines NetVLAD and DELF.
△ Less
Submitted 1 April, 2021; v1 submitted 28 October, 2019;
originally announced October 2019.
-
ELF: Embedded Localisation of Features in pre-trained CNN
Authors:
Assia Benbihi,
Matthieu Geist,
Cédric Pradalier
Abstract:
This paper introduces a novel feature detector based only on information embedded inside a CNN trained on standard tasks (e.g. classification). While previous works already show that the features of a trained CNN are suitable descriptors, we show here how to extract the feature locations from the network to build a detector. This information is computed from the gradient of the feature map with re…
▽ More
This paper introduces a novel feature detector based only on information embedded inside a CNN trained on standard tasks (e.g. classification). While previous works already show that the features of a trained CNN are suitable descriptors, we show here how to extract the feature locations from the network to build a detector. This information is computed from the gradient of the feature map with respect to the input image. This provides a saliency map with local maxima on relevant keypoint locations. Contrary to recent CNN-based detectors, this method requires neither supervised training nor finetuning. We evaluate how repeatable and how matchable the detected keypoints are with the repeatability and matching scores. Matchability is measured with a simple descriptor introduced for the sake of the evaluation. This novel detector reaches similar performances on the standard evaluation HPatches dataset, as well as comparable robustness against illumination and viewpoint changes on Webcam and photo-tourism images. These results show that a CNN trained on a standard task embeds feature location information that is as relevant as when the CNN is specifically trained for feature detection.
△ Less
Submitted 7 July, 2019;
originally announced July 2019.
-
Semantic Nearest Neighbor Fields Monocular Edge Visual-Odometry
Authors:
Xiaolong Wu,
Assia Benbihi,
Antoine Richard,
Cedric Pradalier
Abstract:
Recent advances in deep learning for edge detection and segmentation opens up a new path for semantic-edge-based ego-motion estimation. In this work, we propose a robust monocular visual odometry (VO) framework using category-aware semantic edges. It can reconstruct large-scale semantic maps in challenging outdoor environments. The core of our approach is a semantic nearest neighbor field that fac…
▽ More
Recent advances in deep learning for edge detection and segmentation opens up a new path for semantic-edge-based ego-motion estimation. In this work, we propose a robust monocular visual odometry (VO) framework using category-aware semantic edges. It can reconstruct large-scale semantic maps in challenging outdoor environments. The core of our approach is a semantic nearest neighbor field that facilitates a robust data association of edges across frames using semantics. This significantly enlarges the convergence radius during tracking phases. The proposed edge registration method can be easily integrated into direct VO frameworks to estimate photometrically, geometrically, and semantically consistent camera motions. Different types of edges are evaluated and extensive experiments demonstrate that our proposed system outperforms state-of-art indirect, direct, and semantic monocular VO systems.
△ Less
Submitted 1 April, 2019;
originally announced April 2019.
-
Semi-Supervised Domain Adaptation with Representation Learning for Semantic Segmentation across Time
Authors:
Assia Benbihi,
Matthieu Geist,
Cédric Pradalier
Abstract:
Deep learning generates state-of-the-art semantic segmentation provided that a large number of images together with pixel-wise annotations are available. To alleviate the expensive data collection process, we propose a semi-supervised domain adaptation method for the specific case of images with similar semantic content but different pixel distributions. A network trained with supervision on a pas…
▽ More
Deep learning generates state-of-the-art semantic segmentation provided that a large number of images together with pixel-wise annotations are available. To alleviate the expensive data collection process, we propose a semi-supervised domain adaptation method for the specific case of images with similar semantic content but different pixel distributions. A network trained with supervision on a past dataset is finetuned on the new dataset to conserve its features maps. The domain adaptation becomes a simple regression between feature maps and does not require annotations on the new dataset. This method reaches performances similar to classic transfer learning on the PASCAL VOC dataset with synthetic transformations.
△ Less
Submitted 6 October, 2019; v1 submitted 10 May, 2018;
originally announced May 2018.