Search | arXiv e-print repository

Terrain Classification Enhanced with Uncertainty for Space Exploration Robots from Proprioceptive Data

Authors: Mariela De Lucas Álvarez, Jichen Guo, Raul Domínguez, Matias Valdenegro-Toro

Abstract: Terrain Classification is an essential task in space exploration, where unpredictable environments are difficult to observe using only exteroceptive sensors such as vision. Implementing Neural Network classifiers can have high performance but can be deemed untrustworthy as they lack transparency, which makes them unreliable for taking high-stakes decisions during mission planning. We address this… ▽ More Terrain Classification is an essential task in space exploration, where unpredictable environments are difficult to observe using only exteroceptive sensors such as vision. Implementing Neural Network classifiers can have high performance but can be deemed untrustworthy as they lack transparency, which makes them unreliable for taking high-stakes decisions during mission planning. We address this by proposing Neural Networks with Uncertainty Quantification in Terrain Classification. We enable our Neural Networks with Monte Carlo Dropout, DropConnect, and Flipout in time series-capable architectures using only proprioceptive data as input. We use Bayesian Optimization with Hyperband for efficient hyperparameter optimization to find optimal models for trustworthy terrain classification. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 6 pages, 4 figures. LatinX in AI Workshop @ ICML 2023 Camera Ready

arXiv:2406.18787 [pdf, other]

Unified Uncertainties: Combining Input, Data and Model Uncertainty into a Single Formulation

Authors: Matias Valdenegro-Toro, Ivo Pascal de Jong, Marco Zullich

Abstract: Modelling uncertainty in Machine Learning models is essential for achieving safe and reliable predictions. Most research on uncertainty focuses on output uncertainty (predictions), but minimal attention is paid to uncertainty at inputs. We propose a method for propagating uncertainty in the inputs through a Neural Network that is simultaneously able to estimate input, data, and model uncertainty.… ▽ More Modelling uncertainty in Machine Learning models is essential for achieving safe and reliable predictions. Most research on uncertainty focuses on output uncertainty (predictions), but minimal attention is paid to uncertainty at inputs. We propose a method for propagating uncertainty in the inputs through a Neural Network that is simultaneously able to estimate input, data, and model uncertainty. Our results show that this propagation of input uncertainty results in a more stable decision boundary even under large amounts of input noise than comparatively simple Monte Carlo sampling. Additionally, we discuss and demonstrate that input uncertainty, when propagated through the model, results in model uncertainty at the outputs. The explicit incorporation of input uncertainty may be beneficial in situations where the amount of input uncertainty is known, though good datasets for this are still needed. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 4 pages, 3 figures, with appendix. LatinX in AI Research Workshop @ ICML 2024 Camera Ready

arXiv:2405.02917 [pdf, other]

Overconfidence is Key: Verbalized Uncertainty Evaluation in Large Language and Vision-Language Models

Authors: Tobias Groot, Matias Valdenegro-Toro

Abstract: Language and Vision-Language Models (LLMs/VLMs) have revolutionized the field of AI by their ability to generate human-like text and understand images, but ensuring their reliability is crucial. This paper aims to evaluate the ability of LLMs (GPT4, GPT-3.5, LLaMA2, and PaLM 2) and VLMs (GPT4V and Gemini Pro Vision) to estimate their verbalized uncertainty via prompting. We propose the new Japanes… ▽ More Language and Vision-Language Models (LLMs/VLMs) have revolutionized the field of AI by their ability to generate human-like text and understand images, but ensuring their reliability is crucial. This paper aims to evaluate the ability of LLMs (GPT4, GPT-3.5, LLaMA2, and PaLM 2) and VLMs (GPT4V and Gemini Pro Vision) to estimate their verbalized uncertainty via prompting. We propose the new Japanese Uncertain Scenes (JUS) dataset, aimed at testing VLM capabilities via difficult queries and object counting, and the Net Calibration Error (NCE) to measure direction of miscalibration. Results show that both LLMs and VLMs have a high calibration error and are overconfident most of the time, indicating a poor capability for uncertainty estimation. Additionally we develop prompts for regression tasks, and we show that VLMs have poor calibration when producing mean/standard deviation and 95% confidence intervals. △ Less

Submitted 5 May, 2024; originally announced May 2024.

Comments: 8 pages, with appendix. To appear in TrustNLP workshop @ NAACL 2024

arXiv:2404.05858 [pdf, other]

A Neuromorphic Approach to Obstacle Avoidance in Robot Manipulation

Authors: Ahmed Faisal Abdelrahman, Matias Valdenegro-Toro, Maren Bennewitz, Paul G. Plöger

Abstract: Neuromorphic computing mimics computational principles of the brain in $\textit{silico}$ and motivates research into event-based vision and spiking neural networks (SNNs). Event cameras (ECs) exclusively capture local intensity changes and offer superior power consumption, response latencies, and dynamic ranges. SNNs replicate biological neuronal dynamics and have demonstrated potential as alterna… ▽ More Neuromorphic computing mimics computational principles of the brain in $\textit{silico}$ and motivates research into event-based vision and spiking neural networks (SNNs). Event cameras (ECs) exclusively capture local intensity changes and offer superior power consumption, response latencies, and dynamic ranges. SNNs replicate biological neuronal dynamics and have demonstrated potential as alternatives to conventional artificial neural networks (ANNs), such as in reducing energy expenditure and inference time in visual classification. Nevertheless, these novel paradigms remain scarcely explored outside the domain of aerial robots. To investigate the utility of brain-inspired sensing and data processing, we developed a neuromorphic approach to obstacle avoidance on a camera-equipped manipulator. Our approach adapts high-level trajectory plans with reactive maneuvers by processing emulated event data in a convolutional SNN, decoding neural activations into avoidance motions, and adjusting plans using a dynamic motion primitive. We conducted experiments with a Kinova Gen3 arm performing simple reaching tasks that involve obstacles in sets of distinct task scenarios and in comparison to a non-adaptive baseline. Our neuromorphic approach facilitated reliable avoidance of imminent collisions in simulated and real-world experiments, where the baseline consistently failed. Trajectory adaptations had low impacts on safety and predictability criteria. Among the notable SNN properties were the correlation of computations with the magnitude of perceived motions and a robustness to different event emulation methods. Tests with a DAVIS346 EC showed similar performance, validating our experimental event emulation. Our results motivate incorporating SNN learning, utilizing neuromorphic processors, and further exploring the potential of neuromorphic methods. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 35 pages, accepted at IJRR, authors' version

arXiv:2403.17224 [pdf, other]

Uncertainty Quantification for Gradient-based Explanations in Neural Networks

Authors: Mihir Mulye, Matias Valdenegro-Toro

Abstract: Explanation methods help understand the reasons for a model's prediction. These methods are increasingly involved in model debugging, performance optimization, and gaining insights into the workings of a model. With such critical applications of these methods, it is imperative to measure the uncertainty associated with the explanations generated by these methods. In this paper, we propose a pipeli… ▽ More Explanation methods help understand the reasons for a model's prediction. These methods are increasingly involved in model debugging, performance optimization, and gaining insights into the workings of a model. With such critical applications of these methods, it is imperative to measure the uncertainty associated with the explanations generated by these methods. In this paper, we propose a pipeline to ascertain the explanation uncertainty of neural networks by combining uncertainty estimation methods and explanation methods. We use this pipeline to produce explanation distributions for the CIFAR-10, FER+, and California Housing datasets. By computing the coefficient of variation of these distributions, we evaluate the confidence in the explanation and determine that the explanations generated using Guided Backpropagation have low uncertainty associated with them. Additionally, we compute modified pixel insertion/deletion metrics to evaluate the quality of the generated explanations. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 24 pages, 11 figures

arXiv:2403.17212 [pdf, other]

Sanity Checks for Explanation Uncertainty

Authors: Matias Valdenegro-Toro, Mihir Mulye

Abstract: Explanations for machine learning models can be hard to interpret or be wrong. Combining an explanation method with an uncertainty estimation method produces explanation uncertainty. Evaluating explanation uncertainty is difficult. In this paper we propose sanity checks for uncertainty explanation methods, where a weight and data randomization tests are defined for explanations with uncertainty, a… ▽ More Explanations for machine learning models can be hard to interpret or be wrong. Combining an explanation method with an uncertainty estimation method produces explanation uncertainty. Evaluating explanation uncertainty is difficult. In this paper we propose sanity checks for uncertainty explanation methods, where a weight and data randomization tests are defined for explanations with uncertainty, allowing for quick tests to combinations of uncertainty and explanation methods. We experimentally show the validity and effectiveness of these tests on the CIFAR10 and California Housing datasets, noting that Ensembles seem to consistently pass both tests with Guided Backpropagation, Integrated Gradients, and LIME explanations. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 15 pages, 7 figures, 3 tables

arXiv:2403.15431 [pdf, other]

Transferring BCI models from calibration to control: Observing shifts in EEG features

Authors: Ivo Pascal de Jong, Lüke Luna van den Wittenboer, Matias Valdenegro-Toro, Andreea Ioana Sburlea

Abstract: Public Motor Imagery-based brain-computer interface (BCI) datasets are being used to develop increasingly good classifiers. However, they usually follow discrete paradigms where participants perform Motor Imagery at regularly timed intervals. It is often unclear what changes may happen in the EEG patterns when users attempt to perform a control task with such a BCI. This may lead to generalisation… ▽ More Public Motor Imagery-based brain-computer interface (BCI) datasets are being used to develop increasingly good classifiers. However, they usually follow discrete paradigms where participants perform Motor Imagery at regularly timed intervals. It is often unclear what changes may happen in the EEG patterns when users attempt to perform a control task with such a BCI. This may lead to generalisation errors. We demonstrate a new paradigm containing a standard calibration session and a novel BCI control session based on EMG. This allows us to observe similarities in sensorimotor rhythms, and observe the additional preparation effects introduced by the control paradigm. In the Movement Related Cortical Potentials we found large differences between the calibration and control sessions. We demonstrate a CSP-based Machine Learning model trained on the calibration data that can make surprisingly good predictions on the BCI-controlled driving data. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.09228 [pdf, other]

Uncertainty Quantification for cross-subject Motor Imagery classification

Authors: Prithviraj Manivannan, Ivo Pascal de Jong, Matias Valdenegro-Toro, Andreea Ioana Sburlea

Abstract: Uncertainty Quantification aims to determine when the prediction from a Machine Learning model is likely to be wrong. Computer Vision research has explored methods for determining epistemic uncertainty (also known as model uncertainty), which should correspond with generalisation error. These methods theoretically allow to predict misclassifications due to inter-subject variability. We applied a v… ▽ More Uncertainty Quantification aims to determine when the prediction from a Machine Learning model is likely to be wrong. Computer Vision research has explored methods for determining epistemic uncertainty (also known as model uncertainty), which should correspond with generalisation error. These methods theoretically allow to predict misclassifications due to inter-subject variability. We applied a variety of Uncertainty Quantification methods to predict misclassifications for a Motor Imagery Brain Computer Interface. Deep Ensembles performed best, both in terms of classification performance and cross-subject Uncertainty Quantification performance. However, we found that standard CNNs with Softmax output performed better than some of the more advanced methods. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2401.06583 [pdf, other]

Map** Transformer Leveraged Embeddings for Cross-Lingual Document Representation

Authors: Tsegaye Misikir Tashu, Eduard-Raul Kontos, Matthia Sabatelli, Matias Valdenegro-Toro

Abstract: Recommendation systems, for documents, have become tools to find relevant content on the Web. However, these systems have limitations when it comes to recommending documents in languages different from the query language, which means they might overlook resources in non-native languages. This research focuses on representing documents across languages by using Transformer Leveraged Document Repres… ▽ More Recommendation systems, for documents, have become tools to find relevant content on the Web. However, these systems have limitations when it comes to recommending documents in languages different from the query language, which means they might overlook resources in non-native languages. This research focuses on representing documents across languages by using Transformer Leveraged Document Representations (TLDRs) that are mapped to a cross-lingual domain. Four multilingual pre-trained transformer models (mBERT, mT5 XLM RoBERTa, ErnieM) were evaluated using three map** methods across 20 language pairs representing combinations of five selected languages of the European Union. Metrics like Mate Retrieval Rate and Reciprocal Rank were used to measure the effectiveness of mapped TLDRs compared to non-mapped ones. The results highlight the power of cross-lingual representations achieved through pre-trained transformers and map** approaches suggesting a promising direction for expanding beyond language connections, between two specific languages. △ Less

Submitted 12 January, 2024; originally announced January 2024.

arXiv:2312.09454 [pdf, other]

Uncertainty Quantification in Machine Learning for Biosignal Applications -- A Review

Authors: Ivo Pascal de Jong, Andreea Ioana Sburlea, Matias Valdenegro-Toro

Abstract: Uncertainty Quantification (UQ) has gained traction in an attempt to fix the black-box nature of Deep Learning. Specifically (medical) biosignals such as electroencephalography (EEG), electrocardiography (ECG), electroocculography (EOG) and electromyography (EMG) could benefit from good UQ, since these suffer from a poor signal to noise ratio, and good human interpretability is pivotal for medical… ▽ More Uncertainty Quantification (UQ) has gained traction in an attempt to fix the black-box nature of Deep Learning. Specifically (medical) biosignals such as electroencephalography (EEG), electrocardiography (ECG), electroocculography (EOG) and electromyography (EMG) could benefit from good UQ, since these suffer from a poor signal to noise ratio, and good human interpretability is pivotal for medical applications and Brain Computer Interfaces. In this paper, we review the state of the art at the intersection of Uncertainty Quantification and Biosignal with Machine Learning. We present various methods, shortcomings, uncertainty measures and theoretical frameworks that currently exist in this application domain. Overall it can be concluded that promising UQ methods are available, but that research is needed on how people and systems may interact with an uncertainty model in a (clinical) environment. △ Less

Submitted 15 November, 2023; originally announced December 2023.

Comments: 26 pages, 13 figures, 3 tables

arXiv:2311.06427 [pdf, other]

ChatGPT Prompting Cannot Estimate Predictive Uncertainty in High-Resource Languages

Authors: Martino Pelucchi, Matias Valdenegro-Toro

Abstract: ChatGPT took the world by storm for its impressive abilities. Due to its release without documentation, scientists immediately attempted to identify its limits, mainly through its performance in natural language processing (NLP) tasks. This paper aims to join the growing literature regarding ChatGPT's abilities by focusing on its performance in high-resource languages and on its capacity to predic… ▽ More ChatGPT took the world by storm for its impressive abilities. Due to its release without documentation, scientists immediately attempted to identify its limits, mainly through its performance in natural language processing (NLP) tasks. This paper aims to join the growing literature regarding ChatGPT's abilities by focusing on its performance in high-resource languages and on its capacity to predict its answers' accuracy by giving a confidence level. The analysis of high-resource languages is of interest as studies have shown that low-resource languages perform worse than English in NLP tasks, but no study so far has analysed whether high-resource languages perform as well as English. The analysis of ChatGPT's confidence calibration has not been carried out before either and is critical to learn about ChatGPT's trustworthiness. In order to study these two aspects, five high-resource languages and two NLP tasks were chosen. ChatGPT was asked to perform both tasks in the five languages and to give a numerical confidence value for each answer. The results show that all the selected high-resource languages perform similarly and that ChatGPT does not have a good confidence calibration, often being overconfident and never giving low confidence values. △ Less

Submitted 10 November, 2023; originally announced November 2023.

Comments: 14 pages, 4 figures, with appendix

arXiv:2306.02424 [pdf, other]

Sanity Checks for Saliency Methods Explaining Object Detectors

Authors: Deepan Chakravarthi Padmanabhan, Paul G. Plöger, Octavio Arriaga, Matias Valdenegro-Toro

Abstract: Saliency methods are frequently used to explain Deep Neural Network-based models. Adebayo et al.'s work on evaluating saliency methods for classification models illustrate certain explanation methods fail the model and data randomization tests. However, on extending the tests for various state of the art object detectors we illustrate that the ability to explain a model is more dependent on the mo… ▽ More Saliency methods are frequently used to explain Deep Neural Network-based models. Adebayo et al.'s work on evaluating saliency methods for classification models illustrate certain explanation methods fail the model and data randomization tests. However, on extending the tests for various state of the art object detectors we illustrate that the ability to explain a model is more dependent on the model itself than the explanation method. We perform sanity checks for object detection and define new qualitative criteria to evaluate the saliency explanations, both for object classification and bounding box decisions, using Guided Backpropagation, Integrated Gradients, and their Smoothgrad versions, together with Faster R-CNN, SSD, and EfficientDet-D0, trained on COCO. In addition, the sensitivity of the explanation method to model parameters and data labels varies class-wise motivating to perform the sanity checks for each class. We find that EfficientDet-D0 is the most interpretable method independent of the saliency method, which passes the sanity checks with little problems. △ Less

Submitted 4 June, 2023; originally announced June 2023.

Comments: 18 pages, 10 figures, 1st World Conference on eXplainable Artificial Intelligence camera ready

arXiv:2212.11409 [pdf, other]

DExT: Detector Explanation Toolkit

Authors: Deepan Chakravarthi Padmanabhan, Paul G. Plöger, Octavio Arriaga, Matias Valdenegro-Toro

Abstract: State-of-the-art object detectors are treated as black boxes due to their highly non-linear internal computations. Even with unprecedented advancements in detector performance, the inability to explain how their outputs are generated limits their use in safety-critical applications. Previous work fails to produce explanations for both bounding box and classification decisions, and generally make i… ▽ More State-of-the-art object detectors are treated as black boxes due to their highly non-linear internal computations. Even with unprecedented advancements in detector performance, the inability to explain how their outputs are generated limits their use in safety-critical applications. Previous work fails to produce explanations for both bounding box and classification decisions, and generally make individual explanations for various detectors. In this paper, we propose an open-source Detector Explanation Toolkit (DExT) which implements the proposed approach to generate a holistic explanation for all detector decisions using certain gradient-based explanation methods. We suggests various multi-object visualization methods to merge the explanations of multiple objects detected in an image as well as the corresponding detections in a single image. The quantitative evaluation show that the Single Shot MultiBox Detector (SSD) is more faithfully explained compared to other detectors regardless of the explanation methods. Both quantitative and human-centric evaluations identify that SmoothGrad with Guided Backpropagation (GBP) provides more trustworthy explanations among selected methods across all detectors. We expect that DExT will motivate practitioners to evaluate object detectors from the interpretability perspective by explaining both bounding box and classification decisions. △ Less

Submitted 4 June, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

Comments: 24 pages, with appendix. 1st World Conference on eXplainable Artificial Intelligence camera ready

arXiv:2211.06250 [pdf, other]

Disentangled Uncertainty and Out of Distribution Detection in Medical Generative Models

Authors: Kumud Lakara, Matias Valdenegro-Toro

Abstract: Trusting the predictions of deep learning models in safety critical settings such as the medical domain is still not a viable option. Distentangled uncertainty quantification in the field of medical imaging has received little attention. In this paper, we study disentangled uncertainties in image to image translation tasks in the medical domain. We compare multiple uncertainty quantification metho… ▽ More Trusting the predictions of deep learning models in safety critical settings such as the medical domain is still not a viable option. Distentangled uncertainty quantification in the field of medical imaging has received little attention. In this paper, we study disentangled uncertainties in image to image translation tasks in the medical domain. We compare multiple uncertainty quantification methods, namely Ensembles, Flipout, Dropout, and DropConnect, while using CycleGAN to convert T1-weighted brain MRI scans to T2-weighted brain MRI scans. We further evaluate uncertainty behavior in the presence of out of distribution data (Brain CT and RGB Face Images), showing that epistemic uncertainty can be used to detect out of distribution inputs, which should increase reliability of model outputs. △ Less

Submitted 11 November, 2022; originally announced November 2022.

Comments: 3.5 pages, with appendix. Medical Imaging Meets NeurIPS Workshop 2022 Camera Ready

arXiv:2211.06241 [pdf, other]

A Benchmark for Out of Distribution Detection in Point Cloud 3D Semantic Segmentation

Authors: Lokesh Veeramacheneni, Matias Valdenegro-Toro

Abstract: Safety-critical applications like autonomous driving use Deep Neural Networks (DNNs) for object detection and segmentation. The DNNs fail to predict when they observe an Out-of-Distribution (OOD) input leading to catastrophic consequences. Existing OOD detection methods were extensively studied for image inputs but have not been explored much for LiDAR inputs. So in this study, we proposed two dat… ▽ More Safety-critical applications like autonomous driving use Deep Neural Networks (DNNs) for object detection and segmentation. The DNNs fail to predict when they observe an Out-of-Distribution (OOD) input leading to catastrophic consequences. Existing OOD detection methods were extensively studied for image inputs but have not been explored much for LiDAR inputs. So in this study, we proposed two datasets for benchmarking OOD detection in 3D semantic segmentation. We used Maximum Softmax Probability and Entropy scores generated using Deep Ensembles and Flipout versions of RandLA-Net as OOD scores. We observed that Deep Ensembles out perform Flipout model in OOD detection with greater AUROC scores for both datasets. △ Less

Submitted 11 November, 2022; originally announced November 2022.

Comments: 4 pages, Robot Learning Workshop @ NeurIPS 2022

arXiv:2211.06233 [pdf, other]

Comparison of Uncertainty Quantification with Deep Learning in Time Series Regression

Authors: Levente Foldesi, Matias Valdenegro-Toro

Abstract: Increasingly high-stakes decisions are made using neural networks in order to make predictions. Specifically, meteorologists and hedge funds apply these techniques to time series data. When it comes to prediction, there are certain limitations for machine learning models (such as lack of expressiveness, vulnerability of domain shifts and overconfidence) which can be solved using uncertainty estima… ▽ More Increasingly high-stakes decisions are made using neural networks in order to make predictions. Specifically, meteorologists and hedge funds apply these techniques to time series data. When it comes to prediction, there are certain limitations for machine learning models (such as lack of expressiveness, vulnerability of domain shifts and overconfidence) which can be solved using uncertainty estimation. There is a set of expectations regarding how uncertainty should ``behave". For instance, a wider prediction horizon should lead to more uncertainty or the model's confidence should be proportional to its accuracy. In this paper, different uncertainty estimation methods are compared to forecast meteorological time series data and evaluate these expectations. The results show how each uncertainty estimation method performs on the forecasting task, which partially evaluates the robustness of predicted uncertainty. △ Less

Submitted 11 November, 2022; originally announced November 2022.

Comments: 5 pages, with appendix. RobustSeq @ NeurIPS 2022 Camera Ready

arXiv:2209.03032 [pdf, other]

Machine Learning Students Overfit to Overfitting

Authors: Matias Valdenegro-Toro, Matthia Sabatelli

Abstract: Overfitting and generalization is an important concept in Machine Learning as only models that generalize are interesting for general applications. Yet some students have trouble learning this important concept through lectures and exercises. In this paper we describe common examples of students misunderstanding overfitting, and provide recommendations for possible solutions. We cover student misc… ▽ More Overfitting and generalization is an important concept in Machine Learning as only models that generalize are interesting for general applications. Yet some students have trouble learning this important concept through lectures and exercises. In this paper we describe common examples of students misunderstanding overfitting, and provide recommendations for possible solutions. We cover student misconceptions about overfitting, about solutions to overfitting, and implementation mistakes that are commonly confused with overfitting issues. We expect that our paper can contribute to improving student understanding and lectures about this important topic. △ Less

Submitted 7 September, 2022; originally announced September 2022.

Comments: 5 pages, with appendix, TeachML workshop @ ECML 2022

arXiv:2204.09323 [pdf, other]

Self-supervised Learning for Sonar Image Classification

Authors: Alan Preciado-Grijalva, Bilal Wehbe, Miguel Bande Firvida, Matias Valdenegro-Toro

Abstract: Self-supervised learning has proved to be a powerful approach to learn image representations without the need of large labeled datasets. For underwater robotics, it is of great interest to design computer vision algorithms to improve perception capabilities such as sonar image classification. Due to the confidential nature of sonar imaging and the difficulty to interpret sonar images, it is challe… ▽ More Self-supervised learning has proved to be a powerful approach to learn image representations without the need of large labeled datasets. For underwater robotics, it is of great interest to design computer vision algorithms to improve perception capabilities such as sonar image classification. Due to the confidential nature of sonar imaging and the difficulty to interpret sonar images, it is challenging to create public large labeled sonar datasets to train supervised learning algorithms. In this work, we investigate the potential of three self-supervised learning methods (RotNet, Denoising Autoencoders, and Jigsaw) to learn high-quality sonar image representation without the need of human labels. We present pre-training and transfer learning results on real-life sonar image datasets. Our results indicate that self-supervised pre-training yields classification performance comparable to supervised pre-training in a few-shot transfer learning setup across all three methods. Code and self-supervised pre-trained models are be available at https://github.com/agrija9/ssl-sonar-images △ Less

Submitted 20 April, 2022; originally announced April 2022.

Comments: 8 pages, 10 figures, with supplementary. LatinX in CV Workshop @ CVPR 2022 Camera Ready

arXiv:2204.09308 [pdf, other]

A Deeper Look into Aleatoric and Epistemic Uncertainty Disentanglement

Authors: Matias Valdenegro-Toro, Daniel Saromo

Abstract: Neural networks are ubiquitous in many tasks, but trusting their predictions is an open issue. Uncertainty quantification is required for many applications, and disentangled aleatoric and epistemic uncertainties are best. In this paper, we generalize methods to produce disentangled uncertainties to work with different uncertainty quantification methods, and evaluate their capability to produce dis… ▽ More Neural networks are ubiquitous in many tasks, but trusting their predictions is an open issue. Uncertainty quantification is required for many applications, and disentangled aleatoric and epistemic uncertainties are best. In this paper, we generalize methods to produce disentangled uncertainties to work with different uncertainty quantification methods, and evaluate their capability to produce disentangled uncertainties. Our results show that: there is an interaction between learning aleatoric and epistemic uncertainty, which is unexpected and violates assumptions on aleatoric uncertainty, some methods like Flipout produce zero epistemic uncertainty, aleatoric uncertainty is unreliable in the out-of-distribution setting, and Ensembles provide overall the best disentangling quality. We also explore the error produced by the number of samples hyper-parameter in the sampling softmax function, recommending N > 100 samples. We expect that our formulation and results help practitioners and researchers choose uncertainty methods and expand the use of disentangled uncertainties, as well as motivate additional research into this topic. △ Less

Submitted 20 April, 2022; originally announced April 2022.

Comments: 8 pages, 12 figures, with supplementary. LatinX in CV Workshop @ CVPR 2022 Camera Ready

arXiv:2112.03164 [pdf, other]

Feature Disentanglement of Robot Trajectories

Authors: Matias Valdenegro-Toro, Daniel Harnack, Hendrik Wöhrle

Abstract: Modeling trajectories generated by robot joints is complex and required for high level activities like trajectory generation, clustering, and classification. Disentagled representation learning promises advances in unsupervised learning, but they have not been evaluated in robot-generated trajectories. In this paper we evaluate three disentangling VAEs ($β$-VAE, Decorr VAE, and a new $β$-Decorr VA… ▽ More Modeling trajectories generated by robot joints is complex and required for high level activities like trajectory generation, clustering, and classification. Disentagled representation learning promises advances in unsupervised learning, but they have not been evaluated in robot-generated trajectories. In this paper we evaluate three disentangling VAEs ($β$-VAE, Decorr VAE, and a new $β$-Decorr VAE) on a dataset of 1M robot trajectories generated from a 3 DoF robot arm. We find that the decorrelation-based formulations perform the best in terms of disentangling metrics, trajectory quality, and correlation with ground truth latent features. We expect that these results increase the use of unsupervised learning in robot control. △ Less

Submitted 6 December, 2021; originally announced December 2021.

Comments: 5 pages, 3 figures, 1 table, with supplementary

arXiv:2112.02694 [pdf, other]

Benchmark for Out-of-Distribution Detection in Deep Reinforcement Learning

Authors: Aaqib Parvez Mohammed, Matias Valdenegro-Toro

Abstract: Reinforcement Learning (RL) based solutions are being adopted in a variety of domains including robotics, health care and industrial automation. Most focus is given to when these solutions work well, but they fail when presented with out of distribution inputs. RL policies share the same faults as most machine learning models. Out of distribution detection for RL is generally not well covered in t… ▽ More Reinforcement Learning (RL) based solutions are being adopted in a variety of domains including robotics, health care and industrial automation. Most focus is given to when these solutions work well, but they fail when presented with out of distribution inputs. RL policies share the same faults as most machine learning models. Out of distribution detection for RL is generally not well covered in the literature, and there is a lack of benchmarks for this task. In this work we propose a benchmark to evaluate OOD detection methods in a Reinforcement Learning setting, by modifying the physical parameters of non-visual standard environments or corrupting the state observation for visual environments. We discuss ways to generate custom RL environments that can produce OOD data, and evaluate three uncertainty methods for the OOD detection task. Our results show that ensemble methods have the best OOD detection performance with a lower standard deviation across multiple environments. △ Less

Submitted 5 December, 2021; originally announced December 2021.

Comments: 9 pages, 5 figures, 5 tables, Bayesian Deep Learning Workshop @ NeurIPS 2021

arXiv:2111.09808 [pdf, other]

Exploring the Limits of Epistemic Uncertainty Quantification in Low-Shot Settings

Authors: Matias Valdenegro-Toro

Abstract: Uncertainty quantification in neural network promises to increase safety of AI systems, but it is not clear how performance might vary with the training set size. In this paper we evaluate seven uncertainty methods on Fashion MNIST and CIFAR10, as we sub-sample and produce varied training set sizes. We find that calibration error and out of distribution detection performance strongly depend on the… ▽ More Uncertainty quantification in neural network promises to increase safety of AI systems, but it is not clear how performance might vary with the training set size. In this paper we evaluate seven uncertainty methods on Fashion MNIST and CIFAR10, as we sub-sample and produce varied training set sizes. We find that calibration error and out of distribution detection performance strongly depend on the training set size, with most methods being miscalibrated on the test set with small training sets. Gradient-based methods seem to poorly estimate epistemic uncertainty and are the most affected by training set size. We expect our results can guide future research into uncertainty quantification and help practitioners select methods based on their particular available data. △ Less

Submitted 18 November, 2021; originally announced November 2021.

Comments: 7 pages, 3 figures, with supplementary material. LatinX in AI Research Workshop @ NeurIPS 2021

arXiv:2109.13789 [pdf, other]

doi 10.5220/0011612900003417

The VVAD-LRS3 Dataset for Visual Voice Activity Detection

Authors: Adrian Lubitz, Matias Valdenegro-Toro, Frank Kirchner

Abstract: Robots are becoming everyday devices, increasing their interaction with humans. To make human-machine interaction more natural, cognitive features like Visual Voice Activity Detection (VVAD), which can detect whether a person is speaking or not, given visual input of a camera, need to be implemented. Neural networks are state of the art for tasks in Image Processing, Time Series Prediction, Natura… ▽ More Robots are becoming everyday devices, increasing their interaction with humans. To make human-machine interaction more natural, cognitive features like Visual Voice Activity Detection (VVAD), which can detect whether a person is speaking or not, given visual input of a camera, need to be implemented. Neural networks are state of the art for tasks in Image Processing, Time Series Prediction, Natural Language Processing and other domains. Those Networks require large quantities of labeled data. Currently there are not many datasets for the task of VVAD. In this work we created a large scale dataset called the VVAD-LRS3 dataset, derived by automatic annotations from the LRS3 dataset. The VVAD-LRS3 dataset contains over 44K samples, over three times the next competitive dataset (WildVVAD). We evaluate different baselines on four kinds of features: facial and lip images, and facial and lip landmark features. With a Convolutional Neural Network Long Short Term Memory (CNN LSTM) on facial images an accuracy of 92% was reached on the test set. A study with humans showed that they reach an accuracy of 87.93% on the test set. △ Less

Submitted 28 September, 2021; originally announced September 2021.

arXiv:2108.08712 [pdf, other]

Teaching Uncertainty Quantification in Machine Learning through Use Cases

Authors: Matias Valdenegro-Toro

Abstract: Uncertainty in machine learning is not generally taught as general knowledge in Machine Learning course curricula. In this paper we propose a short curriculum for a course about uncertainty in machine learning, and complement the course with a selection of use cases, aimed to trigger discussion and let students play with the concepts of uncertainty in a programming setting. Our use cases cover the… ▽ More Uncertainty in machine learning is not generally taught as general knowledge in Machine Learning course curricula. In this paper we propose a short curriculum for a course about uncertainty in machine learning, and complement the course with a selection of use cases, aimed to trigger discussion and let students play with the concepts of uncertainty in a programming setting. Our use cases cover the concept of output uncertainty, Bayesian neural networks and weight distributions, sources of uncertainty, and out of distribution detection. We expect that this curriculum and set of use cases motivates the community to adopt these important concepts into courses for safety in AI. △ Less

Submitted 19 August, 2021; originally announced August 2021.

Comments: 2nd Teaching in Machine Learning Workshop, Camera Ready, 5 pages, 3 figures

arXiv:2108.06800 [pdf, other]

The Marine Debris Dataset for Forward-Looking Sonar Semantic Segmentation

Authors: Deepak Singh, Matias Valdenegro-Toro

Abstract: Accurate detection and segmentation of marine debris is important for kee** the water bodies clean. This paper presents a novel dataset for marine debris segmentation collected using a Forward Looking Sonar (FLS). The dataset consists of 1868 FLS images captured using ARIS Explorer 3000 sensor. The objects used to produce this dataset contain typical house-hold marine debris and distractor marin… ▽ More Accurate detection and segmentation of marine debris is important for kee** the water bodies clean. This paper presents a novel dataset for marine debris segmentation collected using a Forward Looking Sonar (FLS). The dataset consists of 1868 FLS images captured using ARIS Explorer 3000 sensor. The objects used to produce this dataset contain typical house-hold marine debris and distractor marine objects (tires, hooks, valves,etc), divided in 11 classes plus a background class. Performance of state of the art semantic segmentation architectures with a variety of encoders have been analyzed on this dataset and presented as baseline results. Since the images are grayscale, no pretrained weights have been used. Comparisons are made using Intersection over Union (IoU). The best performing model is Unet with ResNet34 backbone at 0.7481 mIoU. The dataset is available at https://github.com/mvaldenegro/marine-debris-fls-datasets/ △ Less

Submitted 15 August, 2021; originally announced August 2021.

Comments: OceanVision 2021 ICCV Worshop, Camera Ready, 9 pages, 13 figures, 6 Tables

arXiv:2108.02665 [pdf, other]

Deep Reinforcement Learning for Continuous Docking Control of Autonomous Underwater Vehicles: A Benchmarking Study

Authors: Mihir Patil, Bilal Wehbe, Matias Valdenegro-Toro

Abstract: Docking control of an autonomous underwater vehicle (AUV) is a task that is integral to achieving persistent long term autonomy. This work explores the application of state-of-the-art model-free deep reinforcement learning (DRL) approaches to the task of AUV docking in the continuous domain. We provide a detailed formulation of the reward function, utilized to successfully dock the AUV onto a fixe… ▽ More Docking control of an autonomous underwater vehicle (AUV) is a task that is integral to achieving persistent long term autonomy. This work explores the application of state-of-the-art model-free deep reinforcement learning (DRL) approaches to the task of AUV docking in the continuous domain. We provide a detailed formulation of the reward function, utilized to successfully dock the AUV onto a fixed docking platform. A major contribution that distinguishes our work from the previous approaches is the usage of a physics simulator to define and simulate the underwater environment as well as the DeepLeng AUV. We propose a new reward function formulation for the docking task, incorporating several components, that outperforms previous reward formulations. We evaluate proximal policy optimization (PPO), twin delayed deep deterministic policy gradients (TD3) and soft actor-critic (SAC) in combination with our reward function. Our evaluation yielded results that conclusively show the TD3 agent to be most efficient and consistent in terms of docking the AUV, over multiple evaluation runs it achieved a 100% success rate and episode return of 10667.1 +- 688.8. We also show how our reward function formulation improves over the state of the art. △ Less

Submitted 5 August, 2021; originally announced August 2021.

Comments: Global Oceans 2021 Camera ready, 7 pages, 11 figures

arXiv:2108.01111 [pdf, other]

Pre-trained Models for Sonar Images

Authors: Matias Valdenegro-Toro, Alan Preciado-Grijalva, Bilal Wehbe

Abstract: Machine learning and neural networks are now ubiquitous in sonar perception, but it lags behind the computer vision field due to the lack of data and pre-trained models specifically for sonar images. In this paper we present the Marine Debris Turntable dataset and produce pre-trained neural networks trained on this dataset, meant to fill the gap of missing pre-trained models for sonar images. We t… ▽ More Machine learning and neural networks are now ubiquitous in sonar perception, but it lags behind the computer vision field due to the lack of data and pre-trained models specifically for sonar images. In this paper we present the Marine Debris Turntable dataset and produce pre-trained neural networks trained on this dataset, meant to fill the gap of missing pre-trained models for sonar images. We train Resnet 20, MobileNets, DenseNet121, SqueezeNet, MiniXception, and an Autoencoder, over several input image sizes, from 32 x 32 to 96 x 96, on the Marine Debris turntable dataset. We evaluate these models using transfer learning for low-shot classification in the Marine Debris Watertank and another dataset captured using a Gemini 720i sonar. Our results show that in both datasets the pre-trained models produce good features that allow good classification accuracy with low samples (10-30 samples per class). The Gemini dataset validates that the features transfer to other kinds of sonar sensors. We expect that the community benefits from the public release of our pre-trained models and the turntable dataset. △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: Global Oceans 2021, Camera ready, 8 pages, 9 figures

arXiv:2108.01066 [pdf, other]

Forward-Looking Sonar Patch Matching: Modern CNNs, Ensembling, and Uncertainty

Authors: Arka Mallick, Paul Plöger, Matias Valdenegro-Toro

Abstract: Application of underwater robots are on the rise, most of them are dependent on sonar for underwater vision, but the lack of strong perception capabilities limits them in this task. An important issue in sonar perception is matching image patches, which can enable other techniques like localization, change detection, and map**. There is a rich literature for this problem in color images, but for… ▽ More Application of underwater robots are on the rise, most of them are dependent on sonar for underwater vision, but the lack of strong perception capabilities limits them in this task. An important issue in sonar perception is matching image patches, which can enable other techniques like localization, change detection, and map**. There is a rich literature for this problem in color images, but for acoustic images, it is lacking, due to the physics that produce these images. In this paper we improve on our previous results for this problem (Valdenegro-Toro et al, 2017), instead of modeling features manually, a Convolutional Neural Network (CNN) learns a similarity function and predicts if two input sonar images are similar or not. With the objective of improving the sonar image matching problem further, three state of the art CNN architectures are evaluated on the Marine Debris dataset, namely DenseNet, and VGG, with a siamese or two-channel architecture, and contrastive loss. To ensure a fair evaluation of each network, thorough hyper-parameter optimization is executed. We find that the best performing models are DenseNet Two-Channel network with 0.955 AUC, VGG-Siamese with contrastive loss at 0.949 AUC and DenseNet Siamese with 0.921 AUC. By ensembling the top performing DenseNet two-channel and DenseNet-Siamese models overall highest prediction accuracy obtained is 0.978 AUC, showing a large improvement over the 0.91 AUC in the state of the art. △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: Global Oceans 2021 Camera ready, 7 pages, 8 figures

arXiv:2104.08188 [pdf, other]

I Find Your Lack of Uncertainty in Computer Vision Disturbing

Authors: Matias Valdenegro-Toro

Abstract: Neural networks are used for many real world applications, but often they have problems estimating their own confidence. This is particularly problematic for computer vision applications aimed at making high stakes decisions with humans and their lives. In this paper we make a meta-analysis of the literature, showing that most if not all computer vision applications do not use proper epistemic unc… ▽ More Neural networks are used for many real world applications, but often they have problems estimating their own confidence. This is particularly problematic for computer vision applications aimed at making high stakes decisions with humans and their lives. In this paper we make a meta-analysis of the literature, showing that most if not all computer vision applications do not use proper epistemic uncertainty quantification, which means that these models ignore their own limitations. We describe the consequences of using models without proper uncertainty quantification, and motivate the community to adopt versions of the models they use that have proper calibrated epistemic uncertainty, in order to enable out of distribution detection. We close the paper with a summary of challenges on estimating uncertainty for computer vision applications and recommendations. △ Less

Submitted 16 April, 2021; originally announced April 2021.

Comments: LatinX in CV Workshop @ CVPR 2021, full paper track, camera ready

arXiv:2012.01281 [pdf, other]

Are Gradient-based Saliency Maps Useful in Deep Reinforcement Learning?

Authors: Matthias Rosynski, Frank Kirchner, Matias Valdenegro-Toro

Abstract: Deep Reinforcement Learning (DRL) connects the classic Reinforcement Learning algorithms with Deep Neural Networks. A problem in DRL is that CNNs are black-boxes and it is hard to understand the decision-making process of agents. In order to be able to use RL agents in highly dangerous environments for humans and machines, the developer needs a debugging tool to assure that the agent does what is… ▽ More Deep Reinforcement Learning (DRL) connects the classic Reinforcement Learning algorithms with Deep Neural Networks. A problem in DRL is that CNNs are black-boxes and it is hard to understand the decision-making process of agents. In order to be able to use RL agents in highly dangerous environments for humans and machines, the developer needs a debugging tool to assure that the agent does what is expected. Currently, rewards are primarily used to interpret how well an agent is learning. However, this can lead to deceptive conclusions if the agent receives more rewards by memorizing a policy and not learning to respond to the environment. In this work, it is shown that this problem can be recognized with the help of gradient visualization techniques. This work brings some of the best-known visualization methods from the field of image classification to the area of Deep Reinforcement Learning. Furthermore, two new visualization techniques have been developed, one of which provides particularly good results. It is being proven to what extent the algorithms can be used in the area of Reinforcement learning. Also, the question arises on how well the DRL algorithms can be visualized across different environments with varying visualization techniques. △ Less

Submitted 2 December, 2020; originally announced December 2020.

Comments: 8 pages, with appendix, 30 pages in total

arXiv:2011.11461 [pdf, other]

Unsupervised Difficulty Estimation with Action Scores

Authors: Octavio Arriaga, Matias Valdenegro-Toro

Abstract: Evaluating difficulty and biases in machine learning models has become of extreme importance as current models are now being applied in real-world situations. In this paper we present a simple method for calculating a difficulty score based on the accumulation of losses for each sample during training. We call this the action score. Our proposed method does not require any modification of the mode… ▽ More Evaluating difficulty and biases in machine learning models has become of extreme importance as current models are now being applied in real-world situations. In this paper we present a simple method for calculating a difficulty score based on the accumulation of losses for each sample during training. We call this the action score. Our proposed method does not require any modification of the model neither any external supervision, as it can be implemented as callback that gathers information from the training process. We test and analyze our approach in two different settings: image classification, and object detection, and we show that in both settings the action score can provide insights about model and dataset biases. △ Less

Submitted 23 November, 2020; originally announced November 2020.

Comments: 2 pages, 6 figures, with appendix

arXiv:2011.11459 [pdf, other]

Automatic Detection and Classification of Tick-borne Skin Lesions using Deep Learning

Authors: Lauren Michelle Pfeifer, Matias Valdenegro-Toro

Abstract: Around the globe, ticks are the culprit of transmitting a variety of bacterial, viral and parasitic diseases. The incidence of tick-borne diseases has drastically increased within the last decade, with annual cases of Lyme disease soaring to an estimated 300,000 in the United States alone. As a result, more efforts in improving lesion identification approaches and diagnostics for tick-borne illnes… ▽ More Around the globe, ticks are the culprit of transmitting a variety of bacterial, viral and parasitic diseases. The incidence of tick-borne diseases has drastically increased within the last decade, with annual cases of Lyme disease soaring to an estimated 300,000 in the United States alone. As a result, more efforts in improving lesion identification approaches and diagnostics for tick-borne illnesses is critical. The objective for this study is to build upon the approach used by Burlina et al. by using a variety of convolutional neural network models to detect tick-borne skin lesions. We expanded the data inputs by acquiring images from Google in seven different languages to test if this would diversify training data and improve the accuracy of skin lesion detection. The final dataset included nearly 6,080 images and was trained on a combination of architectures (ResNet 34, ResNet 50, VGG 19, and Dense Net 121). We obtained an accuracy of 80.72% with our model trained on the DenseNet 121 architecture. △ Less

Submitted 23 November, 2020; originally announced November 2020.

Comments: 2 pages, 8 figures, with appendix

arXiv:2010.15823 [pdf, other]

Black-Box Optimization of Object Detector Scales

Authors: Mohandass Muthuraja, Octavio Arriaga, Paul Plöger, Frank Kirchner, Matias Valdenegro-Toro

Abstract: Object detectors have improved considerably in the last years by using advanced CNN architectures. However, many detector hyper-parameters are generally manually tuned, or they are used with values set by the detector authors. Automatic Hyper-parameter optimization has not been explored in improving CNN-based object detectors hyper-parameters. In this work, we propose the use of Black-box optimiza… ▽ More Object detectors have improved considerably in the last years by using advanced CNN architectures. However, many detector hyper-parameters are generally manually tuned, or they are used with values set by the detector authors. Automatic Hyper-parameter optimization has not been explored in improving CNN-based object detectors hyper-parameters. In this work, we propose the use of Black-box optimization methods to tune the prior/default box scales in Faster R-CNN and SSD, using Bayesian Optimization, SMAC, and CMA-ES. We show that by tuning the input image size and prior box anchor scale on Faster R-CNN mAP increases by 2% on PASCAL VOC 2007, and by 3% with SSD. On the COCO dataset with SSD there are mAP improvement in the medium and large objects, but mAP decreases by 1% in small objects. We also perform a regression analysis to find the significant hyper-parameters to tune. △ Less

Submitted 29 October, 2020; originally announced October 2020.

Comments: 17 pages, 7 figures, with appendix

arXiv:2010.14541 [pdf, other]

Perception for Autonomous Systems (PAZ)

Authors: Octavio Arriaga, Matias Valdenegro-Toro, Mohandass Muthuraja, Sushma Devaramani, Frank Kirchner

Abstract: In this paper we introduce the Perception for Autonomous Systems (PAZ) software library. PAZ is a hierarchical perception library that allow users to manipulate multiple levels of abstraction in accordance to their requirements or skill level. More specifically, PAZ is divided into three hierarchical levels which we refer to as pipelines, processors, and backends. These abstractions allows users t… ▽ More In this paper we introduce the Perception for Autonomous Systems (PAZ) software library. PAZ is a hierarchical perception library that allow users to manipulate multiple levels of abstraction in accordance to their requirements or skill level. More specifically, PAZ is divided into three hierarchical levels which we refer to as pipelines, processors, and backends. These abstractions allows users to compose functions in a hierarchical modular scheme that can be applied for preprocessing, data-augmentation, prediction and postprocessing of inputs and outputs of machine learning (ML) models. PAZ uses these abstractions to build reusable training and prediction pipelines for multiple robot perception tasks such as: 2D keypoint estimation, 2D object detection, 3D keypoint discovery, 6D pose estimation, emotion classification, face recognition, instance segmentation, and attention mechanisms. △ Less

Submitted 27 October, 2020; originally announced October 2020.

arXiv:2010.14444 [pdf, other]

Can Reinforcement Learning for Continuous Control Generalize Across Physics Engines?

Authors: Aaqib Parvez Mohammed, Matias Valdenegro-Toro

Abstract: Reinforcement learning (RL) algorithms should learn as much as possible about the environment but not the properties of the physics engines that generate the environment. There are multiple algorithms that solve the task in a physics engine based environment but there is no work done so far to understand if the RL algorithms can generalize across physics engines. In this work, we compare the gener… ▽ More Reinforcement learning (RL) algorithms should learn as much as possible about the environment but not the properties of the physics engines that generate the environment. There are multiple algorithms that solve the task in a physics engine based environment but there is no work done so far to understand if the RL algorithms can generalize across physics engines. In this work, we compare the generalization performance of various deep reinforcement learning algorithms on a variety of control tasks. Our results show that MuJoCo is the best engine to transfer the learning to other engines. On the other hand, none of the algorithms generalize when trained on PyBullet. We also found out that various algorithms have a promising generalizability if the effect of random seeds can be minimized on their performance. △ Less

Submitted 27 October, 2020; originally announced October 2020.

Comments: 10 pages plus appendix

arXiv:2010.14019 [pdf, other]

Know Where To Drop Your Weights: Towards Faster Uncertainty Estimation

Authors: Akshatha Kamath, Dwaraknath Gnaneshwar, Matias Valdenegro-Toro

Abstract: Estimating epistemic uncertainty of models used in low-latency applications and Out-Of-Distribution samples detection is a challenge due to the computationally demanding nature of uncertainty estimation techniques. Estimating model uncertainty using approximation techniques like Monte Carlo Dropout (MCD), DropConnect (MCDC) requires a large number of forward passes through the network, rendering t… ▽ More Estimating epistemic uncertainty of models used in low-latency applications and Out-Of-Distribution samples detection is a challenge due to the computationally demanding nature of uncertainty estimation techniques. Estimating model uncertainty using approximation techniques like Monte Carlo Dropout (MCD), DropConnect (MCDC) requires a large number of forward passes through the network, rendering them inapt for low-latency applications. We propose Select-DC which uses a subset of layers in a neural network to model epistemic uncertainty with MCDC. Through our experiments, we show a significant reduction in the GFLOPS required to model uncertainty, compared to Monte Carlo DropConnect, with marginal trade-off in performance. We perform a suite of experiments on CIFAR 10, CIFAR 100, and SVHN datasets with ResNet and VGG models. We further show how applying DropConnect to various layers in the network with different drop probabilities affects the networks performance and the entropy of the predictive distribution. △ Less

Submitted 26 October, 2020; originally announced October 2020.

Comments: 8 pages, 6 figures, 1 table, with appendix, submitted to a NeurIPS workshop

arXiv:2008.07426 [pdf, other]

Hey Human, If your Facial Emotions are Uncertain, You Should Use Bayesian Neural Networks!

Authors: Maryam Matin, Matias Valdenegro-Toro

Abstract: Facial emotion recognition is the task to classify human emotions in face images. It is a difficult task due to high aleatoric uncertainty and visual ambiguity. A large part of the literature aims to show progress by increasing accuracy on this task, but this ignores the inherent uncertainty and ambiguity in the task. In this paper we show that Bayesian Neural Networks, as approximated using MC-Dr… ▽ More Facial emotion recognition is the task to classify human emotions in face images. It is a difficult task due to high aleatoric uncertainty and visual ambiguity. A large part of the literature aims to show progress by increasing accuracy on this task, but this ignores the inherent uncertainty and ambiguity in the task. In this paper we show that Bayesian Neural Networks, as approximated using MC-Dropout, MC-DropConnect, or an Ensemble, are able to model the aleatoric uncertainty in facial emotion recognition, and produce output probabilities that are closer to what a human expects. We also show that calibration metrics show strange behaviors for this task, due to the multiple classes that can be considered correct, which motivates future work. We believe our work will motivate other researchers to move away from Classical and into Bayesian Neural Networks. △ Less

Submitted 17 August, 2020; originally announced August 2020.

Comments: 10 pages, 7 figures, Women in Computer Vision @ ECCV 2020 camera ready

arXiv:2007.01787 [pdf, other]

Evaluating Uncertainty Estimation Methods on 3D Semantic Segmentation of Point Clouds

Authors: Swaroop Bhandary K, Nico Hochgeschwender, Paul Plöger, Frank Kirchner, Matias Valdenegro-Toro

Abstract: Deep learning models are extensively used in various safety critical applications. Hence these models along with being accurate need to be highly reliable. One way of achieving this is by quantifying uncertainty. Bayesian methods for UQ have been extensively studied for Deep Learning models applied on images but have been less explored for 3D modalities such as point clouds often used for Robots a… ▽ More Deep learning models are extensively used in various safety critical applications. Hence these models along with being accurate need to be highly reliable. One way of achieving this is by quantifying uncertainty. Bayesian methods for UQ have been extensively studied for Deep Learning models applied on images but have been less explored for 3D modalities such as point clouds often used for Robots and Autonomous Systems. In this work, we evaluate three uncertainty quantification methods namely Deep Ensembles, MC-Dropout and MC-DropConnect on the DarkNet21Seg 3D semantic segmentation model and comprehensively analyze the impact of various parameters such as number of models in ensembles or forward passes, and drop probability values, on task performance and uncertainty estimate quality. We find that Deep Ensembles outperforms other methods in both performance and uncertainty metrics. Deep ensembles outperform other methods by a margin of 2.4% in terms of mIOU, 1.3% in terms of accuracy, while providing reliable uncertainty for decision making. △ Less

Submitted 3 July, 2020; originally announced July 2020.

Comments: 12 pages, 19 figures, ICML 2020 Workshop on Uncertainty and Robustness in Deep Learning

arXiv:1910.13144 [pdf, other]

Results from the Robocademy ITN: Autonomy, Disturbance Rejection and Perception for Advanced Marine Robotics

Authors: Matias Valdenegro-Toro, Mariela De Lucas Alvarez, Mariia Dmitrieva, Bilal Wehbe, Georgios Salavasidis, Shahab Heshmati-Alamdari, Juan F. Fuentes-Pérez, Veronika Yordanova, Klemen Istenič, Thomas Guerneve

Abstract: Marine and Underwater resources are important part of the economy of many countries. This requires significant financial resources into their construction and maintentance. Robotics is expected to fill this void, by automating and/or removing humans from hostile environments in order to easily perform maintenance tasks. The Robocademy Marie Sklodowska-Curie Initial Training Network was funded by t… ▽ More Marine and Underwater resources are important part of the economy of many countries. This requires significant financial resources into their construction and maintentance. Robotics is expected to fill this void, by automating and/or removing humans from hostile environments in order to easily perform maintenance tasks. The Robocademy Marie Sklodowska-Curie Initial Training Network was funded by the European Union's FP7 research program in order to train 13 Fellows into world-leading researchers in Marine and Underwater Robotics. The fellows developed guided research into three areas of key importance: Autonomy, Disturbance Rejection, and Perception. This paper presents a summary of the fellows' research in the three action lines. 71 scientific publications were the primary result of this project, with many other publications currently in the pipeline. Most of the fellows have found employment in Europe, which shows the high demand for this kind of experts. We believe the results from this project are already having an impact in the marine robotics industry, as key technologies are being adopted already. △ Less

Submitted 29 October, 2019; originally announced October 2019.

Comments: 19 pages, 20 figures, initial preprint

arXiv:1910.08168 [pdf, other]

Deep Sub-Ensembles for Fast Uncertainty Estimation in Image Classification

Authors: Matias Valdenegro-Toro

Abstract: Fast estimates of model uncertainty are required for many robust robotics applications. Deep Ensembles provides state of the art uncertainty without requiring Bayesian methods, but still it is computationally expensive. In this paper we propose deep sub-ensembles, an approximation to deep ensembles where the core idea is to ensemble only the layers close to the output, and not the whole model. Wit… ▽ More Fast estimates of model uncertainty are required for many robust robotics applications. Deep Ensembles provides state of the art uncertainty without requiring Bayesian methods, but still it is computationally expensive. In this paper we propose deep sub-ensembles, an approximation to deep ensembles where the core idea is to ensemble only the layers close to the output, and not the whole model. With ResNet-20 on the CIFAR10 dataset, we obtain 1.5-2.5 speedup over a Deep Ensemble, with a small increase in error and NLL, and similarly up to 5-15 speedup with a VGG-like network on the SVHN dataset. Our results show that this idea enables a trade-off between error and uncertainty quality versus computational performance. △ Less

Submitted 29 November, 2019; v1 submitted 17 October, 2019; originally announced October 2019.

Comments: 7 pages, 8 figures, Bayesian Deep Learning Workshop 2019 @ NeurIPS 2019, camera ready

arXiv:1907.12902 [pdf, other]

Data augmentation with Symbolic-to-Real Image Translation GANs for Traffic Sign Recognition

Authors: Nour Soufi, Matias Valdenegro-Toro

Abstract: Traffic sign recognition is an important component of many advanced driving assistance systems, and it is required for full autonomous driving. Computational performance is usually the bottleneck in using large scale neural networks for this purpose. SqueezeNet is a good candidate for efficient image classification of traffic signs, but in our experiments it does not reach high accuracy, and we be… ▽ More Traffic sign recognition is an important component of many advanced driving assistance systems, and it is required for full autonomous driving. Computational performance is usually the bottleneck in using large scale neural networks for this purpose. SqueezeNet is a good candidate for efficient image classification of traffic signs, but in our experiments it does not reach high accuracy, and we believe this is due to lack of data, requiring data augmentation. Generative adversarial networks can learn the high dimensional distribution of empirical data, allowing the generation of new data points. In this paper we apply pix2pix GANs architecture to generate new traffic sign images and evaluate the use of these images in data augmentation. We were motivated to use pix2pix to translate symbolic sign images to real ones due to the mode collapse in Conditional GANs. Through our experiments we found that data augmentation using GAN can increase classification accuracy for circular traffic signs from 92.1% to 94.0%, and for triangular traffic signs from 93.8% to 95.3%, producing an overall improvement of 2%. However some traditional augmentation techniques can outperform GAN data augmentation, for example contrast variation in circular traffic signs (95.5%) and displacement on triangular traffic signs (96.7 %). Our negative results shows that while GANs can be naively used for data augmentation, they are not always the best choice, depending on the problem and variability in the data. △ Less

Submitted 17 July, 2019; originally announced July 2019.

Comments: 6 pages, 10 figures

arXiv:1907.00734 [pdf, other]

Learning Objectness from Sonar Images for Class-Independent Object Detection

Authors: Matias Valdenegro-Toro

Abstract: Detecting novel objects without class information is not trivial, as it is difficult to generalize from a small training set. This is an interesting problem for underwater robotics, as modeling marine objects is inherently more difficult in sonar images, and training data might not be available apriori. Detection proposals algorithms can be used for this purpose but usually requires a large amount… ▽ More Detecting novel objects without class information is not trivial, as it is difficult to generalize from a small training set. This is an interesting problem for underwater robotics, as modeling marine objects is inherently more difficult in sonar images, and training data might not be available apriori. Detection proposals algorithms can be used for this purpose but usually requires a large amount of output bounding boxes. In this paper we propose the use of a fully convolutional neural network that regresses an objectness value directly from a Forward-Looking sonar image. By ranking objectness, we can produce high recall (96 %) with only 100 proposals per image. In comparison, EdgeBoxes requires 5000 proposals to achieve a slightly better recall of 97 %, while Selective Search requires 2000 proposals to achieve 95 % recall. We also show that our method outperforms a template matching baseline by a considerable margin, and is able to generalize to completely new objects. We expect that this kind of technique can be used in the field to find lost objects under the sea. △ Less

Submitted 1 July, 2019; originally announced July 2019.

Comments: European Conference on Mobile Robots 2019

arXiv:1905.05241 [pdf, other]

Deep Neural Networks for Marine Debris Detection in Sonar Images

Authors: Matias Valdenegro-Toro

Abstract: Garbage and waste disposal is one of the biggest challenges currently faced by mankind. Proper waste disposal and recycling is a must in any sustainable community, and in many coastal areas there is significant water pollution in the form of floating or submerged garbage. This is called marine debris. Submerged marine debris threatens marine life, and for shallow coastal areas, it can also threate… ▽ More Garbage and waste disposal is one of the biggest challenges currently faced by mankind. Proper waste disposal and recycling is a must in any sustainable community, and in many coastal areas there is significant water pollution in the form of floating or submerged garbage. This is called marine debris. Submerged marine debris threatens marine life, and for shallow coastal areas, it can also threaten fishing vessels [Iñiguez et al. 2016, Renewable and Sustainable Energy Reviews]. Submerged marine debris typically stays in the environment for a long time (20+ years), and consists of materials that can be recycled, such as metals, plastics, glass, etc. Many of these items should not be disposed in water bodies as this has a negative effect in the environment and human health. This thesis performs a comprehensive evaluation on the use of DNNs for the problem of marine debris detection in FLS images, as well as related problems such as image classification, matching, and detection proposals. We do this in a dataset of 2069 FLS images that we captured with an ARIS Explorer 3000 sensor on marine debris objects lying in the floor of a small water tank. The objects we used to produce this dataset contain typical household marine debris and distractor marine objects (tires, hooks, valves, etc), divided in 10 classes plus a background class. Our results show that for the evaluated tasks, DNNs are a superior technique than the corresponding state of the art. There are large gains particularly for the matching and detection proposal tasks. We also study the effect of sample complexity and object size in many tasks, which is valuable information for practitioners. We expect that our results will advance the objective of using Autonomous Underwater Vehicles to automatically survey, detect and collect marine debris from underwater environments. △ Less

Submitted 13 May, 2019; originally announced May 2019.

Comments: PhD Thesis submitted to Heriot-Watt University

arXiv:1903.12270 [pdf, other]

Implementing Noise with Hash functions for Graphics Processing Units

Authors: Matias Valdenegro-Toro, Hector Pincheira

Abstract: We propose a modification to Perlin noise which use computable hash functions instead of textures as lookup tables. We implemented the FNV1, Jenkins and Murmur hashes on Shader Model 4.0 Graphics Processing Units for noise generation. Modified versions of the FNV1 and Jenkins hashes provide very close performance compared to a texture based Perlin noise implementation. Our noise modification enabl… ▽ More We propose a modification to Perlin noise which use computable hash functions instead of textures as lookup tables. We implemented the FNV1, Jenkins and Murmur hashes on Shader Model 4.0 Graphics Processing Units for noise generation. Modified versions of the FNV1 and Jenkins hashes provide very close performance compared to a texture based Perlin noise implementation. Our noise modification enables noise function evaluation without any texture fetches, trading computational power for memory bandwidth. △ Less

Submitted 28 March, 2019; originally announced March 2019.

Comments: Proceedings XXVIII International Conference of the Chilean Computing Science Society (SCCC, 2009)

arXiv:1807.04109 [pdf, other]

Modeling and Soft-fault Diagnosis of Underwater Thrusters with Recurrent Neural Networks

Authors: Samy Nascimento, Matias Valdenegro-Toro

Abstract: Noncritical soft-faults and model deviations are a challenge for Fault Detection and Diagnosis (FDD) of resident Autonomous Underwater Vehicles (AUVs). Such systems may have a faster performance degradation due to the permanent exposure to the marine environment, and constant monitoring of component conditions is required to ensure their reliability. This works presents an evaluation of Recurrent… ▽ More Noncritical soft-faults and model deviations are a challenge for Fault Detection and Diagnosis (FDD) of resident Autonomous Underwater Vehicles (AUVs). Such systems may have a faster performance degradation due to the permanent exposure to the marine environment, and constant monitoring of component conditions is required to ensure their reliability. This works presents an evaluation of Recurrent Neural Networks (RNNs) for a data-driven fault detection and diagnosis scheme for underwater thrusters with empirical data. The nominal behavior of the thruster was modeled using the measured control input, voltage, rotational speed and current signals. We evaluated the performance of fault classification using all the measured signals compared to using the computed residuals from the nominal model as features. △ Less

Submitted 11 July, 2018; originally announced July 2018.

Comments: CAMS 2018 camera ready version

ACM Class: I.2.6; I.2.9

arXiv:1805.04756 [pdf, other]

Improving Predictive Uncertainty Estimation using Dropout -- Hamiltonian Monte Carlo

Authors: Diego Vergara, Sergio Hernández, Matias Valdenegro-Toro, Felipe Jorquera

Abstract: Estimating predictive uncertainty is crucial for many computer vision tasks, from image classification to autonomous driving systems. Hamiltonian Monte Carlo (HMC) is an sampling method for performing Bayesian inference. On the other hand, Dropout regularization has been proposed as an approximate model averaging technique that tends to improve generalization in large scale models such as deep neu… ▽ More Estimating predictive uncertainty is crucial for many computer vision tasks, from image classification to autonomous driving systems. Hamiltonian Monte Carlo (HMC) is an sampling method for performing Bayesian inference. On the other hand, Dropout regularization has been proposed as an approximate model averaging technique that tends to improve generalization in large scale models such as deep neural networks. Although, HMC provides convergence guarantees for most standard Bayesian models, it does not handle discrete parameters arising from Dropout regularization. In this paper, we present a robust methodology for improving predictive uncertainty in classification problems, based on Dropout and Hamiltonian Monte Carlo. Even though Dropout induces a non-smooth energy function with no such convergence guarantees, the resulting discretization of the Hamiltonian proves empirical success. The proposed method allows to effectively estimate the predictive accuracy and to provide better generalization for difficult test examples. △ Less

Submitted 2 July, 2019; v1 submitted 12 May, 2018; originally announced May 2018.

Comments: 26 Pages, 12 Figures, version 3, to appear in Soft Computing, Author preprint

arXiv:1711.02578 [pdf, other]

Image Captioning and Classification of Dangerous Situations

Authors: Octavio Arriaga, Paul Plöger, Matias Valdenegro-Toro

Abstract: Current robot platforms are being employed to collaborate with humans in a wide range of domestic and industrial tasks. These environments require autonomous systems that are able to classify and communicate anomalous situations such as fires, injured persons, car accidents; or generally, any potentially dangerous situation for humans. In this paper we introduce an anomaly detection dataset for th… ▽ More Current robot platforms are being employed to collaborate with humans in a wide range of domestic and industrial tasks. These environments require autonomous systems that are able to classify and communicate anomalous situations such as fires, injured persons, car accidents; or generally, any potentially dangerous situation for humans. In this paper we introduce an anomaly detection dataset for the purpose of robot applications as well as the design and implementation of a deep learning architecture that classifies and describes dangerous situations using only a single image as input. We report a classification accuracy of 97 % and METEOR score of 16.2. We will make the dataset publicly available after this paper is accepted. △ Less

Submitted 7 November, 2017; originally announced November 2017.

arXiv:1710.07557 [pdf, other]

Real-time Convolutional Neural Networks for Emotion and Gender Classification

Authors: Octavio Arriaga, Matias Valdenegro-Toro, Paul Plöger

Abstract: In this paper we propose an implement a general convolutional neural network (CNN) building framework for designing real-time CNNs. We validate our models by creating a real-time vision system which accomplishes the tasks of face detection, gender classification and emotion classification simultaneously in one blended step using our proposed CNN architecture. After presenting the details of the tr… ▽ More In this paper we propose an implement a general convolutional neural network (CNN) building framework for designing real-time CNNs. We validate our models by creating a real-time vision system which accomplishes the tasks of face detection, gender classification and emotion classification simultaneously in one blended step using our proposed CNN architecture. After presenting the details of the training procedure setup we proceed to evaluate on standard benchmark sets. We report accuracies of 96% in the IMDB gender dataset and 66% in the FER-2013 emotion dataset. Along with this we also introduced the very recent real-time enabled guided back-propagation visualization technique. Guided back-propagation uncovers the dynamics of the weight changes and evaluates the learned features. We argue that the careful implementation of modern CNN architectures, the use of the current regularization methods and the visualization of previously hidden features are necessary in order to reduce the gap between slow performances and real-time architectures. Our system has been validated by its deployment on a Care-O-bot 3 robot used during RoboCup@Home competitions. All our code, demos and pre-trained architectures have been released under an open-source license in our public repository. △ Less

Submitted 20 October, 2017; originally announced October 2017.

Comments: Submitted to ICRA 2018

arXiv:1709.02601 [pdf, other]

Best Practices in Convolutional Networks for Forward-Looking Sonar Image Recognition

Authors: Matias Valdenegro-Toro

Abstract: Convolutional Neural Networks (CNN) have revolutionized perception for color images, and their application to sonar images has also obtained good results. But in general CNNs are difficult to train without a large dataset, need manual tuning of a considerable number of hyperparameters, and require many careful decisions by a designer. In this work, we evaluate three common decisions that need to b… ▽ More Convolutional Neural Networks (CNN) have revolutionized perception for color images, and their application to sonar images has also obtained good results. But in general CNNs are difficult to train without a large dataset, need manual tuning of a considerable number of hyperparameters, and require many careful decisions by a designer. In this work, we evaluate three common decisions that need to be made by a CNN designer, namely the performance of transfer learning, the effect of object/image size and the relation between training set size. We evaluate three CNN models, namely one based on LeNet, and two based on the Fire module from SqueezeNet. Our findings are: Transfer learning with an SVM works very well, even when the train and transfer sets have no classes in common, and high classification performance can be obtained even when the target dataset is small. The ADAM optimizer combined with Batch Normalization can make a high accuracy CNN classifier, even with small image sizes (16 pixels). At least 50 samples per class are required to obtain $90\%$ test accuracy, and using Dropout with a small dataset helps improve performance, but Batch Normalization is better when a large dataset is available. △ Less

Submitted 8 September, 2017; originally announced September 2017.

Comments: Author version; IEEE/MTS Oceans 2017 Aberdeen

arXiv:1709.02600 [pdf, other]

doi 10.1007/978-3-319-46182-3_18

Objectness Scoring and Detection Proposals in Forward-Looking Sonar Images with Convolutional Neural Networks

Authors: Matias Valdenegro-Toro

Abstract: Forward-looking sonar can capture high resolution images of underwater scenes, but their interpretation is complex. Generic object detection in such images has not been solved, specially in cases of small and unknown objects. In comparison, detection proposal algorithms have produced top performing object detectors in real-world color images. In this work we develop a Convolutional Neural Network… ▽ More Forward-looking sonar can capture high resolution images of underwater scenes, but their interpretation is complex. Generic object detection in such images has not been solved, specially in cases of small and unknown objects. In comparison, detection proposal algorithms have produced top performing object detectors in real-world color images. In this work we develop a Convolutional Neural Network that can reliably score objectness of image windows in forward-looking sonar images and by thresholding objectness, we generate detection proposals. In our dataset of marine garbage objects, we obtain 94% recall, generating around 60 proposals per image. The biggest strength of our method is that it can generalize to previously unseen objects. We show this by detecting chain links, walls and a wrench without previous training in such objects. We strongly believe our method can be used for class-independent object detection, with many real-world applications such as chain following and mine detection. △ Less

Submitted 8 September, 2017; originally announced September 2017.

Comments: Author version

Journal ref: Proceedings of ANNPR 2016

Showing 1–50 of 52 results for author: Valdenegro-Toro, M