Search | arXiv e-print repository

SPOT: Point Cloud Based Stereo Visual Place Recognition for Similar and Opposing Viewpoints

Authors: Spencer Carmichael, Rahul Agrawal, Ram Vasudevan, Katherine A. Skinner

Abstract: Recognizing places from an opposing viewpoint during a return trip is a common experience for human drivers. However, the analogous robotics capability, visual place recognition (VPR) with limited field of view cameras under 180 degree rotations, has proven to be challenging to achieve. To address this problem, this paper presents Same Place Opposing Trajectory (SPOT), a technique for opposing vie… ▽ More Recognizing places from an opposing viewpoint during a return trip is a common experience for human drivers. However, the analogous robotics capability, visual place recognition (VPR) with limited field of view cameras under 180 degree rotations, has proven to be challenging to achieve. To address this problem, this paper presents Same Place Opposing Trajectory (SPOT), a technique for opposing viewpoint VPR that relies exclusively on structure estimated through stereo visual odometry (VO). The method extends recent advances in lidar descriptors and utilizes a novel double (similar and opposing) distance matrix sequence matching method. We evaluate SPOT on a publicly available dataset with 6.7-7.6 km routes driven in similar and opposing directions under various lighting conditions. The proposed algorithm demonstrates remarkable improvement over the state-of-the-art, achieving up to 91.7% recall at 100% precision in opposing viewpoint cases, while requiring less storage than all baselines tested and running faster than all but one. Moreover, the proposed method assumes no a priori knowledge of whether the viewpoint is similar or opposing, and also demonstrates competitive performance in similar viewpoint cases. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: Accepted to ICRA 2024, project website: https://umautobots.github.io/spot

arXiv:2404.09094 [pdf, other]

Learning Surface Terrain Classifications from Ground Penetrating Radar

Authors: Anja Sheppard, Jason Brown, Nilton Renno, Katherine A. Skinner

Abstract: Terrain classification is an important problem for mobile robots operating in extreme environments as it can aid downstream tasks such as autonomous navigation and planning. While RGB cameras are widely used for terrain identification, vision-based methods can suffer due to poor lighting conditions and occlusions. In this paper, we propose the novel use of Ground Penetrating Radar (GPR) for terrai… ▽ More Terrain classification is an important problem for mobile robots operating in extreme environments as it can aid downstream tasks such as autonomous navigation and planning. While RGB cameras are widely used for terrain identification, vision-based methods can suffer due to poor lighting conditions and occlusions. In this paper, we propose the novel use of Ground Penetrating Radar (GPR) for terrain characterization for mobile robot platforms. Our approach leverages machine learning for surface terrain classification from GPR data. We collect a new dataset consisting of four different terrain types, and present qualitative and quantitative results. Our results demonstrate that classification networks can learn terrain categories from GPR signals. Additionally, we integrate our GPR-based classification approach into a multimodal semantic map** framework to demonstrate a practical use case of GPR for surface terrain classification on mobile robots. Overall, this work extends the usability of GPR sensors deployed on robots to enable terrain classification in addition to GPR's existing scientific use cases. △ Less

Submitted 13 April, 2024; originally announced April 2024.

Comments: Accepted to the 2024 Conference on Computer Vision and Pattern Recognition (CVPR) Perception Beyond the Visible Spectrum Workshop

arXiv:2403.19104 [pdf, other]

CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation

Authors: Lingjun Zhao, **gyu Song, Katherine A. Skinner

Abstract: In the field of 3D object detection for autonomous driving, LiDAR-Camera (LC) fusion is the top-performing sensor configuration. Still, LiDAR is relatively high cost, which hinders adoption of this technology for consumer automobiles. Alternatively, camera and radar are commonly deployed on vehicles already on the road today, but performance of Camera-Radar (CR) fusion falls behind LC fusion. In t… ▽ More In the field of 3D object detection for autonomous driving, LiDAR-Camera (LC) fusion is the top-performing sensor configuration. Still, LiDAR is relatively high cost, which hinders adoption of this technology for consumer automobiles. Alternatively, camera and radar are commonly deployed on vehicles already on the road today, but performance of Camera-Radar (CR) fusion falls behind LC fusion. In this work, we propose Camera-Radar Knowledge Distillation (CRKD) to bridge the performance gap between LC and CR detectors with a novel cross-modality KD framework. We use the Bird's-Eye-View (BEV) representation as the shared feature space to enable effective knowledge distillation. To accommodate the unique cross-modality KD path, we propose four distillation losses to help the student learn crucial features from the teacher model. We present extensive evaluations on the nuScenes dataset to demonstrate the effectiveness of the proposed CRKD framework. The project page for CRKD is https://song-**gyu.github.io/CRKD. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: Accepted to CVPR 2024

arXiv:2403.17405 [pdf, other]

The recessionary pressures of generative AI: A threat to wellbeing

Authors: Jo-An Occhipinti, Ante Prodan, William Hynes, Roy Green, Sharan Burrow, Harris A Eyre, Adam Skinner, Goran Ujdur, John Buchanan, Ian B Hickie, Mark Heffernan, Christine Song, Marcel Tanner

Abstract: Generative Artificial Intelligence (AI) stands as a transformative force that presents a paradox; it offers unprecedented opportunities for productivity growth while potentially posing significant threats to economic stability and societal wellbeing. Many consider generative AI as akin to previous technological advancements, using historical precedent to argue that fears of widespread job displace… ▽ More Generative Artificial Intelligence (AI) stands as a transformative force that presents a paradox; it offers unprecedented opportunities for productivity growth while potentially posing significant threats to economic stability and societal wellbeing. Many consider generative AI as akin to previous technological advancements, using historical precedent to argue that fears of widespread job displacement are unfounded, while others contend that generative AI`s unique capacity to undertake non-routine cognitive tasks sets it apart from other forms of automation capital and presents a threat to the quality and availability of work that underpin stable societies. This paper explores the conditions under which both may be true. We posit the existence of an AI-capital-to-labour ratio threshold beyond which a self-reinforcing cycle of recessionary pressures could be triggered, exacerbating social disparities, reducing social cohesion, heightening tensions, and requiring sustained government intervention to maintain stability. To prevent this, the paper underscores the urgent need for proactive policy responses, making recommendations to reduce these risks through robust regulatory frameworks and a new social contract characterised by progressive social and economic policies. This approach aims to ensure a sustainable, inclusive, and resilient economic future where human contribution to the economy is retained and integrated with generative AI to enhance the Mental Wealth of nations. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: 7 pages, 3 figures

arXiv:2403.05513 [pdf, other]

A Detection and Filtering Framework for Collaborative Localization

Authors: Thirumalaesh Ashokkumar, Katherine A Skinner, Siddarth Agarwal, Ankit Vora, Ashutosh Bhown

Abstract: Increasingly, autonomous vehicles (AVs) are becoming a reality, such as the Advanced Driver Assistance Systems (ADAS) in vehicles that assist drivers in driving and parking functions with vehicles today. The localization problem for AVs relies primarily on multiple sensors, including cameras, LiDARs, and radars. Manufacturing, installing, calibrating, and maintaining these sensors can be very expe… ▽ More Increasingly, autonomous vehicles (AVs) are becoming a reality, such as the Advanced Driver Assistance Systems (ADAS) in vehicles that assist drivers in driving and parking functions with vehicles today. The localization problem for AVs relies primarily on multiple sensors, including cameras, LiDARs, and radars. Manufacturing, installing, calibrating, and maintaining these sensors can be very expensive, thereby increasing the overall cost of AVs. This research explores the means to improve localization on vehicles belonging to the ADAS category in a platooning context, where an ADAS vehicle follows a lead "Smart" AV equipped with a highly accurate sensor suite. We propose and produce results by using a filtering framework to combine pose information derived from vision and odometry to improve the localization of the ADAS vehicle that follows the smart vehicle. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2402.11735 [pdf, other]

LiRaFusion: Deep Adaptive LiDAR-Radar Fusion for 3D Object Detection

Authors: **gyu Song, Lingjun Zhao, Katherine A. Skinner

Abstract: We propose LiRaFusion to tackle LiDAR-radar fusion for 3D object detection to fill the performance gap of existing LiDAR-radar detectors. To improve the feature extraction capabilities from these two modalities, we design an early fusion module for joint voxel feature encoding, and a middle fusion module to adaptively fuse feature maps via a gated network. We perform extensive evaluation on nuScen… ▽ More We propose LiRaFusion to tackle LiDAR-radar fusion for 3D object detection to fill the performance gap of existing LiDAR-radar detectors. To improve the feature extraction capabilities from these two modalities, we design an early fusion module for joint voxel feature encoding, and a middle fusion module to adaptively fuse feature maps via a gated network. We perform extensive evaluation on nuScenes to demonstrate that LiRaFusion leverages the complementary information of LiDAR and radar effectively and achieves notable improvement over existing methods. △ Less

Submitted 18 February, 2024; originally announced February 2024.

Comments: Accepted to ICRA 2024

arXiv:2402.01106 [pdf, other]

Learning Which Side to Scan: Multi-View Informed Active Perception with Side Scan Sonar for Autonomous Underwater Vehicles

Authors: Advaith V. Sethuraman, Philip Baldoni, Katherine A. Skinner, James McMahon

Abstract: Autonomous underwater vehicles often perform surveys that capture multiple views of targets in order to provide more information for human operators or automatic target recognition algorithms. In this work, we address the problem of choosing the most informative views that minimize survey time while maximizing classifier accuracy. We introduce a novel active perception framework for multi-view ada… ▽ More Autonomous underwater vehicles often perform surveys that capture multiple views of targets in order to provide more information for human operators or automatic target recognition algorithms. In this work, we address the problem of choosing the most informative views that minimize survey time while maximizing classifier accuracy. We introduce a novel active perception framework for multi-view adaptive surveying and reacquisition using side scan sonar imagery. Our framework addresses this challenge by using a graph formulation for the adaptive survey task. We then use Graph Neural Networks (GNNs) to both classify acquired sonar views and to choose the next best view based on the collected data. We evaluate our method using simulated surveys in a high-fidelity side scan sonar simulator. Our results demonstrate that our approach is able to surpass the state-of-the-art in classification accuracy and survey efficiency. This framework is a promising approach for more efficient autonomous missions involving side scan sonar, such as underwater exploration, marine archaeology, and environmental monitoring. △ Less

Submitted 13 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

arXiv:2401.14546 [pdf, other]

Machine Learning for Shipwreck Segmentation from Side Scan Sonar Imagery: Dataset and Benchmark

Authors: Advaith V. Sethuraman, Anja Sheppard, Onur Bagoren, Christopher Pinnow, Jamey Anderson, Timothy C. Havens, Katherine A. Skinner

Abstract: Open-source benchmark datasets have been a critical component for advancing machine learning for robot perception in terrestrial applications. Benchmark datasets enable the widespread development of state-of-the-art machine learning methods, which require large datasets for training, validation, and thorough comparison to competing approaches. Underwater environments impose several operational cha… ▽ More Open-source benchmark datasets have been a critical component for advancing machine learning for robot perception in terrestrial applications. Benchmark datasets enable the widespread development of state-of-the-art machine learning methods, which require large datasets for training, validation, and thorough comparison to competing approaches. Underwater environments impose several operational challenges that hinder efforts to collect large benchmark datasets for marine robot perception. Furthermore, a low abundance of targets of interest relative to the size of the search space leads to increased time and cost required to collect useful datasets for a specific task. As a result, there is limited availability of labeled benchmark datasets for underwater applications. We present the AI4Shipwrecks dataset, which consists of 24 distinct shipwreck sites totaling 286 high-resolution labeled side scan sonar images to advance the state-of-the-art in autonomous sonar image understanding. We leverage the unique abundance of targets in Thunder Bay National Marine Sanctuary in Lake Huron, MI, to collect and compile a sonar imagery benchmark dataset through surveys with an autonomous underwater vehicle (AUV). We consulted with expert marine archaeologists for the labeling of robotically gathered data. We then leverage this dataset to perform benchmark experiments for comparison of state-of-the-art supervised segmentation methods, and we present insights on opportunities and open challenges for the field. The dataset and benchmarking tools will be released as an open-source benchmark dataset to spur innovation in machine learning for Great Lakes and ocean exploration. The dataset and accompanying software are available at https://umfieldrobotics.github.io/ai4shipwrecks/. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: Project website link: https://umfieldrobotics.github.io/ai4shipwrecks/

arXiv:2401.13853 [pdf, other]

Dataset and Benchmark: Novel Sensors for Autonomous Vehicle Perception

Authors: Spencer Carmichael, Austin Buchan, Mani Ramanagopal, Radhika Ravi, Ram Vasudevan, Katherine A. Skinner

Abstract: Conventional cameras employed in autonomous vehicle (AV) systems support many perception tasks, but are challenged by low-light or high dynamic range scenes, adverse weather, and fast motion. Novel sensors, such as event and thermal cameras, offer capabilities with the potential to address these scenarios, but they remain to be fully exploited. This paper introduces the Novel Sensors for Autonomou… ▽ More Conventional cameras employed in autonomous vehicle (AV) systems support many perception tasks, but are challenged by low-light or high dynamic range scenes, adverse weather, and fast motion. Novel sensors, such as event and thermal cameras, offer capabilities with the potential to address these scenarios, but they remain to be fully exploited. This paper introduces the Novel Sensors for Autonomous Vehicle Perception (NSAVP) dataset to facilitate future research on this topic. The dataset was captured with a platform including stereo event, thermal, monochrome, and RGB cameras as well as a high precision navigation system providing ground truth poses. The data was collected by repeatedly driving two ~8 km routes and includes varied lighting conditions and opposing viewpoint perspectives. We provide benchmarking experiments on the task of place recognition to demonstrate challenges and opportunities for novel sensors to enhance critical AV perception tasks. To our knowledge, the NSAVP dataset is the first to include stereo thermal cameras together with stereo event and monochrome cameras. The dataset and supporting software suite is available at: https://umautobots.github.io/nsavp △ Less

Submitted 24 January, 2024; originally announced January 2024.

Comments: Under review

arXiv:2310.01932 [pdf, other]

Automatic Data Processing for Space Robotics Machine Learning

Authors: Anja Sheppard, Katherine A. Skinner

Abstract: Autonomous terrain classification is an important problem in planetary navigation, whether the goal is to identify scientific sites of interest or to traverse treacherous areas safely. Past Martian rovers have relied on human operators to manually identify a navigable path from transmitted imagery. Our goals on Mars in the next few decades will eventually require rovers that can autonomously move… ▽ More Autonomous terrain classification is an important problem in planetary navigation, whether the goal is to identify scientific sites of interest or to traverse treacherous areas safely. Past Martian rovers have relied on human operators to manually identify a navigable path from transmitted imagery. Our goals on Mars in the next few decades will eventually require rovers that can autonomously move farther, faster, and through more dangerous landscapes--demonstrating a need for improved terrain classification for traversability. Autonomous navigation through extreme environments will enable the search for water on the Moon and Mars as well as preparations for human habitats. Advancements in machine learning techniques have demonstrated potential to improve terrain classification capabilities for ground vehicles on Earth. However, classification results for space applications are limited by the availability of training data suitable for supervised learning methods. This paper contributes an open source automatic data processing pipeline that uses camera geometry to co-locate Curiosity and Perseverance Mastcam image products with Mars overhead maps via ray projection over a terrain model. In future work, this automated data processing pipeline will be leveraged for development of machine learning methods for terrain classification. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: Presented as a poster at IAC 2023

arXiv:2310.01667 [pdf, other]

STARS: Zero-shot Sim-to-Real Transfer for Segmentation of Shipwrecks in Sonar Imagery

Authors: Advaith Venkatramanan Sethuraman, Katherine A. Skinner

Abstract: In this paper, we address the problem of sim-to-real transfer for object segmentation when there is no access to real examples of an object of interest during training, i.e. zero-shot sim-to-real transfer for segmentation. We focus on the application of shipwreck segmentation in side scan sonar imagery. Our novel segmentation network, STARS, addresses this challenge by fusing a predicted deformati… ▽ More In this paper, we address the problem of sim-to-real transfer for object segmentation when there is no access to real examples of an object of interest during training, i.e. zero-shot sim-to-real transfer for segmentation. We focus on the application of shipwreck segmentation in side scan sonar imagery. Our novel segmentation network, STARS, addresses this challenge by fusing a predicted deformation field and anomaly volume, allowing it to generalize better to real sonar images and achieve more effective zero-shot sim-to-real transfer for image segmentation. We evaluate the sim-to-real transfer capabilities of our method on a real, expert-labeled side scan sonar dataset of shipwrecks collected from field work surveys with an autonomous underwater vehicle (AUV). STARS is trained entirely in simulation and performs zero-shot shipwreck segmentation with no additional fine-tuning on real data. Our method provides a significant 20% increase in segmentation performance for the targeted shipwreck class compared to the best baseline. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2309.04937 [pdf, other]

LONER: LiDAR Only Neural Representations for Real-Time SLAM

Authors: Seth Isaacson, Pou-Chun Kung, Mani Ramanagopal, Ram Vasudevan, Katherine A. Skinner

Abstract: This paper proposes LONER, the first real-time LiDAR SLAM algorithm that uses a neural implicit scene representation. Existing implicit map** methods for LiDAR show promising results in large-scale reconstruction, but either require groundtruth poses or run slower than real-time. In contrast, LONER uses LiDAR data to train an MLP to estimate a dense map in real-time, while simultaneously estimat… ▽ More This paper proposes LONER, the first real-time LiDAR SLAM algorithm that uses a neural implicit scene representation. Existing implicit map** methods for LiDAR show promising results in large-scale reconstruction, but either require groundtruth poses or run slower than real-time. In contrast, LONER uses LiDAR data to train an MLP to estimate a dense map in real-time, while simultaneously estimating the trajectory of the sensor. To achieve real-time performance, this paper proposes a novel information-theoretic loss function that accounts for the fact that different regions of the map may be learned to varying degrees throughout online training. The proposed method is evaluated qualitatively and quantitatively on two open-source datasets. This evaluation illustrates that the proposed loss function converges faster and leads to more accurate geometry reconstruction than other loss functions used in depth-supervised neural implicit frameworks. Finally, this paper shows that LONER estimates trajectories competitively with state-of-the-art LiDAR SLAM methods, while also producing dense maps competitive with existing real-time implicit map** methods that use groundtruth poses. △ Less

Submitted 23 March, 2024; v1 submitted 10 September, 2023; originally announced September 2023.

Comments: First two authors equally contributed. Webpage: https://umautobots.github.io/loner

arXiv:2307.08647 [pdf, other]

Uncertainty-Aware Acoustic Localization and Map** for Underwater Robots

Authors: **gyu Song, Onur Bagoren, Katherine A. Skinner

Abstract: For underwater vehicles, robotic applications have the added difficulty of operating in highly unstructured and dynamic environments. Environmental effects impact not only the dynamics and controls of the robot but also the perception and sensing modalities. Acoustic sensors, which inherently use mechanically vibrated signals for measuring range or velocity, are particularly prone to the effects t… ▽ More For underwater vehicles, robotic applications have the added difficulty of operating in highly unstructured and dynamic environments. Environmental effects impact not only the dynamics and controls of the robot but also the perception and sensing modalities. Acoustic sensors, which inherently use mechanically vibrated signals for measuring range or velocity, are particularly prone to the effects that such dynamic environments induce. This paper presents an uncertainty-aware localization and map** framework that accounts for induced disturbances in acoustic sensing modalities for underwater robots operating near the surface in dynamic wave conditions. For the state estimation task, the uncertainty is accounted for as the added noise caused by the environmental disturbance. The map** method uses an adaptive kernel-based method to propagate measurement and pose uncertainty into an occupancy map. Experiments are carried out in a wave tank environment to perform qualitative and quantitative evaluations of the proposed method. More details about this project can be found at https://umfieldrobotics.github.io/PUMA.github.io. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: 9 pages, 9 figures

arXiv:2209.13091 [pdf, other]

WaterNeRF: Neural Radiance Fields for Underwater Scenes

Authors: Advaith Venkatramanan Sethuraman, Manikandasriram Srinivasan Ramanagopal, Katherine A. Skinner

Abstract: Underwater imaging is a critical task performed by marine robots for a wide range of applications including aquaculture, marine infrastructure inspection, and environmental monitoring. However, water column effects, such as attenuation and backscattering, drastically change the color and quality of imagery captured underwater. Due to varying water conditions and range-dependency of these effects,… ▽ More Underwater imaging is a critical task performed by marine robots for a wide range of applications including aquaculture, marine infrastructure inspection, and environmental monitoring. However, water column effects, such as attenuation and backscattering, drastically change the color and quality of imagery captured underwater. Due to varying water conditions and range-dependency of these effects, restoring underwater imagery is a challenging problem. This impacts downstream perception tasks including depth estimation and 3D reconstruction. In this paper, we advance state-of-the-art in neural radiance fields (NeRFs) to enable physics-informed dense depth estimation and color correction. Our proposed method, WaterNeRF, estimates parameters of a physics-based model for underwater image formation, leading to a hybrid data-driven and model-based solution. After determining the scene structure and radiance field, we can produce novel views of degraded as well as corrected underwater images, along with dense depth of the scene. We evaluate the proposed method qualitatively and quantitatively on a real underwater dataset. △ Less

Submitted 29 September, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

arXiv:2209.01194 [pdf, other]

doi 10.1109/LRA.2023.3262139

CLONeR: Camera-Lidar Fusion for Occupancy Grid-aided Neural Representations

Authors: Alexandra Carlson, Manikandasriram Srinivasan Ramanagopal, Nathan Tseng, Matthew Johnson-Roberson, Ram Vasudevan, Katherine A. Skinner

Abstract: Recent advances in neural radiance fields (NeRFs) achieve state-of-the-art novel view synthesis and facilitate dense estimation of scene properties. However, NeRFs often fail for large, unbounded scenes that are captured under very sparse views with the scene content concentrated far away from the camera, as is typical for field robotics applications. In particular, NeRF-style algorithms perform p… ▽ More Recent advances in neural radiance fields (NeRFs) achieve state-of-the-art novel view synthesis and facilitate dense estimation of scene properties. However, NeRFs often fail for large, unbounded scenes that are captured under very sparse views with the scene content concentrated far away from the camera, as is typical for field robotics applications. In particular, NeRF-style algorithms perform poorly: (1) when there are insufficient views with little pose diversity, (2) when scenes contain saturation and shadows, and (3) when finely sampling large unbounded scenes with fine structures becomes computationally intensive. This paper proposes CLONeR, which significantly improves upon NeRF by allowing it to model large outdoor driving scenes that are observed from sparse input sensor views. This is achieved by decoupling occupancy and color learning within the NeRF framework into separate Multi-Layer Perceptrons (MLPs) trained using LiDAR and camera data, respectively. In addition, this paper proposes a novel method to build differentiable 3D Occupancy Grid Maps (OGM) alongside the NeRF model, and leverage this occupancy grid for improved sampling of points along a ray for volumetric rendering in metric space. Through extensive quantitative and qualitative experiments on scenes from the KITTI dataset, this paper demonstrates that the proposed method outperforms state-of-the-art NeRF models on both novel view synthesis and dense depth prediction tasks when trained on sparse input data. △ Less

Submitted 4 April, 2023; v1 submitted 2 September, 2022; originally announced September 2022.

Comments: first two authors equally contributed

Journal ref: IEEE Robotics and Automation Letters, vol. 8, no. 5, pp. 2812-2819, May 2023

arXiv:2103.16427 [pdf, other]

doi 10.1109/ICRA48506.2021.9561182

Investigation of Multiple Resource Theory Design Principles on Robot Teleoperation and Workload Management

Authors: Zhao Han, Adam Norton, Eric McCann, Lisa Baraniecki, Will Ober, Dave Shane, Anna Skinner, Holly A. Yanco

Abstract: Robot interfaces often only use the visual channel. Inspired by Wickens' Multiple Resource Theory, we investigated if the addition of audio elements would reduce cognitive workload and improve performance. Specifically, we designed a search and threat-defusal task (primary) with a memory test task (secondary). Eleven participants - predominantly first responders - were recruited to control a robot… ▽ More Robot interfaces often only use the visual channel. Inspired by Wickens' Multiple Resource Theory, we investigated if the addition of audio elements would reduce cognitive workload and improve performance. Specifically, we designed a search and threat-defusal task (primary) with a memory test task (secondary). Eleven participants - predominantly first responders - were recruited to control a robot to clear all threats in a combination of four conditions of primary and secondary tasks in visual and auditory channels. We did not find any statistically significant differences in performance or workload across subjects, making it questionable that Multiple Resource Theory could shorten longer-term task completion time and reduce workload. Our results suggest that considering individual differences for splitting interface modalities across multiple channels requires further investigation. △ Less

Submitted 30 March, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

Comments: 7 pages, 13 figures, ICRA 2021

Journal ref: 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi'an, China, 2021, pp. 3858-3864

arXiv:2102.10545 [pdf, other]

Bayesian Deep Learning for Segmentation for Autonomous Safe Planetary Landing

Authors: Kento Tomita, Katherine A. Skinner, Koki Ho

Abstract: Hazard detection is critical for enabling autonomous landing on planetary surfaces. Current state-of-the-art methods leverage traditional computer vision approaches to automate the identification of safe terrain from input digital elevation models (DEMs). However, performance for these methods can degrade for input DEMs with increased sensor noise. In the last decade, deep learning techniques have… ▽ More Hazard detection is critical for enabling autonomous landing on planetary surfaces. Current state-of-the-art methods leverage traditional computer vision approaches to automate the identification of safe terrain from input digital elevation models (DEMs). However, performance for these methods can degrade for input DEMs with increased sensor noise. In the last decade, deep learning techniques have been developed for various applications. Nevertheless, their applicability to safety-critical space missions has often been limited due to concerns regarding their outputs' reliability. In response to these limitations, this paper proposes an application of the Bayesian deep-learning segmentation method for hazard detection. The developed approach enables reliable, safe landing site detection by: (i) generating simultaneously a safety prediction map and its uncertainty map via Bayesian deep learning and semantic segmentation; and (ii) using the uncertainty map to filter out the uncertain pixels in the prediction map so that the safe site identification is performed only based on the certain pixels (i.e., pixels for which the model is certain about its safety prediction). Experiments are presented with simulated data based on a Mars HiRISE digital terrain model by varying uncertainty threshold and noise levels to demonstrate the performance of the proposed approach. △ Less

Submitted 18 March, 2022; v1 submitted 21 February, 2021; originally announced February 2021.

Comments: 18 pages, 9 figures, under review for the AIAA Journal of Spacecraft and Rockets, revised from Paper AAS 21-253 presented at the AAS/AIAA Space Flight Mechanics Meeting in 2021

arXiv:2001.04432 [pdf, other]

A Preliminary Approach for Learning Relational Policies for the Management of Critically Ill Children

Authors: Michael A. Skinner, Lakshmi Raman, Neel Shah, Abdelaziz Farhat, Sriraam Natarajan

Abstract: The increased use of electronic health records has made possible the automated extraction of medical policies from patient records to aid in the development of clinical decision support systems. We adapted a boosted Statistical Relational Learning (SRL) framework to learn probabilistic rules from clinical hospital records for the management of physiologic parameters of children with severe cardiac… ▽ More The increased use of electronic health records has made possible the automated extraction of medical policies from patient records to aid in the development of clinical decision support systems. We adapted a boosted Statistical Relational Learning (SRL) framework to learn probabilistic rules from clinical hospital records for the management of physiologic parameters of children with severe cardiac or respiratory failure who were managed with extracorporeal membrane oxygenation. In this preliminary study, the results were promising. In particular, the algorithm returned logic rules for medical actions that are consistent with medical reasoning. △ Less

Submitted 13 January, 2020; originally announced January 2020.

Comments: 6 pages, 1 figure, presented at the 2020 AAAI StarAI workshop

arXiv:1809.06256 [pdf, other]

Sensor Transfer: Learning Optimal Sensor Effect Image Augmentation for Sim-to-Real Domain Adaptation

Authors: Alexandra Carlson, Katherine A. Skinner, Ram Vasudevan, Matthew Johnson-Roberson

Abstract: Performance on benchmark datasets has drastically improved with advances in deep learning. Still, cross-dataset generalization performance remains relatively low due to the domain shift that can occur between two different datasets. This domain shift is especially exaggerated between synthetic and real datasets. Significant research has been done to reduce this gap, specifically via modeling varia… ▽ More Performance on benchmark datasets has drastically improved with advances in deep learning. Still, cross-dataset generalization performance remains relatively low due to the domain shift that can occur between two different datasets. This domain shift is especially exaggerated between synthetic and real datasets. Significant research has been done to reduce this gap, specifically via modeling variation in the spatial layout of a scene, such as occlusions, and scene environmental factors, such as time of day and weather effects. However, few works have addressed modeling the variation in the sensor domain as a means of reducing the synthetic to real domain gap. The camera or sensor used to capture a dataset introduces artifacts into the image data that are unique to the sensor model, suggesting that sensor effects may also contribute to domain shift. To address this, we propose a learned augmentation network composed of physically-based augmentation functions. Our proposed augmentation pipeline transfers specific effects of the sensor model -- chromatic aberration, blur, exposure, noise, and color temperature -- from a real dataset to a synthetic dataset. We provide experiments that demonstrate that augmenting synthetic training datasets with the proposed learned augmentation framework reduces the domain gap between synthetic and real domains for object detection in urban driving scenes. △ Less

Submitted 7 January, 2019; v1 submitted 17 September, 2018; originally announced September 2018.

arXiv:1809.04734 [pdf, other]

doi 10.1109/LRA.2019.2894913

DispSegNet: Leveraging Semantics for End-to-End Learning of Disparity Estimation from Stereo Imagery

Authors: Junming Zhang, Katherine A. Skinner, Ram Vasudevan, Matthew Johnson-Roberson

Abstract: Recent work has shown that convolutional neural networks (CNNs) can be applied successfully in disparity estimation, but these methods still suffer from errors in regions of low-texture, occlusions and reflections. Concurrently, deep learning for semantic segmentation has shown great progress in recent years. In this paper, we design a CNN architecture that combines these two tasks to improve the… ▽ More Recent work has shown that convolutional neural networks (CNNs) can be applied successfully in disparity estimation, but these methods still suffer from errors in regions of low-texture, occlusions and reflections. Concurrently, deep learning for semantic segmentation has shown great progress in recent years. In this paper, we design a CNN architecture that combines these two tasks to improve the quality and accuracy of disparity estimation with the help of semantic segmentation. Specifically, we propose a network structure in which these two tasks are highly coupled. One key novelty of this approach is the two-stage refinement process. Initial disparity estimates are refined with an embedding learned from the semantic segmentation branch of the network. The proposed model is trained using an unsupervised approach, in which images from one half of the stereo pair are warped and compared against images from the other camera. Another key advantage of the proposed approach is that a single network is capable of outputting disparity estimates and semantic labels. These outputs are of great use in autonomous vehicle operation; with real-time constraints being key, such performance improvements increase the viability of driving applications. Experiments on KITTI and Cityscapes datasets show that our model can achieve state-of-the-art results and that leveraging embedding learned from semantic segmentation improves the performance of disparity estimation. △ Less

Submitted 15 January, 2019; v1 submitted 12 September, 2018; originally announced September 2018.

Comments: Add more description on the architecture of the model. Add more discussion on section IV-C. Fix typo in formula 6

Journal ref: IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 1162-1169, April 2019

arXiv:1803.07721 [pdf, other]

Modeling Camera Effects to Improve Visual Learning from Synthetic Data

Authors: Alexandra Carlson, Katherine A. Skinner, Ram Vasudevan, Matthew Johnson-Roberson

Abstract: Recent work has focused on generating synthetic imagery to increase the size and variability of training data for learning visual tasks in urban scenes. This includes increasing the occurrence of occlusions or varying environmental and weather effects. However, few have addressed modeling variation in the sensor domain. Sensor effects can degrade real images, limiting generalizability of network p… ▽ More Recent work has focused on generating synthetic imagery to increase the size and variability of training data for learning visual tasks in urban scenes. This includes increasing the occurrence of occlusions or varying environmental and weather effects. However, few have addressed modeling variation in the sensor domain. Sensor effects can degrade real images, limiting generalizability of network performance on visual tasks trained on synthetic data and tested in real environments. This paper proposes an efficient, automatic, physically-based augmentation pipeline to vary sensor effects --chromatic aberration, blur, exposure, noise, and color cast-- for synthetic imagery. In particular, this paper illustrates that augmenting synthetic training datasets with the proposed pipeline reduces the domain gap between synthetic and real domains for the task of object detection in urban driving scenes. △ Less

Submitted 1 October, 2018; v1 submitted 20 March, 2018; originally announced March 2018.

arXiv:1702.07392 [pdf, other]

doi 10.1109/LRA.2017.2730363

WaterGAN: Unsupervised Generative Network to Enable Real-time Color Correction of Monocular Underwater Images

Authors: Jie Li, Katherine A. Skinner, Ryan M. Eustice, Matthew Johnson-Roberson

Abstract: This paper reports on WaterGAN, a generative adversarial network (GAN) for generating realistic underwater images from in-air image and depth pairings in an unsupervised pipeline used for color correction of monocular underwater images. Cameras onboard autonomous and remotely operated vehicles can capture high resolution images to map the seafloor, however, underwater image formation is subject to… ▽ More This paper reports on WaterGAN, a generative adversarial network (GAN) for generating realistic underwater images from in-air image and depth pairings in an unsupervised pipeline used for color correction of monocular underwater images. Cameras onboard autonomous and remotely operated vehicles can capture high resolution images to map the seafloor, however, underwater image formation is subject to the complex process of light propagation through the water column. The raw images retrieved are characteristically different than images taken in air due to effects such as absorption and scattering, which cause attenuation of light at different rates for different wavelengths. While this physical process is well described theoretically, the model depends on many parameters intrinsic to the water column as well as the objects in the scene. These factors make recovery of these parameters difficult without simplifying assumptions or field calibration, hence, restoration of underwater images is a non-trivial problem. Deep learning has demonstrated great success in modeling complex nonlinear systems but requires a large amount of training data, which is difficult to compile in deep sea environments. Using WaterGAN, we generate a large training dataset of paired imagery, both raw underwater and true color in-air, as well as depth data. This data serves as input to a novel end-to-end network for color correction of monocular underwater images. Due to the depth-dependent water column effects inherent to underwater environments, we show that our end-to-end network implicitly learns a coarse depth estimate of the underwater scene from monocular underwater images. Our proposed pipeline is validated with testing on real data collected from both a pure water tank and from underwater surveys in field testing. Source code is made publicly available with sample datasets and pretrained models. △ Less

Submitted 26 October, 2017; v1 submitted 23 February, 2017; originally announced February 2017.

Comments: 8 pages, 16 figures, published by RA-letter 2018. Source code available at: https://github.com/kskin/WaterGAN

Journal ref: IEEE Robotics and Automation Letters IEEE Robotics and Automation Letters IEEE Robotics and Automation Letters 387 - 394 (2018)

Showing 1–22 of 22 results for author: Skinner, A