Search | arXiv e-print repository

doi 10.1109/ICAR58858.2023.10406394

Scalable and Efficient Hierarchical Visual Topological Map**

Authors: Saravanabalagi Ramachandran, Jonathan Horgan, Ganesh Sistu, John McDonald

Abstract: Hierarchical topological representations can significantly reduce search times within map** and localization algorithms. Although recent research has shown the potential for such approaches, limited consideration has been given to the suitability and comparative performance of different global feature representations within this context. In this work, we evaluate state-of-the-art hand-crafted an… ▽ More Hierarchical topological representations can significantly reduce search times within map** and localization algorithms. Although recent research has shown the potential for such approaches, limited consideration has been given to the suitability and comparative performance of different global feature representations within this context. In this work, we evaluate state-of-the-art hand-crafted and learned global descriptors using a hierarchical topological map** technique on benchmark datasets and present results of a comprehensive evaluation of the impact of the global descriptor used. Although learned descriptors have been incorporated into place recognition methods to improve retrieval accuracy and enhance overall recall, the problem of scalability and efficiency when applied to longer trajectories has not been adequately addressed in a majority of research studies. Based on our empirical analysis of multiple runs, we identify that continuity and distinctiveness are crucial characteristics for an optimal global descriptor that enable efficient and scalable hierarchical map**, and present a methodology for quantifying and contrasting these characteristics across different global descriptors. Our study demonstrates that the use of global descriptors based on an unsupervised learned Variational Autoencoder (VAE) excels in these characteristics and achieves significantly lower runtime. It runs on a consumer grade desktop, up to 2.3x faster than the second best global descriptor, NetVLAD, and up to 9.5x faster than the hand-crafted descriptor, PHOG, on the longest track evaluated (St Lucia, 17.6 km), without sacrificing overall recall performance. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: Published in the 21st International Conference on Advanced Robotics (ICAR 2023)

arXiv:2403.13918 [pdf, other]

Automated Calibration of Parallel and Distributed Computing Simulators: A Case Study

Authors: Jesse McDonald, Maximilian Horzela, Frédéric Suter, Henri Casanova

Abstract: Many parallel and distributed computing research results are obtained in simulation, using simulators that mimic real-world executions on some target system. Each such simulator is configured by picking values for parameters that define the behavior of the underlying simulation models it implements. The main concern for a simulator is accuracy: simulated behaviors should be as close as possible to… ▽ More Many parallel and distributed computing research results are obtained in simulation, using simulators that mimic real-world executions on some target system. Each such simulator is configured by picking values for parameters that define the behavior of the underlying simulation models it implements. The main concern for a simulator is accuracy: simulated behaviors should be as close as possible to those observed in the real-world target system. This requires that values for each of the simulator's parameters be carefully picked, or "calibrated," based on ground-truth real-world executions. Examining the current state of the art shows that simulator calibration, at least in the field of parallel and distributed computing, is often undocumented (and thus perhaps often not performed) and, when documented, is described as a labor-intensive, manual process. In this work we evaluate the benefit of automating simulation calibration using simple algorithms. Specifically, we use a real-world case study from the field of High Energy Physics and compare automated calibration to calibration performed by a domain scientist. Our main finding is that automated calibration is on par with or significantly outperforms the calibration performed by the domain scientist. Furthermore, automated calibration makes it straightforward to operate desirable trade-offs between simulation accuracy and simulation speed. △ Less

Submitted 1 July, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

Comments: In Proc. of the 25th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2024)

arXiv:2402.18593 [pdf, other]

doi 10.1145/3620678.3624793

Sustainable Supercomputing for AI: GPU Power Cap** at HPC Scale

Authors: Dan Zhao, Siddharth Samsi, Joseph McDonald, Baolin Li, David Bestor, Michael Jones, Devesh Tiwari, Vijay Gadepally

Abstract: As research and deployment of AI grows, the computational burden to support and sustain its progress inevitably does too. To train or fine-tune state-of-the-art models in NLP, computer vision, etc., some form of AI hardware acceleration is virtually a requirement. Recent large language models require considerable resources to train and deploy, resulting in significant energy usage, potential carbo… ▽ More As research and deployment of AI grows, the computational burden to support and sustain its progress inevitably does too. To train or fine-tune state-of-the-art models in NLP, computer vision, etc., some form of AI hardware acceleration is virtually a requirement. Recent large language models require considerable resources to train and deploy, resulting in significant energy usage, potential carbon emissions, and massive demand for GPUs and other hardware accelerators. However, this surge carries large implications for energy sustainability at the HPC/datacenter level. In this paper, we study the aggregate effect of power-cap** GPUs on GPU temperature and power draw at a research supercomputing center. With the right amount of power-cap**, we show significant decreases in both temperature and power draw, reducing power consumption and potentially improving hardware life-span with minimal impact on job performance. While power-cap** reduces power draw by design, the aggregate system-wide effect on overall energy consumption is less clear; for instance, if users notice job performance degradation from GPU power-caps, they may request additional GPU-jobs to compensate, negating any energy savings or even worsening energy consumption. To our knowledge, our work is the first to conduct and make available a detailed analysis of the effects of GPU power-cap** at the supercomputing scale. We hope our work will inspire HPCs/datacenters to further explore, evaluate, and communicate the impact of power-cap** AI hardware accelerators for more sustainable AI. △ Less

Submitted 24 February, 2024; originally announced February 2024.

arXiv:2401.16437 [pdf, other]

A Benchmark Dataset for Tornado Detection and Prediction using Full-Resolution Polarimetric Weather Radar Data

Authors: Mark S. Veillette, James M. Kurdzo, Phillip M. Stepanian, John Y. N. Cho, Siddharth Samsi, Joseph McDonald

Abstract: Weather radar is the primary tool used by forecasters to detect and warn for tornadoes in near-real time. In order to assist forecasters in warning the public, several algorithms have been developed to automatically detect tornadic signatures in weather radar observations. Recently, Machine Learning (ML) algorithms, which learn directly from large amounts of labeled data, have been shown to be hig… ▽ More Weather radar is the primary tool used by forecasters to detect and warn for tornadoes in near-real time. In order to assist forecasters in warning the public, several algorithms have been developed to automatically detect tornadic signatures in weather radar observations. Recently, Machine Learning (ML) algorithms, which learn directly from large amounts of labeled data, have been shown to be highly effective for this purpose. Since tornadoes are extremely rare events within the corpus of all available radar observations, the selection and design of training datasets for ML applications is critical for the performance, robustness, and ultimate acceptance of ML algorithms. This study introduces a new benchmark dataset, TorNet to support development of ML algorithms in tornado detection and prediction. TorNet contains full-resolution, polarimetric, Level-II WSR-88D data sampled from 10 years of reported storm events. A number of ML baselines for tornado detection are developed and compared, including a novel deep learning (DL) architecture capable of processing raw radar imagery without the need for manual feature extraction required for existing ML algorithms. Despite not benefiting from manual feature engineering or other preprocessing, the DL model shows increased detection performance compared to non-DL and operational baselines. The TorNet dataset, as well as source code and model weights of the DL baseline trained in this work, are made freely available. △ Less

Submitted 26 January, 2024; originally announced January 2024.

Comments: 37 pages, 15 Figures, 2 Tables

arXiv:2401.00910 [pdf, other]

WoodScape Motion Segmentation for Autonomous Driving -- CVPR 2023 OmniCV Workshop Challenge

Authors: Saravanabalagi Ramachandran, Nathaniel Cibik, Ganesh Sistu, John McDonald

Abstract: Motion segmentation is a complex yet indispensable task in autonomous driving. The challenges introduced by the ego-motion of the cameras, radial distortion in fisheye lenses, and the need for temporal consistency make the task more complicated, rendering traditional and standard Convolutional Neural Network (CNN) approaches less effective. The consequent laborious data labeling, representation of… ▽ More Motion segmentation is a complex yet indispensable task in autonomous driving. The challenges introduced by the ego-motion of the cameras, radial distortion in fisheye lenses, and the need for temporal consistency make the task more complicated, rendering traditional and standard Convolutional Neural Network (CNN) approaches less effective. The consequent laborious data labeling, representation of diverse and uncommon scenarios, and extensive data capture requirements underscore the imperative of synthetic data for improving machine learning model performance. To this end, we employ the PD-WoodScape synthetic dataset developed by Parallel Domain, alongside the WoodScape fisheye dataset. Thus, we present the WoodScape fisheye motion segmentation challenge for autonomous driving, held as part of the CVPR 2023 Workshop on Omnidirectional Computer Vision (OmniCV). As one of the first competitions focused on fisheye motion segmentation, we aim to explore and evaluate the potential and impact of utilizing synthetic data in this domain. In this paper, we provide a detailed analysis on the competition which attracted the participation of 112 global teams and a total of 234 submissions. This study delineates the complexities inherent in the task of motion segmentation, emphasizes the significance of fisheye datasets, articulate the necessity for synthetic datasets and the resultant domain gap they engender, outlining the foundational blueprint for devising successful solutions. Subsequently, we delve into the details of the baseline experiments and winning methods evaluating their qualitative and quantitative results, providing with useful insights. △ Less

Submitted 16 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

Comments: CVPR 2023 OmniCV Workshop Challenge

arXiv:2310.03003 [pdf, other]

From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference

Authors: Siddharth Samsi, Dan Zhao, Joseph McDonald, Baolin Li, Adam Michaleas, Michael Jones, William Bergeron, Jeremy Kepner, Devesh Tiwari, Vijay Gadepally

Abstract: Large language models (LLMs) have exploded in popularity due to their new generative capabilities that go far beyond prior state-of-the-art. These technologies are increasingly being leveraged in various domains such as law, finance, and medicine. However, these models carry significant computational challenges, especially the compute and energy costs required for inference. Inference energy costs… ▽ More Large language models (LLMs) have exploded in popularity due to their new generative capabilities that go far beyond prior state-of-the-art. These technologies are increasingly being leveraged in various domains such as law, finance, and medicine. However, these models carry significant computational challenges, especially the compute and energy costs required for inference. Inference energy costs already receive less attention than the energy costs of training LLMs -- despite how often these large models are called on to conduct inference in reality (e.g., ChatGPT). As these state-of-the-art LLMs see increasing usage and deployment in various domains, a better understanding of their resource utilization is crucial for cost-savings, scaling performance, efficient hardware usage, and optimal inference strategies. In this paper, we describe experiments conducted to study the computational and energy utilization of inference with LLMs. We benchmark and conduct a preliminary analysis of the inference performance and inference energy costs of different sizes of LLaMA -- a recent state-of-the-art LLM -- developed by Meta AI on two generations of popular GPUs (NVIDIA V100 \& A100) and two datasets (Alpaca and GSM8K) to reflect the diverse set of tasks/benchmarks for LLMs in research and practice. We present the results of multi-node, multi-GPU inference using model sharding across up to 32 GPUs. To our knowledge, our work is the one of the first to study LLM inference performance from the perspective of computational and energy resources at this scale. △ Less

Submitted 4 October, 2023; originally announced October 2023.

arXiv:2308.01242 [pdf, ps, other]

Balanced-chromatic number and Hadwiger-like conjectures

Authors: Andrea Jiménez, Jessica Mcdonald, Reza Naserasr, Kathryn Nurse, Daniel A. Quiroz

Abstract: Motivated by different characterizations of planar graphs and the 4-Color Theorem, several structural results concerning graphs of high chromatic number have been obtained. Toward strengthening some of these results, we consider the \emph{balanced chromatic number}, $χ_b(\hat{G})$, of a signed graph $\hat{G}$. This is the minimum number of parts into which the vertices of a signed graph can be par… ▽ More Motivated by different characterizations of planar graphs and the 4-Color Theorem, several structural results concerning graphs of high chromatic number have been obtained. Toward strengthening some of these results, we consider the \emph{balanced chromatic number}, $χ_b(\hat{G})$, of a signed graph $\hat{G}$. This is the minimum number of parts into which the vertices of a signed graph can be partitioned so that none of the parts induces a negative cycle. This extends the notion of the chromatic number of a graph since $χ(G)=χ_b(\tilde{G})$, where $\tilde{G}$ denotes the signed graph obtained from~$G$ by replacing each edge with a pair of (parallel) positive and negative edges. We introduce a signed version of Hadwiger's conjecture as follows. Conjecture: If a signed graph $\hat{G}$ has no negative loop and no $\tilde{K_t}$-minor, then its balanced chromatic number is at most $t-1$. We prove that this conjecture is, in fact, equivalent to Hadwiger's conjecture and show its relation to the Odd Hadwiger Conjecture. Motivated by these results, we also consider the relation between subdivisions and balanced chromatic number. We prove that if $(G, σ)$ has no negative loop and no $\tilde{K_t}$-subdivision, then it admits a balanced $\frac{79}{2}t^2$-coloring. This qualitatively generalizes a result of Kawarabayashi (2013) on totally odd subdivisions. △ Less

Submitted 2 August, 2023; originally announced August 2023.

arXiv:2301.11581 [pdf, other]

doi 10.1109/IPDPSW55747.2022.00126

A Green(er) World for A.I

Authors: Dan Zhao, Nathan C. Frey, Joseph McDonald, Matthew Hubbell, David Bestor, Michael Jones, Andrew Prout, Vijay Gadepally, Siddharth Samsi

Abstract: As research and practice in artificial intelligence (A.I.) grow in leaps and bounds, the resources necessary to sustain and support their operations also grow at an increasing pace. While innovations and applications from A.I. have brought significant advances, from applications to vision and natural language to improvements to fields like medical imaging and materials engineering, their costs sho… ▽ More As research and practice in artificial intelligence (A.I.) grow in leaps and bounds, the resources necessary to sustain and support their operations also grow at an increasing pace. While innovations and applications from A.I. have brought significant advances, from applications to vision and natural language to improvements to fields like medical imaging and materials engineering, their costs should not be neglected. As we embrace a world with ever-increasing amounts of data as well as research and development of A.I. applications, we are sure to face an ever-mounting energy footprint to sustain these computational budgets, data storage needs, and more. But, is this sustainable and, more importantly, what kind of setting is best positioned to nurture such sustainable A.I. in both research and practice? In this paper, we outline our outlook for Green A.I. -- a more sustainable, energy-efficient and energy-aware ecosystem for develo** A.I. across the research, computing, and practitioner communities alike -- and the steps required to arrive there. We present a bird's eye view of various areas for potential changes and improvements from the ground floor of AI's operational and hardware optimizations for datacenters/HPCs to the current incentive structures in the world of A.I. research and practice, and more. We hope these points will spur further discussion, and action, on some of these issues and their potential solutions. △ Less

Submitted 27 January, 2023; originally announced January 2023.

Comments: 8 pages, published in 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Journal ref: D. Zhao et al., "A Green(er) World for A.I.," 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Lyon, France, 2022, pp. 742-750

arXiv:2301.08826 [pdf, other]

doi 10.1109/ACCESS.2022.3177752

A Review of the Trends and Challenges in Adopting Natural Language Processing Methods for Education Feedback Analysis

Authors: Thanveer Shaik, Xiaohui Tao, Yan Li, Christopher Dann, Jacquie Mcdonald, Petrea Redmond, Linda Galligan

Abstract: Artificial Intelligence (AI) is a fast-growing area of study that stretching its presence to many business and research domains. Machine learning, deep learning, and natural language processing (NLP) are subsets of AI to tackle different areas of data processing and modelling. This review article presents an overview of AI impact on education outlining with current opportunities. In the education… ▽ More Artificial Intelligence (AI) is a fast-growing area of study that stretching its presence to many business and research domains. Machine learning, deep learning, and natural language processing (NLP) are subsets of AI to tackle different areas of data processing and modelling. This review article presents an overview of AI impact on education outlining with current opportunities. In the education domain, student feedback data is crucial to uncover the merits and demerits of existing services provided to students. AI can assist in identifying the areas of improvement in educational infrastructure, learning management systems, teaching practices and study environment. NLP techniques play a vital role in analyzing student feedback in textual format. This research focuses on existing NLP methodologies and applications that could be adapted to educational domain applications like sentiment annotations, entity annotations, text summarization, and topic modelling. Trends and challenges in adopting NLP in education were reviewed and explored. Contextbased challenges in NLP like sarcasm, domain-specific language, ambiguity, and aspect-based sentiment analysis are explained with existing methodologies to overcome them. Research community approaches to extract the semantic meaning of emoticons and special characters in feedback which conveys user opinion and challenges in adopting NLP in education are explored. △ Less

Submitted 20 January, 2023; originally announced January 2023.

Journal ref: IEEE Access, vol. 10, pp. 56720-56739, 2022

arXiv:2210.14981 [pdf, other]

doi 10.56541/SUHE3553

Fast and Efficient Scene Categorization for Autonomous Driving using VAEs

Authors: Saravanabalagi Ramachandran, Jonathan Horgan, Ganesh Sistu, John McDonald

Abstract: Scene categorization is a useful precursor task that provides prior knowledge for many advanced computer vision tasks with a broad range of applications in content-based image indexing and retrieval systems. Despite the success of data driven approaches in the field of computer vision such as object detection, semantic segmentation, etc., their application in learning high-level features for scene… ▽ More Scene categorization is a useful precursor task that provides prior knowledge for many advanced computer vision tasks with a broad range of applications in content-based image indexing and retrieval systems. Despite the success of data driven approaches in the field of computer vision such as object detection, semantic segmentation, etc., their application in learning high-level features for scene recognition has not achieved the same level of success. We propose to generate a fast and efficient intermediate interpretable generalized global descriptor that captures coarse features from the image and use a classification head to map the descriptors to 3 scene categories: Rural, Urban and Suburban. We train a Variational Autoencoder in an unsupervised manner and map images to a constrained multi-dimensional latent space and use the latent vectors as compact embeddings that serve as global descriptors for images. The experimental results evidence that the VAE latent vectors capture coarse information from the image, supporting their usage as global descriptors. The proposed global descriptor is very compact with an embedding length of 128, significantly faster to compute, and is robust to seasonal and illuminational changes, while capturing sufficient scene information required for scene categorization. △ Less

Submitted 26 October, 2022; originally announced October 2022.

Comments: Published in the 24th Irish Machine Vision and Image Processing Conference (IMVIP 2022)

Journal ref: The 24th Irish Machine Vision and Image Processing Conference (IMVIP), 2022, 9-16

arXiv:2209.10404 [pdf, other]

GP-net: Flexible Viewpoint Grasp Proposal

Authors: Anna Konrad, John McDonald, Rudi Villing

Abstract: We present the Grasp Proposal Network (GP-net), a Convolutional Neural Network model which can generate 6-DoF grasps from flexible viewpoints, e.g. as experienced by mobile manipulators. To train GP-net, we synthetically generate a dataset containing depth-images and ground-truth grasp information. In real-world experiments, we use the EGAD evaluation benchmark to evaluate GP-net against two commo… ▽ More We present the Grasp Proposal Network (GP-net), a Convolutional Neural Network model which can generate 6-DoF grasps from flexible viewpoints, e.g. as experienced by mobile manipulators. To train GP-net, we synthetically generate a dataset containing depth-images and ground-truth grasp information. In real-world experiments, we use the EGAD evaluation benchmark to evaluate GP-net against two commonly used algorithms, the Volumetric Gras** Network (VGN) and the Grasp Pose Detection package (GPD), on a PAL TIAGo mobile manipulator. In contrast to the state-of-the-art methods in robotic gras**, GP-net can be used for gras** objects from flexible, unknown viewpoints without the need to define the workspace and achieves a grasp success of 54.4% compared to 51.6% for VGN and 44.2% for GPD. We provide a ROS package along with our code and pre-trained models at https://aucoroboticsmu.github.io/GP-net/. △ Less

Submitted 12 October, 2023; v1 submitted 21 September, 2022; originally announced September 2022.

Comments: Accepted to ICAR 2023

arXiv:2209.05300 [pdf, other]

An Evaluation of Low Overhead Time Series Preprocessing Techniques for Downstream Machine Learning

Authors: Matthew L. Weiss, Joseph McDonald, David Bestor, Charles Yee, Daniel Edelman, Michael Jones, Andrew Prout, Andrew Bowne, Lindsey McEvoy, Vijay Gadepally, Siddharth Samsi

Abstract: In this paper we address the application of pre-processing techniques to multi-channel time series data with varying lengths, which we refer to as the alignment problem, for downstream machine learning. The misalignment of multi-channel time series data may occur for a variety of reasons, such as missing data, varying sampling rates, or inconsistent collection times. We consider multi-channel time… ▽ More In this paper we address the application of pre-processing techniques to multi-channel time series data with varying lengths, which we refer to as the alignment problem, for downstream machine learning. The misalignment of multi-channel time series data may occur for a variety of reasons, such as missing data, varying sampling rates, or inconsistent collection times. We consider multi-channel time series data collected from the MIT SuperCloud High Performance Computing (HPC) center, where different job start times and varying run times of HPC jobs result in misaligned data. This misalignment makes it challenging to build AI/ML approaches for tasks such as compute workload classification. Building on previous supervised classification work with the MIT SuperCloud Dataset, we address the alignment problem via three broad, low overhead approaches: sampling a fixed subset from a full time series, performing summary statistics on a full time series, and sampling a subset of coefficients from time series mapped to the frequency domain. Our best performing models achieve a classification accuracy greater than 95%, outperforming previous approaches to multi-channel time series classification with the MIT SuperCloud Dataset by 5%. These results indicate our low overhead approaches to solving the alignment problem, in conjunction with standard machine learning techniques, are able to achieve high levels of classification accuracy, and serve as a baseline for future approaches to addressing the alignment problem, such as kernel methods. △ Less

Submitted 12 September, 2022; originally announced September 2022.

arXiv:2206.12912 [pdf, other]

Woodscape Fisheye Object Detection for Autonomous Driving -- CVPR 2022 OmniCV Workshop Challenge

Authors: Saravanabalagi Ramachandran, Ganesh Sistu, Varun Ravi Kumar, John McDonald, Senthil Yogamani

Abstract: Object detection is a comprehensively studied problem in autonomous driving. However, it has been relatively less explored in the case of fisheye cameras. The strong radial distortion breaks the translation invariance inductive bias of Convolutional Neural Networks. Thus, we present the WoodScape fisheye object detection challenge for autonomous driving which was held as part of the CVPR 2022 Work… ▽ More Object detection is a comprehensively studied problem in autonomous driving. However, it has been relatively less explored in the case of fisheye cameras. The strong radial distortion breaks the translation invariance inductive bias of Convolutional Neural Networks. Thus, we present the WoodScape fisheye object detection challenge for autonomous driving which was held as part of the CVPR 2022 Workshop on Omnidirectional Computer Vision (OmniCV). This is one of the first competitions focused on fisheye camera object detection. We encouraged the participants to design models which work natively on fisheye images without rectification. We used CodaLab to host the competition based on the publicly available WoodScape fisheye dataset. In this paper, we provide a detailed analysis on the competition which attracted the participation of 120 global teams and a total of 1492 submissions. We briefly discuss the details of the winning methods and analyze their qualitative and quantitative results. △ Less

Submitted 26 June, 2022; originally announced June 2022.

Comments: Workshop on Omnidirectional Computer Vision (OmniCV) at Conference on Computer Vision and Pattern Recognition (CVPR) 2021

arXiv:2205.15667 [pdf, other]

ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation

Authors: Pramit Dutta, Ganesh Sistu, Senthil Yogamani, Edgar Galván, John McDonald

Abstract: Generating a detailed near-field perceptual model of the environment is an important and challenging problem in both self-driving vehicles and autonomous mobile robotics. A Bird Eye View (BEV) map, providing a panoptic representation, is a commonly used approach that provides a simplified 2D representation of the vehicle surroundings with accurate semantic level segmentation for many downstream ta… ▽ More Generating a detailed near-field perceptual model of the environment is an important and challenging problem in both self-driving vehicles and autonomous mobile robotics. A Bird Eye View (BEV) map, providing a panoptic representation, is a commonly used approach that provides a simplified 2D representation of the vehicle surroundings with accurate semantic level segmentation for many downstream tasks. Current state-of-the art approaches to generate BEV-maps employ a Convolutional Neural Network (CNN) backbone to create feature-maps which are passed through a spatial transformer to project the derived features onto the BEV coordinate frame. In this paper, we evaluate the use of vision transformers (ViT) as a backbone architecture to generate BEV maps. Our network architecture, ViT-BEVSeg, employs standard vision transformers to generate a multi-scale representation of the input image. The resulting representation is then provided as an input to a spatial transformer decoder module which outputs segmentation maps in the BEV grid. We evaluate our approach on the nuScenes dataset demonstrating a considerable improvement in the performance relative to state-of-the-art approaches. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Comments: Accepted for 2022 IEEE World Congress on Computational Intelligence (Track: IJCNN)

arXiv:2205.09646 [pdf, other]

doi 10.18653/v1/2022.findings-naacl.151

Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models

Authors: Joseph McDonald, Baolin Li, Nathan Frey, Devesh Tiwari, Vijay Gadepally, Siddharth Samsi

Abstract: The energy requirements of current natural language processing models continue to grow at a rapid, unsustainable pace. Recent works highlighting this problem conclude there is an urgent need for methods that reduce the energy needs of NLP and machine learning more broadly. In this article, we investigate techniques that can be used to reduce the energy consumption of common NLP applications. In pa… ▽ More The energy requirements of current natural language processing models continue to grow at a rapid, unsustainable pace. Recent works highlighting this problem conclude there is an urgent need for methods that reduce the energy needs of NLP and machine learning more broadly. In this article, we investigate techniques that can be used to reduce the energy consumption of common NLP applications. In particular, we focus on techniques to measure energy usage and different hardware and datacenter-oriented settings that can be tuned to reduce energy consumption for training and inference for language models. We characterize the impact of these settings on metrics such as computational performance and energy consumption through experiments conducted on a high performance computing system as well as popular cloud computing platforms. These techniques can lead to significant reduction in energy consumption when training language models or their use for inference. For example, power-cap**, which limits the maximum power a GPU can consume, can enable a 15\% decrease in energy usage with marginal increase in overall computation time when training a transformer-based language model. △ Less

Submitted 19 May, 2022; originally announced May 2022.

Journal ref: Findings of the Association for Computational Linguistics: NAACL 2022

arXiv:2204.05839 [pdf, ps, other]

doi 10.1109/IPDPSW55747.2022.00122

The MIT Supercloud Workload Classification Challenge

Authors: Benny J. Tang, Qiqi Chen, Matthew L. Weiss, Nathan Frey, Joseph McDonald, David Bestor, Charles Yee, William Arcand, Chansup Byun, Daniel Edelman, Matthew Hubbell, Michael Jones, Jeremy Kepner, Anna Klein, Adam Michaleas, Peter Michaleas, Lauren Milechin, Julia Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Andrew Bowne, Lindsey McEvoy, Baolin Li, Devesh Tiwari , et al. (2 additional authors not shown)

Abstract: High-Performance Computing (HPC) centers and cloud providers support an increasingly diverse set of applications on heterogenous hardware. As Artificial Intelligence (AI) and Machine Learning (ML) workloads have become an increasingly larger share of the compute workloads, new approaches to optimized resource usage, allocation, and deployment of new AI frameworks are needed. By identifying compute… ▽ More High-Performance Computing (HPC) centers and cloud providers support an increasingly diverse set of applications on heterogenous hardware. As Artificial Intelligence (AI) and Machine Learning (ML) workloads have become an increasingly larger share of the compute workloads, new approaches to optimized resource usage, allocation, and deployment of new AI frameworks are needed. By identifying compute workloads and their utilization characteristics, HPC systems may be able to better match available resources with the application demand. By leveraging datacenter instrumentation, it may be possible to develop AI-based approaches that can identify workloads and provide feedback to researchers and datacenter operators for improving operational efficiency. To enable this research, we released the MIT Supercloud Dataset, which provides detailed monitoring logs from the MIT Supercloud cluster. This dataset includes CPU and GPU usage by jobs, memory usage, and file system logs. In this paper, we present a workload classification challenge based on this dataset. We introduce a labelled dataset that can be used to develop new approaches to workload classification and present initial results based on existing approaches. The goal of this challenge is to foster algorithmic innovations in the analysis of compute workloads that can achieve higher accuracy than existing methods. Data and code will be made publicly available via the Datacenter Challenge website : https://dcc.mit.edu. △ Less

Submitted 13 April, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

Comments: Accepted at IPDPS ADOPT'22

arXiv:2203.04874 [pdf, other]

doi 10.1109/IJCNN55064.2022.9892763

VGQ-CNN: Moving Beyond Fixed Cameras and Top-Grasps for Grasp Quality Prediction

Authors: A. Konrad, J. McDonald, R. Villing

Abstract: We present the Versatile Grasp Quality Convolutional Neural Network (VGQ-CNN), a grasp quality prediction network for 6-DOF grasps. VGQ-CNN can be used when evaluating grasps for objects seen from a wide range of camera poses or mobile robots without the need to retrain the network. By defining the grasp orientation explicitly as an input to the network, VGQ-CNN can evaluate 6-DOF grasp poses, mov… ▽ More We present the Versatile Grasp Quality Convolutional Neural Network (VGQ-CNN), a grasp quality prediction network for 6-DOF grasps. VGQ-CNN can be used when evaluating grasps for objects seen from a wide range of camera poses or mobile robots without the need to retrain the network. By defining the grasp orientation explicitly as an input to the network, VGQ-CNN can evaluate 6-DOF grasp poses, moving beyond the 4-DOF grasps used in most image-based grasp evaluation methods like GQ-CNN. To train VGQ-CNN, we generate the new Versatile Grasp dataset (VG-dset) containing 6-DOF grasps observed from a wide range of camera poses. VGQ-CNN achieves a balanced accuracy of 82.1% on our test-split while generalising to a variety of camera poses. Meanwhile, it achieves competitive performance for overhead cameras and top-grasps with a balanced accuracy of 74.2% compared to GQ-CNN's 76.6%. We also propose a modified network architecture, FAST-VGQ-CNN, that speeds up inference using a shared encoder architecture and can make 128 grasp quality predictions in 12ms on a CPU. Code and data are available at https://aucoroboticsmu.github.io/vgq-cnn/. △ Less

Submitted 23 June, 2022; v1 submitted 9 March, 2022; originally announced March 2022.

Comments: Accepted for International Joint Conference on Neural Networks (IJCNN) 2022

arXiv:2201.12423 [pdf, other]

Benchmarking Resource Usage for Efficient Distributed Deep Learning

Authors: Nathan C. Frey, Baolin Li, Joseph McDonald, Dan Zhao, Michael Jones, David Bestor, Devesh Tiwari, Vijay Gadepally, Siddharth Samsi

Abstract: Deep learning (DL) workflows demand an ever-increasing budget of compute and energy in order to achieve outsized gains. Neural architecture searches, hyperparameter sweeps, and rapid prototy** consume immense resources that can prevent resource-constrained researchers from experimenting with large models and carry considerable environmental impact. As such, it becomes essential to understand how… ▽ More Deep learning (DL) workflows demand an ever-increasing budget of compute and energy in order to achieve outsized gains. Neural architecture searches, hyperparameter sweeps, and rapid prototy** consume immense resources that can prevent resource-constrained researchers from experimenting with large models and carry considerable environmental impact. As such, it becomes essential to understand how different deep neural networks (DNNs) and training leverage increasing compute and energy resources -- especially specialized computationally-intensive models across different domains and applications. In this paper, we conduct over 3,400 experiments training an array of deep networks representing various domains/tasks -- natural language processing, computer vision, and chemistry -- on up to 424 graphics processing units (GPUs). During training, our experiments systematically vary compute resource characteristics and energy-saving mechanisms such as power utilization and GPU clock rate limits to capture and illustrate the different trade-offs and scaling behaviors each representative model exhibits under various resource and energy-constrained regimes. We fit power law models that describe how training time scales with available compute resources and energy constraints. We anticipate that these findings will help inform and guide high-performance computing providers in optimizing resource utilization, by selectively reducing energy consumption for different deep learning tasks/workflows with minimal impact on training. △ Less

Submitted 28 January, 2022; originally announced January 2022.

Comments: 14 pages, 17 figures

arXiv:2112.03386 [pdf, other]

Guided Imitation of Task and Motion Planning

Authors: Michael James McDonald, Dylan Hadfield-Menell

Abstract: While modern policy optimization methods can do complex manipulation from sensory data, they struggle on problems with extended time horizons and multiple sub-goals. On the other hand, task and motion planning (TAMP) methods scale to long horizons but they are computationally expensive and need to precisely track world state. We propose a method that draws on the strength of both methods: we train… ▽ More While modern policy optimization methods can do complex manipulation from sensory data, they struggle on problems with extended time horizons and multiple sub-goals. On the other hand, task and motion planning (TAMP) methods scale to long horizons but they are computationally expensive and need to precisely track world state. We propose a method that draws on the strength of both methods: we train a policy to imitate a TAMP solver's output. This produces a feed-forward policy that can accomplish multi-step tasks from sensory data. First, we build an asynchronous distributed TAMP solver that can produce supervision data fast enough for imitation learning. Then, we propose a hierarchical policy architecture that lets us use partially trained control policies to speed up the TAMP solver. In robotic manipulation tasks with 7-DoF joint control, the partially trained policies reduce the time needed for planning by a factor of up to 2.6. Among these tasks, we can learn a policy that solves the RoboSuite 4-object pick-place task 88% of the time from object pose observations and a policy that solves the RoboDesk 9-goal benchmark 79% of the time from RGB images (averaged across the 9 disparate tasks). △ Less

Submitted 6 December, 2021; originally announced December 2021.

Comments: 16 pages, 6 figures, 2 tables, submitted to Conference on Robot Learning 2021, to be published in Proceedings of Machine Learning Research

arXiv:2112.03364 [pdf, other]

Scalable Geometric Deep Learning on Molecular Graphs

Authors: Nathan C. Frey, Siddharth Samsi, Joseph McDonald, Lin Li, Connor W. Coley, Vijay Gadepally

Abstract: Deep learning in molecular and materials sciences is limited by the lack of integration between applied science, artificial intelligence, and high-performance computing. Bottlenecks with respect to the amount of training data, the size and complexity of model architectures, and the scale of the compute infrastructure are all key factors limiting the scaling of deep learning for molecules and mater… ▽ More Deep learning in molecular and materials sciences is limited by the lack of integration between applied science, artificial intelligence, and high-performance computing. Bottlenecks with respect to the amount of training data, the size and complexity of model architectures, and the scale of the compute infrastructure are all key factors limiting the scaling of deep learning for molecules and materials. Here, we present $\textit{LitMatter}$, a lightweight framework for scaling molecular deep learning methods. We train four graph neural network architectures on over 400 GPUs and investigate the scaling behavior of these methods. Depending on the model architecture, training time speedups up to $60\times$ are seen. Empirical neural scaling relations quantify the model-dependent scaling and enable optimal compute resource allocation and the identification of scalable molecular geometric deep learning model implementations. △ Less

Submitted 6 December, 2021; originally announced December 2021.

Comments: 7 pages, 3 figures, NeurIPS 2021 AI for Science workshop

arXiv:2111.08398 [pdf, other]

2.5D Vehicle Odometry Estimation

Authors: Ciaran Eising, Leroy-Francisco Pereira, Jonathan Horgan, Anbuchezhiyan Selvaraju, John McDonald, Paul Moran

Abstract: It is well understood that in ADAS applications, a good estimate of the pose of the vehicle is required. This paper proposes a metaphorically named 2.5D odometry, whereby the planar odometry derived from the yaw rate sensor and four wheel speed sensors is augmented by a linear model of suspension. While the core of the planar odometry is a yaw rate model that is already understood in the literatur… ▽ More It is well understood that in ADAS applications, a good estimate of the pose of the vehicle is required. This paper proposes a metaphorically named 2.5D odometry, whereby the planar odometry derived from the yaw rate sensor and four wheel speed sensors is augmented by a linear model of suspension. While the core of the planar odometry is a yaw rate model that is already understood in the literature, we augment this by fitting a quadratic to the incoming signals, enabling interpolation, extrapolation, and a finer integration of the vehicle position. We show, by experimental results with a DGPS/IMU reference, that this model provides highly accurate odometry estimates, compared with existing methods. Utilising sensors that return the change in height of vehicle reference points with changing suspension configurations, we define a planar model of the vehicle suspension, thus augmenting the odometry model. We present an experimental framework and evaluations criteria by which the goodness of the odometry is evaluated and compared with existing methods. This odometry model has been designed to support low-speed surround-view camera systems that are well-known. Thus, we present some application results that show a performance boost for viewing and computer vision applications using the proposed odometry △ Less

Submitted 16 November, 2021; originally announced November 2021.

Comments: 13 pages, 16 figures, 2 tables

Journal ref: IET Intelligent Transport Systems, 2020

arXiv:2108.07736 [pdf, other]

A Hybrid Sparse-Dense Monocular SLAM System for Autonomous Driving

Authors: Louis Gallagher, Varun Ravi Kumar, Senthil Yogamani, John B. McDonald

Abstract: In this paper, we present a system for incrementally reconstructing a dense 3D model of the geometry of an outdoor environment using a single monocular camera attached to a moving vehicle. Dense models provide a rich representation of the environment facilitating higher-level scene understanding, perception, and planning. Our system employs dense depth prediction with a hybrid map** architecture… ▽ More In this paper, we present a system for incrementally reconstructing a dense 3D model of the geometry of an outdoor environment using a single monocular camera attached to a moving vehicle. Dense models provide a rich representation of the environment facilitating higher-level scene understanding, perception, and planning. Our system employs dense depth prediction with a hybrid map** architecture combining state-of-the-art sparse features and dense fusion-based visual SLAM algorithms within an integrated framework. Our novel contributions include design of hybrid sparse-dense camera tracking and loop closure, and scale estimation improvements in dense depth prediction. We use the motion estimates from the sparse method to overcome the large and variable inter-frame displacement typical of outdoor vehicle scenarios. Our system then registers the live image with the dense model using whole-image alignment. This enables the fusion of the live frame and dense depth prediction into the model. Global consistency and alignment between the sparse and dense models are achieved by applying pose constraints from the sparse method directly within the deformation of the dense model. We provide qualitative and quantitative results for both trajectory estimation and surface reconstruction accuracy, demonstrating competitive performance on the KITTI dataset. Qualitative results of the proposed approach are illustrated in https://youtu.be/Pn2uaVqjskY. Source code for the project is publicly available at the following repository https://github.com/robotvisionmu/DenseMonoSLAM. △ Less

Submitted 17 August, 2021; originally announced August 2021.

Comments: 8 pages, 5 figures. To be published in the proceedings of the 10th European Conference on Mobile Robotics 2021

ACM Class: I.2.10

arXiv:2108.05163 [pdf, other]

Efficient Surfel Fusion Using Normalised Information Distance

Authors: Louis Gallagher, John B. McDonald

Abstract: We present a new technique that achieves a significant reduction in the quantity of measurements required for a fusion based dense 3D map** system to converge to an accurate, de-noised surface reconstruction. This is achieved through the use of a Normalised Information Distance metric, that computes the novelty of the information contained in each incoming frame with respect to the reconstructio… ▽ More We present a new technique that achieves a significant reduction in the quantity of measurements required for a fusion based dense 3D map** system to converge to an accurate, de-noised surface reconstruction. This is achieved through the use of a Normalised Information Distance metric, that computes the novelty of the information contained in each incoming frame with respect to the reconstruction, and avoids fusing those frames that exceed a redundancy threshold. This provides a principled approach for opitmising the trade-off between surface reconstruction accuracy and the computational cost of processing frames. The technique builds upon the ElasticFusion (EF) algorithm where we report results of the technique's scalability and the accuracy of the resultant maps by applying it to both the ICL-NUIM and TUM RGB-D datasets. These results demonstrate the capabilities of the approach in performing accurate surface reconstructions whilst utilising a fraction of the frames when compared to the original EF algorithm. △ Less

Submitted 11 August, 2021; originally announced August 2021.

Comments: 4 pages, 4 figures, presented at CVPR 2019 Workshop on 3D Scene Understanding for Vision, Graphics, and Robotics

ACM Class: I.2.10

arXiv:2108.02037 [pdf]

The MIT Supercloud Dataset

Authors: Siddharth Samsi, Matthew L Weiss, David Bestor, Baolin Li, Michael Jones, Albert Reuther, Daniel Edelman, William Arcand, Chansup Byun, John Holodnack, Matthew Hubbell, Jeremy Kepner, Anna Klein, Joseph McDonald, Adam Michaleas, Peter Michaleas, Lauren Milechin, Julia Mullen, Charles Yee, Benjamin Price, Andrew Prout, Antonio Rosa, Allan Vanterpool, Lindsey McEvoy, Anson Cheng , et al. (2 additional authors not shown)

Abstract: Artificial intelligence (AI) and Machine learning (ML) workloads are an increasingly larger share of the compute workloads in traditional High-Performance Computing (HPC) centers and commercial cloud systems. This has led to changes in deployment approaches of HPC clusters and the commercial cloud, as well as a new focus on approaches to optimized resource usage, allocations and deployment of new… ▽ More Artificial intelligence (AI) and Machine learning (ML) workloads are an increasingly larger share of the compute workloads in traditional High-Performance Computing (HPC) centers and commercial cloud systems. This has led to changes in deployment approaches of HPC clusters and the commercial cloud, as well as a new focus on approaches to optimized resource usage, allocations and deployment of new AI frame- works, and capabilities such as Jupyter notebooks to enable rapid prototy** and deployment. With these changes, there is a need to better understand cluster/datacenter operations with the goal of develo** improved scheduling policies, identifying inefficiencies in resource utilization, energy/power consumption, failure prediction, and identifying policy violations. In this paper we introduce the MIT Supercloud Dataset which aims to foster innovative AI/ML approaches to the analysis of large scale HPC and datacenter/cloud operations. We provide detailed monitoring logs from the MIT Supercloud system, which include CPU and GPU usage by jobs, memory usage, file system logs, and physical monitoring data. This paper discusses the details of the dataset, collection methodology, data availability, and discusses potential challenge problems being developed using this data. Datasets and future challenge announcements will be available via https://dcc.mit.edu. △ Less

Submitted 4 August, 2021; originally announced August 2021.

arXiv:2107.08246 [pdf, other]

Woodscape Fisheye Semantic Segmentation for Autonomous Driving -- CVPR 2021 OmniCV Workshop Challenge

Authors: Saravanabalagi Ramachandran, Ganesh Sistu, John McDonald, Senthil Yogamani

Abstract: We present the WoodScape fisheye semantic segmentation challenge for autonomous driving which was held as part of the CVPR 2021 Workshop on Omnidirectional Computer Vision (OmniCV). This challenge is one of the first opportunities for the research community to evaluate the semantic segmentation techniques targeted for fisheye camera perception. Due to strong radial distortion standard models don't… ▽ More We present the WoodScape fisheye semantic segmentation challenge for autonomous driving which was held as part of the CVPR 2021 Workshop on Omnidirectional Computer Vision (OmniCV). This challenge is one of the first opportunities for the research community to evaluate the semantic segmentation techniques targeted for fisheye camera perception. Due to strong radial distortion standard models don't generalize well to fisheye images and hence the deformations in the visual appearance of objects and entities needs to be encoded implicitly or as explicit knowledge. This challenge served as a medium to investigate the challenges and new methodologies to handle the complexities with perception on fisheye images. The challenge was hosted on CodaLab and used the recently released WoodScape dataset comprising of 10k samples. In this paper, we provide a summary of the competition which attracted the participation of 71 global teams and a total of 395 submissions. The top teams recorded significantly improved mean IoU and accuracy scores over the baseline PSPNet with ResNet-50 backbone. We summarize the methods of winning algorithms and analyze the failure cases. We conclude by providing future directions for the research. △ Less

Submitted 17 July, 2021; originally announced July 2021.

Comments: Workshop on Omnidirectional Computer Vision (OmniCV) at Conference on Computer Vision and Pattern Recognition (CVPR) 2021. Presentation video is available at https://youtu.be/xa7Fl2mD4CA?t=12253

arXiv:2107.07557 [pdf, other]

OdoViz: A 3D Odometry Visualization and Processing Tool

Authors: Saravanabalagi Ramachandran, John McDonald

Abstract: OdoViz is a reactive web-based tool for 3D visualization and processing of autonomous vehicle datasets designed to support common tasks in visual place recognition research. The system includes functionality for loading, inspecting, visualizing, and processing GPS/INS poses, point clouds and camera images. It supports a number of commonly used driving datasets and can be adapted to load custom dat… ▽ More OdoViz is a reactive web-based tool for 3D visualization and processing of autonomous vehicle datasets designed to support common tasks in visual place recognition research. The system includes functionality for loading, inspecting, visualizing, and processing GPS/INS poses, point clouds and camera images. It supports a number of commonly used driving datasets and can be adapted to load custom datasets with minimal effort. OdoViz's design consists of a slim server to serve the datasets coupled with a rich client frontend. This design supports multiple deployment configurations including single user stand-alone installations, research group installations serving datasets internally across a lab, or publicly accessible web-frontends for providing online interfaces for exploring and interacting with datasets. The tool allows viewing complete vehicle trajectories traversed at multiple different time periods simultaneously, facilitating tasks such as sub-sampling, comparing and finding pose correspondences both across and within sequences. This significantly reduces the effort required in creating subsets of data from existing datasets for machine learning tasks. Further to the above, the system also supports adding custom extensions and plugins to extend the capabilities of the software for other potential data management, visualization and processing tasks. The platform has been open-sourced to promote its use and encourage further contributions from the research community. △ Less

Submitted 15 July, 2021; originally announced July 2021.

Comments: Accepted, ITSC 2021

arXiv:2105.02679 [pdf, other]

A 2.5D Vehicle Odometry Estimation for Vision Applications

Authors: Paul Moran, Leroy-Francisco Periera, Anbuchezhiyan Selvaraju, Tejash Prakash, Pantelis Ermilios, John McDonald, Jonathan Horgan, Ciarán Eising

Abstract: This paper proposes a method to estimate the pose of a sensor mounted on a vehicle as the vehicle moves through the world, an important topic for autonomous driving systems. Based on a set of commonly deployed vehicular odometric sensors, with outputs available on automotive communication buses (e.g. CAN or FlexRay), we describe a set of steps to combine a planar odometry based on wheel sensors wi… ▽ More This paper proposes a method to estimate the pose of a sensor mounted on a vehicle as the vehicle moves through the world, an important topic for autonomous driving systems. Based on a set of commonly deployed vehicular odometric sensors, with outputs available on automotive communication buses (e.g. CAN or FlexRay), we describe a set of steps to combine a planar odometry based on wheel sensors with a suspension model based on linear suspension sensors. The aim is to determine a more accurate estimate of the camera pose. We outline its usage for applications in both visualisation and computer vision. △ Less

Submitted 6 May, 2021; originally announced May 2021.

Journal ref: Proceedings of the 2020 Irish Machine Vision and Image Processing Conference

arXiv:2104.12583 [pdf]

doi 10.1109/ITSC.2015.329

Vision-based Driver Assistance Systems: Survey, Taxonomy and Advances

Authors: Jonathan Horgan, Ciarán Hughes, John McDonald, Senthil Yogamani

Abstract: Vision-based driver assistance systems is one of the rapidly growing research areas of ITS, due to various factors such as the increased level of safety requirements in automotive, computational power in embedded systems, and desire to get closer to autonomous driving. It is a cross disciplinary area encompassing specialised fields like computer vision, machine learning, robotic navigation, embedd… ▽ More Vision-based driver assistance systems is one of the rapidly growing research areas of ITS, due to various factors such as the increased level of safety requirements in automotive, computational power in embedded systems, and desire to get closer to autonomous driving. It is a cross disciplinary area encompassing specialised fields like computer vision, machine learning, robotic navigation, embedded systems, automotive electronics and safety critical software. In this paper, we survey the list of vision based advanced driver assistance systems with a consistent terminology and propose a taxonomy. We also propose an abstract model in an attempt to formalize a top-down view of application development to scale towards autonomous driving system. △ Less

Submitted 26 April, 2021; originally announced April 2021.

Journal ref: 2015 IEEE 18th International Conference on Intelligent Transportation Systems

arXiv:2104.12537 [pdf, other]

doi 10.1016/j.imavis.2017.07.002

Computer vision in automated parking systems: Design, implementation and challenges

Authors: Markus Heimberger, Jonathan Horgan, Ciaran Hughes, John McDonald, Senthil Yogamani

Abstract: Automated driving is an active area of research in both industry and academia. Automated Parking, which is automated driving in a restricted scenario of parking with low speed manoeuvring, is a key enabling product for fully autonomous driving systems. It is also an important milestone from the perspective of a higher end system built from the previous generation driver assistance systems comprisi… ▽ More Automated driving is an active area of research in both industry and academia. Automated Parking, which is automated driving in a restricted scenario of parking with low speed manoeuvring, is a key enabling product for fully autonomous driving systems. It is also an important milestone from the perspective of a higher end system built from the previous generation driver assistance systems comprising of collision warning, pedestrian detection, etc. In this paper, we discuss the design and implementation of an automated parking system from the perspective of computer vision algorithms. Designing a low-cost system with functional safety is challenging and leads to a large gap between the prototype and the end product, in order to handle all the corner cases. We demonstrate how camera systems are crucial for addressing a range of automated parking use cases and also, to add robustness to systems based on active distance measuring sensors, such as ultrasonics and radar. The key vision modules which realize the parking use cases are 3D reconstruction, parking slot marking recognition, freespace and vehicle/pedestrian detection. We detail the important parking use cases and demonstrate how to combine the vision modules to form a robust parking system. To the best of the authors' knowledge, this is the first detailed discussion of a systemic view of a commercial automated parking system. △ Less

Submitted 26 April, 2021; originally announced April 2021.

Journal ref: Image and Vision Computing, Volume 68, December 2017, Pages 88-101

arXiv:2103.00191 [pdf, other]

FisheyeSuperPoint: Keypoint Detection and Description Network for Fisheye Images

Authors: Anna Konrad, Ciarán Eising, Ganesh Sistu, John McDonald, Rudi Villing, Senthil Yogamani

Abstract: Keypoint detection and description is a commonly used building block in computer vision systems particularly for robotics and autonomous driving. However, the majority of techniques to date have focused on standard cameras with little consideration given to fisheye cameras which are commonly used in urban driving and automated parking. In this paper, we propose a novel training and evaluation pipe… ▽ More Keypoint detection and description is a commonly used building block in computer vision systems particularly for robotics and autonomous driving. However, the majority of techniques to date have focused on standard cameras with little consideration given to fisheye cameras which are commonly used in urban driving and automated parking. In this paper, we propose a novel training and evaluation pipeline for fisheye images. We make use of SuperPoint as our baseline which is a self-supervised keypoint detector and descriptor that has achieved state-of-the-art results on homography estimation. We introduce a fisheye adaptation pipeline to enable training on undistorted fisheye images. We evaluate the performance on the HPatches benchmark, and, by introducing a fisheye based evaluation method for detection repeatability and descriptor matching correctness, on the Oxford RobotCar dataset. △ Less

Submitted 29 November, 2021; v1 submitted 27 February, 2021; originally announced March 2021.

Comments: Accepted for Presentation at International Conference on Computer Vision Theory and Applications (VISAPP 2022)

arXiv:2003.13784 [pdf, other]

A Sampling Theorem for Deconvolution in Two Dimensions

Authors: Joseph McDonald, Brett Bernstein, Carlos Fernandez-Granda

Abstract: This work studies the problem of estimating a two-dimensional superposition of point sources or spikes from samples of their convolution with a Gaussian kernel. Our results show that minimizing a continuous counterpart of the $\ell_1$ norm exactly recovers the true spikes if they are sufficiently separated, and the samples are sufficiently dense. In addition, we provide numerical evidence that our… ▽ More This work studies the problem of estimating a two-dimensional superposition of point sources or spikes from samples of their convolution with a Gaussian kernel. Our results show that minimizing a continuous counterpart of the $\ell_1$ norm exactly recovers the true spikes if they are sufficiently separated, and the samples are sufficiently dense. In addition, we provide numerical evidence that our results extend to non-Gaussian kernels relevant to microscopy and telescopy. △ Less

Submitted 4 August, 2020; v1 submitted 30 March, 2020; originally announced March 2020.

Comments: 41 pages, 18 figures; added references and additional comments and description throughout for clarity

arXiv:2001.05051 [pdf, ps, other]

Strong coloring 2-regular graphs: Cycle restrictions and partial colorings

Authors: Jessica McDonald, Gregory J. Puleo

Abstract: Let $H$ be a graph with $Δ(H) \leq 2$, and let $G$ be obtained from $H$ by gluing in vertex-disjoint copies of $K_4$. We prove that if $H$ contains at most one odd cycle of length exceeding $3$, or if $H$ contains at most $3$ triangles, then $χ(G) \leq 4$. This proves the Strong Coloring Conjecture for such graphs $H$. For graphs $H$ with $Δ=2$ that are not covered by our theorem, we prove an appr… ▽ More Let $H$ be a graph with $Δ(H) \leq 2$, and let $G$ be obtained from $H$ by gluing in vertex-disjoint copies of $K_4$. We prove that if $H$ contains at most one odd cycle of length exceeding $3$, or if $H$ contains at most $3$ triangles, then $χ(G) \leq 4$. This proves the Strong Coloring Conjecture for such graphs $H$. For graphs $H$ with $Δ=2$ that are not covered by our theorem, we prove an approximation result towards the conjecture. △ Less

Submitted 6 July, 2021; v1 submitted 14 January, 2020; originally announced January 2020.

Comments: 17 pages, 2 figures

MSC Class: 05C15

arXiv:1907.05966 [pdf, ps, other]

doi 10.20429/tag.2021.080205

Upper bounds for inverse domination in graphs

Authors: Elliot Krop, Jessica McDonald, Gregory J. Puleo

Abstract: In any graph $G$, the domination number $γ(G)$ is at most the independence number $α(G)$. The Inverse Domination Conjecture says that, in any isolate-free $G$, there exists pair of vertex-disjoint dominating sets $D, D'$ with $|D|=γ(G)$ and $|D'| \leq α(G)$. Here we prove that this statement is true if the upper bound $α(G)$ is replaced by $\frac{3}{2}α(G) - 1$ (and $G$ is not a clique). We also p… ▽ More In any graph $G$, the domination number $γ(G)$ is at most the independence number $α(G)$. The Inverse Domination Conjecture says that, in any isolate-free $G$, there exists pair of vertex-disjoint dominating sets $D, D'$ with $|D|=γ(G)$ and $|D'| \leq α(G)$. Here we prove that this statement is true if the upper bound $α(G)$ is replaced by $\frac{3}{2}α(G) - 1$ (and $G$ is not a clique). We also prove that the conjecture holds whenever $γ(G)\leq 5$ or $|V(G)|\leq 16$. △ Less

Submitted 12 July, 2019; originally announced July 2019.

Comments: 9 pages

MSC Class: 05C69

Journal ref: Theory and Applications of Graphs, 8(2): Article 5, (2021)

arXiv:1901.07507 [pdf]

Investigating 3D Printer Residual Data

Authors: Daniel Bradford Miller, Jacob Gatlin, William Bradley Glisson, Mark Yampolskiy, Jeffrey Todd McDonald

Abstract: The continued adoption of Additive Manufacturing technologies is raising concerns in the security, forensics, and intelligence gathering communities. These concerns range from identifying and mitigating compromised devices, to theft of intellectual property, to sabotage, to the production of prohibited objects. Previous research has provided insight into the retrieval of configuration information… ▽ More The continued adoption of Additive Manufacturing technologies is raising concerns in the security, forensics, and intelligence gathering communities. These concerns range from identifying and mitigating compromised devices, to theft of intellectual property, to sabotage, to the production of prohibited objects. Previous research has provided insight into the retrieval of configuration information maintained on the devices, but this work shows that the devices can additionally maintain information about the print process. Comparisons between before and after images taken from an AM device reveal details about the device's activities, including printed designs, menu interactions, and the print history. Patterns in the storage of that information also may be useful for reducing the amount of data that needs to be examined during an investigation. These results provide a foundation for future investigations regarding the tools and processes suitable for examining these devices. △ Less

Submitted 22 January, 2019; originally announced January 2019.

Comments: Presented at the 52nd Hawaii International Conference on System Sciences. Jan 8-11, 2019. Wailea, HI, USA. 10 pages, 3 figures. Persistent link https://hdl.handle.net/10125/60154

arXiv:1811.07632 [pdf, other]

Collaborative Dense SLAM

Authors: Louis Gallagher, John B. McDonald

Abstract: In this paper, we present a new system for live collaborative dense surface reconstruction. Cooperative robotics, multi participant augmented reality and human-robot interaction are all examples of situations where collaborative map** can be leveraged for greater agent autonomy. Our system builds on ElasticFusion to allow a number of cameras starting with unknown initial relative positions to ma… ▽ More In this paper, we present a new system for live collaborative dense surface reconstruction. Cooperative robotics, multi participant augmented reality and human-robot interaction are all examples of situations where collaborative map** can be leveraged for greater agent autonomy. Our system builds on ElasticFusion to allow a number of cameras starting with unknown initial relative positions to maintain local maps utilising the original algorithm. Carrying out visual place recognition across these local maps the system can identify when two maps overlap in space, providing an inter-map constraint from which the system can derive the relative poses of the two maps. Using these resulting pose constraints, our system performs map merging, allowing multiple cameras to fuse their measurements into a single shared reconstruction. The advantage of this approach is that it avoids replication of structures subsequent to loop closures, where multiple cameras traverse the same regions of the environment. Furthermore, it allows cameras to directly exploit and update regions of the environment previously mapped by other cameras within the system. We provide both quantitative and qualitative analyses using the synthetic ICL-NUIM dataset and the real-world Freiburg dataset including the impact of multi-camera map** on surface reconstruction accuracy, camera pose estimation accuracy and overall processing time. We also include qualitative results in the form of sample reconstructions of room sized environments with up to 3 cameras undergoing intersecting and loopy trajectories. △ Less

Submitted 21 November, 2018; v1 submitted 19 November, 2018; originally announced November 2018.

Comments: 10 pages, 3 figures

ACM Class: I.2.10

arXiv:1805.07376 [pdf, other]

Algorithms for Estimating Trends in Global Temperature Volatility

Authors: Arash Khodadadi, Daniel J McDonald

Abstract: Trends in terrestrial temperature variability are perhaps more relevant for species viability than trends in mean temperature. In this paper, we develop methodology for estimating such trends using multi-resolution climate data from polar orbiting weather satellites. We derive two novel algorithms for computation that are tailored for dense, gridded observations over both space and time. We evalua… ▽ More Trends in terrestrial temperature variability are perhaps more relevant for species viability than trends in mean temperature. In this paper, we develop methodology for estimating such trends using multi-resolution climate data from polar orbiting weather satellites. We derive two novel algorithms for computation that are tailored for dense, gridded observations over both space and time. We evaluate our methods with a simulation that mimics these data's features and on a large, publicly available, global temperature dataset with the eventual goal of tracking trends in cloud reflectance temperature variability. △ Less

Submitted 19 January, 2019; v1 submitted 18 May, 2018; originally announced May 2018.

Comments: Published in AAAI-19

arXiv:1709.05026 [pdf]

doi 10.24251/HICSS.2017.441

Attack-Graph Threat Modeling Assessment of Ambulatory Medical Devices

Authors: Patrick Luckett, J Todd McDonald, William Bradley Glisson

Abstract: The continued integration of technology into all aspects of society stresses the need to identify and understand the risk associated with assimilating new technologies. This necessity is heightened when technology is used for medical purposes like ambulatory devices that monitor a patient's vital signs. This integration creates environments that are conducive to malicious activities. The potential… ▽ More The continued integration of technology into all aspects of society stresses the need to identify and understand the risk associated with assimilating new technologies. This necessity is heightened when technology is used for medical purposes like ambulatory devices that monitor a patient's vital signs. This integration creates environments that are conducive to malicious activities. The potential impact presents new challenges for the medical community. Hence, this research presents attack graph modeling as a viable solution to identifying vulnerabilities, assessing risk, and forming mitigation strategies to defend ambulatory medical devices from attackers. Common and frequent vulnerabilities and attack strategies related to the various aspects of ambulatory devices, including Bluetooth enabled sensors and Android applications are identified in the literature. Based on this analysis, this research presents an attack graph modeling example on a theoretical device that highlights vulnerabilities and mitigation strategies to consider when designing ambulatory devices with similar components. △ Less

Submitted 14 September, 2017; originally announced September 2017.

arXiv:1602.07536 [pdf]

Towards Security of Additive Layer Manufacturing

Authors: Mark Yampolskiy, Todd R. Andel, J. Todd McDonald, William B. Glisson, Alec Yasinsac

Abstract: Additive Layer Manufacturing (ALM), also broadly known as 3D printing, is a new technology to produce 3D objects. As an opposite approach to the conventional subtractive manufacturing process, 3D objects are created by adding thin material layers over layers. Until recently, they have been used, mainly, for plastic models. However, the technology has evolved making it possible to use high-quality… ▽ More Additive Layer Manufacturing (ALM), also broadly known as 3D printing, is a new technology to produce 3D objects. As an opposite approach to the conventional subtractive manufacturing process, 3D objects are created by adding thin material layers over layers. Until recently, they have been used, mainly, for plastic models. However, the technology has evolved making it possible to use high-quality printing with metal alloys. Agencies and companies like NASA, ESA, Boeing, Airbus, etc. are investigating various ALM technology application areas. Recently, SpaceX used additive manufacturing to produce engine chambers for the newest Dragon spacecraft. BAE System plans to print on-demand a complete Unmanned Aerial Vehicle (UAV), depending on the operational requirements. Companies expect the implementation of ALM technology will bring a broad variety of technological and economic benefits. This includes, but not limited to, the reduction of the time needed to produce complex parts, reduction of wasted material and thus control of production costs along with minimization of part storage space as companies implement just-in-time and on-demand production solutions. The broad variety of application areas and a high grade of computerization of the manufacturing process will inevitably make ALM an attractive target for various attacks. △ Less

Submitted 13 January, 2015; originally announced February 2016.

arXiv:1308.6401 [pdf]

doi 10.5772/56570

A Synergistic Approach for Recovering Occlusion-Free Textured 3D Maps of Urban Facades from Heterogeneous Cartographic Data

Authors: Karim Hammoudi, Fadi Dornaika, Bahman Soheilian, Bruno Vallet, John McDonald, Nicolas Paparoditis

Abstract: In this paper we present a practical approach for generating an occlusion-free textured 3D map of urban facades by the synergistic use of terrestrial images, 3D point clouds and area-based information. Particularly in dense urban environments, the high presence of urban objects in front of the facades causes significant difficulties for several stages in computational building modeling. Major chal… ▽ More In this paper we present a practical approach for generating an occlusion-free textured 3D map of urban facades by the synergistic use of terrestrial images, 3D point clouds and area-based information. Particularly in dense urban environments, the high presence of urban objects in front of the facades causes significant difficulties for several stages in computational building modeling. Major challenges lie on the one hand in extracting complete 3D facade quadrilateral delimitations and on the other hand in generating occlusion-free facade textures. For these reasons, we describe a straightforward approach for completing and recovering facade geometry and textures by exploiting the data complementarity of terrestrial multi-source imagery and area-based information. △ Less

Submitted 29 August, 2013; originally announced August 2013.

ACM Class: I.4; I.5

Journal ref: International Journal of Advanced Robotic Systems, vol. 10, 10p., 2013

arXiv:1212.0463 [pdf, other]

Nonparametric risk bounds for time-series forecasting

Authors: Daniel J. McDonald, Cosma Rohilla Shalizi, Mark Schervish

Abstract: We derive generalization error bounds for traditional time-series forecasting models. Our results hold for many standard forecasting tools including autoregressive models, moving average models, and, more generally, linear state-space models. These non-asymptotic bounds need only weak assumptions on the data-generating process, yet allow forecasters to select among competing models and to guarante… ▽ More We derive generalization error bounds for traditional time-series forecasting models. Our results hold for many standard forecasting tools including autoregressive models, moving average models, and, more generally, linear state-space models. These non-asymptotic bounds need only weak assumptions on the data-generating process, yet allow forecasters to select among competing models and to guarantee, with high probability, that their chosen model will perform well. We motivate our techniques with and apply them to standard economic and financial forecasting tools---a GARCH model for predicting equity volatility and a dynamic stochastic general equilibrium model (DSGE), the standard tool in macroeconomic forecasting. We demonstrate in particular how our techniques can aid forecasters and policy makers in choosing models which behave well under uncertainty and mis-specification. △ Less

Submitted 10 September, 2016; v1 submitted 3 December, 2012; originally announced December 2012.

Comments: 34 pages, 3 figures

MSC Class: 62M20 (Primary) 91B84; 62G99 (Secondary)

Journal ref: Journal of Machine Learning Research. (2017). Vol 18. p. 1-40

arXiv:1106.0730 [pdf, ps, other]

Rademacher complexity of stationary sequences

Authors: Daniel J. McDonald, Cosma Rohilla Shalizi

Abstract: We show how to control the generalization error of time series models wherein past values of the outcome are used to predict future values. The results are based on a generalization of standard i.i.d. concentration inequalities to dependent data without the mixing assumptions common in the time series setting. Our proof and the result are simpler than previous analyses with dependent data or stoch… ▽ More We show how to control the generalization error of time series models wherein past values of the outcome are used to predict future values. The results are based on a generalization of standard i.i.d. concentration inequalities to dependent data without the mixing assumptions common in the time series setting. Our proof and the result are simpler than previous analyses with dependent data or stochastic adversaries which use sequential Rademacher complexities rather than the expected Rademacher complexity for i.i.d. processes. We also derive empirical Rademacher results without mixing assumptions resulting in fully calculable upper bounds. △ Less

Submitted 22 May, 2017; v1 submitted 3 June, 2011; originally announced June 2011.

Comments: 15 pages, 1 figure

arXiv:1103.0942 [pdf, other]

Generalization error bounds for stationary autoregressive models

Authors: Daniel J. McDonald, Cosma Rohilla Shalizi, Mark Schervish

Abstract: We derive generalization error bounds for stationary univariate autoregressive (AR) models. We show that imposing stationarity is enough to control the Gaussian complexity without further regularization. This lets us use structural risk minimization for model selection. We demonstrate our methods by predicting interest rate movements. We derive generalization error bounds for stationary univariate autoregressive (AR) models. We show that imposing stationarity is enough to control the Gaussian complexity without further regularization. This lets us use structural risk minimization for model selection. We demonstrate our methods by predicting interest rate movements. △ Less

Submitted 3 June, 2011; v1 submitted 4 March, 2011; originally announced March 2011.

Comments: 10 pages, 3 figures. CMU Statistics Technical Report

arXiv:1103.0941 [pdf, ps, other]

Estimating $β$-mixing coefficients

Authors: Daniel J. McDonald, Cosma Rohilla Shalizi, Mark Schervish

Abstract: The literature on statistical learning for time series assumes the asymptotic independence or ``mixing' of the data-generating process. These mixing assumptions are never tested, nor are there methods for estimating mixing rates from data. We give an estimator for the $β$-mixing rate based on a single stationary sample path and show it is $L_1$-risk consistent. The literature on statistical learning for time series assumes the asymptotic independence or ``mixing' of the data-generating process. These mixing assumptions are never tested, nor are there methods for estimating mixing rates from data. We give an estimator for the $β$-mixing rate based on a single stationary sample path and show it is $L_1$-risk consistent. △ Less

Submitted 4 March, 2011; originally announced March 2011.

Comments: 9 pages, accepted by AIStats. CMU Statistics Technical Report

Journal ref: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011), pp. 516--524

Showing 1–43 of 43 results for author: McDonald, J