Search | arXiv e-print repository

Segment anything, from space?

Authors: Simiao Ren, Francesco Luzi, Saad Lahrichi, Kaleb Kassaw, Leslie M. Collins, Kyle Bradbury, Jordan M. Malof

Abstract: Recently, the first foundation model developed specifically for image segmentation tasks was developed, termed the "Segment Anything Model" (SAM). SAM can segment objects in input imagery based on cheap input prompts, such as one (or more) points, a bounding box, or a mask. The authors examined the \textit{zero-shot} image segmentation accuracy of SAM on a large number of vision benchmark tasks an… ▽ More Recently, the first foundation model developed specifically for image segmentation tasks was developed, termed the "Segment Anything Model" (SAM). SAM can segment objects in input imagery based on cheap input prompts, such as one (or more) points, a bounding box, or a mask. The authors examined the \textit{zero-shot} image segmentation accuracy of SAM on a large number of vision benchmark tasks and found that SAM usually achieved recognition accuracy similar to, or sometimes exceeding, vision models that had been trained on the target tasks. The impressive generalization of SAM for segmentation has major implications for vision researchers working on natural imagery. In this work, we examine whether SAM's performance extends to overhead imagery problems and help guide the community's response to its development. We examine SAM's performance on a set of diverse and widely studied benchmark tasks. We find that SAM does often generalize well to overhead imagery, although it fails in some cases due to the unique characteristics of overhead imagery and its common target objects. We report on these unique systematic failure cases for remote sensing imagery that may comprise useful future research for the community. △ Less

Submitted 9 November, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

Comments: Work accepted at WACV 2024, this is only a pre-print, please go to WACV website for the official version

arXiv:2212.12824 [pdf, other]

Meta-Learning for Color-to-Infrared Cross-Modal Style Transfer

Authors: Evelyn A. Stump, Francesco Luzi, Leslie M. Collins, Jordan M. Malof

Abstract: Recent object detection models for infrared (IR) imagery are based upon deep neural networks (DNNs) and require large amounts of labeled training imagery. However, publicly-available datasets that can be used for such training are limited in their size and diversity. To address this problem, we explore cross-modal style transfer (CMST) to leverage large and diverse color imagery datasets so that t… ▽ More Recent object detection models for infrared (IR) imagery are based upon deep neural networks (DNNs) and require large amounts of labeled training imagery. However, publicly-available datasets that can be used for such training are limited in their size and diversity. To address this problem, we explore cross-modal style transfer (CMST) to leverage large and diverse color imagery datasets so that they can be used to train DNN-based IR image based object detectors. We evaluate six contemporary stylization methods on four publicly-available IR datasets - the first comparison of its kind - and find that CMST is highly effective for DNN-based detectors. Surprisingly, we find that existing data-driven methods are outperformed by a simple grayscale stylization (an average of the color channels). Our analysis reveals that existing data-driven methods are either too simplistic or introduce significant artifacts into the imagery. To overcome these limitations, we propose meta-learning style transfer (MLST), which learns a stylization by composing and tuning well-behaved analytic functions. We find that MLST leads to more complex stylizations without introducing significant image artifacts and achieves the best overall detector performance on our benchmark datasets. △ Less

Submitted 24 December, 2022; originally announced December 2022.

arXiv:2211.14366 [pdf, other]

doi 10.1609/aaai.v37i8.26178

Mixture Manifold Networks: A Computationally Efficient Baseline for Inverse Modeling

Authors: Gregory P. Spell, Simiao Ren, Leslie M. Collins, Jordan M. Malof

Abstract: We propose and show the efficacy of a new method to address generic inverse problems. Inverse modeling is the task whereby one seeks to determine the control parameters of a natural system that produce a given set of observed measurements. Recent work has shown impressive results using deep learning, but we note that there is a trade-off between model performance and computational time. For some a… ▽ More We propose and show the efficacy of a new method to address generic inverse problems. Inverse modeling is the task whereby one seeks to determine the control parameters of a natural system that produce a given set of observed measurements. Recent work has shown impressive results using deep learning, but we note that there is a trade-off between model performance and computational time. For some applications, the computational time at inference for the best performing inverse modeling method may be overly prohibitive to its use. We present a new method that leverages multiple manifolds as a mixture of backward (e.g., inverse) models in a forward-backward model architecture. These multiple backwards models all share a common forward model, and their training is mitigated by generating training examples from the forward model. The proposed method thus has two innovations: 1) the multiple Manifold Mixture Network (MMN) architecture, and 2) the training procedure involving augmenting backward model training data using the forward model. We demonstrate the advantages of our method by comparing to several baselines on four benchmark inverse problems, and we furthermore provide analysis to motivate its design. △ Less

Submitted 25 November, 2022; originally announced November 2022.

Comments: This paper has been accepted to AAAI 2023; this is not the final version

arXiv:2209.08685 [pdf, other]

Meta-simulation for the Automated Design of Synthetic Overhead Imagery

Authors: Handi Yu, Simiao Ren, Leslie M. Collins, Jordan M. Malof

Abstract: The use of synthetic (or simulated) data for training machine learning models has grown rapidly in recent years. Synthetic data can often be generated much faster and more cheaply than its real-world counterpart. One challenge of using synthetic imagery however is scene design: e.g., the choice of content and its features and spatial arrangement. To be effective, this design must not only be reali… ▽ More The use of synthetic (or simulated) data for training machine learning models has grown rapidly in recent years. Synthetic data can often be generated much faster and more cheaply than its real-world counterpart. One challenge of using synthetic imagery however is scene design: e.g., the choice of content and its features and spatial arrangement. To be effective, this design must not only be realistic, but appropriate for the target domain, which (by assumption) is unlabeled. In this work, we propose an approach to automatically choose the design of synthetic imagery based upon unlabeled real-world imagery. Our approach, termed Neural-Adjoint Meta-Simulation (NAMS), builds upon the seminal recent meta-simulation approaches. In contrast to the current state-of-the-art methods, our approach can be pre-trained once offline, and then provides fast design inference for new target imagery. Using both synthetic and real-world problems, we show that NAMS infers synthetic designs that match both the in-domain and out-of-domain target imagery, and that training segmentation models with NAMS-designed imagery yields superior results compared to naïve randomized designs and state-of-the-art meta-simulation methods. △ Less

Submitted 26 October, 2022; v1 submitted 18 September, 2022; originally announced September 2022.

arXiv:2108.05929 [pdf]

Parameter Tuning of Time-Frequency Masking Algorithms for Reverberant Artifact Removal within the Cochlear Implant Stimulus

Authors: Lidea K. Shahidi, Leslie M. Collins, Boyla O. Mainsah

Abstract: Cochlear implant users struggle to understand speech in reverberant environments. To restore speech perception, artifacts dominated by reverberant reflections can be removed from the cochlear implant stimulus. Artifacts can be identified and removed by applying a matrix of gain values, a technique referred to as time-frequency masking. Gain values are determined by an oracle algorithm that uses kn… ▽ More Cochlear implant users struggle to understand speech in reverberant environments. To restore speech perception, artifacts dominated by reverberant reflections can be removed from the cochlear implant stimulus. Artifacts can be identified and removed by applying a matrix of gain values, a technique referred to as time-frequency masking. Gain values are determined by an oracle algorithm that uses knowledge of the undistorted signal to minimize retention of the signal components dominated by reverberant reflections. In practice, gain values are estimated from the distorted signal, with the oracle algorithm providing the estimation objective. Different oracle techniques exist for determining gain values, and each technique must be parameterized to set the amount of signal retention. This work assesses which oracle masking strategies and parameterizations lead to the best improvements in speech intelligibility for cochlear implant users in reverberant conditions using online speech intelligibility testing of normal-hearing individuals with vocoding. △ Less

Submitted 12 August, 2021; originally announced August 2021.

Comments: 5 pages, 4 figures

arXiv:2105.14135 [pdf]

Phoneme-Based Ratio Mask Estimation for Reverberant Speech Enhancement in Cochlear Implant Processors

Authors: Kevin M. Chu, Leslie M. Collins, Boyla O. Mainsah

Abstract: Cochlear implant (CI) users have considerable difficulty in understanding speech in reverberant listening environments. Time-frequency (T-F) masking is a common technique that aims to improve speech intelligibility by multiplying reverberant speech by a matrix of gain values to suppress T-F bins dominated by reverberation. Recently proposed mask estimation algorithms leverage machine learning appr… ▽ More Cochlear implant (CI) users have considerable difficulty in understanding speech in reverberant listening environments. Time-frequency (T-F) masking is a common technique that aims to improve speech intelligibility by multiplying reverberant speech by a matrix of gain values to suppress T-F bins dominated by reverberation. Recently proposed mask estimation algorithms leverage machine learning approaches to distinguish between target speech and reverberant reflections. However, the spectro-temporal structure of speech is highly variable and dependent on the underlying phoneme. One way to potentially overcome this variability is to leverage explicit knowledge of phonemic information during mask estimation. This study proposes a phoneme-based mask estimation algorithm, where separate mask estimation models are trained for each phoneme. Sentence recognition tests were conducted in normal hearing listeners to determine whether a phoneme-based mask estimation algorithm is beneficial in the ideal scenario where perfect knowledge of the phoneme is available. The results showed that the phoneme-based masks improved the intelligibility of vocoded speech when compared to conventional phoneme-independent masks. The results suggest that a phoneme-based speech enhancement strategy may potentially benefit CI users in reverberant listening environments. △ Less

Submitted 28 May, 2021; originally announced May 2021.

arXiv:2105.14120 [pdf]

Assessing the intelligibility of vocoded speech using a remote testing framework

Authors: Kevin M. Chu, Leslie M. Collins, Boyla O. Mainsah

Abstract: Over the past year, remote speech intelligibility testing has become a popular and necessary alternative to traditional in-person experiments due to the need for physical distancing during the COVID-19 pandemic. A remote framework was developed for conducting speech intelligibility tests with normal hearing listeners. In this study, subjects used their personal computers to complete sentence recog… ▽ More Over the past year, remote speech intelligibility testing has become a popular and necessary alternative to traditional in-person experiments due to the need for physical distancing during the COVID-19 pandemic. A remote framework was developed for conducting speech intelligibility tests with normal hearing listeners. In this study, subjects used their personal computers to complete sentence recognition tasks in anechoic and reverberant listening environments. The results obtained using this remote framework were compared with previously collected in-lab results, and showed higher levels of speech intelligibility among remote study participants than subjects who completed the test in the laboratory. △ Less

Submitted 28 May, 2021; originally announced May 2021.

arXiv:2105.02625 [pdf]

Evaluating the Effect of Longitudinal Dose and INR Data on Maintenance Warfarin Dose Predictions

Authors: Anish Karpurapu, Adam Krekorian, Ye Tian, Leslie M. Collins, Ravi Karra, Aaron Franklin, Boyla O. Mainsah

Abstract: Warfarin, a commonly prescribed drug to prevent blood clots, has a highly variable individual response. Determining a maintenance warfarin dose that achieves a therapeutic blood clotting time, as measured by the international normalized ratio (INR), is crucial in preventing complications. Machine learning algorithms are increasingly being used for warfarin dosing; usually, an initial dose is predi… ▽ More Warfarin, a commonly prescribed drug to prevent blood clots, has a highly variable individual response. Determining a maintenance warfarin dose that achieves a therapeutic blood clotting time, as measured by the international normalized ratio (INR), is crucial in preventing complications. Machine learning algorithms are increasingly being used for warfarin dosing; usually, an initial dose is predicted with clinical and genotype factors, and this dose is revised after a few days based on previous doses and current INR. Since a sequence of prior doses and INR better capture the variability in individual warfarin response, we hypothesized that longitudinal dose response data will improve maintenance dose predictions. To test this hypothesis, we analyzed a dataset from the COAG warfarin dosing study, which includes clinical data, warfarin doses and INR measurements over the study period, and maintenance dose when therapeutic INR was achieved. Various machine learning regression models to predict maintenance warfarin dose were trained with clinical factors and dosing history and INR data as features. Overall, dose revision algorithms with a single dose and INR achieved comparable performance as the baseline dose revision algorithm. In contrast, dose revision algorithms with longitudinal dose and INR data provided maintenance dose predictions that were statistically significantly much closer to the true maintenance dose. Focusing on the best performing model, gradient boosting (GB), the proportion of ideal estimated dose, i.e., defined as within $\pm$20% of the true dose, increased from the baseline (54.92%) to the GB model with the single (63.11%) and longitudinal (75.41%) INR. More accurate maintenance dose predictions with longitudinal dose response data can potentially achieve therapeutic INR faster, reduce drug-related complications and improve patient outcomes with warfarin therapy. △ Less

Submitted 6 May, 2021; originally announced May 2021.

arXiv:2101.06390 [pdf, other]

GridTracer: Automatic Map** of Power Grids using Deep Learning and Overhead Imagery

Authors: Bohao Huang, Jichen Yang, Artem Streltsov, Kyle Bradbury, Leslie M. Collins, Jordan Malof

Abstract: Energy system information valuable for electricity access planning such as the locations and connectivity of electricity transmission and distribution towers, termed the power grid, is often incomplete, outdated, or altogether unavailable. Furthermore, conventional means for collecting this information is costly and limited. We propose to automatically map the grid in overhead remotely sensed imag… ▽ More Energy system information valuable for electricity access planning such as the locations and connectivity of electricity transmission and distribution towers, termed the power grid, is often incomplete, outdated, or altogether unavailable. Furthermore, conventional means for collecting this information is costly and limited. We propose to automatically map the grid in overhead remotely sensed imagery using deep learning. Towards this goal, we develop and publicly-release a large dataset ($263km^2$) of overhead imagery with ground truth for the power grid, to our knowledge this is the first dataset of its kind in the public domain. Additionally, we propose scoring metrics and baseline algorithms for two grid map** tasks: (1) tower recognition and (2) power line interconnection (i.e., estimating a graph representation of the grid). We hope the availability of the training data, scoring metrics, and baselines will facilitate rapid progress on this important problem to help decision-makers address the energy needs of societies around the world. △ Less

Submitted 16 January, 2021; originally announced January 2021.

arXiv:2005.05880 [pdf]

The Micro-Randomized Trial for Develo** Digital Interventions: Experimental Design Considerations

Authors: Ashley E. Walton, Linda M. Collins, Predrag Klasnja, Inbal Nahum-Shani, Mashfiqui Rabbi, Maureen A. Walton, Susan A. Murphy

Abstract: Just-in-time adaptive interventions (JITAIs) are time-varying adaptive interventions that use frequent opportunities for the intervention to be adapted such as weekly, daily, or even many times a day. This high intensity of adaptation is facilitated by the ability of digital technology to continuously collect information about an individual's current context and deliver treatments adapted to this… ▽ More Just-in-time adaptive interventions (JITAIs) are time-varying adaptive interventions that use frequent opportunities for the intervention to be adapted such as weekly, daily, or even many times a day. This high intensity of adaptation is facilitated by the ability of digital technology to continuously collect information about an individual's current context and deliver treatments adapted to this information. The micro-randomized trial (MRT) has emerged for use in informing the construction of JITAIs. MRTs operate in, and take advantage of, the rapidly time-varying digital intervention environment. MRTs can be used to address research questions about whether and under what circumstances particular components of a JITAI are effective, with the ultimate objective of develo** effective and efficient components. The purpose of this article is to clarify why, when, and how to use MRTs; to highlight elements that must be considered when designing and implementing an MRT; and to discuss the possibilities this emerging optimization trial design offers for future research in the behavioral sciences, education, and other fields. We briefly review key elements of JITAIs, and then describe three case studies of MRTs, each of which highlights research questions that can be addressed using the MRT and experimental design considerations that might arise. We also discuss a variety of considerations that go into planning and designing an MRT, using the case studies as examples. △ Less

Submitted 23 April, 2020; originally announced May 2020.

MSC Class: 62P15

arXiv:1806.01349 [pdf]

gprHOG and the popularity of Histogram of Oriented Gradients (HOG) for Buried Threat Detection in Ground-Penetrating Radar

Authors: Daniel Reichman, Leslie M. Collins, Jordan M. Malof

Abstract: Substantial research has been devoted to the development of algorithms that automate buried threat detection (BTD) with ground penetrating radar (GPR) data, resulting in a large number of proposed algorithms. One popular algorithm GPR-based BTD, originally applied by Torrione et al., 2012, is the Histogram of Oriented Gradients (HOG) feature. In a recent large-scale comparison among five veteran i… ▽ More Substantial research has been devoted to the development of algorithms that automate buried threat detection (BTD) with ground penetrating radar (GPR) data, resulting in a large number of proposed algorithms. One popular algorithm GPR-based BTD, originally applied by Torrione et al., 2012, is the Histogram of Oriented Gradients (HOG) feature. In a recent large-scale comparison among five veteran institutions, a modified version of HOG referred to here as "gprHOG", performed poorly compared to other modern algorithms. In this paper, we provide experimental evidence demonstrating that the modifications to HOG that comprise gprHOG result in a substantially better-performing algorithm. The results here, in conjunction with the large-scale algorithm comparison, suggest that HOG is not competitive with modern GPR-based BTD algorithms. Given HOG's popularity, these results raise some questions about many existing studies, and suggest gprHOG (and especially HOG) should be employed with caution in future studies. △ Less

Submitted 2 October, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

Comments: 5 pages, 6 figures, letter

arXiv:1805.12219 [pdf]

Tiling and Stitching Segmentation Output for Remote Sensing: Basic Challenges and Recommendations

Authors: Bohao Huang, Daniel Reichman, Leslie M. Collins, Kyle Bradbury, Jordan M. Malof

Abstract: In this work we consider the application of convolutional neural networks (CNNs) for pixel-wise labeling (a.k.a., semantic segmentation) of remote sensing imagery (e.g., aerial color or hyperspectral imagery). Remote sensing imagery is usually stored in the form of very large images, referred to as "tiles", which are too large to be segmented directly using most CNNs and their associated hardware.… ▽ More In this work we consider the application of convolutional neural networks (CNNs) for pixel-wise labeling (a.k.a., semantic segmentation) of remote sensing imagery (e.g., aerial color or hyperspectral imagery). Remote sensing imagery is usually stored in the form of very large images, referred to as "tiles", which are too large to be segmented directly using most CNNs and their associated hardware. As a result, during label inference, smaller sub-images, called "patches", are processed individually and then "stitched" (concatenated) back together to create a tile-sized label map. This approach suffers from computational ineffiency and can result in discontinuities at output boundaries. We propose a simple alternative approach in which the input size of the CNN is dramatically increased only during label inference. This does not avoid stitching altogether, but substantially mitigates its limitations. We evaluate the performance of the proposed approach against a vonventional stitching approach using two popular segmentation CNN models and two large-scale remote sensing imagery datasets. The results suggest that the proposed approach substantially reduces label inference time, while also yielding modest overall label accuracy increases. This approach contributed to our wining entry (overall performance) in the INRIA building labeling competition. △ Less

Submitted 25 February, 2019; v1 submitted 30 May, 2018; originally announced May 2018.

arXiv:1803.03729 [pdf]

doi 10.1109/TGRS.2019.2909665

A Large-Scale Multi-Institutional Evaluation of Advanced Discrimination Algorithms for Buried Threat Detection in Ground Penetrating Radar

Authors: Jordan M. Malof, Daniel Reichman, Andrew Karem, Hichem Frigui, Dominic K. C. Ho, Joseph N. Wilson, Wen-Hsiung Lee, William Cummings, Leslie M. Collins

Abstract: In this paper we consider the development of algorithms for the automatic detection of buried threats using ground penetrating radar (GPR) measurements. GPR is one of the most studied and successful modalities for automatic buried threat detection (BTD), and a large variety of BTD algorithms have been proposed for it. Despite this, large-scale comparisons of GPR-based BTD algorithms are rare in th… ▽ More In this paper we consider the development of algorithms for the automatic detection of buried threats using ground penetrating radar (GPR) measurements. GPR is one of the most studied and successful modalities for automatic buried threat detection (BTD), and a large variety of BTD algorithms have been proposed for it. Despite this, large-scale comparisons of GPR-based BTD algorithms are rare in the literature. In this work we report the results of a multi-institutional effort to develop advanced buried threat detection algorithms for a real-world GPR BTD system. The effort involved five institutions with substantial experience with the development of GPR-based BTD algorithms. In this paper we report the technical details of the advanced algorithms submitted by each institution, representing their latest technical advances, and many state-of-the-art GPR-based BTD algorithms. We also report the results of evaluating the algorithms from each institution on the large experimental dataset used for development. The experimental dataset comprised 120,000 m^2 of GPR data using surface area, from 13 different lanes across two US test sites. The data was collected using a vehicle-mounted GPR system, the variants of which have supplied data for numerous publications. Using these results, we identify the most successful and common processing strategies among the submitted algorithms, and make recommendations for GPR-based BTD algorithm design. △ Less

Submitted 7 June, 2018; v1 submitted 9 March, 2018; originally announced March 2018.

Comments: IEEE Transactions on Geoscience and Remote Sensing (2019)

arXiv:1801.04018 [pdf]

Application of a semantic segmentation convolutional neural network for accurate automatic detection and map** of solar photovoltaic arrays in aerial imagery

Authors: Joseph Camilo, Rui Wang, Leslie M. Collins, Kyle Bradbury, Jordan M. Malof

Abstract: We consider the problem of automatically detecting small-scale solar photovoltaic arrays for behind-the-meter energy resource assessment in high resolution aerial imagery. Such algorithms offer a faster and more cost-effective solution to collecting information on distributed solar photovoltaic (PV) arrays, such as their location, capacity, and generated energy. The surface area of PV arrays, a ch… ▽ More We consider the problem of automatically detecting small-scale solar photovoltaic arrays for behind-the-meter energy resource assessment in high resolution aerial imagery. Such algorithms offer a faster and more cost-effective solution to collecting information on distributed solar photovoltaic (PV) arrays, such as their location, capacity, and generated energy. The surface area of PV arrays, a characteristic which can be estimated from aerial imagery, provides an important proxy for array capacity and energy generation. In this work, we employ a state-of-the-art convolutional neural network architecture, called SegNet (Badrinarayanan et. al., 2015), to semantically segment (or map) PV arrays in aerial imagery. This builds on previous work focused on identifying the locations of PV arrays, as opposed to their specific shapes and sizes. We measure the ability of our SegNet implementation to estimate the surface area of PV arrays on a large, publicly available, dataset that has been employed in several previous studies. The results indicate that the SegNet model yields substantial performance improvements with respect to estimating shape and size as compared to a recently proposed convolutional neural network PV detection algorithm. △ Less

Submitted 11 January, 2018; originally announced January 2018.

Comments: Accepted for publication at the 2017 IEEE Applied Imagery Pattern Recognition (AIPR) Workshop. Presented at the conference in Washington D.C., October 10-12

arXiv:1707.00375 [pdf, other]

Adaptive Stimulus Selection in ERP-Based Brain-Computer Interfaces by Maximizing Expected Discrimination Gain

Authors: Dmitry Kalika, Leslie M. Collins, Chandra S. Throckmorton, Boyla O. Mainsah

Abstract: Brain-computer interfaces (BCIs) can provide an alternative means of communication for individuals with severe neuromuscular limitations. The P300-based BCI speller relies on eliciting and detecting transient event-related potentials (ERPs) in electroencephalography (EEG) data, in response to a user attending to rarely occurring target stimuli amongst a series of non-target stimuli. However, in mo… ▽ More Brain-computer interfaces (BCIs) can provide an alternative means of communication for individuals with severe neuromuscular limitations. The P300-based BCI speller relies on eliciting and detecting transient event-related potentials (ERPs) in electroencephalography (EEG) data, in response to a user attending to rarely occurring target stimuli amongst a series of non-target stimuli. However, in most P300 speller implementations, the stimuli to be presented are randomly selected from a limited set of options and stimulus selection and presentation are not optimized based on previous user data. In this work, we propose a data-driven method for stimulus selection based on the expected discrimination gain metric. The data-driven approach selects stimuli based on previously observed stimulus responses, with the aim of choosing a set of stimuli that will provide the most information about the user's intended target character. Our approach incorporates knowledge of physiological and system constraints imposed due to real-time BCI implementation. Simulations were performed to compare our stimulus selection approach to the row-column paradigm, the conventional stimulus selection method for P300 spellers. Results from the simulations demonstrated that our adaptive stimulus selection approach has the potential to significantly improve performance from the conventional method: up to 34% improvement in accuracy and 43% reduction in the mean number of stimulus presentations required to spell a character in a 72-character grid. In addition, our greedy approach to stimulus selection provides the flexibility to accommodate design constraints. △ Less

Submitted 2 July, 2017; originally announced July 2017.

Comments: This paper has been accepted for the 2017 IEEE International Conference on Systems, Man and Cybernetics (SMC)

arXiv:1702.03000 [pdf]

doi 10.1109/TGRS.2017.2751461

A large comparison of feature-based approaches for buried target classification in forward-looking ground-penetrating radar

Authors: Joseph A. Camilo, Leslie M. Collins, Jordan M. Malof

Abstract: Forward-looking ground-penetrating radar (FLGPR) has recently been investigated as a remote sensing modality for buried target detection (e.g., landmines). In this context, raw FLGPR data is beamformed into images and then computerized algorithms are applied to automatically detect subsurface buried targets. Most existing algorithms are supervised, meaning they are trained to discriminate between… ▽ More Forward-looking ground-penetrating radar (FLGPR) has recently been investigated as a remote sensing modality for buried target detection (e.g., landmines). In this context, raw FLGPR data is beamformed into images and then computerized algorithms are applied to automatically detect subsurface buried targets. Most existing algorithms are supervised, meaning they are trained to discriminate between labeled target and non-target imagery, usually based on features extracted from the imagery. A large number of features have been proposed for this purpose, however thus far it is unclear which are the most effective. The first goal of this work is to provide a comprehensive comparison of detection performance using existing features on a large collection of FLGPR data. Fusion of the decisions resulting from processing each feature is also considered. The second goal of this work is to investigate two modern feature learning approaches from the object recognition literature: the bag-of-visual-words and the Fisher vector for FLGPR processing. The results indicate that the new feature learning approaches outperform existing methods. Results also show that fusion between existing features and new features yields little additional performance improvements. △ Less

Submitted 9 February, 2017; originally announced February 2017.

Comments: 11 pages, 14 figures, for submission to IEEE TGARS

arXiv:1612.03477 [pdf]

On Choosing Training and Testing Data for Supervised Algorithms in Ground Penetrating Radar Data for Buried Threat Detection

Authors: Daniël Reichman, Leslie M. Collins, Jordan M. Malof

Abstract: Ground penetrating radar (GPR) is one of the most popular and successful sensing modalities that has been investigated for landmine and subsurface threat detection. Many of the detection algorithms applied to this task are supervised and therefore require labeled examples of target and non-target data for training. Training data most often consists of 2-dimensional images (or patches) of GPR data,… ▽ More Ground penetrating radar (GPR) is one of the most popular and successful sensing modalities that has been investigated for landmine and subsurface threat detection. Many of the detection algorithms applied to this task are supervised and therefore require labeled examples of target and non-target data for training. Training data most often consists of 2-dimensional images (or patches) of GPR data, from which features are extracted, and provided to the classifier during training and testing. Identifying desirable training and testing locations to extract patches, which we term "keypoints", is well established in the literature. In contrast however, a large variety of strategies have been proposed regarding keypoint utilization (e.g., how many of the identified keypoints should be used at targets, or non-target, locations). Given the variety keypoint utilization strategies that are available, it is very unclear (i) which strategies are best, or (ii) whether the choice of strategy has a large impact on classifier performance. We address these questions by presenting a taxonomy of existing utilization strategies, and then evaluating their effectiveness on a large dataset using many different classifiers and features. We analyze the results and propose a new strategy, called PatchSelect, which outperforms other strategies across all experiments. △ Less

Submitted 11 December, 2016; originally announced December 2016.

Comments: 9 pages, 8 figures, journal paper

arXiv:1607.06029 [pdf]

doi 10.1016/j.apenergy.2016.08.191

Automatic Detection of Solar Photovoltaic Arrays in High Resolution Aerial Imagery

Authors: Jordan M. Malof, Kyle Bradbury, Leslie M. Collins, Richard G. Newell

Abstract: The quantity of small scale solar photovoltaic (PV) arrays in the United States has grown rapidly in recent years. As a result, there is substantial interest in high quality information about the quantity, power capacity, and energy generated by such arrays, including at a high spatial resolution (e.g., counties, cities, or even smaller regions). Unfortunately, existing methods for obtaining this… ▽ More The quantity of small scale solar photovoltaic (PV) arrays in the United States has grown rapidly in recent years. As a result, there is substantial interest in high quality information about the quantity, power capacity, and energy generated by such arrays, including at a high spatial resolution (e.g., counties, cities, or even smaller regions). Unfortunately, existing methods for obtaining this information, such as surveys and utility interconnection filings, are limited in their completeness and spatial resolution. This work presents a computer algorithm that automatically detects PV panels using very high resolution color satellite imagery. The approach potentially offers a fast, scalable method for obtaining accurate information on PV array location and size, and at much higher spatial resolutions than are currently available. The method is validated using a very large (135 km^2) collection of publicly available [1] aerial imagery, with over 2,700 human annotated PV array locations. The results demonstrate the algorithm is highly effective on a per-pixel basis. It is likewise effective at object-level PV array detection, but with significant potential for improvement in estimating the precise shape/size of the PV arrays. These results are the first of their kind for the detection of solar PV in aerial imagery, demonstrating the feasibility of the approach and establishing a baseline performance for future investigations. △ Less

Submitted 20 July, 2016; originally announced July 2016.

Comments: 11 Page manuscript, and 1 page of supplemental information, 10 figures, currently under review as a journal publication

Showing 1–18 of 18 results for author: Collins, L M