-
Streaming Gaussian Dirichlet Random Fields for Spatial Predictions of High Dimensional Categorical Observations
Authors:
J. E. San Soucie,
H. M. Sosik,
Y. Girdhar
Abstract:
We present the Streaming Gaussian Dirichlet Random Field (S-GDRF) model, a novel approach for modeling a stream of spatiotemporally distributed, sparse, high-dimensional categorical observations. The proposed approach efficiently learns global and local patterns in spatiotemporal data, allowing for fast inference and querying with a bounded time complexity. Using a high-resolution data series of p…
▽ More
We present the Streaming Gaussian Dirichlet Random Field (S-GDRF) model, a novel approach for modeling a stream of spatiotemporally distributed, sparse, high-dimensional categorical observations. The proposed approach efficiently learns global and local patterns in spatiotemporal data, allowing for fast inference and querying with a bounded time complexity. Using a high-resolution data series of plankton images classified with a neural network, we demonstrate the ability of the approach to make more accurate predictions compared to a Variational Gaussian Process (VGP), and to learn a predictive distribution of observations from streaming categorical data. S-GDRFs open the door to enabling efficient informative path planning over high-dimensional categorical observations, which until now has not been feasible.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
CUREE: A Curious Underwater Robot for Ecosystem Exploration
Authors:
Yogesh Girdhar,
Nathan McGuire,
Levi Cai,
Stewart Jamieson,
Seth McCammon,
Brian Claus,
John E. San Soucie,
Jessica E. Todd,
T. Aran Mooney
Abstract:
The current approach to exploring and monitoring complex underwater ecosystems, such as coral reefs, is to conduct surveys using diver-held or static cameras, or deploying sensor buoys. These approaches often fail to capture the full variation and complexity of interactions between different reef organisms and their habitat. The CUREE platform presented in this paper provides a unique set of capab…
▽ More
The current approach to exploring and monitoring complex underwater ecosystems, such as coral reefs, is to conduct surveys using diver-held or static cameras, or deploying sensor buoys. These approaches often fail to capture the full variation and complexity of interactions between different reef organisms and their habitat. The CUREE platform presented in this paper provides a unique set of capabilities in the form of robot behaviors and perception algorithms to enable scientists to explore different aspects of an ecosystem. Examples of these capabilities include low-altitude visual surveys, soundscape surveys, habitat characterization, and animal following. We demonstrate these capabilities by describing two field deployments on coral reefs in the US Virgin Islands. In the first deployment, we show that CUREE can identify the preferred habitat type of snap** shrimp in a reef through a combination of a visual survey, habitat characterization, and a soundscape survey. In the second deployment, we demonstrate CUREE's ability to follow arbitrary animals by separately following a barracuda and stingray for several minutes each in midwater and benthic environments, respectively.
△ Less
Submitted 20 April, 2023; v1 submitted 7 March, 2023;
originally announced March 2023.
-
Automated, Targeted Testing of Property-Based Testing Predicates
Authors:
Tim Nelson,
Elijah Rivera,
Sam Soucie,
Thomas Del Vecchio,
John Wrenn,
Shriram Krishnamurthi
Abstract:
Context: This work is based on property-based testing (PBT). PBT is an increasingly important form of software testing. Furthermore, it serves as a concrete gateway into the abstract area of formal methods. Specifically, we focus on students learning PBT methods.
Inquiry: How well do students do at PBT? Our goal is to assess the quality of the predicates they write as part of PBT. Prior work int…
▽ More
Context: This work is based on property-based testing (PBT). PBT is an increasingly important form of software testing. Furthermore, it serves as a concrete gateway into the abstract area of formal methods. Specifically, we focus on students learning PBT methods.
Inquiry: How well do students do at PBT? Our goal is to assess the quality of the predicates they write as part of PBT. Prior work introduced the idea of decomposing the predicate's property into a conjunction of independent subproperties. Testing the predicate against each subproperty gives a "semantic" understanding of their performance.
Approach: The notion of independence of subproperties both seems intuitive and was an important condition in prior work. First, we show that this condition is overly restrictive and might hide valuable information: it both undercounts errors and makes it hard to capture misconceptions. Second, we introduce two forms of automation, one based on PBT tools and the other on SAT-solving, to enable testing of student predicates. Third, we compare the output of these automated tools against manually-constructed tests. Fourth, we also measure the performance of those tools. Finally, we re-assess student performance reported in prior work.
Knowledge: We show the difficulty caused by the independent subproperty requirement. We provide insight into how to use automation effectively to assess PBT predicates. In particular, we discuss the steps we had to take to beat human performance. We also provide insight into how to make the automation work efficiently. Finally, we present a much richer account than prior work of how students did.
Grounding: Our methods are grounded in mathematical logic. We also make use of well-understood principles of test generation from more formal specifications. This combination ensures the soundness of our work. We use standard methods to measure performance.
Importance: As both educators and programmers, we believe PBT is a valuable tool for students to learn, and its importance will only grow as more developers appreciate its value. Effective teaching requires a clear understanding of student knowledge and progress. Our methods enable a rich and automated analysis of student performance on PBT that yields insight into their understanding and can capture misconceptions. We therefore expect these results to be valuable to educators.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
Gaussian-Dirichlet Random Fields for Inference over High Dimensional Categorical Observations
Authors:
John E. San Soucie,
Heidi M. Sosik,
Yogesh Girdhar
Abstract:
We propose a generative model for the spatio-temporal distribution of high dimensional categorical observations. These are commonly produced by robots equipped with an imaging sensor such as a camera, paired with an image classifier, potentially producing observations over thousands of categories. The proposed approach combines the use of Dirichlet distributions to model sparse co-occurrence relat…
▽ More
We propose a generative model for the spatio-temporal distribution of high dimensional categorical observations. These are commonly produced by robots equipped with an imaging sensor such as a camera, paired with an image classifier, potentially producing observations over thousands of categories. The proposed approach combines the use of Dirichlet distributions to model sparse co-occurrence relations between the observed categories using a latent variable, and Gaussian processes to model the latent variable's spatio-temporal distribution. Experiments in this paper show that the resulting model is able to efficiently and accurately approximate the temporal distribution of high dimensional categorical measurements such as taxonomic observations of microscopic organisms in the ocean, even in unobserved (held out) locations, far from other samples. This work's primary motivation is to enable deployment of informative path planning techniques over high dimensional categorical fields, which until now have been limited to scalar or low dimensional vector observations.
△ Less
Submitted 26 March, 2020;
originally announced March 2020.