-
Computing Implicitizations of Multi-Graded Polynomial Maps
Authors:
Joseph Cummings,
Benjamin Hollering
Abstract:
In this paper, we focus on computing the kernel of a map of polynomial rings $\varphi$. This core problem in symbolic computation is known as implicitization. While there are extremely effective Gröbner basis methods used to solve this problem, these methods can become infeasible as the number of variables increases. In the case when the map $\varphi$ is multigraded, we consider an alternative app…
▽ More
In this paper, we focus on computing the kernel of a map of polynomial rings $\varphi$. This core problem in symbolic computation is known as implicitization. While there are extremely effective Gröbner basis methods used to solve this problem, these methods can become infeasible as the number of variables increases. In the case when the map $\varphi$ is multigraded, we consider an alternative approach. We demonstrate how to quickly compute a matrix of maximal rank for which $\varphi$ has a positive multigrading. Then in each graded component we compute the minimal generators of the kernel in that multidegree with linear algebra. We have implemented our techniques in Macaulay2 and show that our implementation can compute many generators of low degree in examples where Gröbner techniques have failed. This includes several examples coming from phylogenetics where even a complete list of quadrics and cubics were unknown. When the multigrading refines total degree, our algorithm is \emph{embarassingly parallel} and a fully parallelized version of our algorithm will be forthcoming in OSCAR.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Detecting Dataset Drift and Non-IID Sampling via k-Nearest Neighbors
Authors:
Jesse Cummings,
Elías Snorrason,
Jonas Mueller
Abstract:
We present a straightforward statistical test to detect certain violations of the assumption that the data are Independent and Identically Distributed (IID). The specific form of violation considered is common across real-world applications: whether the examples are ordered in the dataset such that almost adjacent examples tend to have more similar feature values (e.g. due to distributional drift,…
▽ More
We present a straightforward statistical test to detect certain violations of the assumption that the data are Independent and Identically Distributed (IID). The specific form of violation considered is common across real-world applications: whether the examples are ordered in the dataset such that almost adjacent examples tend to have more similar feature values (e.g. due to distributional drift, or attractive interactions between datapoints). Based on a k-Nearest Neighbors estimate, our approach can be used to audit any multivariate numeric data as well as other data types (image, text, audio, etc.) that can be numerically represented, perhaps with model embeddings. Compared with existing methods to detect drift or auto-correlation, our approach is both applicable to more types of data and also able to detect a wider variety of IID violations in practice. Code: https://github.com/cleanlab/cleanlab
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Develo** a Series of AI Challenges for the United States Department of the Air Force
Authors:
Vijay Gadepally,
Gregory Angelides,
Andrei Barbu,
Andrew Bowne,
Laura J. Brattain,
Tamara Broderick,
Armando Cabrera,
Glenn Carl,
Ronisha Carter,
Miriam Cha,
Emilie Cowen,
Jesse Cummings,
Bill Freeman,
James Glass,
Sam Goldberg,
Mark Hamilton,
Thomas Heldt,
Kuan Wei Huang,
Phillip Isola,
Boris Katz,
Jamie Koerner,
Yen-Chen Lin,
David Mayo,
Kyle McAlpin,
Taylor Perron
, et al. (17 additional authors not shown)
Abstract:
Through a series of federal initiatives and orders, the U.S. Government has been making a concerted effort to ensure American leadership in AI. These broad strategy documents have influenced organizations such as the United States Department of the Air Force (DAF). The DAF-MIT AI Accelerator is an initiative between the DAF and MIT to bridge the gap between AI researchers and DAF mission requireme…
▽ More
Through a series of federal initiatives and orders, the U.S. Government has been making a concerted effort to ensure American leadership in AI. These broad strategy documents have influenced organizations such as the United States Department of the Air Force (DAF). The DAF-MIT AI Accelerator is an initiative between the DAF and MIT to bridge the gap between AI researchers and DAF mission requirements. Several projects supported by the DAF-MIT AI Accelerator are develo** public challenge problems that address numerous Federal AI research priorities. These challenges target priorities by making large, AI-ready datasets publicly available, incentivizing open-source solutions, and creating a demand signal for dual use technologies that can stimulate further research. In this article, we describe these public challenges being developed and how their application contributes to scientific advances.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Machine-Assisted Script Curation
Authors:
Manuel R. Ciosici,
Joseph Cummings,
Mitchell DeHaven,
Alex Hedges,
Yash Kankanampati,
Dong-Ho Lee,
Ralph Weischedel,
Marjorie Freedman
Abstract:
We describe Machine-Aided Script Curator (MASC), a system for human-machine collaborative script authoring. Scripts produced with MASC include (1) English descriptions of sub-events that comprise a larger, complex event; (2) event types for each of those events; (3) a record of entities expected to participate in multiple sub-events; and (4) temporal sequencing between the sub-events. MASC automat…
▽ More
We describe Machine-Aided Script Curator (MASC), a system for human-machine collaborative script authoring. Scripts produced with MASC include (1) English descriptions of sub-events that comprise a larger, complex event; (2) event types for each of those events; (3) a record of entities expected to participate in multiple sub-events; and (4) temporal sequencing between the sub-events. MASC automates portions of the script creation process with suggestions for event types, links to Wikidata, and sub-events that may have been forgotten. We illustrate how these automations are useful to the script writer with a few case-study scripts.
△ Less
Submitted 4 May, 2021; v1 submitted 13 January, 2021;
originally announced January 2021.
-
Synthesizing Property & Casualty Ratemaking Datasets using Generative Adversarial Networks
Authors:
Marie-Pier Cote,
Brian Hartman,
Olivier Mercier,
Joshua Meyers,
Jared Cummings,
Elijah Harmon
Abstract:
Due to confidentiality issues, it can be difficult to access or share interesting datasets for methodological development in actuarial science, or other fields where personal data are important. We show how to design three different types of generative adversarial networks (GANs) that can build a synthetic insurance dataset from a confidential original dataset. The goal is to obtain synthetic data…
▽ More
Due to confidentiality issues, it can be difficult to access or share interesting datasets for methodological development in actuarial science, or other fields where personal data are important. We show how to design three different types of generative adversarial networks (GANs) that can build a synthetic insurance dataset from a confidential original dataset. The goal is to obtain synthetic data that no longer contains sensitive information but still has the same structure as the original dataset and retains the multivariate relationships. In order to adequately model the specific characteristics of insurance data, we use GAN architectures adapted for multi-categorical data: a Wassertein GAN with gradient penalty (MC-WGAN-GP), a conditional tabular GAN (CTGAN) and a Mixed Numerical and Categorical Differentially Private GAN (MNCDP-GAN). For transparency, the approaches are illustrated using a public dataset, the French motor third party liability data. We compare the three different GANs on various aspects: ability to reproduce the original data structure and predictive models, privacy, and ease of use. We find that the MC-WGAN-GP synthesizes the best data, the CTGAN is the easiest to use, and the MNCDP-GAN guarantees differential privacy.
△ Less
Submitted 13 August, 2020;
originally announced August 2020.
-
Develo** Computational Models of Social Assistance to Guide Socially Assistive Robots
Authors:
Jason R. Wilson,
Seongsik Kim,
Ulyana Kurylo,
Joseph Cummings,
Eshan Tarneja
Abstract:
While there are many examples in which robots provide social assistance, a lack of theory on how the robots should decide how to assist impedes progress in realizing these technologies. To address this deficiency, we propose a pair of computational models to guide a robot as it provides social assistance. The model of social autonomy helps a robot select an appropriate assistance that will help wi…
▽ More
While there are many examples in which robots provide social assistance, a lack of theory on how the robots should decide how to assist impedes progress in realizing these technologies. To address this deficiency, we propose a pair of computational models to guide a robot as it provides social assistance. The model of social autonomy helps a robot select an appropriate assistance that will help with the task at hand while also maintaining the autonomy of the person being assisted. The model of social alliance describes how a to determine whether the robot and the person being assisted are cooperatively working towards the same goal. Each of these models are rooted in social reasoning between people, and we describe here our ongoing work to adapt this social reasoning to human-robot interactions.
△ Less
Submitted 13 September, 2019;
originally announced September 2019.
-
Compact Convolutional Neural Networks for Classification of Asynchronous Steady-state Visual Evoked Potentials
Authors:
Nicholas R. Waytowich,
Vernon Lawhern,
Javier O. Garcia,
Jennifer Cummings,
Josef Faller,
Paul Sajda,
Jean M. Vettel
Abstract:
Steady-State Visual Evoked Potentials (SSVEPs) are neural oscillations from the parietal and occipital regions of the brain that are evoked from flickering visual stimuli. SSVEPs are robust signals measurable in the electroencephalogram (EEG) and are commonly used in brain-computer interfaces (BCIs). However, methods for high-accuracy decoding of SSVEPs usually require hand-crafted approaches that…
▽ More
Steady-State Visual Evoked Potentials (SSVEPs) are neural oscillations from the parietal and occipital regions of the brain that are evoked from flickering visual stimuli. SSVEPs are robust signals measurable in the electroencephalogram (EEG) and are commonly used in brain-computer interfaces (BCIs). However, methods for high-accuracy decoding of SSVEPs usually require hand-crafted approaches that leverage domain-specific knowledge of the stimulus signals, such as specific temporal frequencies in the visual stimuli and their relative spatial arrangement. When this knowledge is unavailable, such as when SSVEP signals are acquired asynchronously, such approaches tend to fail. In this paper, we show how a compact convolutional neural network (Compact-CNN), which only requires raw EEG signals for automatic feature extraction, can be used to decode signals from a 12-class SSVEP dataset without the need for any domain-specific knowledge or calibration data. We report across subject mean accuracy of approximately 80% (chance being 8.3%) and show this is substantially better than current state-of-the-art hand-crafted approaches using canonical correlation analysis (CCA) and Combined-CCA. Furthermore, we analyze our Compact-CNN to examine the underlying feature representation, discovering that the deep learner extracts additional phase and amplitude related features associated with the structure of the dataset. We discuss how our Compact-CNN shows promise for BCI applications that allow users to freely gaze/attend to any stimulus at any time (e.g., asynchronous BCI) as well as provides a method for analyzing SSVEP signals in a way that might augment our understanding about the basic processing in the visual cortex.
△ Less
Submitted 9 October, 2018; v1 submitted 12 March, 2018;
originally announced March 2018.