Skip to main content

Showing 1–50 of 54 results for author: Noever, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.04769  [pdf

    cs.CR

    Safeguarding Voice Privacy: Harnessing Near-Ultrasonic Interference To Protect Against Unauthorized Audio Recording

    Authors: Forrest McKee, David Noever

    Abstract: The widespread adoption of voice-activated systems has modified routine human-machine interaction but has also introduced new vulnerabilities. This paper investigates the susceptibility of automatic speech recognition (ASR) algorithms in these systems to interference from near-ultrasonic noise. Building upon prior research that demonstrated the ability of near-ultrasonic frequencies (16 kHz - 22 k… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  2. arXiv:2402.10090  [pdf

    cs.CV cs.IR cs.LG

    PICS: Pipeline for Image Captioning and Search

    Authors: Grant Rosario, David Noever

    Abstract: The growing volume of digital images necessitates advanced systems for efficient categorization and retrieval, presenting a significant challenge in database management and information retrieval. This paper introduces PICS (Pipeline for Image Captioning and Search), a novel approach designed to address the complexities inherent in organizing large-scale image repositories. PICS leverages the advan… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  3. arXiv:2402.09671  [pdf

    cs.CV cs.LG

    Exploiting Alpha Transparency In Language And Vision-Based AI Systems

    Authors: David Noever, Forrest McKee

    Abstract: This investigation reveals a novel exploit derived from PNG image file formats, specifically their alpha transparency layer, and its potential to fool multiple AI vision systems. Our method uses this alpha layer as a clandestine channel invisible to human observers but fully actionable by AI image processors. The scope tested for the vulnerability spans representative vision systems from Apple, Mi… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  4. arXiv:2401.15817  [pdf

    cs.CV cs.CR cs.LG

    Transparency Attacks: How Imperceptible Image Layers Can Fool AI Perception

    Authors: Forrest McKee, David Noever

    Abstract: This paper investigates a novel algorithmic vulnerability when imperceptible image layers confound multiple vision models into arbitrary label assignments and captions. We explore image preprocessing methods to introduce stealth transparency, which triggers AI misinterpretation of what the human eye perceives. The research compiles a broad attack surface to investigate the consequences ranging fro… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

  5. arXiv:2312.12383  [pdf

    cs.AI cs.CL cs.CV

    Visual AI and Linguistic Intelligence Through Steerability and Composability

    Authors: David Noever, Samantha Elizabeth Miller Noever

    Abstract: This study explores the capabilities of multimodal large language models (LLMs) in handling challenging multistep tasks that integrate language and vision, focusing on model steerability, composability, and the application of long-term memory and context understanding. The problem addressed is the LLM's ability (Nov 2023 GPT-4 Vision Preview) to manage tasks that require synthesizing visual and te… ▽ More

    Submitted 18 November, 2023; originally announced December 2023.

  6. arXiv:2312.10905  [pdf

    cs.LG cs.CL cs.CV

    Satellite Captioning: Large Language Models to Augment Labeling

    Authors: Grant Rosario, David Noever

    Abstract: With the growing capabilities of modern object detection networks and datasets to train them, it has gotten more straightforward and, importantly, less laborious to get up and running with a model that is quite adept at detecting any number of various objects. However, while image datasets for object detection have grown and continue to proliferate (the current most extensive public set, ImageNet,… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: 9 pages, 4 figures, 4 tables

  7. arXiv:2312.10603  [pdf

    cs.LG cs.AI

    Evaluating AI Vocational Skills Through Professional Testing

    Authors: David Noever, Matt Ciolino

    Abstract: Using a novel professional certification survey, the study focuses on assessing the vocational skills of two highly cited AI models, GPT-3 and Turbo-GPT3.5. The approach emphasizes the importance of practical readiness over academic performance by examining the models' performances on a benchmark dataset consisting of 1149 professional certifications. This study also includes a comparison with hum… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: text overlap with arXiv:2305.05377

  8. arXiv:2312.00039  [pdf

    cs.CR cs.LG cs.SD eess.AS

    Acoustic Cybersecurity: Exploiting Voice-Activated Systems

    Authors: Forrest McKee, David Noever

    Abstract: In this study, we investigate the emerging threat of inaudible acoustic attacks targeting digital voice assistants, a critical concern given their projected prevalence to exceed the global population by 2024. Our research extends the feasibility of these attacks across various platforms like Amazon's Alexa, Android, iOS, and Cortana, revealing significant vulnerabilities in smart devices. The twel… ▽ More

    Submitted 22 November, 2023; originally announced December 2023.

  9. arXiv:2309.16705  [pdf

    cs.CV cs.CL cs.LG

    Multimodal Analysis Of Google Bard And GPT-Vision: Experiments In Visual Reasoning

    Authors: David Noever, Samantha Elizabeth Miller Noever

    Abstract: Addressing the gap in understanding visual comprehension in Large Language Models (LLMs), we designed a challenge-response study, subjecting Google Bard and GPT-Vision to 64 visual tasks, spanning categories like "Visual Situational Reasoning" and "Next Scene Prediction." Previous models, such as GPT4, leaned heavily on optical character recognition tools like Tesseract, whereas Bard and GPT-Visio… ▽ More

    Submitted 14 October, 2023; v1 submitted 16 August, 2023; originally announced September 2023.

  10. arXiv:2308.10345  [pdf

    cs.SE cs.LG

    Can Large Language Models Find And Fix Vulnerable Software?

    Authors: David Noever

    Abstract: In this study, we evaluated the capability of Large Language Models (LLMs), particularly OpenAI's GPT-4, in detecting software vulnerabilities, comparing their performance against traditional static code analyzers like Snyk and Fortify. Our analysis covered numerous repositories, including those from NASA and the Department of Defense. GPT-4 identified approximately four times the vulnerabilities… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

  11. arXiv:2308.07326  [pdf

    cs.AI cs.CL cs.LG

    AI Text-to-Behavior: A Study In Steerability

    Authors: David Noever, Sam Hyams

    Abstract: The research explores the steerability of Large Language Models (LLMs), particularly OpenAI's ChatGPT iterations. By employing a behavioral psychology framework called OCEAN (Openness, Conscientiousness, Extroversion, Agreeableness, Neuroticism), we quantitatively gauged the model's responsiveness to tailored prompts. When asked to generate text mimicking an extroverted personality, OCEAN scored t… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  12. arXiv:2307.12204  [pdf

    cs.LG cs.SD eess.AS

    Adversarial Agents For Attacking Inaudible Voice Activated Devices

    Authors: Forrest McKee, David Noever

    Abstract: The paper applies reinforcement learning to novel Internet of Thing configurations. Our analysis of inaudible attacks on voice-activated devices confirms the alarming risk factor of 7.6 out of 10, underlining significant security vulnerabilities scored independently by NIST National Vulnerability Database (NVD). Our baseline network model showcases a scenario in which an attacker uses inaudible vo… ▽ More

    Submitted 25 July, 2023; v1 submitted 22 July, 2023; originally announced July 2023.

  13. arXiv:2305.10358  [pdf

    cs.CR cs.LG cs.SD eess.AS

    NUANCE: Near Ultrasound Attack On Networked Communication Environments

    Authors: Forrest McKee, David Noever

    Abstract: This study investigates a primary inaudible attack vector on Amazon Alexa voice services using near ultrasound trojans and focuses on characterizing the attack surface and examining the practical implications of issuing inaudible voice commands. The research maps each attack vector to a tactic or technique from the MITRE ATT&CK matrix, covering enterprise, mobile, and Industrial Control System (IC… ▽ More

    Submitted 22 May, 2023; v1 submitted 25 April, 2023; originally announced May 2023.

  14. arXiv:2305.05377  [pdf

    cs.AI cs.LG

    Professional Certification Benchmark Dataset: The First 500 Jobs For Large Language Models

    Authors: David Noever, Matt Ciolino

    Abstract: The research creates a professional certification survey to test large language models and evaluate their employable skills. It compares the performance of two AI models, GPT-3 and Turbo-GPT3.5, on a benchmark dataset of 1149 professional certifications, emphasizing vocational readiness rather than academic performance. GPT-3 achieved a passing score (>70% correct) in 39% of the professional certi… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.

  15. arXiv:2304.02016  [pdf

    cs.CL cs.CV cs.LG

    The Multimodal And Modular Ai Chef: Complex Recipe Generation From Imagery

    Authors: David Noever, Samantha Elizabeth Miller Noever

    Abstract: The AI community has embraced multi-sensory or multi-modal approaches to advance this generation of AI models to resemble expected intelligent understanding. Combining language and imagery represents a familiar method for specific tasks like image captioning or generation from descriptions. This paper compares these monolithic approaches to a lightweight and specialized method based on employing i… ▽ More

    Submitted 19 March, 2023; originally announced April 2023.

  16. arXiv:2303.12038  [pdf

    cs.CL cs.LG

    Grading Conversational Responses Of Chatbots

    Authors: Grant Rosario, David Noever

    Abstract: Chatbots have long been capable of answering basic questions and even responding to obscure prompts, but recently their improvements have been far more significant. Modern chatbots like Open AIs ChatGPT3 not only have the ability to answer basic questions but can write code and movie scripts and imitate well-known people. In this paper, we analyze ChatGPTs' responses to various questions from a da… ▽ More

    Submitted 31 January, 2023; originally announced March 2023.

  17. arXiv:2301.13382  [pdf

    cs.CL

    Numeracy from Literacy: Data Science as an Emergent Skill from Large Language Models

    Authors: David Noever, Forrest McKee

    Abstract: Large language models (LLM) such as OpenAI's ChatGPT and GPT-3 offer unique testbeds for exploring the translation challenges of turning literacy into numeracy. Previous publicly-available transformer models from eighteen months prior and 1000 times smaller failed to provide basic arithmetic. The statistical analysis of four complex datasets described here combines arithmetic manipulations that ca… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

  18. arXiv:2301.03771  [pdf

    cs.CR cs.CY cs.LG

    Chatbots in a Honeypot World

    Authors: Forrest McKee, David Noever

    Abstract: Question-and-answer agents like ChatGPT offer a novel tool for use as a potential honeypot interface in cyber security. By imitating Linux, Mac, and Windows terminal commands and providing an interface for TeamViewer, nmap, and **, it is possible to create a dynamic environment that can adapt to the actions of attackers and provide insight into their tactics, techniques, and procedures (TTPs). T… ▽ More

    Submitted 9 January, 2023; originally announced January 2023.

  19. arXiv:2301.03373  [pdf

    cs.LG cs.CL cs.SE

    Chatbots As Fluent Polyglots: Revisiting Breakthrough Code Snippets

    Authors: David Noever, Kevin Williams

    Abstract: The research applies AI-driven code assistants to analyze a selection of influential computer code that has shaped modern technology, including email, internet browsing, robotics, and malicious software. The original contribution of this study was to examine half of the most significant code advances in the last 50 years and, in some cases, to provide notable improvements in clarity or performance… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

  20. arXiv:2301.01743  [pdf

    cs.AI cs.CL

    Chatbots as Problem Solvers: Playing Twenty Questions with Role Reversals

    Authors: David Noever, Forrest McKee

    Abstract: New chat AI applications like ChatGPT offer an advanced understanding of question context and memory across multi-step tasks, such that experiments can test its deductive reasoning. This paper proposes a multi-role and multi-step challenge, where ChatGPT plays the classic twenty-questions game but innovatively switches roles from the questioner to the answerer. The main empirical result establishe… ▽ More

    Submitted 31 December, 2022; originally announced January 2023.

  21. arXiv:2212.11126  [pdf

    cs.CR cs.CL cs.LG

    Chatbots in a Botnet World

    Authors: Forrest McKee, David Noever

    Abstract: Question-and-answer formats provide a novel experimental platform for investigating cybersecurity questions. Unlike previous chatbots, the latest ChatGPT model from OpenAI supports an advanced understanding of complex coding questions. The research demonstrates thirteen coding tasks that generally qualify as stages in the MITRE ATT&CK framework, ranging from credential access to defense evasion. W… ▽ More

    Submitted 22 December, 2022; v1 submitted 18 December, 2022; originally announced December 2022.

  22. arXiv:2212.06721  [pdf

    cs.LG cs.AI cs.CL

    The Turing Deception

    Authors: David Noever, Matt Ciolino

    Abstract: This research revisits the classic Turing test and compares recent large language models such as ChatGPT for their abilities to reproduce human-level comprehension and compelling text generation. Two task challenges -- summarization, and question answering -- prompt ChatGPT to produce original content (98-99%) from a single text entry and also sequential questions originally posed by Turing in 195… ▽ More

    Submitted 23 December, 2022; v1 submitted 9 December, 2022; originally announced December 2022.

  23. arXiv:2212.00585  [pdf, other

    cs.CV cs.LG

    Soft Labels for Rapid Satellite Object Detection

    Authors: Matthew Ciolino, Grant Rosario, David Noever

    Abstract: Soft labels in image classification are vector representations of an image's true classification. In this paper, we investigate soft labels in the context of satellite object detection. We propose using detections as the basis for a new dataset of soft labels. Much of the effort in creating a high-quality model is gathering and annotating the training data. If we could use a model to generate a da… ▽ More

    Submitted 27 January, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: 5 Pages, 5 Figures, 1 Tables, 22 References

  24. arXiv:2209.12684  [pdf

    cs.LG cs.CV

    Soft-labeling Strategies for Rapid Sub-Ty**

    Authors: Grant Rosario, David Noever, Matt Ciolino

    Abstract: The challenge of labeling large example datasets for computer vision continues to limit the availability and scope of image repositories. This research provides a new method for automated data collection, curation, labeling, and iterative training with minimal human intervention for the case of overhead satellite imagery and object detection. The new operational scale effectively scanned an entire… ▽ More

    Submitted 19 January, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

  25. arXiv:2207.13702  [pdf

    cs.LG

    Physical Systems Modeled Without Physical Laws

    Authors: David Noever, Samuel Hyams

    Abstract: Physics-based simulations typically operate with a combination of complex differentiable equations and many scientific and geometric inputs. Our work involves gathering data from those simulations and seeing how well tree-based machine learning methods can emulate desired outputs without "knowing" the complex backing involved in the simulations. The selected physics-based simulations included Navi… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

  26. arXiv:2207.08766  [pdf

    cs.LG

    Word Play for Playing Othello (Reverses)

    Authors: Samantha E. Miller Noever, David Noever

    Abstract: Language models like OpenAI's Generative Pre-Trained Transformers (GPT-2/3) capture the long-term correlations needed to generate text in a variety of domains (such as language translators) and recently in gameplay (chess, Go, and checkers). The present research applies both the larger (GPT-3) and smaller (GPT-2) language models to explore the complex strategies for the game of Othello (or Reverse… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

  27. arXiv:2203.00116  [pdf, other

    cs.CV cs.LG eess.IV

    Enhancing Satellite Imagery using Deep Learning for the Sensor To Shooter Timeline

    Authors: Matthew Ciolino, Dominick Hambrick, David Noever

    Abstract: The sensor to shooter timeline is affected by two main variables: satellite positioning and asset positioning. Speeding up satellite positioning by adding more sensors or by decreasing processing time is important only if there is a prepared shooter, otherwise the main source of time is getting the shooter into position. However, the intelligence community should work towards the exploitation of s… ▽ More

    Submitted 30 March, 2022; v1 submitted 28 February, 2022; originally announced March 2022.

    Comments: 5 Pages, 3 Figures, 1 Table, 39 References

  28. arXiv:2201.00848  [pdf

    cs.CV cs.LG

    Runway Extraction and Improved Map** from Space Imagery

    Authors: David A. Noever

    Abstract: Change detection methods applied to monitoring key infrastructure like airport runways represent an important capability for disaster relief and urban planning. The present work identifies two generative adversarial networks (GAN) architectures that translate reversibly between plausible runway maps and satellite imagery. We illustrate the training capability using paired images (satellite-map) fr… ▽ More

    Submitted 29 December, 2021; originally announced January 2022.

  29. arXiv:2110.10601  [pdf

    cs.LG cs.CR cs.SE

    Color Teams for Machine Learning Development

    Authors: Josh Kalin, David Noever, Matthew Ciolino

    Abstract: Machine learning and software development share processes and methodologies for reliably delivering products to customers. This work proposes the use of a new teaming construct for forming machine learning teams for better combatting adversarial attackers. In cybersecurity, infrastructure uses these teams to protect their systems by using system builders and programmers to also offer more robustne… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

    Comments: 8 Pages, 6 Figures

  30. arXiv:2110.07636  [pdf

    cs.LG cs.CR

    A Survey of Machine Learning Algorithms for Detecting Ransomware Encryption Activity

    Authors: Erik Larsen, David Noever, Korey MacVittie

    Abstract: A survey of machine learning techniques trained to detect ransomware is presented. This work builds upon the efforts of Taylor et al. in using sensor-based methods that utilize data collected from built-in instruments like CPU power and temperature monitors to identify encryption activity. Exploratory data analysis (EDA) shows the features most useful from this simulated data are clock speed, temp… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: 9 pages, 8 figures, 3 tables

  31. arXiv:2109.12162  [pdf

    cs.CR cs.LG

    POSSE: Patterns of Systems During Software Encryption

    Authors: David Noever, Samantha Miller Noever

    Abstract: This research recasts ransomware detection using performance monitoring and statistical machine learning. The work builds a test environment with 41 input variables to label and compares three computing states: idle, encryption and compression. A common goal of this behavioral detector seeks to anticipate and short-circuit the final step of hard-drive locking with encryption and the demand for pay… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

  32. arXiv:2109.02797  [pdf

    cs.LG cs.AI cs.CL

    Puzzle Solving without Search or Human Knowledge: An Unnatural Language Approach

    Authors: David Noever, Ryerson Burdick

    Abstract: The application of Generative Pre-trained Transformer (GPT-2) to learn text-archived game notation provides a model environment for exploring sparse reward gameplay. The transformer architecture proves amenable to training on solved text archives describing mazes, Rubik's Cube, and Sudoku solvers. The method benefits from fine-tuning the transformer architecture to visualize plausible strategies d… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

  33. arXiv:2107.00436  [pdf

    cs.CV cs.AI cs.LG

    Overhead-MNIST: Machine Learning Baselines for Image Classification

    Authors: Erik Larsen, David Noever, Korey MacVittie, John Lilly

    Abstract: Twenty-three machine learning algorithms were trained then scored to establish baseline comparison metrics and to select an image classification algorithm worthy of embedding into mission-critical satellite imaging systems. The Overhead-MNIST dataset is a collection of satellite images similar in style to the ubiquitous MNIST hand-written digits found in the machine learning literature. The CatBoo… ▽ More

    Submitted 19 October, 2021; v1 submitted 1 July, 2021; originally announced July 2021.

    Comments: 6 pages; 8 figures, 2 tables

  34. arXiv:2104.04359  [pdf

    cs.CV cs.LG

    Rock Hunting With Martian Machine Vision

    Authors: David Noever, Samantha E. Miller Noever

    Abstract: The Mars Perseverance rover applies computer vision for navigation and hazard avoidance. The challenge to do onboard object recognition highlights the need for low-power, customized training, often including low-contrast backgrounds. We investigate deep learning methods for the classification and detection of Martian rocks. We report greater than 97% accuracy for binary classifications (rock vs. r… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

  35. arXiv:2103.15897  [pdf

    cs.CR cs.CV

    Automating Defense Against Adversarial Attacks: Discovery of Vulnerabilities and Application of Multi-INT Imagery to Protect Deployed Models

    Authors: Josh Kalin, David Noever, Matthew Ciolino, Dominick Hambrick, Gerry Dozier

    Abstract: Image classification is a common step in image recognition for machine learning in overhead applications. When applying popular model architectures like MobileNetV2, known vulnerabilities expose the model to counter-attacks, either mislabeling a known class or altering box location. This work proposes an automated approach to defend these models. We evaluate the use of multi-spectral image arrays… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

    Comments: SPIE 2021, 8 Pages, 6 Figures

  36. arXiv:2103.10480  [pdf

    cs.LG cs.CL cs.CV

    Reading Isn't Believing: Adversarial Attacks On Multi-Modal Neurons

    Authors: David A. Noever, Samantha E. Miller Noever

    Abstract: With Open AI's publishing of their CLIP model (Contrastive Language-Image Pre-training), multi-modal neural networks now provide accessible models that combine reading with visual recognition. Their network offers novel ways to probe its dual abilities to read text while classifying visual objects. This paper demonstrates several new categories of adversarial attacks, spanning basic typographical,… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

  37. arXiv:2103.07765  [pdf

    cs.CR cs.LG

    Image Classifiers for Network Intrusions

    Authors: David A. Noever, Samantha E. Miller Noever

    Abstract: This research recasts the network attack dataset from UNSW-NB15 as an intrusion detection problem in image space. Using one-hot-encodings, the resulting grayscale thumbnails provide a quarter-million examples for deep learning algorithms. Applying the MobileNetV2's convolutional neural network architecture, the work demonstrates a 97% accuracy in distinguishing normal and attack traffic. Further c… ▽ More

    Submitted 13 March, 2021; originally announced March 2021.

  38. arXiv:2103.02718  [pdf

    cs.LG cs.CR

    A Modified Drake Equation for Assessing Adversarial Risk to Machine Learning Models

    Authors: Josh Kalin, David Noever, Matthew Ciolino

    Abstract: Machine learning models present a risk of adversarial attack when deployed in production. Quantifying the contributing factors and uncertainties using empirical measures could assist the industry with assessing the risk of downloading and deploying common model types. This work proposes modifying the traditional Drake Equation's formalism to estimate the number of potentially successful adversaria… ▽ More

    Submitted 7 July, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

    Comments: 8 Pages, 2 Figures, 3 Equations, 27 References, SAIM 2021

  39. arXiv:2103.00602  [pdf

    cs.CR cs.LG

    Virus-MNIST: A Benchmark Malware Dataset

    Authors: David Noever, Samantha E. Miller Noever

    Abstract: The short note presents an image classification dataset consisting of 10 executable code varieties and approximately 50,000 virus examples. The malicious classes include 9 families of computer viruses and one benign set. The image formatting for the first 1024 bytes of the Portable Executable (PE) mirrors the familiar MNIST handwriting dataset, such that most of the previously explored algorithmic… ▽ More

    Submitted 28 February, 2021; originally announced March 2021.

  40. arXiv:2102.09708  [pdf, other

    cs.CL

    Back Translation Survey for Improving Text Augmentation

    Authors: Matthew Ciolino, David Noever, Josh Kalin

    Abstract: Natural Language Processing (NLP) relies heavily on training data. Transformers, as they have gotten bigger, have required massive amounts of training data. To satisfy this requirement, text augmentation should be looked at as a way to expand your current dataset and to generalize your models. One text augmentation we will look at is translation augmentation. We take an English sentence and transl… ▽ More

    Submitted 16 November, 2022; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: 18 Pages, 10 Figures, 4 Tables, 37 References

  41. arXiv:2102.09695  [pdf, other

    cs.LG cs.CR

    Fortify Machine Learning Production Systems: Detect and Classify Adversarial Attacks

    Authors: Matthew Ciolino, Josh Kalin, David Noever

    Abstract: Production machine learning systems are consistently under attack by adversarial actors. Various deep learning models must be capable of accurately detecting fake or adversarial input while maintaining speed. In this work, we propose one piece of the production protection system: detecting an incoming adversarial attack and its characteristics. Detecting types of adversarial attacks has two primar… ▽ More

    Submitted 14 June, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: 5 Pages, 5 Figures, 5 Tables, 17 References, ICMLA 2021, IEEE Conference Format

  42. arXiv:2102.04266  [pdf

    cs.CV cs.LG

    Overhead MNIST: A Benchmark Satellite Dataset

    Authors: David Noever, Samantha E. Miller Noever

    Abstract: The research presents an overhead view of 10 important objects and follows the general formatting requirements of the most popular machine learning task: digit recognition with MNIST. This dataset offers a public benchmark extracted from over a million human-labelled and curated examples. The work outlines the key multi-class object identification task while matching with prior work in handwriting… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

  43. arXiv:2101.01628  [pdf

    cs.CL cs.LG

    Local Translation Services for Neglected Languages

    Authors: David Noever, Josh Kalin, Matt Ciolino, Dom Hambrick, Gerry Dozier

    Abstract: Taking advantage of computationally lightweight, but high-quality translators prompt consideration of new applications that address neglected languages. Locally run translators for less popular languages may assist data projects with protected or personal data that may require specific compliance checks before posting to a public translation API, but which could render reasonable, cost-effective s… ▽ More

    Submitted 13 January, 2021; v1 submitted 5 January, 2021; originally announced January 2021.

  44. arXiv:2009.03136  [pdf, other

    cs.LG stat.ML

    Black Box to White Box: Discover Model Characteristics Based on Strategic Probing

    Authors: Josh Kalin, Matthew Ciolino, David Noever, Gerry Dozier

    Abstract: In Machine Learning, White Box Adversarial Attacks rely on knowing underlying knowledge about the model attributes. This works focuses on discovering to distrinct pieces of model information: the underlying architecture and primary training dataset. With the process in this paper, a structured set of input probes and the output of the model become the training data for a deep classifier. Two subdo… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

    Comments: 4 Pages, 3 Figure, IEEE Format, Ai4i 2020

  45. arXiv:2008.04057  [pdf

    cs.AI cs.CL cs.GT cs.LG

    The Chess Transformer: Mastering Play using Generative Language Models

    Authors: David Noever, Matt Ciolino, Josh Kalin

    Abstract: This work demonstrates that natural language transformers can support more generic strategic modeling, particularly for text-archived games. In addition to learning natural language skills, the abstract transformer architecture can generate meaningful moves on a chessboard. With further fine-tuning, the transformer learns complex gameplay by training on 2.8 million chess games in Portable Game Not… ▽ More

    Submitted 18 September, 2020; v1 submitted 2 August, 2020; originally announced August 2020.

    Comments: 7 Pages, 6 Figures, AAAI Format, AAAI 21

  46. arXiv:2007.03500  [pdf

    cs.CL cs.LG

    The Go Transformer: Natural Language Modeling for Game Play

    Authors: Matthew Ciolino, David Noever, Josh Kalin

    Abstract: This work applies natural language modeling to generate plausible strategic moves in the ancient game of Go. We train the Generative Pretrained Transformer (GPT-2) to mimic the style of Go champions as archived in Smart Game Format (SGF), which offers a text description of move sequences. The trained model further generates valid but previously unseen strategies for Go. Because GPT-2 preserves pun… ▽ More

    Submitted 7 September, 2020; v1 submitted 7 July, 2020; originally announced July 2020.

    Comments: 8 Pages, 5 Figures, 1 Table, IEEE Format, Ai4i 2020

  47. arXiv:2006.11130  [pdf

    cs.CR cs.LG stat.ML

    Systematic Attack Surface Reduction For Deployed Sentiment Analysis Models

    Authors: Josh Kalin, David Noever, Gerry Dozier

    Abstract: This work proposes a structured approach to baselining a model, identifying attack vectors, and securing the machine learning models after deployment. This method for securing each model post deployment is called the BAD (Build, Attack, and Defend) Architecture. Two implementations of the BAD architecture are evaluated to quantify the adversarial life cycle for a black box Sentiment Analysis syste… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

    Comments: 11 pages, 4 figures, 6th International Conference on Data Mining

  48. arXiv:2004.03366  [pdf

    cs.CV cs.LG

    Knife and Threat Detectors

    Authors: David A. Noever, Sam E. Miller Noever

    Abstract: Despite rapid advances in image-based machine learning, the threat identification of a knife wielding attacker has not garnered substantial academic attention. This relative research gap appears less understandable given the high knife assault rate (>100,000 annually) and the increasing availability of public video surveillance to analyze and forensically document. We present three complementary m… ▽ More

    Submitted 8 April, 2020; v1 submitted 4 April, 2020; originally announced April 2020.

  49. arXiv:2001.10374  [pdf

    cs.IR cs.LG cs.SI

    The Enron Corpus: Where the Email Bodies are Buried?

    Authors: David Noever

    Abstract: To probe the largest public-domain email database for indicators of fraud, we apply machine learning and accomplish four investigative tasks. First, we identify persons of interest (POI), using financial records and email, and report a peak accuracy of 95.7%. Secondly, we find any publicly exposed personally identifiable information (PII) and discover 50,000 previously unreported instances. Thirdl… ▽ More

    Submitted 24 January, 2020; originally announced January 2020.

  50. arXiv:2001.05839  [pdf

    cs.CV cs.CL cs.LG stat.ML

    Discoverability in Satellite Imagery: A Good Sentence is Worth a Thousand Pictures

    Authors: David Noever, Wes Regian, Matt Ciolino, Josh Kalin, Dom Hambrick, Kaye Blankenship

    Abstract: Small satellite constellations provide daily global coverage of the earth's landmass, but image enrichment relies on automating key tasks like change detection or feature searches. For example, to extract text annotations from raw pixels requires two dependent machine learning models, one to analyze the overhead image and the other to generate a descriptive caption. We evaluate seven models on the… ▽ More

    Submitted 3 January, 2020; originally announced January 2020.