-
My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack
Authors:
Matthias Gazzari,
Annemarie Mattmann,
Max Maass,
Matthias Hollick
Abstract:
Wearables that constantly collect various sensor data of their users increase the chances for inferences of unintentional and sensitive information such as passwords typed on a physical keyboard. We take a thorough look at the potential of using electromyographic (EMG) data, a sensor modality which is new to the market but has lately gained attention in the context of wearables for augmented reali…
▽ More
Wearables that constantly collect various sensor data of their users increase the chances for inferences of unintentional and sensitive information such as passwords typed on a physical keyboard. We take a thorough look at the potential of using electromyographic (EMG) data, a sensor modality which is new to the market but has lately gained attention in the context of wearables for augmented reality (AR), for a keylogging side-channel attack. Our approach is based on neural networks for a between-subject attack in a realistic scenario using the Myo Armband to collect the sensor data. In our approach, the EMG data has proven to be the most prominent source of information compared to the accelerometer and gyroscope, increasing the keystroke detection performance. For our end-to-end approach on raw data, we report a mean balanced accuracy of about 76 % for the keystroke detection and a mean top-3 key accuracy of about 32 % on 52 classes for the key identification on passwords of varying strengths. We have created an extensive dataset including more than 310 000 keystrokes recorded from 37 volunteers, which is available as open access along with the source code used to create the given results.
△ Less
Submitted 4 December, 2021;
originally announced December 2021.
-
Many-to-English Machine Translation Tools, Data, and Pretrained Models
Authors:
Thamme Gowda,
Zhao Zhang,
Chris A Mattmann,
Jonathan May
Abstract:
While there are more than 7000 languages in the world, most translation research efforts have targeted a few high-resource languages. Commercial translation systems support only one hundred languages or fewer, and do not make these models available for transfer to low resource languages. In this work, we present useful tools for machine translation research: MTData, NLCodec, and RTG. We demonstrat…
▽ More
While there are more than 7000 languages in the world, most translation research efforts have targeted a few high-resource languages. Commercial translation systems support only one hundred languages or fewer, and do not make these models available for transfer to low resource languages. In this work, we present useful tools for machine translation research: MTData, NLCodec, and RTG. We demonstrate their usefulness by creating a multilingual neural machine translation model capable of translating from 500 source languages to English. We make this multilingual model readily downloadable and usable as a service, or as a parent model for transfer-learning to even lower-resource languages.
△ Less
Submitted 1 July, 2021; v1 submitted 1 April, 2021;
originally announced April 2021.
-
MARVIN: An Open Machine Learning Corpus and Environment for Automated Machine Learning Primitive Annotation and Execution
Authors:
Chris A. Mattmann,
Sujen Shah,
Brian Wilson
Abstract:
In this demo paper, we introduce the DARPA D3M program for automatic machine learning (ML) and JPL's MARVIN tool that provides an environment to locate, annotate, and execute machine learning primitives for use in ML pipelines. MARVIN is a web-based application and associated back-end interface written in Python that enables composition of ML pipelines from hundreds of primitives from the world of…
▽ More
In this demo paper, we introduce the DARPA D3M program for automatic machine learning (ML) and JPL's MARVIN tool that provides an environment to locate, annotate, and execute machine learning primitives for use in ML pipelines. MARVIN is a web-based application and associated back-end interface written in Python that enables composition of ML pipelines from hundreds of primitives from the world of Scikit-Learn, Keras, DL4J and other widely used libraries. MARVIN allows for the creation of Docker containers that run on Kubernetes clusters within DARPA to provide an execution environment for automated machine learning. MARVIN currently contains over 400 datasets and challenge problems from a wide array of ML domains including routine classification and regression to advanced video/image classification and remote sensing.
△ Less
Submitted 11 August, 2018;
originally announced August 2018.
-
Measurement Context Extraction from Text: Discovering Opportunities and Gaps in Earth Science
Authors:
Kyle Hundman,
Chris A. Mattmann
Abstract:
We propose Marve, a system for extracting measurement values, units, and related words from natural language text. Marve uses conditional random fields (CRF) to identify measurement values and units, followed by a rule-based system to find related entities, descriptors and modifiers within a sentence. Sentence tokens are represented by an undirected graphical model, and rules are based on part-of-…
▽ More
We propose Marve, a system for extracting measurement values, units, and related words from natural language text. Marve uses conditional random fields (CRF) to identify measurement values and units, followed by a rule-based system to find related entities, descriptors and modifiers within a sentence. Sentence tokens are represented by an undirected graphical model, and rules are based on part-of-speech and word dependency patterns connecting values and units to contextual words. Marve is unique in its focus on measurement context and early experimentation demonstrates Marve's ability to generate high-precision extractions with strong recall. We also discuss Marve's role in refining measurement requirements for NASA's proposed HyspIRI mission, a hyperspectral infrared imaging satellite that will study the world's ecosystems. In general, our work with HyspIRI demonstrates the value of semantic measurement extractions in characterizing quantitative discussion contained in large corpuses of natural language text. These extractions accelerate broad, cross-cutting research and expose scientists new algorithmic approaches and experimental nuances. They also facilitate identification of scientific opportunities enabled by HyspIRI leading to more efficient scientific investment and research.
△ Less
Submitted 11 October, 2017;
originally announced October 2017.