-
Neuromorphic Co-Design as a Game
Authors:
Craig M. Vineyard,
William M. Severa,
James B. Aimone
Abstract:
Co-design is a prominent topic presently in computing, speaking to the mutual benefit of coordinating design choices of several layers in the technology stack. For example, this may be designing algorithms which can most efficiently take advantage of the acceleration properties of a given architecture, while simultaneously designing the hardware to support the structural needs of a class of comput…
▽ More
Co-design is a prominent topic presently in computing, speaking to the mutual benefit of coordinating design choices of several layers in the technology stack. For example, this may be designing algorithms which can most efficiently take advantage of the acceleration properties of a given architecture, while simultaneously designing the hardware to support the structural needs of a class of computation. The implications of these design decisions are influential enough to be deemed a lottery, enabling an idea to win out over others irrespective of the individual merits. Coordination is a well studied topic in the mathematics of game theory, where in many cases without a coordination mechanism the outcome is sub-optimal. Here we consider what insights game theoretic analysis can offer for computer architecture co-design. In particular, we consider the interplay between algorithm and architecture advances in the field of neuromorphic computing. Analyzing developments of spiking neural network algorithms and neuromorphic hardware as a co-design game we use the Stag Hunt model to illustrate challenges for spiking algorithms or architectures to advance the field independently and advocate for a strategic pursuit to advance neuromorphic computing.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Synaptic Sampling of Neural Networks
Authors:
James B. Aimone,
William Severa,
J. Darby Smith
Abstract:
Probabilistic artificial neural networks offer intriguing prospects for enabling the uncertainty of artificial intelligence methods to be described explicitly in their function; however, the development of techniques that quantify uncertainty by well-understood methods such as Monte Carlo sampling has been limited by the high costs of stochastic sampling on deterministic computing hardware. Emergi…
▽ More
Probabilistic artificial neural networks offer intriguing prospects for enabling the uncertainty of artificial intelligence methods to be described explicitly in their function; however, the development of techniques that quantify uncertainty by well-understood methods such as Monte Carlo sampling has been limited by the high costs of stochastic sampling on deterministic computing hardware. Emerging computing systems that are amenable to hardware-level probabilistic computing, such as those that leverage stochastic devices, may make probabilistic neural networks more feasible in the not-too-distant future. This paper describes the scANN technique -- \textit{sampling (by coinflips) artificial neural networks} -- which enables neural networks to be sampled directly by treating the weights as Bernoulli coin flips. This method is natively well suited for probabilistic computing techniques that focus on tunable stochastic devices, nearly matches fully deterministic performance while also describing the uncertainty of correct and incorrect neural network outputs.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Design Principles for Lifelong Learning AI Accelerators
Authors:
Dhireesha Kudithipudi,
Anurag Daram,
Abdullah M. Zyarah,
Fatima Tuz Zohora,
James B. Aimone,
Angel Yanguas-Gil,
Nicholas Soures,
Emre Neftci,
Matthew Mattina,
Vincenzo Lomonaco,
Clare D. Thiem,
Benjamin Epstein
Abstract:
Lifelong learning - an agent's ability to learn throughout its lifetime - is a hallmark of biological learning systems and a central challenge for artificial intelligence (AI). The development of lifelong learning algorithms could lead to a range of novel AI applications, but this will also require the development of appropriate hardware accelerators, particularly if the models are to be deployed…
▽ More
Lifelong learning - an agent's ability to learn throughout its lifetime - is a hallmark of biological learning systems and a central challenge for artificial intelligence (AI). The development of lifelong learning algorithms could lead to a range of novel AI applications, but this will also require the development of appropriate hardware accelerators, particularly if the models are to be deployed on edge platforms, which have strict size, weight, and power constraints. Here, we explore the design of lifelong learning AI accelerators that are intended for deployment in untethered environments. We identify key desirable capabilities for lifelong learning accelerators and highlight metrics to evaluate such accelerators. We then discuss current edge AI accelerators and explore the future design of lifelong learning accelerators, considering the role that different emerging technologies could play.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Decomposing spiking neural networks with Graphical Neural Activity Threads
Authors:
Bradley H. Theilman,
Felix Wang,
Fred Rothganger,
James B. Aimone
Abstract:
A satisfactory understanding of information processing in spiking neural networks requires appropriate computational abstractions of neural activity. Traditionally, the neural population state vector has been the most common abstraction applied to spiking neural networks, but this requires artificially partitioning time into bins that are not obviously relevant to the network itself. We introduce…
▽ More
A satisfactory understanding of information processing in spiking neural networks requires appropriate computational abstractions of neural activity. Traditionally, the neural population state vector has been the most common abstraction applied to spiking neural networks, but this requires artificially partitioning time into bins that are not obviously relevant to the network itself. We introduce a distinct set of techniques for analyzing spiking neural networks that decomposes neural activity into multiple, disjoint, parallel threads of activity. We construct these threads by estimating the degree of causal relatedness between pairs of spikes, then use these estimates to construct a directed acyclic graph that traces how the network activity evolves through individual spikes. We find that this graph of spiking activity naturally decomposes into disjoint connected components that overlap in space and time, which we call Graphical Neural Activity Threads (GNATs). We provide an efficient algorithm for finding analogous threads that reoccur in large spiking datasets, revealing that seemingly distinct spike trains are composed of similar underlying threads of activity, a hallmark of compositionality. The picture of spiking neural networks provided by our GNAT analysis points to new abstractions for spiking neural computation that are naturally adapted to the spatiotemporally distributed dynamics of spiking neural networks.
△ Less
Submitted 29 June, 2023;
originally announced June 2023.
-
Probabilistic Neural Circuits leveraging AI-Enhanced Codesign for Random Number Generation
Authors:
Suma G. Cardwell,
Catherine D. Schuman,
J. Darby Smith,
Karan Patel,
Jaesuk Kwon,
Samuel Liu,
Christopher Allemang,
Shashank Misra,
Jean Anne Incorvia,
James B. Aimone
Abstract:
Stochasticity is ubiquitous in the world around us. However, our predominant computing paradigm is deterministic. Random number generation (RNG) can be a computationally inefficient operation in this system especially for larger workloads. Our work leverages the underlying physics of emerging devices to develop probabilistic neural circuits for RNGs from a given distribution. However, codesign for…
▽ More
Stochasticity is ubiquitous in the world around us. However, our predominant computing paradigm is deterministic. Random number generation (RNG) can be a computationally inefficient operation in this system especially for larger workloads. Our work leverages the underlying physics of emerging devices to develop probabilistic neural circuits for RNGs from a given distribution. However, codesign for novel circuits and systems that leverage inherent device stochasticity is a hard problem. This is mostly due to the large design space and complexity of doing so. It requires concurrent input from multiple areas in the design stack from algorithms, architectures, circuits, to devices. In this paper, we present examples of optimal circuits developed leveraging AI-enhanced codesign techniques using constraints from emerging devices and algorithms. Our AI-enhanced codesign approach accelerated design and enabled interactions between experts from different areas of the microelectronics design stack including theory, algorithms, circuits, and devices. We demonstrate optimal probabilistic neural circuits using magnetic tunnel junction and tunnel diode devices that generate an RNG from a given distribution.
△ Less
Submitted 1 December, 2022;
originally announced December 2022.
-
Stochastic Neuromorphic Circuits for Solving MAXCUT
Authors:
Bradley H. Theilman,
Yipu Wang,
Ojas D. Parekh,
William Severa,
J. Darby Smith,
James B. Aimone
Abstract:
Finding the maximum cut of a graph (MAXCUT) is a classic optimization problem that has motivated parallel algorithm development. While approximate algorithms to MAXCUT offer attractive theoretical guarantees and demonstrate compelling empirical performance, such approximation approaches can shift the dominant computational cost to the stochastic sampling operations. Neuromorphic computing, which u…
▽ More
Finding the maximum cut of a graph (MAXCUT) is a classic optimization problem that has motivated parallel algorithm development. While approximate algorithms to MAXCUT offer attractive theoretical guarantees and demonstrate compelling empirical performance, such approximation approaches can shift the dominant computational cost to the stochastic sampling operations. Neuromorphic computing, which uses the organizing principles of the nervous system to inspire new parallel computing architectures, offers a possible solution. One ubiquitous feature of natural brains is stochasticity: the individual elements of biological neural networks possess an intrinsic randomness that serves as a resource enabling their unique computational capacities. By designing circuits and algorithms that make use of randomness similarly to natural brains, we hypothesize that the intrinsic randomness in microelectronics devices could be turned into a valuable component of a neuromorphic architecture enabling more efficient computations. Here, we present neuromorphic circuits that transform the stochastic behavior of a pool of random devices into useful correlations that drive stochastic solutions to MAXCUT. We show that these circuits perform favorably in comparison to software solvers and argue that this neuromorphic hardware implementation provides a path for scaling advantages. This work demonstrates the utility of combining neuromorphic principles with intrinsic randomness as a computational resource for new computational architectures.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
Sequence Learning and Consolidation on Loihi using On-chip Plasticity
Authors:
Jack Lindsey,
James B Aimone
Abstract:
In this work we develop a model of predictive learning on neuromorphic hardware. Our model uses the on-chip plasticity capabilities of the Loihi chip to remember observed sequences of events and use this memory to generate predictions of future events in real time. Given the locality constraints of on-chip plasticity rules, generating predictions without interfering with the ongoing learning proce…
▽ More
In this work we develop a model of predictive learning on neuromorphic hardware. Our model uses the on-chip plasticity capabilities of the Loihi chip to remember observed sequences of events and use this memory to generate predictions of future events in real time. Given the locality constraints of on-chip plasticity rules, generating predictions without interfering with the ongoing learning process is nontrivial. We address this challenge with a memory consolidation approach inspired by hippocampal replay. Sequence memory is stored in an initial memory module using spike-timing dependent plasticity. Later, during an offline period, memories are consolidated into a distinct prediction module. This second module is then able to represent predicted future events without interfering with the activity, and plasticity, in the first module, enabling online comparison between predictions and ground-truth observations. Our model serves as a proof-of-concept that online predictive learning models can be deployed on neuromorphic hardware with on-chip plasticity.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
Spiking Neural Streaming Binary Arithmetic
Authors:
James B. Aimone,
Aaron J. Hill,
William M. Severa,
Craig M. Vineyard
Abstract:
Boolean functions and binary arithmetic operations are central to standard computing paradigms. Accordingly, many advances in computing have focused upon how to make these operations more efficient as well as exploring what they can compute. To best leverage the advantages of novel computing paradigms it is important to consider what unique computing approaches they offer. However, for any special…
▽ More
Boolean functions and binary arithmetic operations are central to standard computing paradigms. Accordingly, many advances in computing have focused upon how to make these operations more efficient as well as exploring what they can compute. To best leverage the advantages of novel computing paradigms it is important to consider what unique computing approaches they offer. However, for any special-purpose co-processor, Boolean functions and binary arithmetic operations are useful for, among other things, avoiding unnecessary I/O on-and-off the co-processor by pre- and post-processing data on-device. This is especially true for spiking neuromorphic architectures where these basic operations are not fundamental low-level operations. Instead, these functions require specific implementation. Here we discuss the implications of an advantageous streaming binary encoding method as well as a handful of circuits designed to exactly compute elementary Boolean and binary operations.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Neuromorphic scaling advantages for energy-efficient random walk computation
Authors:
J. Darby Smith,
Aaron J. Hill,
Leah E. Reeder,
Brian C. Franke,
Richard B. Lehoucq,
Ojas Parekh,
William Severa,
James B. Aimone
Abstract:
Computing stands to be radically improved by neuromorphic computing (NMC) approaches inspired by the brain's incredible efficiency and capabilities. Most NMC research, which aims to replicate the brain's computational structure and architecture in man-made hardware, has focused on artificial intelligence; however, less explored is whether this brain-inspired hardware can provide value beyond cogni…
▽ More
Computing stands to be radically improved by neuromorphic computing (NMC) approaches inspired by the brain's incredible efficiency and capabilities. Most NMC research, which aims to replicate the brain's computational structure and architecture in man-made hardware, has focused on artificial intelligence; however, less explored is whether this brain-inspired hardware can provide value beyond cognitive tasks. We demonstrate that high-degree parallelism and configurability of spiking neuromorphic architectures makes them well-suited to implement random walks via discrete time Markov chains. Such random walks are useful in Monte Carlo methods, which represent a fundamental computational tool for solving a wide range of numerical computing tasks. Additionally, we show how the mathematical basis for a probabilistic solution involving a class of stochastic differential equations can leverage those simulations to provide solutions for a range of broadly applicable computational tasks. Despite being in an early development stage, we find that NMC platforms, at a sufficient scale, can drastically reduce the energy demands of high-performance computing (HPC) platforms.
△ Less
Submitted 27 July, 2021;
originally announced July 2021.
-
Constant-Depth and Subcubic-Size Threshold Circuits for Matrix Multiplication
Authors:
Ojas Parekh,
Cynthia A. Phillips,
Conrad D. James,
James B. Aimone
Abstract:
Boolean circuits of McCulloch-Pitts threshold gates are a classic model of neural computation studied heavily in the late 20th century as a model of general computation. Recent advances in large-scale neural computing hardware has made their practical implementation a near-term possibility. We describe a theoretical approach for multiplying two $N$ by $N$ matrices that integrates threshold gate lo…
▽ More
Boolean circuits of McCulloch-Pitts threshold gates are a classic model of neural computation studied heavily in the late 20th century as a model of general computation. Recent advances in large-scale neural computing hardware has made their practical implementation a near-term possibility. We describe a theoretical approach for multiplying two $N$ by $N$ matrices that integrates threshold gate logic with conventional fast matrix multiplication algorithms, that perform $O(N^ω)$ arithmetic operations for a positive constant $ω< 3$. Our approach converts such a fast matrix multiplication algorithm into a constant-depth threshold circuit with approximately $O(N^ω)$ gates. Prior to our work, it was not known whether the $Θ(N^3)$-gate barrier for matrix multiplication was surmountable by constant-depth threshold circuits.
Dense matrix multiplication is a core operation in convolutional neural network training. Performing this work on a neural architecture instead of off-loading it to a GPU may be an appealing option.
△ Less
Submitted 25 June, 2020;
originally announced June 2020.
-
Solving a steady-state PDE using spiking networks and neuromorphic hardware
Authors:
J. Darby Smith,
William Severa,
Aaron J. Hill,
Leah Reeder,
Brian Franke,
Richard B. Lehoucq,
Ojas D. Parekh,
James B. Aimone
Abstract:
The widely parallel, spiking neural networks of neuromorphic processors can enable computationally powerful formulations. While recent interest has focused on primarily machine learning tasks, the space of appropriate applications is wide and continually expanding. Here, we leverage the parallel and event-driven structure to solve a steady state heat equation using a random walk method. The random…
▽ More
The widely parallel, spiking neural networks of neuromorphic processors can enable computationally powerful formulations. While recent interest has focused on primarily machine learning tasks, the space of appropriate applications is wide and continually expanding. Here, we leverage the parallel and event-driven structure to solve a steady state heat equation using a random walk method. The random walk can be executed fully within a spiking neural network using stochastic neuron behavior, and we provide results from both IBM TrueNorth and Intel Loihi implementations. Additionally, we position this algorithm as a potential scalable benchmark for neuromorphic systems.
△ Less
Submitted 21 May, 2020;
originally announced May 2020.
-
Composing Neural Algorithms with Fugu
Authors:
James B Aimone,
William Severa,
Craig M Vineyard
Abstract:
Neuromorphic hardware architectures represent a growing family of potential post-Moore's Law Era platforms. Largely due to event-driving processing inspired by the human brain, these computer platforms can offer significant energy benefits compared to traditional von Neumann processors. Unfortunately there still remains considerable difficulty in successfully programming, configuring and deploying…
▽ More
Neuromorphic hardware architectures represent a growing family of potential post-Moore's Law Era platforms. Largely due to event-driving processing inspired by the human brain, these computer platforms can offer significant energy benefits compared to traditional von Neumann processors. Unfortunately there still remains considerable difficulty in successfully programming, configuring and deploying neuromorphic systems. We present the Fugu framework as an answer to this need. Rather than necessitating a developer attain intricate knowledge of how to program and exploit spiking neural dynamics to utilize the potential benefits of neuromorphic computing, Fugu is designed to provide a higher level abstraction as a hardware-independent mechanism for linking a variety of scalable spiking neural algorithms from a variety of sources. Individual kernels linked together provide sophisticated processing through compositionality. Fugu is intended to be suitable for a wide-range of neuromorphic applications, including machine learning, scientific computing, and more brain-inspired neural algorithms. Ultimately, we hope the community adopts this and other open standardization attempts allowing for free exchange and easy implementations of the ever-growing list of spiking neural algorithms.
△ Less
Submitted 28 May, 2019;
originally announced May 2019.
-
Whetstone: A Method for Training Deep Artificial Neural Networks for Binary Communication
Authors:
William Severa,
Craig M. Vineyard,
Ryan Dellana,
Stephen J. Verzi,
James B. Aimone
Abstract:
This paper presents a new technique for training networks for low-precision communication. Targeting minimal communication between nodes not only enables the use of emerging spiking neuromorphic platforms, but may additionally streamline processing conventionally. Low-power and embedded neuromorphic processors potentially offer dramatic performance-per-Watt improvements over traditional von Neuman…
▽ More
This paper presents a new technique for training networks for low-precision communication. Targeting minimal communication between nodes not only enables the use of emerging spiking neuromorphic platforms, but may additionally streamline processing conventionally. Low-power and embedded neuromorphic processors potentially offer dramatic performance-per-Watt improvements over traditional von Neumann processors, however programming these brain-inspired platforms generally requires platform-specific expertise which limits their applicability. To date, the majority of artificial neural networks have not operated using discrete spike-like communication.
We present a method for training deep spiking neural networks using an iterative modification of the backpropagation optimization algorithm. This method, which we call Whetstone, effectively and reliably configures a network for a spiking hardware target with little, if any, loss in performance. Whetstone networks use single time step binary communication and do not require a rate code or other spike-based coding scheme, thus producing networks comparable in timing and size to conventional ANNs, albeit with binarized communication. We demonstrate Whetstone on a number of image classification networks, describing how the sharpening process interacts with different training optimizers and changes the distribution of activity within the network. We further note that Whetstone is compatible with several non-classification neural network applications, such as autoencoders and semantic segmentation. Whetstone is widely extendable and currently implemented using custom activation functions within the Keras wrapper to the popular TensorFlow machine learning framework.
△ Less
Submitted 26 October, 2018;
originally announced October 2018.
-
Resilient Computing with Reinforcement Learning on a Dynamical System: Case Study in Sorting
Authors:
Aleksandra Faust,
James B. Aimone,
Conrad D. James,
Lydia Tapia
Abstract:
Robots and autonomous agents often complete goal-based tasks with limited resources, relying on imperfect models and sensor measurements. In particular, reinforcement learning (RL) and feedback control can be used to help a robot achieve a goal. Taking advantage of this body of work, this paper formulates general computation as a feedback-control problem, which allows the agent to autonomously ove…
▽ More
Robots and autonomous agents often complete goal-based tasks with limited resources, relying on imperfect models and sensor measurements. In particular, reinforcement learning (RL) and feedback control can be used to help a robot achieve a goal. Taking advantage of this body of work, this paper formulates general computation as a feedback-control problem, which allows the agent to autonomously overcome some limitations of standard procedural language programming: resilience to errors and early program termination. Our formulation considers computation to be trajectory generation in the program's variable space. The computing then becomes a sequential decision making problem, solved with reinforcement learning (RL), and analyzed with Lyapunov stability theory to assess the agent's resilience and progression to the goal. We do this through a case study on a quintessential computer science problem, array sorting. Evaluations show that our RL sorting agent makes steady progress to an asymptotically stable goal, is resilient to faulty components, and performs less array manipulations than traditional Quicksort and Bubble sort.
△ Less
Submitted 24 September, 2018;
originally announced September 2018.
-
Spiking Neural Algorithms for Markov Process Random Walk
Authors:
William Severa,
Rich Lehoucq,
Ojas Parekh,
James B. Aimone
Abstract:
The random walk is a fundamental stochastic process that underlies many numerical tasks in scientific computing applications. We consider here two neural algorithms that can be used to efficiently implement random walks on spiking neuromorphic hardware. The first method tracks the positions of individual walkers independently by using a modular code inspired by the grid cell spatial representation…
▽ More
The random walk is a fundamental stochastic process that underlies many numerical tasks in scientific computing applications. We consider here two neural algorithms that can be used to efficiently implement random walks on spiking neuromorphic hardware. The first method tracks the positions of individual walkers independently by using a modular code inspired by the grid cell spatial representation in the brain. The second method tracks the densities of random walkers at each spatial location directly. We analyze the scaling complexity of each of these methods and illustrate their ability to model random walkers under different probabilistic conditions.
△ Less
Submitted 1 May, 2018;
originally announced May 2018.
-
Tracking Cyber Adversaries with Adaptive Indicators of Compromise
Authors:
Justin E. Doak,
Joe B. Ingram,
Sam A. Mulder,
John H. Naegle,
Jonathan A. Cox,
James B. Aimone,
Kevin R. Dixon,
Conrad D. James,
David R. Follett
Abstract:
A forensics investigation after a breach often uncovers network and host indicators of compromise (IOCs) that can be deployed to sensors to allow early detection of the adversary in the future. Over time, the adversary will change tactics, techniques, and procedures (TTPs), which will also change the data generated. If the IOCs are not kept up-to-date with the adversary's new TTPs, the adversary w…
▽ More
A forensics investigation after a breach often uncovers network and host indicators of compromise (IOCs) that can be deployed to sensors to allow early detection of the adversary in the future. Over time, the adversary will change tactics, techniques, and procedures (TTPs), which will also change the data generated. If the IOCs are not kept up-to-date with the adversary's new TTPs, the adversary will no longer be detected once all of the IOCs become invalid. Tracking the Known (TTK) is the problem of kee** IOCs, in this case regular expressions (regexes), up-to-date with a dynamic adversary. Our framework solves the TTK problem in an automated, cyclic fashion to bracket a previously discovered adversary. This tracking is accomplished through a data-driven approach of self-adapting a given model based on its own detection capabilities.
In our initial experiments, we found that the true positive rate (TPR) of the adaptive solution degrades much less significantly over time than the naive solution, suggesting that self-updating the model allows the continued detection of positives (i.e., adversaries). The cost for this performance is in the false positive rate (FPR), which increases over time for the adaptive solution, but remains constant for the naive solution. However, the difference in overall detection performance, as measured by the area under the curve (AUC), between the two methods is negligible. This result suggests that self-updating the model over time should be done in practice to continue to detect known, evolving adversaries.
△ Less
Submitted 20 December, 2017;
originally announced December 2017.
-
Context-modulation of hippocampal dynamics and deep convolutional networks
Authors:
James B. Aimone,
William M. Severa
Abstract:
Complex architectures of biological neural circuits, such as parallel processing pathways, has been behaviorally implicated in many cognitive studies. However, the theoretical consequences of circuit complexity on neural computation have only been explored in limited cases. Here, we introduce a mechanism by which direct and indirect pathways from cortex to the CA3 region of the hippocampus can bal…
▽ More
Complex architectures of biological neural circuits, such as parallel processing pathways, has been behaviorally implicated in many cognitive studies. However, the theoretical consequences of circuit complexity on neural computation have only been explored in limited cases. Here, we introduce a mechanism by which direct and indirect pathways from cortex to the CA3 region of the hippocampus can balance both contextual gating of memory formation and driving network activity. We implement this concept in a deep artificial neural network by enabling a context-sensitive bias. The motivation for this is to improve performance of a size-constrained network. Using direct knowledge of the superclass information in the CIFAR-100 and Fashion-MNIST datasets, we show a dramatic increase in performance without an increase in network size.
△ Less
Submitted 27 November, 2017;
originally announced November 2017.
-
Dynamic Analysis of Executables to Detect and Characterize Malware
Authors:
Michael R. Smith,
Joe B. Ingram,
Christopher C. Lamb,
Timothy J. Draelos,
Justin E. Doak,
James B. Aimone,
Conrad D. James
Abstract:
It is needed to ensure the integrity of systems that process sensitive information and control many aspects of everyday life. We examine the use of machine learning algorithms to detect malware using the system calls generated by executables-alleviating attempts at obfuscation as the behavior is monitored rather than the bytes of an executable. We examine several machine learning techniques for de…
▽ More
It is needed to ensure the integrity of systems that process sensitive information and control many aspects of everyday life. We examine the use of machine learning algorithms to detect malware using the system calls generated by executables-alleviating attempts at obfuscation as the behavior is monitored rather than the bytes of an executable. We examine several machine learning techniques for detecting malware including random forests, deep learning techniques, and liquid state machines. The experiments examine the effects of concept drift on each algorithm to understand how well the algorithms generalize to novel malware samples by testing them on data that was collected after the training data. The results suggest that each of the examined machine learning algorithms is a viable solution to detect malware-achieving between 90% and 95% class-averaged accuracy (CAA). In real-world scenarios, the performance evaluation on an operational network may not match the performance achieved in training. Namely, the CAA may be about the same, but the values for precision and recall over the malware can change significantly. We structure experiments to highlight these caveats and offer insights into expected performance in operational environments. In addition, we use the induced models to gain a better understanding about what differentiates the malware samples from the goodware, which can further be used as a forensics tool to understand what the malware (or goodware) was doing to provide directions for investigation and remediation.
△ Less
Submitted 28 September, 2018; v1 submitted 10 November, 2017;
originally announced November 2017.
-
Data-driven Feature Sampling for Deep Hyperspectral Classification and Segmentation
Authors:
William M. Severa,
Jerilyn A. Timlin,
Suraj Kholwadwala,
Conrad D. James,
James B. Aimone
Abstract:
The high dimensionality of hyperspectral imaging forces unique challenges in scope, size and processing requirements. Motivated by the potential for an in-the-field cell sorting detector, we examine a $\textit{Synechocystis sp.}$ PCC 6803 dataset wherein cells are grown alternatively in nitrogen rich or deplete cultures. We use deep learning techniques to both successfully classify cells and gener…
▽ More
The high dimensionality of hyperspectral imaging forces unique challenges in scope, size and processing requirements. Motivated by the potential for an in-the-field cell sorting detector, we examine a $\textit{Synechocystis sp.}$ PCC 6803 dataset wherein cells are grown alternatively in nitrogen rich or deplete cultures. We use deep learning techniques to both successfully classify cells and generate a mask segmenting the cells/condition from the background. Further, we use the classification accuracy to guide a data-driven, iterative feature selection method, allowing the design neural networks requiring 90% fewer input features with little accuracy degradation.
△ Less
Submitted 26 October, 2017;
originally announced October 2017.
-
Exponential scaling of neural algorithms - a future beyond Moore's Law?
Authors:
James B. Aimone
Abstract:
Although the brain has long been considered a potential inspiration for future computing, Moore's Law - the scaling property that has seen revolutions in technologies ranging from supercomputers to smart phones - has largely been driven by advances in materials science. As the ability to miniaturize transistors is coming to an end, there is increasing attention on new approaches to computation, in…
▽ More
Although the brain has long been considered a potential inspiration for future computing, Moore's Law - the scaling property that has seen revolutions in technologies ranging from supercomputers to smart phones - has largely been driven by advances in materials science. As the ability to miniaturize transistors is coming to an end, there is increasing attention on new approaches to computation, including renewed enthusiasm around the potential of neural computation. This paper describes how recent advances in neurotechnologies, many of which have been aided by computing's rapid progression over recent decades, are now reigniting this opportunity to bring neural computation insights into broader computing applications. As we understand more about the brain, our ability to motivate new computing paradigms with continue to progress. These new approaches to computing, which we are already seeing in techniques such as deep learning and neuromorphic hardware, will themselves improve our ability to learn about the brain and accordingly can be projected to give rise to even further insights. This paper will describe how this positive feedback has the potential to change the complexion of how computing sciences and neurosciences interact, and suggests that the next form of exponential scaling in computing may emerge from our progressive understanding of the brain.
△ Less
Submitted 13 July, 2017; v1 submitted 4 May, 2017;
originally announced May 2017.
-
A Digital Neuromorphic Architecture Efficiently Facilitating Complex Synaptic Response Functions Applied to Liquid State Machines
Authors:
Michael R. Smith,
Aaron J. Hill,
Kristofor D. Carlson,
Craig M. Vineyard,
Jonathon Donaldson,
David R. Follett,
Pamela L. Follett,
John H. Naegle,
Conrad D. James,
James B. Aimone
Abstract:
Information in neural networks is represented as weighted connections, or synapses, between neurons. This poses a problem as the primary computational bottleneck for neural networks is the vector-matrix multiply when inputs are multiplied by the neural network weights. Conventional processing architectures are not well suited for simulating neural networks, often requiring large amounts of energy…
▽ More
Information in neural networks is represented as weighted connections, or synapses, between neurons. This poses a problem as the primary computational bottleneck for neural networks is the vector-matrix multiply when inputs are multiplied by the neural network weights. Conventional processing architectures are not well suited for simulating neural networks, often requiring large amounts of energy and time. Additionally, synapses in biological neural networks are not binary connections, but exhibit a nonlinear response function as neurotransmitters are emitted and diffuse between neurons. Inspired by neuroscience principles, we present a digital neuromorphic architecture, the Spiking Temporal Processing Unit (STPU), capable of modeling arbitrary complex synaptic response functions without requiring additional hardware components. We consider the paradigm of spiking neurons with temporally coded information as opposed to non-spiking rate coded neurons used in most neural networks. In this paradigm we examine liquid state machines applied to speech recognition and show how a liquid state machine with temporal dynamics maps onto the STPU-demonstrating the flexibility and efficiency of the STPU for instantiating neural algorithms.
△ Less
Submitted 21 March, 2017;
originally announced April 2017.
-
Neurogenesis Deep Learning
Authors:
Timothy J. Draelos,
Nadine E. Miner,
Christopher C. Lamb,
Jonathan A. Cox,
Craig M. Vineyard,
Kristofor D. Carlson,
William M. Severa,
Conrad D. James,
James B. Aimone
Abstract:
Neural machine learning methods, such as deep neural networks (DNN), have achieved remarkable success in a number of complex data processing tasks. These methods have arguably had their strongest impact on tasks such as image and audio processing - data processing domains in which humans have long held clear advantages over conventional algorithms. In contrast to biological neural systems, which a…
▽ More
Neural machine learning methods, such as deep neural networks (DNN), have achieved remarkable success in a number of complex data processing tasks. These methods have arguably had their strongest impact on tasks such as image and audio processing - data processing domains in which humans have long held clear advantages over conventional algorithms. In contrast to biological neural systems, which are capable of learning continuously, deep artificial networks have a limited ability for incorporating new information in an already trained network. As a result, methods for continuous learning are potentially highly impactful in enabling the application of deep networks to dynamic data sets. Here, inspired by the process of adult neurogenesis in the hippocampus, we explore the potential for adding new neurons to deep layers of artificial neural networks in order to facilitate their acquisition of novel information while preserving previously trained data representations. Our results on the MNIST handwritten digit dataset and the NIST SD 19 dataset, which includes lower and upper case letters and digits, demonstrate that neurogenesis is well suited for addressing the stability-plasticity dilemma that has long challenged adaptive machine learning algorithms.
△ Less
Submitted 28 March, 2017; v1 submitted 12 December, 2016;
originally announced December 2016.