Search | arXiv e-print repository

High Order Dynamic Mode Decomposition for Mechanical Vibrations and Modal Analysis

Authors: Andreas Tuor, Nico Canzani, Tobias Rüggeberg, Stefan Gorenflo, Gerd Simons, Bruno Bättig, Daniel Iseli

Abstract: In many mechanical, electrical, and general physical systems evolving over time or space, spectral analysis methods as Fast Fourier Transform (FFT), Short Term Fourier Transform (STFT), Power Spectrum Density (PSD) plays a very important role. They allow an extraction of required information content from signals in another base by decomposing it in its spectral components for further processing.In… ▽ More In many mechanical, electrical, and general physical systems evolving over time or space, spectral analysis methods as Fast Fourier Transform (FFT), Short Term Fourier Transform (STFT), Power Spectrum Density (PSD) plays a very important role. They allow an extraction of required information content from signals in another base by decomposing it in its spectral components for further processing.In theory this approach is very powerful, even in some 'simple' or 'not too complicated' practical cases it has proven its utility and efficiency. However, for real-world applications such as mechanical modal analysis of large dimension systems including dam**, noise and unpredictable excitation those signals are often so complex that it can be almost impossible to obtain a high-resolution spectral decomposition with these methods due to the time-bandwidth limitation. In this paper we describe an alternative approach for spectral analysis based on the High Order Dynamical Mode Decomposition (HODMD) and Kernel Density Spectrum (KDS). We will show that this method allows overcoming some limitations of the FFT and may be a promising approach to for a much more precisely the spectral decomposition. △ Less

Submitted 19 June, 2023; originally announced June 2023.

arXiv:2208.07333 [pdf, other]

Domain-aware Control-oriented Neural Models for Autonomous Underwater Vehicles

Authors: Wenceslao Shaw Cortez, Soumya Vasisht, Aaron Tuor, Ján Drgoňa, Draguna Vrabie

Abstract: Conventional physics-based modeling is a time-consuming bottleneck in control design for complex nonlinear systems like autonomous underwater vehicles (AUVs). In contrast, purely data-driven models, though convenient and quick to obtain, require a large number of observations and lack operational guarantees for safety-critical systems. Data-driven models leveraging available partially characterize… ▽ More Conventional physics-based modeling is a time-consuming bottleneck in control design for complex nonlinear systems like autonomous underwater vehicles (AUVs). In contrast, purely data-driven models, though convenient and quick to obtain, require a large number of observations and lack operational guarantees for safety-critical systems. Data-driven models leveraging available partially characterized dynamics have potential to provide reliable systems models in a typical data-limited scenario for high value complex systems, thereby avoiding months of expensive expert modeling time. In this work we explore this middle-ground between expert-modeled and pure data-driven modeling. We present control-oriented parametric models with varying levels of domain-awareness that exploit known system structure and prior physics knowledge to create constrained deep neural dynamical system models. We employ universal differential equations to construct data-driven blackbox and graybox representations of the AUV dynamics. In addition, we explore a hybrid formulation that explicitly models the residual error related to imperfect graybox models. We compare the prediction performance of the learned models for different distributions of initial conditions and control inputs to assess their accuracy, generalization, and suitability for control. △ Less

Submitted 15 August, 2022; originally announced August 2022.

Comments: 6 pages, 6 figures, submitted to 12th IFAC Symposium on Nonlinear Control Systems 2022

arXiv:2208.02319 [pdf, other]

Differentiable Predictive Control with Safety Guarantees: A Control Barrier Function Approach

Authors: Wenceslao Shaw Cortez, Jan Drgona, Aaron Tuor, Mahantesh Halappanavar, Draguna Vrabie

Abstract: We develop a novel form of differentiable predictive control (DPC) with safety and robustness guarantees based on control barrier functions. DPC is an unsupervised learning-based method for obtaining approximate solutions to explicit model predictive control (MPC) problems. In DPC, the predictive control policy parametrized by a neural network is optimized offline via direct policy gradients obtai… ▽ More We develop a novel form of differentiable predictive control (DPC) with safety and robustness guarantees based on control barrier functions. DPC is an unsupervised learning-based method for obtaining approximate solutions to explicit model predictive control (MPC) problems. In DPC, the predictive control policy parametrized by a neural network is optimized offline via direct policy gradients obtained by automatic differentiation of the MPC problem. The proposed approach exploits a new form of sampled-data barrier function to enforce offline and online safety requirements in DPC settings while only interrupting the neural network-based controller near the boundary of the safe set. The effectiveness of the proposed approach is demonstrated in simulation. △ Less

Submitted 3 August, 2022; originally announced August 2022.

Comments: Accepted to IEEE Conference on Decision and Control Conference 2022

arXiv:2207.04962 [pdf, other]

doi 10.1063/5.0109093

Structural Inference of Networked Dynamical Systems with Universal Differential Equations

Authors: James Koch, Zhao Chen, Aaron Tuor, Jan Drgona, Draguna Vrabie

Abstract: Networked dynamical systems are common throughout science in engineering; e.g., biological networks, reaction networks, power systems, and the like. For many such systems, nonlinearity drives populations of identical (or near-identical) units to exhibit a wide range of nontrivial behaviors, such as the emergence of coherent structures (e.g., waves and patterns) or otherwise notable dynamics (e.g.,… ▽ More Networked dynamical systems are common throughout science in engineering; e.g., biological networks, reaction networks, power systems, and the like. For many such systems, nonlinearity drives populations of identical (or near-identical) units to exhibit a wide range of nontrivial behaviors, such as the emergence of coherent structures (e.g., waves and patterns) or otherwise notable dynamics (e.g., synchrony and chaos). In this work, we seek to infer (i) the intrinsic physics of a base unit of a population, (ii) the underlying graphical structure shared between units, and (iii) the coupling physics of a given networked dynamical system given observations of nodal states. These tasks are formulated around the notion of the Universal Differential Equation, whereby unknown dynamical systems can be approximated with neural networks, mathematical terms known a priori (albeit with unknown parameterizations), or combinations of the two. We demonstrate the value of these inference tasks by investigating not only future state predictions but also the inference of system behavior on varied network topologies. The effectiveness and utility of these methods is shown with their application to canonical networked nonlinear coupled oscillators. △ Less

Submitted 11 July, 2022; originally announced July 2022.

arXiv:2205.10728 [pdf, other]

Neural Lyapunov Differentiable Predictive Control

Authors: Sayak Mukherjee, Ján Drgoňa, Aaron Tuor, Mahantesh Halappanavar, Draguna Vrabie

Abstract: We present a learning-based predictive control methodology using the differentiable programming framework with probabilistic Lyapunov-based stability guarantees. The neural Lyapunov differentiable predictive control (NLDPC) learns the policy by constructing a computational graph encompassing the system dynamics, state and input constraints, and the necessary Lyapunov certification constraints, and… ▽ More We present a learning-based predictive control methodology using the differentiable programming framework with probabilistic Lyapunov-based stability guarantees. The neural Lyapunov differentiable predictive control (NLDPC) learns the policy by constructing a computational graph encompassing the system dynamics, state and input constraints, and the necessary Lyapunov certification constraints, and thereafter using the automatic differentiation to update the neural policy parameters. In conjunction, our approach jointly learns a Lyapunov function that certifies the regions of state-space with stable dynamics. We also provide a sampling-based statistical guarantee for the training of NLDPC from the distribution of initial conditions. Our offline training approach provides a computationally efficient and scalable alternative to classical explicit model predictive control solutions. We substantiate the advantages of the proposed approach with simulations to stabilize the double integrator model and on an example of controlling an aircraft model. △ Less

Submitted 21 May, 2022; originally announced May 2022.

Comments: 8 pages; 9 figures

arXiv:2203.10582 [pdf, other]

Neuro-physical dynamic load modeling using differentiable parametric optimization

Authors: Shrirang Abhyankar, Jan Drgona, Andrew August, Elliot Skomski, Aaron Tuor

Abstract: In this work, we investigate a data-driven approach for obtaining a reduced equivalent load model of distribution systems for electromechanical transient stability analysis. The proposed reduced equivalent is a neuro-physical model comprising of a traditional ZIP load model augmented with a neural network. This neuro-physical model is trained through differentiable programming. We discuss the form… ▽ More In this work, we investigate a data-driven approach for obtaining a reduced equivalent load model of distribution systems for electromechanical transient stability analysis. The proposed reduced equivalent is a neuro-physical model comprising of a traditional ZIP load model augmented with a neural network. This neuro-physical model is trained through differentiable programming. We discuss the formulation, modeling details, and training of the proposed model set up as a differential parametric program. The performance and accuracy of this neurophysical ZIP load model is presented on a medium-scale 350-bus transmission-distribution network. △ Less

Submitted 20 March, 2022; originally announced March 2022.

Comments: 7 pages, 9 figures

Report number: PNNL-SA-167146

arXiv:2203.08984 [pdf, other]

Koopman-based Differentiable Predictive Control for the Dynamics-Aware Economic Dispatch Problem

Authors: Ethan King, Jan Drgona, Aaron Tuor, Shrirang Abhyankar, Craig Bakker, Arnab Bhattacharya, Draguna Vrabie

Abstract: The dynamics-aware economic dispatch (DED) problem embeds low-level generator dynamics and operational constraints to enable near real-time scheduling of generation units in a power network. DED produces a more dynamic supervisory control policy than traditional economic dispatch (T-ED) that leads to reduced overall generation costs. However, the incorporation of differential equations that govern… ▽ More The dynamics-aware economic dispatch (DED) problem embeds low-level generator dynamics and operational constraints to enable near real-time scheduling of generation units in a power network. DED produces a more dynamic supervisory control policy than traditional economic dispatch (T-ED) that leads to reduced overall generation costs. However, the incorporation of differential equations that govern the system dynamics makes DED an optimization problem that is computationally prohibitive to solve. In this work, we present a new data-driven approach based on differentiable programming to efficiently obtain parametric solutions to the underlying DED problem. In particular, we employ the recently proposed differentiable predictive control (DPC) for offline learning of explicit neural control policies using an identified Koopman operator (KO) model of the power system dynamics. We demonstrate the high solution quality and five orders of magnitude computational-time savings of the DPC method over the original online optimization-based DED approach on a 9-bus test power grid network. △ Less

Submitted 16 March, 2022; originally announced March 2022.

Comments: The code for producing this work is available in the repo: https://github.com/pnnl/neuromancer/tree/DED_DPC

arXiv:2203.01447 [pdf, other]

Learning Stochastic Parametric Differentiable Predictive Control Policies

Authors: Ján Drgoňa, Sayak Mukherjee, Aaron Tuor, Mahantesh Halappanavar, Draguna Vrabie

Abstract: The problem of synthesizing stochastic explicit model predictive control policies is known to be quickly intractable even for systems of modest complexity when using classical control-theoretic methods. To address this challenge, we present a scalable alternative called stochastic parametric differentiable predictive control (SP-DPC) for unsupervised learning of neural control policies governing s… ▽ More The problem of synthesizing stochastic explicit model predictive control policies is known to be quickly intractable even for systems of modest complexity when using classical control-theoretic methods. To address this challenge, we present a scalable alternative called stochastic parametric differentiable predictive control (SP-DPC) for unsupervised learning of neural control policies governing stochastic linear systems subject to nonlinear chance constraints. SP-DPC is formulated as a deterministic approximation to the stochastic parametric constrained optimal control problem. This formulation allows us to directly compute the policy gradients via automatic differentiation of the problem's value function, evaluated over sampled parameters and uncertainties. In particular, the computed expectation of the SP-DPC problem's value function is backpropagated through the closed-loop system rollouts parametrized by a known nominal system dynamics model and neural control policy which allows for direct model-based policy optimization. We provide theoretical probabilistic guarantees for policies learned via the SP-DPC method on closed-loop stability and chance constraints satisfaction. Furthermore, we demonstrate the computational efficiency and scalability of the proposed policy optimization algorithm in three numerical examples, including systems with a large number of states or subject to nonlinear constraints. △ Less

Submitted 21 May, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

Comments: Full version for the paper accepted at the 10th IFAC Symposium on Robust Control Design (ROCOND) 2022

arXiv:2203.00120 [pdf, other]

Neural Ordinary Differential Equations for Nonlinear System Identification

Authors: Aowabin Rahman, Ján Drgoňa, Aaron Tuor, Jan Strube

Abstract: Neural ordinary differential equations (NODE) have been recently proposed as a promising approach for nonlinear system identification tasks. In this work, we systematically compare their predictive performance with current state-of-the-art nonlinear and classical linear methods. In particular, we present a quantitative study comparing NODE's performance against neural state-space models and classi… ▽ More Neural ordinary differential equations (NODE) have been recently proposed as a promising approach for nonlinear system identification tasks. In this work, we systematically compare their predictive performance with current state-of-the-art nonlinear and classical linear methods. In particular, we present a quantitative study comparing NODE's performance against neural state-space models and classical linear system identification methods. We evaluate the inference speed and prediction performance of each method on open-loop errors across eight different dynamical systems. The experiments show that NODEs can consistently improve the prediction accuracy by an order of magnitude compared to benchmark methods. Besides improved accuracy, we also observed that NODEs are less sensitive to hyperparameters compared to neural state-space models. On the other hand, these performance gains come with a slight increase of computation at the inference time. △ Less

Submitted 15 March, 2022; v1 submitted 28 February, 2022; originally announced March 2022.

arXiv:2107.11843 [pdf, other]

Deep Learning Explicit Differentiable Predictive Control Laws for Buildings

Authors: Jan Drgona, Aaron Tuor, Soumya Vasisht, Elliott Skomski, Draguna Vrabie

Abstract: We present a differentiable predictive control (DPC) methodology for learning constrained control laws for unknown nonlinear systems. DPC poses an approximate solution to multiparametric programming problems emerging from explicit nonlinear model predictive control (MPC). Contrary to approximate MPC, DPC does not require supervision by an expert controller. Instead, a system dynamics model is lear… ▽ More We present a differentiable predictive control (DPC) methodology for learning constrained control laws for unknown nonlinear systems. DPC poses an approximate solution to multiparametric programming problems emerging from explicit nonlinear model predictive control (MPC). Contrary to approximate MPC, DPC does not require supervision by an expert controller. Instead, a system dynamics model is learned from the observed system's dynamics, and the neural control law is optimized offline by leveraging the differentiable closed-loop system model. The combination of a differentiable closed-loop system and penalty methods for constraint handling of system outputs and inputs allows us to optimize the control law's parameters directly by backpropagating economic MPC loss through the learned system model. The control performance of the proposed DPC method is demonstrated in simulation using learned model of multi-zone building thermal dynamics. △ Less

Submitted 25 July, 2021; originally announced July 2021.

arXiv:2104.03496 [pdf, other]

Prototypical Region Proposal Networks for Few-Shot Localization and Classification

Authors: Elliott Skomski, Aaron Tuor, Andrew Avila, Lauren Phillips, Zachary New, Henry Kvinge, Courtney D. Corley, Nathan Hodas

Abstract: Recently proposed few-shot image classification methods have generally focused on use cases where the objects to be classified are the central subject of images. Despite success on benchmark vision datasets aligned with this use case, these methods typically fail on use cases involving densely-annotated, busy images: images common in the wild where objects of relevance are not the central subject,… ▽ More Recently proposed few-shot image classification methods have generally focused on use cases where the objects to be classified are the central subject of images. Despite success on benchmark vision datasets aligned with this use case, these methods typically fail on use cases involving densely-annotated, busy images: images common in the wild where objects of relevance are not the central subject, instead appearing potentially occluded, small, or among other incidental objects belonging to other classes of potential interest. To localize relevant objects, we employ a prototype-based few-shot segmentation model which compares the encoded features of unlabeled query images with support class centroids to produce region proposals indicating the presence and location of support set classes in a query image. These region proposals are then used as additional conditioning input to few-shot image classifiers. We develop a framework to unify the two stages (segmentation and classification) into an end-to-end classification model -- PRoPnet -- and empirically demonstrate that our methods improve accuracy on image datasets with natural scenes containing multiple object classes. △ Less

Submitted 8 April, 2021; originally announced April 2021.

Comments: 9 pages, 1 figure. Submitted to 4th Workshop on Meta-Learning at NeurIPS 2020

arXiv:2101.01864 [pdf, other]

Constrained Block Nonlinear Neural Dynamical Models

Authors: Elliott Skomski, Soumya Vasisht, Colby Wight, Aaron Tuor, Jan Drgona, Draguna Vrabie

Abstract: Neural network modules conditioned by known priors can be effectively trained and combined to represent systems with nonlinear dynamics. This work explores a novel formulation for data-efficient learning of deep control-oriented nonlinear dynamical models by embedding local model structure and constraints. The proposed method consists of neural network blocks that represent input, state, and outpu… ▽ More Neural network modules conditioned by known priors can be effectively trained and combined to represent systems with nonlinear dynamics. This work explores a novel formulation for data-efficient learning of deep control-oriented nonlinear dynamical models by embedding local model structure and constraints. The proposed method consists of neural network blocks that represent input, state, and output dynamics with constraints placed on the network weights and system variables. For handling partially observable dynamical systems, we utilize a state observer neural network to estimate the states of the system's latent dynamics. We evaluate the performance of the proposed architecture and training methods on system identification tasks for three nonlinear systems: a continuous stirred tank reactor, a two tank interacting system, and an aerodynamics body. Models optimized with a few thousand system state observations accurately represent system dynamics in open loop simulation over thousands of time steps from a single set of initial conditions. Experimental results demonstrate an order of magnitude reduction in open-loop simulation mean squared error for our constrained, block-structured neural models when compared to traditional unstructured and unconstrained neural network models. △ Less

Submitted 5 January, 2021; originally announced January 2021.

Comments: 10 pages. Submitted to American Control Conference ACC 2020. Under review

arXiv:2011.13497 [pdf, other]

Physics-Informed Neural State Space Models via Learning and Evolution

Authors: Elliott Skomski, Jan Drgona, Aaron Tuor

Abstract: Recent works exploring deep learning application to dynamical systems modeling have demonstrated that embedding physical priors into neural networks can yield more effective, physically-realistic, and data-efficient models. However, in the absence of complete prior knowledge of a dynamical system's physical characteristics, determining the optimal structure and optimization strategy for these mode… ▽ More Recent works exploring deep learning application to dynamical systems modeling have demonstrated that embedding physical priors into neural networks can yield more effective, physically-realistic, and data-efficient models. However, in the absence of complete prior knowledge of a dynamical system's physical characteristics, determining the optimal structure and optimization strategy for these models can be difficult. In this work, we explore methods for discovering neural state space dynamics models for system identification. Starting with a design space of block-oriented state space models and structured linear maps with strong physical priors, we encode these components into a model genome alongside network structure, penalty constraints, and optimization hyperparameters. Demonstrating the overall utility of the design space, we employ an asynchronous genetic search algorithm that alternates between model selection and optimization and obtains accurate physically consistent models of three physical systems: an aerodynamics body, a continuous stirred tank reactor, and a two tank interacting system. △ Less

Submitted 26 November, 2020; originally announced November 2020.

Comments: Submitted to 3rd Annual Learning for Dynamics & Control Conference. 9 pages. 4 figures

arXiv:2011.13492 [pdf, other]

Dissipative Deep Neural Dynamical Systems

Authors: Jan Drgona, Soumya Vasisht, Aaron Tuor, Draguna Vrabie

Abstract: In this paper, we provide sufficient conditions for dissipativity and local asymptotic stability of discrete-time dynamical systems parametrized by deep neural networks. We leverage the representation of neural networks as pointwise affine maps, thus exposing their local linear operators and making them accessible to classical system analytic and design methods. This allows us to "crack open the b… ▽ More In this paper, we provide sufficient conditions for dissipativity and local asymptotic stability of discrete-time dynamical systems parametrized by deep neural networks. We leverage the representation of neural networks as pointwise affine maps, thus exposing their local linear operators and making them accessible to classical system analytic and design methods. This allows us to "crack open the black box" of the neural dynamical system's behavior by evaluating their dissipativity, and estimating their stationary points and state-space partitioning. We relate the norms of these local linear operators to the energy stored in the dissipative system with supply rates represented by their aggregate bias terms. Empirically, we analyze the variance in dynamical behavior and eigenvalue spectra of these local linear operators with varying weight factorizations, activation functions, bias terms, and depths. △ Less

Submitted 8 June, 2022; v1 submitted 26 November, 2020; originally announced November 2020.

Comments: Under review at IEEE Open Journal of Control Systems

arXiv:2011.05987 [pdf, other]

Physics-constrained Deep Learning of Multi-zone Building Thermal Dynamics

Authors: Jan Drgona, Aaron R. Tuor, Vikas Chandan, Draguna L. Vrabie

Abstract: We present a physics-constrained control-oriented deep learning method for modeling building thermal dynamics. The proposed method is based on the systematic encoding of physics-based prior knowledge into a structured recurrent neural architecture. Specifically, our method incorporates structural priors from traditional physics-based building modeling into the neural network thermal dynamics model… ▽ More We present a physics-constrained control-oriented deep learning method for modeling building thermal dynamics. The proposed method is based on the systematic encoding of physics-based prior knowledge into a structured recurrent neural architecture. Specifically, our method incorporates structural priors from traditional physics-based building modeling into the neural network thermal dynamics model structure. Further, we leverage penalty methods to provide inequality constraints, thereby bounding predictions within physically realistic and safe operating ranges. Observing that stable eigenvalues accurately characterize the dissipativeness of the system, we additionally use a constrained matrix parameterization based on the Perron-Frobenius theorem to bound the dominant eigenvalues of the building thermal model parameter matrices. We demonstrate the proposed data-driven modeling approach's effectiveness and physical interpretability on a dataset obtained from a real-world office building with 20 thermal zones. Using only 10 days' measurements for training, we demonstrate generalization over 20 consecutive days, significantly improving the accuracy compared to prior state-of-the-art results reported in the literature. △ Less

Submitted 11 November, 2020; originally announced November 2020.

arXiv:2011.03699 [pdf, other]

Deep Learning Alternative to Explicit Model Predictive Control for Unknown Nonlinear Systems

Authors: Jan Drgona, Karol Kis, Aaron Tuor, Draguna Vrabie, Martin Klauco

Abstract: We present differentiable predictive control (DPC) as a deep learning-based alternative to the explicit model predictive control (MPC) for unknown nonlinear systems. In the DPC framework, a neural state-space model is learned from time-series measurements of the system dynamics. The neural control policy is then optimized via stochastic gradient descent approach by differentiating the MPC loss fun… ▽ More We present differentiable predictive control (DPC) as a deep learning-based alternative to the explicit model predictive control (MPC) for unknown nonlinear systems. In the DPC framework, a neural state-space model is learned from time-series measurements of the system dynamics. The neural control policy is then optimized via stochastic gradient descent approach by differentiating the MPC loss function through the closed-loop system dynamics model. The proposed DPC method learns model-based control policies with state and input constraints, while supporting time-varying references and constraints. In embedded implementation using a Raspberry-Pi platform, we experimentally demonstrate that it is possible to train constrained control policies purely based on the measurements of the unknown nonlinear system. We compare the control performance of the DPC method against explicit MPC and report efficiency gains in online computational demands, memory requirements, policy complexity, and construction time. In particular, we show that our method scales linearly compared to exponential scalability of the explicit MPC solved via multiparametric programming. △ Less

Submitted 26 July, 2021; v1 submitted 7 November, 2020; originally announced November 2020.

arXiv:2009.11253 [pdf, other]

Fuzzy Simplicial Networks: A Topology-Inspired Model to Improve Task Generalization in Few-shot Learning

Authors: Henry Kvinge, Zachary New, Nico Courts, Jung H. Lee, Lauren A. Phillips, Courtney D. Corley, Aaron Tuor, Andrew Avila, Nathan O. Hodas

Abstract: Deep learning has shown great success in settings with massive amounts of data but has struggled when data is limited. Few-shot learning algorithms, which seek to address this limitation, are designed to generalize well to new tasks with limited data. Typically, models are evaluated on unseen classes and datasets that are defined by the same fundamental task as they are trained for (e.g. category… ▽ More Deep learning has shown great success in settings with massive amounts of data but has struggled when data is limited. Few-shot learning algorithms, which seek to address this limitation, are designed to generalize well to new tasks with limited data. Typically, models are evaluated on unseen classes and datasets that are defined by the same fundamental task as they are trained for (e.g. category membership). One can also ask how well a model can generalize to fundamentally different tasks within a fixed dataset (for example: moving from category membership to tasks that involve detecting object orientation or quantity). To formalize this kind of shift we define a notion of "independence of tasks" and identify three new sets of labels for established computer vision datasets that test a model's ability to generalize to tasks which draw on orthogonal attributes in the data. We use these datasets to investigate the failure modes of metric-based few-shot models. Based on our findings, we introduce a new few-shot model called Fuzzy Simplicial Networks (FSN) which leverages a construction from topology to more flexibly represent each class from limited data. In particular, FSN models can not only form multiple representations for a given class but can also begin to capture the low-dimensional structure which characterizes class manifolds in the encoded space of deep networks. We show that FSN outperforms state-of-the-art models on the challenging tasks we introduce in this paper while remaining competitive on standard few-shot benchmarks. △ Less

Submitted 23 September, 2020; originally announced September 2020.

Comments: 17 pages

arXiv:2004.11514 [pdf, other]

Systematic Evaluation of Backdoor Data Poisoning Attacks on Image Classifiers

Authors: Loc Truong, Chace Jones, Brian Hutchinson, Andrew August, Brenda Praggastis, Robert Jasper, Nicole Nichols, Aaron Tuor

Abstract: Backdoor data poisoning attacks have recently been demonstrated in computer vision research as a potential safety risk for machine learning (ML) systems. Traditional data poisoning attacks manipulate training data to induce unreliability of an ML model, whereas backdoor data poisoning attacks maintain system performance unless the ML model is presented with an input containing an embedded "trigger… ▽ More Backdoor data poisoning attacks have recently been demonstrated in computer vision research as a potential safety risk for machine learning (ML) systems. Traditional data poisoning attacks manipulate training data to induce unreliability of an ML model, whereas backdoor data poisoning attacks maintain system performance unless the ML model is presented with an input containing an embedded "trigger" that provides a predetermined response advantageous to the adversary. Our work builds upon prior backdoor data-poisoning research for ML image classifiers and systematically assesses different experimental conditions including types of trigger patterns, persistence of trigger patterns during retraining, poisoning strategies, architectures (ResNet-50, NasNet, NasNet-Mobile), datasets (Flowers, CIFAR-10), and potential defensive regularization techniques (Contrastive Loss, Logit Squeezing, Manifold Mixup, Soft-Nearest-Neighbors Loss). Experiments yield four key findings. First, the success rate of backdoor poisoning attacks varies widely, depending on several factors, including model architecture, trigger pattern and regularization technique. Second, we find that poisoned models are hard to detect through performance inspection alone. Third, regularization typically reduces backdoor success rate, although it can have no effect or even slightly increase it, depending on the form of regularization. Finally, backdoors inserted through data poisoning can be rendered ineffective after just a few epochs of additional training on a small set of clean data without affecting the model's performance. △ Less

Submitted 23 April, 2020; originally announced April 2020.

arXiv:2004.11184 [pdf, other]

Learning Constrained Adaptive Differentiable Predictive Control Policies With Guarantees

Authors: Jan Drgona, Aaron Tuor, Draguna Vrabie

Abstract: We present differentiable predictive control (DPC), a method for learning constrained neural control policies for linear systems with probabilistic performance guarantees. We employ automatic differentiation to obtain direct policy gradients by backpropagating the model predictive control (MPC) loss function and constraints penalties through a differentiable closed-loop system dynamics model. We d… ▽ More We present differentiable predictive control (DPC), a method for learning constrained neural control policies for linear systems with probabilistic performance guarantees. We employ automatic differentiation to obtain direct policy gradients by backpropagating the model predictive control (MPC) loss function and constraints penalties through a differentiable closed-loop system dynamics model. We demonstrate that the proposed method can learn parametric constrained control policies to stabilize systems with unstable dynamics, track time-varying references, and satisfy nonlinear state and input constraints. In contrast with imitation learning-based approaches, our method does not depend on a supervisory controller. Most importantly, we demonstrate that, without losing performance, our method is scalable and computationally more efficient than implicit, explicit, and approximate MPC. Under review at IEEE Transactions on Automatic Control. △ Less

Submitted 27 January, 2022; v1 submitted 23 April, 2020; originally announced April 2020.

Comments: 31 pages. Code for reproducing our experiments is available at: https://github.com/pnnl/deps_arXiv20204 Under review at IEEE Transactions on Automatic Control

arXiv:2004.10883 [pdf, other]

Constrained Neural Ordinary Differential Equations with Stability Guarantees

Authors: Aaron Tuor, Jan Drgona, Draguna Vrabie

Abstract: Differential equations are frequently used in engineering domains, such as modeling and control of industrial systems, where safety and performance guarantees are of paramount importance. Traditional physics-based modeling approaches require domain expertise and are often difficult to tune or adapt to new systems. In this paper, we show how to model discrete ordinary differential equations (ODE) w… ▽ More Differential equations are frequently used in engineering domains, such as modeling and control of industrial systems, where safety and performance guarantees are of paramount importance. Traditional physics-based modeling approaches require domain expertise and are often difficult to tune or adapt to new systems. In this paper, we show how to model discrete ordinary differential equations (ODE) with algebraic nonlinearities as deep neural networks with varying degrees of prior knowledge. We derive the stability guarantees of the network layers based on the implicit constraints imposed on the weight's eigenvalues. Moreover, we show how to use barrier methods to generically handle additional inequality constraints. We demonstrate the prediction accuracy of learned neural ODEs evaluated on open-loop simulations compared to ground truth dynamics with bi-linear terms. △ Less

Submitted 22 April, 2020; originally announced April 2020.

Comments: 4 pages, Appendix

Journal ref: Presented at DEEPDIFFEQ 2020 : ICLR Workshop on Integration of Deep Neural Models and Differential Equations

arXiv:1902.06231 [pdf, other]

Multiple Document Representations from News Alerts for Automated Bio-surveillance Event Detection

Authors: Aaron Tuor, Fnu Anubhav, Lauren Charles

Abstract: Due to globalization, geographic boundaries no longer serve as effective shields for the spread of infectious diseases. In order to aid bio-surveillance analysts in disease tracking, recent research has been devoted to develo** information retrieval and analysis methods utilizing the vast corpora of publicly available documents on the internet. In this work, we present methods for the automated… ▽ More Due to globalization, geographic boundaries no longer serve as effective shields for the spread of infectious diseases. In order to aid bio-surveillance analysts in disease tracking, recent research has been devoted to develo** information retrieval and analysis methods utilizing the vast corpora of publicly available documents on the internet. In this work, we present methods for the automated retrieval and classification of documents related to active public health events. We demonstrate classification performance on an auto-generated corpus, using recurrent neural network, TF-IDF, and Naive Bayes log count ratio document representations. By jointly modeling the title and description of a document, we achieve 97% recall and 93.3% accuracy with our best performing bio-surveillance event classification model: logistic regression on the combined output from a pair of bidirectional recurrent neural networks. △ Less

Submitted 17 February, 2019; originally announced February 2019.

Comments: Presented at the 5th Pacific Northwest Regional NLP Workshop: NW-NLP 2018

arXiv:1803.04967 [pdf, other]

Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection

Authors: Andy Brown, Aaron Tuor, Brian Hutchinson, Nicole Nichols

Abstract: Deep learning has recently demonstrated state-of-the art performance on key tasks related to the maintenance of computer systems, such as intrusion detection, denial of service attack detection, hardware and software system failures, and malware detection. In these contexts, model interpretability is vital for administrator and analyst to trust and act on the automated analysis of machine learning… ▽ More Deep learning has recently demonstrated state-of-the art performance on key tasks related to the maintenance of computer systems, such as intrusion detection, denial of service attack detection, hardware and software system failures, and malware detection. In these contexts, model interpretability is vital for administrator and analyst to trust and act on the automated analysis of machine learning models. Deep learning methods have been criticized as black box oracles which allow limited insight into decision factors. In this work we seek to "bridge the gap" between the impressive performance of deep learning models and the need for interpretable model introspection. To this end we present recurrent neural network (RNN) language models augmented with attention for anomaly detection in system logs. Our methods are generally applicable to any computer system and logging source. By incorporating attention variants into our RNN language models we create opportunities for model introspection and analysis without sacrificing state-of-the art performance. We demonstrate model performance and illustrate model interpretability on an intrusion detection task using the Los Alamos National Laboratory (LANL) cyber security dataset, reporting upward of 0.99 area under the receiver operator characteristic curve despite being trained only on a single day's worth of data. △ Less

Submitted 13 March, 2018; originally announced March 2018.

Comments: Submitted to the First Workshop On Machine Learning for Computer Systems, ACM HPDC 2018

arXiv:1803.04659 [pdf, other]

Protein Mutation Stability Ternary Classification using Neural Networks and Rigidity Analysis

Authors: Richard Olney, Aaron Tuor, Filip Jagodzinski, Brian Hutchinson

Abstract: Discerning how a mutation affects the stability of a protein is central to the study of a wide range of diseases. Machine learning and statistical analysis techniques can inform how to allocate limited resources to the considerable time and cost associated with wet lab mutagenesis experiments. In this work we explore the effectiveness of using a neural network classifier to predict the change in t… ▽ More Discerning how a mutation affects the stability of a protein is central to the study of a wide range of diseases. Machine learning and statistical analysis techniques can inform how to allocate limited resources to the considerable time and cost associated with wet lab mutagenesis experiments. In this work we explore the effectiveness of using a neural network classifier to predict the change in the stability of a protein due to a mutation. Assessing the accuracy of our approach is dependent on the use of experimental data about the effects of mutations performed in vitro. Because the experimental data is prone to discrepancies when similar experiments have been performed by multiple laboratories, the use of the data near the juncture of stabilizing and destabilizing mutations is questionable. We address this later problem via a systematic approach in which we explore the use of a three-way classification scheme with stabilizing, destabilizing, and inconclusive labels. For a systematic search of potential classification cutoff values our classifier achieved 68 percent accuracy on ternary classification for cutoff values of -0.6 and 0.7 with a low rate of classifying stabilizing as destabilizing and vice versa. △ Less

Submitted 13 March, 2018; originally announced March 2018.

Comments: To appear in the Proceedings of 10th International Conference on Bioinformatics and Computational Biology (BICOB 2018)

arXiv:1712.00557 [pdf, other]

Recurrent Neural Network Language Models for Open Vocabulary Event-Level Cyber Anomaly Detection

Authors: Aaron Tuor, Ryan Baerwolf, Nicolas Knowles, Brian Hutchinson, Nicole Nichols, Rob Jasper

Abstract: Automated analysis methods are crucial aids for monitoring and defending a network to protect the sensitive or confidential data it hosts. This work introduces a flexible, powerful, and unsupervised approach to detecting anomalous behavior in computer and network logs, one that largely eliminates domain-dependent feature engineering employed by existing methods. By treating system logs as threads… ▽ More Automated analysis methods are crucial aids for monitoring and defending a network to protect the sensitive or confidential data it hosts. This work introduces a flexible, powerful, and unsupervised approach to detecting anomalous behavior in computer and network logs, one that largely eliminates domain-dependent feature engineering employed by existing methods. By treating system logs as threads of interleaved "sentences" (event log lines) to train online unsupervised neural network language models, our approach provides an adaptive model of normal network behavior. We compare the effectiveness of both standard and bidirectional recurrent neural network language models at detecting malicious activity within network log data. Extending these models, we introduce a tiered recurrent architecture, which provides context by modeling sequences of users' actions over time. Compared to Isolation Forest and Principal Components Analysis, two popular anomaly detection algorithms, we observe superior performance on the Los Alamos National Laboratory Cyber Security dataset. For log-line-level red team detection, our best performing character-based model provides test set area under the receiver operator characteristic curve of 0.98, demonstrating the strong fine-grained anomaly detection performance of this approach on open vocabulary logging sources. △ Less

Submitted 2 December, 2017; originally announced December 2017.

Comments: 8 pages, To appear in proceedings of AAAI-2018 Artificial Intelligence in Cyber Security Workshop

arXiv:1710.00811 [pdf, other]

Deep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams

Authors: Aaron Tuor, Samuel Kaplan, Brian Hutchinson, Nicole Nichols, Sean Robinson

Abstract: Analysis of an organization's computer network activity is a key component of early detection and mitigation of insider threat, a growing concern for many organizations. Raw system logs are a prototypical example of streaming data that can quickly scale beyond the cognitive power of a human analyst. As a prospective filter for the human analyst, we present an online unsupervised deep learning appr… ▽ More Analysis of an organization's computer network activity is a key component of early detection and mitigation of insider threat, a growing concern for many organizations. Raw system logs are a prototypical example of streaming data that can quickly scale beyond the cognitive power of a human analyst. As a prospective filter for the human analyst, we present an online unsupervised deep learning approach to detect anomalous network activity from system logs in real time. Our models decompose anomaly scores into the contributions of individual user behavior features for increased interpretability to aid analysts reviewing potential cases of insider threat. Using the CERT Insider Threat Dataset v6.2 and threat detection recall as our performance metric, our novel deep and recurrent neural network models outperform Principal Component Analysis, Support Vector Machine and Isolation Forest based anomaly detection baselines. For our best model, the events labeled as insider threat activity in our dataset had an average anomaly score in the 95.53 percentile, demonstrating our approach's potential to greatly reduce analyst workloads. △ Less

Submitted 15 December, 2017; v1 submitted 2 October, 2017; originally announced October 2017.

Comments: Proceedings of AI for Cyber Security Workshop at AAAI 2017

MSC Class: 62-07

Showing 1–25 of 25 results for author: Tuor, A