-
LLark: A Multimodal Instruction-Following Language Model for Music
Authors:
Josh Gardner,
Simon Durand,
Daniel Stoller,
Rachel M. Bittner
Abstract:
Music has a unique and complex structure which is challenging for both expert humans and existing AI systems to understand, and presents unique challenges relative to other forms of audio. We present LLark, an instruction-tuned multimodal model for \emph{music} understanding. We detail our process for dataset creation, which involves augmenting the annotations of diverse open-source music datasets…
▽ More
Music has a unique and complex structure which is challenging for both expert humans and existing AI systems to understand, and presents unique challenges relative to other forms of audio. We present LLark, an instruction-tuned multimodal model for \emph{music} understanding. We detail our process for dataset creation, which involves augmenting the annotations of diverse open-source music datasets and converting them to a unified instruction-tuning format. We propose a multimodal architecture for LLark, integrating a pretrained generative model for music with a pretrained language model. In evaluations on three types of tasks (music understanding, captioning, reasoning), we show that LLark matches or outperforms existing baselines in music understanding, and that humans show a high degree of agreement with its responses in captioning and reasoning tasks. LLark is trained entirely from open-source music data and models, and we make our training code available along with the release of this paper. Additional results and audio examples are at https://bit.ly/llark, and our source code is available at https://github.com/spotify-research/llark .
△ Less
Submitted 2 June, 2024; v1 submitted 10 October, 2023;
originally announced October 2023.
-
Benchmarking for Integrating Logic Rules with Everything Else
Authors:
Yanhong A. Liu,
Scott D. Stoller,
Yi Tong,
K. Tuncay Tekle
Abstract:
Integrating logic rules with other language features is increasingly sought after for advanced applications that require knowledge-base capabilities. To address this demand, increasingly more languages and extensions for such integration have been developed. How to evaluate such languages?
This paper describes a set of programming and performance benchmarks for evaluating languages supporting…
▽ More
Integrating logic rules with other language features is increasingly sought after for advanced applications that require knowledge-base capabilities. To address this demand, increasingly more languages and extensions for such integration have been developed. How to evaluate such languages?
This paper describes a set of programming and performance benchmarks for evaluating languages supporting integrated use of rules and other features, and the results of evaluating such an integrated language together with logic languages and languages not supporting logic rules.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Contrastive Learning-Based Audio to Lyrics Alignment for Multiple Languages
Authors:
Simon Durand,
Daniel Stoller,
Sebastian Ewert
Abstract:
Lyrics alignment gained considerable attention in recent years. State-of-the-art systems either re-use established speech recognition toolkits, or design end-to-end solutions involving a Connectionist Temporal Classification (CTC) loss. However, both approaches suffer from specific weaknesses: toolkits are known for their complexity, and CTC systems use a loss designed for transcription which can…
▽ More
Lyrics alignment gained considerable attention in recent years. State-of-the-art systems either re-use established speech recognition toolkits, or design end-to-end solutions involving a Connectionist Temporal Classification (CTC) loss. However, both approaches suffer from specific weaknesses: toolkits are known for their complexity, and CTC systems use a loss designed for transcription which can limit alignment accuracy. In this paper, we use instead a contrastive learning procedure that derives cross-modal embeddings linking the audio and text domains. This way, we obtain a novel system that is simple to train end-to-end, can make use of weakly annotated training data, jointly learns a powerful text model, and is tailored to alignment. The system is not only the first to yield an average absolute error below 0.2 seconds on the standard Jamendo dataset but it is also robust to other languages, even when trained on English data only. Finally, we release word-level alignments for the JamendoLyrics Multi-Lang dataset.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
Integrating Logic Rules with Everything Else, Seamlessly
Authors:
Yanhong A. Liu,
Scott D. Stoller,
Yi Tong,
Bo Lin
Abstract:
This paper presents a language, Alda, that supports all of logic rules, sets, functions, updates, and objects as seamlessly integrated built-ins. The key idea is to support predicates in rules as set-valued variables that can be used and updated in any scope, and support queries using rules as either explicit or implicit automatic calls to an inference function.
We have defined a formal semantic…
▽ More
This paper presents a language, Alda, that supports all of logic rules, sets, functions, updates, and objects as seamlessly integrated built-ins. The key idea is to support predicates in rules as set-valued variables that can be used and updated in any scope, and support queries using rules as either explicit or implicit automatic calls to an inference function.
We have defined a formal semantics of the language, implemented a prototype compiler that builds on an object-oriented language that supports concurrent and distributed programming and on an efficient logic rule system, and successfully used the language and implementation on benchmarks and problems from a wide variety of application domains. We describe the compilation method and results of experimental evaluation.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
Programming with rules and everything else, seamlessly
Authors:
Yanhong A. Liu,
Scott D. Stoller,
Yi Tong,
Bo Lin,
K. Tuncay Tekle
Abstract:
Logic rules are powerful for expressing complex reasoning and analysis problems. At the same time, they are inconvenient or impossible to use for many other aspects of applications. Integrating rules in a language with sets and functions, and furthermore with updates to objects, has been a subject of significant study. What's lacking is a language that integrates all constructs seamlessly.
This…
▽ More
Logic rules are powerful for expressing complex reasoning and analysis problems. At the same time, they are inconvenient or impossible to use for many other aspects of applications. Integrating rules in a language with sets and functions, and furthermore with updates to objects, has been a subject of significant study. What's lacking is a language that integrates all constructs seamlessly.
This paper presents a language, Alda, that supports all of rules, sets, functions, updates, and objects as seamlessly integrated built-ins, including concurrent and distributed processes. The key idea is to support predicates as set-valued variables that can be used and updated in any scope, and support queries and inference with both explicit and automatic calls to an inference function. We develop a complete formal semantics for Alda. We design a compilation framework that ensures the declarative semantics of rules, while also being able to exploit available optimizations. We describe a prototype implementation that builds on a powerful extension of Python and employs an efficient logic rule engine. We develop a range of benchmarks and present results of experiments to demonstrate Alda's power for programming and generally good performance.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
Few-Shot Musical Source Separation
Authors:
Yu Wang,
Daniel Stoller,
Rachel M. Bittner,
Juan Pablo Bello
Abstract:
Deep learning-based approaches to musical source separation are often limited to the instrument classes that the models are trained on and do not generalize to separate unseen instruments. To address this, we propose a few-shot musical source separation paradigm. We condition a generic U-Net source separation model using few audio examples of the target instrument. We train a few-shot conditioning…
▽ More
Deep learning-based approaches to musical source separation are often limited to the instrument classes that the models are trained on and do not generalize to separate unseen instruments. To address this, we propose a few-shot musical source separation paradigm. We condition a generic U-Net source separation model using few audio examples of the target instrument. We train a few-shot conditioning encoder jointly with the U-Net to encode the audio examples into a conditioning vector to configure the U-Net via feature-wise linear modulation (FiLM). We evaluate the trained models on real musical recordings in the MUSDB18 and MedleyDB datasets. We show that our proposed few-shot conditioning paradigm outperforms the baseline one-hot instrument-class conditioned model for both seen and unseen instruments. To extend the scope of our approach to a wider variety of real-world scenarios, we also experiment with different conditioning example characteristics, including examples from different recordings, with multiple sources, or negative conditioning examples.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
Multi-Agent Spatial Predictive Control with Application to Drone Flocking (Extended Version)
Authors:
Andreas Brandstätter,
Scott A. Smolka,
Scott D. Stoller,
Ashish Tiwari,
Radu Grosu
Abstract:
We introduce the novel concept of Spatial Predictive Control (SPC) to solve the following problem: given a collection of agents (e.g., drones) with positional low-level controllers (LLCs) and a mission-specific distributed cost function, how can a distributed controller achieve and maintain cost-function minimization without a plant model and only positional observations of the environment? Our fu…
▽ More
We introduce the novel concept of Spatial Predictive Control (SPC) to solve the following problem: given a collection of agents (e.g., drones) with positional low-level controllers (LLCs) and a mission-specific distributed cost function, how can a distributed controller achieve and maintain cost-function minimization without a plant model and only positional observations of the environment? Our fully distributed SPC controller is based strictly on the position of the agent itself and on those of its neighboring agents. This information is used in every time-step to compute the gradient of the cost function and to perform a spatial look-ahead to predict the best next target position for the LLC. Using a high-fidelity simulation environment, we show that SPC outperforms the most closely related class of controllers, Potential Field Controllers, on the drone flocking problem. We also show that SPC is able to cope with a potential sim-to-real transfer gap by demonstrating its performance on real hardware, namely our implementation of flocking using nine Crazyflie 2.1 drones.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
A Barrier Certificate-based Simplex Architecture with Application to Microgrids
Authors:
Amol Damare,
Shouvik Roy,
Scott A. Smolka,
Scott D. Stoller
Abstract:
We present Barrier Certificate-based Simplex (BC-Simplex), a new, provably correct design for runtime assurance of continuous dynamical systems. BC-Simplex is centered around the Simplex Control Architecture, which consists of a high-performance advanced controller which is not guaranteed to maintain safety of the plant, a verified-safe baseline controller, and a decision module that switches cont…
▽ More
We present Barrier Certificate-based Simplex (BC-Simplex), a new, provably correct design for runtime assurance of continuous dynamical systems. BC-Simplex is centered around the Simplex Control Architecture, which consists of a high-performance advanced controller which is not guaranteed to maintain safety of the plant, a verified-safe baseline controller, and a decision module that switches control of the plant between the two controllers to ensure safety without sacrificing performance. In BC-Simplex, Barrier certificates are used to prove that the baseline controller ensures safety. Furthermore, BC-Simplex features a new automated method for deriving, from the barrier certificate, the conditions for switching between the controllers. Our method is based on the Taylor expansion of the barrier certificate and yields computationally inexpensive switching conditions. We consider a significant application of BC-Simplex to a microgrid featuring an advanced controller in the form of a neural network trained using reinforcement learning. The microgrid is modeled in RTDS, an industry-standard high-fidelity, real-time power systems simulator. Our results demonstrate that BC-Simplex can automatically derive switching conditions for complex systems, the switching conditions are not overly conservative, and BC-Simplex ensures safety even in the presence of adversarial attacks on the neural controller.
△ Less
Submitted 2 June, 2022; v1 submitted 19 February, 2022;
originally announced February 2022.
-
The Black-Box Simplex Architecture for Runtime Assurance of Autonomous CPS
Authors:
Usama Mehmood,
Sanaz Sheikhi,
Stanley Bak,
Scott A. Smolka,
Scott D. Stoller
Abstract:
The Simplex Architecture is a runtime assurance framework where control authority may switch from an unverified and potentially unsafe advanced controller to a backup baseline controller in order to maintain the safety of an autonomous cyber-physical system. In this work, we show that runtime checks can replace the requirement to statically verify safety of the baseline controller. This is importa…
▽ More
The Simplex Architecture is a runtime assurance framework where control authority may switch from an unverified and potentially unsafe advanced controller to a backup baseline controller in order to maintain the safety of an autonomous cyber-physical system. In this work, we show that runtime checks can replace the requirement to statically verify safety of the baseline controller. This is important as there are many powerful control techniques, such as model-predictive control and neural network controllers, that work well in practice but are difficult to statically verify. Since the method does not use internal information about the advanced or baseline controller, we call the approach the Black-Box Simplex Architecture. We prove the architecture is safe and present two case studies where (i) model-predictive control provides safe multi-robot coordination, and (ii) neural networks provably prevent collisions in groups of F-16 aircraft, despite the controllers occasionally outputting unsafe commands.
△ Less
Submitted 31 May, 2022; v1 submitted 24 February, 2021;
originally announced February 2021.
-
A Distributed Simplex Architecture for Multi-Agent Systems
Authors:
Usama Mehmood,
Scott D. Stoller,
Radu Grosu,
Shouvik Roy,
Amol Damare,
Scott A. Smolka
Abstract:
We present Distributed Simplex Architecture (DSA), a new runtime assurance technique that provides safety guarantees for multi-agent systems (MASs). DSA is inspired by the Simplex control architecture of Sha et al., but with some significant differences. The traditional Simplex approach is limited to single-agent systems or a MAS with a centralized control scheme. DSA addresses this limitation by…
▽ More
We present Distributed Simplex Architecture (DSA), a new runtime assurance technique that provides safety guarantees for multi-agent systems (MASs). DSA is inspired by the Simplex control architecture of Sha et al., but with some significant differences. The traditional Simplex approach is limited to single-agent systems or a MAS with a centralized control scheme. DSA addresses this limitation by extending the scope of Simplex to include MASs under distributed control. In DSA, each agent has a local instance of traditional Simplex such that the preservation of safety in the local instances implies safety for the entire MAS. We provide a proof of safety for DSA, and present experimental results for several case studies, including flocking with collision avoidance, safe navigation of ground rovers through way-points, and the safe operation of a microgrid.
△ Less
Submitted 18 December, 2020;
originally announced December 2020.
-
Assurance of Distributed Algorithms and Systems: Runtime Checking of Safety and Liveness
Authors:
Yanhong A. Liu,
Scott D. Stoller
Abstract:
This paper presents a general framework and methods for complete programming and checking of distributed algorithms at a high-level, as in pseudocode languages, but precisely specified and directly executable, as in formal specification languages and practical programming languages, respectively. The checking framework, as well as the writing of distributed algorithms and specification of their sa…
▽ More
This paper presents a general framework and methods for complete programming and checking of distributed algorithms at a high-level, as in pseudocode languages, but precisely specified and directly executable, as in formal specification languages and practical programming languages, respectively. The checking framework, as well as the writing of distributed algorithms and specification of their safety and liveness properties, use DistAlgo, a high-level language for distributed algorithms. We give a complete executable specification of the checking framework, with a complete example algorithm and example safety and liveness properties.
△ Less
Submitted 23 December, 2020; v1 submitted 21 August, 2020;
originally announced August 2020.
-
Learning Attribute-Based and Relationship-Based Access Control Policies with Unknown Values
Authors:
Thang Bui,
Scott D. Stoller
Abstract:
Attribute-Based Access Control (ABAC) and Relationship-based access control (ReBAC) provide a high level of expressiveness and flexibility that promote security and information sharing, by allowing policies to be expressed in terms of attributes of and chains of relationships between entities. Algorithms for learning ABAC and ReBAC policies from legacy access control information have the potential…
▽ More
Attribute-Based Access Control (ABAC) and Relationship-based access control (ReBAC) provide a high level of expressiveness and flexibility that promote security and information sharing, by allowing policies to be expressed in terms of attributes of and chains of relationships between entities. Algorithms for learning ABAC and ReBAC policies from legacy access control information have the potential to significantly reduce the cost of migration to ABAC or ReBAC.
This paper presents the first algorithms for mining ABAC and ReBAC policies from access control lists (ACLs) and incomplete information about entities, where the values of some attributes of some entities are unknown. We show that the core of this problem can be viewed as learning a concise three-valued logic formula from a set of labeled feature vectors containing unknowns, and we give the first algorithm (to the best of our knowledge) for that problem.
△ Less
Submitted 23 November, 2020; v1 submitted 19 August, 2020;
originally announced August 2020.
-
Recursive Rules with Aggregation: A Simple Unified Semantics
Authors:
Yanhong A. Liu,
Scott D. Stoller
Abstract:
Complex reasoning problems are most clearly and easily specified using logical rules, but require recursive rules with aggregation such as count and sum for practical applications. Unfortunately, the meaning of such rules has been a significant challenge, leading to many disagreeing semantics.
This paper describes a unified semantics for recursive rules with aggregation, extending the unified fo…
▽ More
Complex reasoning problems are most clearly and easily specified using logical rules, but require recursive rules with aggregation such as count and sum for practical applications. Unfortunately, the meaning of such rules has been a significant challenge, leading to many disagreeing semantics.
This paper describes a unified semantics for recursive rules with aggregation, extending the unified founded semantics and constraint semantics for recursive rules with negation. The key idea is to support simple expression of the different assumptions underlying different semantics, and orthogonally interpret aggregation operations using their simple usual meaning. We present a formal definition of the semantics, prove important properties of the semantics, and compare with prior semantics. In particular, we present an efficient inference over aggregation that gives precise answers to all examples we have studied from the literature. We also apply our semantics to a wide range of challenging examples, and show that our semantics is simple and matches the desired results in all cases. Finally, we describe experiments on the most challenging examples, exhibiting unexpectedly superior performance over well-known systems when they can compute correct answers.
△ Less
Submitted 21 September, 2022; v1 submitted 26 July, 2020;
originally announced July 2020.
-
Learning Distributed Controllers for V-Formation
Authors:
Shouvik Roy,
Usama Mehmood,
Radu Grosu,
Scott A. Smolka,
Scott D. Stoller,
Ashish Tiwari
Abstract:
We show how a high-performing, fully distributed and symmetric neural V-formation controller can be synthesized from a Centralized MPC (Model Predictive Control) controller using Deep Learning. This result is significant as we also establish that under very reasonable conditions, it is impossible to achieve V-formation using a deterministic, distributed, and symmetric controller. The learning proc…
▽ More
We show how a high-performing, fully distributed and symmetric neural V-formation controller can be synthesized from a Centralized MPC (Model Predictive Control) controller using Deep Learning. This result is significant as we also establish that under very reasonable conditions, it is impossible to achieve V-formation using a deterministic, distributed, and symmetric controller. The learning process we use for the neural V-formation controller is significantly enhanced by CEGkR, a Counterexample-Guided k-fold Retraining technique we introduce, which extends prior work in this direction in important ways. Our experimental results show that our neural V-formation controller generalizes to a significantly larger number of agents than for which it was trained (from 7 to 15), and exhibits substantial speedup over the MPC-based controller. We use a form of statistical model checking to compute confidence intervals for our neural V-formation controller's convergence rate and time to convergence.
△ Less
Submitted 31 May, 2020;
originally announced June 2020.
-
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
Authors:
Daniel Stoller,
Mi Tian,
Sebastian Ewert,
Simon Dixon
Abstract:
Convolutional neural networks (CNNs) with dilated filters such as the Wavenet or the Temporal Convolutional Network (TCN) have shown good results in a variety of sequence modelling tasks. However, efficiently modelling long-term dependencies in these sequences is still challenging. Although the receptive field of these models grows exponentially with the number of layers, computing the convolution…
▽ More
Convolutional neural networks (CNNs) with dilated filters such as the Wavenet or the Temporal Convolutional Network (TCN) have shown good results in a variety of sequence modelling tasks. However, efficiently modelling long-term dependencies in these sequences is still challenging. Although the receptive field of these models grows exponentially with the number of layers, computing the convolutions over very long sequences of features in each layer is time and memory-intensive, prohibiting the use of longer receptive fields in practice. To increase efficiency, we make use of the "slow feature" hypothesis stating that many features of interest are slowly varying over time. For this, we use a U-Net architecture that computes features at multiple time-scales and adapt it to our auto-regressive scenario by making convolutions causal. We apply our model ("Seq-U-Net") to a variety of tasks including language and audio generation. In comparison to TCN and Wavenet, our network consistently saves memory and computation time, with speed-ups for training and inference of over 4x in the audio generation experiment in particular, while achieving a comparable performance in all tasks.
△ Less
Submitted 14 November, 2019;
originally announced November 2019.
-
Knowledge of Uncertain Worlds: Programming with Logical Constraints
Authors:
Yanhong A. Liu,
Scott D. Stoller
Abstract:
Programming with logic for sophisticated applications must deal with recursion and negation, which together have created significant challenges in logic, leading to many different, conflicting semantics of rules. This paper describes a unified language, DA logic, for design and analysis logic, based on the unifying founded semantics and constraint semantics, that support the power and ease of prog…
▽ More
Programming with logic for sophisticated applications must deal with recursion and negation, which together have created significant challenges in logic, leading to many different, conflicting semantics of rules. This paper describes a unified language, DA logic, for design and analysis logic, based on the unifying founded semantics and constraint semantics, that support the power and ease of programming with different intended semantics. The key idea is to provide meta-constraints, supports the use of uncertain information in the form of either undefined values or possible combinations of values or both, and promote the use of knowledge units that can be instantiated by any new predicates, including predicates with additional arguments.
△ Less
Submitted 10 December, 2020; v1 submitted 23 October, 2019;
originally announced October 2019.
-
A Decision Tree Learning Approach for Mining Relationship-Based Access Control Policies
Authors:
Thang Bui,
Scott D. Stoller
Abstract:
Relationship-based access control (ReBAC) provides a high level of expressiveness and flexibility that promotes security and information sharing, by allowing policies to be expressed in terms of chains of relationships between entities. ReBAC policy mining algorithms have the potential to significantly reduce the cost of migration from legacy access control systems to ReBAC, by partially automatin…
▽ More
Relationship-based access control (ReBAC) provides a high level of expressiveness and flexibility that promotes security and information sharing, by allowing policies to be expressed in terms of chains of relationships between entities. ReBAC policy mining algorithms have the potential to significantly reduce the cost of migration from legacy access control systems to ReBAC, by partially automating the development of a ReBAC policy.
This paper presents new algorithms, called DTRM (Decision Tree ReBAC Miner) and DTRM$^-$, based on decision trees, for mining ReBAC policies from access control lists (ACLs) and information about entities. Compared to state-of-the-art ReBAC mining algorithms, our algorithms are significantly faster, achieve comparable policy quality, and can mine policies in a richer language.
△ Less
Submitted 12 May, 2020; v1 submitted 24 September, 2019;
originally announced September 2019.
-
Neural Flocking: MPC-based Supervised Learning of Flocking Controllers
Authors:
Shouvik Roy,
Usama Mehmood,
Radu Grosu,
Scott A. Smolka,
Scott D. Stoller,
Ashish Tiwari
Abstract:
We show how a distributed flocking controller can be synthesized using deep learning from a centralized controller which generates the trajectories of the flock. Our approach is based on supervised learning, with the centralized controller providing the training data to the learning agent, i.e., the synthesized distributed controller. We use Model Predictive Control (MPC) for the centralized contr…
▽ More
We show how a distributed flocking controller can be synthesized using deep learning from a centralized controller which generates the trajectories of the flock. Our approach is based on supervised learning, with the centralized controller providing the training data to the learning agent, i.e., the synthesized distributed controller. We use Model Predictive Control (MPC) for the centralized controller, an approach that has been successfully demonstrated on flocking problems. MPC-based flocking controllers are high-performing but also computationally expensive. By learning a symmetric distributed neural flocking controller from a centralized MPC-based flocking controller, we achieve the best of both worlds: the neural controllers have high performance (on par with the MPC controllers) and high efficiency. Our experimental results demonstrate the sophisticated nature of the distributed controllers we learn. In particular, the neural controllers are capable of achieving myriad flocking-oriented control objectives, including flocking formation, collision avoidance, obstacle avoidance, predator avoidance, and target seeking. Moreover, they generalize the behavior seen in the training data in order to achieve these objectives in a significantly broader range of scenarios.
△ Less
Submitted 17 January, 2020; v1 submitted 26 August, 2019;
originally announced August 2019.
-
Neural Simplex Architecture
Authors:
Dung T. Phan,
Radu Grosu,
Nils Jansen,
Nicola Paoletti,
Scott A. Smolka,
Scott D. Stoller
Abstract:
We present the Neural Simplex Architecture (NSA), a new approach to runtime assurance that provides safety guarantees for neural controllers (obtained e.g. using reinforcement learning) of autonomous and other complex systems without unduly sacrificing performance. NSA is inspired by the Simplex control architecture of Sha et al., but with some significant differences. In the traditional approach,…
▽ More
We present the Neural Simplex Architecture (NSA), a new approach to runtime assurance that provides safety guarantees for neural controllers (obtained e.g. using reinforcement learning) of autonomous and other complex systems without unduly sacrificing performance. NSA is inspired by the Simplex control architecture of Sha et al., but with some significant differences. In the traditional approach, the advanced controller (AC) is treated as a black box; when the decision module switches control to the baseline controller (BC), the BC remains in control forever. There is relatively little work on switching control back to the AC, and there are no techniques for correcting the AC's behavior after it generates a potentially unsafe control input that causes a failover to the BC. Our NSA addresses both of these limitations. NSA not only provides safety assurances in the presence of a possibly unsafe neural controller, but can also improve the safety of such a controller in an online setting via retraining, without overly degrading its performance. To demonstrate NSA's benefits, we have conducted several significant case studies in the continuous control domain. These include a target-seeking ground rover navigating an obstacle field, and a neural controller for an artificial pancreas system.
△ Less
Submitted 24 March, 2020; v1 submitted 1 August, 2019;
originally announced August 2019.
-
Training Generative Adversarial Networks from Incomplete Observations using Factorised Discriminators
Authors:
Daniel Stoller,
Sebastian Ewert,
Simon Dixon
Abstract:
Generative adversarial networks (GANs) have shown great success in applications such as image generation and inpainting. However, they typically require large datasets, which are often not available, especially in the context of prediction tasks such as image segmentation that require labels. Therefore, methods such as the CycleGAN use more easily available unlabelled data, but do not offer a way…
▽ More
Generative adversarial networks (GANs) have shown great success in applications such as image generation and inpainting. However, they typically require large datasets, which are often not available, especially in the context of prediction tasks such as image segmentation that require labels. Therefore, methods such as the CycleGAN use more easily available unlabelled data, but do not offer a way to leverage additional labelled data for improved performance. To address this shortcoming, we show how to factorise the joint data distribution into a set of lower-dimensional distributions along with their dependencies. This allows splitting the discriminator in a GAN into multiple "sub-discriminators" that can be independently trained from incomplete observations. Their outputs can be combined to estimate the density ratio between the joint real and the generator distribution, which enables training generators as in the original GAN framework. We apply our method to image generation, image segmentation and audio source separation, and obtain improved performance over a standard GAN when additional incomplete training examples are available. For the Cityscapes segmentation task in particular, our method also improves accuracy by an absolute 14.9% over CycleGAN while using only 25 additional paired examples.
△ Less
Submitted 30 January, 2020; v1 submitted 29 May, 2019;
originally announced May 2019.
-
Algorithm Diversity for Resilient Systems
Authors:
Scott D. Stoller,
Yanhong A. Liu
Abstract:
Diversity can significantly increase the resilience of systems, by reducing the prevalence of shared vulnerabilities and making vulnerabilities harder to exploit. Work on software diversity for security typically creates variants of a program using low-level code transformations. This paper is the first to study algorithm diversity for resilience. We first describe how a method based on high-level…
▽ More
Diversity can significantly increase the resilience of systems, by reducing the prevalence of shared vulnerabilities and making vulnerabilities harder to exploit. Work on software diversity for security typically creates variants of a program using low-level code transformations. This paper is the first to study algorithm diversity for resilience. We first describe how a method based on high-level invariants and systematic incrementalization can be used to create algorithm variants. Executing multiple variants in parallel and comparing their outputs provides greater resilience than executing one variant. To prevent different parallel schedules from causing variants' behaviors to diverge, we present a synchronized execution algorithm for DistAlgo, an extension of Python for high-level, precise, executable specifications of distributed algorithms. We propose static and dynamic metrics for measuring diversity. An experimental evaluation of algorithm diversity combined with implementation-level diversity for several sequential algorithms and distributed algorithms shows the benefits of algorithm diversity.
△ Less
Submitted 28 April, 2019;
originally announced April 2019.
-
GAN-based Generation and Automatic Selection of Explanations for Neural Networks
Authors:
Saumitra Mishra,
Daniel Stoller,
Emmanouil Benetos,
Bob L. Sturm,
Simon Dixon
Abstract:
One way to interpret trained deep neural networks (DNNs) is by inspecting characteristics that neurons in the model respond to, such as by iteratively optimising the model input (e.g., an image) to maximally activate specific neurons. However, this requires a careful selection of hyper-parameters to generate interpretable examples for each neuron of interest, and current methods rely on a manual,…
▽ More
One way to interpret trained deep neural networks (DNNs) is by inspecting characteristics that neurons in the model respond to, such as by iteratively optimising the model input (e.g., an image) to maximally activate specific neurons. However, this requires a careful selection of hyper-parameters to generate interpretable examples for each neuron of interest, and current methods rely on a manual, qualitative evaluation of each setting, which is prohibitively slow. We introduce a new metric that uses Fréchet Inception Distance (FID) to encourage similarity between model activations for real and generated data. This provides an efficient way to evaluate a set of generated examples for each setting of hyper-parameters. We also propose a novel GAN-based method for generating explanations that enables an efficient search through the input space and imposes a strong prior favouring realistic outputs. We apply our approach to a classification model trained to predict whether a music audio recording contains singing voice. Our results suggest that this proposed metric successfully selects hyper-parameters leading to interpretable examples, avoiding the need for manual evaluation. Moreover, we see that examples synthesised to maximise or minimise the predicted probability of singing voice presence exhibit vocal or non-vocal characteristics, respectively, suggesting that our approach is able to generate suitable explanations for understanding concepts learned by a neural network.
△ Less
Submitted 27 April, 2019; v1 submitted 20 April, 2019;
originally announced April 2019.
-
Ensemble Models for Spoofing Detection in Automatic Speaker Verification
Authors:
Bhusan Chettri,
Daniel Stoller,
Veronica Morfi,
Marco A. Martínez Ramírez,
Emmanouil Benetos,
Bob L. Sturm
Abstract:
Detecting spoofing attempts of automatic speaker verification (ASV) systems is challenging, especially when using only one modeling approach. For robustness, we use both deep neural networks and traditional machine learning models and combine them as ensemble models through logistic regression. They are trained to detect logical access (LA) and physical access (PA) attacks on the dataset released…
▽ More
Detecting spoofing attempts of automatic speaker verification (ASV) systems is challenging, especially when using only one modeling approach. For robustness, we use both deep neural networks and traditional machine learning models and combine them as ensemble models through logistic regression. They are trained to detect logical access (LA) and physical access (PA) attacks on the dataset released as part of the ASV Spoofing and Countermeasures Challenge 2019. We propose dataset partitions that ensure different attack types are present during training and validation to improve system robustness. Our ensemble model outperforms all our single models and the baselines from the challenge for both attack types. We investigate why some models on the PA dataset strongly outperform others and find that spoofed recordings in the dataset tend to have longer silences at the end than genuine ones. By removing them, the PA task becomes much more challenging, with the tandem detection cost function (t-DCF) of our best single model rising from 0.1672 to 0.5018 and equal error rate (EER) increasing from 5.98% to 19.8% on the development set.
△ Less
Submitted 4 July, 2019; v1 submitted 9 April, 2019;
originally announced April 2019.
-
Efficient and Extensible Policy Mining for Relationship-Based Access Control
Authors:
Thang Bui,
Scott D. Stoller,
Hieu Le
Abstract:
Relationship-based access control (ReBAC) is a flexible and expressive framework that allows policies to be expressed in terms of chains of relationship between entities as well as attributes of entities. ReBAC policy mining algorithms have a potential to significantly reduce the cost of migration from legacy access control systems to ReBAC, by partially automating the development of a ReBAC polic…
▽ More
Relationship-based access control (ReBAC) is a flexible and expressive framework that allows policies to be expressed in terms of chains of relationship between entities as well as attributes of entities. ReBAC policy mining algorithms have a potential to significantly reduce the cost of migration from legacy access control systems to ReBAC, by partially automating the development of a ReBAC policy. Existing ReBAC policy mining algorithms support a policy language with a limited set of operators; this limits their applicability. This paper presents a ReBAC policy mining algorithm designed to be both (1) easily extensible (to support additional policy language features) and (2) scalable. The algorithm is based on Bui et al.'s evolutionary algorithm for ReBAC policy mining algorithm. First, we simplify their algorithm, in order to make it easier to extend and provide a methodology that extends it to handle new policy language features. However, extending the policy language increases the search space of candidate policies explored by the evolutionary algorithm, thus causes longer running time and/or worse results. To address the problem, we enhance the algorithm with a feature selection phase. The enhancement utilizes a neural network to identify useful features. We use the result of feature selection to reduce the evolutionary algorithm's search space. The new algorithm is easy to extend and, as shown by our experiments, is more efficient and produces better policies.
△ Less
Submitted 8 August, 2019; v1 submitted 18 March, 2019;
originally announced March 2019.
-
End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model
Authors:
Daniel Stoller,
Simon Durand,
Sebastian Ewert
Abstract:
Time-aligned lyrics can enrich the music listening experience by enabling karaoke, text-based song retrieval and intra-song navigation, and other applications. Compared to text-to-speech alignment, lyrics alignment remains highly challenging, despite many attempts to combine numerous sub-modules including vocal separation and detection in an effort to break down the problem. Furthermore, training…
▽ More
Time-aligned lyrics can enrich the music listening experience by enabling karaoke, text-based song retrieval and intra-song navigation, and other applications. Compared to text-to-speech alignment, lyrics alignment remains highly challenging, despite many attempts to combine numerous sub-modules including vocal separation and detection in an effort to break down the problem. Furthermore, training required fine-grained annotations to be available in some form. Here, we present a novel system based on a modified Wave-U-Net architecture, which predicts character probabilities directly from raw audio using learnt multi-scale representations of the various signal components. There are no sub-modules whose interdependencies need to be optimized. Our training procedure is designed to work with weak, line-level annotations available in the real world. With a mean alignment error of 0.35s on a standard dataset our system outperforms the state-of-the-art by an order of magnitude.
△ Less
Submitted 18 February, 2019;
originally announced February 2019.
-
High-level Cryptographic Abstractions
Authors:
Christopher Kane,
Bo Lin,
Saksham Chand,
Scott D. Stoller,
Yanhong A. Liu
Abstract:
The interfaces exposed by commonly used cryptographic libraries are clumsy, complicated, and assume an understanding of cryptographic algorithms. The challenge is to design high-level abstractions that require minimum knowledge and effort to use while also allowing maximum control when needed.
This paper proposes such high-level abstractions consisting of simple cryptographic primitives and full…
▽ More
The interfaces exposed by commonly used cryptographic libraries are clumsy, complicated, and assume an understanding of cryptographic algorithms. The challenge is to design high-level abstractions that require minimum knowledge and effort to use while also allowing maximum control when needed.
This paper proposes such high-level abstractions consisting of simple cryptographic primitives and full declarative configuration. These abstractions can be implemented on top of any cryptographic library in any language. We have implemented these abstractions in Python, and used them to write a wide variety of well-known security protocols, including Signal, Kerberos, and TLS.
We show that programs using our abstractions are much smaller and easier to write than using low-level libraries, where size of security protocols implemented is reduced by about a third on average. We show our implementation incurs a small overhead, less than 5 microseconds for shared key operations and less than 341 microseconds (< 1%) for public key operations. We also show our abstractions are safe against main types of cryptographic misuse reported in the literature.
△ Less
Submitted 23 August, 2019; v1 submitted 21 October, 2018;
originally announced October 2018.
-
Neural State Classification for Hybrid Systems
Authors:
Dung Phan,
Nicola Paoletti,
Timothy Zhang,
Radu Grosu,
Scott A. Smolka,
Scott D. Stoller
Abstract:
We introduce the State Classification Problem (SCP) for hybrid systems, and present Neural State Classification (NSC) as an efficient solution technique. SCP generalizes the model checking problem as it entails classifying each state $s$ of a hybrid automaton as either positive or negative, depending on whether or not $s$ satisfies a given time-bounded reachability specification. This is an intere…
▽ More
We introduce the State Classification Problem (SCP) for hybrid systems, and present Neural State Classification (NSC) as an efficient solution technique. SCP generalizes the model checking problem as it entails classifying each state $s$ of a hybrid automaton as either positive or negative, depending on whether or not $s$ satisfies a given time-bounded reachability specification. This is an interesting problem in its own right, which NSC solves using machine-learning techniques, Deep Neural Networks in particular. State classifiers produced by NSC tend to be very efficient (run in constant time and space), but may be subject to classification errors. To quantify and mitigate such errors, our approach comprises: i) techniques for certifying, with statistical guarantees, that an NSC classifier meets given accuracy levels; ii) tuning techniques, including a novel technique based on adversarial sampling, that can virtually eliminate false negatives (positive states classified as negative), thereby making the classifier more conservative. We have applied NSC to six nonlinear hybrid system benchmarks, achieving an accuracy of 99.25% to 99.98%, and a false-negative rate of 0.0033 to 0, which we further reduced to 0.0015 to 0 after tuning the classifier. We believe that this level of accuracy is acceptable in many practical applications, and that these results demonstrate the promise of the NSC approach.
△ Less
Submitted 25 July, 2018;
originally announced July 2018.
-
Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation
Authors:
Daniel Stoller,
Sebastian Ewert,
Simon Dixon
Abstract:
Models for audio source separation usually operate on the magnitude spectrum, which ignores phase information and makes separation performance dependant on hyper-parameters for the spectral front-end. Therefore, we investigate end-to-end source separation in the time-domain, which allows modelling phase information and avoids fixed spectral transformations. Due to high sampling rates for audio, em…
▽ More
Models for audio source separation usually operate on the magnitude spectrum, which ignores phase information and makes separation performance dependant on hyper-parameters for the spectral front-end. Therefore, we investigate end-to-end source separation in the time-domain, which allows modelling phase information and avoids fixed spectral transformations. Due to high sampling rates for audio, employing a long temporal input context on the sample level is difficult, but required for high quality separation results because of long-range temporal correlations. In this context, we propose the Wave-U-Net, an adaptation of the U-Net to the one-dimensional time domain, which repeatedly resamples feature maps to compute and combine features at different time scales. We introduce further architectural improvements, including an output layer that enforces source additivity, an upsampling technique and a context-aware prediction framework to reduce output artifacts. Experiments for singing voice separation indicate that our architecture yields a performance comparable to a state-of-the-art spectrogram-based U-Net architecture, given the same data. Finally, we reveal a problem with outliers in the currently used SDR evaluation metrics and suggest reporting rank-based statistics to alleviate this problem.
△ Less
Submitted 8 June, 2018;
originally announced June 2018.
-
Jointly Detecting and Separating Singing Voice: A Multi-Task Approach
Authors:
Daniel Stoller,
Sebastian Ewert,
Simon Dixon
Abstract:
A main challenge in applying deep learning to music processing is the availability of training data. One potential solution is Multi-task Learning, in which the model also learns to solve related auxiliary tasks on additional datasets to exploit their correlation. While intuitive in principle, it can be challenging to identify related tasks and construct the model to optimally share information be…
▽ More
A main challenge in applying deep learning to music processing is the availability of training data. One potential solution is Multi-task Learning, in which the model also learns to solve related auxiliary tasks on additional datasets to exploit their correlation. While intuitive in principle, it can be challenging to identify related tasks and construct the model to optimally share information between tasks. In this paper, we explore vocal activity detection as an additional task to stabilise and improve the performance of vocal separation. Further, we identify problematic biases specific to each dataset that could limit the generalisation capability of separation and detection models, to which our proposed approach is robust. Experiments show improved performance in separation as well as vocal detection compared to single-task baselines. However, we find that the commonly used Signal-to-Distortion Ratio (SDR) metrics did not capture the improvement on non-vocal sections, indicating the need for improved evaluation methodologies.
△ Less
Submitted 4 April, 2018;
originally announced April 2018.
-
How to Learn a Model Checker
Authors:
Dung Phan,
Radu Grosu,
Nicola Paoletti,
Scott A. Smolka,
Scott D. Stoller
Abstract:
We show how machine-learning techniques, particularly neural networks, offer a very effective and highly efficient solution to the approximate model-checking problem for continuous and hybrid systems, a solution where the general-purpose model checker is replaced by a model-specific classifier trained by sampling model trajectories. To the best of our knowledge, we are the first to establish this…
▽ More
We show how machine-learning techniques, particularly neural networks, offer a very effective and highly efficient solution to the approximate model-checking problem for continuous and hybrid systems, a solution where the general-purpose model checker is replaced by a model-specific classifier trained by sampling model trajectories. To the best of our knowledge, we are the first to establish this link from machine learning to model checking. Our method comprises a pipeline of analysis techniques for estimating and obtaining statistical guarantees on the classifier's prediction performance, as well as tuning techniques to improve such performance. Our experimental evaluation considers the time-bounded reachability problem for three well-established benchmarks in the hybrid systems community. On these examples, we achieve an accuracy of 99.82% to 100% and a false-negative rate (incorrectly predicting that unsafe states are not reachable from a given state) of 0.0007 to 0. We believe that this level of accuracy is acceptable in many practical applications and we show how the approximate model checker can be made more conservative by tuning the classifier through further training and selection of the classification threshold.
△ Less
Submitted 5 December, 2017;
originally announced December 2017.
-
Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction
Authors:
Daniel Stoller,
Sebastian Ewert,
Simon Dixon
Abstract:
The state of the art in music source separation employs neural networks trained in a supervised fashion on multi-track databases to estimate the sources from a given mixture. With only few datasets available, often extensive data augmentation is used to combat overfitting. Mixing random tracks, however, can even reduce separation performance as instruments in real music are strongly correlated. Th…
▽ More
The state of the art in music source separation employs neural networks trained in a supervised fashion on multi-track databases to estimate the sources from a given mixture. With only few datasets available, often extensive data augmentation is used to combat overfitting. Mixing random tracks, however, can even reduce separation performance as instruments in real music are strongly correlated. The key concept in our approach is that source estimates of an optimal separator should be indistinguishable from real source signals. Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples. This way, we can also use unpaired source and mixture recordings without the drawbacks of creating unrealistic music mixtures. Our framework is widely applicable as it does not assume a specific network architecture or number of sources. To our knowledge, this is the first adoption of adversarial training for music source separation. In a prototype experiment for singing voice separation, separation performance increases with our approach compared to purely supervised training.
△ Less
Submitted 6 April, 2018; v1 submitted 31 October, 2017;
originally announced November 2017.
-
Declarative vs Rule-based Control for Flocking Dynamics
Authors:
Usama Mehmood,
Nicola Paoletti,
Dung Phan,
Radu Grosu,
Shan Lin,
Scott D. Stoller,
Ashish Tiwari,
Junxing Yang,
Scott A. Smolka
Abstract:
The popularity of rule-based flocking models, such as Reynolds' classic flocking model, raises the question of whether more declarative flocking models are possible. This question is motivated by the observation that declarative models are generally simpler and easier to design, understand, and analyze than operational models. We introduce a very simple control law for flocking based on a cost fun…
▽ More
The popularity of rule-based flocking models, such as Reynolds' classic flocking model, raises the question of whether more declarative flocking models are possible. This question is motivated by the observation that declarative models are generally simpler and easier to design, understand, and analyze than operational models. We introduce a very simple control law for flocking based on a cost function capturing cohesion (agents want to stay together) and separation (agents do not want to get too close). We refer to it as {\textit declarative flocking} (DF). We use model-predictive control (MPC) to define controllers for DF in centralized and distributed settings. A thorough performance comparison of our declarative flocking with Reynolds' model, and with more recent flocking models that use MPC with a cost function based on lattice structures, demonstrate that DF-MPC yields the best cohesion and least fragmentation, and maintains a surprisingly good level of geometric regularity while still producing natural flock shapes similar to those produced by Reynolds' model. We also show that DF-MPC has high resilience to sensor noise.
△ Less
Submitted 27 October, 2017;
originally announced October 2017.
-
Greedy and Evolutionary Algorithms for Mining Relationship-Based Access Control Policies
Authors:
Thang Bui,
Scott D. Stoller,
Jiajie Li
Abstract:
Relationship-based access control (ReBAC) provides a high level of expressiveness and flexibility that promotes security and information sharing. We formulate ReBAC as an object-oriented extension of attribute-based access control (ABAC) in which relationships are expressed using fields that refer to other objects, and path expressions are used to follow chains of relationships between objects.…
▽ More
Relationship-based access control (ReBAC) provides a high level of expressiveness and flexibility that promotes security and information sharing. We formulate ReBAC as an object-oriented extension of attribute-based access control (ABAC) in which relationships are expressed using fields that refer to other objects, and path expressions are used to follow chains of relationships between objects.
ReBAC policy mining algorithms have potential to significantly reduce the cost of migration from legacy access control systems to ReBAC, by partially automating the development of a ReBAC policy from an existing access control policy and attribute data. This paper presents two algorithms for mining ReBAC policies from access control lists (ACLs) and attribute data represented as an object model: a greedy algorithm guided by heuristics, and a grammar-based evolutionary algorithm. An evaluation of the algorithms on four sample policies and two large case studies demonstrates their effectiveness.
△ Less
Submitted 21 August, 2018; v1 submitted 15 August, 2017;
originally announced August 2017.
-
A Component-Based Simplex Architecture for High-Assurance Cyber-Physical Systems
Authors:
Dung Phan,
Junxing Yang,
Matthew Clark,
Radu Grosu,
John D. Schierman,
Scott A. Smolka,
Scott D. Stoller
Abstract:
We present Component-Based Simplex Architecture (CBSA), a new framework for assuring the runtime safety of component-based cyber-physical systems (CPSs). CBSA integrates Assume-Guarantee (A-G) reasoning with the core principles of the Simplex control architecture to allow component-based CPSs to run advanced, uncertified controllers while still providing runtime assurance that A-G contracts and gl…
▽ More
We present Component-Based Simplex Architecture (CBSA), a new framework for assuring the runtime safety of component-based cyber-physical systems (CPSs). CBSA integrates Assume-Guarantee (A-G) reasoning with the core principles of the Simplex control architecture to allow component-based CPSs to run advanced, uncertified controllers while still providing runtime assurance that A-G contracts and global properties are satisfied. In CBSA, multiple Simplex instances, which can be composed in a nested, serial or parallel manner, coordinate to assure system-wide properties. Combining A-G reasoning and the Simplex architecture is a challenging problem that yields significant benefits. By utilizing A-G contracts, we are able to compositionally determine the switching logic for CBSAs, thereby alleviating the state explosion encountered by other approaches. Another benefit is that we can use A-G proof rules to decompose the proof of system-wide safety assurance into sub-proofs corresponding to the component-based structure of the system architecture. We also introduce the notion of coordinated switching between Simplex instances, a key component of our compositional approach to reasoning about CBSA switching logic. We illustrate our framework with a component-based control system for a ground rover. We formally prove that the CBSA for this system guarantees energy safety (the rover never runs out of power), and collision freedom (the rover never collides with a stationary obstacle). We also consider a CBSA for the rover that guarantees mission completion: all target destinations visited within a prescribed amount of time.
△ Less
Submitted 16 April, 2017;
originally announced April 2017.
-
Moderately Complex Paxos Made Simple: High-Level Executable Specification of Distributed Algorithms
Authors:
Yanhong A. Liu,
Saksham Chand,
Scott D. Stoller
Abstract:
This paper describes the application of a high-level language and method in develo** simpler specifications of more complex variants of the Paxos algorithm for distributed consensus. The specifications are for Multi-Paxos with preemption, replicated state machine, and reconfiguration and optimized with state reduction and failure detection. The language is DistAlgo. The key is to express complex…
▽ More
This paper describes the application of a high-level language and method in develo** simpler specifications of more complex variants of the Paxos algorithm for distributed consensus. The specifications are for Multi-Paxos with preemption, replicated state machine, and reconfiguration and optimized with state reduction and failure detection. The language is DistAlgo. The key is to express complex control flows and synchronization conditions precisely at a high level, using nondeterministic waits and message-history queries. We obtain complete executable specifications that are almost completely declarative---updating only a number for the protocol round besides the sets of messages sent and received.
We show the following results: 1.English and pseudocode descriptions of distributed algorithms can be captured completely and precisely at a high level, without adding, removing, or reformulating algorithm details to fit lower-level, more abstract, or less direct languages. 2.We created higher-level control flows and synchronization conditions than all previous specifications, and obtained specifications that are much simpler and smaller, even matching or smaller than abstract specifications that omit many algorithm details. 3.The simpler specifications led us to easily discover useless replies, unnecessary delays, and liveness violations (if messages can be lost) in previous published specifications, by just following the simplified algorithm flows. 4.The resulting specifications can be executed directly, and we can express optimizations cleanly, yielding drastic performance improvement over naive execution and facilitating a general method for merging processes. 5.We systematically translated the resulting specifications into TLA+ and developed machine-checked safety proofs, which also allowed us to detect and fix a subtle safety violation in an earlier unpublished specification.
△ Less
Submitted 12 August, 2019; v1 submitted 31 March, 2017;
originally announced April 2017.
-
Model Checking Cyber-Physical Systems using Particle Swarm Optimization
Authors:
Dung Phan,
Scott A. Smolka,
Radu Grosu,
Usama Mehmood,
Scott D. Stoller,
Junxing Yang
Abstract:
We present a novel approach to the problem of model checking cyber-physical systems. We transform the model checking problem to an optimization one by designing an objective function that measures how close a state is to a violation of a property. We use particle swarm optimization (PSO) to effectively search for a state that minimizes the objective function. Such states, if found, are counter-exa…
▽ More
We present a novel approach to the problem of model checking cyber-physical systems. We transform the model checking problem to an optimization one by designing an objective function that measures how close a state is to a violation of a property. We use particle swarm optimization (PSO) to effectively search for a state that minimizes the objective function. Such states, if found, are counter-examples describing safe states from which the system can reach an unsafe state in one time step. We illustrate our approach with a controller for the Quickbot ground rover. Our PSO model checker quickly found a bug in the controller that could cause the rover to collide with an obstacle.
△ Less
Submitted 3 March, 2017;
originally announced March 2017.
-
Founded Semantics and Constraint Semantics of Logic Rules
Authors:
Yanhong A. Liu,
Scott D. Stoller
Abstract:
Logic rules and inference are fundamental in computer science and have been studied extensively. However, prior semantics of logic languages can have subtle implications and can disagree significantly, on even very simple programs, including in attempting to solve the well-known Russell's paradox. These semantics are often non-intuitive and hard-to-understand when unrestricted negation is used in…
▽ More
Logic rules and inference are fundamental in computer science and have been studied extensively. However, prior semantics of logic languages can have subtle implications and can disagree significantly, on even very simple programs, including in attempting to solve the well-known Russell's paradox. These semantics are often non-intuitive and hard-to-understand when unrestricted negation is used in recursion.
This paper describes a simple new semantics for logic rules, founded semantics, and its straightforward extension to another simple new semantics, constraint semantics, that unify the core of different prior semantics. The new semantics support unrestricted negation, as well as unrestricted existential and universal quantifications. They are uniquely expressive and intuitive by allowing assumptions about the predicates, rules, and reasoning to be specified explicitly, as simple and precise binary choices. They are completely declarative and relate cleanly to prior semantics. In addition, founded semantics can be computed in linear time in the size of the ground program.
△ Less
Submitted 26 March, 2020; v1 submitted 20 June, 2016;
originally announced June 2016.
-
Formal Verification of Multi-Paxos for Distributed Consensus
Authors:
Saksham Chand,
Yanhong A. Liu,
Scott D. Stoller
Abstract:
Paxos is an important algorithm for a set of distributed processes to agree on a single value or a sequence of values, for which it is called Basic Paxos or Multi-Paxos, respectively. Consensus is critical when distributed services are replicated for fault-tolerance, because non-faulty replicas must agree on the state of the system or the sequence of operations that have been performed. Unfortunat…
▽ More
Paxos is an important algorithm for a set of distributed processes to agree on a single value or a sequence of values, for which it is called Basic Paxos or Multi-Paxos, respectively. Consensus is critical when distributed services are replicated for fault-tolerance, because non-faulty replicas must agree on the state of the system or the sequence of operations that have been performed. Unfortunately, consensus algorithms including Multi-Paxos in particular are well-known to be difficult to understand, and their accurate specifications and correctness proofs remain challenging, despite extensive studies ever since Lamport introduced Paxos.
This article describes formal specification and verification of Lamport's Multi-Paxos algorithm for distributed consensus. The specification is written in TLA+, Lamport's Temporal Logic of Actions. The proof is written and automatically checked using TLAPS, the TLA+ Proof System. The proof is for the safety property of the algorithm. Building on Lamport, Merz, and Doligez's specification and proof for Basic Paxos, we aim to facilitate the understanding of Multi-Paxos and its proof by minimizing the difference from those for Basic Paxos, and to demonstrate a general way of proving other variants of Paxos and other sophisticated distributed algorithms. We also discuss our general strategies and results for proving complex invariants using invariance lemmas and increments, for proving properties about sets and tuples to help the proof check succeed in significantly reduced time, and for overall proof improvement leading to considerably reduced proof size.
△ Less
Submitted 11 November, 2019; v1 submitted 4 June, 2016;
originally announced June 2016.
-
Mining Hierarchical Temporal Roles with Multiple Metrics
Authors:
Scott D. Stoller,
Thang Bui
Abstract:
Temporal role-based access control (TRBAC) extends role-based access control to limit the times at which roles are enabled. This paper presents a new algorithm for mining high-quality TRBAC policies from timed ACLs (i.e., ACLs with time limits in the entries) and optionally user attribute information. Such algorithms have potential to significantly reduce the cost of migration from timed ACLs to T…
▽ More
Temporal role-based access control (TRBAC) extends role-based access control to limit the times at which roles are enabled. This paper presents a new algorithm for mining high-quality TRBAC policies from timed ACLs (i.e., ACLs with time limits in the entries) and optionally user attribute information. Such algorithms have potential to significantly reduce the cost of migration from timed ACLs to TRBAC. The algorithm is parameterized by the policy quality metric. We consider multiple quality metrics, including number of roles, weighted structural complexity (a generalization of policy size), and (when user attribute information is available) interpretability, i.e., how well role membership can be characterized in terms of user attributes. Ours is the first TRBAC policy mining algorithm that produces hierarchical policies, and the first that optimizes weighted structural complexity or interpretability. In experiments with datasets based on real-world ACL policies, our algorithm is more effective than previous algorithms at optimizing policy quality.
△ Less
Submitted 15 October, 2017; v1 submitted 8 March, 2016;
originally announced March 2016.
-
Demand-Driven Incremental Object Queries
Authors:
Yanhong A. Liu,
Jon Brandvein,
Scott D. Stoller,
Bo Lin
Abstract:
Object queries are essential in information seeking and decision making in vast areas of applications. However, a query may involve complex conditions on objects and sets, which can be arbitrarily nested and aliased. The objects and sets involved as well as the demand---the given parameter values of interest---can change arbitrarily. How to implement object queries efficiently under all possible u…
▽ More
Object queries are essential in information seeking and decision making in vast areas of applications. However, a query may involve complex conditions on objects and sets, which can be arbitrarily nested and aliased. The objects and sets involved as well as the demand---the given parameter values of interest---can change arbitrarily. How to implement object queries efficiently under all possible updates, and furthermore to provide complexity guarantees?
This paper describes an automatic method. The method allows powerful queries to be written completely declaratively. It transforms demand as well as all objects and sets into relations. Most importantly, it defines invariants for not only the query results, but also all auxiliary values about the objects and sets involved, including those for propagating demand, and incrementally maintains all of them. Implementation and experiments with problems from a variety of application areas, including distributed algorithms and probabilistic queries, confirm the analyzed complexities, trade-offs, and significant improvements over prior work.
△ Less
Submitted 15 July, 2016; v1 submitted 14 November, 2015;
originally announced November 2015.
-
A survey on unmanned aerial vehicle collision avoidance systems
Authors:
Hung Pham,
Scott A. Smolka,
Scott D. Stoller,
Dung Phan,
Junxing Yang
Abstract:
Collision avoidance is a key factor in enabling the integration of unmanned aerial vehicle into real life use, whether it is in military or civil application. For a long time there have been a large number of works to address this problem; therefore a comparative summary of them would be desirable. This paper presents a survey on the major collision avoidance systems developed in up to date public…
▽ More
Collision avoidance is a key factor in enabling the integration of unmanned aerial vehicle into real life use, whether it is in military or civil application. For a long time there have been a large number of works to address this problem; therefore a comparative summary of them would be desirable. This paper presents a survey on the major collision avoidance systems developed in up to date publications. Each collision avoidance system contains two main parts: sensing and detection, and collision avoidance. Based on their characteristics each part is divided into different categories; and those categories are explained, compared and discussed about advantages and disadvantages in this paper.
△ Less
Submitted 31 August, 2015;
originally announced August 2015.
-
From Clarity to Efficiency for Distributed Algorithms
Authors:
Yanhong A. Liu,
Scott D. Stoller,
Bo Lin
Abstract:
This article describes a very high-level language for clear description of distributed algorithms and optimizations necessary for generating efficient implementations. The language supports high-level control flows where complex synchronization conditions can be expressed using high-level queries, especially logic quantifications, over message history sequences. Unfortunately, the programs would b…
▽ More
This article describes a very high-level language for clear description of distributed algorithms and optimizations necessary for generating efficient implementations. The language supports high-level control flows where complex synchronization conditions can be expressed using high-level queries, especially logic quantifications, over message history sequences. Unfortunately, the programs would be extremely inefficient, including consuming unbounded memory, if executed straightforwardly.
We present new optimizations that automatically transform complex synchronization conditions into incremental updates of necessary auxiliary values as messages are sent and received. The core of the optimizations is the first general method for efficient implementation of logic quantifications. We have developed an operational semantics of the language, implemented a prototype of the compiler and the optimizations, and successfully used the language and implementation on a variety of important distributed algorithms.
△ Less
Submitted 11 March, 2017; v1 submitted 29 December, 2014;
originally announced December 2014.
-
Mining Attribute-Based Access Control Policies from Logs
Authors:
Zhongyuan Xu,
Scott D. Stoller
Abstract:
Attribute-based access control (ABAC) provides a high level of flexibility that promotes security and information sharing. ABAC policy mining algorithms have potential to significantly reduce the cost of migration to ABAC, by partially automating the development of an ABAC policy from information about the existing access-control policy and attribute data. This paper presents an algorithm for mini…
▽ More
Attribute-based access control (ABAC) provides a high level of flexibility that promotes security and information sharing. ABAC policy mining algorithms have potential to significantly reduce the cost of migration to ABAC, by partially automating the development of an ABAC policy from information about the existing access-control policy and attribute data. This paper presents an algorithm for mining ABAC policies from operation logs and attribute data. To the best of our knowledge, it is the first algorithm for this problem.
△ Less
Submitted 12 February, 2018; v1 submitted 22 March, 2014;
originally announced March 2014.
-
Mining Attribute-based Access Control Policies
Authors:
Zhongyuan Xu,
Scott D. Stoller
Abstract:
Attribute-based access control (ABAC) provides a high level of flexibility that promotes security and information sharing. ABAC policy mining algorithms have potential to significantly reduce the cost of migration to ABAC, by partially automating the development of an ABAC policy from an access control list (ACL) policy or role-based access control (RBAC) policy with accompanying attribute data. T…
▽ More
Attribute-based access control (ABAC) provides a high level of flexibility that promotes security and information sharing. ABAC policy mining algorithms have potential to significantly reduce the cost of migration to ABAC, by partially automating the development of an ABAC policy from an access control list (ACL) policy or role-based access control (RBAC) policy with accompanying attribute data. This paper presents an ABAC policy mining algorithm. To the best of our knowledge, it is the first ABAC policy mining algorithm. Our algorithm iterates over tuples in the given user-permission relation, uses selected tuples as seeds for constructing candidate rules, and attempts to generalize each candidate rule to cover additional tuples in the user-permission relation by replacing conjuncts in attribute expressions with constraints. Our algorithm attempts to improve the policy by merging and simplifying candidate rules, and then it selects the highest-quality candidate rules for inclusion in the generated policy.
△ Less
Submitted 7 August, 2014; v1 submitted 10 June, 2013;
originally announced June 2013.
-
Review of Best Practice Methods for Determining an Electrode Material's Performance for Ultracapacitors
Authors:
Meryl D. Stoller,
Rodney S. Ruoff
Abstract:
Ultracapacitors are rapidly being adopted for use for a wide range of electrical energy storage applications. While ultracapacitors are able to deliver high rates of charge and discharge, they are limited in the amount of energy stored. The capacity of ultracapacitors is largely determined by the electrode material and as a result, research to improve the performance of electrode materials has d…
▽ More
Ultracapacitors are rapidly being adopted for use for a wide range of electrical energy storage applications. While ultracapacitors are able to deliver high rates of charge and discharge, they are limited in the amount of energy stored. The capacity of ultracapacitors is largely determined by the electrode material and as a result, research to improve the performance of electrode materials has dramatically increased. While test methods for packaged ultracapacitors are well developed, it is often not feasible for the materials scientist to assemble full sized, packaged cells to test electrode materials. Methodology to reliably measure a material's performance for ultracapacitor electrode use is not well standardized with the different techniques currently being used yielding widely varying results. In this manuscript, we review the best practice test methods that accurately predict a materials performance, yet are flexible and quick enough to accommodate a wide range of material sample types and amounts.
△ Less
Submitted 13 May, 2010; v1 submitted 5 May, 2010;
originally announced May 2010.