Skip to main content

Showing 1–29 of 29 results for author: Verwer, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18328  [pdf, ps, other

    cs.FL cs.LG

    PDFA Distillation via String Probability Queries

    Authors: Robert Baumgartner, Sicco Verwer

    Abstract: Probabilistic deterministic finite automata (PDFA) are discrete event systems modeling conditional probabilities over languages: Given an already seen sequence of tokens they return the probability of tokens of interest to appear next. These types of models have gained interest in the domain of explainable machine learning, where they are used as surrogate models for neural networks trained as lan… ▽ More

    Submitted 28 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: LearnAUT 2024

  2. arXiv:2406.07208  [pdf, other

    cs.FL

    Database-assisted automata learning

    Authors: Hielke Walinga, Robert Baumgartner, Sicco Verwer

    Abstract: This paper presents DAALder (Database-Assisted Automata Learning, with Dutch suffix from leerder), a new algorithm for learning state machines, or automata, specifically deterministic finite-state automata (DFA). When learning state machines from log data originating from software systems, the large amount of log data can pose a challenge. Conventional state merging algorithms cannot efficiently d… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 8 pages body, 12 pages total, LearnAut 2024 Keywords: Active/Passive state machine learning, Incomplete Minimally Adequate Teacher

  3. CATMA: Conformance Analysis Tool For Microservice Applications

    Authors: Clinton Cao, Simon Schneider, Nicolás E. Díaz Ferreyra, Sicco Verwer, Annibale Panichella, Riccardo Scandariato

    Abstract: The microservice architecture allows developers to divide the core functionality of their software system into multiple smaller services. However, this architectural style also makes it harder for them to debug and assess whether the system's deployment conforms to its implementation. We present CATMA, an automated tool that detects non-conformances between the system's deployment and implementati… ▽ More

    Submitted 23 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 5 pages, 5 figures, ICSE '24 Demonstration Track

  4. arXiv:2305.15394  [pdf, other

    cs.LG cs.CR

    Differentially-Private Decision Trees and Provable Robustness to Data Poisoning

    Authors: Daniël Vos, Jelle Vos, Tianyu Li, Zekeriya Erkin, Sicco Verwer

    Abstract: Decision trees are interpretable models that are well-suited to non-linear learning problems. Much work has been done on extending decision tree learning algorithms with differential privacy, a system that guarantees the privacy of samples within the training data. However, current state-of-the-art algorithms for this purpose sacrifice much utility for a small privacy benefit. These solutions crea… ▽ More

    Submitted 12 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  5. Optimal Decision Tree Policies for Markov Decision Processes

    Authors: Daniël Vos, Sicco Verwer

    Abstract: Interpretability of reinforcement learning policies is essential for many real-world tasks but learning such interpretable policies is a hard problem. Particularly rule-based policies such as decision trees and rules lists are difficult to optimize due to their non-differentiability. While existing techniques can learn verifiable decision tree policies there is no guarantee that the learners gener… ▽ More

    Submitted 13 February, 2024; v1 submitted 30 January, 2023; originally announced January 2023.

    Journal ref: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence Main Track, 5457-5465 (2023)

  6. arXiv:2208.10605  [pdf, other

    cs.CR cs.CY cs.LG

    SoK: Explainable Machine Learning for Computer Security Applications

    Authors: Azqa Nadeem, Daniël Vos, Clinton Cao, Luca Pajola, Simon Dieck, Robert Baumgartner, Sicco Verwer

    Abstract: Explainable Artificial Intelligence (XAI) aims to improve the transparency of machine learning (ML) pipelines. We systematize the increasingly growing (but fragmented) microcosm of studies that develop and utilize XAI methods for defensive and offensive cybersecurity tasks. We identify 3 cybersecurity stakeholders, i.e., model users, designers, and adversaries, who utilize XAI for 4 distinct objec… ▽ More

    Submitted 3 March, 2023; v1 submitted 22 August, 2022; originally announced August 2022.

    Comments: 13 pages. Accepted at Euro S&P

  7. Learning State Machines to Monitor and Detect Anomalies on a Kubernetes Cluster

    Authors: Clinton Cao, Agathe Blaise, Sicco Verwer, Filippo Rebecchi

    Abstract: These days more companies are shifting towards using cloud environments to provide their services to their client. While it is easy to set up a cloud environment, it is equally important to monitor the system's runtime behaviour and identify anomalous behaviours that occur during its operation. In recent years, the utilisation of \ac{rnn} and \ac{dnn} to detect anomalies that might occur during ru… ▽ More

    Submitted 28 June, 2022; originally announced July 2022.

    Comments: 9 pages, 12 figures, workshop paper

  8. arXiv:2207.03890  [pdf, other

    cs.LG

    ENCODE: Encoding NetFlows for Network Anomaly Detection

    Authors: Clinton Cao, Annibale Panichella, Sicco Verwer, Agathe Blaise, Filippo Rebecchi

    Abstract: NetFlow data is a popular network log format used by many network analysts and researchers. The advantages of using NetFlow over deep packet inspection are that it is easier to collect and process, and it is less privacy intrusive. Many works have used machine learning to detect network attacks using NetFlow data. The first step for these machine learning pipelines is to pre-process the data befor… ▽ More

    Submitted 4 August, 2023; v1 submitted 8 July, 2022; originally announced July 2022.

    Comments: 11 pages, 17 figures

  9. arXiv:2207.01516  [pdf, other

    cs.FL cs.LG

    Learning state machines via efficient hashing of future traces

    Authors: Robert Baumgartner, Sicco Verwer

    Abstract: State machines are popular models to model and visualize discrete systems such as software systems, and to represent regular grammars. Most algorithms that passively learn state machines from data assume all the data to be available from the beginning and they load this data into memory. This makes it hard to apply them to continuously streaming data and results in large memory requirements when d… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

  10. arXiv:2206.12190  [pdf, other

    cs.LG

    SECLEDS: Sequence Clustering in Evolving Data Streams via Multiple Medoids and Medoid Voting

    Authors: Azqa Nadeem, Sicco Verwer

    Abstract: Sequence clustering in a streaming environment is challenging because it is computationally expensive, and the sequences may evolve over time. K-medoids or Partitioning Around Medoids (PAM) is commonly used to cluster sequences since it supports alignment-based distances, and the k-centers being actual data items helps with cluster interpretability. However, offline k-medoids has no support for co… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

    Comments: Accepted to appear in ECML/PKDD 2022

  11. arXiv:2206.07776  [pdf, other

    cs.LG cs.CR

    Robust Attack Graph Generation

    Authors: Dennis Mouwen, Sicco Verwer, Azqa Nadeem

    Abstract: We present a method to learn automaton models that are more robust to input modifications. It iteratively aligns sequences to a learned model, modifies the sequences to their aligned versions, and re-learns the model. Automaton learning algorithms are typically very good at modeling the frequent behavior of a software system. Our solution can be used to also learn the behavior present in infrequen… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Comments: Appeared at LearnAut '22

  12. arXiv:2203.16331  [pdf, other

    cs.LG cs.LO cs.SE

    FlexFringe: Modeling Software Behavior by Learning Probabilistic Automata

    Authors: Sicco Verwer, Christian Hammerschmidt

    Abstract: We present the efficient implementations of probabilistic deterministic finite automaton learning methods available in FlexFringe. These implement well-known strategies for state-merging including several modifications to improve their performance in practice. We show experimentally that these algorithms obtain competitive results and significant improvements over a default implementation. We also… ▽ More

    Submitted 24 August, 2023; v1 submitted 28 March, 2022; originally announced March 2022.

  13. arXiv:2201.10453  [pdf, other

    cs.AI

    The First AI4TSP Competition: Learning to Solve Stochastic Routing Problems

    Authors: Laurens Bliek, Paulo da Costa, Reza Refaei Afshar, Yingqian Zhang, Tom Catshoek, Daniël Vos, Sicco Verwer, Fynn Schmitt-Ulms, André Hottung, Tapan Shah, Meinolf Sellmann, Kevin Tierney, Carl Perreault-Lafleur, Caroline Leboeuf, Federico Bobbio, Justine Pepin, Warley Almeida Silva, Ricardo Gama, Hugo L. Fernandes, Martin Zaefferer, Manuel López-Ibáñez, Ekhine Irurozki

    Abstract: This paper reports on the first international competition on AI for the traveling salesman problem (TSP) at the International Joint Conference on Artificial Intelligence 2021 (IJCAI-21). The TSP is one of the classical combinatorial optimization problems, with many variants inspired by real-world applications. This first competition asked the participants to develop algorithms to solve a time-depe… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    Comments: 21 pages

    MSC Class: 68T05

  14. arXiv:2109.03857  [pdf, other

    cs.LG cs.AI

    Robust Optimal Classification Trees Against Adversarial Examples

    Authors: Daniël Vos, Sicco Verwer

    Abstract: Decision trees are a popular choice of explainable model, but just like neural networks, they suffer from adversarial examples. Existing algorithms for fitting decision trees robust against adversarial examples are greedy heuristics and lack approximation guarantees. In this paper we propose ROCT, a collection of methods to train decision trees that are optimally robust against user-specified atta… ▽ More

    Submitted 8 September, 2021; originally announced September 2021.

  15. arXiv:2107.02783  [pdf, other

    cs.CR cs.LG

    SAGE: Intrusion Alert-driven Attack Graph Extractor

    Authors: Azqa Nadeem, Sicco Verwer, Shanchieh Jay Yang

    Abstract: Attack graphs (AG) are used to assess pathways availed by cyber adversaries to penetrate a network. State-of-the-art approaches for AG generation focus mostly on deriving dependencies between system vulnerabilities based on network scans and expert knowledge. In real-world operations however, it is costly and ineffective to rely on constant vulnerability scanning and expert-crafted AGs. We propose… ▽ More

    Submitted 14 October, 2021; v1 submitted 6 July, 2021; originally announced July 2021.

    Comments: Appeared at VizSec '21 (proceedings) and KDD AI4Cyber '21 (without proceedings)

  16. arXiv:2106.04618  [pdf, other

    cs.LG cs.NE math.OC

    EXPObench: Benchmarking Surrogate-based Optimisation Algorithms on Expensive Black-box Functions

    Authors: Laurens Bliek, Arthur Guijt, Rickard Karlsson, Sicco Verwer, Mathijs de Weerdt

    Abstract: Surrogate algorithms such as Bayesian optimisation are especially designed for black-box optimisation problems with expensive objectives, such as hyperparameter tuning or simulation-based optimisation. In the literature, these algorithms are usually evaluated with synthetic benchmarks which are well established but have no expensive objective, and only on one or two real-life applications which va… ▽ More

    Submitted 1 December, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: 33 pages

  17. arXiv:2012.10438  [pdf, other

    cs.LG

    Efficient Training of Robust Decision Trees Against Adversarial Examples

    Authors: Daniël Vos, Sicco Verwer

    Abstract: In the present day we use machine learning for sensitive tasks that require models to be both understandable and robust. Although traditional models such as decision trees are understandable, they suffer from adversarial attacks. When a decision tree is used to differentiate between a user's benign and malicious behavior, an adversarial attack allows the user to effectively evade the model by pert… ▽ More

    Submitted 18 December, 2020; originally announced December 2020.

  18. Continuous surrogate-based optimization algorithms are well-suited for expensive discrete problems

    Authors: Rickard Karlsson, Laurens Bliek, Sicco Verwer, Mathijs de Weerdt

    Abstract: One method to solve expensive black-box optimization problems is to use a surrogate model that approximates the objective based on previous observed evaluations. The surrogate, which is cheaper to evaluate, is optimized instead to find an approximate solution to the original problem. In the case of discrete problems, recent research has revolved around surrogate models that are specifically constr… ▽ More

    Submitted 30 November, 2020; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: 16 pages, 3 figures; added keywords, typos corrected, additional information in Figure 3 but results unchanged

    Journal ref: Proceedings of BNAIC/BeneLearn (2020), 88-102

  19. arXiv:2006.04508  [pdf, other

    cs.LG math.OC stat.ML

    Black-box Mixed-Variable Optimisation using a Surrogate Model that Satisfies Integer Constraints

    Authors: Laurens Bliek, Arthur Guijt, Sicco Verwer, Mathijs de Weerdt

    Abstract: A challenging problem in both engineering and computer science is that of minimising a function for which we have no mathematical formulation available, that is expensive to evaluate, and that contains continuous and integer variables, for example in automatic algorithm configuration. Surrogate-based algorithms are very suitable for this type of problem, but most existing techniques are designed w… ▽ More

    Submitted 15 September, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

    Comments: Ann Math Artif Intell (2020)

    Journal ref: Proceedings of the Genetic and Evolutionary Computation Conference Companion 2021

  20. Black-box Combinatorial Optimization using Models with Integer-valued Minima

    Authors: Laurens Bliek, Sicco Verwer, Mathijs de Weerdt

    Abstract: When a black-box optimization objective can only be evaluated with costly or noisy measurements, most standard optimization algorithms are unsuited to find the optimal solution. Specialized algorithms that deal with exactly this situation make use of surrogate models. These models are usually continuous and smooth, which is beneficial for continuous optimization problems, but not necessarily for c… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

    Journal ref: Annals of Mathematics and Artificial Intelligence 89, 639-653 (2021)

  21. arXiv:1910.13526  [pdf, other

    cs.AI cs.FL cs.LG

    Learning a Safety Verifiable Adaptive Cruise Controller from Human Driving Data

    Authors: Qin Lin, Sicco Verwer, John Dolan

    Abstract: Imitation learning provides a way to automatically construct a controller by mimicking human behavior from data. For safety-critical systems such as autonomous vehicles, it can be problematic to use controllers learned from data because they cannot be guaranteed to be collision-free. Recently, a method has been proposed for learning a multi-mode hybrid automaton cruise controller (MOHA). Besides b… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

  22. Beyond Labeling: Using Clustering to Build Network Behavioral Profiles of Malware Families

    Authors: Azqa Nadeem, Christian Hammerschmidt, Carlos H. Gañán, Sicco Verwer

    Abstract: Malware family labels are known to be inconsistent. They are also black-box since they do not represent the capabilities of malware. The current state-of-the-art in malware capability assessment include mostly manual approaches, which are infeasible due to the ever-increasing volume of discovered malware samples. We propose a novel unsupervised machine learning-based method called MalPaCA, which a… ▽ More

    Submitted 13 November, 2020; v1 submitted 2 April, 2019; originally announced April 2019.

    Comments: Accepted as a chapter in Springer MAAIDL 2020

  23. arXiv:1707.09430  [pdf, ps, other

    stat.ML cs.LG

    Human in the Loop: Interactive Passive Automata Learning via Evidence-Driven State-Merging Algorithms

    Authors: Christian A. Hammerschmidt, Radu State, Sicco Verwer

    Abstract: We present an interactive version of an evidence-driven state-merging (EDSM) algorithm for learning variants of finite state automata. Learning these automata often amounts to recovering or reverse engineering the model generating the data despite noisy, incomplete, or imperfectly sampled data sources rather than optimizing a purely numeric target function. Domain expertise and human knowledge abo… ▽ More

    Submitted 28 July, 2017; originally announced July 2017.

    Comments: 4 pages, presented at the Human in the Loop workshop at ICML 2017

  24. arXiv:1706.01663  [pdf, other

    cs.LG cs.FL

    Learning Pairwise Disjoint Simple Languages from Positive Examples

    Authors: Alexis Linard, Rick Smetsers, Frits Vaandrager, Umar Waqas, Joost van Pinxten, Sicco Verwer

    Abstract: A classical problem in grammatical inference is to identify a deterministic finite automaton (DFA) from a set of positive and negative examples. In this paper, we address the related - yet seemingly novel - problem of identifying a set of DFAs from examples that belong to different unknown simple regular languages. We propose two methods based on compression for clustering the observed positive ex… ▽ More

    Submitted 6 June, 2017; originally announced June 2017.

    Comments: This paper has been accepted at the Learning and Automata (LearnAut) Workshop, LICS 2017 (Reykjavik, Iceland)

  25. arXiv:1705.09650  [pdf, other

    cs.LG cs.AI cs.FL cs.LO

    Anomaly Detection in a Digital Video Broadcasting System Using Timed Automata

    Authors: Xiaoran Liu, Qin Lin, Sicco Verwer, Dmitri Jarnikov

    Abstract: This paper focuses on detecting anomalies in a digital video broadcasting (DVB) system from providers' perspective. We learn a probabilistic deterministic real timed automaton profiling benign behavior of encryption control in the DVB control access system. This profile is used as a one-class classifier. Anomalous items in a testing sequence are detected when the sequence is not accepted by the le… ▽ More

    Submitted 24 May, 2017; originally announced May 2017.

    Comments: This paper has been accepted by the Thirty-Second Annual ACM/IEEE Symposium on Logic in Computer Science (LICS) Workshop on Learning and Automata (LearnAut)

  26. arXiv:1611.07100  [pdf, other

    stat.ML cs.AI

    Interpreting Finite Automata for Sequential Data

    Authors: Christian Albert Hammerschmidt, Sicco Verwer, Qin Lin, Radu State

    Abstract: Automaton models are often seen as interpretable models. Interpretability itself is not well defined: it remains unclear what interpretability means without first explicitly specifying objectives or desired attributes. In this paper, we identify the key properties used to interpret automata and propose a modification of a state-merging approach to learn variants of finite state automata. We apply… ▽ More

    Submitted 24 November, 2016; v1 submitted 21 November, 2016; originally announced November 2016.

    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

    ACM Class: I.2.6

  27. arXiv:1611.02429  [pdf, ps, other

    cs.SE

    Complementing Model Learning with Mutation-Based Fuzzing

    Authors: Rick Smetsers, Joshua Moerman, Mark Janssen, Sicco Verwer

    Abstract: An ongoing challenge for learning algorithms formulated in the Minimally Adequate Teacher framework is to efficiently obtain counterexamples. In this paper we compare and combine conformance testing and mutation-based fuzzing methods for obtaining counterexamples when learning finite state machine models for the reactive software systems of the Rigorous Exampination of Reactive Systems (RERS) chal… ▽ More

    Submitted 8 November, 2016; originally announced November 2016.

    Comments: Submitted to the RERS challenge 2016

  28. Learning optimization models in the presence of unknown relations

    Authors: Sicco Verwer, Yingqian Zhang, Qing Chuan Ye

    Abstract: In a sequential auction with multiple bidding agents, it is highly challenging to determine the ordering of the items to sell in order to maximize the revenue due to the fact that the autonomy and private information of the agents heavily influence the outcome of the auction. The main contribution of this paper is two-fold. First, we demonstrate how to apply machine learning techniques to solve… ▽ More

    Submitted 15 April, 2014; v1 submitted 6 January, 2014; originally announced January 2014.

    Comments: 37 pages. Working paper

    ACM Class: F.5.3; K.3; K.4

  29. Predicate Logic as a Modeling Language: Modeling and Solving some Machine Learning and Data Mining Problems with IDP3

    Authors: Maurice Bruynooghe, Hendrik Blockeel, Bart Bogaerts, Broes De Cat, Stef De Pooter, Joachim Jansen, Anthony Labarre, Jan Ramon, Marc Denecker, Sicco Verwer

    Abstract: This paper provides a gentle introduction to problem solving with the IDP3 system. The core of IDP3 is a finite model generator that supports first order logic enriched with types, inductive definitions, aggregates and partial functions. It offers its users a modeling language that is a slight extension of predicate logic and allows them to solve a wide range of search problems. Apart from a small… ▽ More

    Submitted 28 March, 2014; v1 submitted 26 September, 2013; originally announced September 2013.

    Comments: To appear in Theory and Practice of Logic Programming (TPLP)

    Journal ref: Theory and Practice of Logic Programming 15 (2014) 783-817