-
A source separation approach to temporal graph modelling for computer networks
Authors:
Corentin Larroche
Abstract:
Detecting malicious activity within an enterprise computer network can be framed as a temporal link prediction task: given a sequence of graphs representing communications between hosts over time, the goal is to predict which edges should--or should not--occur in the future. However, standard temporal link prediction algorithms are ill-suited for computer network monitoring as they do not take acc…
▽ More
Detecting malicious activity within an enterprise computer network can be framed as a temporal link prediction task: given a sequence of graphs representing communications between hosts over time, the goal is to predict which edges should--or should not--occur in the future. However, standard temporal link prediction algorithms are ill-suited for computer network monitoring as they do not take account of the peculiar short-term dynamics of computer network activity, which exhibits sharp seasonal variations. In order to build a better model, we propose a source separation-inspired description of computer network activity: at each time step, the observed graph is a mixture of subgraphs representing various sources of activity, and short-term dynamics result from changes in the mixing coefficients. Both qualitative and quantitative experiments demonstrate the validity of our approach.
△ Less
Submitted 28 March, 2023;
originally announced March 2023.
-
Multilayer Block Models for Exploratory Analysis of Computer Event Logs
Authors:
Corentin Larroche
Abstract:
We investigate a graph-based approach to exploratory data analysis in the context of network security monitoring. Given a possibly large batch of event logs describing ongoing activity, we first represent these events as a bipartite multiplex graph. We then apply a model-based biclustering algorithm to extract relevant clusters of entities and interactions between these clusters, thereby providing…
▽ More
We investigate a graph-based approach to exploratory data analysis in the context of network security monitoring. Given a possibly large batch of event logs describing ongoing activity, we first represent these events as a bipartite multiplex graph. We then apply a model-based biclustering algorithm to extract relevant clusters of entities and interactions between these clusters, thereby providing a simplified situational picture. We illustrate this methodology through two case studies addressing network flow records and authentication logs, respectively. In both cases, the inferred clusters reveal the functional roles of entities as well as relevant behavioral patterns. Displaying interactions between these clusters also helps uncover malicious activity. Our code is available at https://github.com/cl-anssi/MultilayerBlockModels.
△ Less
Submitted 21 June, 2022;
originally announced June 2022.
-
Dynamically Modelling Heterogeneous Higher-Order Interactions for Malicious Behavior Detection in Event Logs
Authors:
Corentin Larroche,
Johan Mazel,
Stephan Clémençon
Abstract:
Anomaly detection in event logs is a promising approach for intrusion detection in enterprise networks. By building a statistical model of usual activity, it aims to detect multiple kinds of malicious behavior, including stealthy tactics, techniques and procedures (TTPs) designed to evade signature-based detection systems. However, finding suitable anomaly detection methods for event logs remains…
▽ More
Anomaly detection in event logs is a promising approach for intrusion detection in enterprise networks. By building a statistical model of usual activity, it aims to detect multiple kinds of malicious behavior, including stealthy tactics, techniques and procedures (TTPs) designed to evade signature-based detection systems. However, finding suitable anomaly detection methods for event logs remains an important challenge. This results from the very complex, multi-faceted nature of the data: event logs are not only combinatorial, but also temporal and heterogeneous data, thus they fit poorly in most theoretical frameworks for anomaly detection. Most previous research focuses on either one of these three aspects, building a simplified representation of the data that can be fed to standard anomaly detection algorithms. In contrast, we propose to simultaneously address all three of these characteristics through a specifically tailored statistical model. We introduce \textsc{Decades}, a \underline{d}ynamic, h\underline{e}terogeneous and \underline{c}ombinatorial model for \underline{a}nomaly \underline{d}etection in \underline{e}vent \underline{s}treams, and we demonstrate its effectiveness at detecting malicious behavior through experiments on a real dataset containing labelled red team activity. In particular, we empirically highlight the importance of handling the multiple characteristics of the data by comparing our model with state-of-the-art baselines relying on various data representations.
△ Less
Submitted 28 June, 2022; v1 submitted 29 March, 2021;
originally announced March 2021.