COPR -- Efficient, large-scale log storage and retrieval
Authors:
Julian Reichinger,
Thomas Krismayer,
Jan Rellermeyer
Abstract:
Modern, large scale monitoring systems have to process and store vast amounts of log data in near real-time. At query time the systems have to find relevant logs based on the content of the log message using support structures that can scale to these amounts of data while still being efficient to use. We present our novel Compressed Probabilistic Retrieval algorithm (COPR), capable of answering Mu…
▽ More
Modern, large scale monitoring systems have to process and store vast amounts of log data in near real-time. At query time the systems have to find relevant logs based on the content of the log message using support structures that can scale to these amounts of data while still being efficient to use. We present our novel Compressed Probabilistic Retrieval algorithm (COPR), capable of answering Multi-Set Multi-Membership-Queries, that can be used as an alternative to existing indexing structures for streamed log data. In our experiments, COPR required up to 93% less storage space than the tested state-of-the-art inverted index and had up to four orders of magnitude less false-positives than the tested state-of-the-art membership sketch. Additionally, COPR achieved up to 250 times higher query throughput than the tested inverted index and up to 240 times higher query throughput than the tested membership sketch.
△ Less
Submitted 27 March, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
Comparing Constraints Mined From Execution Logs to Understand Software Evolution
Authors:
Thomas Krismayer,
Michael Vierhauser,
Rick Rabiser,
Paul Grünbacher
Abstract:
Complex software systems evolve frequently, e.g., when introducing new features or fixing bugs during maintenance. However, understanding the impact of such changes on system behavior is often difficult. Many approaches have thus been proposed that analyze systems before and after changes, e.g., by comparing source code, model-based representations, or system execution logs. In this paper, we prop…
▽ More
Complex software systems evolve frequently, e.g., when introducing new features or fixing bugs during maintenance. However, understanding the impact of such changes on system behavior is often difficult. Many approaches have thus been proposed that analyze systems before and after changes, e.g., by comparing source code, model-based representations, or system execution logs. In this paper, we propose an approach for comparing run-time constraints, synthesized by a constraint mining algorithm, based on execution logs recorded before and after changes. Specifically, automatically mined constraints define the expected timing and order of recurring events and the values of data elements attached to events. Our approach presents the differences of the mined constraints to users, thereby providing a higher-level view on software evolution and supporting the analysis of the impact of changes on system behavior. We present a motivating example and a preliminary evaluation based on a cyber-physical system controlling unmanned aerial vehicles. The results of our preliminary evaluation show that our approach can help to analyze changed behavior and thus contributes to understanding software evolution.
△ Less
Submitted 8 January, 2020;
originally announced January 2020.