-
Discovering High-Quality Process Models Despite Data Scarcity
Authors:
Jan Niklas Adams,
Jari Peeperkorn,
Tobias Brockhoff,
Isabelle Terrier,
Heiko Göhner,
Merih Seran Uysal,
Seppe vanden Broucke,
Jochen De Weerdt,
Wil M. P. van der Aalst
Abstract:
Process discovery algorithms learn process models from executed activity sequences, describing concurrency, causality, and conflict. Concurrent activities require observing multiple permutations, increasing data requirements, especially for processes with concurrent subprocesses such as hierarchical, composite, or distributed processes. While process discovery algorithms traditionally use sequence…
▽ More
Process discovery algorithms learn process models from executed activity sequences, describing concurrency, causality, and conflict. Concurrent activities require observing multiple permutations, increasing data requirements, especially for processes with concurrent subprocesses such as hierarchical, composite, or distributed processes. While process discovery algorithms traditionally use sequences of activities as input, recently introduced object-centric process discovery algorithms can use graphs of activities as input, encoding partial orders between activities. As such, they contain the concurrency information of many sequences in a single graph. In this paper, we address the research question of reducing process discovery data requirements when using object-centric event logs for process discovery. We classify different real-life processes according to the control-flow complexity within and between subprocesses and introduce an evaluation framework to assess process discovery algorithm quality of traditional and object-centric process discovery based on the sample size. We complement this with a large-scale production process case study. Our results show reduced data requirements, enabling the discovery of large, concurrent processes such as manufacturing with little data, previously infeasible with traditional process discovery. Our findings suggest that object-centric process mining could revolutionize process discovery in various sectors, including manufacturing and supply chains.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Resolving Uncertain Case Identifiers in Interaction Logs: A User Study
Authors:
Marco Pegoraro,
Merih Seran Uysal,
Tom-Hendrik Hülsmann,
Wil M. P. van der Aalst
Abstract:
Modern software systems are able to record vast amounts of user actions, stored for later analysis. One of the main types of such user interaction data is click data: the digital trace of the actions of a user through the graphical elements of an application, website or software. While readily available, click data is often missing a case notion: an attribute linking events from user interactions…
▽ More
Modern software systems are able to record vast amounts of user actions, stored for later analysis. One of the main types of such user interaction data is click data: the digital trace of the actions of a user through the graphical elements of an application, website or software. While readily available, click data is often missing a case notion: an attribute linking events from user interactions to a specific process instance in the software. In this paper, we propose a neural network-based technique to determine a case notion for click data, thus enabling process mining and other process analysis techniques on user interaction data. We describe our method, show its scalability to datasets of large dimensions, and we validate its efficacy through a user study based on the segmented event log resulting from interaction data of a mobility sharing company. Interviews with domain experts in the company demonstrate that the case notion obtained by our method can lead to actionable process insights.
△ Less
Submitted 21 November, 2022;
originally announced December 2022.
-
Uncertain Case Identifiers in Process Mining: A User Study of the Event-Case Correlation Problem on Click Data
Authors:
Marco Pegoraro,
Merih Seran Uysal,
Tom-Hendrik Hülsmann,
Wil M. P. van der Aalst
Abstract:
Among the many sources of event data available today, a prominent one is user interaction data. User activity may be recorded during the use of an application or website, resulting in a type of user interaction data often called click data. An obstacle to the analysis of click data using process mining is the lack of a case identifier in the data. In this paper, we show a case and user study for e…
▽ More
Among the many sources of event data available today, a prominent one is user interaction data. User activity may be recorded during the use of an application or website, resulting in a type of user interaction data often called click data. An obstacle to the analysis of click data using process mining is the lack of a case identifier in the data. In this paper, we show a case and user study for event-case correlation on click data, in the context of user interaction events from a mobility sharing company. To reconstruct the case notion of the process, we apply a novel method to aggregate user interaction data in separate user sessions-interpreted as cases-based on neural networks. To validate our findings, we qualitatively discuss the impact of process mining analyses on the resulting well-formed event log through interviews with process experts.
△ Less
Submitted 8 April, 2022;
originally announced April 2022.
-
An XES Extension for Uncertain Event Data
Authors:
Marco Pegoraro,
Merih Seran Uysal,
Wil M. P. van der Aalst
Abstract:
Event data, often stored in the form of event logs, serve as the starting point for process mining and other evidence-based process improvements. However, event data in logs are often tainted by noise, errors, and missing data. Recently, a novel body of research has emerged, with the aim to address and analyze a class of anomalies known as uncertainty-imprecisions quantified with meta-information…
▽ More
Event data, often stored in the form of event logs, serve as the starting point for process mining and other evidence-based process improvements. However, event data in logs are often tainted by noise, errors, and missing data. Recently, a novel body of research has emerged, with the aim to address and analyze a class of anomalies known as uncertainty-imprecisions quantified with meta-information in the event log. This paper illustrates an extension of the XES data standard capable of representing uncertain event data. Such an extension enables input, output, and manipulation of uncertain data, as well as analysis through the process discovery and conformance checking approaches available in literature.
△ Less
Submitted 8 April, 2022;
originally announced April 2022.
-
Probability Estimation of Uncertain Process Trace Realizations
Authors:
Marco Pegoraro,
Bianka Bakullari,
Merih Seran Uysal,
Wil M. P. van der Aalst
Abstract:
Process mining is a scientific discipline that analyzes event data, often collected in databases called event logs. Recently, uncertain event logs have become of interest, which contain non-deterministic and stochastic event attributes that may represent many possible real-life scenarios. In this paper, we present a method to reliably estimate the probability of each of such scenarios, allowing th…
▽ More
Process mining is a scientific discipline that analyzes event data, often collected in databases called event logs. Recently, uncertain event logs have become of interest, which contain non-deterministic and stochastic event attributes that may represent many possible real-life scenarios. In this paper, we present a method to reliably estimate the probability of each of such scenarios, allowing their analysis. Experiments show that the probabilities calculated with our method closely match the true chances of occurrence of specific outcomes, enabling more trustworthy analyses on uncertain data.
△ Less
Submitted 24 September, 2021; v1 submitted 19 August, 2021;
originally announced August 2021.
-
Removing Operational Friction Using Process Mining: Challenges Provided by the Internet of Production (IoP)
Authors:
Wil van der Aalst,
Tobias Brockhoff,
Anahita Farhang Ghahfarokhi,
Mahsa Pourbafrani,
Merih Seran Uysal,
Sebastiaan van Zelst
Abstract:
Operational processes in production, logistics, material handling, maintenance, etc., are supported by cyber-physical systems combining hardware and software components. As a result, the digital and the physical world are closely aligned, and it is possible to track operational processes in detail (e.g., using sensors). The abundance of event data generated by today's operational processes provide…
▽ More
Operational processes in production, logistics, material handling, maintenance, etc., are supported by cyber-physical systems combining hardware and software components. As a result, the digital and the physical world are closely aligned, and it is possible to track operational processes in detail (e.g., using sensors). The abundance of event data generated by today's operational processes provides opportunities and challenges for process mining techniques supporting process discovery, performance analysis, and conformance checking. Using existing process mining tools, it is already possible to automatically discover process models and uncover performance and compliance problems. In the DFG-funded Cluster of Excellence "Internet of Production" (IoP), process mining is used to create "digital shadows" to improve a wide variety of operational processes. However, operational processes are dynamic, distributed, and complex. Driven by the challenges identified in the IoP cluster, we work on novel techniques for comparative process mining (comparing process variants for different products at different locations at different times), object-centric process mining (to handle processes involving different types of objects that interact), and forward-looking process mining (to explore "What if?" questions). By addressing these challenges, we aim to develop valuable "digital shadows" that can be used to remove operational friction.
△ Less
Submitted 27 July, 2021;
originally announced July 2021.
-
Text-Aware Predictive Monitoring of Business Processes
Authors:
Marco Pegoraro,
Merih Seran Uysal,
David Benedikt Georgi,
Wil M. P. van der Aalst
Abstract:
The real-time prediction of business processes using historical event data is an important capability of modern business process monitoring systems. Existing process prediction methods are able to also exploit the data perspective of recorded events, in addition to the control-flow perspective. However, while well-structured numerical or categorical attributes are considered in many prediction tec…
▽ More
The real-time prediction of business processes using historical event data is an important capability of modern business process monitoring systems. Existing process prediction methods are able to also exploit the data perspective of recorded events, in addition to the control-flow perspective. However, while well-structured numerical or categorical attributes are considered in many prediction techniques, almost no technique is able to utilize text documents written in natural language, which can hold information critical to the prediction task. In this paper, we illustrate the design, implementation, and evaluation of a novel text-aware process prediction model based on Long Short-Term Memory (LSTM) neural networks and natural language models. The proposed model can take categorical, numerical and textual attributes in event data into account to predict the activity and timestamp of the next event, the outcome, and the cycle time of a running process instance. Experiments show that the text-aware model is able to outperform state-of-the-art process prediction methods on simulated and real-world event logs containing textual data.
△ Less
Submitted 10 May, 2022; v1 submitted 20 April, 2021;
originally announced April 2021.
-
PROVED: A Tool for Graph Representation and Analysis of Uncertain Event Data
Authors:
Marco Pegoraro,
Merih Seran Uysal,
Wil M. P. van der Aalst
Abstract:
The discipline of process mining aims to study processes in a data-driven manner by analyzing historical process executions, often employing Petri nets. Event data, extracted from information systems (e.g. SAP), serve as the starting point for process mining. Recently, novel types of event data have gathered interest among the process mining community, including uncertain event data. Uncertain eve…
▽ More
The discipline of process mining aims to study processes in a data-driven manner by analyzing historical process executions, often employing Petri nets. Event data, extracted from information systems (e.g. SAP), serve as the starting point for process mining. Recently, novel types of event data have gathered interest among the process mining community, including uncertain event data. Uncertain events, process traces and logs contain attributes that are characterized by quantified imprecisions, e.g., a set of possible attribute values. The PROVED tool helps to explore, navigate and analyze such uncertain event data by abstracting the uncertain information using behavior graphs and nets, which have Petri nets semantics. Based on these constructs, the tool enables discovery and conformance checking.
△ Less
Submitted 8 April, 2022; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Efficient Time and Space Representation of Uncertain Event Data
Authors:
Marco Pegoraro,
Merih Seran Uysal,
Wil M. P. van der Aalst
Abstract:
Process mining is a discipline which concerns the analysis of execution data of operational processes, the extraction of models from event data, the measurement of the conformance between event data and normative models, and the enhancement of all aspects of processes. Most approaches assume that event data is accurately capture behavior. However, this is not realistic in many applications: data c…
▽ More
Process mining is a discipline which concerns the analysis of execution data of operational processes, the extraction of models from event data, the measurement of the conformance between event data and normative models, and the enhancement of all aspects of processes. Most approaches assume that event data is accurately capture behavior. However, this is not realistic in many applications: data can contain uncertainty, generated from errors in recording, imprecise measurements, and other factors. Recently, new methods have been developed to analyze event data containing uncertainty; these techniques prominently rely on representing uncertain event data by means of graph-based models explicitly capturing uncertainty. In this paper, we introduce a new approach to efficiently calculate a graph representation of the behavior contained in an uncertain process trace. We present our novel algorithm, prove its asymptotic time complexity, and show experimental results that highlight order-of-magnitude performance improvements for the behavior graph construction.
△ Less
Submitted 8 November, 2020; v1 submitted 30 September, 2020;
originally announced October 2020.
-
Conformance Checking over Uncertain Event Data
Authors:
Marco Pegoraro,
Merih Seran Uysal,
Wil M. P. van der Aalst
Abstract:
The strong impulse to digitize processes and operations in companies and enterprises have resulted in the creation and automatic recording of an increasingly large amount of process data in information systems. These are made available in the form of event logs. Process mining techniques enable the process-centric analysis of data, including automatically discovering process models and checking if…
▽ More
The strong impulse to digitize processes and operations in companies and enterprises have resulted in the creation and automatic recording of an increasingly large amount of process data in information systems. These are made available in the form of event logs. Process mining techniques enable the process-centric analysis of data, including automatically discovering process models and checking if event data conform to a given model. In this paper, we analyze the previously unexplored setting of uncertain event logs. In such event logs uncertainty is recorded explicitly, i.e., the time, activity and case of an event may be unclear or imprecise. In this work, we define a taxonomy of uncertain event logs and models, and we examine the challenges that uncertainty poses on process discovery and conformance checking. Finally, we show how upper and lower bounds for conformance can be obtained by aligning an uncertain trace onto a regular process model.
△ Less
Submitted 8 April, 2022; v1 submitted 29 September, 2020;
originally announced September 2020.
-
Efficient Construction of Behavior Graphs for Uncertain Event Data
Authors:
Marco Pegoraro,
Merih Seran Uysal,
Wil M. P. van der Aalst
Abstract:
The discipline of process mining deals with analyzing execution data of operational processes, extracting models from event data, checking the conformance between event data and normative models, and enhancing all aspects of processes. Recently, new techniques have been developed to analyze event data containing uncertainty; these techniques strongly rely on representing uncertain event data throu…
▽ More
The discipline of process mining deals with analyzing execution data of operational processes, extracting models from event data, checking the conformance between event data and normative models, and enhancing all aspects of processes. Recently, new techniques have been developed to analyze event data containing uncertainty; these techniques strongly rely on representing uncertain event data through graph-based models capturing uncertainty. In this paper we present a novel approach to efficiently compute a graph representation of the behavior contained in an uncertain process trace. We present our new algorithm, analyze its time complexity, and report experimental results showing order-of-magnitude performance improvements for behavior graph construction.
△ Less
Submitted 8 April, 2022; v1 submitted 19 February, 2020;
originally announced February 2020.
-
Discovering Process Models from Uncertain Event Data
Authors:
Marco Pegoraro,
Merih Seran Uysal,
Wil M. P. van der Aalst
Abstract:
Modern information systems are able to collect event data in the form of event logs. Process mining techniques allow to discover a model from event data, to check the conformance of an event log against a reference model, and to perform further process-centric analyses. In this paper, we consider uncertain event logs, where data is recorded together with explicit uncertainty information. We descri…
▽ More
Modern information systems are able to collect event data in the form of event logs. Process mining techniques allow to discover a model from event data, to check the conformance of an event log against a reference model, and to perform further process-centric analyses. In this paper, we consider uncertain event logs, where data is recorded together with explicit uncertainty information. We describe a technique to discover a directly-follows graph from such event data which retains information about the uncertainty in the process. We then present experimental results of performing inductive mining over the directly-follows graph to obtain models representing the certain and uncertain part of the process.
△ Less
Submitted 8 April, 2022; v1 submitted 20 September, 2019;
originally announced September 2019.