-
LLMs in HCI Data Work: Bridging the Gap Between Information Retrieval and Responsible Research Practices
Authors:
Neda Taghizadeh Serajeh,
Iman Mohammadi,
Vittorio Fuccella,
Mattia De Rosa
Abstract:
Efficient and accurate information extraction from scientific papers is significant in the rapidly develo** human-computer interaction research in the literature review process. Our paper introduces and analyses a new information retrieval system using state-of-the-art Large Language Models (LLMs) in combination with structured text analysis techniques to extract experimental data from HCI liter…
▽ More
Efficient and accurate information extraction from scientific papers is significant in the rapidly develo** human-computer interaction research in the literature review process. Our paper introduces and analyses a new information retrieval system using state-of-the-art Large Language Models (LLMs) in combination with structured text analysis techniques to extract experimental data from HCI literature, emphasizing key elements. Then We analyze the challenges and risks of using LLMs in the world of research. We performed a comprehensive analysis on our conducted dataset, which contained the specified information of 300 CHI 2020-2022 papers, to evaluate the performance of the two large language models, GPT-3.5 (text-davinci-003) and Llama-2-70b, paired with structured text analysis techniques. The GPT-3.5 model gains an accuracy of 58\% and a mean absolute error of 7.00. In contrast, the Llama2 model indicates an accuracy of 56\% with a mean absolute error of 7.63. The ability to answer questions was also included in the system in order to work with streamlined data. By evaluating the risks and opportunities presented by LLMs, our work contributes to the ongoing dialogue on establishing methodological validity and ethical guidelines for LLM use in HCI data work.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
The Hitchhiker's Guide to Malicious Third-Party Dependencies
Authors:
Piergiorgio Ladisa,
Merve Sahin,
Serena Elisa Ponta,
Marco Rosa,
Matias Martinez,
Olivier Barais
Abstract:
The increasing popularity of certain programming languages has spurred the creation of ecosystem-specific package repositories and package managers. Such repositories (e.g., npm, PyPI) serve as public databases that users can query to retrieve packages for various functionalities, whereas package managers automatically handle dependency resolution and package installation on the client side. These…
▽ More
The increasing popularity of certain programming languages has spurred the creation of ecosystem-specific package repositories and package managers. Such repositories (e.g., npm, PyPI) serve as public databases that users can query to retrieve packages for various functionalities, whereas package managers automatically handle dependency resolution and package installation on the client side. These mechanisms enhance software modularization and accelerate implementation. However, they have become a target for malicious actors seeking to propagate malware on a large scale.
In this work, we show how attackers can leverage capabilities of popular package managers and languages to achieve arbitrary code execution on victim machines, thereby realizing open-source software supply chain attacks. Based on the analysis of 7 ecosystems, we identify 3 install-time and 4 runtime techniques, and we provide recommendations describing how to reduce the risk when consuming third-party dependencies. We will provide proof-of-concepts that demonstrate the identified techniques. Furthermore, we describe evasion strategies employed by attackers to circumvent detection mechanisms.
△ Less
Submitted 6 October, 2023; v1 submitted 18 July, 2023;
originally announced July 2023.
-
Learning When to Treat Business Processes: Prescriptive Process Monitoring with Causal Inference and Reinforcement Learning
Authors:
Zahra Dasht Bozorgi,
Marlon Dumas,
Marcello La Rosa,
Artem Polyvyanyy,
Mahmoud Shoush,
Irene Teinemaa
Abstract:
Increasing the success rate of a process, i.e. the percentage of cases that end in a positive outcome, is a recurrent process improvement goal. At runtime, there are often certain actions (a.k.a. treatments) that workers may execute to lift the probability that a case ends in a positive outcome. For example, in a loan origination process, a possible treatment is to issue multiple loan offers to in…
▽ More
Increasing the success rate of a process, i.e. the percentage of cases that end in a positive outcome, is a recurrent process improvement goal. At runtime, there are often certain actions (a.k.a. treatments) that workers may execute to lift the probability that a case ends in a positive outcome. For example, in a loan origination process, a possible treatment is to issue multiple loan offers to increase the probability that the customer takes a loan. Each treatment has a cost. Thus, when defining policies for prescribing treatments to cases, managers need to consider the net gain of the treatments. Also, the effect of a treatment varies over time: treating a case earlier may be more effective than later in a case. This paper presents a prescriptive monitoring method that automates this decision-making task. The method combines causal inference and reinforcement learning to learn treatment policies that maximize the net gain. The method leverages a conformal prediction technique to speed up the convergence of the reinforcement learning mechanism by separating cases that are likely to end up in a positive or negative outcome, from uncertain cases. An evaluation on two real-life datasets shows that the proposed method outperforms a state-of-the-art baseline.
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Web-based Database Courses E-Learning Application
Authors:
Aaron Paul M. Dela Rosa,
Luigi Miguel M. Villanueva,
John Mardy R. San Miguel,
John Emmanuel B. Quinto
Abstract:
This study was focused on the development of a web e-learning application for the database courses taken by Information Technology (IT) students at the College of Information and Communications Technology (CICT) of Bulacan State University (BulSU). The research methodology used in this project was the cross-sectional developmental approach. The Agile Software Development methodology was followed p…
▽ More
This study was focused on the development of a web e-learning application for the database courses taken by Information Technology (IT) students at the College of Information and Communications Technology (CICT) of Bulacan State University (BulSU). The research methodology used in this project was the cross-sectional developmental approach. The Agile Software Development methodology was followed phase by phase, up to the development phase, to develop the system. It was used to produce the desired output rapidly while allowing users to go back through phases without finishing the whole cycle. The goal of this study was to create a web application that teaches Structured Query Language (SQL) in both the MySQL and SQL Server approaches. The application contains quizzes and examinations to allow self-assessment of learning. Additionally, an Entity Relationship Diagram (ERD) simulation was included to provide ERD creations in a drag-and-drop fashion. This study was evaluated using ISO/IEC 25010 software quality evaluation criteria. The study's overall mean was 4.24, 4.41, and 4.33, all with the descriptive meaning of "Very Good," which showed that the system performed its necessary functions as perceived by students, faculty members, and experts, respectively. In summary, the e-learning web application for database courses was fully developed. Moreover, the entity-relationship diagram was integrated well within the system and is accessible to the users. Lastly, respondents evaluated the developed web application using the ISO/IEC 25010 with an overall descriptive interpretation of "Very Good." For future developments of the study, an administrator panel may be developed to manage users and do other administrative tasks. Lastly, higher-order thinking skills questions on assessments and quizzes may be included.
△ Less
Submitted 23 November, 2022;
originally announced December 2022.
-
Web-based Management Information System of Cases Filed with the National Labor Relations Commission
Authors:
Aaron Paul M. Dela Rosa
Abstract:
This study was developed to describe the daily operations and encountered problems of the National Labor Relations Commission Regional Arbitration Branch No. IV (NLRC RAB IV) through conducted observations and interviews. These problems were addressed and analyzed to be the features of the developed web-based management information system (MIS) for cases. The research methodology utilized in this…
▽ More
This study was developed to describe the daily operations and encountered problems of the National Labor Relations Commission Regional Arbitration Branch No. IV (NLRC RAB IV) through conducted observations and interviews. These problems were addressed and analyzed to be the features of the developed web-based management information system (MIS) for cases. The research methodology utilized in this project was the descriptive developmental approach. The Agile Software Development methodology was followed to develop the system. It was used to quickly produce the desired output while allowing the user to go back through phases without finishing the whole cycle. The system covered managing filed complaints, Single-Entry Approach (SEnA), labor cases, and report generation. The findings, through the interview, of handling records were inconsistent and inaccurate. This study also focused on ensuring the Data Privacy Act of 2012, protecting the database's information using the XOR Cipher Algorithm. This study was evaluated using standard web evaluation criteria. Using the criteria, the study's overall mean was 4.27 and 4.43, with the descriptive meaning of Very Good, which showed that the system was accepted as perceived by experts and end-users, respectively. Management of filed cases is a vital process for the Commission. With that said, develo** a web-based management information system could ease the internal operations of handling and managing filed labor cases. Moreover, respondents and complainants can easily determine their filed cases' status using the case status tracking system. For further improvements to the system, additional printable documents may be added that could be found needed by the Commission. Lastly, further research about the effectiveness of the web-based system may be conducted for further enhancements of the system.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Effectiveness of an Online Course in Programming in a State University in the Philippines
Authors:
Aaron Paul M. Dela Rosa
Abstract:
Online courses, as a pedagogical approach to teaching, boomed during this Coronavirus Disease 2019 pandemic era. Universities shifted from traditional face to face classes to online distance learning due to the cause of the pandemic. This study aimed to determine how effective an online course is in learning a programming course. The study utilized mixed method research applied through a validated…
▽ More
Online courses, as a pedagogical approach to teaching, boomed during this Coronavirus Disease 2019 pandemic era. Universities shifted from traditional face to face classes to online distance learning due to the cause of the pandemic. This study aimed to determine how effective an online course is in learning a programming course. The study utilized mixed method research applied through a validated survey questionnaire consisting of closed and open ended questions. Python programming was the course selected to undergo the study and underwent an evaluation to determine the students' responses. Student respondents are from Bulacan State University, a state university in the Philippines, under the Bachelor of Science in Information Technology program. Based on their responses, the students found that the online Python programming was Very Effective, with an overall mean of 4.49. This result shows that students found the online course effective, provided the proper course design and content, allowed them to spend enough time finishing tasks, and provided communication and interaction with their instructor and fellow students. Additionally, students gave overwhelmingly positive responses when asked what their instructors had done well on the course delivery and provided insightful and constructive comments for further enhancement and delivery of the course. This study found that most students strongly agreed and believed in the effectiveness of delivering the Python Programming course asynchronously. With such positive results from the student's perspective and evaluation, the course can be enhanced to continue providing quality education at Bulacan State University.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Monitoring Fog Computing: a Review, Taxonomy and Open Challenges
Authors:
Breno Costa,
Joao Bachiega Jr,
Leonardo Reboucas de Carvalho,
Michel Rosa,
Aleteia Araujo
Abstract:
Fog computing is a distributed paradigm that provides computational resources in the users' vicinity. Fog orchestration is a set of functionalities that coordinate the dynamic infrastructure and manage the services to guarantee the Service Level Agreements. Monitoring is an orchestration functionality of prime importance. It is the basis for resource management actions, collecting status of resour…
▽ More
Fog computing is a distributed paradigm that provides computational resources in the users' vicinity. Fog orchestration is a set of functionalities that coordinate the dynamic infrastructure and manage the services to guarantee the Service Level Agreements. Monitoring is an orchestration functionality of prime importance. It is the basis for resource management actions, collecting status of resource and service and delivering updated data to the orchestrator. There are several cloud monitoring solutions and tools, but none of them comply with fog characteristics and challenges. Fog monitoring solutions are scarce, and they may not be prepared to compose an orchestration service. This paper updates the knowledge base about fog monitoring, assessing recent subjects in this context like observability, data standardization and instrumentation domains. We propose a novel taxonomy of fog monitoring solutions, supported by a systematic review of the literature. Fog monitoring proposals are analyzed and categorized by this new taxonomy, offering researchers a comprehensive overview. This work also highlights the main challenges and open research questions.
△ Less
Submitted 16 June, 2022; v1 submitted 13 May, 2022;
originally announced June 2022.
-
No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval
Authors:
Guilherme Moraes Rosa,
Luiz Bonifacio,
Vitor Jeronymo,
Hugo Abonizio,
Marzieh Fadaee,
Roberto Lotufo,
Rodrigo Nogueira
Abstract:
Recent work has shown that small distilled language models are strong competitors to models that are orders of magnitude larger and slower in a wide range of information retrieval tasks. This has made distilled and dense models, due to latency constraints, the go-to choice for deployment in real-world retrieval applications. In this work, we question this practice by showing that the number of par…
▽ More
Recent work has shown that small distilled language models are strong competitors to models that are orders of magnitude larger and slower in a wide range of information retrieval tasks. This has made distilled and dense models, due to latency constraints, the go-to choice for deployment in real-world retrieval applications. In this work, we question this practice by showing that the number of parameters and early query-document interaction play a significant role in the generalization ability of retrieval models. Our experiments show that increasing model size results in marginal gains on in-domain test sets, but much larger gains in new domains never seen during fine-tuning. Furthermore, we show that rerankers largely outperform dense ones of similar size in several tasks. Our largest reranker reaches the state of the art in 12 of the 18 datasets of the Benchmark-IR (BEIR) and surpasses the previous state of the art by 3 average points. Finally, we confirm that in-domain effectiveness is not a good indicator of zero-shot effectiveness. Code is available at https://github.com/guilhermemr04/scaling-zero-shot-retrieval.git
△ Less
Submitted 12 December, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Billions of Parameters Are Worth More Than In-domain Training Data: A case study in the Legal Case Entailment Task
Authors:
Guilherme Moraes Rosa,
Luiz Bonifacio,
Vitor Jeronymo,
Hugo Abonizio,
Roberto Lotufo,
Rodrigo Nogueira
Abstract:
Recent work has shown that language models scaled to billions of parameters, such as GPT-3, perform remarkably well in zero-shot and few-shot scenarios. In this work, we experiment with zero-shot models in the legal case entailment task of the COLIEE 2022 competition. Our experiments show that scaling the number of parameters in a language model improves the F1 score of our previous zero-shot resu…
▽ More
Recent work has shown that language models scaled to billions of parameters, such as GPT-3, perform remarkably well in zero-shot and few-shot scenarios. In this work, we experiment with zero-shot models in the legal case entailment task of the COLIEE 2022 competition. Our experiments show that scaling the number of parameters in a language model improves the F1 score of our previous zero-shot result by more than 6 points, suggesting that stronger zero-shot capability may be a characteristic of larger models, at least for this task. Our 3B-parameter zero-shot model outperforms all models, including ensembles, in the COLIEE 2021 test set and also achieves the best performance of a single model in the COLIEE 2022 competition, second only to the ensemble composed of the 3B model itself and a smaller version of the same model. Despite the challenges posed by large language models, mainly due to latency constraints in real-time applications, we provide a demonstration of our zero-shot monoT5-3b model being used in production as a search engine, including for legal documents. The code for our submission and the demo of our system are available at https://github.com/neuralmind-ai/coliee and https://neuralsearchx.neuralmind.ai, respectively.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
Generalization in Automated Process Discovery: A Framework based on Event Log Patterns
Authors:
Daniel Reißner,
Abel Armas-Cervantes,
Marcello La Rosa
Abstract:
The importance of quality measures in process mining has increased. One of the key quality aspects, generalization, is concerned with measuring the degree of overfitting of a process model w.r.t. an event log, since the recorded behavior is just an example of the true behavior of the underlying business process. Existing generalization measures exhibit several shortcomings that severely hinder the…
▽ More
The importance of quality measures in process mining has increased. One of the key quality aspects, generalization, is concerned with measuring the degree of overfitting of a process model w.r.t. an event log, since the recorded behavior is just an example of the true behavior of the underlying business process. Existing generalization measures exhibit several shortcomings that severely hinder their applicability in practice. For example, they assume the event log fully fits the discovered process model, and cannot deal with large real-life event logs and complex process models. More significantly, current measures neglect generalizations for clear patterns that demand a certain construct in the model. For example, a repeating sequence in an event log should be generalized with a loop structure in the model. We address these shortcomings by proposing a framework of measures that generalize a set of patterns discovered from an event log with representative traces and check the corresponding control-flow structures in the process model via their trace alignment. We instantiate the framework with a generalization measure that uses tandem repeats to identify repetitive patterns that are compared to the loop structures and a concurrency oracle to identify concurrent patterns that are compared to the parallel structures of the process model. In an extensive qualitative and quantitative evaluation using 74 log-model pairs using against two baseline generalization measures, we show that the proposed generalization measure consistently ranks process models that fulfil the observed patterns with generalizing control-flow structures higher than those which do not, while the baseline measures disregard those patterns. Further, we show that our measure can be efficiently computed for datasets two orders of magnitude larger than the largest dataset the baseline generalization measures can handle.
△ Less
Submitted 26 March, 2022;
originally announced March 2022.
-
To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment
Authors:
Guilherme Moraes Rosa,
Ruan Chaves Rodrigues,
Roberto de Alencar Lotufo,
Rodrigo Nogueira
Abstract:
There has been mounting evidence that pretrained language models fine-tuned on large and diverse supervised datasets can transfer well to a variety of out-of-domain tasks. In this work, we investigate this transfer ability to the legal domain. For that, we participated in the legal case entailment task of COLIEE 2021, in which we use such models with no adaptations to the target domain. Our submis…
▽ More
There has been mounting evidence that pretrained language models fine-tuned on large and diverse supervised datasets can transfer well to a variety of out-of-domain tasks. In this work, we investigate this transfer ability to the legal domain. For that, we participated in the legal case entailment task of COLIEE 2021, in which we use such models with no adaptations to the target domain. Our submissions achieved the highest scores, surpassing the second-best team by more than six percentage points. Our experiments confirm a counter-intuitive result in the new paradigm of pretrained language models: given limited labeled data, models with little or no adaptation to the target task can be more robust to changes in the data distribution than models fine-tuned on it. Code is available at https://github.com/neuralmind-ai/coliee.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
AI-Augmented Business Process Management Systems: A Research Manifesto
Authors:
Marlon Dumas,
Fabiana Fournier,
Lior Limonad,
Andrea Marrella,
Marco Montali,
Jana-Rebecca Rehse,
Rafael Accorsi,
Diego Calvanese,
Giuseppe De Giacomo,
Dirk Fahland,
Avigdor Gal,
Marcello La Rosa,
Hagen Völzer,
Ingo Weber
Abstract:
AI-Augmented Business Process Management Systems (ABPMSs) are an emerging class of process-aware information systems, empowered by trustworthy AI technology. An ABPMS enhances the execution of business processes with the aim of making these processes more adaptable, proactive, explainable, and context-sensitive. This manifesto presents a vision for ABPMSs and discusses research challenges that nee…
▽ More
AI-Augmented Business Process Management Systems (ABPMSs) are an emerging class of process-aware information systems, empowered by trustworthy AI technology. An ABPMS enhances the execution of business processes with the aim of making these processes more adaptable, proactive, explainable, and context-sensitive. This manifesto presents a vision for ABPMSs and discusses research challenges that need to be surmounted to realize this vision. To this end, we define the concept of ABPMS, we outline the lifecycle of processes within an ABPMS, we discuss core characteristics of an ABPMS, and we derive a set of challenges to realize systems with these characteristics.
△ Less
Submitted 4 November, 2022; v1 submitted 30 January, 2022;
originally announced January 2022.
-
Packaging research artefacts with RO-Crate
Authors:
Stian Soiland-Reyes,
Peter Sefton,
Mercè Crosas,
Leyla Jael Castro,
Frederik Coppens,
José M. Fernández,
Daniel Garijo,
Björn Grüning,
Marco La Rosa,
Simone Leo,
Eoghan Ó Carragáin,
Marc Portier,
Ana Trisovic,
RO-Crate Community,
Paul Groth,
Carole Goble
Abstract:
An increasing number of researchers support reproducibility by including pointers to and descriptions of datasets, software and methods in their publications. However, scientific articles may be ambiguous, incomplete and difficult to process by automated systems. In this paper we introduce RO-Crate, an open, community-driven, and lightweight approach to packaging research artefacts along with thei…
▽ More
An increasing number of researchers support reproducibility by including pointers to and descriptions of datasets, software and methods in their publications. However, scientific articles may be ambiguous, incomplete and difficult to process by automated systems. In this paper we introduce RO-Crate, an open, community-driven, and lightweight approach to packaging research artefacts along with their metadata in a machine readable manner. RO-Crate is based on Schema$.$org annotations in JSON-LD, aiming to establish best practices to formally describe metadata in an accessible and practical way for their use in a wide variety of situations.
An RO-Crate is a structured archive of all the items that contributed to a research outcome, including their identifiers, provenance, relations and annotations. As a general purpose packaging approach for data and their metadata, RO-Crate is used across multiple areas, including bioinformatics, digital humanities and regulatory sciences. By applying "just enough" Linked Data standards, RO-Crate simplifies the process of making research outputs FAIR while also enhancing research reproducibility.
An RO-Crate for this article is available at https://w3id.org/ro/doi/10.5281/zenodo.5146227
△ Less
Submitted 6 December, 2021; v1 submitted 14 August, 2021;
originally announced August 2021.
-
Automated Repair of Process Models with Non-Local Constraints Using State-Based Region Theory
Authors:
Anna Kalenkova,
Josep Carmona,
Artem Polyvyanyy,
Marcello La Rosa
Abstract:
State-of-the-art process discovery methods construct free-choice process models from event logs. Consequently, the constructed models do not take into account indirect dependencies between events. Whenever the input behaviour is not free-choice, these methods fail to provide a precise model. In this paper, we propose a novel approach for enhancing free-choice process models by adding non-free-choi…
▽ More
State-of-the-art process discovery methods construct free-choice process models from event logs. Consequently, the constructed models do not take into account indirect dependencies between events. Whenever the input behaviour is not free-choice, these methods fail to provide a precise model. In this paper, we propose a novel approach for enhancing free-choice process models by adding non-free-choice constructs discovered a-posteriori via region-based techniques. This allows us to benefit from the performance of existing process discovery methods and the accuracy of the employed fundamental synthesis techniques. We prove that the proposed approach preserves fitness with respect to the event log while improving the precision when indirect dependencies exist. The approach has been implemented and tested on both synthetic and real-life datasets. The results show its effectiveness in repairing models discovered from event logs.
△ Less
Submitted 13 December, 2021; v1 submitted 26 June, 2021;
originally announced June 2021.
-
Discovering executable routine specifications from user interaction logs
Authors:
Volodymyr Leno,
Adriano Augusto,
Marlon Dumas,
Marcello La Rosa,
Fabrizio Maria Maggi,
Artem Polyvyanyy
Abstract:
Robotic Process Automation (RPA) is a technology to automate routine work such as copying data across applications or filling in document templates using data from multiple applications. RPA tools allow organizations to automate a wide range of routines. However, identifying and sco** routines that can be automated using RPA tools is time consuming. Manual identification of candidate routines vi…
▽ More
Robotic Process Automation (RPA) is a technology to automate routine work such as copying data across applications or filling in document templates using data from multiple applications. RPA tools allow organizations to automate a wide range of routines. However, identifying and sco** routines that can be automated using RPA tools is time consuming. Manual identification of candidate routines via interviews, walk-throughs, or job shadowing allow analysts to identify the most visible routines, but these methods are not suitable when it comes to identifying the long tail of routines in an organization. This article proposes an approach to discover automatable routines from logs of user interactions with IT systems and to synthesize executable specifications for such routines. The approach starts by discovering frequent routines at a control-flow level (candidate routines). It then determines which of these candidate routines are automatable and it synthetizes an executable specification for each such routine. Finally, it identifies semantically equivalent routines so as to produce a set of non-redundant automatable routines. The article reports on an evaluation of the approach using a combination of synthetic and real-life logs. The evaluation results show that the approach can discover automatable routines that are known to be present in a UI log, and that it identifies automatable routines that users recognize as such in real-life logs.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
Prescriptive Process Monitoring for Cost-Aware Cycle Time Reduction
Authors:
Zahra Dasht Bozorgi,
Irene Teinemaa,
Marlon Dumas,
Marcello La Rosa,
Artem Polyvyanyy
Abstract:
Reducing cycle time is a recurrent concern in the field of business process management. Depending on the process, various interventions may be triggered to reduce the cycle time of a case, for example, using a faster ship** service in an order-to-delivery process or giving a phone call to a customer to obtain missing information rather than waiting passively. Each of these interventions comes wi…
▽ More
Reducing cycle time is a recurrent concern in the field of business process management. Depending on the process, various interventions may be triggered to reduce the cycle time of a case, for example, using a faster ship** service in an order-to-delivery process or giving a phone call to a customer to obtain missing information rather than waiting passively. Each of these interventions comes with a cost. This paper tackles the problem of determining if and when to trigger a time-reducing intervention in a way that maximizes the total net gain. The paper proposes a prescriptive process monitoring method that uses orthogonal random forest models to estimate the causal effect of triggering a time-reducing intervention for each ongoing case of a process. Based on this causal effect estimate, the method triggers interventions according to a user-defined policy. The method is evaluated on two real-life logs.
△ Less
Submitted 14 September, 2021; v1 submitted 14 May, 2021;
originally announced May 2021.
-
A cost-benefit analysis of cross-lingual transfer methods
Authors:
Guilherme Moraes Rosa,
Luiz Henrique Bonifacio,
Leandro Rodrigues de Souza,
Roberto Lotufo,
Rodrigo Nogueira
Abstract:
An effective method for cross-lingual transfer is to fine-tune a bilingual or multilingual model on a supervised dataset in one language and evaluating it on another language in a zero-shot manner. Translating examples at training time or inference time are also viable alternatives. However, there are costs associated with these methods that are rarely addressed in the literature. In this work, we…
▽ More
An effective method for cross-lingual transfer is to fine-tune a bilingual or multilingual model on a supervised dataset in one language and evaluating it on another language in a zero-shot manner. Translating examples at training time or inference time are also viable alternatives. However, there are costs associated with these methods that are rarely addressed in the literature. In this work, we analyze cross-lingual methods in terms of their effectiveness (e.g., accuracy), development and deployment costs, as well as their latencies at inference time. Our experiments on three tasks indicate that the best cross-lingual method is highly task-dependent. Finally, by combining zero-shot and translation methods, we achieve the state-of-the-art in two of the three datasets used in this work. Based on these results, we question the need for manually labeled training data in a target language. Code and translated datasets are available at https://github.com/unicamp-dl/cross-lingual-analysis
△ Less
Submitted 14 December, 2021; v1 submitted 14 May, 2021;
originally announced May 2021.
-
Automated Discovery of Process Models with True Concurrency and Inclusive Choices
Authors:
Adriano Augusto,
Marlon Dumas,
Marcello La Rosa
Abstract:
Enterprise information systems allow companies to maintain detailed records of their business process executions. These records can be extracted in the form of event logs, which capture the execution of activities across multiple instances of a business process. Event logs may be used to analyze business processes at a fine level of detail using process mining techniques. Among other things, proce…
▽ More
Enterprise information systems allow companies to maintain detailed records of their business process executions. These records can be extracted in the form of event logs, which capture the execution of activities across multiple instances of a business process. Event logs may be used to analyze business processes at a fine level of detail using process mining techniques. Among other things, process mining techniques allow us to discover a process model from an event log -- an operation known as automated process discovery. Despite a rich body of research in the field, existing automated process discovery techniques do not fully capture the concurrency inherent in a business process. Specifically, the bulk of these techniques treat two activities A and B as concurrent if sometimes A completes before B and other times B completes before A. Typically though, activities in a business process are executed in a true concurrency setting, meaning that two or more activity executions overlap temporally. This paper addresses this gap by presenting a refined version of an automated process discovery technique, namely Split Miner, that discovers true concurrency relations from event logs containing start and end timestamps for each activity. The proposed technique is also able to differentiate between exclusive and inclusive choices. We evaluate the proposed technique relative to existing baselines using 11 real-life logs drawn from different industries.
△ Less
Submitted 12 May, 2021;
originally announced May 2021.
-
Yes, BM25 is a Strong Baseline for Legal Case Retrieval
Authors:
Guilherme Moraes Rosa,
Ruan Chaves Rodrigues,
Roberto Lotufo,
Rodrigo Nogueira
Abstract:
We describe our single submission to task 1 of COLIEE 2021. Our vanilla BM25 got second place, well above the median of submissions. Code is available at https://github.com/neuralmind-ai/coliee.
We describe our single submission to task 1 of COLIEE 2021. Our vanilla BM25 got second place, well above the median of submissions. Code is available at https://github.com/neuralmind-ai/coliee.
△ Less
Submitted 25 October, 2021; v1 submitted 26 April, 2021;
originally announced May 2021.
-
Bootstrap** of memetic from genetic evolution via inter-agent selection pressures
Authors:
Nicholas Guttenberg,
Marek Rosa
Abstract:
We create an artificial system of agents (attention-based neural networks) which selectively exchange messages with each-other in order to study the emergence of memetic evolution and how memetic evolutionary pressures interact with genetic evolution of the network weights. We observe that the ability of agents to exert selection pressures on each-other is essential for memetic evolution to bootst…
▽ More
We create an artificial system of agents (attention-based neural networks) which selectively exchange messages with each-other in order to study the emergence of memetic evolution and how memetic evolutionary pressures interact with genetic evolution of the network weights. We observe that the ability of agents to exert selection pressures on each-other is essential for memetic evolution to bootstrap itself into a state which has both high-fidelity replication of memes, as well as continuing production of new memes over time. However, in this system there is very little interaction between this memetic 'ecology' and underlying tasks driving individual fitness - the emergent meme layer appears to be neither helpful nor harmful to agents' ability to learn to solve tasks. Sourcecode for these experiments is available at https://github.com/GoodAI/memes
△ Less
Submitted 7 April, 2021;
originally announced April 2021.
-
Experimental Body-input Three-stage DC offset Calibration Scheme for Memristive Crossbar
Authors:
Charanraj Mohan,
L. A. Camuñas-Mesa,
Elisa Vianello,
Carlo Reita,
José M. de la Rosa,
Teresa Serrano-Gotarredona,
Bernabé Linares-Barranco
Abstract:
Reading several ReRAMs simultaneously in a neuromorphic circuit increases power consumption and limits scalability. Applying small inference read pulses is a vain attempt when offset voltages of the read-out circuit are decisively more. This paper presents an experimental validation of a three-stage calibration scheme to calibrate the DC offset voltage across the rows of the memristive crossbar. T…
▽ More
Reading several ReRAMs simultaneously in a neuromorphic circuit increases power consumption and limits scalability. Applying small inference read pulses is a vain attempt when offset voltages of the read-out circuit are decisively more. This paper presents an experimental validation of a three-stage calibration scheme to calibrate the DC offset voltage across the rows of the memristive crossbar. The proposed method is based on biasing the body terminal of one of the differential pair MOSFETs of the buffer through a series of cascaded resistor banks arranged in three stages: coarse, fine and finer stages. The circuit is designed in a 130 nm CMOS technology, where the OxRAM-based binary memristors are built on top of it. A dedicated PCB and other auxiliary boards have been designed for testing the chip. Experimental results validate the presented approach, which is only limited by mismatch and electrical noise.
△ Less
Submitted 3 March, 2021;
originally announced March 2021.
-
Implementation of binary stochastic STDP learning using chalcogenide-based memristive devices
Authors:
C. Mohan,
L. A. Camuñas-Mesa,
J. M. de la Rosa,
T. Serrano-Gotarredona,
B. Linares-Barranco
Abstract:
The emergence of nano-scale memristive devices encouraged many different research areas to exploit their use in multiple applications. One of the proposed applications was to implement synaptic connections in bio-inspired neuromorphic systems. Large-scale neuromorphic hardware platforms are being developed with increasing number of neurons and synapses, having a critical bottleneck in the online l…
▽ More
The emergence of nano-scale memristive devices encouraged many different research areas to exploit their use in multiple applications. One of the proposed applications was to implement synaptic connections in bio-inspired neuromorphic systems. Large-scale neuromorphic hardware platforms are being developed with increasing number of neurons and synapses, having a critical bottleneck in the online learning capabilities. Spike-timing-dependent plasticity (STDP) is a widely used learning mechanism inspired by biology which updates the synaptic weight as a function of the temporal correlation between pre- and post-synaptic spikes. In this work, we demonstrate experimentally that binary stochastic STDP learning can be obtained from a memristor when the appropriate pulses are applied at both sides of the device.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
A Deep Adversarial Model for Suffix and Remaining Time Prediction of Event Sequences
Authors:
Farbod Taymouri,
Marcello La Rosa,
Sarah M. Erfani
Abstract:
Event suffix and remaining time prediction are sequence to sequence learning tasks. They have wide applications in different areas such as economics, digital health, business process management and IT infrastructure monitoring. Timestamped event sequences contain ordered events which carry at least two attributes: the event's label and its timestamp. Suffix and remaining time prediction are about…
▽ More
Event suffix and remaining time prediction are sequence to sequence learning tasks. They have wide applications in different areas such as economics, digital health, business process management and IT infrastructure monitoring. Timestamped event sequences contain ordered events which carry at least two attributes: the event's label and its timestamp. Suffix and remaining time prediction are about obtaining the most likely continuation of event labels and the remaining time until the sequence finishes, respectively. Recent deep learning-based works for such predictions are prone to potentially large prediction errors because of closed-loop training (i.e., the next event is conditioned on the ground truth of previous events) and open-loop inference (i.e., the next event is conditioned on previously predicted events). In this work, we propose an encoder-decoder architecture for open-loop training to advance the suffix and remaining time prediction of event sequences. To capture the joint temporal dynamics of events, we harness the power of adversarial learning techniques to boost prediction performance. We consider four real-life datasets and three baselines in our experiments. The results show improvements up to four times compared to the state of the art in suffix and remaining time prediction of event sequences, specifically in the realm of business process executions. We also show that the obtained improvements of adversarial training are superior compared to standard training under the same experimental setup.
△ Less
Submitted 14 February, 2021;
originally announced February 2021.
-
Process Mining Meets Causal Machine Learning: Discovering Causal Rules from Event Logs
Authors:
Zahra Dasht Bozorgi,
Irene Teinemaa,
Marlon Dumas,
Marcello La Rosa,
Artem Polyvyanyy
Abstract:
This paper proposes an approach to analyze an event log of a business process in order to generate case-level recommendations of treatments that maximize the probability of a given outcome. Users classify the attributes in the event log into controllable and non-controllable, where the former correspond to attributes that can be altered during an execution of the process (the possible treatments).…
▽ More
This paper proposes an approach to analyze an event log of a business process in order to generate case-level recommendations of treatments that maximize the probability of a given outcome. Users classify the attributes in the event log into controllable and non-controllable, where the former correspond to attributes that can be altered during an execution of the process (the possible treatments). We use an action rule mining technique to identify treatments that co-occur with the outcome under some conditions. Since action rules are generated based on correlation rather than causation, we then use a causal machine learning technique, specifically uplift trees, to discover subgroups of cases for which a treatment has a high causal effect on the outcome after adjusting for confounding variables. We test the relevance of this approach using an event log of a loan application process and compare our findings with recommendations manually produced by process mining experts.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
Identifying candidate routines for Robotic Process Automation from unsegmented UI logs
Authors:
V. Leno,
A. Augusto,
M. Dumas,
M. La Rosa,
F. Maggi,
A. Polyvyanyy
Abstract:
Robotic Process Automation (RPA) is a technology to develop software bots that automate repetitive sequences of interactions between users and software applications (a.k.a. routines). To take full advantage of this technology, organizations need to identify and to scope their routines. This is a challenging endeavor in large organizations, as routines are usually not concentrated in a handful of p…
▽ More
Robotic Process Automation (RPA) is a technology to develop software bots that automate repetitive sequences of interactions between users and software applications (a.k.a. routines). To take full advantage of this technology, organizations need to identify and to scope their routines. This is a challenging endeavor in large organizations, as routines are usually not concentrated in a handful of processes, but rather scattered across the process landscape. Accordingly, the identification of routines from User Interaction (UI) logs has received significant attention. Existing approaches to this problem assume that the UI log is segmented, meaning that it consists of traces of a task that is presupposed to contain one or more routines. However, a UI log usually takes the form of a single unsegmented sequence of events. This paper presents an approach to discover candidate routines from unsegmented UI logs in the presence of noise, i.e. events within or between routine instances that do not belong to any routine. The approach is implemented as an open-source tool and evaluated using synthetic and real-life UI logs.
△ Less
Submitted 26 August, 2020; v1 submitted 13 August, 2020;
originally announced August 2020.
-
Encoder-Decoder Generative Adversarial Nets for Suffix Generation and Remaining Time Prediction of Business Process Models
Authors:
Farbod Taymouri,
Marcello La Rosa
Abstract:
This paper proposes an encoder-decoder architecture grounded on Generative Adversarial Networks (GANs), that generates a sequence of activities and their timestamps in an end-to-end way. GANs work well with differentiable data such as images. However, a suffix is a sequence of categorical items. To this end, we use the Gumbel-Softmax distribution to get a differentiable continuous approximation. T…
▽ More
This paper proposes an encoder-decoder architecture grounded on Generative Adversarial Networks (GANs), that generates a sequence of activities and their timestamps in an end-to-end way. GANs work well with differentiable data such as images. However, a suffix is a sequence of categorical items. To this end, we use the Gumbel-Softmax distribution to get a differentiable continuous approximation. The training works by putting one neural network against the other in a two-player game (hence the "adversarial" nature), which leads to generating suffixes close to the ground truth. From the experimental evaluation it emerges that the approach is superior to the baselines in terms of the accuracy of the predicted suffixes and corresponding remaining times, despite using a naive feature encoding and only engineering features based on control flow and events completion time.
△ Less
Submitted 19 October, 2020; v1 submitted 29 July, 2020;
originally announced July 2020.
-
Detecting sudden and gradual drifts in business processes from execution traces
Authors:
Abderrahmane Maaradji,
Marlon Dumas,
Marcello La Rosa,
Alireza Ostovar
Abstract:
Business processes are prone to unexpected changes, as process workers may suddenly or gradually start executing a process differently in order to adjust to changes in workload, season, or other external factors. Early detection of business process changes enables managers to identify and act upon changes that may otherwise affect process performance. Business process drift detection refers to a f…
▽ More
Business processes are prone to unexpected changes, as process workers may suddenly or gradually start executing a process differently in order to adjust to changes in workload, season, or other external factors. Early detection of business process changes enables managers to identify and act upon changes that may otherwise affect process performance. Business process drift detection refers to a family of methods to detect changes in a business process by analyzing event logs extracted from the systems that support the execution of the process. Existing methods for business process drift detection are based on an explorative analysis of a potentially large feature space and in some cases they require users to manually identify specific features that characterize the drift. Depending on the explored feature space, these methods miss various types of changes. Moreover, they are either designed to detect sudden drifts or gradual drifts but not both. This paper proposes an automated and statistically grounded method for detecting sudden and gradual business process drifts under a unified framework. An empirical evaluation shows that the method detects typical change patterns with significantly higher accuracy and lower detection delay than existing methods, while accurately distinguishing between sudden and gradual drifts.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
Efficient Conformance Checking using Approximate Alignment Computation with Tandem Repeats
Authors:
Daniel Reißner,
Abel Armas-Cervantes,
Marcello La Rosa
Abstract:
Conformance checking encompasses a body of process mining techniques which aim to find and describe the differences between a process model capturing the expected process behavior and a corresponding event log recording the observed behavior. Alignments are an established technique to compute the distance between a trace in the event log and the closest execution trace of a corresponding process m…
▽ More
Conformance checking encompasses a body of process mining techniques which aim to find and describe the differences between a process model capturing the expected process behavior and a corresponding event log recording the observed behavior. Alignments are an established technique to compute the distance between a trace in the event log and the closest execution trace of a corresponding process model. Given a cost function, an alignment is optimal when it contains the least number of mismatches between a log trace and a model trace. Determining optimal alignments, however, is computationally expensive, especially in light of the growing size and complexity of event logs from practice, which can easily exceed one million events with traces of several hundred activities. A common limitation of existing alignment techniques is the inability to exploit repetitions in the log. By exploiting a specific form of sequential pattern in traces, namely tandem repeats, we propose a novel approximate technique that uses pre- and post-processing steps to compress the length of a trace and recomputes the alignment cost while guaranteeing that the cost result never under-approximates the optimal cost. In an extensive empirical evaluation with 50 real-life model log pairs and against six state-of-the-art alignment techniques, we show that the proposed compression approach systematically outperforms the baselines by up to an order of magnitude in the presence of traces with repetitions, and that the cost over-approximation, when it occurs, is negligible.
△ Less
Submitted 26 March, 2022; v1 submitted 1 April, 2020;
originally announced April 2020.
-
Predictive Business Process Monitoring via Generative Adversarial Nets: The Case of Next Event Prediction
Authors:
Farbod Taymouri,
Marcello La Rosa,
Sarah Erfani,
Zahra Dasht Bozorgi,
Ilya Verenich
Abstract:
Predictive process monitoring aims to predict future characteristics of an ongoing process case, such as case outcome or remaining timestamp. Recently, several predictive process monitoring methods based on deep learning such as Long Short-Term Memory or Convolutional Neural Network have been proposed to address the problem of next event prediction. However, due to insufficient training data or su…
▽ More
Predictive process monitoring aims to predict future characteristics of an ongoing process case, such as case outcome or remaining timestamp. Recently, several predictive process monitoring methods based on deep learning such as Long Short-Term Memory or Convolutional Neural Network have been proposed to address the problem of next event prediction. However, due to insufficient training data or sub-optimal network configuration and architecture, these approaches do not generalize well the problem at hand. This paper proposes a novel adversarial training framework to address this shortcoming, based on an adaptation of Generative Adversarial Networks (GANs) to the realm of sequential temporal data. The training works by putting one neural network against the other in a two-player game (hence the adversarial nature) which leads to predictions that are indistinguishable from the ground truth. We formally show that the worst-case accuracy of the proposed approach is at least equal to the accuracy achieved in non-adversarial settings. From the experimental evaluation it emerges that the approach systematically outperforms all baselines both in terms of accuracy and earliness of the prediction, despite using a simple network architecture and a naive feature encoding. Moreover, the approach is more robust, as its accuracy is not affected by fluctuations over the case length.
△ Less
Submitted 1 April, 2020; v1 submitted 25 March, 2020;
originally announced March 2020.
-
Automated Discovery of Data Transformations for Robotic Process Automation
Authors:
Volodymyr Leno,
Marlon Dumas,
Marcello La Rosa,
Fabrizio Maria Maggi,
Artem Polyvyanyy
Abstract:
Robotic Process Automation (RPA) is a technology for automating repetitive routines consisting of sequences of user interactions with one or more applications. In order to fully exploit the opportunities opened by RPA, companies need to discover which specific routines may be automated, and how. In this setting, this paper addresses the problem of analyzing User Interaction (UI) logs in order to d…
▽ More
Robotic Process Automation (RPA) is a technology for automating repetitive routines consisting of sequences of user interactions with one or more applications. In order to fully exploit the opportunities opened by RPA, companies need to discover which specific routines may be automated, and how. In this setting, this paper addresses the problem of analyzing User Interaction (UI) logs in order to discover routines where a user transfers data from one spreadsheet or (Web) form to another. The paper maps this problem to that of discovering data transformations by example - a problem for which several techniques are available. The paper shows that a naive application of a state-of-the-art technique for data transformation discovery is computationally inefficient. Accordingly, the paper proposes two optimizations that take advantage of the information in the UI log and the fact that data transfers across applications typically involve copying alphabetic and numeric tokens separately. The proposed approach and its optimizations are evaluated using UI logs that replicate a real-life repetitive data transfer routine.
△ Less
Submitted 3 January, 2020;
originally announced January 2020.
-
Business Process Variant Analysis based on Mutual Fingerprints of Event Logs
Authors:
Farbod Taymouri,
Marcello La Rosa,
Josep Carmona
Abstract:
Comparing business process variants using event logs is a common use case in process mining. Existing techniques for process variant analysis detect statistically-significant differences between variants at the level of individual entities (such as process activities) and their relationships (e.g. directly-follows relations between activities). This may lead to a proliferation of differences due t…
▽ More
Comparing business process variants using event logs is a common use case in process mining. Existing techniques for process variant analysis detect statistically-significant differences between variants at the level of individual entities (such as process activities) and their relationships (e.g. directly-follows relations between activities). This may lead to a proliferation of differences due to the low level of granularity in which such differences are captured. This paper presents a novel approach to detect statistically-significant differences between variants at the level of entire process traces (i.e. sequences of directly-follows relations). The cornerstone of this approach is a technique to learn a directly follows graph called mutual fingerprint from the event logs of the two variants. A mutual fingerprint is a lossless encoding of a set of traces and their duration using discrete wavelet transformation. This structure facilitates the understanding of statistical differences along the control-flow and performance dimensions. The approach has been evaluated using real-life event logs against two baselines. The results show that at a trace level, the baselines cannot always reveal the differences discovered by our approach, or can detect spurious differences.
△ Less
Submitted 1 April, 2020; v1 submitted 22 December, 2019;
originally announced December 2019.
-
BADGER: Learning to (Learn [Learning Algorithms] through Multi-Agent Communication)
Authors:
Marek Rosa,
Olga Afanasjeva,
Simon Andersson,
Joseph Davidson,
Nicholas Guttenberg,
Petr Hlubuček,
Martin Poliak,
Jaroslav Vítku,
Jan Feyereisl
Abstract:
In this work, we propose a novel memory-based multi-agent meta-learning architecture and learning procedure that allows for learning of a shared communication policy that enables the emergence of rapid adaptation to new and unseen environments by learning to learn learning algorithms through communication. Behavior, adaptation and learning to adapt emerges from the interactions of homogeneous expe…
▽ More
In this work, we propose a novel memory-based multi-agent meta-learning architecture and learning procedure that allows for learning of a shared communication policy that enables the emergence of rapid adaptation to new and unseen environments by learning to learn learning algorithms through communication. Behavior, adaptation and learning to adapt emerges from the interactions of homogeneous experts inside a single agent. The proposed architecture should allow for generalization beyond the level seen in existing methods, in part due to the use of a single policy shared by all experts within the agent as well as the inherent modularity of 'Badger'.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
Business Process Variant Analysis: Survey and Classification
Authors:
Farbod Taymouri,
Marcello La Rosa,
Marlon Dumas,
Fabrizio Maria Maggi
Abstract:
Process variant analysis aims at identifying and addressing the differences existing in a set of process executions enacted by the same process model. A process model can be executed differently in different situations for various reasons, e.g., the process could run in different locations or seasons, which gives rise to different behaviors. Having intuitions about the discrepancies in process beh…
▽ More
Process variant analysis aims at identifying and addressing the differences existing in a set of process executions enacted by the same process model. A process model can be executed differently in different situations for various reasons, e.g., the process could run in different locations or seasons, which gives rise to different behaviors. Having intuitions about the discrepancies in process behaviors, though challenging, is beneficial for managers and process analysts since they can improve their process models efficiently, e.g., via interactive learning or adapting mechanisms. Several methods have been proposed to tackle the problem of uncovering discrepancies in process executions. However, because of the interdisciplinary nature of the challenge, the methods and sorts of analysis in the literature are very heterogeneous. This article not only presents a systematic literature review and taxonomy of methods for variant analysis of business processes but also provides a methodology including the required steps to apply this type of analysis for the identification of variants in business process executions.
△ Less
Submitted 22 December, 2019; v1 submitted 18 November, 2019;
originally announced November 2019.
-
Scalable Alignment of Process Models and Event Logs: An Approach Based on Automata and S-Components
Authors:
Daniel Reißner,
Abel Armas-Cervantes,
Raffaele Conforti,
Marlon Dumas,
Dirk Fahland,
Marcello La Rosa
Abstract:
Given a model of the expected behavior of a business process and an event log recording its observed behavior, the problem of business process conformance checking is that of identifying and describing the differences between the model and the log. A desirable feature of a conformance checking technique is to identify a minimal yet complete set of differences. Existing conformance checking techniq…
▽ More
Given a model of the expected behavior of a business process and an event log recording its observed behavior, the problem of business process conformance checking is that of identifying and describing the differences between the model and the log. A desirable feature of a conformance checking technique is to identify a minimal yet complete set of differences. Existing conformance checking techniques that fulfil this property exhibit limited scalability when confronted to large and complex models and logs. This paper presents two complementary techniques to address these shortcomings. The first technique transforms the model and log into two automata. These automata are compared using an error-correcting synchronized product, computed via an A* that guarantees the resulting automaton captures all differences with a minimal amount of error corrections. The synchronized product is used to extract minimal-length alignments between each trace of the log and the closest corresponding trace of the model. A limitation of the first technique is that as the level of concurrency in the model increases, the size of the automaton of the model grows exponentially, thus hampering scalability. To address this limitation, the paper proposes a second technique wherein the process model is first decomposed into a set of automata, known as S-components, such that the product of these automata is equal to the automaton of the whole process model. An error-correcting product is computed for each S-component separately and the resulting automata are recomposed into a single product automaton capturing all differences without minimality guarantees. An empirical evaluation shows that the proposed techniques outperform state-of-the-art baselines in terms of computational efficiency. Moreover, the decomposition-based technique is optimal for the vast majority of datasets and quasi-optimal for the remaining ones.
△ Less
Submitted 4 March, 2020; v1 submitted 22 October, 2019;
originally announced October 2019.
-
Process Query Language: Design, Implementation, and Evaluation
Authors:
Artem Polyvyanyy,
Arthur H. M. ter Hofstede,
Marcello La Rosa,
Chun Ouyang,
Anastasiia Pika
Abstract:
Organizations can benefit from the use of practices, techniques, and tools from the area of business process management. Through the focus on processes, they create process models that require management, including support for versioning, refactoring and querying. Querying thus far has primarily focused on structural properties of models rather than on exploiting behavioral properties capturing as…
▽ More
Organizations can benefit from the use of practices, techniques, and tools from the area of business process management. Through the focus on processes, they create process models that require management, including support for versioning, refactoring and querying. Querying thus far has primarily focused on structural properties of models rather than on exploiting behavioral properties capturing aspects of model execution. While the latter is more challenging, it is also more effective, especially when models are used for auditing or process automation. The focus of this paper is to overcome the challenges associated with behavioral querying of process models in order to unlock its benefits. The first challenge concerns determining decidability of the building blocks of the query language, which are the possible behavioral relations between process tasks. The second challenge concerns achieving acceptable performance of query evaluation. The evaluation of a query may require expensive checks in all process models, of which there may be thousands. In light of these challenges, this paper proposes a special-purpose programming language, namely Process Query Language (PQL) for behavioral querying of process model collections. The language relies on a set of behavioral predicates between process tasks, whose usefulness has been empirically evaluated with a pool of process model stakeholders. This study resulted in a selection of the predicates to be implemented in PQL, whose decidability has also been formally proven. The computational performance of the language has been extensively evaluated through a set of experiments against two large process model collections.
△ Less
Submitted 20 September, 2019;
originally announced September 2019.
-
ToyArchitecture: Unsupervised Learning of Interpretable Models of the World
Authors:
Jaroslav Vítků,
Petr Dluhoš,
Joseph Davidson,
Matěj Nikl,
Simon Andersson,
Přemysl Paška,
Jan Šinkora,
Petr Hlubuček,
Martin Stránský,
Martin Hyben,
Martin Poliak,
Jan Feyereisl,
Marek Rosa
Abstract:
Research in Artificial Intelligence (AI) has focused mostly on two extremes: either on small improvements in narrow AI domains, or on universal theoretical frameworks which are usually uncomputable, incompatible with theories of biological intelligence, or lack practical implementations. The goal of this work is to combine the main advantages of the two: to follow a big picture view, while providi…
▽ More
Research in Artificial Intelligence (AI) has focused mostly on two extremes: either on small improvements in narrow AI domains, or on universal theoretical frameworks which are usually uncomputable, incompatible with theories of biological intelligence, or lack practical implementations. The goal of this work is to combine the main advantages of the two: to follow a big picture view, while providing a particular theory and its implementation. In contrast with purely theoretical approaches, the resulting architecture should be usable in realistic settings, but also form the core of a framework containing all the basic mechanisms, into which it should be easier to integrate additional required functionality.
In this paper, we present a novel, purposely simple, and interpretable hierarchical architecture which combines multiple different mechanisms into one system: unsupervised learning of a model of the world, learning the influence of one's own actions on the world, model-based reinforcement learning, hierarchical planning and plan execution, and symbolic/sub-symbolic integration in general. The learned model is stored in the form of hierarchical representations with the following properties: 1) they are increasingly more abstract, but can retain details when needed, and 2) they are easy to manipulate in their local and symbolic-like form, thus also allowing one to observe the learning process at each level of abstraction. On all levels of the system, the representation of the data can be interpreted in both a symbolic and a sub-symbolic manner. This enables the architecture to learn efficiently using sub-symbolic methods and to employ symbolic inference.
△ Less
Submitted 9 September, 2020; v1 submitted 20 March, 2019;
originally announced March 2019.
-
Survey and cross-benchmark comparison of remaining time prediction methods in business process monitoring
Authors:
Ilya Verenich,
Marlon Dumas,
Marcello La Rosa,
Fabrizio Maggi,
Irene Teinemaa
Abstract:
Predictive business process monitoring methods exploit historical process execution logs to generate predictions about running instances (called cases) of a business process, such as the prediction of the outcome, next activity or remaining cycle time of a given process case. These insights could be used to support operational managers in taking remedial actions as business processes unfold, e.g.…
▽ More
Predictive business process monitoring methods exploit historical process execution logs to generate predictions about running instances (called cases) of a business process, such as the prediction of the outcome, next activity or remaining cycle time of a given process case. These insights could be used to support operational managers in taking remedial actions as business processes unfold, e.g. shifting resources from one case onto another to ensure this latter is completed on time. A number of methods to tackle the remaining cycle time prediction problem have been proposed in the literature. However, due to differences in their experimental setup, choice of datasets, evaluation measures and baselines, the relative merits of each method remain unclear. This article presents a systematic literature review and taxonomy of methods for remaining time prediction in the context of business processes, as well as a cross-benchmark comparison of 16 such methods based on 16 real-life datasets originating from different industry domains.
△ Less
Submitted 10 May, 2018; v1 submitted 8 May, 2018;
originally announced May 2018.
-
Discovering Process Maps from Event Streams
Authors:
Volodymyr Leno,
Abel Armas-Cervantes,
Marlon Dumas,
Marcello La Rosa,
Fabrizio M. Maggi
Abstract:
Automated process discovery is a class of process mining methods that allow analysts to extract business process models from event logs. Traditional process discovery methods extract process models from a snapshot of an event log stored in its entirety. In some scenarios, however, events keep coming with a high arrival rate to the extent that it is impractical to store the entire event log and to…
▽ More
Automated process discovery is a class of process mining methods that allow analysts to extract business process models from event logs. Traditional process discovery methods extract process models from a snapshot of an event log stored in its entirety. In some scenarios, however, events keep coming with a high arrival rate to the extent that it is impractical to store the entire event log and to continuously re-discover a process model from scratch. Such scenarios require online process discovery approaches. Given an event stream produced by the execution of a business process, the goal of an online process discovery method is to maintain a continuously updated model of the process with a bounded amount of memory while at the same time achieving similar accuracy as offline methods. However, existing online discovery approaches require relatively large amounts of memory to achieve levels of accuracy comparable to that of offline methods. Therefore, this paper proposes an approach that addresses this limitation by map** the problem of online process discovery to that of cache memory management, and applying well-known cache replacement policies to the problem of online process discovery. The approach has been implemented in .NET, experimentally integrated with the Minit process mining tool and comparatively evaluated against an existing baseline using real-life datasets.
△ Less
Submitted 8 April, 2018;
originally announced April 2018.
-
Outcome-Oriented Predictive Process Monitoring: Review and Benchmark
Authors:
Irene Teinemaa,
Marlon Dumas,
Marcello La Rosa,
Fabrizio Maria Maggi
Abstract:
Predictive business process monitoring refers to the act of making predictions about the future state of ongoing cases of a business process, based on their incomplete execution traces and logs of historical (completed) traces. Motivated by the increasingly pervasive availability of fine-grained event data about business process executions, the problem of predictive process monitoring has received…
▽ More
Predictive business process monitoring refers to the act of making predictions about the future state of ongoing cases of a business process, based on their incomplete execution traces and logs of historical (completed) traces. Motivated by the increasingly pervasive availability of fine-grained event data about business process executions, the problem of predictive process monitoring has received substantial attention in the past years. In particular, a considerable number of methods have been put forward to address the problem of outcome-oriented predictive process monitoring, which refers to classifying each ongoing case of a process according to a given set of possible categorical outcomes - e.g., Will the customer complain or not? Will an order be delivered, canceled or withdrawn? Unfortunately, different authors have used different datasets, experimental settings, evaluation measures and baselines to assess their proposals, resulting in poor comparability and an unclear picture of the relative merits and applicability of different methods. To address this gap, this article presents a systematic review and taxonomy of outcome-oriented predictive process monitoring methods, and a comparative experimental evaluation of eleven representative methods using a benchmark covering 24 predictive process monitoring tasks based on nine real-life event logs.
△ Less
Submitted 23 October, 2018; v1 submitted 21 July, 2017;
originally announced July 2017.
-
Automated Discovery of Process Models from Event Logs: Review and Benchmark
Authors:
Adriano Augusto,
Raffaele Conforti,
Marlon Dumas,
Marcello La Rosa,
Fabrizio Maria Maggi,
Andrea Marrella,
Massimo Mecella,
Allar Soo
Abstract:
Process mining allows analysts to exploit logs of historical executions of business processes to extract insights regarding the actual performance of these processes. One of the most widely studied process mining operations is automated process discovery. An automated process discovery method takes as input an event log, and produces as output a business process model that captures the control-flo…
▽ More
Process mining allows analysts to exploit logs of historical executions of business processes to extract insights regarding the actual performance of these processes. One of the most widely studied process mining operations is automated process discovery. An automated process discovery method takes as input an event log, and produces as output a business process model that captures the control-flow relations between tasks that are observed in or implied by the event log. Various automated process discovery methods have been proposed in the past two decades, striking different tradeoffs between scalability, accuracy and complexity of the resulting models. However, these methods have been evaluated in an ad-hoc manner, employing different datasets, experimental setups, evaluation measures and baselines, often leading to incomparable conclusions and sometimes unreproducible results due to the use of closed datasets. This article provides a systematic review and comparative evaluation of automated process discovery methods, using an open-source benchmark and covering twelve publicly-available real-life event logs, twelve proprietary real-life event logs, and nine quality metrics. The results highlight gaps and unexplored tradeoffs in the field, including the lack of scalability of some methods and a strong divergence in their performance with respect to the different quality metrics used.
△ Less
Submitted 29 January, 2018; v1 submitted 5 May, 2017;
originally announced May 2017.
-
Blockchains for Business Process Management - Challenges and Opportunities
Authors:
Jan Mendling,
Ingo Weber,
Wil van der Aalst,
Jan vom Brocke,
Cristina Cabanillas,
Florian Daniel,
Soren Debois,
Claudio Di Ciccio,
Marlon Dumas,
Schahram Dustdar,
Avigdor Gal,
Luciano Garcia-Banuelos,
Guido Governatori,
Richard Hull,
Marcello La Rosa,
Henrik Leopold,
Frank Leymann,
Jan Recker,
Manfred Reichert,
Hajo A. Reijers,
Stefanie Rinderle-Ma,
Andreas Rogge-Solti,
Michael Rosemann,
Stefan Schulte,
Munindar P. Singh
, et al. (7 additional authors not shown)
Abstract:
Blockchain technology promises a sizable potential for executing inter-organizational business processes without requiring a central party serving as a single point of trust (and failure). This paper analyzes its impact on business process management (BPM). We structure the discussion using two BPM frameworks, namely the six BPM core capabilities and the BPM lifecycle. This paper provides research…
▽ More
Blockchain technology promises a sizable potential for executing inter-organizational business processes without requiring a central party serving as a single point of trust (and failure). This paper analyzes its impact on business process management (BPM). We structure the discussion using two BPM frameworks, namely the six BPM core capabilities and the BPM lifecycle. This paper provides research directions for investigating the application of blockchain technology to BPM.
△ Less
Submitted 31 January, 2018; v1 submitted 11 April, 2017;
originally announced April 2017.
-
Predictive Business Process Monitoring with LSTM Neural Networks
Authors:
Niek Tax,
Ilya Verenich,
Marcello La Rosa,
Marlon Dumas
Abstract:
Predictive business process monitoring methods exploit logs of completed cases of a process in order to make predictions about running cases thereof. Existing methods in this space are tailor-made for specific prediction tasks. Moreover, their relative accuracy is highly sensitive to the dataset at hand, thus requiring users to engage in trial-and-error and tuning when applying them in a specific…
▽ More
Predictive business process monitoring methods exploit logs of completed cases of a process in order to make predictions about running cases thereof. Existing methods in this space are tailor-made for specific prediction tasks. Moreover, their relative accuracy is highly sensitive to the dataset at hand, thus requiring users to engage in trial-and-error and tuning when applying them in a specific setting. This paper investigates Long Short-Term Memory (LSTM) neural networks as an approach to build consistently accurate models for a wide range of predictive process monitoring tasks. First, we show that LSTMs outperform existing techniques to predict the next event of a running case and its timestamp. Next, we show how to use models for predicting the next task in order to predict the full continuation of a running case. Finally, we apply the same approach to predict the remaining time, and show that this approach outperforms existing tailor-made methods.
△ Less
Submitted 16 May, 2017; v1 submitted 7 December, 2016;
originally announced December 2016.
-
A Framework for Searching for General Artificial Intelligence
Authors:
Marek Rosa,
Jan Feyereisl,
The GoodAI Collective
Abstract:
There is a significant lack of unified approaches to building generally intelligent machines. The majority of current artificial intelligence research operates within a very narrow field of focus, frequently without considering the importance of the 'big picture'. In this document, we seek to describe and unify principles that guide the basis of our development of general artificial intelligence.…
▽ More
There is a significant lack of unified approaches to building generally intelligent machines. The majority of current artificial intelligence research operates within a very narrow field of focus, frequently without considering the importance of the 'big picture'. In this document, we seek to describe and unify principles that guide the basis of our development of general artificial intelligence. These principles revolve around the idea that intelligence is a tool for searching for general solutions to problems. We define intelligence as the ability to acquire skills that narrow this search, diversify it and help steer it to more promising areas. We also provide suggestions for studying, measuring, and testing the various skills and abilities that a human-level intelligent machine needs to acquire. The document aims to be both implementation agnostic, and to provide an analytic, systematic, and scalable way to generate hypotheses that we believe are needed to meet the necessary conditions in the search for general artificial intelligence. We believe that such a framework is an important step** stone for bringing together definitions, highlighting open problems, connecting researchers willing to collaborate, and for unifying the arguably most significant search of this century.
△ Less
Submitted 2 November, 2016;
originally announced November 2016.
-
Business Process Deviance Mining: Review and Evaluation
Authors:
Hoang Nguyen,
Marlon Dumas,
Marcello La Rosa,
Fabrizio Maria Maggi,
Suriadi Suriadi
Abstract:
Business process deviance refers to the phenomenon whereby a subset of the executions of a business process deviate, in a negative or positive way, with respect to its expected or desirable outcomes. Deviant executions of a business process include those that violate compliance rules, or executions that undershoot or exceed performance targets. Deviance mining is concerned with uncovering the reas…
▽ More
Business process deviance refers to the phenomenon whereby a subset of the executions of a business process deviate, in a negative or positive way, with respect to its expected or desirable outcomes. Deviant executions of a business process include those that violate compliance rules, or executions that undershoot or exceed performance targets. Deviance mining is concerned with uncovering the reasons for deviant executions by analyzing business process event logs. This article provides a systematic review and comparative evaluation of deviance mining approaches based on a family of data mining techniques known as sequence classification. Using real-life logs from multiple domains, we evaluate a range of feature types and classification methods in terms of their ability to accurately discriminate between normal and deviant executions of a process. We also analyze the interestingness of the rule sets extracted using different methods. We observe that feature sets extracted using pattern mining techniques only slightly outperform simpler feature sets based on counts of individual activity occurrences in a trace.
△ Less
Submitted 29 August, 2016;
originally announced August 2016.
-
Four Degrees of Separation
Authors:
Lars Backstrom,
Paolo Boldi,
Marco Rosa,
Johan Ugander,
Sebastiano Vigna
Abstract:
Frigyes Karinthy, in his 1929 short story "Láancszemek" ("Chains") suggested that any two persons are distanced by at most six friendship links. (The exact wording of the story is slightly ambiguous: "He bet us that, using no more than five individuals, one of whom is a personal acquaintance, he could contact the selected individual [...]". It is not completely clear whether the selected individua…
▽ More
Frigyes Karinthy, in his 1929 short story "Láancszemek" ("Chains") suggested that any two persons are distanced by at most six friendship links. (The exact wording of the story is slightly ambiguous: "He bet us that, using no more than five individuals, one of whom is a personal acquaintance, he could contact the selected individual [...]". It is not completely clear whether the selected individual is part of the five, so this could actually allude to distance five or six in the language of graph theory, but the "six degrees of separation" phrase stuck after John Guare's 1990 eponymous play. Following Milgram's definition and Guare's interpretation, we will assume that "degrees of separation" is the same as "distance minus one", where "distance" is the usual path length-the number of arcs in the path.) Stanley Milgram in his famous experiment challenged people to route postcards to a fixed recipient by passing them only through direct acquaintances. The average number of intermediaries on the path of the postcards lay between 4.4 and 5.7, depending on the sample of people chosen.
We report the results of the first world-scale social-network graph-distance computations, using the entire Facebook network of active users (\approx721 million users, \approx69 billion friendship links). The average distance we observe is 4.74, corresponding to 3.74 intermediaries or "degrees of separation", showing that the world is even smaller than we expected, and prompting the title of this paper. More generally, we study the distance distribution of Facebook and of some interesting geographic subgraphs, looking also at their evolution over time.
The networks we are able to explore are almost two orders of magnitude larger than those analysed in the previous literature. We report detailed statistical metadata showing that our measurements (which rely on probabilistic algorithms) are very accurate.
△ Less
Submitted 5 January, 2012; v1 submitted 19 November, 2011;
originally announced November 2011.
-
Robustness of Social Networks: Comparative Results Based on Distance Distributions
Authors:
Paolo Boldi,
Marco Rosa,
Sebastiano Vigna
Abstract:
Given a social network, which of its nodes have a stronger impact in determining its structure? More formally: which node-removal order has the greatest impact on the network structure? We approach this well-known problem for the first time in a setting that combines both web graphs and social networks, using datasets that are orders of magnitude larger than those appearing in the previous literat…
▽ More
Given a social network, which of its nodes have a stronger impact in determining its structure? More formally: which node-removal order has the greatest impact on the network structure? We approach this well-known problem for the first time in a setting that combines both web graphs and social networks, using datasets that are orders of magnitude larger than those appearing in the previous literature, thanks to some recently developed algorithms and software tools that make it possible to approximate accurately the number of reachable pairs and the distribution of distances in a graph. Our experiments highlight deep differences in the structure of social networks and web graphs, show significant limitations of previous experimental results, and at the same time reveal clustering by label propagation as a new and very effective way of locating nodes that are important from a structural viewpoint.
△ Less
Submitted 20 October, 2011;
originally announced October 2011.
-
HyperANF: Approximating the Neighbourhood Function of Very Large Graphs on a Budget
Authors:
Paolo Boldi,
Marco Rosa,
Sebastiano Vigna
Abstract:
The neighbourhood function N(t) of a graph G gives, for each t, the number of pairs of nodes <x, y> such that y is reachable from x in less that t hops. The neighbourhood function provides a wealth of information about the graph (e.g., it easily allows one to compute its diameter), but it is very expensive to compute it exactly. Recently, the ANF algorithm (approximate neighbourhood function) has…
▽ More
The neighbourhood function N(t) of a graph G gives, for each t, the number of pairs of nodes <x, y> such that y is reachable from x in less that t hops. The neighbourhood function provides a wealth of information about the graph (e.g., it easily allows one to compute its diameter), but it is very expensive to compute it exactly. Recently, the ANF algorithm (approximate neighbourhood function) has been proposed with the purpose of approximating NG(t) on large graphs. We describe a breakthrough improvement over ANF in terms of speed and scalability. Our algorithm, called HyperANF, uses the new HyperLogLog counters and combines them efficiently through broadword programming; our implementation uses overdecomposition to exploit multi-core parallelism. With HyperANF, for the first time we can compute in a few hours the neighbourhood function of graphs with billions of nodes with a small error and good confidence using a standard workstation. Then, we turn to the study of the distribution of the shortest paths between reachable nodes (that can be efficiently approximated by means of HyperANF), and discover the surprising fact that its index of dispersion provides a clear-cut characterisation of proper social networks vs. web graphs. We thus propose the spid (Shortest-Paths Index of Dispersion) of a graph as a new, informative statistics that is able to discriminate between the above two types of graphs. We believe this is the first proposal of a significant new non-local structural index for complex networks whose computation is highly scalable.
△ Less
Submitted 26 January, 2011; v1 submitted 25 November, 2010;
originally announced November 2010.
-
Layered Label Propagation: A MultiResolution Coordinate-Free Ordering for Compressing Social Networks
Authors:
Paolo Boldi,
Marco Rosa,
Massimo Santini,
Sebastiano Vigna
Abstract:
We continue the line of research on graph compression started with WebGraph, but we move our focus to the compression of social networks in a proper sense (e.g., LiveJournal): the approaches that have been used for a long time to compress web graphs rely on a specific ordering of the nodes (lexicographical URL ordering) whose extension to general social networks is not trivial. In this paper, we p…
▽ More
We continue the line of research on graph compression started with WebGraph, but we move our focus to the compression of social networks in a proper sense (e.g., LiveJournal): the approaches that have been used for a long time to compress web graphs rely on a specific ordering of the nodes (lexicographical URL ordering) whose extension to general social networks is not trivial. In this paper, we propose a solution that mixes clusterings and orders, and devise a new algorithm, called Layered Label Propagation, that builds on previous work on scalable clustering and can be used to reorder very large graphs (billions of nodes). Our implementation uses overdecomposition to perform aggressively on multi-core architecture, making it possible to reorder graphs of more than 600 millions nodes in a few hours. Experiments performed on a wide array of web graphs and social networks show that combining the order produced by the proposed algorithm with the WebGraph compression framework provides a major increase in compression with respect to all currently known techniques, both on web graphs and on social networks. These improvements make it possible to analyse in main memory significantly larger graphs.
△ Less
Submitted 14 October, 2011; v1 submitted 24 November, 2010;
originally announced November 2010.