-
Literal-Aware Knowledge Graph Embedding for Welding Quality Monitoring: A Bosch Case
Authors:
Zhipeng Tan,
Baifan Zhou,
Zhuoxun Zheng,
Ognjen Savkovic,
Ziqi Huang,
Irlan-Grangel Gonzalez,
Ahmet Soylu,
Evgeny Kharlamov
Abstract:
Recently there has been a series of studies in knowledge graph embedding (KGE), which attempts to learn the embeddings of the entities and relations as numerical vectors and mathematical map**s via machine learning (ML). However, there has been limited research that applies KGE for industrial problems in manufacturing. This paper investigates whether and to what extent KGE can be used for an imp…
▽ More
Recently there has been a series of studies in knowledge graph embedding (KGE), which attempts to learn the embeddings of the entities and relations as numerical vectors and mathematical map**s via machine learning (ML). However, there has been limited research that applies KGE for industrial problems in manufacturing. This paper investigates whether and to what extent KGE can be used for an important problem: quality monitoring for welding in manufacturing industry, which is an impactful process accounting for production of millions of cars annually. The work is in line with Bosch research of data-driven solutions that intends to replace the traditional way of destroying cars, which is extremely costly and produces waste. The paper tackles two very challenging questions simultaneously: how large the welding spot diameter is; and to which car body the welded spot belongs to. The problem setting is difficult for traditional ML because there exist a high number of car bodies that should be assigned as class labels. We formulate the problem as link prediction, and experimented popular KGE methods on real industry data, with consideration of literals. Our results reveal both limitations and promising aspects of adapted KGE methods.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Scaling Data Science Solutions with Semantics and Machine Learning: Bosch Case
Authors:
Baifan Zhou,
Nikolay Nikolov,
Zhuoxun Zheng,
Xianghui Luo,
Ognjen Savkovic,
Dumitru Roman,
Ahmet Soylu,
Evgeny Kharlamov
Abstract:
Industry 4.0 and Internet of Things (IoT) technologies unlock unprecedented amount of data from factory production, posing big data challenges in volume and variety. In that context, distributed computing solutions such as cloud systems are leveraged to parallelise the data processing and reduce computation time. As the cloud systems become increasingly popular, there is increased demand that more…
▽ More
Industry 4.0 and Internet of Things (IoT) technologies unlock unprecedented amount of data from factory production, posing big data challenges in volume and variety. In that context, distributed computing solutions such as cloud systems are leveraged to parallelise the data processing and reduce computation time. As the cloud systems become increasingly popular, there is increased demand that more users that were originally not cloud experts (such as data scientists, domain experts) deploy their solutions on the cloud systems. However, it is non-trivial to address both the high demand for cloud system users and the excessive time required to train them. To this end, we propose SemCloud, a semantics-enhanced cloud system, that couples cloud system with semantic technologies and machine learning. SemCloud relies on domain ontologies and map**s for data integration, and parallelises the semantic data integration and data analysis on distributed computing nodes. Furthermore, SemCloud adopts adaptive Datalog rules and machine learning for automated resource configuration, allowing non-cloud experts to use the cloud system. The system has been evaluated in industrial use case with millions of data, thousands of repeated runs, and domain users, showing promising results.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
PG-Schema: Schemas for Property Graphs
Authors:
Renzo Angles,
Angela Bonifati,
Stefania Dumbrava,
George Fletcher,
Alastair Green,
Jan Hidders,
Bei Li,
Leonid Libkin,
Victor Marsault,
Wim Martens,
Filip Murlak,
Stefan Plantikow,
Ognjen Savković,
Michael Schmidt,
Juan Sequeda,
Sławek Staworko,
Dominik Tomaszuk,
Hannes Voigt,
Domagoj Vrgoč,
Mingxi Wu,
Dušan Živković
Abstract:
Property graphs have reached a high level of maturity, witnessed by multiple robust graph database systems as well as the ongoing ISO standardization effort aiming at creating a new standard Graph Query Language (GQL). Yet, despite documented demand, schema support is limited both in existing systems and in the first version of the GQL Standard. It is anticipated that the second version of the GQL…
▽ More
Property graphs have reached a high level of maturity, witnessed by multiple robust graph database systems as well as the ongoing ISO standardization effort aiming at creating a new standard Graph Query Language (GQL). Yet, despite documented demand, schema support is limited both in existing systems and in the first version of the GQL Standard. It is anticipated that the second version of the GQL Standard will include a rich DDL. Aiming to inspire the development of GQL and enhance the capabilities of graph database systems, we propose PG-Schema, a simple yet powerful formalism for specifying property graph schemas. It features PG-Types with flexible type definitions supporting multi-inheritance, as well as expressive constraints based on the recently proposed PG-Keys formalism. We provide the formal syntax and semantics of PG-Schema, which meet principled design requirements grounded in contemporary property graph management scenarios, and offer a detailed comparison of its features with those of existing schema languages and graph database systems.
△ Less
Submitted 8 July, 2023; v1 submitted 20 November, 2022;
originally announced November 2022.
-
Query Stability in Monotonic Data-Aware Business Processes [Extended Version]
Authors:
Ognjen Savkovic,
Elisa Marengo,
Werner Nutt
Abstract:
Organizations continuously accumulate data, often according to some business processes. If one poses a query over such data for decision support, it is important to know whether the query is stable, that is, whether the answers will stay the same or may change in the future because business processes may add further data. We investigate query stability for conjunctive queries. To this end, we defi…
▽ More
Organizations continuously accumulate data, often according to some business processes. If one poses a query over such data for decision support, it is important to know whether the query is stable, that is, whether the answers will stay the same or may change in the future because business processes may add further data. We investigate query stability for conjunctive queries. To this end, we define a formalism that combines an explicit representation of the control flow of a process with a specification of how data is read and inserted into the database. We consider different restrictions of the process model and the state of the system, such as negation in conditions, cyclic executions, read access to written data, presence of pending process instances, and the possibility to start fresh process instances. We identify for which facet combinations stability of conjunctive queries is decidable and provide encodings into variants of Datalog that are optimal with respect to the worst-case complexity of the problem.
△ Less
Submitted 21 December, 2015;
originally announced December 2015.