-
From Data Complexity to User Simplicity: A Framework for Linked Open Data Reconciliation and Serendipitous Discovery
Authors:
Marco Grasso,
Giulia Renda,
Marilena Daquino
Abstract:
This article introduces a novel software solution to create a Web portal to align Linked Open Data sources and provide user-friendly interfaces for serendipitous discovery. We present the Polifonia Web portal as a motivating scenario and case study to address research problems such as data reconciliation and serving generous interfaces in the music heritage domain.
This article introduces a novel software solution to create a Web portal to align Linked Open Data sources and provide user-friendly interfaces for serendipitous discovery. We present the Polifonia Web portal as a motivating scenario and case study to address research problems such as data reconciliation and serving generous interfaces in the music heritage domain.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
A System Development Kit for Big Data Applications on FPGA-based Clusters: The EVEREST Approach
Authors:
Christian Pilato,
Subhadeep Banik,
Jakub Beranek,
Fabien Brocheton,
Jeronimo Castrillon,
Riccardo Cevasco,
Radim Cmar,
Serena Curzel,
Fabrizio Ferrandi,
Karl F. A. Friebel,
Antonella Galizia,
Matteo Grasso,
Paulo Silva,
Jan Martinovic,
Gianluca Palermo,
Michele Paolino,
Andrea Parodi,
Antonio Parodi,
Fabio Pintus,
Raphael Polig,
David Poulet,
Francesco Regazzoni,
Burkhard Ringlein,
Roberto Rocco,
Katerina Slaninova
, et al. (6 additional authors not shown)
Abstract:
Modern big data workflows are characterized by computationally intensive kernels. The simulated results are often combined with knowledge extracted from AI models to ultimately support decision-making. These energy-hungry workflows are increasingly executed in data centers with energy-efficient hardware accelerators since FPGAs are well-suited for this task due to their inherent parallelism. We pr…
▽ More
Modern big data workflows are characterized by computationally intensive kernels. The simulated results are often combined with knowledge extracted from AI models to ultimately support decision-making. These energy-hungry workflows are increasingly executed in data centers with energy-efficient hardware accelerators since FPGAs are well-suited for this task due to their inherent parallelism. We present the H2020 project EVEREST, which has developed a system development kit (SDK) to simplify the creation of FPGA-accelerated kernels and manage the execution at runtime through a virtualization environment. This paper describes the main components of the EVEREST SDK and the benefits that can be achieved in our use cases.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
From ontology design to user-centred interfaces for music heritage
Authors:
Giulia Renda,
Marco Grasso,
Marilena Daquino
Abstract:
In this article we investigate the bridge between ontology design and UI/UX design methodologies to assist designers in prototy** web applications for information seeking purposes. We briefly review the state of the art in ontology design and UI/UX methodologies, then we illustrate our approach applied to a case study in the music heritage domain.
In this article we investigate the bridge between ontology design and UI/UX design methodologies to assist designers in prototy** web applications for information seeking purposes. We briefly review the state of the art in ontology design and UI/UX methodologies, then we illustrate our approach applied to a case study in the music heritage domain.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
Disambiguation of Company names via Deep Recurrent Networks
Authors:
Alessandro Basile,
Riccardo Crupi,
Michele Grasso,
Alessandro Mercanti,
Daniele Regoli,
Simone Scarsi,
Shuyi Yang,
Andrea Cosentini
Abstract:
Name Entity Disambiguation is the Natural Language Processing task of identifying textual records corresponding to the same Named Entity, i.e. real-world entities represented as a list of attributes (names, places, organisations, etc.). In this work, we face the task of disambiguating companies on the basis of their written names. We propose a Siamese LSTM Network approach to extract -- via superv…
▽ More
Name Entity Disambiguation is the Natural Language Processing task of identifying textual records corresponding to the same Named Entity, i.e. real-world entities represented as a list of attributes (names, places, organisations, etc.). In this work, we face the task of disambiguating companies on the basis of their written names. We propose a Siamese LSTM Network approach to extract -- via supervised learning -- an embedding of company name strings in a (relatively) low dimensional vector space and use this representation to identify pairs of company names that actually represent the same company (i.e. the same Entity).
Given that the manual labelling of string pairs is a rather onerous task, we analyse how an Active Learning approach to prioritise the samples to be labelled leads to a more efficient overall learning pipeline.
With empirical investigations, we show that our proposed Siamese Network outperforms several benchmark approaches based on standard string matching algorithms when enough labelled data are available. Moreover, we show that Active Learning prioritisation is indeed helpful when labelling resources are limited, and let the learning models reach the out-of-sample performance saturation with less labelled data with respect to standard (random) data labelling approaches.
△ Less
Submitted 15 April, 2023; v1 submitted 7 March, 2023;
originally announced March 2023.
-
Real-time Detection of Clustered Events in Video-imaging data with Applications to Additive Manufacturing
Authors:
Hao Yan,
Marco Grasso,
Kamran Paynabar,
Bianca Maria Colosimo
Abstract:
The use of video-imaging data for in-line process monitoring applications has become more and more popular in the industry. In this framework, spatio-temporal statistical process monitoring methods are needed to capture the relevant information content and signal possible out-of-control states. Video-imaging data are characterized by a spatio-temporal variability structure that depends on the unde…
▽ More
The use of video-imaging data for in-line process monitoring applications has become more and more popular in the industry. In this framework, spatio-temporal statistical process monitoring methods are needed to capture the relevant information content and signal possible out-of-control states. Video-imaging data are characterized by a spatio-temporal variability structure that depends on the underlying phenomenon, and typical out-of-control patterns are related to the events that are localized both in time and space. In this paper, we propose an integrated spatio-temporal decomposition and regression approach for anomaly detection in video-imaging data. Out-of-control events are typically sparse spatially clustered and temporally consistent. Therefore, the goal is to not only detect the anomaly as quickly as possible ("when") but also locate it ("where"). The proposed approach works by decomposing the original spatio-temporal data into random natural events, sparse spatially clustered and temporally consistent anomalous events, and random noise. Recursive estimation procedures for spatio-temporal regression are presented to enable the real-time implementation of the proposed methodology. Finally, a likelihood ratio test procedure is proposed to detect when and where the hotspot happens. The proposed approach was applied to the analysis of video-imaging data to detect and locate local over-heating phenomena ("hotspots") during the layer-wise process in a metal additive manufacturing process.
△ Less
Submitted 23 April, 2020;
originally announced April 2020.
-
Hybrid Mortality Prediction using Multiple Source Systems
Authors:
Isaac Mativo,
Yelena Yesha,
Michael Grasso,
Tim Oates,
Qian Zhu
Abstract:
The use of artificial intelligence in clinical care to improve decision support systems is increasing. This is not surprising since, by its very nature, the practice of medicine consists of making decisions based on observations from different systems both inside and outside the human body. In this paper, we combine three general systems (ICU, diabetes, and comorbidities) and use them to make pati…
▽ More
The use of artificial intelligence in clinical care to improve decision support systems is increasing. This is not surprising since, by its very nature, the practice of medicine consists of making decisions based on observations from different systems both inside and outside the human body. In this paper, we combine three general systems (ICU, diabetes, and comorbidities) and use them to make patient clinical predictions. We use an artificial intelligence approach to show that we can improve mortality prediction of hospitalized diabetic patients. We do this by utilizing a machine learning approach to select clinical input features that are more likely to predict mortality. We then use these features to create a hybrid mortality prediction model and compare our results to non-artificial intelligence models. For simplicity, we limit our input features to patient comorbidities and features derived from a well-known mortality measure, the Sequential Organ Failure Assessment (SOFA).
△ Less
Submitted 17 April, 2019;
originally announced May 2019.
-
LTLf and LDLf Monitoring: A Technical Report
Authors:
Giuseppe De Giacomo,
Riccardo De Masellis,
Marco Grasso,
Fabrizio Maggi,
Marco Montali
Abstract:
Runtime monitoring is one of the central tasks to provide operational decision support to running business processes, and check on-the-fly whether they comply with constraints and rules. We study runtime monitoring of properties expressed in LTL on finite traces (LTLf) and in its extension LDLf. LDLf is a powerful logic that captures all monadic second order logic on finite traces, which is obtain…
▽ More
Runtime monitoring is one of the central tasks to provide operational decision support to running business processes, and check on-the-fly whether they comply with constraints and rules. We study runtime monitoring of properties expressed in LTL on finite traces (LTLf) and in its extension LDLf. LDLf is a powerful logic that captures all monadic second order logic on finite traces, which is obtained by combining regular expressions and LTLf, adopting the syntax of propositional dynamic logic (PDL). Interestingly, in spite of its greater expressivity, LDLf has exactly the same computational complexity of LTLf. We show that LDLf is able to capture, in the logic itself, not only the constraints to be monitored, but also the de-facto standard RV-LTL monitors. This makes it possible to declaratively capture monitoring metaconstraints, and check them by relying on usual logical services instead of ad-hoc algorithms. This, in turn, enables to flexibly monitor constraints depending on the monitoring state of other constraints, e.g., "compensation" constraints that are only checked when others are detected to be violated. In addition, we devise a direct translation of LDLf formulas into nondeterministic automata, avoiding to detour to Buechi automata or alternating automata, and we use it to implement a monitoring plug-in for the PROM suite.
△ Less
Submitted 30 April, 2014;
originally announced May 2014.