-
Situated Ground Truths: Enhancing Bias-Aware AI by Situating Data Labels with SituAnnotate
Authors:
Delfina Sol Martinez Pandiani,
Valentina Presutti
Abstract:
In the contemporary world of AI and data-driven applications, supervised machines often derive their understanding, which they mimic and reproduce, through annotations--typically conveyed in the form of words or labels. However, such annotations are often divorced from or lack contextual information, and as such hold the potential to inadvertently introduce biases when subsequently used for traini…
▽ More
In the contemporary world of AI and data-driven applications, supervised machines often derive their understanding, which they mimic and reproduce, through annotations--typically conveyed in the form of words or labels. However, such annotations are often divorced from or lack contextual information, and as such hold the potential to inadvertently introduce biases when subsequently used for training. This paper introduces SituAnnotate, a novel ontology explicitly crafted for 'situated grounding,' aiming to anchor the ground truth data employed in training AI systems within the contextual and culturally-bound situations from which those ground truths emerge. SituAnnotate offers an ontology-based approach to structured and context-aware data annotation, addressing potential bias issues associated with isolated annotations. Its representational power encompasses situational context, including annotator details, timing, location, remuneration schemes, annotation roles, and more, ensuring semantic richness. Aligned with the foundational Dolce Ultralight ontology, it provides a robust and consistent framework for knowledge representation. As a method to create, query, and compare label-based datasets, SituAnnotate empowers downstream AI systems to undergo training with explicit consideration of context and cultural bias, laying the groundwork for enhanced system interpretability and adaptability, and enabling AI models to align with a multitude of cultural contexts and viewpoints.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Stitching Gaps: Fusing Situated Perceptual Knowledge with Vision Transformers for High-Level Image Classification
Authors:
Delfina Sol Martinez Pandiani,
Nicolas Lazzari,
Valentina Presutti
Abstract:
The increasing demand for automatic high-level image understanding, particularly in detecting abstract concepts (AC) within images, underscores the necessity for innovative and more interpretable approaches. These approaches need to harmonize traditional deep vision methods with the nuanced, context-dependent knowledge humans employ to interpret images at intricate semantic levels. In this work, w…
▽ More
The increasing demand for automatic high-level image understanding, particularly in detecting abstract concepts (AC) within images, underscores the necessity for innovative and more interpretable approaches. These approaches need to harmonize traditional deep vision methods with the nuanced, context-dependent knowledge humans employ to interpret images at intricate semantic levels. In this work, we leverage situated perceptual knowledge of cultural images to enhance performance and interpretability in AC image classification. We automatically extract perceptual semantic units from images, which we then model and integrate into the ARTstract Knowledge Graph (AKG). This resource captures situated perceptual semantics gleaned from over 14,000 cultural images labeled with ACs. Additionally, we enhance the AKG with high-level linguistic frames. We compute KG embeddings and experiment with relative representations and hybrid approaches that fuse these embeddings with visual transformer embeddings. Finally, for interpretability, we conduct posthoc qualitative analyses by examining model similarities with training instances. Our results show that our hybrid KGE-ViT methods outperform existing techniques in AC image classification. The posthoc interpretability analyses reveal the visual transformer's proficiency in capturing pixel-level visual attributes, contrasting with our method's efficacy in representing more abstract and semantic scene elements. We demonstrate the synergy and complementarity between KGE embeddings' situated perceptual knowledge and deep visual model's sensory-perceptual understanding for AC image classification. This work suggests a strong potential of neuro-symbolic methods for knowledge integration and robust image representation for use in downstream intricate visual comprehension tasks. All the materials and code are available online.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
Sandra -- A Neuro-Symbolic Reasoner Based On Descriptions And Situations
Authors:
Nicolas Lazzari,
Stefano De Giorgis,
Aldo Gangemi,
Valentina Presutti
Abstract:
This paper presents sandra, a neuro-symbolic reasoner combining vectorial representations with deductive reasoning. Sandra builds a vector space constrained by an ontology and performs reasoning over it. The geometric nature of the reasoner allows its combination with neural networks, bridging the gap with symbolic knowledge representations. Sandra is based on the Description and Situation (DnS) o…
▽ More
This paper presents sandra, a neuro-symbolic reasoner combining vectorial representations with deductive reasoning. Sandra builds a vector space constrained by an ontology and performs reasoning over it. The geometric nature of the reasoner allows its combination with neural networks, bridging the gap with symbolic knowledge representations. Sandra is based on the Description and Situation (DnS) ontology design pattern, a formalization of frame semantics. Given a set of facts (a situation) it allows to infer all possible perspectives (descriptions) that can provide a plausible interpretation for it, even in presence of incomplete information. We prove that our method is correct with respect to the DnS model. We experiment with two different tasks and their standard benchmarks, demonstrating that, without increasing complexity, sandra (i) outperforms all the baselines (ii) provides interpretability in the classification process, and (iii) allows control over the vector space, which is designed a priori.
△ Less
Submitted 25 March, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
The Music Meta Ontology: a flexible semantic model for the interoperability of music metadata
Authors:
Jacopo de Berardinis,
Valentina Anita Carriero,
Albert Meroño-Peñuela,
Andrea Poltronieri,
Valentina Presutti
Abstract:
The semantic description of music metadata is a key requirement for the creation of music datasets that can be aligned, integrated, and accessed for information retrieval and knowledge discovery. It is nonetheless an open challenge due to the complexity of musical concepts arising from different genres, styles, and periods -- standing to benefit from a lingua franca to accommodate various stakehol…
▽ More
The semantic description of music metadata is a key requirement for the creation of music datasets that can be aligned, integrated, and accessed for information retrieval and knowledge discovery. It is nonetheless an open challenge due to the complexity of musical concepts arising from different genres, styles, and periods -- standing to benefit from a lingua franca to accommodate various stakeholders (musicologists, librarians, data engineers, etc.). To initiate this transition, we introduce the Music Meta ontology, a rich and flexible semantic model to describe music metadata related to artists, compositions, performances, recordings, and links. We follow eXtreme Design methodologies and best practices for data engineering, to reflect the perspectives and the requirements of various stakeholders into the design of the model, while leveraging ontology design patterns and accounting for provenance at different levels (claims, links). After presenting the main features of Music Meta, we provide a first evaluation of the model, alignments to other schema (Music Ontology, DOREMUS, Wikidata), and support for data transformation.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Seeing the Intangible: Survey of Image Classification into High-Level and Abstract Categories
Authors:
Delfina Sol Martinez Pandiani,
Valentina Presutti
Abstract:
The field of Computer Vision (CV) is increasingly shifting towards ``high-level'' visual sensemaking tasks, yet the exact nature of these tasks remains unclear and tacit. This survey paper addresses this ambiguity by systematically reviewing research on high-level visual understanding, focusing particularly on Abstract Concepts (ACs) in automatic image classification. Our survey contributes in thr…
▽ More
The field of Computer Vision (CV) is increasingly shifting towards ``high-level'' visual sensemaking tasks, yet the exact nature of these tasks remains unclear and tacit. This survey paper addresses this ambiguity by systematically reviewing research on high-level visual understanding, focusing particularly on Abstract Concepts (ACs) in automatic image classification. Our survey contributes in three main ways: Firstly, it clarifies the tacit understanding of high-level semantics in CV through a multidisciplinary analysis, and categorization into distinct clusters, including commonsense, emotional, aesthetic, and inductive interpretative semantics. Secondly, it identifies and categorizes computer vision tasks associated with high-level visual sensemaking, offering insights into the diverse research areas within this domain. Lastly, it examines how abstract concepts such as values and ideologies are handled in CV, revealing challenges and opportunities in AC-based image classification. Notably, our survey of AC image classification tasks highlights persistent challenges, such as the limited efficacy of massive datasets and the importance of integrating supplementary information and mid-level features. We emphasize the growing relevance of hybrid AI systems in addressing the multifaceted nature of AC image classification tasks. Overall, this survey enhances our understanding of high-level visual reasoning in CV and lays the groundwork for future research endeavors.
△ Less
Submitted 29 February, 2024; v1 submitted 21 August, 2023;
originally announced August 2023.
-
Melody: A Platform for Linked Open Data Visualisation and Curated Storytelling
Authors:
Giulia Renda,
Marilena Daquino,
Valentina Presutti
Abstract:
Data visualisation and storytelling techniques help experts highlight relations between data and share complex information with a broad audience. However, existing solutions targeted to Linked Open Data visualisation have several restrictions and lack the narrative element. In this article we present MELODY, a web interface for authoring data stories based on Linked Open Data. MELODY has been desi…
▽ More
Data visualisation and storytelling techniques help experts highlight relations between data and share complex information with a broad audience. However, existing solutions targeted to Linked Open Data visualisation have several restrictions and lack the narrative element. In this article we present MELODY, a web interface for authoring data stories based on Linked Open Data. MELODY has been designed using a novel methodology that harmonises existing Ontology Design and User Experience methodologies (eXtreme Design and Design Thinking), and provides reusable User Interface components to create and publish web-ready article-alike documents based on data retrievable from any SPARQL endpoint. We evaluate the software by comparing it with existing solutions, and we show its potential impact in projects where data dissemination is crucial.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Classifying sequences by combining context-free grammars and OWL ontologies
Authors:
Nicolas Lazzari,
Andrea Poltronieri,
Valentina Presutti
Abstract:
This paper describes a pattern to formalise context-free grammars in OWL and its use for sequence classification. The proposed approach is compared to existing methods in terms of computational complexity as well as pragmatic applicability, with examples in the music domain.
This paper describes a pattern to formalise context-free grammars in OWL and its use for sequence classification. The proposed approach is compared to existing methods in terms of computational complexity as well as pragmatic applicability, with examples in the music domain.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
The Music Annotation Pattern
Authors:
Jacopo de Berardinis,
Albert Meroño-Peñuela,
Andrea Poltronieri,
Valentina Presutti
Abstract:
The annotation of music content is a complex process to represent due to its inherent multifaceted, subjectivity, and interdisciplinary nature. Numerous systems and conventions for annotating music have been developed as independent standards over the past decades. Little has been done to make them interoperable, which jeopardises cross-corpora studies as it requires users to familiarise with a mu…
▽ More
The annotation of music content is a complex process to represent due to its inherent multifaceted, subjectivity, and interdisciplinary nature. Numerous systems and conventions for annotating music have been developed as independent standards over the past decades. Little has been done to make them interoperable, which jeopardises cross-corpora studies as it requires users to familiarise with a multitude of conventions. Most of these systems lack the semantic expressiveness needed to represent the complexity of the musical language and cannot model multi-modal annotations originating from audio and symbolic sources. In this article, we introduce the Music Annotation Pattern, an Ontology Design Pattern (ODP) to homogenise different annotation systems and to represent several types of musical objects (e.g. chords, patterns, structures). This ODP preserves the semantics of the object's content at different levels and temporal granularity. Moreover, our ODP accounts for multi-modality upfront, to describe annotations derived from different sources, and it is the first to enable the integration of music datasets at a large scale.
△ Less
Submitted 30 March, 2023;
originally announced April 2023.
-
Pitchclass2vec: Symbolic Music Structure Segmentation with Chord Embeddings
Authors:
Nicolas Lazzari,
Andrea Poltronieri,
Valentina Presutti
Abstract:
Structure perception is a fundamental aspect of music cognition in humans. Historically, the hierarchical organization of music into structures served as a narrative device for conveying meaning, creating expectancy, and evoking emotions in the listener. Thereby, musical structures play an essential role in music composition, as they shape the musical discourse through which the composer organises…
▽ More
Structure perception is a fundamental aspect of music cognition in humans. Historically, the hierarchical organization of music into structures served as a narrative device for conveying meaning, creating expectancy, and evoking emotions in the listener. Thereby, musical structures play an essential role in music composition, as they shape the musical discourse through which the composer organises his ideas. In this paper, we present a novel music segmentation method, pitchclass2vec, based on symbolic chord annotations, which are embedded into continuous vector representations using both natural language processing techniques and custom-made encodings. Our algorithm is based on long-short term memory (LSTM) neural network and outperforms the state-of-the-art techniques based on symbolic chord annotations in the field.
△ Less
Submitted 24 March, 2023;
originally announced March 2023.
-
Automatic Modeling of Social Concepts Evoked by Art Images as Multimodal Frames
Authors:
Delfina Sol Martinez Pandiani,
Valentina Presutti
Abstract:
Social concepts referring to non-physical objects--such as revolution, violence, or friendship--are powerful tools to describe, index, and query the content of visual data, including ever-growing collections of art images from the Cultural Heritage (CH) field. While much progress has been made towards complete image understanding in computer vision, automatic detection of social concepts evoked by…
▽ More
Social concepts referring to non-physical objects--such as revolution, violence, or friendship--are powerful tools to describe, index, and query the content of visual data, including ever-growing collections of art images from the Cultural Heritage (CH) field. While much progress has been made towards complete image understanding in computer vision, automatic detection of social concepts evoked by images is still a challenge. This is partly due to the well-known semantic gap problem, worsened for social concepts given their lack of unique physical features, and reliance on more unspecific features than concrete concepts. In this paper, we propose the translation of recent cognitive theories about social concept representation into a software approach to represent them as multimodal frames, by integrating multisensory data. Our method focuses on the extraction, analysis, and integration of multimodal features from visual art material tagged with the concepts of interest. We define a conceptual model and present a novel ontology for formally representing social concepts as multimodal frames. Taking the Tate Gallery's collection as an empirical basis, we experiment our method on a corpus of art images to provide a proof of concept of its potential. We discuss further directions of research, and provide all software, data sources, and results.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Pattern-based Visualization of Knowledge Graphs
Authors:
Luigi Asprino,
Christian Colonna,
Misael Mongiovì,
Margherita Porena,
Valentina Presutti
Abstract:
We present a novel approach to knowledge graph visualization based on ontology design patterns. This approach relies on OPLa (Ontology Pattern Language) annotations and on a catalogue of visual frames, which are associated with foundational ontology design patterns. We demonstrate that this approach significantly reduces the cognitive load required to users for visualizing and interpreting a knowl…
▽ More
We present a novel approach to knowledge graph visualization based on ontology design patterns. This approach relies on OPLa (Ontology Pattern Language) annotations and on a catalogue of visual frames, which are associated with foundational ontology design patterns. We demonstrate that this approach significantly reduces the cognitive load required to users for visualizing and interpreting a knowledge graph and guides the user in exploring it through meaningful thematic paths provided by ontology patterns.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
Extraction of common conceptual components from multiple ontologies
Authors:
Luigi Asprino,
Valentina Anita Carriero,
Valentina Presutti
Abstract:
Understanding large ontologies is still an issue, and has an impact on many ontology engineering tasks. We describe a novel method for identifying and extracting conceptual components from domain ontologies, which are used to understand and compare them. The method is applied to two corpora of ontologies in the Cultural Heritage and Conference domain, respectively. The results, which show good qua…
▽ More
Understanding large ontologies is still an issue, and has an impact on many ontology engineering tasks. We describe a novel method for identifying and extracting conceptual components from domain ontologies, which are used to understand and compare them. The method is applied to two corpora of ontologies in the Cultural Heritage and Conference domain, respectively. The results, which show good quality, are evaluated by manual inspection and by correlation with datasets and tool performance from the ontology alignment evaluation initiative.
△ Less
Submitted 4 November, 2021; v1 submitted 24 June, 2021;
originally announced June 2021.
-
An Ontology Design Pattern for representing Recurrent Situations
Authors:
Valentina Anita Carriero,
Aldo Gangemi,
Andrea Giovanni Nuzzolese,
Valentina Presutti
Abstract:
In this paper, we present an Ontology Design Pattern for representing situations that recur at regular periods and share some invariant factors, which unify them conceptually: we refer to this set of recurring situations as recurrent situation series. The proposed pattern appears to be foundational, since it can be generalised for modelling the top-level domain-independent concept of recurrence, w…
▽ More
In this paper, we present an Ontology Design Pattern for representing situations that recur at regular periods and share some invariant factors, which unify them conceptually: we refer to this set of recurring situations as recurrent situation series. The proposed pattern appears to be foundational, since it can be generalised for modelling the top-level domain-independent concept of recurrence, which is strictly associated with invariance. The pattern reuses other foundational patterns such as Collection, Description and Situation, Classification, Sequence. Indeed, a recurrent situation series is formalised as both a collection of situations occurring regularly over time and unified according to some properties that are common to all the members, and a situation itself, which provides a relational context to its members that satisfy a reference description. Besides including some exemplifying instances of this pattern, we show how it has been implemented and specialised to model recurrent cultural events and ceremonies in ArCo, the Knowledge Graph of Italian cultural heritage.
△ Less
Submitted 1 January, 2021;
originally announced January 2021.
-
Knowledge Graphs Evolution and Preservation -- A Technical Report from ISWS 2019
Authors:
Nacira Abbas,
Kholoud Alghamdi,
Mortaza Alinam,
Francesca Alloatti,
Glenda Amaral,
Claudia d'Amato,
Luigi Asprino,
Martin Beno,
Felix Bensmann,
Russa Biswas,
Ling Cai,
Riley Capshaw,
Valentina Anita Carriero,
Irene Celino,
Amine Dadoun,
Stefano De Giorgis,
Harm Delva,
John Domingue,
Michel Dumontier,
Vincent Emonet,
Marieke van Erp,
Paola Espinoza Arias,
Omaima Fallatah,
Sebastián Ferrada,
Marc Gallofré Ocaña
, et al. (49 additional authors not shown)
Abstract:
One of the grand challenges discussed during the Dagstuhl Seminar "Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web" and described in its report is that of a: "Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of entities. [...] This grand challenge extends this fur…
▽ More
One of the grand challenges discussed during the Dagstuhl Seminar "Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web" and described in its report is that of a: "Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of entities. [...] This grand challenge extends this further by asking if we can create a knowledge graph of "everything" ranging from common sense concepts to location based entities. This knowledge graph should be "open to the public" in a FAIR manner democratizing this mass amount of knowledge." Although linked open data (LOD) is one knowledge graph, it is the closest realisation (and probably the only one) to a public FAIR Knowledge Graph (KG) of everything. Surely, LOD provides a unique testbed for experimenting and evaluating research hypotheses on open and FAIR KG. One of the most neglected FAIR issues about KGs is their ongoing evolution and long term preservation. We want to investigate this problem, that is to understand what preserving and supporting the evolution of KGs means and how these problems can be addressed. Clearly, the problem can be approached from different perspectives and may require the development of different approaches, including new theories, ontologies, metrics, strategies, procedures, etc. This document reports a collaborative effort performed by 9 teams of students, each guided by a senior researcher as their mentor, attending the International Semantic Web Research School (ISWS 2019). Each team provides a different perspective to the problem of knowledge graph evolution substantiated by a set of research questions as the main subject of their investigation. In addition, they provide their working definition for KG preservation and evolution.
△ Less
Submitted 22 December, 2020;
originally announced December 2020.
-
The Landscape of Ontology Reuse Approaches
Authors:
Valentina Anita Carriero,
Marilena Daquino,
Aldo Gangemi,
Andrea Giovanni Nuzzolese,
Silvio Peroni,
Valentina Presutti,
Francesca Tomasi
Abstract:
Ontology reuse aims to foster interoperability and facilitate knowledge reuse. Several approaches are typically evaluated by ontology engineers when bootstrap** a new project. However, current practices are often motivated by subjective, case-by-case decisions, which hamper the definition of a recommended behaviour. In this chapter we argue that to date there are no effective solutions for suppo…
▽ More
Ontology reuse aims to foster interoperability and facilitate knowledge reuse. Several approaches are typically evaluated by ontology engineers when bootstrap** a new project. However, current practices are often motivated by subjective, case-by-case decisions, which hamper the definition of a recommended behaviour. In this chapter we argue that to date there are no effective solutions for supporting developers' decision-making process when deciding on an ontology reuse strategy. The objective is twofold: (i) to survey current approaches to ontology reuse, presenting motivations, strategies, benefits and limits, and (ii) to analyse two representative approaches and discuss their merits.
△ Less
Submitted 25 November, 2020;
originally announced November 2020.
-
A Reference Software Architecture for Social Robots
Authors:
Luigi Asprino,
Paolo Ciancarini,
Andrea Giovanni Nuzzolese,
Valentina Presutti,
Alessandro Russo
Abstract:
Social Robotics poses tough challenges to software designers who are required to take care of difficult architectural drivers like acceptability, trust of robots as well as to guarantee that robots establish a personalised interaction with their users. Moreover, in this context recurrent software design issues such as ensuring interoperability, improving reusability and customizability of software…
▽ More
Social Robotics poses tough challenges to software designers who are required to take care of difficult architectural drivers like acceptability, trust of robots as well as to guarantee that robots establish a personalised interaction with their users. Moreover, in this context recurrent software design issues such as ensuring interoperability, improving reusability and customizability of software components also arise.
Designing and implementing social robotic software architectures is a time-intensive activity requiring multi-disciplinary expertise: this makes difficult to rapidly develop, customise, and personalise robotic solutions.
These challenges may be mitigated at design time by choosing certain architectural styles, implementing specific architectural patterns and using particular technologies.
Leveraging on our experience in the MARIO project, in this paper we propose a series of principles that social robots may benefit from. These principles lay also the foundations for the design of a reference software architecture for Social Robots. The ultimate goal of this work is to establish a common ground based on a reference software architecture to allow to easily reuse robotic software components in order to rapidly develop, implement, and personalise Social Robots.
△ Less
Submitted 9 July, 2020;
originally announced July 2020.
-
Using altmetrics for detecting impactful research in quasi-zero-day time-windows: the case of COVID-19
Authors:
Erik Boetto,
Maria Pia Fantini,
Aldo Gangemi,
Davide Golinelli,
Manfredi Greco,
Andrea Giovanni Nuzzolese,
Valentina Presutti,
Flavia Rallo
Abstract:
On December 31st 2019, the World Health Organization (WHO) China Country Office was informed of cases of pneumonia of unknown etiology detected in Wuhan City. The cause of the syndrome was a new type of coronavirus isolated on January 7th 2020 and named Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2). SARS-CoV-2 is the cause of the coronavirus disease 2019 (COVID-19). Since January 20…
▽ More
On December 31st 2019, the World Health Organization (WHO) China Country Office was informed of cases of pneumonia of unknown etiology detected in Wuhan City. The cause of the syndrome was a new type of coronavirus isolated on January 7th 2020 and named Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2). SARS-CoV-2 is the cause of the coronavirus disease 2019 (COVID-19). Since January 2020 an ever increasing number of scientific works have appeared in literature. Identifying relevant research outcomes at very early stages is challenging. In this work we use COVID-19 as a use-case for investigating: (i) which tools and frameworks are mostly used for early scholarly communication; (ii) to what extent altmetrics can be used to identify potential impactful research in tight (i.e. quasi-zero-day) time-windows. A literature review with rigorous eligibility criteria is performed for gathering a sample composed of scientific papers about SARS-CoV-2/COVID-19 appeared in literature in the tight time-window ranging from January 15th 2020 to February 24th 2020. This sample is used for building a knowledge graph that represents the knowledge about papers and indicators formally. This knowledge graph feeds a data analysis process which is applied for experimenting with altmetrics as impact indicators. We find moderate correlation among traditional citation count, citations on social media, and mentions on news and blogs. This suggests there is a common intended meaning of the citational acts associated with aforementioned indicators. Additionally, we define a method that harmonises different indicators for providing a multi-dimensional impact indicator.
△ Less
Submitted 16 November, 2020; v1 submitted 13 April, 2020;
originally announced April 2020.
-
Pattern-based design applied to cultural heritage knowledge graphs
Authors:
Valentina Anita Carriero,
Aldo Gangemi,
Maria Letizia Mancinelli,
Andrea Giovanni Nuzzolese,
Valentina Presutti,
Chiara Veninata
Abstract:
Ontology Design Patterns (ODPs) have become an established and recognised practice for guaranteeing good quality ontology engineering. There are several ODP repositories where ODPs are shared as well as ontology design methodologies recommending their reuse. Performing rigorous testing is recommended as well for supporting ontology maintenance and validating the resulting resource against its moti…
▽ More
Ontology Design Patterns (ODPs) have become an established and recognised practice for guaranteeing good quality ontology engineering. There are several ODP repositories where ODPs are shared as well as ontology design methodologies recommending their reuse. Performing rigorous testing is recommended as well for supporting ontology maintenance and validating the resulting resource against its motivating requirements. Nevertheless, it is less than straightforward to find guidelines on how to apply such methodologies for develo** domain-specific knowledge graphs. ArCo is the knowledge graph of Italian Cultural Heritage and has been developed by using eXtreme Design (XD), an ODP- and test-driven methodology. During its development, XD has been adapted to the need of the CH domain e.g. gathering requirements from an open, diverse community of consumers, a new ODP has been defined and many have been specialised to address specific CH requirements. This paper presents ArCo and describes how to apply XD to the development and validation of a CH knowledge graph, also detailing the (intellectual) process implemented for matching the encountered modelling problems to ODPs. Relevant contributions also include a novel web tool for supporting unit-testing of knowledge graphs, a rigorous evaluation of ArCo, and a discussion of methodological lessons learned during ArCo development.
△ Less
Submitted 20 June, 2020; v1 submitted 18 November, 2019;
originally announced November 2019.
-
SQuAP-Ont: an Ontology of Software Quality Relational Factors from Financial Systems
Authors:
Paolo Ciancarini,
Andrea Giovanni Nuzzolese,
Valentina Presutti,
Daniel Russo
Abstract:
Quality, architecture, and process are considered the keystones of software engineering. ISO defines them in three separate standards. However, their interaction has been scarcely studied, so far. The SQuAP model (Software Quality, Architecture, Process) describes twenty-eight main factors that impact on software quality in banking systems, and each factor is described as a relation among some cha…
▽ More
Quality, architecture, and process are considered the keystones of software engineering. ISO defines them in three separate standards. However, their interaction has been scarcely studied, so far. The SQuAP model (Software Quality, Architecture, Process) describes twenty-eight main factors that impact on software quality in banking systems, and each factor is described as a relation among some characteristics from the three ISO standards. Hence, SQuAP makes such relations emerge rigorously, although informally. In this paper, we present SQuAP-Ont, an OWL ontology designed by following a well-established methodology based on the re-use of Ontology Design Patterns (i.e. ODPs). SQuAP-Ont formalises the relations emerging from SQuAP to represent and reason via Linked Data about software engineering in a three-dimensional model consisting of quality, architecture, and process ISO characteristics.
△ Less
Submitted 4 September, 2019;
originally announced September 2019.
-
Observing LOD using Equivalent Set Graphs: it is mostly flat and sparsely linked
Authors:
Luigi Asprino,
Wouter Beek,
Paolo Ciancarini,
Frank van Harmelen,
Valentina Presutti
Abstract:
This paper presents an empirical study aiming at understanding the modeling style and the overall semantic structure of Linked Open Data. We observe how classes, properties and individuals are used in practice. We also investigate how hierarchies of concepts are structured, and how much they are linked. In addition to discussing the results, this paper contributes (i) a conceptual framework, inclu…
▽ More
This paper presents an empirical study aiming at understanding the modeling style and the overall semantic structure of Linked Open Data. We observe how classes, properties and individuals are used in practice. We also investigate how hierarchies of concepts are structured, and how much they are linked. In addition to discussing the results, this paper contributes (i) a conceptual framework, including a set of metrics, which generalises over the observable constructs; (ii) an open source implementation that facilitates its application to other Linked Data knowledge graphs.
△ Less
Submitted 18 July, 2019; v1 submitted 19 June, 2019;
originally announced June 2019.
-
ArCo: the Italian Cultural Heritage Knowledge Graph
Authors:
Valentina Anita Carriero,
Aldo Gangemi,
Maria Letizia Mancinelli,
Ludovica Marinucci,
Andrea Giovanni Nuzzolese,
Valentina Presutti,
Chiara Veninata
Abstract:
ArCo is the Italian Cultural Heritage knowledge graph, consisting of a network of seven vocabularies and 169 million triples about 820 thousand cultural entities. It is distributed jointly with a SPARQL endpoint, a software for converting catalogue records to RDF, and a rich suite of documentation material (testing, evaluation, how-to, examples, etc.). ArCo is based on the official General Catalog…
▽ More
ArCo is the Italian Cultural Heritage knowledge graph, consisting of a network of seven vocabularies and 169 million triples about 820 thousand cultural entities. It is distributed jointly with a SPARQL endpoint, a software for converting catalogue records to RDF, and a rich suite of documentation material (testing, evaluation, how-to, examples, etc.). ArCo is based on the official General Catalogue of the Italian Ministry of Cultural Heritage and Activities (MiBAC) - and its associated encoding regulations - which collects and validates the catalogue records of (ideally) all Italian Cultural Heritage properties (excluding libraries and archives), contributed by CH administrators from all over Italy. We present its structure, design methods and tools, its growing community, and delineate its importance, quality, and impact.
△ Less
Submitted 7 May, 2019;
originally announced May 2019.
-
Linked Open Data Validity -- A Technical Report from ISWS 2018
Authors:
Tayeb Abderrahmani Ghor,
Esha Agrawal,
Mehwish Alam,
Omar Alqawasmeh,
Claudia D'amato,
Amina Annane,
Amr Azzam,
Andrew Berezovskyi,
Russa Biswas,
Mathias Bonduel,
Quentin Brabant,
Cristina-iulia Bucur,
Elena Camossi,
Valentina Anita Carriero,
Shruthi Chari,
David Chaves Fraga,
Fiorela Ciroku,
Michael Cochez,
Hubert Curien,
Vincenzo Cutrona,
Rahma Dandan,
Danilo Dess,
Valerio Di Carlo,
Ahmed El Amine Djebri,
Marieke Van Erp
, et al. (46 additional authors not shown)
Abstract:
Linked Open Data (LOD) is the publicly available RDF data in the Web. Each LOD entity is identfied by a URI and accessible via HTTP. LOD encodes globalscale knowledge potentially available to any human as well as artificial intelligence that may want to benefit from it as background knowledge for supporting their tasks. LOD has emerged as the backbone of applications in diverse fields such as Natu…
▽ More
Linked Open Data (LOD) is the publicly available RDF data in the Web. Each LOD entity is identfied by a URI and accessible via HTTP. LOD encodes globalscale knowledge potentially available to any human as well as artificial intelligence that may want to benefit from it as background knowledge for supporting their tasks. LOD has emerged as the backbone of applications in diverse fields such as Natural Language Processing, Information Retrieval, Computer Vision, Speech Recognition, and many more. Nevertheless, regardless of the specific tasks that LOD-based tools aim to address, the reuse of such knowledge may be challenging for diverse reasons, e.g. semantic heterogeneity, provenance, and data quality. As aptly stated by Heath et al. Linked Data might be outdated, imprecise, or simply wrong": there arouses a necessity to investigate the problem of linked data validity. This work reports a collaborative effort performed by nine teams of students, guided by an equal number of senior researchers, attending the International Semantic Web Research School (ISWS 2018) towards addressing such investigation from different perspectives coupled with different approaches to tackle the issue.
△ Less
Submitted 26 March, 2019;
originally announced March 2019.
-
The practice of self-citations: a longitudinal study
Authors:
Silvio Peroni,
Paolo Ciancarini,
Aldo Gangemi,
Andrea Giovanni Nuzzolese,
Francesco Poggi,
Valentina Presutti
Abstract:
In this article, we discuss the outcomes of an experiment where we analysed whether and to what extent the introduction, in 2012, of the new research assessment exercise in Italy (a.k.a. Italian Scientific Habilitation) affected self-citation behaviours in the Italian research community. The Italian Scientific Habilitation attests to the scientific maturity of researchers and in Italy, as in many…
▽ More
In this article, we discuss the outcomes of an experiment where we analysed whether and to what extent the introduction, in 2012, of the new research assessment exercise in Italy (a.k.a. Italian Scientific Habilitation) affected self-citation behaviours in the Italian research community. The Italian Scientific Habilitation attests to the scientific maturity of researchers and in Italy, as in many other countries, is a requirement for accessing to a professorship. To this end, we obtained from ScienceDirect 35,673 articles published from 1957 and 2016 by the participants to the 2012 Italian Scientific Habilitation, that resulted in the extraction of 1,379,050 citations retrieved through Semantic Publishing technologies. Our analysis showed an overall increment in author self-citations (i.e. where the citing article and the cited article share at least one author) in several of the 24 academic disciplines considered. However, we depicted a stronger causal relation between such increment and the rules introduced by the 2012 Italian Scientific Habilitation in 10 out of 24 disciplines analysed.
△ Less
Submitted 19 February, 2020; v1 submitted 14 March, 2019;
originally announced March 2019.
-
Do altmetrics work for assessing research quality?
Authors:
Andrea Giovanni Nuzzolese,
Paolo Ciancarini,
Aldo Gangemi,
Silvio Peroni,
Francesco Poggi,
Valentina Presutti
Abstract:
Alternative metrics (aka altmetrics) are gaining increasing interest in the scientometrics community as they can capture both the volume and quality of attention that a research work receives online. Nevertheless, there is limited knowledge about their effectiveness as a mean for measuring the impact of research if compared to traditional citation-based indicators. This work aims at rigorously inv…
▽ More
Alternative metrics (aka altmetrics) are gaining increasing interest in the scientometrics community as they can capture both the volume and quality of attention that a research work receives online. Nevertheless, there is limited knowledge about their effectiveness as a mean for measuring the impact of research if compared to traditional citation-based indicators. This work aims at rigorously investigating if any correlation exists among indicators, either traditional (i.e. citation count and h-index) or alternative (i.e. altmetrics) and which of them may be effective for evaluating scholars. The study is based on the analysis of real data coming from the National Scientific Qualification procedure held in Italy by committees of peers on behalf of the Italian Ministry of Education, Universities and Research.
△ Less
Submitted 31 December, 2018;
originally announced December 2018.
-
Semantic Role Labeling for Knowledge Graph Extraction from Text
Authors:
Mehwish Alam,
Aldo Gangemi,
Valentina Presutti,
Diego Reforgiato Recupero
Abstract:
This paper introduces TakeFive, a new semantic role labeling method that transforms a text into a frame-oriented knowledge graph. It performs dependency parsing, identifies the words that evoke lexical frames, locates the roles and fillers for each frame, runs coercion techniques, and formalises the results as a knowledge graph. This formal representation complies with the frame semantics used in…
▽ More
This paper introduces TakeFive, a new semantic role labeling method that transforms a text into a frame-oriented knowledge graph. It performs dependency parsing, identifies the words that evoke lexical frames, locates the roles and fillers for each frame, runs coercion techniques, and formalises the results as a knowledge graph. This formal representation complies with the frame semantics used in Framester, a factual-linguistic linked data resource. The obtained precision, recall and F1 values indicate that TakeFive is competitive with other existing methods such as SEMAFOR, Pikes, PathLSTM and FRED. We finally discuss how to combine TakeFive and FRED, obtaining higher values of precision, recall and F1.
△ Less
Submitted 4 November, 2018;
originally announced November 2018.
-
Amnestic Forgery: an Ontology of Conceptual Metaphors
Authors:
Aldo Gangemi,
Mehwish Alam,
Valentina Presutti
Abstract:
This paper presents Amnestic Forgery, an ontology for metaphor semantics, based on MetaNet, which is inspired by the theory of Conceptual Metaphor. Amnestic Forgery reuses and extends the Framester schema, as an ideal ontology design framework to deal with both semiotic and referential aspects of frames, roles, map**s, and eventually blending. The description of the resource is supplied by a dis…
▽ More
This paper presents Amnestic Forgery, an ontology for metaphor semantics, based on MetaNet, which is inspired by the theory of Conceptual Metaphor. Amnestic Forgery reuses and extends the Framester schema, as an ideal ontology design framework to deal with both semiotic and referential aspects of frames, roles, map**s, and eventually blending. The description of the resource is supplied by a discussion of its applications, with examples taken from metaphor generation, and the referential problems of metaphoric map**s. Both schema and data are available from the Framester SPARQL endpoint.
△ Less
Submitted 30 May, 2018;
originally announced May 2018.
-
Empirical Analysis of Foundational Distinctions in Linked Open Data
Authors:
Luigi Asprino,
Valerio Basile,
Paolo Ciancarini,
Valentina Presutti
Abstract:
The Web and its Semantic extension (i.e. Linked Open Data) contain open global-scale knowledge and make it available to potentially intelligent machines that want to benefit from it. Nevertheless, most of Linked Open Data lack ontological distinctions and have sparse axiomatisation. For example, distinctions such as whether an entity is inherently a class or an individual, or whether it is a physi…
▽ More
The Web and its Semantic extension (i.e. Linked Open Data) contain open global-scale knowledge and make it available to potentially intelligent machines that want to benefit from it. Nevertheless, most of Linked Open Data lack ontological distinctions and have sparse axiomatisation. For example, distinctions such as whether an entity is inherently a class or an individual, or whether it is a physical object or not, are hardly expressed in the data, although they have been largely studied and formalised by foundational ontologies (e.g. DOLCE, SUMO). These distinctions belong to common sense too, which is relevant for many artificial intelligence tasks such as natural language understanding, scene recognition, and the like. There is a gap between foundational ontologies, that often formalise or are inspired by pre-existing philosophical theories and are developed with a top-down approach, and Linked Open Data that mostly derive from existing databases or crowd-based effort (e.g. DBpedia, Wikidata). We investigate whether machines can learn foundational distinctions over Linked Open Data entities, and if they match common sense. We want to answer questions such as "does the DBpedia entity for dog refer to a class or to an instance?". We report on a set of experiments based on machine learning and crowdsourcing that show promising results.
△ Less
Submitted 23 May, 2018; v1 submitted 26 March, 2018;
originally announced March 2018.
-
An Innovative, Open, Interoperable Citizen Engagement Cloud Platform for Smart Government and Users' Interaction
Authors:
Diego Reforgiato Recupero,
Mario Castronovo,
Sergio Consoli,
Tarcisio Costanzo,
Aldo Gangemi,
Luigi Grasso,
Giorgia Lodi,
Gianluca Merendino,
Misael Mongiovì,
Valentina Presutti,
Salvatore Davide Rapisarda,
Salvo Rosa,
Emanuele Spampinato
Abstract:
This paper introduces an open, interoperable, and cloud-computing-based citizen engagement platform for the management of administrative processes of public administrations, which also increases the engagement of citizens. The citizen engagement platform is the outcome of a 3-year Italian national project called PRISMA (Interoperable cloud platforms for smart government). The aim of the project is…
▽ More
This paper introduces an open, interoperable, and cloud-computing-based citizen engagement platform for the management of administrative processes of public administrations, which also increases the engagement of citizens. The citizen engagement platform is the outcome of a 3-year Italian national project called PRISMA (Interoperable cloud platforms for smart government). The aim of the project is to constitute a new model of digital ecosystem that can support and enable new methods of interaction among public administrations, citizens, companies, and other stakeholders surrounding cities. The platform has been defined by the media as a flexible (enable the addition of any kind of application or service) and open (enable access to open services) Italian "cloud" that allows public administrations to access to a vast knowledge base represented as linked open data to be reused by a stakeholder community with the aim of develo** new applications ("Cloud Apps") tailored to the specific needs of citizens. The platform has been used by Catania and Syracuse municipalities, two of the main cities of southern Italy, located in the Sicilian region. The fully adoption of the platform is rapidly spreading around the whole region (local developers have already used available application programming interfaces (APIs) to create additional services for citizens and administrations) to such an extent that other provinces of Sicily and Italy in general expressed their interest for its usage. The platform is available online and, as mentioned above, is open source and provides APIs for full exploitation.
△ Less
Submitted 24 May, 2016;
originally announced May 2016.