-
MuLaN: a MultiLayer Networks Alignment Algorithm
Authors:
Marianna Milano,
Pietro Cinaglia,
Pietro Hiram Guzzi,
Mario Cannataro
Abstract:
A Multilayer Network (MN) is a system consisting of several topological levels (i.e., layers) representing the interactions between the system's objects and the related interdependency. Therefore, it may be represented as a set of layers that can be assimilated to a set of networks of its own objects, by means inter-layer edges (or inter-edges) linking the nodes of different layers; for instance,…
▽ More
A Multilayer Network (MN) is a system consisting of several topological levels (i.e., layers) representing the interactions between the system's objects and the related interdependency. Therefore, it may be represented as a set of layers that can be assimilated to a set of networks of its own objects, by means inter-layer edges (or inter-edges) linking the nodes of different layers; for instance, a biological MN may allow modeling of inter and intra interactions among diseases, genes, and drugs, only using its own structure. The analysis of MNs may reveal hidden knowledge, as demonstrated by several algorithms for the analysis. Recently, there is a growing interest in comparing two MNs by revealing local regions of similarity, as a counterpart of Network Alignment algorithms (NA) for simple networks. However, classical algorithms for NA such as Local NA (LNA) cannot be applied on multilayer networks, since they are not able to deal with inter-layer edges. Therefore, there is the need for the introduction of novel algorithms. In this paper, we present MuLaN, an algorithm for the local alignment of multilayer networks. We first show as proof of concept the performances of MuLaN on a set of synthetic multilayer networks. Then, we used as a case study a real multilayer network in the biomedical domain. Our results show that MuLaN is able to build high-quality alignments and can extract knowledge about the aligned multilayer networks. MuLaN is available at https://github.com/pietrocinaglia/mulan.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
HetNetAligner: Design and Implementation of an algorithm for heterogeneous network alignment on Apache Spark
Authors:
Pietro H Guzzi,
Marianna Milano,
Pierangelo Veltri,
Mario Cannataro
Abstract:
The importance of the use of networks to model and analyse biological data and the interplay of bio-molecules is widely recognised. Consequently, many algorithms for the analysis and the comparison of networks (such as alignment algorithms) have been developed in the past. Recently, many different approaches tried to integrate into a single model the interplay of different molecules, such as genes…
▽ More
The importance of the use of networks to model and analyse biological data and the interplay of bio-molecules is widely recognised. Consequently, many algorithms for the analysis and the comparison of networks (such as alignment algorithms) have been developed in the past. Recently, many different approaches tried to integrate into a single model the interplay of different molecules, such as genes, transcription factors and microRNAs. A possible formalism to model such scenario comes from node coloured networks (or heterogeneous networks) implemented as node/ edge-coloured graphs. Consequently, the need for the introduction of alignment algorithms able to analyse heterogeneous networks arises. To the best of our knowledge, all the existing algorithms are not able to mine heterogeneous networks. We propose a two-step alignment strategy that receives as input two heterogeneous networks (node-coloured graphs) and a similarity function among nodes of two networks extending the previous formulations. We first build a single alignment graph. Then we mine this graph extracting relevant subgraphs. Despite this simple approach, the analysis of such networks relies on graph and subgraph isomorphism and the size of the data is still growing. Therefore the use of high-performance data analytics framework is needed. We here present HetNetAligner a framework built on top of Apache Spark. We also implemented our algorithm, and we tested it on some selected heterogeneous biological networks. Preliminary results confirm that our method may extract relevant knowledge from biological data reducing the computational time.
△ Less
Submitted 11 June, 2018;
originally announced June 2018.
-
Learning Weighted Association Rules in Human Phenotype Ontology
Authors:
Pietro Hiram Guzzi,
Giuseppe Agapito,
Marianna Milano,
Mario Cannataro
Abstract:
The Human Phenotype Ontology (HPO) is a structured repository of concepts (HPO Terms) that are associated to one or more diseases. The process of association is referred to as annotation. The relevance and the specificity of both HPO terms and annotations are evaluated by a measure defined as Information Content (IC). The analysis of annotated data is thus an important challenge for bioinformatics…
▽ More
The Human Phenotype Ontology (HPO) is a structured repository of concepts (HPO Terms) that are associated to one or more diseases. The process of association is referred to as annotation. The relevance and the specificity of both HPO terms and annotations are evaluated by a measure defined as Information Content (IC). The analysis of annotated data is thus an important challenge for bioinformatics. There exist different approaches of analysis. From those, the use of Association Rules (AR) may provide useful knowledge, and it has been used in some applications, e.g. improving the quality of annotations. Nevertheless classical association rules algorithms do not take into account the source of annotation nor the importance yielding to the generation of candidate rules with low IC. This paper presents HPO-Miner (Human Phenotype Ontology-based Weighted Association Rules) a methodology for extracting Weighted Association Rules. HPO-Miner can extract relevant rules from a biological point of view. A case study on using of HPO-Miner on publicly available HPO annotation datasets is used to demonstrate the effectiveness of our methodology.
△ Less
Submitted 31 December, 2016;
originally announced January 2017.
-
The impact of Gene Ontology evolution on GO-Term Information Content
Authors:
Pietro Hiram Guzzi,
Giuseppe Agapito,
Marianna Milano,
Mario Cannataro
Abstract:
The Gene Ontology (GO) is a major bioinformatics ontology that provides structured controlled vocabularies to classify gene and proteins function and role. The GO and its annotations to gene products are now an integral part of functional analysis. Recently, the evaluation of similarity among gene products starting from their annotations (also referred to as semantic similarities) has become an in…
▽ More
The Gene Ontology (GO) is a major bioinformatics ontology that provides structured controlled vocabularies to classify gene and proteins function and role. The GO and its annotations to gene products are now an integral part of functional analysis. Recently, the evaluation of similarity among gene products starting from their annotations (also referred to as semantic similarities) has become an increasing area in bioinformatics. While many research on updates to the structure of GO and on the annotation corpora have been made, the impact of GO evolution on semantic similarities is quite unobserved. Here we extensively analyze how GO changes that should be carefully considered by all users of semantic similarities. GO changes in particular have a big impact on information content (IC) of GO terms. Since many semantic similarities rely on calculation of IC it is obvious that the study of these changes should be deeply investigated. Here we consider GO versions from 2005 to 2014 and we calculate IC of all GO Terms considering five different formulation. Then we compare these results. Analysis confirm that there exists a statistically significant difference among different calculation on the same version of the ontology (and this is quite obvious) and there exists a statistically difference among the results obtained with different GO version on the same IC formula. Results evidence there exist a remarkable bias due to the GO evolution that has not been considered so far. Possible future works should keep into account this consideration.
△ Less
Submitted 30 December, 2016;
originally announced December 2016.
-
A web-based tool to Analyze Semantic Similarity Networks
Authors:
Mario Cannataro,
Pietro Hiram Guzzi,
Marianna Milano,
Pierangelo Veltri
Abstract:
In computational biology, biological entities such as genes or proteins are usually annotated with terms extracted from Gene Ontology (GO). The functional similarity among terms of an ontology is evaluated by using Semantic Similarity Measures (SSM). More recently, the extensive application of SSMs yielded to the Semantic Similarity Networks (SSNs). SSNs are edge-weighted graphs where the nodes ar…
▽ More
In computational biology, biological entities such as genes or proteins are usually annotated with terms extracted from Gene Ontology (GO). The functional similarity among terms of an ontology is evaluated by using Semantic Similarity Measures (SSM). More recently, the extensive application of SSMs yielded to the Semantic Similarity Networks (SSNs). SSNs are edge-weighted graphs where the nodes are concepts (e.g. proteins) and each edge has an associated weight that represents the semantic similarity among related pairs of nodes. The analysis of SSNs may reveal biologically meaningful knowledge. For these aims, the need for the introduction of tool able to manage and analyze SSN arises. Consequently we developed SSN-Analyzer a web based tool able to build and preprocess SSN. As proof of concept we demonstrate that community detection algorithms applied to filtered (thresholded) networks, have better performances in terms of biological relevance of the results, with respect to the use of raw unfiltered networks.
△ Less
Submitted 21 December, 2014;
originally announced December 2014.
-
Thresholding of Semantic Similarity Networks using a Spectral Graph Based Technique
Authors:
Pietro Hiram Guzzi,
Simone Truglia,
Pierangelo Veltri,
Mario Cannataro
Abstract:
Semantic similarity measures (SSMs) refer to a set of algorithms used to quantify the similarity of two or more terms belonging to the same ontology. Ontology terms may be associated to concepts, for instance in computational biology gene and proteins are associated with terms of biological ontologies. Thus, SSMs may be used to quantify the similarity of genes and proteins starting from the compar…
▽ More
Semantic similarity measures (SSMs) refer to a set of algorithms used to quantify the similarity of two or more terms belonging to the same ontology. Ontology terms may be associated to concepts, for instance in computational biology gene and proteins are associated with terms of biological ontologies. Thus, SSMs may be used to quantify the similarity of genes and proteins starting from the comparison of the associated annotations. SSMs have been recently used to compare genes and proteins even on a system level scale. More recently some works have focused on the building and analysis of Semantic Similarity Networks (SSNs) i.e. weighted networks in which nodes represents genes or proteins while weighted edges represent the semantic similarity score among them. SSNs are quasi-complete networks, thus their analysis presents different challenges that should be addressed. For instance, the need for the introduction of reliable thresholds for the elimination of meaningless edges arises. Nevertheless, the use of global thresholding methods may produce the elimination of meaningful nodes, while the use of local thresholds may introduce biases. For these aims, we introduce a novel technique, based on spectral graph considerations and on a mixed global-local focus. The effectiveness of our technique is demonstrated by using markov clustering for the extraction of biological modules. We applied clustering to simplified networks demonstrating a considerable improvements with respect to the original ones.
△ Less
Submitted 21 May, 2013;
originally announced May 2013.