-
Human and social capital strategies for Mafia network disruption
Authors:
Annamaria Ficara,
Francesco Curreri,
Giacomo Fiumara,
Pasquale De Meo
Abstract:
Social Network Analysis (SNA) is an interdisciplinary science that focuses on discovering the patterns of individuals interactions. In particular, practitioners have used SNA to describe and analyze criminal networks to highlight subgroups, key actors, strengths and weaknesses in order to generate disruption interventions and crime prevention systems. In this paper, the effectiveness of a total of…
▽ More
Social Network Analysis (SNA) is an interdisciplinary science that focuses on discovering the patterns of individuals interactions. In particular, practitioners have used SNA to describe and analyze criminal networks to highlight subgroups, key actors, strengths and weaknesses in order to generate disruption interventions and crime prevention systems. In this paper, the effectiveness of a total of seven disruption strategies for two real Mafia networks is investigated adopting SNA tools. Three interventions targeting actors with a high level of social capital and three interventions targeting those with a high human capital are put to the test and compared between each other and with random node removal. Human and social capital approaches were also applied on the Barabási-Albert models which are the one which better represent criminal networks. Simulations showed that actor removal based on social capital proved to be the most effective strategy, by leading to the total disruption of the criminal network in the least number of steps. The removal of a specific figure of a Mafia family such as the Caporegime seemed also promising in the network disruption.
△ Less
Submitted 9 January, 2023; v1 submitted 5 September, 2022;
originally announced September 2022.
-
Classical and Quantum Random Walks to Identify Leaders in Criminal Networks
Authors:
Annamaria Ficara,
Giacomo Fiumara,
Pasquale De Meo,
Salvatore Catanese
Abstract:
Random walks simulate the randomness of objects, and are key instruments in various fields such as computer science, biology and physics. The counter part of classical random walks in quantum mechanics are the quantum walks. Quantum walk algorithms provide an exponential speedup over classical algorithms. Classical and quantum random walks can be applied in social network analysis, and can be used…
▽ More
Random walks simulate the randomness of objects, and are key instruments in various fields such as computer science, biology and physics. The counter part of classical random walks in quantum mechanics are the quantum walks. Quantum walk algorithms provide an exponential speedup over classical algorithms. Classical and quantum random walks can be applied in social network analysis, and can be used to define specific centrality metrics in terms of node occupation on single-layer and multilayer networks. In this paper, we applied these new centrality measures to three real criminal networks derived from an anti-mafia operation named Montagna and a multilayer network derived from them. Our aim is to (i) identify leaders in our criminal networks, (ii) study the dependence between these centralities and the degree, (iii) compare the results obtained for the real multilayer criminal network with those of a synthetic multilayer network which replicates its structure.
△ Less
Submitted 5 September, 2022;
originally announced September 2022.
-
Multilayer Network Analysis: The Identification of Key Actors in a Sicilian Mafia Operation
Authors:
Annamaria Ficara,
Giacomo Fiumara,
Pasquale De Meo,
Salvatore Catanese
Abstract:
Recently, Social Network Analysis studies have led to an improvement and to a generalization of existing tools to networks with multiple subsystems and layers of connectivity. These kind of networks are usually called multilayer networks. Multilayer networks in which each layer shares at least one node with some other layer in the network are called multiplex networks. Being a multiplex network do…
▽ More
Recently, Social Network Analysis studies have led to an improvement and to a generalization of existing tools to networks with multiple subsystems and layers of connectivity. These kind of networks are usually called multilayer networks. Multilayer networks in which each layer shares at least one node with some other layer in the network are called multiplex networks. Being a multiplex network does not require all nodes to exist on every layer. In this paper, we built a criminal multiplex network which concerns an anti-mafia operation called "Montagna" and it is based on the examination of a pre-trial detention order issued on March 14, 2007 by the judge for preliminary investigations of the Court of Messina (Sicily). "Montagna" focus on two Mafia families called "Mistretta" and "Batanesi" who infiltrated several economic activities including the public works in the north-eastern part of Sicily, through a cartel of entrepreneurs close to the Sicilian Mafia. Originally we derived two single-layer networks, the former capturing meetings between suspected individuals and the latter recording phone calls. But some networked systems can be better modeled by multilayer structures where the individual nodes develop relationships in multiple layers. For this reason we built a two-layer network from the single-layer ones. These two layers share 47 nodes. We followed three different approaches to measure the importance of nodes in multilayer networks using degree as descriptor. Our analysis can aid in the identification of key players in criminal networks.
△ Less
Submitted 19 May, 2021;
originally announced May 2021.
-
Correlation analysis of node and edge centrality measures in artificial complex networks
Authors:
Annamaria Ficara,
Giacomo Fiumara,
Pasquale De Meo,
Antonio Liotta
Abstract:
The importance of a node in a social network is identified through a set of measures called centrality. Degree centrality, closeness centrality, betweenness centrality and clustering coefficient are the most frequently used metrics to compute node centrality. Their computational complexity in some cases makes unfeasible, when not practically impossible, their computations. For this reason we focus…
▽ More
The importance of a node in a social network is identified through a set of measures called centrality. Degree centrality, closeness centrality, betweenness centrality and clustering coefficient are the most frequently used metrics to compute node centrality. Their computational complexity in some cases makes unfeasible, when not practically impossible, their computations. For this reason we focused on two alternative measures, WERW-Kpath and Game of Thieves, which are at the same time highly descriptive and computationally affordable. Our experiments show that a strong correlation exists between WERW-Kpath and Game of Thieves and the classical centrality measures. This may suggest the possibility of using them as useful and more economic replacements of the classical centrality measures.
△ Less
Submitted 5 August, 2021; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Graph and Network Theory for the analysis of Criminal Networks
Authors:
Lucia Cavallaro,
Ovidiu Bagdasar,
Pasquale De Meo,
Giacomo Fiumara,
Antonio Liotta
Abstract:
Social Network Analysis is the use of Network and Graph Theory to study social phenomena, which was found to be highly relevant in areas like Criminology. This chapter provides an overview of key methods and tools that may be used for the analysis of criminal networks, which are presented in a real-world case study. Starting from available juridical acts, we have extracted data on the interactions…
▽ More
Social Network Analysis is the use of Network and Graph Theory to study social phenomena, which was found to be highly relevant in areas like Criminology. This chapter provides an overview of key methods and tools that may be used for the analysis of criminal networks, which are presented in a real-world case study. Starting from available juridical acts, we have extracted data on the interactions among suspects within two Sicilian Mafia clans, obtaining two weighted undirected graphs. Then, we have investigated the roles of these weights on the criminal network's properties, focusing on two key features: weight distribution and shortest path length. We also present an experiment that aims to construct an artificial network that mirrors criminal behaviours. To this end, we have conducted a comparative degree distribution analysis between the real criminal networks, using some of the most popular artificial network models: Watts-Strogatz, Erdős-Rényi, and Barabási-Albert, with some topology variations. This chapter will be a valuable tool for researchers who wish to employ social network analysis within their own area of interest.
△ Less
Submitted 4 March, 2021; v1 submitted 3 March, 2021;
originally announced March 2021.
-
Criminal Networks Analysis in Missing Data scenarios through Graph Distances
Authors:
Annamaria Ficara,
Lucia Cavallaro,
Francesco Curreri,
Giacomo Fiumara,
Pasquale De Meo,
Ovidiu Bagdasar,
Wei Song,
Antonio Liotta
Abstract:
Data collected in criminal investigations may suffer from: (i) incompleteness, due to the covert nature of criminal organisations; (ii) incorrectness, caused by either unintentional data collection errors and intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analys…
▽ More
Data collected in criminal investigations may suffer from: (i) incompleteness, due to the covert nature of criminal organisations; (ii) incorrectness, caused by either unintentional data collection errors and intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analyse nine real criminal networks of different nature (i.e., Mafia networks, criminal street gangs and terrorist organizations) in order to quantify the impact of incomplete data and to determine which network type is most affected by it. The networks are firstly pruned following two specific methods: (i) random edges removal, simulating the scenario in which the Law Enforcement Agencies (LEAs) fail to intercept some calls, or to spot sporadic meetings among suspects; (ii) nodes removal, that catches the hypothesis in which some suspects cannot be intercepted or investigated. Finally we compute spectral (i.e., Adjacency, Laplacian and Normalised Laplacian Spectral Distances) and matrix (i.e., Root Euclidean Distance) distances between the complete and pruned networks, which we compare using statistical analysis. Our investigation identified two main features: first, the overall understanding of the criminal networks remains high even with incomplete data on criminal interactions (i.e., 10% removed edges); second, removing even a small fraction of suspects not investigated (i.e., 2% removed nodes) may lead to significant misinterpretation of the overall network.
△ Less
Submitted 28 February, 2021;
originally announced March 2021.
-
Correlations among Game of Thieves and other centrality measures in complex networks
Authors:
Annamaria Ficara,
Giacomo Fiumara,
Pasquale De Meo,
Antonio Liotta
Abstract:
Social Network Analysis (SNA) is used to study the exchange of resources among individuals, groups, or organizations. The role of individuals or connections in a network is described by a set of centrality metrics which represent one of the most important results of SNA. Degree, closeness, betweenness and clustering coefficient are the most used centrality measures. Their use is, however, severely…
▽ More
Social Network Analysis (SNA) is used to study the exchange of resources among individuals, groups, or organizations. The role of individuals or connections in a network is described by a set of centrality metrics which represent one of the most important results of SNA. Degree, closeness, betweenness and clustering coefficient are the most used centrality measures. Their use is, however, severely hampered by their computation cost. This issue can be overcome by an algorithm called Game of Thieves (GoT). Thanks to this new algorithm, we can compute the importance of all elements in a network (i.e. vertices and edges), compared to the total number of vertices. This calculation is done not in a quadratic time, as when we use the classical methods, but in polylogarithmic time. Starting from this we present our results on the correlation existing between GoT and the most widely used centrality measures. From our experiments emerge that a strong correlation exists, which makes GoT eligible as a centrality measure for large scale complex networks.
△ Less
Submitted 23 December, 2020;
originally announced December 2020.
-
Disrupting Resilient Criminal Networks through Data Analysis: The case of Sicilian Mafia
Authors:
Lucia Cavallaro,
Annamaria Ficara,
Pasquale De Meo,
Giacomo Fiumara,
Salvatore Catanese,
Ovidiu Bagdasar,
Antonio Liotta
Abstract:
Compared to other types of social networks, criminal networks present hard challenges, due to their strong resilience to disruption, which poses severe hurdles to law-enforcement agencies. Herein, we borrow methods and tools from Social Network Analysis to (i) unveil the structure of Sicilian Mafia gangs, based on two real-world datasets, and (ii) gain insights as to how to efficiently disrupt the…
▽ More
Compared to other types of social networks, criminal networks present hard challenges, due to their strong resilience to disruption, which poses severe hurdles to law-enforcement agencies. Herein, we borrow methods and tools from Social Network Analysis to (i) unveil the structure of Sicilian Mafia gangs, based on two real-world datasets, and (ii) gain insights as to how to efficiently disrupt them. Mafia networks have peculiar features, due to the links distribution and strength, which makes them very different from other social networks, and extremely robust to exogenous perturbations. Analysts are also faced with the difficulty in collecting reliable datasets that accurately describe the gangs' internal structure and their relationships with the external world, which is why earlier studies are largely qualitative, elusive and incomplete. An added value of our work is the generation of two real-world datasets, based on raw data derived from juridical acts, relating to a Mafia organization that operated in Sicily during the first decade of 2000s. We created two different networks, capturing phone calls and physical meetings, respectively. Our network disruption analysis simulated different intervention procedures: (i) arresting one criminal at a time (sequential node removal); and (ii) police raids (node block removal). We measured the effectiveness of each approach through a number of network centrality metrics. We found Betweeness Centrality to be the most effective metric, showing how, by neutralizing only the 5% of the affiliates, network connectivity dropped by 70%. We also identified that, due the peculiar type of interactions in criminal networks (namely, the distribution of the interactions frequency) no significant differences exist between weighted and unweighted network analysis. Our work has significant practical applications for tackling criminal and terrorist networks.
△ Less
Submitted 10 March, 2020;
originally announced March 2020.
-
Network Structure and Resilience of Mafia Syndicates
Authors:
Santa Agreste,
Salvatore Catanese,
Pasquale De Meo,
Emilio Ferrara,
Giacomo Fiumara
Abstract:
In this paper we present the results of the study of Sicilian Mafia organization by using Social Network Analysis. The study investigates the network structure of a Mafia organization, describing its evolution and highlighting its plasticity to interventions targeting membership and its resilience to disruption caused by police operations. We analyze two different datasets about Mafia gangs built…
▽ More
In this paper we present the results of the study of Sicilian Mafia organization by using Social Network Analysis. The study investigates the network structure of a Mafia organization, describing its evolution and highlighting its plasticity to interventions targeting membership and its resilience to disruption caused by police operations. We analyze two different datasets about Mafia gangs built by examining different digital trails and judicial documents spanning a period of ten years: the former dataset includes the phone contacts among suspected individuals, the latter is constituted by the relationships among individuals actively involved in various criminal offenses. Our report illustrates the limits of traditional investigation methods like tap**: criminals high up in the organization hierarchy do not occupy the most central positions in the criminal network, and oftentimes do not appear in the reconstructed criminal network at all. However, we also suggest possible strategies of intervention, as we show that although criminal networks (i.e., the network encoding mobsters and crime relationships) are extremely resilient to different kind of attacks, contact networks (i.e., the network reporting suspects and reciprocated phone calls) are much more vulnerable and their analysis can yield extremely valuable insights.
△ Less
Submitted 4 September, 2015;
originally announced September 2015.
-
RDF annotation of Second Life objects: Knowledge Representation meets Social Virtual reality
Authors:
Carlo Bernava,
Giacomo Fiumara,
Dario Maggiorini,
Alessandro Provetti,
Laura Ripamonti
Abstract:
We have designed and implemented an application running inside Second Life that supports user annotation of graphical objects and graphical visualization of concept ontologies, thus providing a formal, machine-accessible description of objects. As a result, we offer a platform that combines the graphical knowledge representation that is expected from a MUVE artifact with the semantic structure giv…
▽ More
We have designed and implemented an application running inside Second Life that supports user annotation of graphical objects and graphical visualization of concept ontologies, thus providing a formal, machine-accessible description of objects. As a result, we offer a platform that combines the graphical knowledge representation that is expected from a MUVE artifact with the semantic structure given by the Resource Framework Description (RDF) representation of information.
△ Less
Submitted 9 April, 2015;
originally announced April 2015.
-
Adaptive Search over Sorted Sets
Authors:
Biagio Bonasera,
Emilio Ferrara,
Giacomo Fiumara,
Francesco Pagano,
Alessandro Provetti
Abstract:
We revisit the classical algorithms for searching over sorted sets to introduce an algorithm refinement, called Adaptive Search, that combines the good features of Interpolation search and those of Binary search. W.r.t. Interpolation search, only a constant number of extra comparisons is introduced. Yet, under diverse input data distributions our algorithm shows costs comparable to that of Interpo…
▽ More
We revisit the classical algorithms for searching over sorted sets to introduce an algorithm refinement, called Adaptive Search, that combines the good features of Interpolation search and those of Binary search. W.r.t. Interpolation search, only a constant number of extra comparisons is introduced. Yet, under diverse input data distributions our algorithm shows costs comparable to that of Interpolation search, i.e., O(log log n) while the worst-case cost is always in O(log n), as with Binary search. On benchmarks drawn from large datasets, both synthetic and real-life, Adaptive search scores better times and lesser memory accesses even than Santoro and Sidney's Interpolation-Binary search.
△ Less
Submitted 12 February, 2015;
originally announced February 2015.
-
Visualizing criminal networks reconstructed from mobile phone records
Authors:
Emilio Ferrara,
Pasquale De Meo,
Salvatore Catanese,
Giacomo Fiumara
Abstract:
In the fight against the racketeering and terrorism, knowledge about the structure and the organization of criminal networks is of fundamental importance for both the investigations and the development of efficient strategies to prevent and restrain crimes. Intelligence agencies exploit information obtained from the analysis of large amounts of heterogeneous data deriving from various informative…
▽ More
In the fight against the racketeering and terrorism, knowledge about the structure and the organization of criminal networks is of fundamental importance for both the investigations and the development of efficient strategies to prevent and restrain crimes. Intelligence agencies exploit information obtained from the analysis of large amounts of heterogeneous data deriving from various informative sources including the records of phone traffic, the social networks, surveillance data, interview data, experiential police data, and police intelligence files, to acquire knowledge about criminal networks and initiate accurate and destabilizing actions. In this context, visual representation techniques coordinate the exploration of the structure of the network together with the metrics of social network analysis. Nevertheless, the utility of visualization tools may become limited when the dimension and the complexity of the system under analysis grow beyond certain terms. In this paper we show how we employ some interactive visualization techniques to represent criminal and terrorist networks reconstructed from phone traffic data, namely foci, fisheye and geo-map** network layouts. These methods allow the exploration of the network through animated transitions among visualization models and local enlargement techniques in order to improve the comprehension of interesting areas. By combining the features of the various visualization models it is possible to gain substantial enhancements with respect to classic visualization models, often unreadable in those cases of great complexity of the network.
△ Less
Submitted 10 July, 2014;
originally announced July 2014.
-
Detecting criminal organizations in mobile phone networks
Authors:
Emilio Ferrara,
Pasquale De Meo,
Salvatore Catanese,
Giacomo Fiumara
Abstract:
The study of criminal networks using traces from heterogeneous communication media is acquiring increasing importance in nowadays society. The usage of communication media such as phone calls and online social networks leaves digital traces in the form of metadata that can be used for this type of analysis. The goal of this work is twofold: first we provide a theoretical framework for the problem…
▽ More
The study of criminal networks using traces from heterogeneous communication media is acquiring increasing importance in nowadays society. The usage of communication media such as phone calls and online social networks leaves digital traces in the form of metadata that can be used for this type of analysis. The goal of this work is twofold: first we provide a theoretical framework for the problem of detecting and characterizing criminal organizations in networks reconstructed from phone call records. Then, we introduce an expert system to support law enforcement agencies in the task of unveiling the underlying structure of criminal networks hidden in communication data. This platform allows for statistical network analysis, community detection and visual exploration of mobile phone network data. It allows forensic investigators to deeply understand hierarchies within criminal organizations, discovering members who play central role and provide connection among sub-groups. Our work concludes illustrating the adoption of our computational framework for a real-word criminal investigation.
△ Less
Submitted 3 April, 2014;
originally announced April 2014.
-
Forensic Analysis of Phone Call Networks
Authors:
Salvatore Catanese,
Emilio Ferrara,
Giacomo Fiumara
Abstract:
In the context of preventing and fighting crime, the analysis of mobile phone traffic, among actors of a criminal network, is helpful in order to reconstruct illegal activities on the base of the relationships connecting those specific individuals. Thus, forensic analysts and investigators require new advanced tools and techniques which allow them to manage these data in a meaningful and efficient…
▽ More
In the context of preventing and fighting crime, the analysis of mobile phone traffic, among actors of a criminal network, is helpful in order to reconstruct illegal activities on the base of the relationships connecting those specific individuals. Thus, forensic analysts and investigators require new advanced tools and techniques which allow them to manage these data in a meaningful and efficient way. In this paper we present LogAnalysis, a tool we developed to provide visual data representation and filtering, statistical analysis features and the possibility of a temporal analysis of mobile phone activities. Its adoption may help in unveiling the structure of a criminal network and the roles and dynamics of communications among its components. By using LogAnalysis, forensic investigators could deeply understand hierarchies within criminal organizations, for example discovering central members that provide connections among different sub-groups, etc. Moreover, by analyzing the temporal evolution of the contacts among individuals, or by focusing on specific time windows they could acquire additional insights on the data they are analyzing. Finally, we put into evidence how the adoption of LogAnalysis may be crucial to solve real cases, providing as example a number of case studies inspired by real forensic investigations led by one of the authors.
△ Less
Submitted 7 March, 2013;
originally announced March 2013.
-
A Novel Measure of Edge Centrality in Social Networks
Authors:
Pasquale De Meo,
Emilio Ferrara,
Giacomo Fiumara,
Angela Ricciardello
Abstract:
The problem of assigning centrality values to nodes and edges in graphs has been widely investigated during last years. Recently, a novel measure of node centrality has been proposed, called k-path centrality index, which is based on the propagation of messages inside a network along paths consisting of at most k edges. On the other hand, the importance of computing the centrality of edges has bee…
▽ More
The problem of assigning centrality values to nodes and edges in graphs has been widely investigated during last years. Recently, a novel measure of node centrality has been proposed, called k-path centrality index, which is based on the propagation of messages inside a network along paths consisting of at most k edges. On the other hand, the importance of computing the centrality of edges has been put into evidence since 1970's by Anthonisse and, subsequently by Girvan and Newman. In this work we propose the generalization of the concept of k-path centrality by defining the k-path edge centrality, a measure of centrality introduced to compute the importance of edges. We provide an efficient algorithm, running in O(k m), being m the number of edges in the graph. Thus, our technique is feasible for large scale network analysis. Finally, the performance of our algorithm is analyzed, discussing the results obtained against large online social network datasets.
△ Less
Submitted 7 March, 2013;
originally announced March 2013.
-
Enhancing community detection using a network weighting strategy
Authors:
Pasquale De Meo,
Emilio Ferrara,
Giacomo Fiumara,
Alessandro Provetti
Abstract:
A community within a network is a group of vertices densely connected to each other but less connected to the vertices outside. The problem of detecting communities in large networks plays a key role in a wide range of research areas, e.g. Computer Science, Biology and Sociology. Most of the existing algorithms to find communities count on the topological features of the network and often do not s…
▽ More
A community within a network is a group of vertices densely connected to each other but less connected to the vertices outside. The problem of detecting communities in large networks plays a key role in a wide range of research areas, e.g. Computer Science, Biology and Sociology. Most of the existing algorithms to find communities count on the topological features of the network and often do not scale well on large, real-life instances.
In this article we propose a strategy to enhance existing community detection algorithms by adding a pre-processing step in which edges are weighted according to their centrality w.r.t. the network topology. In our approach, the centrality of an edge reflects its contribute to making arbitrary graph tranversals, i.e., spreading messages over the network, as short as possible. Our strategy is able to effectively complements information about network topology and it can be used as an additional tool to enhance community detection. The computation of edge centralities is carried out by performing multiple random walks of bounded length on the network. Our method makes the computation of edge centralities feasible also on large-scale networks. It has been tested in conjunction with three state-of-the-art community detection algorithms, namely the Louvain method, COPRA and OSLOM. Experimental results show that our method raises the accuracy of existing algorithms both on synthetic and real-life datasets.
△ Less
Submitted 7 March, 2013;
originally announced March 2013.
-
Mixing local and global information for community detection in large networks
Authors:
Pasquale De Meo,
Emilio Ferrara,
Giacomo Fiumara,
Alessandro Provetti
Abstract:
The problem of clustering large complex networks plays a key role in several scientific fields ranging from Biology to Sociology and Computer Science. Many approaches to clustering complex networks are based on the idea of maximizing a network modularity function. Some of these approaches can be classified as global because they exploit knowledge about the whole network topology to find clusters.…
▽ More
The problem of clustering large complex networks plays a key role in several scientific fields ranging from Biology to Sociology and Computer Science. Many approaches to clustering complex networks are based on the idea of maximizing a network modularity function. Some of these approaches can be classified as global because they exploit knowledge about the whole network topology to find clusters. Other approaches, instead, can be interpreted as local because they require only a partial knowledge of the network topology, e.g., the neighbors of a vertex. Global approaches are able to achieve high values of modularity but they do not scale well on large networks and, therefore, they cannot be applied to analyze on-line social networks like Facebook or YouTube. In contrast, local approaches are fast and scale up to large, real-life networks, at the cost of poorer results than those achieved by local methods. In this article we propose a glocal method to maximizing modularity, i.e., our method uses information at the global level, yet its scalability on large networks is comparable to that of local methods. The proposed method is called COmplex Network CLUster DEtection (or, shortly, CONCLUDE.) It works in two stages: in the first stage it uses an information-propagation model, based on random and non-backtracking walks of finite length, to compute the importance of each edge in kee** the network connected (called edge centrality.) Then, edge centrality is used to map network vertices onto points of an Euclidean space and to compute distances between all pairs of connected vertices. In the second stage, CONCLUDE uses the distances computed in the first stage to partition the network into clusters. CONCLUDE is computationally efficient since in the average case its cost is roughly linear in the number of edges of the network.
△ Less
Submitted 16 October, 2013; v1 submitted 7 March, 2013;
originally announced March 2013.
-
Web Data Extraction, Applications and Techniques: A Survey
Authors:
Emilio Ferrara,
Pasquale De Meo,
Giacomo Fiumara,
Robert Baumgartner
Abstract:
Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction.
This survey a…
▽ More
Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction.
This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.
△ Less
Submitted 9 June, 2014; v1 submitted 1 July, 2012;
originally announced July 2012.
-
On Facebook, most ties are weak
Authors:
Pasquale De Meo,
Emilio Ferrara,
Giacomo Fiumara,
Alessandro Provetti
Abstract:
Pervasive socio-technical networks bring new conceptual and technological challenges to developers and users alike. A central research theme is evaluation of the intensity of relations linking users and how they facilitate communication and the spread of information. These aspects of human relationships have been studied extensively in the social sciences under the framework of the "strength of we…
▽ More
Pervasive socio-technical networks bring new conceptual and technological challenges to developers and users alike. A central research theme is evaluation of the intensity of relations linking users and how they facilitate communication and the spread of information. These aspects of human relationships have been studied extensively in the social sciences under the framework of the "strength of weak ties" theory proposed by Mark Granovetter.13 Some research has considered whether that theory can be extended to online social networks like Facebook, suggesting interaction data can be used to predict the strength of ties. The approaches being used require handling user-generated data that is often not publicly available due to privacy concerns. Here, we propose an alternative definition of weak and strong ties that requires knowledge of only the topology of the social network (such as who is a friend of whom on Facebook), relying on the fact that online social networks, or OSNs, tend to fragment into communities. We thus suggest classifying as weak ties those edges linking individuals belonging to different communities and strong ties as those connecting users in the same community. We tested this definition on a large network representing part of the Facebook social graph and studied how weak and strong ties affect the information-diffusion process. Our findings suggest individuals in OSNs self-organize to create well-connected communities, while weak ties yield cohesion and optimize the coverage of information spread.
△ Less
Submitted 1 November, 2014; v1 submitted 2 March, 2012;
originally announced March 2012.
-
Topological Features of Online Social Networks
Authors:
Emilio Ferrara,
Giacomo Fiumara
Abstract:
The importance of modeling and analyzing Social Networks is a consequence of the success of Online Social Networks during last years. Several models of networks have been proposed, reflecting the different characteristics of Social Networks. Some of them fit better to model specific phenomena, such as the growth and the evolution of the Social Networks; others are more appropriate to capture the t…
▽ More
The importance of modeling and analyzing Social Networks is a consequence of the success of Online Social Networks during last years. Several models of networks have been proposed, reflecting the different characteristics of Social Networks. Some of them fit better to model specific phenomena, such as the growth and the evolution of the Social Networks; others are more appropriate to capture the topological characteristics of the networks. Because these networks show unique and different properties and features, in this work we describe and exploit several models in order to capture the structure of popular Online Social Networks, such as Arxiv, Facebook, Wikipedia and YouTube. Our experimentation aims at verifying the structural characteristics of these networks, in order to understand what model better depicts their structure, and to analyze the inner community structure, to illustrate how members of these Online Social Networks interact and group together into smaller communities.
△ Less
Submitted 1 February, 2012;
originally announced February 2012.
-
Improving Recommendation Quality by Merging Collaborative Filtering and Social Relationships
Authors:
Pasquale De Meo,
Emilio Ferrara,
Giacomo Fiumara,
Alessandro Provetti
Abstract:
Matrix Factorization techniques have been successfully applied to raise the quality of suggestions generated by Collaborative Filtering Systems (CFSs). Traditional CFSs based on Matrix Factorization operate on the ratings provided by users and have been recently extended to incorporate demographic aspects such as age and gender. In this paper we propose to merge CFS based on Matrix Factorization a…
▽ More
Matrix Factorization techniques have been successfully applied to raise the quality of suggestions generated by Collaborative Filtering Systems (CFSs). Traditional CFSs based on Matrix Factorization operate on the ratings provided by users and have been recently extended to incorporate demographic aspects such as age and gender. In this paper we propose to merge CFS based on Matrix Factorization and information regarding social friendships in order to provide users with more accurate suggestions and rankings on items of their interest. The proposed approach has been evaluated on a real-life online social network; the experimental results show an improvement against existing CFSs. A detailed comparison with related literature is also present.
△ Less
Submitted 29 September, 2011;
originally announced September 2011.
-
Generalized Louvain Method for Community Detection in Large Networks
Authors:
Pasquale De Meo,
Emilio Ferrara,
Giacomo Fiumara,
Alessandro Provetti
Abstract:
In this paper we present a novel strategy to discover the community structure of (possibly, large) networks. This approach is based on the well-know concept of network modularity optimization. To do so, our algorithm exploits a novel measure of edge centrality, based on the k-paths. This technique allows to efficiently compute a edge ranking in large networks in near linear time. Once the centrali…
▽ More
In this paper we present a novel strategy to discover the community structure of (possibly, large) networks. This approach is based on the well-know concept of network modularity optimization. To do so, our algorithm exploits a novel measure of edge centrality, based on the k-paths. This technique allows to efficiently compute a edge ranking in large networks in near linear time. Once the centrality ranking is calculated, the algorithm computes the pairwise proximity between nodes of the network. Finally, it discovers the community structure adopting a strategy inspired by the well-known state-of-the-art Louvain method (henceforth, LM), efficiently maximizing the network modularity. The experiments we carried out show that our algorithm outperforms other techniques and slightly improves results of the original LM, providing reliable results. Another advantage is that its adoption is naturally extended even to unweighted networks, differently with respect to the LM.
△ Less
Submitted 10 February, 2012; v1 submitted 6 August, 2011;
originally announced August 2011.
-
A Framework for Designing 3D Virtual Environments
Authors:
Salvatore Catanese,
Emilio Ferrara,
Giacomo Fiumara,
Francesco Pagano
Abstract:
The process of design and development of virtual environments can be supported by tools and frameworks, to save time in technical aspects and focusing on the content. In this paper we present an academic framework which provides several levels of abstraction to ease this work. It includes state-of-the-art components we devised or integrated adopting open-source solutions in order to face specific…
▽ More
The process of design and development of virtual environments can be supported by tools and frameworks, to save time in technical aspects and focusing on the content. In this paper we present an academic framework which provides several levels of abstraction to ease this work. It includes state-of-the-art components we devised or integrated adopting open-source solutions in order to face specific problems. Its architecture is modular and customizable, the code is open-source.
△ Less
Submitted 4 July, 2011;
originally announced July 2011.
-
Crawling Facebook for Social Network Analysis Purposes
Authors:
Salvatore A. Catanese,
Pasquale De Meo,
Emilio Ferrara,
Giacomo Fiumara,
Alessandro Provetti
Abstract:
We describe our work in the collection and analysis of massive data describing the connections between participants to online social networks. Alternative approaches to social network data collection are defined and evaluated in practice, against the popular Facebook Web site. Thanks to our ad-hoc, privacy-compliant crawlers, two large samples, comprising millions of connections, have been collect…
▽ More
We describe our work in the collection and analysis of massive data describing the connections between participants to online social networks. Alternative approaches to social network data collection are defined and evaluated in practice, against the popular Facebook Web site. Thanks to our ad-hoc, privacy-compliant crawlers, two large samples, comprising millions of connections, have been collected; the data is anonymous and organized as an undirected graph. We describe a set of tools that we developed to analyze specific properties of such social-network graphs, i.e., among others, degree distribution, centrality measures, scaling laws and distribution of friendship.
△ Less
Submitted 31 May, 2011;
originally announced May 2011.
-
Rendering of 3D Dynamic Virtual Environments
Authors:
Salvatore Catanese,
Emilio Ferrara,
Giacomo Fiumara,
Francesco Pagano
Abstract:
In this paper we present a framework for the rendering of dynamic 3D virtual environments which can be integrated in the development of videogames. It includes methods to manage sounds and particle effects, paged static geometries, the support of a physics engine and various input systems. It has been designed with a modular structure to allow future expansions. We exploited some open-source state…
▽ More
In this paper we present a framework for the rendering of dynamic 3D virtual environments which can be integrated in the development of videogames. It includes methods to manage sounds and particle effects, paged static geometries, the support of a physics engine and various input systems. It has been designed with a modular structure to allow future expansions. We exploited some open-source state-of-the-art components such as OGRE, PhysX, ParticleUniverse, etc.; all of them have been properly integrated to obtain peculiar physical and environmental effects. The stand-alone version of the application is fully compatible with Direct3D and OpenGL APIs and adopts OpenAL APIs to manage audio cards. Concluding, we devised a showcase demo which reproduces a dynamic 3D environment, including some particular effects: the alternation of day and night infuencing the lighting of the scene, the rendering of terrain, water and vegetation, the reproduction of sounds and atmospheric agents.
△ Less
Submitted 2 June, 2011; v1 submitted 22 March, 2011;
originally announced March 2011.
-
Analyzing the Facebook Friendship Graph
Authors:
Salvatore Catanese,
Pasquale De Meo,
Emilio Ferrara,
Giacomo Fiumara
Abstract:
Online Social Networks (OSN) during last years acquired a huge and increasing popularity as one of the most important emerging Web phenomena, deeply modifying the behavior of users and contributing to build a solid substrate of connections and relationships among people using the Web. In this preliminary work paper, our purpose is to analyze Facebook, considering a significant sample of data refle…
▽ More
Online Social Networks (OSN) during last years acquired a huge and increasing popularity as one of the most important emerging Web phenomena, deeply modifying the behavior of users and contributing to build a solid substrate of connections and relationships among people using the Web. In this preliminary work paper, our purpose is to analyze Facebook, considering a significant sample of data reflecting relationships among subscribed users. Our goal is to extract, from this platform, relevant information about the distribution of these relations and exploit tools and algorithms provided by the Social Network Analysis (SNA) to discover and, possibly, understand underlying similarities between the develo** of OSN and real-life social networks.
△ Less
Submitted 2 June, 2011; v1 submitted 23 November, 2010;
originally announced November 2010.
-
Living City, a Collaborative Browser-based Massively Multiplayer Online Game
Authors:
Emilio Ferrara,
Giacomo Fiumara,
Francesco Pagano
Abstract:
This work presents the design and implementation of our Browser-based Massively Multiplayer Online Game, Living City, a simulation game fully developed at the University of Messina. Living City is a persistent and real-time digital world, running in the Web browser environment and accessible from users without any client-side installation. Today Massively Multiplayer Online Games attract the atten…
▽ More
This work presents the design and implementation of our Browser-based Massively Multiplayer Online Game, Living City, a simulation game fully developed at the University of Messina. Living City is a persistent and real-time digital world, running in the Web browser environment and accessible from users without any client-side installation. Today Massively Multiplayer Online Games attract the attention of Computer Scientists both for their architectural peculiarity and the close interconnection with the social network phenomenon. We will cover these two aspects paying particular attention to some aspects of the project: game balancing (e.g. algorithms behind time and money balancing); business logic (e.g., handling concurrency, cheating avoidance and availability) and, finally, social and psychological aspects involved in the collaboration of players, analyzing their activities and interconnections.
△ Less
Submitted 23 November, 2010;
originally announced November 2010.