-
Spectra of correlators in the relaxation time approximation of kinetic theory
Authors:
Matej Bajec,
Sašo Grozdanov,
Alexander Soloviev
Abstract:
The relaxation time approximation (RTA) of the kinetic Boltzmann equation is likely the simplest window into the microscopic properties of collective real-time transport. Within this framework, we analytically compute all retarded two-point Green's functions of the energy-momentum tensor and a conserved $U(1)$ current in thermal states with classical massless particles (a `CFT') at non-zero densit…
▽ More
The relaxation time approximation (RTA) of the kinetic Boltzmann equation is likely the simplest window into the microscopic properties of collective real-time transport. Within this framework, we analytically compute all retarded two-point Green's functions of the energy-momentum tensor and a conserved $U(1)$ current in thermal states with classical massless particles (a `CFT') at non-zero density, and in the absence and presence of broken translational symmetry. This is done in $2+1$ and $3+1$ dimensions. RTA allows a full explicit analysis of the analytic structure of different correlators (poles versus branch cuts) and the transport properties that they imply (the thermoelectric conductivities, and the hydrodynamic, quasihydrodynamic and gapped mode dispersion relations). Our inherently weakly coupled analysis thereby also enables a direct comparison with previously known strongly coupled results in holographic CFTs dual to the Einstein-Maxwell-axion theories.
△ Less
Submitted 26 April, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
Non-forward radiative corrections to electron-carbon scattering
Authors:
M. Mihovilovič,
A. B. Weber,
P. Achenbach,
M. Bajec,
T. Beranek,
J. Beričič,
J. C. Bernauer,
D. Bosnar,
R. Böhm,
M. Cardinali,
L. Correa,
L. Debenjak,
A. Denig,
M. O. Distler,
A. Esser,
M. I. Feretti Bondy,
H. Fonvieille,
J. M. Friedrich,
I. Friščić,
K. Griffioen,
M. Hoek,
S. Kegel,
D. G. Middleton,
H. Merkel,
U. Müller
, et al. (15 additional authors not shown)
Abstract:
Radiative corrections to elastic scattering represent an important part of the interpretation of electron-induced nuclear reactions at small energy transfers, where they make for a dominant part of background. Here we present and validate a new event generator for mimicking QED radiative processes in electron-carbon scattering that exactly calculates the coherent sum of the Bethe-Heitler amplitude…
▽ More
Radiative corrections to elastic scattering represent an important part of the interpretation of electron-induced nuclear reactions at small energy transfers, where they make for a dominant part of background. Here we present and validate a new event generator for mimicking QED radiative processes in electron-carbon scattering that exactly calculates the coherent sum of the Bethe-Heitler amplitudes for the leading diagrams and can be reliably employed for a more robust determination of inelastic cross-sections.
△ Less
Submitted 21 March, 2023;
originally announced March 2023.
-
ANGLEr: A Next-Generation Natural Language Exploratory Framework
Authors:
Timotej Knez,
Marko Bajec,
Slavko Žitnik
Abstract:
Natural language processing is used for solving a wide variety of problems. Some scholars and interest groups working with language resources are not well versed in programming, so there is a need for a good graphical framework that allows users to quickly design and test natural language processing pipelines without the need for programming. The existing frameworks do not satisfy all the requirem…
▽ More
Natural language processing is used for solving a wide variety of problems. Some scholars and interest groups working with language resources are not well versed in programming, so there is a need for a good graphical framework that allows users to quickly design and test natural language processing pipelines without the need for programming. The existing frameworks do not satisfy all the requirements for such a tool. We, therefore, propose a new framework that provides a simple way for its users to build language processing pipelines. It also allows a simple programming language agnostic way for adding new modules, which will help the adoption by natural language processing developers and researchers. The main parts of the proposed framework consist of (a) a pluggable Docker-based architecture, (b) a general data model, and (c) APIs description along with the graphical user interface. The proposed design is being used for implementation of a new natural language processing framework, called ANGLEr.
△ Less
Submitted 10 May, 2022;
originally announced June 2022.
-
Smart contracts for container based video conferencing services: Architecture and implementation
Authors:
Sandi Gec,
Dejan Lavbič,
Marko Bajec,
Vlado Stankovski
Abstract:
Today, container-based virtualization is very popular due to the lightweight nature of containers and the ability to use them flexibly in various heterogeneously composed systems. This makes it possible to collaboratively develop services by sharing various types of resources, such as infrastructures, software and digitalized content. In this work, our home made video-conferencing (VC) system is u…
▽ More
Today, container-based virtualization is very popular due to the lightweight nature of containers and the ability to use them flexibly in various heterogeneously composed systems. This makes it possible to collaboratively develop services by sharing various types of resources, such as infrastructures, software and digitalized content. In this work, our home made video-conferencing (VC) system is used to study resource usage optimisation in business context. An application like this, does not provide monetization possibilities to all involved stakeholders including end users, cloud providers, software engineers and similar. Blockchain related technologies, such as Smart Contracts (SC) offer a possibility to address some of these needs. We introduce a novel architecture for monetization of added-value according to preferences of the stakeholders that participate in joint software service offers. The developed architecture facilitates use case scenarios of service and resource offers according to fixed and dynamic pricing schemes, fixed usage period, prepaid quota for flexible usage, division of income, consensual decisions among collaborative service providers, and constrained based usage of resources or services. Our container-based VC service, which is based on the Jitsi Meet Open Source software is used to demonstrate the proposed architecture and the benefits of the investigated use cases.
△ Less
Submitted 18 February, 2019; v1 submitted 11 August, 2018;
originally announced August 2018.
-
General Context-Aware Data Matching and Merging Framework
Authors:
Slavko Žitnik,
Lovro Šubelj,
Dejan Lavbič,
Olegas Vasilecas,
Marko Bajec
Abstract:
Due to numerous public information sources and services, many methods to combine heterogeneous data were proposed recently. However, general end-to-end solutions are still rare, especially systems taking into account different context dimensions. Therefore, the techniques often prove insufficient or are limited to a certain domain. In this paper we briefly review and rigorously evaluate a general…
▽ More
Due to numerous public information sources and services, many methods to combine heterogeneous data were proposed recently. However, general end-to-end solutions are still rare, especially systems taking into account different context dimensions. Therefore, the techniques often prove insufficient or are limited to a certain domain. In this paper we briefly review and rigorously evaluate a general framework for data matching and merging. The framework employs collective entity resolution and redundancy elimination using three dimensions of context types. In order to achieve domain independent results, data is enriched with semantics and trust. However, the main contribution of the paper is evaluation on five public domain-incompatible datasets. Furthermore, we introduce additional attribute, relationship, semantic and trust metrics, which allow complete framework management. Besides overall results improvement within the framework, metrics could be of independent interest.
△ Less
Submitted 26 July, 2018;
originally announced July 2018.
-
Empirical comparison of network sampling techniques
Authors:
Neli Blagus,
Lovro Šubelj,
Marko Bajec
Abstract:
In the past few years, the storage and analysis of large-scale and fast evolving networks present a great challenge. Therefore, a number of different techniques have been proposed for sampling large networks. In general, network exploration techniques approximate the original networks more accurately than random node and link selection. Yet, link selection with additional subgraph induction step o…
▽ More
In the past few years, the storage and analysis of large-scale and fast evolving networks present a great challenge. Therefore, a number of different techniques have been proposed for sampling large networks. In general, network exploration techniques approximate the original networks more accurately than random node and link selection. Yet, link selection with additional subgraph induction step outperforms most other techniques. In this paper, we apply subgraph induction also to random walk and forest-fire sampling. We analyze different real-world networks and the changes of their properties introduced by sampling. We compare several sampling techniques based on the match between the original networks and their sampled variants. The results reveal that the techniques with subgraph induction underestimate the degree and clustering distribution, while overestimate average degree and density of the original networks. Techniques without subgraph induction step exhibit exactly the opposite behavior. Hence, the performance of the sampling techniques from random selection category compared to network exploration sampling does not differ significantly, while clear differences exist between the techniques with subgraph induction step and the ones without it.
△ Less
Submitted 9 June, 2015; v1 submitted 8 June, 2015;
originally announced June 2015.
-
Quantifying the consistency of scientific databases
Authors:
Lovro Šubelj,
Marko Bajec,
Biljana Mileva Boshkoska,
Andrej Kastrin,
Zoran Levnajić
Abstract:
Science is a social process with far-reaching impact on our modern society. In the recent years, for the first time we are able to scientifically study the science itself. This is enabled by massive amounts of data on scientific publications that is increasingly becoming available. The data is contained in several databases such as Web of Science or PubMed, maintained by various public and private…
▽ More
Science is a social process with far-reaching impact on our modern society. In the recent years, for the first time we are able to scientifically study the science itself. This is enabled by massive amounts of data on scientific publications that is increasingly becoming available. The data is contained in several databases such as Web of Science or PubMed, maintained by various public and private entities. Unfortunately, these databases are not always consistent, which considerably hinders this study. Relying on the powerful framework of complex networks, we conduct a systematic analysis of the consistency among six major scientific databases. We found that identifying a single "best" database is far from easy. Nevertheless, our results indicate appreciable differences in mutual consistency of different databases, which we interpret as recipes for future bibliometric studies.
△ Less
Submitted 13 May, 2015;
originally announced May 2015.
-
Do PageRank-based author rankings outperform simple citation counts?
Authors:
Dalibor Fiala,
Lovro Šubelj,
Slavko Žitnik,
Marko Bajec
Abstract:
The basic indicators of a researcher's productivity and impact are still the number of publications and their citation counts. These metrics are clear, straightforward, and easy to obtain. When a ranking of scholars is needed, for instance in grant, award, or promotion procedures, their use is the fastest and cheapest way of prioritizing some scientists over others. However, due to their nature, t…
▽ More
The basic indicators of a researcher's productivity and impact are still the number of publications and their citation counts. These metrics are clear, straightforward, and easy to obtain. When a ranking of scholars is needed, for instance in grant, award, or promotion procedures, their use is the fastest and cheapest way of prioritizing some scientists over others. However, due to their nature, there is a danger of oversimplifying scientific achievements. Therefore, many other indicators have been proposed including the usage of the PageRank algorithm known for the ranking of webpages and its modifications suited to citation networks. Nevertheless, this recursive method is computationally expensive and even if it has the advantage of favouring prestige over popularity, its application should be well justified, particularly when compared to the standard citation counts. In this study, we analyze three large datasets of computer science papers in the categories of artificial intelligence, software engineering, and theory and methods and apply 12 different ranking methods to the citation networks of authors. We compare the resulting rankings with self-compiled lists of outstanding researchers selected as frequent editorial board members of prestigious journals in the field and conclude that there is no evidence of PageRank-based methods outperforming simple citation counts.
△ Less
Submitted 12 May, 2015;
originally announced May 2015.
-
Sampling promotes community structure in social and information networks
Authors:
Neli Blagus,
Lovro Šubelj,
Gregor Weiss,
Marko Bajec
Abstract:
Any network studied in the literature is inevitably just a sampled representative of its real-world analogue. Additionally, network sampling is lately often applied to large networks to allow for their faster and more efficient analysis. Nevertheless, the changes in network structure introduced by sampling are still far from understood. In this paper, we study the presence of characteristic groups…
▽ More
Any network studied in the literature is inevitably just a sampled representative of its real-world analogue. Additionally, network sampling is lately often applied to large networks to allow for their faster and more efficient analysis. Nevertheless, the changes in network structure introduced by sampling are still far from understood. In this paper, we study the presence of characteristic groups of nodes in sampled social and information networks. We consider different network sampling techniques including random node and link selection, network exploration and expansion. We first observe that the structure of social networks reveals densely linked groups like communities, while the structure of information networks is better described by modules of structurally equivalent nodes. However, despite these notable differences, the structure of sampled networks exhibits stronger characterization by community-like groups than the original networks, irrespective of their type and consistently across various sampling techniques. Hence, rich community structure commonly observed in social and information networks is to some extent merely an artifact of sampling.
△ Less
Submitted 13 April, 2015;
originally announced April 2015.
-
Assessing the effectiveness of real-world network simplification
Authors:
Neli Blagus,
Lovro Šubelj,
Marko Bajec
Abstract:
Many real-world networks are large, complex and thus hard to understand, analyze or visualize. The data about networks is not always complete, their structure may be hidden or they change quickly over time. Therefore, understanding how incomplete system differs from complete one is crucial. In this paper, we study the changes in networks under simplification (i.e., reduction in size). We simplify…
▽ More
Many real-world networks are large, complex and thus hard to understand, analyze or visualize. The data about networks is not always complete, their structure may be hidden or they change quickly over time. Therefore, understanding how incomplete system differs from complete one is crucial. In this paper, we study the changes in networks under simplification (i.e., reduction in size). We simplify 30 real-world networks with six simplification methods and analyze the similarity between original and simplified networks based on preservation of several properties, for example degree distribution, clustering coefficient, betweenness centrality, density and degree mixing. We propose an approach for assessing the effectiveness of simplification process to define the most appropriate size of simplified networks and to determine the method, which preserves the most properties of original networks. The results reveal the type and size of original networks do not influence the changes of networks under simplification process, while the size of simplified networks does. Moreover, we investigate the performance of simplification methods when the size of simplified networks is 10% of the original networks. The findings show that sampling methods outperform merging ones, particularly random node selection based on degree and breadth-first sampling perform the best.
△ Less
Submitted 18 February, 2015;
originally announced February 2015.
-
Node mixing and group structure of complex software networks
Authors:
Lovro Šubelj,
Slavko Žitnik,
Neli Blagus,
Marko Bajec
Abstract:
Large software projects are among most sophisticated human-made systems consisting of a network of interdependent parts. Past studies of software systems from the perspective of complex networks have already led to notable discoveries with different applications. Nevertheless, our comprehension of the structure of software networks remains to be only partial. We here investigate correlations or mi…
▽ More
Large software projects are among most sophisticated human-made systems consisting of a network of interdependent parts. Past studies of software systems from the perspective of complex networks have already led to notable discoveries with different applications. Nevertheless, our comprehension of the structure of software networks remains to be only partial. We here investigate correlations or mixing between linked nodes and show that software networks reveal dichotomous node degree mixing similar to that recently observed in biological networks. We further show that software networks also reveal characteristic clustering profiles and mixing. Hence, node mixing in software networks significantly differs from that in, e.g., the Internet or social networks. We explain the observed mixing through the presence of groups of nodes with common linking pattern. More precisely, besides densely linked groups known as communities, software networks also consist of disconnected groups denoted modules, core/periphery structures and other. Moreover, groups coincide with the intrinsic properties of the underlying software projects, which promotes practical applications in software engineering.
△ Less
Submitted 17 February, 2015;
originally announced February 2015.
-
Network-based statistical comparison of citation topology of bibliographic databases
Authors:
Lovro Šubelj,
Dalibor Fiala,
Marko Bajec
Abstract:
Modern bibliographic databases provide the basis for scientific research and its evaluation. While their content and structure differ substantially, there exist only informal notions on their reliability. Here we compare the topological consistency of citation networks extracted from six popular bibliographic databases including Web of Science, CiteSeer and arXiv.org. The networks are assessed thr…
▽ More
Modern bibliographic databases provide the basis for scientific research and its evaluation. While their content and structure differ substantially, there exist only informal notions on their reliability. Here we compare the topological consistency of citation networks extracted from six popular bibliographic databases including Web of Science, CiteSeer and arXiv.org. The networks are assessed through a rich set of local and global graph statistics. We first reveal statistically significant inconsistencies between some of the databases with respect to individual statistics. For example, the introduced field bow-tie decomposition of DBLP Computer Science Bibliography substantially differs from the rest due to the coverage of the database, while the citation information within arXiv.org is the most exhaustive. Finally, we compare the databases over multiple graph statistics using the critical difference diagram. The citation topology of DBLP Computer Science Bibliography is the least consistent with the rest, while, not surprisingly, Web of Science is significantly more reliable from the perspective of consistency. This work can serve either as a reference for scholars in bibliometrics and scientometrics or a scientific evaluation guideline for governments and research agencies.
△ Less
Submitted 17 February, 2015;
originally announced February 2015.
-
Group detection in complex networks: An algorithm and comparison of the state of the art
Authors:
Lovro Šubelj,
Marko Bajec
Abstract:
Complex real-world networks commonly reveal characteristic groups of nodes like communities and modules. These are of value in various applications, especially in the case of large social and information networks. However, while numerous community detection techniques have been presented in the literature, approaches for other groups of nodes are relatively rare and often limited in some way. We p…
▽ More
Complex real-world networks commonly reveal characteristic groups of nodes like communities and modules. These are of value in various applications, especially in the case of large social and information networks. However, while numerous community detection techniques have been presented in the literature, approaches for other groups of nodes are relatively rare and often limited in some way. We present a simple propagation-based algorithm for general group detection that requires no a priori knowledge and has near ideal complexity. The main novelty here is that different types of groups are revealed through an adequate hierarchical group refinement procedure. The proposed algorithm is validated on various synthetic and real-world networks, and rigorously compared against twelve other state-of-the-art approaches on group detection, hierarchy discovery and link prediction tasks. The algorithm is comparable to the state of the art in community detection, while superior in general group detection and link prediction. Based on the comparison, we also dis- cuss some prominent directions for future work on group detection in complex networks.
△ Less
Submitted 27 December, 2013; v1 submitted 22 May, 2013;
originally announced May 2013.
-
Model of complex networks based on citation dynamics
Authors:
Lovro Šubelj,
Marko Bajec
Abstract:
Complex networks of real-world systems are believed to be controlled by common phenomena, producing structures far from regular or random. These include scale-free degree distributions, small-world structure and assortative mixing by degree, which are also the properties captured by different random graph models proposed in the literature. However, many (non-social) real-world networks are in fact…
▽ More
Complex networks of real-world systems are believed to be controlled by common phenomena, producing structures far from regular or random. These include scale-free degree distributions, small-world structure and assortative mixing by degree, which are also the properties captured by different random graph models proposed in the literature. However, many (non-social) real-world networks are in fact disassortative by degree. Thus, we here propose a simple evolving model that generates networks with most common properties of real-world networks including degree disassortativity. Furthermore, the model has a natural interpretation for citation networks with different practical applications.
△ Less
Submitted 23 March, 2013;
originally announced March 2013.
-
Software systems through complex networks science: Review, analysis and applications
Authors:
Lovro Šubelj,
Marko Bajec
Abstract:
Complex software systems are among most sophisticated human-made systems, yet only little is known about the actual structure of 'good' software. We here study different software systems developed in Java from the perspective of network science. The study reveals that network theory can provide a prominent set of techniques for the exploratory analysis of large complex software system. We further…
▽ More
Complex software systems are among most sophisticated human-made systems, yet only little is known about the actual structure of 'good' software. We here study different software systems developed in Java from the perspective of network science. The study reveals that network theory can provide a prominent set of techniques for the exploratory analysis of large complex software system. We further identify several applications in software engineering, and propose different network-based quality indicators that address software design, efficiency, reusability, vulnerability, controllability and other. We also highlight various interesting findings, e.g., software systems are highly vulnerable to processes like bug propagation, however, they are not easily controllable.
△ Less
Submitted 13 August, 2012;
originally announced August 2012.
-
Clustering assortativity, communities and functional modules in real-world networks
Authors:
Lovro Šubelj,
Marko Bajec
Abstract:
Complex networks of real-world systems are believed to be controlled by common phenomena, producing structures far from regular or random. Clustering, community structure and assortative mixing by degree are perhaps among most prominent examples of the latter. Although generally accepted for social networks, these properties only partially explain the structure of other networks. We first show tha…
▽ More
Complex networks of real-world systems are believed to be controlled by common phenomena, producing structures far from regular or random. Clustering, community structure and assortative mixing by degree are perhaps among most prominent examples of the latter. Although generally accepted for social networks, these properties only partially explain the structure of other networks. We first show that degree-corrected clustering is in contrast to standard definition highly assortative. Yet interesting on its own, we further note that non-social networks contain connected regions with very low clustering. Hence, the structure of real-world networks is beyond communities. We here investigate the concept of functional modules---groups of regularly equivalent nodes---and show that such structures could explain for the properties observed in non-social networks. Real-world networks might be composed of functional modules that are overlaid by communities. We support the latter by proposing a simple network model that generates scale-free small-world networks with tunable clustering and degree mixing. Model has a natural interpretation in many real-world networks, while it also gives insights into an adequate community extraction framework. We also present an algorithm for detection of arbitrary structural modules without any prior knowledge. Algorithm is shown to be superior to state-of-the-art, while application to real-world networks reveals well supported composites of different structural modules that are consistent with the underlying systems. Clear functional modules are identified in all types of networks including social. Our findings thus expose functional modules as another key ingredient of complex real-world networks.
△ Less
Submitted 14 February, 2012;
originally announced February 2012.
-
Self-similar scaling of density in complex real-world networks
Authors:
Neli Blagus,
Lovro Šubelj,
Marko Bajec
Abstract:
Despite their diverse origin, networks of large real-world systems reveal a number of common properties including small-world phenomena, scale-free degree distributions and modularity. Recently, network self-similarity as a natural outcome of the evolution of real-world systems has also attracted much attention within the physics literature. Here we investigate the scaling of density in complex ne…
▽ More
Despite their diverse origin, networks of large real-world systems reveal a number of common properties including small-world phenomena, scale-free degree distributions and modularity. Recently, network self-similarity as a natural outcome of the evolution of real-world systems has also attracted much attention within the physics literature. Here we investigate the scaling of density in complex networks under two classical box-covering renormalizations-network coarse-graining-and also different community-based renormalizations. The analysis on over 50 real-world networks reveals a power-law scaling of network density and size under adequate renormalization technique, yet irrespective of network type and origin. The results thus advance a recent discovery of a universal scaling of density among different real-world networks [Laurienti et al., Physica A 390 (20) (2011) 3608-3613.] and imply an existence of a scale-free density also within-among different self-similar scales of-complex real-world networks. The latter further improves the comprehension of self-similar structure in large real-world networks with several possible applications.
△ Less
Submitted 4 December, 2011; v1 submitted 25 October, 2011;
originally announced October 2011.
-
Generalized network community detection
Authors:
Lovro Šubelj,
Marko Bajec
Abstract:
Community structure is largely regarded as an intrinsic property of complex real-world networks. However, recent studies reveal that networks comprise even more sophisticated modules than classical cohesive communities. More precisely, real-world networks can also be naturally partitioned according to common patterns of connections between the nodes. Recently, a propagation based algorithm has bee…
▽ More
Community structure is largely regarded as an intrinsic property of complex real-world networks. However, recent studies reveal that networks comprise even more sophisticated modules than classical cohesive communities. More precisely, real-world networks can also be naturally partitioned according to common patterns of connections between the nodes. Recently, a propagation based algorithm has been proposed for the detection of arbitrary network modules. We here advance the latter with a more adequate community modeling based on network clustering. The resulting algorithm is evaluated on various synthetic benchmark networks and random graphs. It is shown to be comparable to current state-of-the-art algorithms, however, in contrast to other approaches, it does not require some prior knowledge of the true community structure. To demonstrate its generality, we further employ the proposed algorithm for community detection in different unipartite and bipartite real-world networks, for generalized community detection and also predictive data clustering.
△ Less
Submitted 12 October, 2011;
originally announced October 2011.
-
Robust network community detection using balanced propagation
Authors:
Lovro Šubelj,
Marko Bajec
Abstract:
Label propagation has proven to be an extremely fast method for detecting communities in large complex networks. Furthermore, due to its simplicity, it is also currently one of the most commonly adopted algorithms in the literature. Despite various subsequent advances, an important issue of the algorithm has not yet been properly addressed. Random (node) update orders within the algorithm severely…
▽ More
Label propagation has proven to be an extremely fast method for detecting communities in large complex networks. Furthermore, due to its simplicity, it is also currently one of the most commonly adopted algorithms in the literature. Despite various subsequent advances, an important issue of the algorithm has not yet been properly addressed. Random (node) update orders within the algorithm severely hamper its robustness, and consequently also the stability of the identified community structure. We note that an update order can be seen as increasing propagation preferences from certain nodes, and propose a balanced propagation that counteracts for the introduced randomness by utilizing node balancers. We have evaluated the proposed approach on synthetic networks with planted partition, and on several real-world networks with community structure. The results confirm that balanced propagation is significantly more robust than label propagation, when the performance of community detection is even improved. Thus, balanced propagation retains high scalability and algorithmic simplicity of label propagation, but improves on its stability and performance.
△ Less
Submitted 27 June, 2011;
originally announced June 2011.
-
Community structure of complex software systems: Analysis and applications
Authors:
Lovro Šubelj,
Marko Bajec
Abstract:
Due to notable discoveries in the fast evolving field of complex networks, recent research in software engineering has also focused on representing software systems with networks. Previous work has observed that these networks follow scale-free degree distributions and reveal small-world phenomena, while we here explore another property commonly found in different complex networks, i.e. community…
▽ More
Due to notable discoveries in the fast evolving field of complex networks, recent research in software engineering has also focused on representing software systems with networks. Previous work has observed that these networks follow scale-free degree distributions and reveal small-world phenomena, while we here explore another property commonly found in different complex networks, i.e. community structure. We adopt class dependency networks, where nodes represent software classes and edges represent dependencies among them, and show that these networks reveal a significant community structure, characterized by similar properties as observed in other complex networks. However, although intuitive and anticipated by different phenomena, identified communities do not exactly correspond to software packages. We empirically confirm our observations on several networks constructed from Java and various third party libraries, and propose different applications of community detection to software engineering.
△ Less
Submitted 21 May, 2011;
originally announced May 2011.
-
An expert system for detecting automobile insurance fraud using social network analysis
Authors:
Lovro Šubelj,
Štefan Furlan,
Marko Bajec
Abstract:
The article proposes an expert system for detection, and subsequent investigation, of groups of collaborating automobile insurance fraudsters. The system is described and examined in great detail, several technical difficulties in detecting fraud are also considered, for it to be applicable in practice. Opposed to many other approaches, the system uses networks for representation of data. Networks…
▽ More
The article proposes an expert system for detection, and subsequent investigation, of groups of collaborating automobile insurance fraudsters. The system is described and examined in great detail, several technical difficulties in detecting fraud are also considered, for it to be applicable in practice. Opposed to many other approaches, the system uses networks for representation of data. Networks are the most natural representation of such a relational domain, allowing formulation and analysis of complex relations between entities. Fraudulent entities are found by employing a novel assessment algorithm, \textit{Iterative Assessment Algorithm} (\textit{IAA}), also presented in the article. Besides intrinsic attributes of entities, the algorithm explores also the relations between entities. The prototype was evaluated and rigorously analyzed on real world data. Results show that automobile insurance fraud can be efficiently detected with the proposed system and that appropriate data representation is vital.
△ Less
Submitted 19 April, 2011;
originally announced April 2011.
-
Ubiquitousness of link-density and link-pattern communities in real-world networks
Authors:
Lovro Šubelj,
Marko Bajec
Abstract:
Community structure appears to be an intrinsic property of many complex real-world networks. However, recent work shows that real-world networks reveal even more sophisticated modules than classical cohesive (link-density) communities. In particular, networks can also be naturally partitioned according to similar patterns of connectedness among the nodes, revealing link-pattern communities. We her…
▽ More
Community structure appears to be an intrinsic property of many complex real-world networks. However, recent work shows that real-world networks reveal even more sophisticated modules than classical cohesive (link-density) communities. In particular, networks can also be naturally partitioned according to similar patterns of connectedness among the nodes, revealing link-pattern communities. We here propose a propagation based algorithm that can extract both link-density and link-pattern communities, without any prior knowledge of the true structure. The algorithm was first validated on different classes of synthetic benchmark networks with community structure, and also on random networks. We have further applied the algorithm to different social, information, technological and biological networks, where it indeed reveals meaningful (composites of) link-density and link-pattern communities. The results thus seem to imply that, similarly as link-density counterparts, link-pattern communities appear ubiquitous in nature and design.
△ Less
Submitted 24 October, 2011; v1 submitted 15 April, 2011;
originally announced April 2011.
-
Unfolding network communities by combining defensive and offensive label propagation
Authors:
Lovro Šubelj,
Marko Bajec
Abstract:
Label propagation has proven to be a fast method for detecting communities in complex networks. Recent work has also improved the accuracy and stability of the basic algorithm, however, a general approach is still an open issue. We propose different label propagation algorithms that convey two unique strategies of community formation, namely, defensive preservation and offensive expansion of commu…
▽ More
Label propagation has proven to be a fast method for detecting communities in complex networks. Recent work has also improved the accuracy and stability of the basic algorithm, however, a general approach is still an open issue. We propose different label propagation algorithms that convey two unique strategies of community formation, namely, defensive preservation and offensive expansion of communities. Furthermore, the strategies are combined in an advanced label propagation algorithm that retains the advantages of both approaches; and are enhanced with hierarchical community extraction, prominent for the use on larger networks. The proposed algorithms were empirically evaluated on different benchmarks networks with planted partition and on over 30 real-world networks of various types and sizes. The results confirm the adequacy of the propositions and give promising grounds for future analysis of (large) complex networks. Nevertheless, the main contribution of this work is in showing that different types of networks (with different topological properties) favor different strategies of community formation.
△ Less
Submitted 14 March, 2011;
originally announced March 2011.
-
Unfolding communities in large complex networks: Combining defensive and offensive label propagation for core extraction
Authors:
Lovro Šubelj,
Marko Bajec
Abstract:
Label propagation has proven to be a fast method for detecting communities in large complex networks. Recent developments have also improved the accuracy of the approach, however, a general algorithm is still an open issue. We present an advanced label propagation algorithm that combines two unique strategies of community formation, namely, defensive preservation and offensive expansion of communi…
▽ More
Label propagation has proven to be a fast method for detecting communities in large complex networks. Recent developments have also improved the accuracy of the approach, however, a general algorithm is still an open issue. We present an advanced label propagation algorithm that combines two unique strategies of community formation, namely, defensive preservation and offensive expansion of communities. Two strategies are combined in a hierarchical manner, to recursively extract the core of the network, and to identify whisker communities. The algorithm was evaluated on two classes of benchmark networks with planted partition and on almost 25 real-world networks ranging from networks with tens of nodes to networks with several tens of millions of edges. It is shown to be comparable to the current state-of-the-art community detection algorithms and superior to all previous label propagation algorithms, with comparable time complexity. In particular, analysis on real-world networks has proven that the algorithm has almost linear complexity, $\mathcal{O}(m^{1.19})$, and scales even better than basic label propagation algorithm ($m$ is the number of edges in the network).
△ Less
Submitted 14 March, 2011;
originally announced March 2011.