-
Community detection in bipartite signed networks is highly dependent on parameter choice
Authors:
Elena Candellone,
Erik-Jan van Kesteren,
Sofia Chelmi,
Javier Garcia-Bernardo
Abstract:
Decision-making processes often involve voting. Human interactions with exogenous entities such as legislations or products can be effectively modeled as two-mode (bipartite) signed networks-where people can either vote positively, negatively, or abstain from voting on the entities. Detecting communities in such networks could help us understand underlying properties: for example ideological camps…
▽ More
Decision-making processes often involve voting. Human interactions with exogenous entities such as legislations or products can be effectively modeled as two-mode (bipartite) signed networks-where people can either vote positively, negatively, or abstain from voting on the entities. Detecting communities in such networks could help us understand underlying properties: for example ideological camps or consumer preferences. While community detection is an established practice separately for bipartite and signed networks, it remains largely unexplored in the case of bipartite signed networks. In this paper, we systematically evaluate the efficacy of community detection methods on bipartite signed networks using a synthetic benchmark and real-world datasets. Our findings reveal that when no communities are present in the data, these methods often recover spurious communities. When communities are present, the algorithms exhibit promising performance, although their performance is highly susceptible to parameter choice. This indicates that researchers using community detection methods in the context of bipartite signed networks should not take the communities found at face value: it is essential to assess the robustness of parameter choices or perform domain-specific external validation.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
The Impact of School and Family Networks on COVID-19 Infections Among Dutch Students: A Study Using Population-Level Registry Data
Authors:
Javier Garcia-Bernardo,
Christine Hedde-von Westernhagen,
Tom Emery,
Albert Jan van Hoek
Abstract:
Understanding the impact of different social interactions is key to improving epidemic models. Here, we use extensive registry data -- including PCR test results and population-level networks -- to investigate the impact of school, family, and other social contacts on SARS-CoV-2 transmission in the Netherlands (June 2020--October 2021). We isolate and compare different contexts of potential SARS-C…
▽ More
Understanding the impact of different social interactions is key to improving epidemic models. Here, we use extensive registry data -- including PCR test results and population-level networks -- to investigate the impact of school, family, and other social contacts on SARS-CoV-2 transmission in the Netherlands (June 2020--October 2021). We isolate and compare different contexts of potential SARS-CoV-2 transmission by matching pairs of students based on their attendance at the same or different primary school (in 2020) and secondary school (in 2021) and their geographic proximity. We then calculated the probability of temporally associated infections -- i.e. the probability of both students testing positive within a 14-day period.
Our results highlight the relative importance of household and family transmission in the spread of SARS-CoV-2 compared to school settings. The probability of temporally associated infections for siblings and parent-child pairs living in the same household was 22.6--23.2\%, and 4.7--7.9\% for family members living in different household. In contrast, the probability of temporally associated infections was 0.52\% for pairs of students living nearby but not attending the same primary or secondary school, 0.66\% for pairs attending different secondary schools but having attended the same primary school, and 1.65\% for pairs attending the same secondary school. Finally, we used multilevel regression analyses to examine how individual, school, and geographic factors contribute to transmission risk. We found that the largest differences in transmission probabilities were due to unobserved individual (60\%) and school-level (34\%) factors. Only a small proportion (3\%) could be attributed to geographic proximity of students or to school size, denomination, or the median income of the school area.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Predicting COVID-19 Infections Using Multi-layer Centrality Measures in Population-scale Networks
Authors:
Christine Hedde-von Westernhagen,
Javier Garcia-Bernardo,
Ayoub Bagheri
Abstract:
Understanding the spread of SARS-CoV-2 has been one of the most pressing problems of the recent past. Network models present a potent approach to studying such spreading phenomena because of their ability to represent complex social interactions. While previous studies have shown that network centrality measures are generally able to identify influential spreaders in a susceptible population, it i…
▽ More
Understanding the spread of SARS-CoV-2 has been one of the most pressing problems of the recent past. Network models present a potent approach to studying such spreading phenomena because of their ability to represent complex social interactions. While previous studies have shown that network centrality measures are generally able to identify influential spreaders in a susceptible population, it is not yet known if they can also be used to predict infection risks. However, information about infection risks at the individual level is vital for the design of targeted interventions. Here, we use large-scale administrative data from the Netherlands to study whether centrality measures can predict the risk and timing of infections with COVID-19-like diseases. We investigate this issue leveraging the framework of multi-layer networks, which accounts for interactions taking place in different contexts, such as workplaces, households and schools. In epidemic models simulated on real-world network data from over one million individuals, we find that existing centrality measures offer good predictions of relative infection risks, and are correlated with the timing of individual infections. We however find no association between centrality measures and real SARS-CoV-2 test data, which indicates that population-scale network data alone cannot aid predictions of virus transmission.
△ Less
Submitted 31 October, 2023; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Uncovering Offshore Financial Centers: Conduits and Sinks in the Global Corporate Ownership Network
Authors:
Javier Garcia-Bernardo,
Jan Fichtner,
Eelke M. Heemskerk,
Frank W. Takes
Abstract:
Multinational corporations use highly complex structures of parents and subsidiaries to organize their operations and ownership. Offshore Financial Centers (OFCs) facilitate these structures through low taxation and lenient regulation, but are increasingly under scrutiny, for instance for enabling tax avoidance. Therefore, the identification of OFC jurisdictions has become a politicized and contes…
▽ More
Multinational corporations use highly complex structures of parents and subsidiaries to organize their operations and ownership. Offshore Financial Centers (OFCs) facilitate these structures through low taxation and lenient regulation, but are increasingly under scrutiny, for instance for enabling tax avoidance. Therefore, the identification of OFC jurisdictions has become a politicized and contested issue. We introduce a novel data-driven approach for identifying OFCs based on the global corporate ownership network, in which over 98 million firms (nodes) are connected through 71 million ownership relations. This granular firm-level network data uniquely allows identifying both sink-OFCs and conduit-OFCs. Sink-OFCs attract and retain foreign capital while conduit-OFCs are attractive intermediate destinations in the routing of international investments and enable the transfer of capital without taxation. We identify 24 sink-OFCs. In addition, a small set of five countries -- the Netherlands, the United Kingdom, Ireland, Singapore and Switzerland -- canalize the majority of corporate offshore investment as conduit-OFCs. Each conduit jurisdiction is specialized in a geographical area and there is significant specialization based on industrial sectors. Against the idea of OFCs as exotic small islands that cannot be regulated, we show that many sink and conduit-OFCs are highly developed countries.
△ Less
Submitted 29 May, 2017; v1 submitted 8 March, 2017;
originally announced March 2017.
-
The Effects of Data Quality on the Analysis of Corporate Board Interlock Networks
Authors:
Javier Garcia-Bernardo,
Frank W. Takes
Abstract:
Nowadays, social networks of ever increasing size are studied by researchers from a range of disciplines. The data underlying these networks is often automatically gathered from API's, websites or existing databases. As a result, the quality of this data is typically not manually validated, and the resulting networks may be based on false, biased or incomplete data. In this paper, we investigate t…
▽ More
Nowadays, social networks of ever increasing size are studied by researchers from a range of disciplines. The data underlying these networks is often automatically gathered from API's, websites or existing databases. As a result, the quality of this data is typically not manually validated, and the resulting networks may be based on false, biased or incomplete data. In this paper, we investigate the effect of data quality issues on the analysis of large networks. We focus on the global board interlock network, in which nodes represent firms across the globe, and edges model social ties between firms -- shared board members holding a position at both firms. First, we demonstrate how we can automatically assess the completeness of a large dataset of 160 million firms, in which data is missing not at random. Second, we present a novel method to increase the accuracy of the entries in our data. By comparing the expected and empirical characteristics of the resulting network topology, we develop a technique that automatically prunes and merges duplicate nodes and edges. Third, we use a case study of the board interlock network of Sweden to show how poor quality data results in incorrect network topologies, biased centrality values and abnormal influence spread under a well-known diffusion model. Finally, we demonstrate how our data quality assessment methods help restore the correct network structure, ultimately allowing us to derive meaningful and correct results from analyzing the network.
△ Less
Submitted 21 December, 2016; v1 submitted 5 December, 2016;
originally announced December 2016.
-
Where is the global corporate elite? A large-scale network study of local and nonlocal interlocking directorates
Authors:
Eelke M. Heemskerk,
Frank W. Takes,
Javier Garcia-Bernardo,
M. Jouke Huijzer
Abstract:
Business elites reconfigure their locus of organization over time, from the city level, to the national level, and beyond. We ask what the current level of elite organization is and propose a novel theoretical and empirical approach to answer this question. Building on the universal distinction between local and nonlocal ties we use network analysis and community detection to dissect the global ne…
▽ More
Business elites reconfigure their locus of organization over time, from the city level, to the national level, and beyond. We ask what the current level of elite organization is and propose a novel theoretical and empirical approach to answer this question. Building on the universal distinction between local and nonlocal ties we use network analysis and community detection to dissect the global network of interlocking directorates among over five million firms. We find that elite orientation is indeed changing from the national to the transnational plane, but we register a considerable heterogeneity across different regions in the world. In some regions the business communities are organized along national borders, whereas in other areas the locus of organization is at the city level or international level. London dominates the global corporate elite network. Our findings underscore that the study of corporate elites requires an approach that is sensitive to levels of organization that go beyond the confines of nation states.
△ Less
Submitted 25 July, 2016; v1 submitted 16 April, 2016;
originally announced April 2016.
-
Social media affects the timing, location, and severity of school shootings
Authors:
J. Garcia-Bernardo,
H. Qi,
J. M. Shultz,
A. M. Cohen,
N. F. Johnson,
P. S. Dodds
Abstract:
Over the past two decades, school shootings within the United States have repeatedly devastated communities and shaken public opinion. Many of these attacks appear to be `lone wolf' ones driven by specific individual motivations, and the identification of precursor signals and hence actionable policy measures would thus seem highly unlikely. Here, we take a system-wide view and investigate the tim…
▽ More
Over the past two decades, school shootings within the United States have repeatedly devastated communities and shaken public opinion. Many of these attacks appear to be `lone wolf' ones driven by specific individual motivations, and the identification of precursor signals and hence actionable policy measures would thus seem highly unlikely. Here, we take a system-wide view and investigate the timing of school attacks and the dynamical feedback with social media. We identify a trend divergence in which college attacks have continued to accelerate over the last 25 years while those carried out on K-12 schools have slowed down. We establish the copycat effect in school shootings and uncover a statistical association between social media chatter and the probability of an attack in the following days. While hinting at causality, this relationship may also help mitigate the frequency and intensity of future attacks.
△ Less
Submitted 5 November, 2018; v1 submitted 20 June, 2015;
originally announced June 2015.
-
Quantitative patterns in drone wars
Authors:
Javier Garcia-Bernardo,
Peter Sheridan Dodds,
Neil F. Johnson
Abstract:
Attacks by drones (i.e., unmanned combat air vehicles) continue to generate heated political and ethical debates. Here we examine the quantitative nature of drone attacks, focusing on how their intensity and frequency compare with that of other forms of human conflict. Instead of the power-law distribution found recently for insurgent and terrorist attacks, the severity of attacks is more akin to…
▽ More
Attacks by drones (i.e., unmanned combat air vehicles) continue to generate heated political and ethical debates. Here we examine the quantitative nature of drone attacks, focusing on how their intensity and frequency compare with that of other forms of human conflict. Instead of the power-law distribution found recently for insurgent and terrorist attacks, the severity of attacks is more akin to lognormal and exponential distributions, suggesting that the dynamics underlying drone attacks lie beyond these other forms of human conflict. We find that the pattern in the timing of attacks is consistent with one side having almost complete control, an important if expected result. We show that these novel features can be reproduced and understood using a generative mathematical model in which resource allocation to the dominant side is regulated through a feedback loop.
△ Less
Submitted 10 October, 2015; v1 submitted 15 July, 2014;
originally announced July 2014.