-
An embedding-based distance for temporal graphs
Authors:
Lorenzo Dall'Amico,
Alain Barrat,
Ciro Cattuto
Abstract:
We define a distance between temporal graphs based on graph embeddings built using time-respecting random walks. We study both the case of matched graphs, when there exists a known relation between the nodes, and the unmatched case, when such a relation is unavailable and the graphs may be of different sizes. We illustrate the interest of our distance definition, using both real and synthetic temp…
▽ More
We define a distance between temporal graphs based on graph embeddings built using time-respecting random walks. We study both the case of matched graphs, when there exists a known relation between the nodes, and the unmatched case, when such a relation is unavailable and the graphs may be of different sizes. We illustrate the interest of our distance definition, using both real and synthetic temporal network data, by showing its ability to discriminate between graphs with different structural and temporal properties. Leveraging state-of-the-art machine learning techniques, we propose an efficient implementation of distance computation that is viable for large-scale temporal graphs.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Estimating household contact matrices structure from easily collectable metadata
Authors:
Lorenzo Dall'Amico,
Jackie Kleynhans,
Laetitia Gauvin,
Michele Tizzoni,
Laura Ozella,
Mvuyo Makhasi,
Nicole Wolter,
Brigitte Language,
Ryan G. Wagner,
Cheryl Cohen,
Stefano Tempia,
Ciro Cattuto
Abstract:
Contact matrices are a commonly adopted data representation, used to develop compartmental models for epidemic spreading, accounting for the contact heterogeneities across age groups. Their estimation, however, is generally time and effort consuming and model-driven strategies to quantify the contacts are often needed. In this article we focus on household contact matrices, describing the contacts…
▽ More
Contact matrices are a commonly adopted data representation, used to develop compartmental models for epidemic spreading, accounting for the contact heterogeneities across age groups. Their estimation, however, is generally time and effort consuming and model-driven strategies to quantify the contacts are often needed. In this article we focus on household contact matrices, describing the contacts among the members of a family and develop a parametric model to describe them. This model combines demographic and easily quantifiable survey-based data and is tested on high resolution proximity data collected in two sites in South Africa. Given its simplicity and interpretability, we expect our method to be easily applied to other contexts as well and we identify relevant questions that need to be addressed during the data collection procedure.
△ Less
Submitted 24 July, 2023; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Epidemiological and public health requirements for COVID-19 contact tracing apps and their evaluation
Authors:
Vittoria Colizza,
Eva Grill,
Rafael Mikolajczyk,
Ciro Cattuto,
Adam Kucharski,
Steven Riley,
Michelle Kendall,
Katrina Lythgoe,
Lucie Abeler-Dörner,
Chris Wymant,
David Bonsall,
Luca Ferretti,
Christophe Fraser
Abstract:
Digital contact tracing is a public health intervention. It should be integrated with local health policy, provide rapid and accurate notifications to exposed individuals, and encourage high app uptake and adherence to quarantine. Real-time monitoring and evaluation of effectiveness of app-based contact tracing is key for improvement and public trust.
Digital contact tracing is a public health intervention. It should be integrated with local health policy, provide rapid and accurate notifications to exposed individuals, and encourage high app uptake and adherence to quarantine. Real-time monitoring and evaluation of effectiveness of app-based contact tracing is key for improvement and public trust.
△ Less
Submitted 10 February, 2021;
originally announced February 2021.
-
Using wearable proximity sensors to characterize social contact patterns in a village of rural Malawi
Authors:
Laura Ozella,
Daniela Paolotti,
Guilherme Lichand,
Jorge P. Rodriguez,
Simon Haenni,
John Phuka,
Onicio B. Leal-Neto,
Ciro Cattuto
Abstract:
Measuring close proximity interactions between individuals can provide key information on social contacts in human communities. With the present study, we report the quantitative assessment of contact patterns in a village in rural Malawi, based on proximity sensors technology that allows for high-resolution measurements of social contacts. The system provided information on community structure of…
▽ More
Measuring close proximity interactions between individuals can provide key information on social contacts in human communities. With the present study, we report the quantitative assessment of contact patterns in a village in rural Malawi, based on proximity sensors technology that allows for high-resolution measurements of social contacts. The system provided information on community structure of the village, on social relationships and social assortment between individuals, and on daily contacts activity within the village. Our findings revealed that the social network presented communities that were highly correlated with household membership, thus confirming the importance of family ties within the village. Contacts within households occur mainly between adults and children, and adults and adolescents. This result suggests that the principal role of adults within the family is the care for the youngest. Most of the inter-household interactions occurred among caregivers and among adolescents. We studied the tendency of participants to interact with individuals with whom they shared similar attributes (i.e., assortativity). Age and gender assortativity were observed in inter-household network, showing that individuals not belonging to the same family group prefer to interact with people with whom they share similar age and gender. Age disassortativity is observed in intra-household networks. Family members congregate in the early morning, during lunch time and dinner time. In contrast, individuals not belonging to the same household displayed a growing contact activity from the morning, reaching a maximum in the afternoon. The data collection infrastructure used in this study seems to be very effective to capture the dynamics of contacts by collecting high resolution temporal data and to give access to the level of information needed to understand the social context of the village.
△ Less
Submitted 20 December, 2020;
originally announced December 2020.
-
An individual-level ground truth dataset for home location detection
Authors:
Luca Pappalardo,
Leo Ferres,
Manuel Sacasa,
Ciro Cattuto,
Loreto Bravo
Abstract:
Home detection, assigning a phone device to its home antenna, is a ubiquitous part of most studies in the literature on mobile phone data. Despite its widespread use, home detection relies on a few assumptions that are difficult to check without ground truth, i.e., where the individual that owns the device resides. In this paper, we provide an unprecedented evaluation of the accuracy of home detec…
▽ More
Home detection, assigning a phone device to its home antenna, is a ubiquitous part of most studies in the literature on mobile phone data. Despite its widespread use, home detection relies on a few assumptions that are difficult to check without ground truth, i.e., where the individual that owns the device resides. In this paper, we provide an unprecedented evaluation of the accuracy of home detection algorithms on a group of sixty-five participants for whom we know their exact home address and the antennas that might serve them. Besides, we analyze not only Call Detail Records (CDRs) but also two other mobile phone streams: eXtended Detail Records (XDRs, the ``data'' channel) and Control Plane Records (CPRs, the network stream). These data streams vary not only in their temporal granularity but also they differ in the data generation mechanism', e.g., CDRs are purely human-triggered while CPR is purely machine-triggered events. Finally, we quantify the amount of data that is needed for each stream to carry out successful home detection for each stream. We find that the choice of stream and the algorithm heavily influences home detection, with an hour-of-day algorithm for the XDRs performing the best, and with CPRs performing best for the amount of data needed to perform home detection. Our work is useful for researchers and practitioners in order to minimize data requests and to maximize the accuracy of home antenna location.
△ Less
Submitted 17 October, 2020;
originally announced October 2020.
-
Relevance of temporal cores for epidemic spread in temporal networks
Authors:
Martino Ciaperoni,
Edoardo Galimberti,
Francesco Bonchi,
Ciro Cattuto,
Francesco Gullo,
Alain Barrat
Abstract:
Temporal networks are widely used to represent a vast diversity of systems, including in particular social interactions, and the spreading processes unfolding on top of them. The identification of structures playing important roles in such processes remains largely an open question, despite recent progresses in the case of static networks. Here, we consider as candidate structures the recently int…
▽ More
Temporal networks are widely used to represent a vast diversity of systems, including in particular social interactions, and the spreading processes unfolding on top of them. The identification of structures playing important roles in such processes remains largely an open question, despite recent progresses in the case of static networks. Here, we consider as candidate structures the recently introduced concept of span-cores: the span-cores decompose a temporal network into subgraphs of controlled duration and increasing connectivity, generalizing the core-decomposition of static graphs. To assess the relevance of such structures, we explore the effectiveness of strategies aimed either at containing or maximizing the impact of a spread, based respectively on removing span-cores of high cohesiveness or duration to decrease the epidemic risk, or on seeding the process from such structures. The effectiveness of such strategies is assessed in a variety of empirical data sets and compared to baselines that use only static information on the centrality of nodes and static concepts of coreness, as well as to a baseline based on a temporal centrality measure. Our results show that the most stable and cohesive temporal cores play indeed an important role in epidemic processes on temporal networks, and that their nodes are likely to represent influential spreaders.
△ Less
Submitted 9 July, 2020; v1 submitted 20 March, 2020;
originally announced March 2020.
-
Span-core Decomposition for Temporal Networks: Algorithms and Applications
Authors:
Edoardo Galimberti,
Martino Ciaperoni,
Alain Barrat,
Francesco Bonchi,
Ciro Cattuto,
Francesco Gullo
Abstract:
When analyzing temporal networks, a fundamental task is the identification of dense structures (i.e., groups of vertices that exhibit a large number of links), together with their temporal span (i.e., the period of time for which the high density holds). In this paper we tackle this task by introducing a notion of temporal core decomposition where each core is associated with two quantities, its c…
▽ More
When analyzing temporal networks, a fundamental task is the identification of dense structures (i.e., groups of vertices that exhibit a large number of links), together with their temporal span (i.e., the period of time for which the high density holds). In this paper we tackle this task by introducing a notion of temporal core decomposition where each core is associated with two quantities, its coreness, which quantifies how densely it is connected, and its span, which is a temporal interval: we call such cores \emph{span-cores}.
For a temporal network defined on a discrete temporal domain $T$, the total number of time intervals included in $T$ is quadratic in $|T|$, so that the total number of span-cores is potentially quadratic in $|T|$ as well. Our first main contribution is an algorithm that, by exploiting containment properties among span-cores, computes all the span-cores efficiently. Then, we focus on the problem of finding only the \emph{maximal span-cores}, i.e., span-cores that are not dominated by any other span-core by both their coreness property and their span. We devise a very efficient algorithm that exploits theoretical findings on the maximality condition to directly extract the maximal ones without computing all span-cores.
Finally, as a third contribution, we introduce the problem of \emph{temporal community search}, where a set of query vertices is given as input, and the goal is to find a set of densely-connected subgraphs containing the query vertices and covering the whole underlying temporal domain $T$. We derive a connection between this problem and the problem of finding (maximal) span-cores. Based on this connection, we show how temporal community search can be solved in polynomial-time via dynamic programming, and how the maximal span-cores can be profitably exploited to significantly speed-up the basic algorithm.
△ Less
Submitted 31 July, 2020; v1 submitted 6 October, 2019;
originally announced October 2019.
-
DyANE: Dynamics-aware node embedding for temporal networks
Authors:
Koya Sato,
Mizuki Oka,
Alain Barrat,
Ciro Cattuto
Abstract:
Low-dimensional vector representations of network nodes have proven successful to feed graph data to machine learning algorithms and to improve performance across diverse tasks. Most of the embedding techniques, however, have been developed with the goal of achieving dense, low-dimensional encoding of network structure and patterns. Here, we present a node embedding technique aimed at providing lo…
▽ More
Low-dimensional vector representations of network nodes have proven successful to feed graph data to machine learning algorithms and to improve performance across diverse tasks. Most of the embedding techniques, however, have been developed with the goal of achieving dense, low-dimensional encoding of network structure and patterns. Here, we present a node embedding technique aimed at providing low-dimensional feature vectors that are informative of dynamical processes occurring over temporal networks -- rather than of the network structure itself -- with the goal of enabling prediction tasks related to the evolution and outcome of these processes. We achieve this by using a modified supra-adjacency representation of temporal networks and building on standard embedding techniques for static graphs based on random-walks. We show that the resulting embedding vectors are useful for prediction tasks related to paradigmatic dynamical processes, namely epidemic spreading over empirical temporal networks. In particular, we illustrate the performance of our approach for the prediction of nodes' epidemic states in a single instance of a spreading process. We show how framing this task as a supervised multi-label classification task on the embedding vectors allows us to estimate the temporal evolution of the entire system from a partial sampling of nodes at random times, with potential impact for nowcasting infectious disease dynamics.
△ Less
Submitted 29 August, 2020; v1 submitted 12 September, 2019;
originally announced September 2019.
-
Gender gaps in urban mobility
Authors:
Laetitia Gauvin,
Michele Tizzoni,
Simone Piaggesi,
Andrew Young,
Natalia Adler,
Stefaan Verhulst,
Leo Ferres,
Ciro Cattuto
Abstract:
The use of public transportation or simply moving about in streets are gendered issues. Women and girls often engage in multi-purpose, multi-stop trips in order to do household chores, work, and study ('trip chaining'). Women-headed households are often more prominent in urban settings and they tend to work more in low-paid/informal jobs than men, with limited access to transportation subsidies. H…
▽ More
The use of public transportation or simply moving about in streets are gendered issues. Women and girls often engage in multi-purpose, multi-stop trips in order to do household chores, work, and study ('trip chaining'). Women-headed households are often more prominent in urban settings and they tend to work more in low-paid/informal jobs than men, with limited access to transportation subsidies. Here we present recent results on urban mobility from a gendered perspective by uniquely combining a wide range of datasets, including commercial sources of telecom and open data. We explored urban mobility of women and men in the greater metropolitan area of Santiago, Chile, by analyzing the mobility traces extracted from the Call Detail Records (CDRs) of a large cohort of anonymized mobile phone users over a period of 3 months. We find that, taking into account the differences in users' calling behaviors, women move less than men, visiting less unique locations and distributing their time less equally among such locations. By map** gender differences in mobility over the 52 comunas of Santiago, we find a higher mobility gap to be correlated with socio-economic indicators, such as a lower average income, and with the lack of public and private transportation options. Such results provide new insights for policymakers to design more gender inclusive transportation plans in the city of Santiago.
△ Less
Submitted 21 June, 2019;
originally announced June 2019.
-
Wearable proximity sensors for monitoring a mass casualty incident exercise: a feasibility study
Authors:
Laura Ozella,
Laetitia Gauvin,
Luca Carenzo,
Marco Quaggiotto,
Pier Luigi Ingrassia,
Michele Tizzoni,
André Panisson,
Davide Colombo,
Anna Sapienza,
Kyriaki Kalimeri,
Francesco Della Corte,
Ciro Cattuto
Abstract:
Over the past several decades, naturally occurring and man-made mass casualty incidents (MCI) have increased in frequency and number, worldwide. To test the impact of such event on medical resources, simulations can provide a safe, controlled setting while replicating the chaotic environment typical of an actual disaster. A standardised method to collect and analyse data from mass casualty exercis…
▽ More
Over the past several decades, naturally occurring and man-made mass casualty incidents (MCI) have increased in frequency and number, worldwide. To test the impact of such event on medical resources, simulations can provide a safe, controlled setting while replicating the chaotic environment typical of an actual disaster. A standardised method to collect and analyse data from mass casualty exercises is needed, in order to assess preparedness and performance of the healthcare staff involved. We report on the use of wearable proximity sensors to measure proximity events during a MCI simulation. We investigated the interactions between medical staff and patients, to evaluate the time dedicated by the medical staff with respect to the severity of the injury of the victims depending on the roles. We estimated the presence of the patients in the different spaces of the field hospital, in order to study the patients' flow. Data were obtained and collected through the deployment of wearable proximity sensors during a mass casualty incident functional exercise. The scenario included two areas: the accident site and the Advanced Medical Post (AMP), and the exercise lasted 3 hours. A total of 238 participants simulating medical staff and victims were involved. Each participant wore a proximity sensor and 30 fixed devices were placed in the field hospital. The contact networks show a heterogeneous distribution of the cumulative time spent in proximity by participants. We obtained contact matrices based on cumulative time spent in proximity between victims and the rescuers. Our results showed that the time spent in proximity by the healthcare teams with the victims is related to the severity of the patient's injury. The analysis of patients' flow showed that the presence of patients in the rooms of the hospital is consistent with triage code and diagnosis, and no obvious bottlenecks were found.
△ Less
Submitted 18 September, 2018;
originally announced September 2018.
-
Mining (maximal) span-cores from temporal networks
Authors:
Edoardo Galimberti,
Alain Barrat,
Francesco Bonchi,
Ciro Cattuto,
Francesco Gullo
Abstract:
When analyzing temporal networks, a fundamental task is the identification of dense structures (i.e., groups of vertices that exhibit a large number of links), together with their temporal span (i.e., the period of time for which the high density holds). We tackle this task by introducing a notion of temporal core decomposition where each core is associated with its span: we call such cores span-c…
▽ More
When analyzing temporal networks, a fundamental task is the identification of dense structures (i.e., groups of vertices that exhibit a large number of links), together with their temporal span (i.e., the period of time for which the high density holds). We tackle this task by introducing a notion of temporal core decomposition where each core is associated with its span: we call such cores span-cores.
As the total number of time intervals is quadratic in the size of the temporal domain $T$ under analysis, the total number of span-cores is quadratic in $|T|$ as well. Our first contribution is an algorithm that, by exploiting containment properties among span-cores, computes all the span-cores efficiently. Then, we focus on the problem of finding only the maximal span-cores, i.e., span-cores that are not dominated by any other span-core by both the coreness property and the span. We devise a very efficient algorithm that exploits theoretical findings on the maximality condition to directly compute the maximal ones without computing all span-cores.
Experimentation on several real-world temporal networks confirms the efficiency and scalability of our methods. Applications on temporal networks, gathered by a proximity-sensing infrastructure recording face-to-face interactions in schools, highlight the relevance of the notion of (maximal) span-core in analyzing social dynamics and detecting/correcting anomalies in the data.
△ Less
Submitted 28 August, 2018;
originally announced August 2018.
-
Shop** Mall Attraction and Social Mixing at a City Scale
Authors:
Mariano G. Beiró,
Loreto Bravo,
Diego Caro,
Ciro Cattuto,
Leo Ferres,
Eduardo Graells-Garrido
Abstract:
The social inclusion aspects of shop** malls and their effects on our understanding of urban spaces have been a controversial argument largely discussed in the literature. Shop** malls offer an open, safe and democratic version of the public space. Many of their detractors suggest that malls target their customers in subtle ways, promoting social exclusion. In this work, we analyze whether mal…
▽ More
The social inclusion aspects of shop** malls and their effects on our understanding of urban spaces have been a controversial argument largely discussed in the literature. Shop** malls offer an open, safe and democratic version of the public space. Many of their detractors suggest that malls target their customers in subtle ways, promoting social exclusion. In this work, we analyze whether malls offer opportunities for social mixing by analyzing the patterns of shop** mall visits in a large Latin-American city: Santiago de Chile.
We use a large XDR (Data Detail Records) dataset from a telecommunication company to analyze the mobility of $387,152$ cell phones around $16$ large malls in Santiago de Chile during one month. We model the influx of people to malls in terms of a gravity model of mobility, and we are able to predict the customer profile distribution of each mall, explaining it in terms of mall location, the population distribution, and mall size.
Then, we analyze the concept of social attraction, expressed as people from low and middle classes being attracted by malls that target high-income customers. We include a social attraction factor in our model and find that it is negligible in the process of choosing a mall. We observe that social mixing arises only in peripheral malls located farthest from the city center, which both low and middle class people visit. Using a co-visitation model we show that people tend to choose a restricted profile of malls according to their socio-economic status and their distance from the mall. We conclude that the potential for social mixing in malls could be capitalized by designing public policies regarding transportation and mobility.
△ Less
Submitted 9 February, 2018; v1 submitted 31 January, 2018;
originally announced February 2018.
-
Estimating the outcome of spreading processes on networks with incomplete information: a mesoscale approach
Authors:
Anna Sapienza,
Alain Barrat,
Ciro Cattuto,
Laetitia Gauvin
Abstract:
Recent advances in data collection have facilitated the access to time-resolved human proximity data that can conveniently be represented as temporal networks of contacts between individuals. While this type of data is fundamental to investigate how information or diseases propagate in a population, it often suffers from incompleteness, which possibly leads to biased conclusions. A major challenge…
▽ More
Recent advances in data collection have facilitated the access to time-resolved human proximity data that can conveniently be represented as temporal networks of contacts between individuals. While this type of data is fundamental to investigate how information or diseases propagate in a population, it often suffers from incompleteness, which possibly leads to biased conclusions. A major challenge is thus to estimate the outcome of spreading processes occurring on temporal networks built from partial information. To cope with this problem, we devise an approach based on Non-negative Tensor Factorization (NTF) -- a dimensionality reduction technique from multi-linear algebra. The key idea is to learn a low-dimensional representation of the temporal network built from partial information, to adapt it to take into account temporal and structural heterogeneity properties known to be crucial for spreading processes occurring on networks, and to construct in this way a surrogate network similar to the complete original network. To test our method, we consider several human-proximity networks, on which we simulate a loss of data. Using our approach on the resulting partial networks, we build a surrogate version of the complete network for each. We then compare the outcome of a spreading process on the complete networks (non altered by a loss of data) and on the surrogate networks. We observe that the epidemic sizes obtained using the surrogate networks are in good agreement with those measured on the complete networks. Finally, we propose an extension of our framework when additional data sources are available to cope with the missing data problem.
△ Less
Submitted 6 September, 2017;
originally announced September 2017.
-
Robust modeling of human contact networks across different scales and proximity-sensing techniques
Authors:
Michele Starnini,
Bruno Lepri,
Andrea Baronchelli,
Alain Barrat,
Ciro Cattuto,
Romualdo Pastor-Satorras
Abstract:
The problem of map** human close-range proximity networks has been tackled using a variety of technical approaches. Wearable electronic devices, in particular, have proven to be particularly successful in a variety of settings relevant for research in social science, complex networks and infectious diseases dynamics. Each device and technology used for proximity sensing (e.g., RFIDs, Bluetooth,…
▽ More
The problem of map** human close-range proximity networks has been tackled using a variety of technical approaches. Wearable electronic devices, in particular, have proven to be particularly successful in a variety of settings relevant for research in social science, complex networks and infectious diseases dynamics. Each device and technology used for proximity sensing (e.g., RFIDs, Bluetooth, low-power radio or infrared communication, etc.) comes with specific biases on the close-range relations it records. Hence it is important to assess which statistical features of the empirical proximity networks are robust across different measurement techniques, and which modeling frameworks generalize well across empirical data. Here we compare time-resolved proximity networks recorded in different experimental settings and show that some important statistical features are robust across all settings considered. The observed universality calls for a simplified modeling approach. We show that one such simple model is indeed able to reproduce the main statistical distributions characterizing the empirical temporal networks.
△ Less
Submitted 20 July, 2017;
originally announced July 2017.
-
Predicting human mobility through the assimilation of social media traces into mobility models
Authors:
M. G. Beiró,
A. Panisson,
M. Tizzoni,
C. Cattuto
Abstract:
Predicting human mobility flows at different spatial scales is challenged by the heterogeneity of individual trajectories and the multi-scale nature of transportation networks. As vast amounts of digital traces of human behaviour become available, an opportunity arises to improve mobility models by integrating into them proxy data on mobility collected by a variety of digital platforms and locatio…
▽ More
Predicting human mobility flows at different spatial scales is challenged by the heterogeneity of individual trajectories and the multi-scale nature of transportation networks. As vast amounts of digital traces of human behaviour become available, an opportunity arises to improve mobility models by integrating into them proxy data on mobility collected by a variety of digital platforms and location-aware services. Here we propose a hybrid model of human mobility that integrates a large-scale publicly available dataset from a popular photo-sharing system with the classical gravity model, under a stacked regression procedure. We validate the performance and generalizability of our approach using two ground-truth datasets on air travel and daily commuting in the United States: using two different cross-validation schemes we show that the hybrid model affords enhanced mobility prediction at both spatial scales.
△ Less
Submitted 5 February, 2016; v1 submitted 18 January, 2016;
originally announced January 2016.
-
Compensating for population sampling in simulations of epidemic spread on temporal contact networks
Authors:
Mathieu Génois,
Christian L. Vestergaard,
Ciro Cattuto,
Alain Barrat
Abstract:
Data describing human interactions often suffer from incomplete sampling of the underlying population. As a consequence, the study of contagion processes using data-driven models can lead to a severe underestimation of the epidemic risk. Here we present a systematic method to alleviate this issue and obtain a better estimation of the risk in the context of epidemic models informed by high-resoluti…
▽ More
Data describing human interactions often suffer from incomplete sampling of the underlying population. As a consequence, the study of contagion processes using data-driven models can lead to a severe underestimation of the epidemic risk. Here we present a systematic method to alleviate this issue and obtain a better estimation of the risk in the context of epidemic models informed by high-resolution time-resolved contact data. We consider several such data sets collected in various contexts and perform controlled resampling experiments. We show how the statistical information contained in the resampled data can be used to build a series of surrogate versions of the unknown contacts. We simulate epidemic processes on the resulting reconstructed data sets and show that it is possible to obtain good estimates of the outcome of simulations performed using the complete data set. We discuss limitations and potential improvements of our method.
△ Less
Submitted 18 November, 2015; v1 submitted 13 March, 2015;
originally announced March 2015.
-
Revealing latent factors of temporal networks for mesoscale intervention in epidemic spread
Authors:
Laetitia Gauvin,
André Panisson,
Alain Barrat,
Ciro Cattuto
Abstract:
The customary perspective to reason about epidemic mitigation in temporal networks hinges on the identification of nodes with specific features or network roles. The ensuing individual-based control strategies, however, are difficult to carry out in practice and ignore important correlations between topological and temporal patterns. Here we adopt a mesoscopic perspective and present a principled…
▽ More
The customary perspective to reason about epidemic mitigation in temporal networks hinges on the identification of nodes with specific features or network roles. The ensuing individual-based control strategies, however, are difficult to carry out in practice and ignore important correlations between topological and temporal patterns. Here we adopt a mesoscopic perspective and present a principled framework to identify collective features at multiple scales and rank their importance for epidemic spread. We use tensor decomposition techniques to build an additive representation of a temporal network in terms of mesostructures, such as cohesive clusters and temporally-localized mixing patterns. This representation allows to determine the impact of individual mesostructures on epidemic spread and to assess the effect of targeted interventions that remove chosen structures. We illustrate this approach using high-resolution social network data on face-to-face interactions in a school and show that our method affords the design of effective mesoscale interventions.
△ Less
Submitted 12 January, 2015;
originally announced January 2015.
-
Mitigation of infectious disease at school: targeted class closure vs school closure
Authors:
Valerio Gemmetto,
Alain Barrat,
Ciro Cattuto
Abstract:
School environments are thought to play an important role in the community spread of airborne infections (e.g., influenza) because of the high mixing rates of school children. The closure of schools has therefore been proposed as efficient mitigation strategy, with however high social and economic costs: alternative, less disruptive interventions are highly desirable. The recent availability of hi…
▽ More
School environments are thought to play an important role in the community spread of airborne infections (e.g., influenza) because of the high mixing rates of school children. The closure of schools has therefore been proposed as efficient mitigation strategy, with however high social and economic costs: alternative, less disruptive interventions are highly desirable. The recent availability of high-resolution contact networks in school environments provides an opportunity to design micro-interventions and compare the outcomes of alternative mitigation measures. We consider mitigation measures that involve the targeted closure of school classes or grades based on readily available information such as the number of symptomatic infectious children in a class. We focus on the case of a primary school for which we have high-resolution data on the close-range interactions of children and teachers. We simulate the spread of an influenza-like illness in this population by using an SEIR model with asymptomatics and compare the outcomes of different mitigation strategies. We find that targeted class closure affords strong mitigation effects: closing a class for a fixed period of time -equal to the sum of the average infectious and latent durations- whenever two infectious individuals are detected in that class decreases the attack rate by almost 70% and strongly decreases the probability of a severe outbreak. The closure of all classes of the same grade mitigates the spread almost as much as closing the whole school. Targeted class closure strategies based on readily available information on symptomatic subjects and on limited information on mixing patterns, such as the grade structure of the school, can be almost as effective as whole-school closure, at a much lower cost. This may inform public health policies for the management and mitigation of influenza-like outbreaks in the community.
△ Less
Submitted 29 August, 2014;
originally announced August 2014.
-
Mining Concurrent Topical Activity in Microblog Streams
Authors:
A. Panisson,
L. Gauvin,
M. Quaggiotto,
C. Cattuto
Abstract:
Streams of user-generated content in social media exhibit patterns of collective attention across diverse topics, with temporal structures determined both by exogenous factors and endogenous factors. Teasing apart different topics and resolving their individual, concurrent, activity timelines is a key challenge in extracting knowledge from microblog streams. Facing this challenge requires the use…
▽ More
Streams of user-generated content in social media exhibit patterns of collective attention across diverse topics, with temporal structures determined both by exogenous factors and endogenous factors. Teasing apart different topics and resolving their individual, concurrent, activity timelines is a key challenge in extracting knowledge from microblog streams. Facing this challenge requires the use of methods that expose latent signals by using term correlations across posts and over time. Here we focus on content posted to Twitter during the London 2012 Olympics, for which a detailed schedule of events is independently available and can be used for reference. We mine the temporal structure of topical activity by using two methods based on non-negative matrix factorization. We show that for events in the Olympics schedule that can be semantically matched to Twitter topics, the extracted Twitter activity timeline closely matches the known timeline from the schedule. Our results show that, given appropriate techniques to detect latent signals, Twitter can be used as a social sensor to extract topical-temporal information on real-world events at high temporal resolution.
△ Less
Submitted 6 March, 2014;
originally announced March 2014.
-
Estimating Potential Infection Transmission Routes in Hospital Wards Using Wearable Proximity Sensors
Authors:
Philippe Vanhems,
Alain Barrat,
Ciro Cattuto,
Jean-François Pinton,
Nagham Khanafer,
Corinne Régis,
Byeul-a Kim,
Brigitte Comte,
Nicolas Voirin
Abstract:
Contacts between patients, patients and health care workers (HCWs) and among HCWs represent one of the important routes of transmission of hospital-acquired infections (HAI). A detailed description and quantification of contacts in hospitals provides key information for HAIs epidemiology and for the design and validation of control measures. We used wearable sensors to detect close-range interacti…
▽ More
Contacts between patients, patients and health care workers (HCWs) and among HCWs represent one of the important routes of transmission of hospital-acquired infections (HAI). A detailed description and quantification of contacts in hospitals provides key information for HAIs epidemiology and for the design and validation of control measures. We used wearable sensors to detect close-range interactions ("contacts") between individuals in the geriatric unit of a university hospital. Contact events were measured with a spatial resolution of about 1.5 meters and a temporal resolution of 20 seconds. The study included 46 HCWs and 29 patients and lasted for 4 days and 4 nights. 14037 contacts were recorded. The number and duration of contacts varied between mornings, afternoons and nights, and contact matrices describing the mixing patterns between HCW and patients were built for each time period. Contact patterns were qualitatively similar from one day to the next. 38% of the contacts occurred between pairs of HCWs and 6 HCWs accounted for 42% of all the contacts including at least one patient, suggesting a population of individuals who could potentially act as super-spreaders. Wearable sensors represent a novel tool for the measurement of contact patterns in hospitals. The collected data provides information on important aspects that impact the spreading patterns of infectious diseases, such as the strong heterogeneity of contact numbers and durations across individuals, the variability in the number of contacts during a day, and the fraction of repeated contacts across days. This variability is associated with a marked statistical stability of contact and mixing patterns across days. Our results highlight the need for such measurement efforts in order to correctly inform mathematical models of HAIs and use them to inform the design and evaluation of prevention strategies.
△ Less
Submitted 14 September, 2013;
originally announced September 2013.
-
Detecting the community structure and activity patterns of temporal networks: a non-negative tensor factorization approach
Authors:
Laetitia Gauvin,
André Panisson,
Ciro Cattuto
Abstract:
The increasing availability of temporal network data is calling for more research on extracting and characterizing mesoscopic structures in temporal networks and on relating such structure to specific functions or properties of the system. An outstanding challenge is the extension of the results achieved for static networks to time-varying networks, where the topological structure of the system an…
▽ More
The increasing availability of temporal network data is calling for more research on extracting and characterizing mesoscopic structures in temporal networks and on relating such structure to specific functions or properties of the system. An outstanding challenge is the extension of the results achieved for static networks to time-varying networks, where the topological structure of the system and the temporal activity patterns of its components are intertwined. Here we investigate the use of a latent factor decomposition technique, non-negative tensor factorization, to extract the community-activity structure of temporal networks. The method is intrinsically temporal and allows to simultaneously identify communities and to track their activity over time. We represent the time-varying adjacency matrix of a temporal network as a three-way tensor and approximate this tensor as a sum of terms that can be interpreted as communities of nodes with an associated activity time series. We summarize known computational techniques for tensor decomposition and discuss some quality metrics that can be used to tune the complexity of the factorized representation. We subsequently apply tensor factorization to a temporal network for which a ground truth is available for both the community structure and the temporal activity patterns. The data we use describe the social interactions of students in a school, the associations between students and school classes, and the spatio-temporal trajectories of students over time. We show that non-negative tensor factorization is capable of recovering the class structure with high accuracy. In particular, the extracted tensor components can be validated either as known school classes, or in terms of correlated activity patterns, i.e., of spatial and temporal coincidences that are determined by the known school activity schedule.
△ Less
Submitted 5 November, 2013; v1 submitted 3 August, 2013;
originally announced August 2013.
-
Gender homophily from spatial behavior in a primary school: a sociometric study
Authors:
J. Stehlé,
F. Charbonnier,
T. Picard,
C. Cattuto,
A. Barrat
Abstract:
We investigate gender homophily in the spatial proximity of children (6 to 12 years old) in a French primary school, using time-resolved data on face-to-face proximity recorded by means of wearable sensors. For strong ties, i.e., for pairs of children who interact more than a defined threshold, we find statistical evidence of gender preference that increases with grade. For weak ties, conversely,…
▽ More
We investigate gender homophily in the spatial proximity of children (6 to 12 years old) in a French primary school, using time-resolved data on face-to-face proximity recorded by means of wearable sensors. For strong ties, i.e., for pairs of children who interact more than a defined threshold, we find statistical evidence of gender preference that increases with grade. For weak ties, conversely, gender homophily is negatively correlated with grade for girls, and positively correlated with grade for boys. This different evolution with grade of weak and strong ties exposes a contrasted picture of gender homophily.
△ Less
Submitted 26 June, 2013; v1 submitted 25 June, 2013;
originally announced June 2013.
-
Activity clocks: spreading dynamics on temporal networks of human contact
Authors:
Laetitia Gauvin,
André Panisson,
Ciro Cattuto,
Alain Barrat
Abstract:
Dynamical processes on time-varying complex networks are key to understanding and modeling a broad variety of processes in socio-technical systems. Here we focus on empirical temporal networks of human proximity and we aim at understanding the factors that, in simulation, shape the arrival time distribution of simple spreading processes. Abandoning the notion of wall-clock time in favour of node-s…
▽ More
Dynamical processes on time-varying complex networks are key to understanding and modeling a broad variety of processes in socio-technical systems. Here we focus on empirical temporal networks of human proximity and we aim at understanding the factors that, in simulation, shape the arrival time distribution of simple spreading processes. Abandoning the notion of wall-clock time in favour of node-specific clocks based on activity exposes robust statistical patterns in the arrival times across different social contexts. Using randomization strategies and generative models constrained by data, we show that these patterns can be understood in terms of heterogeneous inter-event time distributions coupled with heterogeneous numbers of events per edge. We also show, both empirically and by using a synthetic dataset, that significant deviations from the above behavior can be caused by the presence of edge classes with strong activity correlations.
△ Less
Submitted 31 October, 2013; v1 submitted 19 June, 2013;
originally announced June 2013.
-
Temporal networks of face-to-face human interactions
Authors:
Alain Barrat,
Ciro Cattuto
Abstract:
The ever increasing adoption of mobile technologies and ubiquitous services allows to sense human behavior at unprecedented levels of details and scale. Wearable sensors are opening up a new window on human mobility and proximity at the finest resolution of face-to-face proximity. As a consequence, empirical data describing social and behavioral networks are acquiring a longitudinal dimension that…
▽ More
The ever increasing adoption of mobile technologies and ubiquitous services allows to sense human behavior at unprecedented levels of details and scale. Wearable sensors are opening up a new window on human mobility and proximity at the finest resolution of face-to-face proximity. As a consequence, empirical data describing social and behavioral networks are acquiring a longitudinal dimension that brings forth new challenges for analysis and modeling. Here we review recent work on the representation and analysis of temporal networks of face-to-face human proximity, based on large-scale datasets collected in the context of the SocioPatterns collaboration. We show that the raw behavioral data can be studied at various levels of coarse-graining, which turn out to be complementary to one another, with each level exposing different features of the underlying system. We briefly review a generative model of temporal contact networks that reproduces some statistical observables. Then, we shift our focus from surface statistical features to dynamical processes on empirical temporal networks. We discuss how simple dynamical processes can be used as probes to expose important features of the interaction patterns, such as burstiness and causal constraints. We show that simulating dynamical processes on empirical temporal networks can unveil differences between datasets that would otherwise look statistically similar. Moreover, we argue that, due to the temporal heterogeneity of human dynamics, in order to investigate the temporal properties of spreading processes it may be necessary to abandon the notion of wall-clock time in favour of an intrinsic notion of time for each individual node, defined in terms of its activity level. We conclude highlighting several open research questions raised by the nature of the data at hand.
△ Less
Submitted 15 May, 2013;
originally announced May 2013.
-
Immunization strategies for epidemic processes in time-varying contact networks
Authors:
Michele Starnini,
Anna Machens,
Ciro Cattuto,
Alain Barrat,
Romualdo Pastor Satorras
Abstract:
Spreading processes represent a very efficient tool to investigate the structural properties of networks and the relative importance of their constituents, and have been widely used to this aim in static networks. Here we consider simple disease spreading processes on empirical time-varying networks of contacts between individuals, and compare the effect of several immunization strategies on these…
▽ More
Spreading processes represent a very efficient tool to investigate the structural properties of networks and the relative importance of their constituents, and have been widely used to this aim in static networks. Here we consider simple disease spreading processes on empirical time-varying networks of contacts between individuals, and compare the effect of several immunization strategies on these processes. An immunization strategy is defined as the choice of a set of nodes (individuals) who cannot catch nor transmit the disease. This choice is performed according to a certain ranking of the nodes of the contact network. We consider various ranking strategies, focusing in particular on the role of the training window during which the nodes' properties are measured in the time-varying network: longer training windows correspond to a larger amount of information collected and could be expected to result in better performances of the immunization strategies. We find instead an unexpected saturation in the efficiency of strategies based on nodes' characteristics when the length of the training window is increased, showing that a limited amount of information on the contact patterns is sufficient to design efficient immunization strategies. This finding is balanced by the large variations of the contact patterns, which strongly alter the importance of nodes from one period to the next and therefore significantly limit the efficiency of any strategy based on an importance ranking of nodes. We also observe that the efficiency of strategies that include an element of randomness and are based on temporally local information do not perform as well but are largely independent on the amount of information available.
△ Less
Submitted 10 May, 2013;
originally announced May 2013.
-
An infectious disease model on empirical networks of human contact: bridging the gap between dynamic network data and contact matrices
Authors:
Anna Machens,
Francesco Gesualdo,
Caterina Rizzo,
Alberto E Tozzi,
Alain Barrat,
Ciro Cattuto
Abstract:
The integration of empirical data in computational frameworks to model the spread of infectious diseases poses challenges that are becoming pressing with the increasing availability of high-resolution information on human mobility and contacts. This deluge of data has the potential to revolutionize the computational efforts aimed at simulating scenarios and designing containment strategies. Howeve…
▽ More
The integration of empirical data in computational frameworks to model the spread of infectious diseases poses challenges that are becoming pressing with the increasing availability of high-resolution information on human mobility and contacts. This deluge of data has the potential to revolutionize the computational efforts aimed at simulating scenarios and designing containment strategies. However, the integration of detailed data sources yields models that are less transparent and general. Hence, given a specific disease model, it is crucial to assess which representations of the raw data strike the best balance between simplicity and detail. We consider high-resolution data on the face-to-face interactions of individuals in a hospital ward, obtained by using wearable proximity sensors. We simulate the spread of a disease in this community by using an SEIR model on top of different mathematical representations of the contact patterns. We show that a contact matrix that only contains average contact durations fails to reproduce the size of the epidemic obtained with the high-resolution contact data and also to identify the most at-risk classes. We introduce a contact matrix of probability distributions that takes into account the heterogeneity of contact durations between (and within) classes of individuals, and we show that this representation yields a good approximation of the epidemic spreading properties obtained by using the high-resolution data. Our results mark a step towards the definition of synopses of high-resolution dynamic contact networks, providing a compact representation of contact patterns that can correctly inform computational models designed to discover risk groups and evaluate containment policies. We show that this novel kind of representation can preserve in simulation quantitative features of the epidemics that are crucial for their study and management.
△ Less
Submitted 23 April, 2013;
originally announced April 2013.
-
Dynamical Classes of Collective Attention in Twitter
Authors:
Janette Lehmann,
Bruno Gonçalves,
José J. Ramasco,
Ciro Cattuto
Abstract:
Micro-blogging systems such as Twitter expose digital traces of social discourse with an unprecedented degree of resolution of individual behaviors. They offer an opportunity to investigate how a large-scale social system responds to exogenous or endogenous stimuli, and to disentangle the temporal, spatial and topical aspects of users' activity. Here we focus on spikes of collective attention in T…
▽ More
Micro-blogging systems such as Twitter expose digital traces of social discourse with an unprecedented degree of resolution of individual behaviors. They offer an opportunity to investigate how a large-scale social system responds to exogenous or endogenous stimuli, and to disentangle the temporal, spatial and topical aspects of users' activity. Here we focus on spikes of collective attention in Twitter, and specifically on peaks in the popularity of hashtags. Users employ hashtags as a form of social annotation, to define a shared context for a specific event, topic, or meme. We analyze a large-scale record of Twitter activity and find that the evolution of hastag popularity over time defines discrete classes of hashtags. We link these dynamical classes to the events the hashtags represent and use text mining techniques to provide a semantic characterization of the hastag classes. Moreover, we track the propagation of hashtags in the Twitter social network and find that epidemic spreading plays a minor role in hastag popularity, which is mostly driven by exogenous factors.
△ Less
Submitted 1 March, 2012; v1 submitted 8 November, 2011;
originally announced November 2011.
-
High-resolution measurements of face-to-face contact patterns in a primary school
Authors:
J. Stehlé,
N. Voirin,
A. Barrat,
C. Cattuto,
L. Isella,
J. -F. Pinton,
M. Quaggiotto,
W. Van den Broeck,
C. Régis,
B. Lina,
P. Vanhems
Abstract:
Little quantitative information is available on the mixing patterns of children in school environments. Describing and understanding contacts between children at school would help quantify the transmission opportunities of respiratory infections and identify situations within schools where the risk of transmission is higher. We report on measurements carried out in a French school (6-12 years chil…
▽ More
Little quantitative information is available on the mixing patterns of children in school environments. Describing and understanding contacts between children at school would help quantify the transmission opportunities of respiratory infections and identify situations within schools where the risk of transmission is higher. We report on measurements carried out in a French school (6-12 years children), where we collected data on the time-resolved face-to-face proximity of children and teachers using a proximity-sensing infrastructure based on radio frequency identification devices.
Data on face-to-face interactions were collected on October 1st and 2nd, 2009. We recorded 77,602 contact events between 242 individuals. Each child has on average 323 contacts per day with 47 other children, leading to an average daily interaction time of 176 minutes. Most contacts are brief, but long contacts are also observed. Contacts occur mostly within each class, and each child spends on average three times more time in contact with classmates than with children of other classes. We describe the temporal evolution of the contact network and the trajectories followed by the children in the school, which constrain the contact patterns. We determine an exposure matrix aimed at informing mathematical models. This matrix exhibits a class and age structure which is very different from the homogeneous mixing hypothesis.
The observed properties of the contact patterns between school children are relevant for modeling the propagation of diseases and for evaluating control measures. We discuss public health implications related to the management of schools in case of epidemics and pandemics. Our results can help define a prioritization of control measures based on preventive measures, case isolation, classes and school closures, that could reduce the disruption to education during epidemics.
△ Less
Submitted 5 September, 2011;
originally announced September 2011.
-
Simulation of an SEIR infectious disease model on the dynamic contact network of conference attendees
Authors:
Juliette Stehlé,
Nicolas Voirin,
Alain Barrat,
Ciro Cattuto,
Vittoria Colizza,
Lorenzo Isella,
Corinne Régis,
Jean-François Pinton,
Nagham Khanafer,
Wouter Van den Broeck,
Philippe Vanhems
Abstract:
The spread of infectious diseases crucially depends on the pattern of contacts among individuals. Knowledge of these patterns is thus essential to inform models and computational efforts. Few empirical studies are however available that provide estimates of the number and duration of contacts among social groups. Moreover, their space and time resolution are limited, so that data is not explicit a…
▽ More
The spread of infectious diseases crucially depends on the pattern of contacts among individuals. Knowledge of these patterns is thus essential to inform models and computational efforts. Few empirical studies are however available that provide estimates of the number and duration of contacts among social groups. Moreover, their space and time resolution are limited, so that data is not explicit at the person-to-person level, and the dynamical aspect of the contacts is disregarded. Here, we want to assess the role of data-driven dynamic contact patterns among individuals, and in particular of their temporal aspects, in sha** the spread of a simulated epidemic in the population.
We consider high resolution data of face-to-face interactions between the attendees of a conference, obtained from the deployment of an infrastructure based on Radio Frequency Identification (RFID) devices that assess mutual face-to-face proximity. The spread of epidemics along these interactions is simulated through an SEIR model, using both the dynamical network of contacts defined by the collected data, and two aggregated versions of such network, in order to assess the role of the data temporal aspects.
We show that, on the timescales considered, an aggregated network taking into account the daily duration of contacts is a good approximation to the full resolution network, whereas a homogeneous representation which retains only the topology of the contact network fails in reproducing the size of the epidemic.
These results have important implications in understanding the level of detail needed to correctly inform computational models for the study and management of real epidemics.
△ Less
Submitted 24 August, 2011;
originally announced August 2011.
-
On the Dynamics of Human Proximity for Data Diffusion in Ad-Hoc Networks
Authors:
André Panisson,
Alain Barrat,
Ciro Cattuto,
Wouter Van den Broeck,
Giancarlo Ruffo,
Rossano Schifanella
Abstract:
We report on a data-driven investigation aimed at understanding the dynamics of message spreading in a real-world dynamical network of human proximity. We use data collected by means of a proximity-sensing network of wearable sensors that we deployed at three different social gatherings, simultaneously involving several hundred individuals. We simulate a message spreading process over the recorded…
▽ More
We report on a data-driven investigation aimed at understanding the dynamics of message spreading in a real-world dynamical network of human proximity. We use data collected by means of a proximity-sensing network of wearable sensors that we deployed at three different social gatherings, simultaneously involving several hundred individuals. We simulate a message spreading process over the recorded proximity network, focusing on both the topological and the temporal properties. We show that by using an appropriate technique to deal with the temporal heterogeneity of proximity events, a universal statistical pattern emerges for the delivery times of messages, robust across all the data sets. Our results are useful to set constraints for generic processes of data dissemination, as well as to validate established models of human mobility and proximity that are frequently used to simulate realistic behaviors.
△ Less
Submitted 29 June, 2011;
originally announced June 2011.
-
Dynamics of person-to-person interactions from distributed RFID sensor networks
Authors:
Ciro Cattuto,
Wouter Van den Broeck,
Alain Barrat,
Vittoria Colizza,
Jean-François Pinton,
Alessandro Vespignani
Abstract:
Digital networks, mobile devices, and the possibility of mining the ever-increasing amount of digital traces that we leave behind in our daily activities are changing the way we can approach the study of human and social interactions. Large-scale datasets, however, are mostly available for collective and statistical behaviors, at coarse granularities, while high-resolution data on person-to-person…
▽ More
Digital networks, mobile devices, and the possibility of mining the ever-increasing amount of digital traces that we leave behind in our daily activities are changing the way we can approach the study of human and social interactions. Large-scale datasets, however, are mostly available for collective and statistical behaviors, at coarse granularities, while high-resolution data on person-to-person interactions are generally limited to relatively small groups of individuals. Here we present a scalable experimental framework for gathering real-time data resolving face-to-face social interactions with tunable spatial and temporal granularities. We use active Radio Frequency Identification (RFID) devices that assess mutual proximity in a distributed fashion by exchanging low-power radio packets. We analyze the dynamics of person-to-person interaction networks obtained in three high-resolution experiments carried out at different orders of magnitude in community size. The data sets exhibit common statistical properties and lack of a characteristic time scale from 20 seconds to several hours. The association between the number of connections and their duration shows an interesting super-linear behavior, which indicates the possibility of defining super-connectors both in the number and intensity of connections. Taking advantage of scalability and resolution, this experimental framework allows the monitoring of social interactions, uncovering similarities in the way individuals interact in different contexts, and identifying patterns of super-connector behavior in the community. These results could impact our understanding of all phenomena driven by face-to-face interactions, such as the spreading of transmissible infectious diseases and information.
△ Less
Submitted 21 July, 2010;
originally announced July 2010.
-
Link creation and profile alignment in the aNobii social network
Authors:
Luca Maria Aiello,
Alain Barrat,
Ciro Cattuto,
Giancarlo Ruffo,
Rossano Schifanella
Abstract:
The present work investigates the structural and dynamical properties of aNobii\footnote{http://www.anobii.com/}, a social bookmarking system designed for readers and book lovers. Users of aNobii provide information about their library, reading interests and geographical location, and they can establish typed social links to other users. Here, we perform an in-depth analysis of the system's social…
▽ More
The present work investigates the structural and dynamical properties of aNobii\footnote{http://www.anobii.com/}, a social bookmarking system designed for readers and book lovers. Users of aNobii provide information about their library, reading interests and geographical location, and they can establish typed social links to other users. Here, we perform an in-depth analysis of the system's social network and its interplay with users' profiles. We describe the relation of geographic and interest-based factors to social linking. Furthermore, we perform a longitudinal analysis to investigate the interplay of profile similarity and link creation in the social network, with a focus on triangle closure. We report a reciprocal causal connection: profile similarity of users drives the subsequent closure in the social network and, reciprocally, closure in the social network induces subsequent profile alignment. Access to the dynamics of the social network also allows us to measure quantitative indicators of preferential linking.
△ Less
Submitted 25 June, 2010;
originally announced June 2010.
-
What's in a crowd? Analysis of face-to-face behavioral networks
Authors:
Lorenzo Isella,
Juliette Stehlé,
Alain Barrat,
Ciro Cattuto,
Jean-François Pinton,
Wouter Van den Broeck
Abstract:
The availability of new data sources on human mobility is opening new avenues for investigating the interplay of social networks, human mobility and dynamical processes such as epidemic spreading. Here we analyze data on the time-resolved face-to-face proximity of individuals in large-scale real-world scenarios. We compare two settings with very different properties, a scientific conference and a…
▽ More
The availability of new data sources on human mobility is opening new avenues for investigating the interplay of social networks, human mobility and dynamical processes such as epidemic spreading. Here we analyze data on the time-resolved face-to-face proximity of individuals in large-scale real-world scenarios. We compare two settings with very different properties, a scientific conference and a long-running museum exhibition. We track the behavioral networks of face-to-face proximity, and characterize them from both a static and a dynamic point of view, exposing important differences as well as striking similarities. We use our data to investigate the dynamics of a susceptible-infected model for epidemic spreading that unfolds on the dynamical networks of human proximity. The spreading patterns are markedly different for the conference and the museum case, and they are strongly impacted by the causal structure of the network data. A deeper study of the spreading paths shows that the mere knowledge of static aggregated networks would lead to erroneous conclusions about the transmission paths on the dynamical networks.
△ Less
Submitted 5 January, 2011; v1 submitted 7 June, 2010;
originally announced June 2010.
-
Folks in Folksonomies: Social Link Prediction from Shared Metadata
Authors:
Rossano Schifanella,
Alain Barrat,
Ciro Cattuto,
Benjamin Markines,
Filippo Menczer
Abstract:
Web 2.0 applications have attracted a considerable amount of attention because their open-ended nature allows users to create light-weight semantic scaffolding to organize and share content. To date, the interplay of the social and semantic components of social media has been only partially explored. Here we focus on Flickr and Last.fm, two social media systems in which we can relate the tagging a…
▽ More
Web 2.0 applications have attracted a considerable amount of attention because their open-ended nature allows users to create light-weight semantic scaffolding to organize and share content. To date, the interplay of the social and semantic components of social media has been only partially explored. Here we focus on Flickr and Last.fm, two social media systems in which we can relate the tagging activity of the users with an explicit representation of their social network. We show that a substantial level of local lexical and topical alignment is observable among users who lie close to each other in the social network. We introduce a null model that preserves user activity while removing local correlations, allowing us to disentangle the actual local alignment between users from statistical effects due to the assortative mixing of user activity and centrality in the social network. This analysis suggests that users with similar topical interests are more likely to be friends, and therefore semantic similarity measures among users based solely on their annotation metadata should be predictive of social links. We test this hypothesis on the Last.fm data set, confirming that the social network constructed from semantic similarity captures actual friendship more accurately than Last.fm's suggestions based on listening patterns.
△ Less
Submitted 11 March, 2010;
originally announced March 2010.
-
Collective dynamics of social annotation
Authors:
Ciro Cattuto,
Alain Barrat,
Andrea Baldassarri,
G. Schehr,
Vittorio Loreto
Abstract:
The enormous increase of popularity and use of the WWW has led in the recent years to important changes in the ways people communicate. An interesting example of this fact is provided by the now very popular social annotation systems, through which users annotate resources (such as web pages or digital photographs) with text keywords dubbed tags. Understanding the rich emerging structures result…
▽ More
The enormous increase of popularity and use of the WWW has led in the recent years to important changes in the ways people communicate. An interesting example of this fact is provided by the now very popular social annotation systems, through which users annotate resources (such as web pages or digital photographs) with text keywords dubbed tags. Understanding the rich emerging structures resulting from the uncoordinated actions of users calls for an interdisciplinary effort. In particular concepts borrowed from statistical physics, such as random walks, and the complex networks framework, can effectively contribute to the mathematical modeling of social annotation systems. Here we show that the process of social annotation can be seen as a collective but uncoordinated exploration of an underlying semantic space, pictured as a graph, through a series of random walks. This modeling framework reproduces several aspects, so far unexplained, of social annotation, among which the peculiar growth of the size of the vocabulary used by the community and its complex network structure that represents an externalization of semantic structures grounded in cognition and typically hard to access.
△ Less
Submitted 30 April, 2009; v1 submitted 17 February, 2009;
originally announced February 2009.
-
High resolution dynamical map** of social interactions with active RFID
Authors:
Alain Barrat,
Ciro Cattuto,
Vittoria Colizza,
Jean-Francois Pinton,
Wouter Van den Broeck,
Alessandro Vespignani
Abstract:
In this paper we present an experimental framework to gather data on face-to-face social interactions between individuals, with a high spatial and temporal resolution. We use active Radio Frequency Identification (RFID) devices that assess contacts with one another by exchanging low-power radio packets. When individuals wear the beacons as a badge, a persistent radio contact between the RFID dev…
▽ More
In this paper we present an experimental framework to gather data on face-to-face social interactions between individuals, with a high spatial and temporal resolution. We use active Radio Frequency Identification (RFID) devices that assess contacts with one another by exchanging low-power radio packets. When individuals wear the beacons as a badge, a persistent radio contact between the RFID devices can be used as a proxy for a social interaction between individuals. We present the results of a pilot study recently performed during a conference, and a subsequent preliminary data analysis, that provides an assessment of our method and highlights its versatility and applicability in many areas concerned with human dynamics.
△ Less
Submitted 25 November, 2008; v1 submitted 25 November, 2008;
originally announced November 2008.
-
Vocabulary growth in collaborative tagging systems
Authors:
Ciro Cattuto,
Andrea Baldassarri,
Vito D. P. Servedio,
Vittorio Loreto
Abstract:
We analyze a large-scale snapshot of del.icio.us and investigate how the number of different tags in the system grows as a function of a suitably defined notion of time. We study the temporal evolution of the global vocabulary size, i.e. the number of distinct tags in the entire system, as well as the evolution of local vocabularies, that is the growth of the number of distinct tags used in the…
▽ More
We analyze a large-scale snapshot of del.icio.us and investigate how the number of different tags in the system grows as a function of a suitably defined notion of time. We study the temporal evolution of the global vocabulary size, i.e. the number of distinct tags in the entire system, as well as the evolution of local vocabularies, that is the growth of the number of distinct tags used in the context of a given resource or user. In both cases, we find power-law behaviors with exponents smaller than one. Surprisingly, the observed growth behaviors are remarkably regular throughout the entire history of the system and across very different resources being bookmarked. Similar sub-linear laws of growth have been observed in written text, and this qualitative universality calls for an explanation and points in the direction of non-trivial cognitive processes in the complex interaction patterns characterizing collaborative tagging.
△ Less
Submitted 25 April, 2007;
originally announced April 2007.
-
A Yule-Simon process with memory
Authors:
C. Cattuto,
V. Loreto,
V. D. P. Servedio
Abstract:
The Yule-Simon model has been used as a tool to describe the growth of diverse systems, acquiring a paradigmatic character in many fields of research. Here we study a modified Yule-Simon model that takes into account the full history of the system by means of an hyperbolic memory kernel. We show how the memory kernel changes the properties of preferential attachment and provide an approximate an…
▽ More
The Yule-Simon model has been used as a tool to describe the growth of diverse systems, acquiring a paradigmatic character in many fields of research. Here we study a modified Yule-Simon model that takes into account the full history of the system by means of an hyperbolic memory kernel. We show how the memory kernel changes the properties of preferential attachment and provide an approximate analytical solution for the frequency distribution density as well as for the frequency-rank distribution.
△ Less
Submitted 30 August, 2006;
originally announced August 2006.
-
Collaborative Tagging and Semiotic Dynamics
Authors:
Ciro Cattuto,
Vittorio Loreto,
Luciano Pietronero
Abstract:
Collaborative tagging has been quickly gaining ground because of its ability to recruit the activity of web users into effectively organizing and sharing vast amounts of information. Here we collect data from a popular system and investigate the statistical properties of tag co-occurrence. We introduce a stochastic model of user behavior embodying two main aspects of collaborative tagging: (i) a…
▽ More
Collaborative tagging has been quickly gaining ground because of its ability to recruit the activity of web users into effectively organizing and sharing vast amounts of information. Here we collect data from a popular system and investigate the statistical properties of tag co-occurrence. We introduce a stochastic model of user behavior embodying two main aspects of collaborative tagging: (i) a frequency-bias mechanism related to the idea that users are exposed to each other's tagging activity; (ii) a notion of memory - or aging of resources - in the form of a heavy-tailed access to the past state of the system. Remarkably, our simple modeling is able to account quantitatively for the observed experimental features, with a surprisingly high accuracy. This points in the direction of a universal behavior of users, who - despite the complexity of their own cognitive processes and the uncoordinated and selfish nature of their tagging activity - appear to follow simple activity patterns.
△ Less
Submitted 4 May, 2006;
originally announced May 2006.