^†^†thanks: These authors contributed equally^†^†thanks: These authors contributed equally

Higher-order modeling of face-to-face interactions

Luca Gallo Department of Network and Data Science, Central European University, 1100 Vienna, Austria ANETI Lab, Corvinus Institute for Advances Studies (CIAS), Corvinus University, 1093, Budapest, Hungary Chiara Zappalà Center for Collective Learning, Corvinus Institute for Advanced Studies (CIAS), Corvinus University, 1093 Budapest, Hungary Fariba Karimi Tecnhincal University of Graz, 8010 Graz, Austria Complexity Science Hub Vienna, A-1080 Vienna, Austria Federico Battiston [email protected]; [email protected] Department of Network and Data Science, Central European University, 1100 Vienna, Austria

(June 7, 2024)

Abstract

The most fundamental social interactions among humans occur face to face. Their features have been extensively studied in recent years, owing to the availability of high-resolution data on individuals’ proximity. Mathematical models based on mobile agents have been crucial to understand the spatio-temporal organization of face-to-face interactions. However, these models focus on dyadic relationships only, failing to characterize interactions in larger groups of individuals. Here, we propose a model in which agents interact with each other by forming groups of different sizes. Each group has a degree of social attractiveness, based on which neighboring agents decide whether to join. Our framework reproduces different properties of groups in face-to-face interactions, including their distribution, the correlation in their number, and their persistence in time, which cannot be replicated by dyadic models. Furthermore, it captures homophilic patterns at the level of higher-order interactions, going beyond standard pairwise approaches. Our work sheds light on the higher-order mechanisms at the heart of human face-to-face interactions, paving the way for further investigation of how group dynamics at a microscopic scale affects social phenomena at a macroscopic scale.

Face-to-face contacts lie at the core of an individual’s social world [1]. A street encounter with a stranger, discussing with colleagues over coffee, having dinner with family, or hanging out with a group of friends: all of these represent the most fundamental form of social interaction. The recent availability of high-resolution data on individuals’ proximity has allowed researchers to uncover how those seemingly random interactions display coherent spatio-temporal characteristics. Specifically, across several social contexts face-to-face interactions show universal features, such as the absence of a characteristic scale for the contact duration, the switching between low activity periods and high activity bursts, and a great heterogeneity in interaction behaviors among individuals [2, 3, 4, 5, 6, 7, 8]. The ubiquity of these features has thus posed the crucial challenge of explaining what mechanisms underlie their emergence.

Modeling frameworks based on mobile agents proved to be a valuable tool to understand the organization of human face-to-face interactions. In this scenario, agents move erratically in a spatial environment and interactions occur every time they get close together [9]. In addition, contacts between agents can be modulated by more complex mechanisms, including their attractiveness [10, 11], their activeness and reachability [12], their pairwise similarity [13, 14], or belonging to the same social group [15]. This class of models gave useful insights to understand the bursty [10] and small-world behavior of face-to-face interactions [16], as well as many social phenomena emerging from them, like disease spreading [17, 18], spatial segregation and echo chamber formation [13], and structural inequalities [15].

These models, however, are limited as they describe face-to-face interactions only in terms of dyadic relationships between agents. In fact, they adopt a temporal network representation [19], focusing either on the dynamics of dyadic contacts, i.e., links, [10], or the mesoscopic level of social gatherings, i.e., connected components [9, 14]. However, humans do not only interact in pairs but regularly engage in groups involving more than two individuals at the same time [20]. While a few recent works have investigated the higher-order nature [21] of face-to-face interactions [22, 23, 24], current models based on mobile agents either overlook or fail to capture [11] the spatio-temporal features and the dynamical evolution of groups.

Here, we bridge this gap by introducing a model in which mobile agents interact with each other by forming groups of different sizes. Each group is characterized by an intrinsic degree of social appeal that we call “group attractiveness”. Agents passing in the vicinity of a group choose whether to join it based on its attractiveness, while group members decide whether to stay or walk away. We show how the Group Attractiveness Model (GAM) can reproduce different properties of groups in face-to-face interactions, including their statistics, the correlation in their number, and their temporal duration. Furthermore, differently from low-order approaches, we demonstrate the potential of our model to correctly capture higher-order homophilic patterns not only at the level of pairwise contacts, but also at that of group interactions. Given its predictive power, the Group Attractiveness Model can foster the study of human face-to-face interactions, paving the way for further investigation of how group dynamics at a microscopic scale affects social phenomena at a macroscopic scale.

The Group Attractiveness Model

In the Group Attractiveness Model (Fig. 1), $N$ agents are placed in a square environment of size $L\times L$ , with periodic boundary conditions. Each agent $i$ has a value of attractiveness, $a_{i}$ , which represents how appealing the agent is to the others. The value of attractiveness is sampled from a uniform distribution in the interval $[0,1]$ . Agents can be isolated or they can be part of a group. An agent can decide to interact with the groups surrounding it or to walk away in a random direction. How attractive is a group derives from how attractive are their members. Formally, for each group of agents $g$ , we define its attractiveness as

a_{g}=\prod\limits_{j\in g}a_{j},

(1)

where index $j$ runs over all agents forming group $g$ . Note that an isolated agent constitutes a group of one member (group $g_{3}$ in Fig. 1). We remark that the attractiveness of a group is smaller than the individual attractiveness of its members, i.e., $a_{g}<a_{j},\,\forall j\in g$ . Consequently, the larger the group, the less attractive it will be on average. Such modeling choice finds motivation in previous results on human face-to-face interactions, which highlight how large groups are more unstable than small ones [25, 23], due to the higher propensity of individuals to leave them [24], a phenomenon known as schisming [26]. At time step $t$ , each agent $i$ considers the groups located within a distance $d$ from it and interacts with them all with probability

p_{i}(t)=\frac{1}{|\mathcal{N}(i)|}\sum\limits_{g\in\mathcal{N}(i)}a_{g},

(2)

where $\mathcal{N}(i)$ indicates the set of groups in the vicinity of $i$ . Therefore, when an agent chooses to interact with a group of size $s$ , a group of size $s+1$ is formed at time $t+1$ . Groups in our model thus change gradually, with the addition of one member at the time, a mechanism supported by evidence from real-world human face-to-face interactions [20, 23, 24]. If $i$ is already part of a group, we consider the group formed by the other members to be part of $\mathcal{N}(i)$ . Therefore, when $i$ decides to interact, it rejoins the group it was part of, i.e., the group persists in time. Also, when a group partially lies within the scope of an agent, the latter interacts only with those agents that are at a distance smaller than $d$ (e.g., only one member of group $g_{2}$ in Fig. 1 is close to agent $i$ , so a group of size 2 is formed between them). When an agent does not interact with its neighboring groups, it makes a step of length $v$ along a direction given by an angle $\xi\in[0,2\pi]$ randomly chosen, leaving all the groups it was part of in the previous time step. Hence, a group persists in time only when all its members decide to interact with the others [25]. Finally, since empirical observations show that individuals do not engage in face-to-face interactions constantly [10], we assume that agents can be active or inactive. While an active agent can walk or form groups with other agents, an inactive agent neither moves nor interacts with the others. Inactive agents can become active with probability $r_{i}$ , that we sample from a uniform distribution in $[0,1]$ , while active agents that are isolated can become inactive with probability $1-r_{i}$ .

Refer to caption — Figure 1: Schematic illustration of the Group Attractiveness Model. At each time step $t$ , each active agent $i$ (blue) considers the groups lying within a radius $d$ from it, and interacts with all of them with a probability $p_{i}(t)$ that depends on the mean attractiveness of the neighboring groups. The agent interacts only with the group members within its scope and ignores inactive nodes (gray). With the complementary probability $1-p_{i}(t)$ , the agent moves away in a random direction, with a step of length $v$ .

Higher-order statistics of human face-to-face interactions

To test the capability of the Group Attractiveness Model to reproduce higher-order patterns of human face-to-face interactions, we analyze six high-resolution datasets coming from the SocioPatterns collaboration [2]. Those describe the dynamics of contacts between individuals in different social contexts, specifically a primary school (“PS”) [3], a high-school (“HS11”, “HS12”, and “HS13”)[6, 8], and two scientific conferences (“C16” and “C17”) [27]. As all datasets store interactions as dyadic contacts, we first reconstruct the group interactions among individuals leveraging the fine-grained temporal information of the data. Specifically, if at a time $t$ we find all possible dyads among $s$ individuals, we assume that they are interacting together in a single group of size $s$ (see Methods for details). The statistics of distinct groups of different sizes for the six datasets are reported in Fig. 2 as black circles. In general, smaller groups are more abundant than larger ones, with the exception of the Conference 2016 dataset, where groups of size three are more than groups of size two (see Table S1 in Supplementary Information).

We now want to verify whether our model is able to reproduce the distribution of groups in those social systems. We initiate the model simulation by randomly placing each agent in the environment and setting agents active with probability $1/2$ . We fix $v=d=1$ and the number of agents $N$ as the number of individuals forming the largest connected component in the hypergraph of contacts (see Supplementary Information for more details). The simulation stops once the number of groups of size two generated reaches the empirical value. The number of groups of size three or more utterly depends on the agent density. For instance, if the size $L$ of the environment is significantly large, than agents would rarely get in contact with each other, making the formation of large groups quite unlikely. On the other hand, when agents are close to each other, i.e., when the density is high, it is likely that agents form groups of various sizes. Hence, we fit the value of $L$ that best reproduces the group statistics in the dataset (see Supplementary Information for the best-fitting values of $L$ in each system).

As a comparison, we consider the (individual) Attractiveness Model (AM) proposed in [10]. Since such model accounts for groups of two agents only, we extract groups of larger size following the same procedure adopted for empirical data. We then fit the size environment $L$ in the same way as for the GAM. For both models, we run 100 simulations and consider the average number of groups of different sizes, as well as the standard deviation as an estimator of the model variability.

Fig. 2 shows the group statistics of the six datasets (black circles), as well as the average number of groups predicted by the GAM (blue squares) and the AM (red diamonds). In general, the Group Attractiveness Model is able to reproduce the distributions of groups of different sizes, while the individual Attractiveness Model significantly overestimates larger groups. In those cases where the GAM is not able to capture the exact group statistics (e.g., “PS” dataset), we still observe a better performance compared to the AM. This result highlights the need to consider group attractiveness to properly model non-dyadic face-to-face interactions. Indeed, since groups are generally less attractive than single individuals, large groups are substantially less frequent than small ones, a feature of the GAM that matches the patterns in real-world face-to-face interactions.

Next, we aim to understand whether individuals participating in groups of a given size also take part in groups of a different size. The presence of those correlations, and in particular the empirical tendency of face-to-face group interactions to be nested (i.e., individuals interacting in a group at a given time also interact in subgroups at other times)[28, 29], can promote the contagion dynamics [30, 31, 32]. Motivated by this, we examine the capability of the Group Attractiveness Model to reproduce correlations in the number of groups, focusing on groups of sizes two and three.

We count the unique number of groups of size two, $k^{(2)}_{i}$ , and three, $k^{(3)}_{i}$ , in which each individual $i$ takes part at any moment within the observation time of the system, and evaluate the Pearson correlation coefficient, $\rho$ , between these two quantities. We then simulate 100 times the Group Attractiveness Model, using the parameters fitted from the group size distributions, and evaluate for each run the linear correlation between the groups of size two and three. Again, we consider the Attractiveness Model as a reference model.

The results are reported in Fig. 3. We observe that, in general, face-to-face interactions in pairs and triads in real-world systems tend to be highly correlated (black circles), with $\rho$ varying between 0.66 (“HS11”) and 0.82 (“C17”). This aspect of the empirical datasets is well reproduced by the GAM (blue squares), for which the average correlation coefficient never goes below 0.59 (“C16”). Moreover, the GAM is able to predict the exact value of $\rho$ for half of the systems, while slightly underestimating it for the others. The AM, instead, systematically underestimates the correlations between groups of sizes two and three (red diamonds), with values always below 0.53 (“HS11”), down to 0.33 (“C16”).

The correlation analysis points out the higher-order nature of human face-to-face interactions. In the Attractiveness Model, which is based on a dyadic approach, groups of more than two agents are constructed as a collection of pairs of agents, e.g., if three agents form three pairs at a given time step, we assume them to interact in a group of three agents. Consequently, the correlation between groups of two and three agents reduces, especially in those scenarios with a high density of agents, i.e., “C16” dataset. The lower correlation in the AM is not simply an effect of how we reconstruct groups from pairwise interactions, as in empirical systems, where groups are obtained in the same way, we observe high values of correlation. This means that groups in real-world systems are not simply a collection of dyadic contacts. In fact, the Group Attractiveness Model, which naturally accounts for group interactions, correctly reproduces the high values of correlation observed in the data.

The hierarchical structure of group burstiness

A distinctive characteristic of human face-to-face interactions is that they show a bursty behavior [2, 33]. In particular, the duration of contacts between individuals display broad-tail distributions, indicating that most contacts are brief and few last for long periods of time, with no characteristic time scale. Recent works have shown that burstiness is not limited to pairwise contacts, as group interactions show similar temporal patterns [34, 20, 24]. Remarkably, the distributions of contact duration are typically organized in a hierarchy, with small groups showing broader distributions compared to larger ones, a feature emerging also in the social systems we investigate (here we focus on the “HS11” dataset, panel a of Fig. 4, while the analysis of the other datasets is reported in the Supplementary Information, Fig. S1-S5). Note that here we define the contact duration as the number of consecutive time steps for which an interaction is present.

While the individual Attractiveness Model succeeds in reproducing the broad-tail distribution of pairwise contacts [10], it fails to recover the hierarchical structure of group interactions. In particular, the model predicts large groups to be more stable than small ones, namely that groups with more individuals remain in contact for longer (panel c of Fig. 4 and Fig. S1-S5 in Supplementary Information) [11]. This discrepancy with the empirical evidence is probably due to the fact that large groups of agents in the AM tend to be more attractive than small ones, as it is more likely that individuals with high attractiveness are members of the group.

Contrarily, capitalizing on the higher-order mechanism of group formation based on group attractiveness, the GAM is able to produce broad-tail distributions for the contact duration as well as their hierarchical organization (panel b of Fig. 4 and Fig. S1-S5 in the Supplementary Information). Yet, we observe that the distributions are often narrower compared to the empirical ones, especially in scenarios where the density of agents is high (see results on “C16” dataset in the Supplementary Information). This is probably due to how we define the probability that an agent interacts with its neighbors, i.e., Eq. 2. Specifically, in a dense environment each agent will interact with a probability that tends to the average value of the group attractiveness, meaning that individuals with high attractiveness, namely those contributing the most to the persistence of interactions, do not have a strong effect. Further studies should aim to understand the relationship between the broadness of the distributions and their hierarchical organization.

Higher-order homophily in face-to-face interactions

In many social contexts, people prefer to build ties with others who they perceive being similar to themselves [35]. This pervasive characteristic, known as homophily, shapes the “social world” of individuals, thus profoundly influencing how behaviors spread [36], biases [37] and social norms [38] form, and segregation emerges [39, 40]. Homophily characterizes face-to-face interactions as well [3, 41, 42], driving the onset of inequalities even at such fundamental scale [15].

While homophily is usually measured at the level of pairs of individuals, recent studies have aimed at capturing it at the level of groups of three or more individuals [43, 44]. We can use the Group Attractiveness Model to analyze higher-order homophilic patterns in face-to-face interactions. Specifically, we enrich the model by associating agents with a set of attributes and by tuning the probability that an agent interacts with its neighbors according to their attributes. The group formation is now a two-step process that incorporates attractiveness and homophilic preferences: First, each agent decides whether to stay or to walk away based on the attractiveness of its neighborhood (see Eq. 2); if it stays, the agent chooses the group(s) to which it connects based on its own attributes and those of the group member(s) (panel a of Fig. 5).

To illustrate the second step, let us assume that each agent is associated with a single attribute. An agent with attribute $\alpha$ close to an agent with attribute $\beta$ will form a group of two with probability $h^{(2)}_{\alpha\beta}$ . Note that $h^{(2)}_{\alpha\beta}$ represents the probability that it is the agent with attribute $\alpha$ to start the interaction, and in general $h^{(2)}_{\alpha\beta}\neq h^{(2)}_{\beta\alpha}$ . Similarly, if the agent is close to a group of two agents having attributes $\beta$ and $\gamma$ , respectively, it will form a group of three with probability $h^{(3)}_{\alpha\beta\gamma}$ . Therefore, the probability of forming groups of various sizes based on the agents’ attributes is determined by a set of homophily matrices, $H^{(2)}$ , $H^{(3)}$ , and so on. In general, we can consider a set of matrices for each attribute associated with the agents. Alternatively, one could adopt an intersectional approach, defining a single set of matrices that modulates the probability of agents to interact based on combinations of attributes, e.g., black-woman, white-man. Here, we will focus on a single binary attribute, i.e., $\alpha\in\{0,1\}$ , using the information on gender contained in the data to test the ability of our model to reproduce higher-order mixing patterns in face-to-face interactions (from now on, attribute 0 will denote women, while attribute 1 will denote men).

To determine the elements of the homophily matrix $H^{(2)}=[[h_{00},h_{01}],[h_{10},h_{11}]]$ (superscripts are dropped for simplicity), we evaluate the fraction of unique groups of size two in the different configurations, i.e., female-female, male-male, and female-male, that are formed from two individuals not previously interacting, namely $i$ and $j$ form a group at time $t$ but they are not part of any common group at time $t-1$ . Such fractions can be written in terms of the elements of the homophily matrix (see Methods for details), as

	$\displaystyle e_{00}=\frac{f_{0}^{2}(1-h_{01}^{2})}{f_{0}^{2}(1-h_{01}^{2})+2f% _{0}f_{1}(1-h_{00}h_{11})+f_{1}^{2}(1-h_{10}^{2})},$
	$\displaystyle e_{01}=\frac{2f_{0}f_{1}(1-h_{00}h_{11})}{f_{0}^{2}(1-h_{01}^{2}% )+2f_{0}f_{1}(1-h_{00}h_{11})+f_{1}^{2}(1-h_{10}^{2})},$		(3)
	$\displaystyle e_{11}=\frac{f_{1}^{2}(1-h_{10}^{2})}{f_{0}^{2}(1-h_{01}^{2})+2f% _{0}f_{1}(1-h_{00}h_{11})+f_{1}^{2}(1-h_{10}^{2})},$

where $e_{00}$ , $e_{01}$ , and $e_{11}$ , denotes the fractions of groups formed by two women, a woman and a man, and two men, respectively, while $f_{0}$ and $f_{1}=1-f_{0}$ represent the fraction of women and men. To estimate the elements of the homophily matrix $H^{(3)}=[[h_{000},h_{001},h_{011}],[h_{100},h_{101},h_{111}]]$ , we count the number of unique groups of size three in the different configurations that are formed by aggregation of an individual in a group of size two, i.e., at time $t-1$ two individuals $i$ and $j$ form a group, at time $t$ an individual $k$ , not previously interacting with them, joins the group. Based on the gender of the individuals joining the group, we have two sets of transitions. A woman can join a group of two other women, two men, or a woman and a man. The fraction of these transitions can be written in terms of the first row of the homophily matrix $H^{(3)}$ (see Methods), namely

	$\displaystyle\tau_{0\rightarrow(0,0)}=\frac{\varepsilon_{00}h_{000}}{% \varepsilon_{00}h_{000}+\varepsilon_{01}h_{001}+\varepsilon_{11}h_{011}},$
	$\displaystyle\tau_{0\rightarrow(0,1)}=\frac{\varepsilon_{01}h_{001}}{% \varepsilon_{00}h_{000}+\varepsilon_{01}h_{001}+\varepsilon_{11}h_{011}},$		(4)
	$\displaystyle\tau_{0\rightarrow(1,1)}=\frac{\varepsilon_{11}h_{011}}{% \varepsilon_{00}h_{000}+\varepsilon_{01}h_{001}+\varepsilon_{11}h_{011}},$

where $\tau_{0\rightarrow(\alpha,\beta)}$ indicate the fractions of transitions, while $\varepsilon_{\alpha\beta}$ denote the fractions of unique groups of size two in the various configurations. Note that $e_{\alpha\beta}$ indicate the groups emerging from two not previously interacting individuals, $\varepsilon_{\alpha\beta}$ denote groups formed in all possible ways, e.g., a group of three that loses a member. In the same way, men can join groups of two individuals in different configurations, the fraction of which can we expressed in terms of the second row of the homophily matrix $H^{(3)}$ , namely

	$\displaystyle\tau_{1\rightarrow(0,0)}=\frac{\varepsilon_{00}h_{100}}{% \varepsilon_{00}h_{100}+\varepsilon_{01}h_{101}+\varepsilon_{11}h_{101}},$
	$\displaystyle\tau_{1\rightarrow(0,1)}=\frac{\varepsilon_{01}h_{101}}{% \varepsilon_{00}h_{100}+\varepsilon_{01}h_{101}+\varepsilon_{11}h_{101}},$
	$\displaystyle\tau_{1\rightarrow(1,1)}=\frac{\varepsilon_{11}h_{111}}{% \varepsilon_{00}h_{100}+\varepsilon_{01}h_{101}+\varepsilon_{11}h_{101}}.$		(5)

A similar approach can be adopted to evaluate the matrices modulating the formation of groups of four or more individuals. As larger groups are less abundant than small ones, for simplicity we here limit our analysis to groups of size two and three.

Panels b and c of Fig. 5 display the homophily matrices $H^{(2)}$ and $H^{(3)}$ obtained for the interactions in the “HS11” dataset (see Supplementary Information for the analysis of the other systems). At the level of pairwise interactions, we observe that women do not have a clear homophilic behavior, as they interact with other women and men with almost the same probability. Conversely, men are strongly homophilic, as the model predicts a substantial difference between the interaction probabilities. Remarkably, things change in groups of size three. In this case, women tend to be more homophilic, while men do not have a strong gender preference when joining groups of two individuals. Homophilic preferences depend on the group size in a nontrivial way: Here we observe a discordant behavior, i.e., men tend to be homophilic in pairs, whereas women in triples, while other social systems can display a consistent pattern (see Panels a and b of Figs.S6-S10 in Supplementary Information).

Finally, we test the capability of the GAM to reproduce mixing patterns in social systems. Panel d of Fig. 5 shows the fraction of unique groups of size three in the different gender configurations present in the data (black bars), together with those predicted by the GAM (blue bars). As a comparison, we consider the Social-Attractiveness Model (SAM) proposed in [15] (yellow bars). Similarly to our model, in the SAM a population of mobile agents performs a random walk interacting with the others based on the intrinsic attractiveness of individuals and their attributes, i.e., gender. Yet, this model only accounts for pairwise interactions, so mixing patterns at the level of groups of three agents are ultimately determined by homophily at the level of pairs. Our results show that considering higher-order homophily allows to better reproduce the gender configurations in the data. Particularly, we observe that the SAM overestimates the tendency of men to interact with their same gender, generating too many groups with three men or two men and a woman. Conversely, our model provides significantly better predictions of the empirical mixing patterns (the better performance of GAM is consistent across different datasets; see Panels c of Figs.S6-S10 in Supplementary Information). Still, we observe some mismatch, particularly in the fraction of groups with three women. This might have various explanations, including differences in the frequency of contacts between the two genders [41] or at the level of individual behaviors, as well as other mechanisms of group formation/dissolution that the Group Attractiveness Model does not accounted for [20, 23, 24]. Overall, our results shed light on how higher-order effects can influence mixing patterns in social systems, underlining the importance of measuring homophily at the level of groups.

Discussion

Humans are social animals that communicate, gather, and live in groups. Even the most fundamental level of social interaction, i.e., face-to-face proximity, is utterly characterized by groups of various sizes. Although models based on dyadic representations, i.e., complex networks, have proved to be a valuable tool to characterize various properties of face-to-face interactions, they fall short when it comes to capture structural and temporal features of groups. This is crucial, as group interactions can dramatically change the collective behavior of complex social systems, leading to super-exponential disease spreading [45], triggering critical mass effects in social contagion [46, 47], and boosting the ability of committed minorities [48] to overturn social norms [49].

In this paper, we presented the Group Attractiveness Model, an agent-based model accounting for the dynamics of groups of individuals interacting face-to-face. Our model is able to reproduce many aspects of real-world systems, including the distribution of groups, the correlation in their number, their persistence in time, and the presence of mixing patterns. The superior performance of the GAM compared to pairwise methods marks the need to adopt higher-order models to investigate groups in face-to-face interactions.

Despite its capability to reproduce different features of group interactions, others remain beyond reach. For instance, our model is Markovian as agents decide whether to interact with others without memory of the previous time steps. Face-to-face interactions are instead characterized by complex memory effects, with each group having memory of itself and others [23]. The asymmetric nature of memory can determine preferred temporal directions in group formation and dispersal that cannot be captured by our model, as both dynamics are governed by the same mechanism, i.e., group attractiveness. Moreover, while small groups tend to evolve gradually, with one or few members at the time joining/leaving the group, large groups can have more complex dynamics [20, 23, 24] that the Group Attractiveness Model does not account for.

In empirical systems, the distributions of group duration are broad-tailed, and generally small groups have broader distributions than large ones, i.e., small groups last longer. Although this feature is generally reproduced by our model, we observed that denser environments lead to narrower contact duration distributions, likely due to a larger volume of group aggregations and disaggregations. However, as tuning the density allows us to correctly predict the number of groups of different sizes, a trade-off between the group statistics and their temporal duration remains. In addition, the data do not provide spatial information about the environment in which contacts take place, making it difficult to determine the appropriate value of agent density. Further modeling efforts should thus aim at investigating how the spatial dimension, the number of groups, the profile and the hierarchical organization of the duration distributions relate to one another.

Though few attempts to give a higher-order definition of homophily were recently made [43, 44], our understanding of homophily at the group level remains limited. Our results advance this line of research by shifting the perspective on how to measure homophily in group interactions. Instead of quantifying it a posteriori, namely from mixing patterns in the data, we adopted an a priori approach, modeling how microscopic interactions are driven by homophilic preferences. Yet, we assumed that agents differ only in terms of their intention to interact with their close neighbors, while other factors can be at play, both at an individual and at an attribute level (e.g., one can assume that agent attractiveness correlates with their attributes). Therefore, one has to be aware that the quantification of how much a system is characterized by homophily strongly depends on the particular modeling choices. Given the prominence that both group interactions and homophily have (separately) in social systems, a deeper understanding of higher-order homophily is essential.

Overall, our work contributes to the study of human face-to-face interactions through the lens of group dynamics and higher-order mechanisms. Given its ability to reproduce different features of the data, we are confident that our model will prove to be beneficial to investigate how groups affect different phenomena, including social contagion, epidemic spreading, and the emergence of mixing patterns and segregation in networked populations.

Methods

Reconstructing groups from face-to-face pairwise data

To assess the features of the Group Attractiveness Model, we use datasets from the SocioPatterns collaboration [2]. These datasets store face-to-face interactions as a list of dyadic contacts with a resolution of 20 seconds. Therefore, they do not provide any information on the group interactions in the social systems. However, given the fine-grained temporal resolution of the data, we are able to reconstruct groups of more than two individuals. Specifically, if at time $t$ in the dataset there are all possible dyadic contacts among $s$ individuals, we can reasonably assume that they are interacting together in a group. For example, if at time $t$ an individual $i$ is in contact with individuals $j$ and $k$ , and these two are also interacting, we can safely say that $i$ , $j$ and $k$ form a group of three individuals.

Fitting the homophily matrices

The Group Attractiveness Model can be extended to assess higher-order mixing patterns in face-to-face interactions. Specifically, we can enrich the model by tuning the probability that an agent interacts with a neighboring group based on their attributes. These probabilities are determined by a set of homophily matrices, $H^{(2)}$ , $H^{(3)}$ , and so on, one for each group size. Here, we show how can we analytically derive the homophily matrices in the case of a single, binary attribute $\alpha\in\{0,1\}$ . Since larger groups are less abundant in the data, we focus on groups of size two and three, tuned by the matrices $H^{(2)}=[[h_{00},h_{01}],[h_{10},h_{11}]]$ and $H^{(3)}=[[h_{000},h_{001},h_{011}],[h_{100},h_{101},h_{111}]]$ (the superscripts are omitted for simplicity). $h_{\alpha\beta}$ denotes the probability that an agent with attribute $\alpha$ starts to interact with an agent having attribute $\beta$ , while $h_{\alpha\beta\gamma}$ represents the probability that an agent with attribute $\alpha$ starts to interact with a group of two agents having attributes $\beta$ and $\gamma$ , respectively.

Groups of two agents can be in three different configurations, namely $(0,0)$ , $(0,1)$ and $(1,1)$ . Let us consider the scenario in which two agents that are not interacting, i.e., they are not part of any common group, get in contact and form a group. We denote the number of pairs in each configuration generated at time $t$ as $E_{00}$ , $E_{01}$ , and $E_{11}$ , respectively. In general, we can write the number of pairs in configuration $(\alpha,\beta)$ as

E_{\alpha\beta}=G^{(2)}p_{ij,\alpha\beta}.

(6)

$G^{(2)}$ denotes the average number of interactions between two agents (previously not interacting) that could be formed without considering homophily. $G^{(2)}$ depends on number of agents $N$ , their radius of action $d$ , the size of the environment $L$ , and the attractiveness distribution. $p_{ij,\alpha\beta}$ represents the probability that two agents $i$ and $j$ (i) have attributes $\alpha$ and $\beta$ , respectively, and (ii) start interacting according to their attributes. A pair is created in three different situations, depending on whether (1) only $i$ , (2) only $j$ , or (3) both $i$ and $j$ initiate the formation of the group. Therefore, we can write $p_{ij,\alpha\beta}$ as

p_{ij,\alpha\beta}=p_{i\rightarrow j,\alpha\beta}+p_{i\leftarrow j,\alpha\beta% }+p_{i\leftrightarrow j,\alpha\beta},

(7)

where the arrows indicate the three possible scenarios of group formation. In the case of two agents with attributes $\alpha=\beta=0$ , we have

$\displaystyle p_{ij,00}$	$\displaystyle=p_{i\rightarrow j,00}+p_{i\leftarrow j,00}+p_{i\leftrightarrow j% ,00}$	(8)
	$\displaystyle=f_{0}^{2}\times h_{00}(1-h_{00})+f_{0}^{2}\times(1-h_{00})h_{00}% +f_{0}^{2}\times h_{00}^{2}$
	$\displaystyle=f_{0}^{2}(1-h_{01}^{2}),$

where $f_{0}$ is the fraction of agents with attribute 0, and we assumed $h_{00}+h_{01}=1$ [15]. Hence, the number of pairs in state $(0,0)$ generated at time $t$ is

E_{00}=G^{(2)}f_{0}^{2}(1-h_{01}^{2}).

(9)

Similarly, we can write the number of groups in state $(1,1)$ as

E_{11}=G^{(2)}f_{1}^{2}(1-h_{10}^{2}),

(10)

where $f_{1}$ is the fraction of agents with attribute 1, and we assumed $h_{10}+h_{11}=1$ . Finally, the number of groups in state $(0,1)$ is given by

E_{01}=2G^{(2)}f_{0}f_{1}(1-h_{00}h_{11}),

(11)

where the factor two comes from the fact that $i$ and $j$ can have either attributes 0 or 1. We can then normalize the number of groups in each configuration by the total number of groups generated, obtaining the fractions

	$\displaystyle e_{00}=\frac{f_{0}^{2}(1-h_{01}^{2})}{f_{0}^{2}(1-h_{01}^{2})+2f% _{0}f_{1}(1-h_{00}h_{11})+f_{1}^{2}(1-h_{10}^{2})},$
	$\displaystyle e_{01}=\frac{2f_{0}f_{1}(1-h_{00}h_{11})}{f_{0}^{2}(1-h_{01}^{2}% )+2f_{0}f_{1}(1-h_{00}h_{11})+f_{1}^{2}(1-h_{10}^{2})},$
	$\displaystyle e_{11}=\frac{f_{1}^{2}(1-h_{10}^{2})}{f_{0}^{2}(1-h_{01}^{2})+2f% _{0}f_{1}(1-h_{00}h_{11})+f_{1}^{2}(1-h_{10}^{2})}.$		(12)

In the case of gender homophily, setting $f_{0}$ and $f_{1}$ equal to the fraction of female and male individuals, and $e_{\alpha\beta}$ equal to the average fractions of pairs formed at time $t$ , where the individuals were not interacting at time $t-1$ , we can estimate the entries of $H^{(2)}$ .

If $h_{\alpha\alpha}>1/2$ agents prefer to interact with those having the same attribute, namely the system is in a homophilic regime. Instead, when $h_{\alpha\alpha}<1/2$ agents tend to interact more with those having the other attribute, i.e., heterophilic regime. The case $h_{\alpha\alpha}=1/2$ corresponds to the neutral scenario where agents interact without any preferences.

We now consider the scenario in which an agent with attribute $\alpha$ joins a pair of interacting agents that are within its scope. We denote the number of groups of size three in configuration $(\alpha,\beta,\gamma)$ generated as $T_{\alpha\rightarrow(\beta,\gamma)}$ . This can be written as

T_{\alpha\rightarrow(\beta,\gamma)}=M^{(2)}\varepsilon_{\beta\gamma}h_{\alpha% \beta\gamma},

(13)

where $M^{(2)}$ is the average number of groups of size two within the scope of an agent, $\varepsilon_{\beta\gamma}$ is the fraction of groups in state $(\beta,\gamma)$ , while $h_{\alpha\beta\gamma}$ is the element of the homophily matrix $H^{(3)}$ denoting the probability that an agent with attribute $\alpha$ interacts with a pairs of agents with attributes $\beta$ and $\gamma$ , respectively. Focusing on $\alpha=0$ , we can write the number of groups of size three formed at time $t$ as

	$\displaystyle T_{0\rightarrow(0,0)}=M^{(2)}\varepsilon_{00}h_{000},$
	$\displaystyle T_{0\rightarrow(0,1)}=M^{(2)}\varepsilon_{01}h_{001},$
	$\displaystyle T_{0\rightarrow(1,1)}=M^{(2)}\varepsilon_{11}h_{011}.$		(14)

Normalizing by the total number of groups generated, we obtain the fractions

	$\displaystyle\tau_{0\rightarrow(0,0)}=\frac{\varepsilon_{00}h_{000}}{% \varepsilon_{00}h_{000}+\varepsilon_{01}h_{011}+\varepsilon_{11}h_{001}},$
	$\displaystyle\tau_{0\rightarrow(0,1)}=\frac{\varepsilon_{01}h_{001}}{% \varepsilon_{00}h_{000}+\varepsilon_{01}h_{011}+\varepsilon_{11}h_{001}},$
	$\displaystyle\tau_{0\rightarrow(1,1)}=\frac{\varepsilon_{11}h_{011}}{% \varepsilon_{00}h_{000}+\varepsilon_{01}h_{011}+\varepsilon_{11}h_{001}}.$		(15)

Similarly, for an agent with attribute 1 we find

	$\displaystyle\tau_{1\rightarrow(0,0)}=\frac{\varepsilon_{00}h_{100}}{% \varepsilon_{00}h_{100}+\varepsilon_{01}h_{101}+\varepsilon_{11}h_{111}},$
	$\displaystyle\tau_{1\rightarrow(0,1)}=\frac{\varepsilon_{01}h_{101}}{% \varepsilon_{00}h_{100}+\varepsilon_{01}h_{101}+\varepsilon_{11}h_{111}},$
	$\displaystyle\tau_{1\rightarrow(1,1)}=\frac{\varepsilon_{11}h_{111}}{% \varepsilon_{00}h_{100}+\varepsilon_{01}h_{101}+\varepsilon_{11}h_{111}}.$		(16)

Assuming that $h_{000}+h_{001}+h_{011}=h_{100}+h_{101}+h_{111}=1$ , we can estimate the entries of the homophily matrix $H^{(3)}$ . Specifically, we set $\varepsilon_{\beta\gamma}$ equal to the fractions of unique groups of size two in the different gender configurations, while $\tau_{\alpha\rightarrow(\beta,\gamma)}$ can be evaluated by counting how many times in the data a pair of interacting individuals at time $t-1$ is followed by a group of size three at time $t$ . Note that the neutral scenario with no homophilic preferences in the group formation corresponds to $h_{000}=h_{011}=1/3$ and $h_{100}=h_{111}=1/3$ .

Finally, we can recover the Group Attractiveness Model without homophily by assuming that all agents have the same attribute, say $f_{0}=1$ , and that the corresponding homophilic interaction probabilities are equal to 1, namely $h_{00}=1$ (no matter the value of $h_{11}$ ) and $h_{000}=1$ (no matter the values of $h_{100}$ and $h_{111}$ ).

Data Availability

Datasets storing face-to-face interactions in primary and high schools are freely available at https://www.sociopatterns.org/datasets. Data on contacts in scientific conferences are available upon request at https://doi.org/10.7802/235.

Code Availability

A Python implementation of the Group Attractiveness Model is available as part of the HGX library [50].

References

Duncan and Fiske [2015] S. Duncan and D. W. Fiske, Face-to-face interaction: Research, methods, and theory (Routledge, 2015).
Cattuto et al. [2010] C. Cattuto, W. Van den Broeck, A. Barrat, V. Colizza, J.-F. Pinton, and A. Vespignani, Dynamics of person-to-person interactions from distributed rfid sensor networks, PloS one 5, e11596 (2010).
Stehlé et al. [2011] J. Stehlé, N. Voirin, A. Barrat, C. Cattuto, L. Isella, J.-F. Pinton, M. Quaggiotto, W. Van den Broeck, C. Régis, B. Lina, et al., High-resolution measurements of face-to-face contact patterns in a primary school, PloS one 6, e23176 (2011).
Isella et al. [2011] L. Isella, J. Stehlé, A. Barrat, C. Cattuto, J.-F. Pinton, and W. Van den Broeck, What’s in a crowd? analysis of face-to-face behavioral networks, Journal of theoretical biology 271, 166 (2011).
Takaguchi et al. [2011] T. Takaguchi, M. Nakamura, N. Sato, K. Yano, and N. Masuda, Predictability of conversation partners, Physical Review X 1, 011008 (2011).
Fournet and Barrat [2014] J. Fournet and A. Barrat, Contact patterns among high school students, PloS one 9, e107878 (2014).
Stopczynski et al. [2014] A. Stopczynski, V. Sekara, P. Sapiezynski, A. Cuttone, M. M. Madsen, J. E. Larsen, and S. Lehmann, Measuring large-scale social networks with high resolution, PloS one 9, e95978 (2014).
Mastrandrea et al. [2015] R. Mastrandrea, J. Fournet, and A. Barrat, Contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys, PloS one 10, e0136497 (2015).
Sekara et al. [2016] V. Sekara, A. Stopczynski, and S. Lehmann, Fundamental structures of dynamic social networks, Proceedings of the national academy of sciences 113, 9977 (2016).
Starnini et al. [2013] M. Starnini, A. Baronchelli, and R. Pastor-Satorras, Modeling human dynamics of face-to-face interaction networks, Physical review letters 110, 168701 (2013).
Starnini et al. [2016a] M. Starnini, A. Baronchelli, and R. Pastor-Satorras, Model reproduces individual, group and collective dynamics of human contact networks, Social Networks 47, 130 (2016a).
Zhang et al. [2016] Y.-Q. Zhang, J. Cui, S.-M. Zhang, Q. Zhang, and X. Li, Modelling temporal networks of human face-to-face contacts with public activity and individual reachability, The European Physical Journal B 89, 1 (2016).
Starnini et al. [2016b] M. Starnini, M. Frasca, and A. Baronchelli, Emergence of metapopulations and echo chambers in mobile agents, Scientific reports 6, 31834 (2016b).
Flores and Papadopoulos [2018] M. A. R. Flores and F. Papadopoulos, Similarity forces and recurrent components in human face-to-face interaction networks, Physical review letters 121, 258301 (2018).
Oliveira et al. [2022] M. Oliveira, F. Karimi, M. Zens, J. Schaible, M. Génois, and M. Strohmaier, Group mixing drives inequality in face-to-face gatherings, Communications Physics 5, 127 (2022).
Tang et al. [2010] J. Tang, S. Scellato, M. Musolesi, C. Mascolo, and V. Latora, Small-world behavior in time-varying graphs, Physical Review E 81, 055101 (2010).
Frasca et al. [2006] M. Frasca, A. Buscarino, A. Rizzo, L. Fortuna, and S. Boccaletti, Dynamical network model of infective mobile agents, Physical Review E 74, 036110 (2006).
Buscarino et al. [2008] A. Buscarino, L. Fortuna, M. Frasca, and V. Latora, Disease spreading in populations of moving agents, Europhysics Letters 82, 38002 (2008).
Holme and Saramäki [2012] P. Holme and J. Saramäki, Temporal networks, Physics reports 519, 97 (2012).
Cencetti et al. [2021] G. Cencetti, F. Battiston, B. Lepri, and M. Karsai, Temporal properties of higher-order interactions in social networks, Scientific reports 11, 7028 (2021).
Battiston et al. [2020] F. Battiston, G. Cencetti, I. Iacopini, V. Latora, M. Lucas, A. Patania, J.-G. Young, and G. Petri, Networks beyond pairwise interactions: Structure and dynamics, Physics Reports 874, 1 (2020).
Hoffman et al. [2020] M. Hoffman, P. Block, T. Elmer, and C. Stadtfeld, A model for the dynamics of face-to-face interactions in social groups, Network Science 8, S4 (2020).
Gallo et al. [2024] L. Gallo, L. Lacasa, V. Latora, and F. Battiston, Higher-order correlations reveal complex memory in temporal hypergraphs, Nature Communications 15, 4754 (2024).
Iacopini et al. [2023] I. Iacopini, M. Karsai, and A. Barrat, The temporal dynamics of group interactions in higher-order social networks, arXiv preprint arXiv:2306.09967 (2023).
Stehlé et al. [2010] J. Stehlé, A. Barrat, and G. Bianconi, Dynamical and bursty interactions in social networks, Physical review E 81, 035101 (2010).
Egbert [1997] M. M. Egbert, Schisming: The collaborative transformation from a single conversation to multiple conversations, Research on Language and Social Interaction 30, 1 (1997).
Génois et al. [2019] M. Génois, M. Zens, C. Lechner, B. Rammstedt, and M. Strohmaier, Building connections: How scientists meet each other during a conference, arXiv preprint arXiv:1901.01182 (2019).
Lotito et al. [2022] Q. F. Lotito, F. Musciotto, A. Montresor, and F. Battiston, Higher-order motif analysis in hypergraphs, Communications Physics 5, 79 (2022).
Landry et al. [2024] N. W. Landry, J.-G. Young, and N. Eikmeier, The simpliciality of higher-order networks, EPJ Data Science 13, 17 (2024).
Landry and Restrepo [2020] N. W. Landry and J. G. Restrepo, The effect of heterogeneity on hypergraph contagion models, Chaos: An Interdisciplinary Journal of Nonlinear Science 30 (2020).
LaRock and Lambiotte [2023] T. LaRock and R. Lambiotte, Encapsulation structure and dynamics in hypergraphs, Journal of Physics: Complexity 4, 045007 (2023).
Kim et al. [2023] J. Kim, D.-S. Lee, and K.-I. Goh, Contagion dynamics on hypergraphs with nested hyperedges, Physical Review E 108, 034313 (2023).
Karsai et al. [2012] M. Karsai, K. Kaski, A.-L. Barabási, and J. Kertész, Universal features of correlated bursty behaviour, Scientific reports 2, 397 (2012).
Zhao et al. [2011] K. Zhao, J. Stehlé, G. Bianconi, and A. Barrat, Social network dynamics of face-to-face interactions, Physical review E 83, 056109 (2011).
McPherson et al. [2001] M. McPherson, L. Smith-Lovin, and J. M. Cook, Birds of a feather: Homophily in social networks, Annual review of sociology 27, 415 (2001).
Christakis and Fowler [2007] N. A. Christakis and J. H. Fowler, The spread of obesity in a large social network over 32 years, New England journal of medicine 357, 370 (2007).
Lee et al. [2019] E. Lee, F. Karimi, C. Wagner, H.-H. Jo, M. Strohmaier, and M. Galesic, Homophily and minority-group size explain perception biases in social networks, Nature human behaviour 3, 1078 (2019).
Centola et al. [2005] D. Centola, R. Willer, and M. Macy, The emperor’s dilemma: A computational model of self-enforcing norms, American Journal of Sociology 110, 1009 (2005).
Schelling [1971] T. C. Schelling, Dynamic models of segregation, Journal of mathematical sociology 1, 143 (1971).
Currarini et al. [2009] S. Currarini, M. O. Jackson, and P. Pin, An economic model of friendship: Homophily, minorities, and segregation, Econometrica 77, 1003 (2009).
Stehlé et al. [2013] J. Stehlé, F. Charbonnier, T. Picard, C. Cattuto, and A. Barrat, Gender homophily from spatial behavior in a primary school: A sociometric study, Social Networks 35, 604 (2013).
Ozella et al. [2021] L. Ozella, D. Paolotti, G. Lichand, J. P. Rodríguez, S. Haenni, J. Phuka, O. B. Leal-Neto, and C. Cattuto, Using wearable proximity sensors to characterize social contact patterns in a village of rural malawi, EPJ Data Science 10, 46 (2021).
Veldt et al. [2023] N. Veldt, A. R. Benson, and J. Kleinberg, Combinatorial characterizations and impossibilities for higher-order homophily, Science Advances 9, eabq3200 (2023).
Sarker et al. [2024] A. Sarker, N. Northrup, and A. Jadbabaie, Higher-order homophily on simplicial complexes, Proceedings of the National Academy of Sciences 121, e2315931121 (2024).
St-Onge et al. [2021] G. St-Onge, H. Sun, A. Allard, L. Hébert-Dufresne, and G. Bianconi, Universal nonlinear infection kernel from heterogeneous exposure on higher-order networks, Physical review letters 127, 158301 (2021).
Iacopini et al. [2019] I. Iacopini, G. Petri, A. Barrat, and V. Latora, Simplicial models of social contagion, Nature communications 10, 2485 (2019).
de Arruda et al. [2020] G. F. de Arruda, G. Petri, and Y. Moreno, Social contagion models on hypergraphs, Physical Review Research 2, 023032 (2020).
Centola et al. [2018] D. Centola, J. Becker, D. Brackbill, and A. Baronchelli, Experimental evidence for tip** points in social convention, Science 360, 1116 (2018).
Iacopini et al. [2022] I. Iacopini, G. Petri, A. Baronchelli, and A. Barrat, Group interactions modulate critical mass dynamics in social convention, Communications Physics 5, 64 (2022).
Lotito et al. [2023] Q. F. Lotito, M. Contisciani, C. De Bacco, L. Di Gaetano, L. Gallo, A. Montresor, F. Musciotto, N. Ruggeri, and F. Battiston, Hypergraphx: a library for higher-order network analysis, Journal of Complex Networks 11, cnad019 (2023).

Acknowledgments

L.G. and F.B. acknowledge support of the Air Force Office of Scientific Research under award number FA8655-22-1-7025. C.Z. acknowledges the support of 101086712-LearnData-HORIZON-WIDERA-2022-TALENTS-01 financed by EUROPEAN RESEARCH EXECUTIVE AGENCY (REA) (https://cordis.europa.eu/project/id/101086712).