-
Nicer Than Humans: How do Large Language Models Behave in the Prisoner's Dilemma?
Authors:
Nicoló Fontana,
Francesco Pierri,
Luca Maria Aiello
Abstract:
The behavior of Large Language Models (LLMs) as artificial social agents is largely unexplored, and we still lack extensive evidence of how these agents react to simple social stimuli. Testing the behavior of AI agents in classic Game Theory experiments provides a promising theoretical framework for evaluating the norms and values of these agents in archetypal social situations. In this work, we i…
▽ More
The behavior of Large Language Models (LLMs) as artificial social agents is largely unexplored, and we still lack extensive evidence of how these agents react to simple social stimuli. Testing the behavior of AI agents in classic Game Theory experiments provides a promising theoretical framework for evaluating the norms and values of these agents in archetypal social situations. In this work, we investigate the cooperative behavior of Llama2 when playing the Iterated Prisoner's Dilemma against random adversaries displaying various levels of hostility. We introduce a systematic methodology to evaluate an LLM's comprehension of the game's rules and its capability to parse historical gameplay logs for decision-making. We conducted simulations of games lasting for 100 rounds, and analyzed the LLM's decisions in terms of dimensions defined in behavioral economics literature. We find that Llama2 tends not to initiate defection but it adopts a cautious approach towards cooperation, sharply shifting towards a behavior that is both forgiving and non-retaliatory only when the opponent reduces its rate of defection below 30%. In comparison to prior research on human participants, Llama2 exhibits a greater inclination towards cooperative behavior. Our systematic approach to the study of LLMs in game theoretical scenarios is a step towards using these simulations to inform practices of LLM auditing and alignment.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Urban highways are barriers to social ties
Authors:
Luca Maria Aiello,
Anastassia Vybornova,
Sándor Juhász,
Michael Szell,
Eszter Bokányi
Abstract:
Urban highways are common, especially in the US, making cities more car-centric. They promise the annihilation of distance but obstruct pedestrian mobility, thus playing a key role in limiting social interactions locally. Although this limiting role is widely acknowledged in urban studies, the quantitative relationship between urban highways and social ties is barely tested. Here we define a Barri…
▽ More
Urban highways are common, especially in the US, making cities more car-centric. They promise the annihilation of distance but obstruct pedestrian mobility, thus playing a key role in limiting social interactions locally. Although this limiting role is widely acknowledged in urban studies, the quantitative relationship between urban highways and social ties is barely tested. Here we define a Barrier Score that relates massive, geolocated online social network data to highways in the 50 largest US cities. At the unprecedented granularity of individual social ties, we show that urban highways are associated with decreased social connectivity. This barrier effect is especially strong for short distances and consistent with historical cases of highways that were built to purposefully disrupt or isolate Black neighborhoods. By combining spatial infrastructure with social tie data, our method adds a new dimension to demographic studies of social segregation. Our study can inform reparative planning for an evidence-based reduction of spatial inequality, and more generally, support a better integration of the social fabric in urban planning.
△ Less
Submitted 18 April, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
The causal role of the Reddit collective action on the GameStop short squeeze
Authors:
Antonio Desiderio,
Luca Maria Aiello,
Giulio Cimini,
Laura Alessandretti
Abstract:
In early 2021, the stock prices of GameStop, AMC, Nokia, and BlackBerry experienced dramatic increases, triggered by short squeeze operations that have been largely attributed to Reddit's retail investors. These events showcased, for the first time, the potential of online social networks to catalyze financial collective action. How, when and to what extent Reddit users played a causal role in dri…
▽ More
In early 2021, the stock prices of GameStop, AMC, Nokia, and BlackBerry experienced dramatic increases, triggered by short squeeze operations that have been largely attributed to Reddit's retail investors. These events showcased, for the first time, the potential of online social networks to catalyze financial collective action. How, when and to what extent Reddit users played a causal role in driving up these prices, however, remains unclear. To address these questions, we employ causal inference techniques, leveraging data capturing activity on Reddit and Twitter, and trading volume with a high temporal resolution. We find that Reddit discussions foreshadowed trading volume before the GameStop short squeeze, with their predictive power being particularly strong on hourly time scales. This effect emerged abruptly and became prominent a few weeks before the event, but waned once the community of investors gained widespread visibility through Twitter. As the causal link unfolded, the collective investment of the Reddit community, quantified through each user's financial position on GameStop, closely mirrored the market capitalization of the stock. The evidence from our study suggests that Reddit users fueled the GameStop short squeeze, and thereby Reddit served as a coordination hub for a shared financial strategy. Towards the end of January, users talking about GameStop contributed to raise the popularity of BlackBerry, AMC and Nokia, which emerged as the most popular stocks as the community gained global recognition. Overall, our results shed light on the dynamics behind the first large-scale financial collective action driven by social media users.
△ Less
Submitted 5 February, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Narratives of Collective Action in YouTube's Discourse on Veganism
Authors:
Arianna Pera,
Luca Maria Aiello
Abstract:
Narratives can be powerful tools for inspiring action on pressing societal issues such as climate change. While social science theories offer frameworks for understanding the narratives that arise within collective movements, these are rarely applied to the vast data available from social media platforms, which play a significant role in sha** public opinion and mobilizing collective action. Thi…
▽ More
Narratives can be powerful tools for inspiring action on pressing societal issues such as climate change. While social science theories offer frameworks for understanding the narratives that arise within collective movements, these are rarely applied to the vast data available from social media platforms, which play a significant role in sha** public opinion and mobilizing collective action. This gap in the empirical evaluation of online narratives limits our understanding of their relationship with public response. In this study, we focus on plant-based diets as a form of pro-environmental action and employ natural language processing to operationalize a theoretical framework of moral narratives specific to the vegan movement. We apply this framework to narratives found in YouTube videos promoting environmental initiatives such as Veganuary, Meatless March, and No Meat May. Our analysis reveals that several narrative types, as defined by the theory, are empirically present in the data. To identify narratives with the potential to elicit positive public engagement, we used text processing to estimate the proportion of comments supporting collective action across narrative types. Video narratives advocating social fight, whether through protest or through efforts to convert others to the cause, are associated with a stronger sense of collective action in the respective comments. These narrative types also demonstrate increased semantic coherence and alignment between the message and public response, markers typically associated with successful collective action. Our work offers new insights into the complex factors that influence the emergence of collective action, thereby informing the development of effective communication strategies within social movements.
△ Less
Submitted 28 March, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
The Persuasive Power of Large Language Models
Authors:
Simon Martin Breum,
Daniel Vædele Egdal,
Victor Gram Mortensen,
Anders Giovanni Møller,
Luca Maria Aiello
Abstract:
The increasing capability of Large Language Models to act as human-like social agents raises two important questions in the area of opinion dynamics. First, whether these agents can generate effective arguments that could be injected into the online discourse to steer the public opinion. Second, whether artificial agents can interact with each other to reproduce dynamics of persuasion typical of h…
▽ More
The increasing capability of Large Language Models to act as human-like social agents raises two important questions in the area of opinion dynamics. First, whether these agents can generate effective arguments that could be injected into the online discourse to steer the public opinion. Second, whether artificial agents can interact with each other to reproduce dynamics of persuasion typical of human social systems, opening up opportunities for studying synthetic social systems as faithful proxies for opinion dynamics in human populations. To address these questions, we designed a synthetic persuasion dialogue scenario on the topic of climate change, where a 'convincer' agent generates a persuasive argument for a 'skeptic' agent, who subsequently assesses whether the argument changed its internal opinion state. Different types of arguments were generated to incorporate different linguistic dimensions underpinning psycho-linguistic theories of opinion change. We then asked human judges to evaluate the persuasiveness of machine-generated arguments. Arguments that included factual knowledge, markers of trust, expressions of support, and conveyed status were deemed most effective according to both humans and agents, with humans reporting a marked preference for knowledge-based arguments. Our experimental framework lays the groundwork for future in-silico studies of opinion dynamics, and our findings suggest that artificial agents have the potential of playing an important role in collective processes of opinion formation in online social media.
△ Less
Submitted 24 December, 2023;
originally announced December 2023.
-
Shifting Climates: Climate Change Communication from YouTube to TikTok
Authors:
Arianna Pera,
Luca Maria Aiello
Abstract:
Public discourse on critical issues such as climate change is progressively shifting to social media platforms that prioritize short-form video content. Content creators acting on those platforms play a pivotal role in sha** the discourse, yet the dynamics of communication and audience reactions across platforms remain underexplored. To improve our understanding of this transition, we studied th…
▽ More
Public discourse on critical issues such as climate change is progressively shifting to social media platforms that prioritize short-form video content. Content creators acting on those platforms play a pivotal role in sha** the discourse, yet the dynamics of communication and audience reactions across platforms remain underexplored. To improve our understanding of this transition, we studied the video content produced by 21 prominent YouTube creators who have expanded their influence to TikTok as information disseminators. Using dictionary-based tools and BERT-based embeddings, we analyzed the transcripts of nearly 7k climate-related videos across both platforms and the 574k comments they received. We found that, when publishing on TikTok, creators use a more emotionally resonant, self-referential, and action-oriented language compared to YouTube. We also observed a strong semantic alignment between videos and comments, with creators who excel at diversifying their TikTok content from YouTube typically receiving responses that more closely align with their produced content. This suggests that tailored communication strategies hold greater promise in directing public discussion toward desired topics, which bears implications for the design of effective climate communication campaigns.
△ Less
Submitted 20 February, 2024; v1 submitted 8 December, 2023;
originally announced December 2023.
-
The role of interface design on prompt-mediated creativity in Generative AI
Authors:
Maddalena Torricelli,
Mauro Martino,
Andrea Baronchelli,
Luca Maria Aiello
Abstract:
Generative AI for the creation of images is becoming a staple in the toolkit of digital artists and visual designers. The interaction with these systems is mediated by \emph{prompting}, a process in which users write a short text to describe the desired image's content and style. The study of prompts offers an unprecedented opportunity to gain insight into the process of human creativity. Yet, our…
▽ More
Generative AI for the creation of images is becoming a staple in the toolkit of digital artists and visual designers. The interaction with these systems is mediated by \emph{prompting}, a process in which users write a short text to describe the desired image's content and style. The study of prompts offers an unprecedented opportunity to gain insight into the process of human creativity. Yet, our understanding of how people use them remains limited. We analyze more than 145,000 prompts from the logs of two Generative AI platforms (Stable Diffusion and Pick-a-Pic) to shed light on how people \emph{explore} new concepts over time, and how their exploration might be influenced by different design choices in human-computer interfaces to Generative AI. We find that users exhibit a tendency towards exploration of new topics over exploitation of concepts visited previously. However, a comparative analysis of the two platforms, which differ both in scope and functionalities, reveals some stark differences. Features diverting user focus from prompting and providing instead shortcuts for quickly generating image variants are associated with a considerable reduction in both exploration of novel concepts and detail in the submitted prompts. These results carry direct implications for the design of human interfaces to Generative AI and raise new questions regarding how the process of prompting should be aided in ways that best support creativity.
△ Less
Submitted 17 February, 2024; v1 submitted 30 November, 2023;
originally announced December 2023.
-
Measuring Behavior Change with Observational Studies: a Review
Authors:
Arianna Pera,
Gianmarco de Francisci Morales,
Luca Maria Aiello
Abstract:
Exploring behavioral change in the digital age is imperative for societal progress in the context of 21st-century challenges. We analyzed 148 articles (2000-2023) and built a map that categorizes behaviors and change detection methodologies, platforms of reference, and theoretical frameworks that characterize online behavior change. Our findings uncover a focus on sentiment shifts, an emphasis on…
▽ More
Exploring behavioral change in the digital age is imperative for societal progress in the context of 21st-century challenges. We analyzed 148 articles (2000-2023) and built a map that categorizes behaviors and change detection methodologies, platforms of reference, and theoretical frameworks that characterize online behavior change. Our findings uncover a focus on sentiment shifts, an emphasis on API-restricted platforms, and limited theory integration. We call for methodologies able to capture a wider range of behavioral types, diverse data sources, and stronger theory-practice alignment in the study of online behavioral change.
△ Less
Submitted 2 November, 2023; v1 submitted 30 October, 2023;
originally announced October 2023.
-
Dream Content Discovery from Reddit with an Unsupervised Mixed-Method Approach
Authors:
Anubhab Das,
Sanja Šćepanović,
Luca Maria Aiello,
Remington Mallett,
Deirdre Barrett,
Daniele Quercia
Abstract:
Dreaming is a fundamental but not fully understood part of human experience that can shed light on our thought patterns. Traditional dream analysis practices, while popular and aided by over 130 unique scales and rating systems, have limitations. Mostly based on retrospective surveys or lab studies, they struggle to be applied on a large scale or to show the importance and connections between diff…
▽ More
Dreaming is a fundamental but not fully understood part of human experience that can shed light on our thought patterns. Traditional dream analysis practices, while popular and aided by over 130 unique scales and rating systems, have limitations. Mostly based on retrospective surveys or lab studies, they struggle to be applied on a large scale or to show the importance and connections between different dream themes. To overcome these issues, we developed a new, data-driven mixed-method approach for identifying topics in free-form dream reports through natural language processing. We tested this method on 44,213 dream reports from Reddit's r/Dreams subreddit, where we found 217 topics, grouped into 22 larger themes: the most extensive collection of dream topics to date. We validated our topics by comparing it to the widely-used Hall and van de Castle scale. Going beyond traditional scales, our method can find unique patterns in different dream types (like nightmares or recurring dreams), understand topic importance and connections, and observe changes in collective dream experiences over time and around major events, like the COVID-19 pandemic and the recent Russo-Ukrainian war. We envision that the applications of our method will provide valuable insights into the intricate nature of dreaming.
△ Less
Submitted 9 July, 2023;
originally announced July 2023.
-
Drivers of social influence in the Twitter migration to Mastodon
Authors:
Lucio La Cava,
Luca Maria Aiello,
Andrea Tagarelli
Abstract:
The migration of Twitter users to Mastodon following Elon Musk's acquisition presents a unique opportunity to study collective behavior and gain insights into the drivers of coordinated behavior in online media. We analyzed the social network and the public conversations of about 75,000 migrated users and observed that the temporal trace of their migrations is compatible with a phenomenon of socia…
▽ More
The migration of Twitter users to Mastodon following Elon Musk's acquisition presents a unique opportunity to study collective behavior and gain insights into the drivers of coordinated behavior in online media. We analyzed the social network and the public conversations of about 75,000 migrated users and observed that the temporal trace of their migrations is compatible with a phenomenon of social influence, as described by a compartmental epidemic model of information diffusion. Drawing from prior research on behavioral change, we delved into the factors that account for variations across different Twitter communities in the effectiveness of the spreading of the influence to migrate. Communities in which the influence process unfolded more rapidly exhibit lower density of social connections, higher levels of signaled commitment to migrating, and more emphasis on shared identity and exchange of factual knowledge in the community discussion. These factors account collectively for 57% of the variance in the observed data. Our results highlight the joint importance of network structure, commitment, and psycho-linguistic aspects of social interactions in describing grassroots collective action, and contribute to deepen our understanding of the mechanisms driving processes of behavior change of online groups.
△ Less
Submitted 28 November, 2023; v1 submitted 30 May, 2023;
originally announced May 2023.
-
The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification Tasks
Authors:
Anders Giovanni Møller,
Jacob Aarup Dalsgaard,
Arianna Pera,
Luca Maria Aiello
Abstract:
In the realm of Computational Social Science (CSS), practitioners often navigate complex, low-resource domains and face the costly and time-intensive challenges of acquiring and annotating data. We aim to establish a set of guidelines to address such challenges, comparing the use of human-labeled data with synthetically generated data from GPT-4 and Llama-2 in ten distinct CSS classification tasks…
▽ More
In the realm of Computational Social Science (CSS), practitioners often navigate complex, low-resource domains and face the costly and time-intensive challenges of acquiring and annotating data. We aim to establish a set of guidelines to address such challenges, comparing the use of human-labeled data with synthetically generated data from GPT-4 and Llama-2 in ten distinct CSS classification tasks of varying complexity. Additionally, we examine the impact of training data sizes on performance. Our findings reveal that models trained on human-labeled data consistently exhibit superior or comparable performance compared to their synthetically augmented counterparts. Nevertheless, synthetic augmentation proves beneficial, particularly in improving performance on rare classes within multi-class tasks. Furthermore, we leverage GPT-4 and Llama-2 for zero-shot classification and find that, while they generally display strong performance, they often fall short when compared to specialized classifiers trained on moderately sized training sets.
△ Less
Submitted 5 February, 2024; v1 submitted 26 April, 2023;
originally announced April 2023.
-
Multidimensional Tie Strength and Economic Development
Authors:
Luca Maria Aiello,
Sagar Joglekar,
Daniele Quercia
Abstract:
The strength of social relations has been shown to affect an individual's access to opportunities. To date, however, the correspondence between tie strength and population's economic prospects has not been quantified, largely because of the inability to operationalise strength based on Granovetter's classic theory. Our work departed from the premise that tie strength is a unidimensional construct…
▽ More
The strength of social relations has been shown to affect an individual's access to opportunities. To date, however, the correspondence between tie strength and population's economic prospects has not been quantified, largely because of the inability to operationalise strength based on Granovetter's classic theory. Our work departed from the premise that tie strength is a unidimensional construct (typically operationalized with frequency or volume of contact), and used instead a validated model of ten fundamental dimensions of social relationships grounded in the literature of social psychology. We built state-of-the-art NLP tools to infer the presence of these dimensions from textual communication, and analyzed a large conversation network of 630K geo-referenced Reddit users across the entire US connected by 12.8M social ties created over the span of 7 years. We found that unidimensional tie strength is only weakly correlated with economic opportunities (R2=0.30), while multidimensional constructs are highly correlated (R2=0.62). In particular, economic opportunities are associated to the combination of: i) knowledge ties, which bridge geographically distant groups, facilitating the knowledge dissemination across communities; and ii) social support ties, which knit geographically close communities together, and represent dependable sources of social and emotional support. These results point to the importance of develo** high-quality measures of tie strength in network theory.
△ Less
Submitted 22 December, 2022;
originally announced December 2022.
-
The language of opinion change on social media under the lens of communicative action
Authors:
Corrado Monti,
Luca Maria Aiello,
Gianmarco De Francisci Morales,
Francesco Bonchi
Abstract:
Which messages are more effective at inducing a change of opinion in the listener? We approach this question within the frame of Habermas' theory of communicative action, which posits that the illocutionary intent of the message (its pragmatic meaning) is the key. Thanks to recent advances in natural language processing, we are able to operationalize this theory by extracting the latent social dim…
▽ More
Which messages are more effective at inducing a change of opinion in the listener? We approach this question within the frame of Habermas' theory of communicative action, which posits that the illocutionary intent of the message (its pragmatic meaning) is the key. Thanks to recent advances in natural language processing, we are able to operationalize this theory by extracting the latent social dimensions of a message, namely archetypes of social intent of language, that come from social exchange theory. We identify key ingredients to opinion change by looking at more than 46k posts and more than 3.5M comments on Reddit's r/ChangeMyView, a debate forum where people try to change each other's opinion and explicitly mark opinion-changing comments with a special flag called "delta". Comments that express no intent are about 77% less likely to change the mind of the recipient, compared to comments that convey at least one social dimension. Among the various social dimensions, the ones that are most likely to produce an opinion change are knowledge, similarity, and trust, which resonates with Habermas' theory of communicative action. We also find other new important dimensions, such as appeals to power or empathetic expressions of support. Finally, in line with theories of constructive conflict, yet contrary to the popular characterization of conflict as the bane of modern social media, our findings show that voicing conflict in the context of a structured public debate can promote integration, especially when it is used to counter another conflictive stance. By leveraging recent advances in natural language processing, our work provides an empirical framework for Habermas' theory, finds concrete examples of its effects in the wild, and suggests its possible extension with a more faceted understanding of intent interpreted as social dimensions of language.
△ Less
Submitted 31 October, 2022;
originally announced October 2022.
-
Urban form and COVID-19 cases and deaths in Greater London: an urban morphometric approach
Authors:
Alessandro Venerandi,
Luca Maria Aiello,
Sergio Porta
Abstract:
The COVID-19 pandemic generated a considerable debate in relation to urban density. This is an old debate, originated in mid 19th century's England with the emergence of public health and urban planning disciplines. While popularly linked, evidence suggests that such relationship cannot be generally assumed. Furthermore, urban density has been investigated in a spatially coarse manner (predominant…
▽ More
The COVID-19 pandemic generated a considerable debate in relation to urban density. This is an old debate, originated in mid 19th century's England with the emergence of public health and urban planning disciplines. While popularly linked, evidence suggests that such relationship cannot be generally assumed. Furthermore, urban density has been investigated in a spatially coarse manner (predominantly at city level) and never contextualised with other descriptors of urban form. In this work, we explore COVID-19 and urban form in Greater London, relating a comprehensive set of morphometric descriptors (including built-up density) to COVID-19 deaths and cases, while controlling for socioeconomic, ethnicity, age, and co-morbidity. We describe urban form at individual building level and then aggregate information for official neighbourhoods, allowing for a detailed intra-urban representation. Results show that: i) control variables significantly explain more variance of both COVID-19 cases and deaths than the morphometric descriptors; ii) of what the latter can explain, built-up density is indeed the most associated, though inversely. The typical London neighbourhood with high levels of COVID-19 infections and deaths resembles a suburb, featuring a low-density urban fabric dotted by larger free-standing buildings and framed by a poorly inter-connected street network.
△ Less
Submitted 16 October, 2022;
originally announced October 2022.
-
Heterogeneous rarity patterns drive price dynamics in NFT collections
Authors:
Amin Mekacher,
Alberto Bracci,
Matthieu Nadini,
Mauro Martino,
Laura Alessandretti,
Luca Maria Aiello,
Andrea Baronchelli
Abstract:
We quantify Non Fungible Token (NFT) rarity and investigate how it impacts market behaviour by analysing a dataset of 3.7M transactions collected between January 2018 and June 2022, involving 1.4M NFTs distributed across 410 collections. First, we consider the rarity of an NFT based on the set of human-readable attributes it possesses and show that most collections present heterogeneous rarity pat…
▽ More
We quantify Non Fungible Token (NFT) rarity and investigate how it impacts market behaviour by analysing a dataset of 3.7M transactions collected between January 2018 and June 2022, involving 1.4M NFTs distributed across 410 collections. First, we consider the rarity of an NFT based on the set of human-readable attributes it possesses and show that most collections present heterogeneous rarity patterns, with few rare NFTs and a large number of more common ones. Then, we analyze market performance and show that, on average, rarer NFTs: (i) sell for higher prices, (ii) are traded less frequently, (iii) guarantee higher returns on investment (ROIs), and (iv) are less risky, i.e., less prone to yield negative returns. We anticipate that these findings will be of interest to researchers as well as NFT creators, collectors, and traders.
△ Less
Submitted 31 August, 2022; v1 submitted 21 April, 2022;
originally announced April 2022.
-
Epidemic Dreams: Dreaming about health during the COVID-19 pandemic
Authors:
Sanja Šćepanović,
Luca Maria Aiello,
Deirdre Barrett,
Daniele Quercia
Abstract:
The continuity hypothesis of dreams suggests that the content of dreams is continuous with the dreamer's waking experiences. Given the unprecedented nature of the experiences during COVID-19, we studied the continuity hypothesis in the context of the pandemic. We implemented a deep-learning algorithm that can extract mentions of medical conditions from text and applied it to two datasets collected…
▽ More
The continuity hypothesis of dreams suggests that the content of dreams is continuous with the dreamer's waking experiences. Given the unprecedented nature of the experiences during COVID-19, we studied the continuity hypothesis in the context of the pandemic. We implemented a deep-learning algorithm that can extract mentions of medical conditions from text and applied it to two datasets collected during the pandemic: 2,888 dream reports (dreaming life experiences), and 57M tweets mentioning the pandemic (waking life experiences). The health expressions common to both sets were typical COVID-19 symptoms (e.g., cough, fever, and anxiety), suggesting that dreams reflected people's real-world experiences. The health expressions that distinguished the two sets reflected differences in thought processes: expressions in waking life reflected a linear and logical thought process and, as such, described realistic symptoms or related disorders (e.g., nasal pain, SARS, H1N1); those in dreaming life reflected a thought process closer to the visual and emotional spheres and, as such, described either conditions unrelated to the virus (e.g., maggots, deformities, snakebites), or conditions of surreal nature (e.g., teeth falling out, body crumbling into sand). Our results confirm that dream reports represent an understudied yet valuable source of people's health experiences in the real world.
△ Less
Submitted 2 February, 2022;
originally announced February 2022.
-
From Reddit to Wall Street: The role of committed minorities in financial collective action
Authors:
Lorenzo Lucchini,
Luca Maria Aiello,
Laura Alessandretti,
Gianmarco De Francisci Morales,
Michele Starnini,
Andrea Baronchelli
Abstract:
In January 2021, retail investors coordinated on Reddit to target short selling activity by hedge funds on GameStop shares, causing a surge in the share price and triggering significant losses for the funds involved. Such an effective collective action was unprecedented in finance, and its dynamics remain unclear. Here, we analyse Reddit and financial data and rationalise the events based on recen…
▽ More
In January 2021, retail investors coordinated on Reddit to target short selling activity by hedge funds on GameStop shares, causing a surge in the share price and triggering significant losses for the funds involved. Such an effective collective action was unprecedented in finance, and its dynamics remain unclear. Here, we analyse Reddit and financial data and rationalise the events based on recent findings describing how a small fraction of committed individuals may trigger behavioural cascades. First, we operationalise the concept of individual commitment in financial discussions. Second, we show that the increase of commitment within Reddit predated the initial surge in price. Third, we reveal that initial committed users occupied a central position in the network of Reddit conversations. Finally, we show that the social identity of the broader Reddit community grew as the collective action unfolded. These findings shed light on financial collective action, as several observers anticipate it will grow in importance.
△ Less
Submitted 13 September, 2021; v1 submitted 15 July, 2021;
originally announced July 2021.
-
Cartographic Design of Cultural Maps
Authors:
Edyta Paulina Bogucka,
Marios Constantinides,
Luca Maria Aiello,
Daniele Quercia,
Wonyoung So,
Melanie Bancilhon
Abstract:
Throughout history, maps have been used as a tool to explore cities. They visualize a city's urban fabric through its streets, buildings, and points of interest. Besides purely navigation purposes, street names also reflect a city's culture through its commemorative practices. Therefore, cultural maps that unveil socio-cultural characteristics encoded in street names could potentially raise citize…
▽ More
Throughout history, maps have been used as a tool to explore cities. They visualize a city's urban fabric through its streets, buildings, and points of interest. Besides purely navigation purposes, street names also reflect a city's culture through its commemorative practices. Therefore, cultural maps that unveil socio-cultural characteristics encoded in street names could potentially raise citizens' historical awareness. But designing effective cultural maps is challenging, not only due to data scarcity but also due to the lack of effective approaches to engage citizens with data exploration. To address these challenges, we collected a dataset of 5,000 streets across the cities of Paris, Vienna, London, and New York, and built their cultural maps grounded on cartographic storytelling techniques. Through data exploration scenarios, we demonstrated how cultural maps engage users and allow them to discover distinct patterns in the ways these cities are gender-biased, celebrate various professions, and embrace foreign cultures.
△ Less
Submitted 8 June, 2021;
originally announced June 2021.
-
Streetonomics: Quantifying Culture Using Street Names
Authors:
Melanie Bancilhon,
Marios Constantinides,
Edyta Paulina Bogucka,
Luca Maria Aiello,
Daniele Quercia
Abstract:
Quantifying a society's value system is important because it suggests what people deeply care about -- it reflects who they actually are and, more importantly, who they will like to be. This cultural quantification has been typically done by studying literary production. However, a society's value system might well be implicitly quantified based on the decisions that people took in the past and th…
▽ More
Quantifying a society's value system is important because it suggests what people deeply care about -- it reflects who they actually are and, more importantly, who they will like to be. This cultural quantification has been typically done by studying literary production. However, a society's value system might well be implicitly quantified based on the decisions that people took in the past and that were mediated by what they care about. It turns out that one class of these decisions is visible in ordinary settings: it is visible in street names. We studied the names of 4,932 honorific streets in the cities of Paris, Vienna, London and New York. We chose these four cities because they were important centers of cultural influence for the Western world in the 20th century. We found that street names greatly reflect the extent to which a society is gender biased, which professions are considered elite ones, and the extent to which a city is influenced by the rest of the world. This way of quantifying a society's value system promises to inform new methodologies in Digital Humanities; makes it possible for municipalities to reflect on their past to inform their future; and informs the design of everyday's educational tools that promote historical awareness in a playful way.
△ Less
Submitted 18 June, 2021; v1 submitted 8 June, 2021;
originally announced June 2021.
-
Map** the NFT revolution: market trends, trade networks and visual features
Authors:
Matthieu Nadini,
Laura Alessandretti,
Flavio Di Giacinto,
Mauro Martino,
Luca Maria Aiello,
Andrea Baronchelli
Abstract:
Non Fungible Tokens (NFTs) are digital assets that represent objects like art, collectible, and in-game items. They are traded online, often with cryptocurrency, and are generally encoded within smart contracts on a blockchain. Public attention towards NFTs has exploded in 2021, when their market has experienced record sales, but little is known about the overall structure and evolution of its mar…
▽ More
Non Fungible Tokens (NFTs) are digital assets that represent objects like art, collectible, and in-game items. They are traded online, often with cryptocurrency, and are generally encoded within smart contracts on a blockchain. Public attention towards NFTs has exploded in 2021, when their market has experienced record sales, but little is known about the overall structure and evolution of its market. Here, we analyse data concerning 6.1 million trades of 4.7 million NFTs between June 23, 2017 and April 27, 2021, obtained primarily from Ethereum and WAX blockchains. First, we characterize statistical properties of the market. Second, we build the network of interactions, show that traders typically specialize on NFTs associated with similar objects and form tight clusters with other traders that exchange the same kind of objects. Third, we cluster objects associated to NFTs according to their visual features and show that collections contain visually homogeneous objects. Finally, we investigate the predictability of NFT sales using simple machine learning algorithms and find that sale history and, secondarily, visual features are good predictors for price. We anticipate that these findings will stimulate further research on NFT production, adoption, and trading in different contexts.
△ Less
Submitted 20 September, 2021; v1 submitted 1 June, 2021;
originally announced June 2021.
-
The Healthy States of America: Creating a Health Taxonomy with Social Media
Authors:
Sanja Scepanovic,
Luca Maria Aiello,
Ke Zhou,
Sagar Joglekar,
Daniele Quercia
Abstract:
Since the uptake of social media, researchers have mined online discussions to track the outbreak and evolution of specific diseases or chronic conditions such as influenza or depression. To broaden the set of diseases under study, we developed a Deep Learning tool for Natural Language Processing that extracts mentions of virtually any medical condition or disease from unstructured social media te…
▽ More
Since the uptake of social media, researchers have mined online discussions to track the outbreak and evolution of specific diseases or chronic conditions such as influenza or depression. To broaden the set of diseases under study, we developed a Deep Learning tool for Natural Language Processing that extracts mentions of virtually any medical condition or disease from unstructured social media text. With that tool at hand, we processed Reddit and Twitter posts, analyzed the clusters of the two resulting co-occurrence networks of conditions, and discovered that they correspond to well-defined categories of medical conditions. This resulted in the creation of the first comprehensive taxonomy of medical conditions automatically derived from online discussions. We validated the structure of our taxonomy against the official International Statistical Classification of Diseases and Related Health Problems (ICD-11), finding matches of our clusters with 20 official categories, out of 22. Based on the mentions of our taxonomy's sub-categories on Reddit posts geo-referenced in the U.S., we were then able to compute disease-specific health scores. As opposed to counts of disease mentions or counts with no knowledge of our taxonomy's structure, we found that our disease-specific health scores are causally linked with the officially reported prevalence of 18 conditions.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
HeartBees: Visualizing Crowd Affects
Authors:
Chao Ying Qin,
Marios Constantinides,
Luca Maria Aiello,
Daniele Quercia
Abstract:
Affective sharing within groups strengthens coordination and empathy, leads to better health outcomes, and increases productivity and performance. Existing tools for affective sharing face one main challenge: creating a representation of collective emotional states that is relatable and universally accessible. To overcome this challenge, we propose HeartBees, a bio-feedback system for visualizing…
▽ More
Affective sharing within groups strengthens coordination and empathy, leads to better health outcomes, and increases productivity and performance. Existing tools for affective sharing face one main challenge: creating a representation of collective emotional states that is relatable and universally accessible. To overcome this challenge, we propose HeartBees, a bio-feedback system for visualizing collective emotional states, which maps a multi-dimensional emotion model into a metaphorical visualization of flocks of birds. Grounded on Affective Computing literature and physiological sensing, we mapped physiological indicators that could be obtained from wearable devices into a multi-dimensional emotion model, which, in turn, our HeartBees can make use of. We evaluated our nature-inspired interactive system with 353 online participants, whose responses showed good consensus in the way they subjectively perceived the visualizations. Last, we discuss practical applications of HeartBees.
△ Less
Submitted 14 October, 2020;
originally announced October 2020.
-
How Epidemic Psychology Works on Twitter: Evolution of responses to the COVID-19 pandemic in the U.S
Authors:
Luca Maria Aiello,
Daniele Quercia,
Ke Zhou,
Marios Constantinides,
Sanja Šćepanović,
Sagar Joglekar
Abstract:
Disruptions resulting from an epidemic might often appear to amount to chaos but, in reality, can be understood in a systematic way through the lens of "epidemic psychology". According to Philip Strong, the founder of the sociological study of epidemic infectious diseases, not only is an epidemic biological; there is also the potential for three psycho-social epidemics: of fear, moralization, and…
▽ More
Disruptions resulting from an epidemic might often appear to amount to chaos but, in reality, can be understood in a systematic way through the lens of "epidemic psychology". According to Philip Strong, the founder of the sociological study of epidemic infectious diseases, not only is an epidemic biological; there is also the potential for three psycho-social epidemics: of fear, moralization, and action. This work empirically tests Strong's model at scale by studying the use of language of 122M tweets related to the COVID-19 pandemic posted in the U.S. during the whole year of 2020. On Twitter, we identified three distinct phases. Each of them is characterized by different regimes of the three psycho-social epidemics. In the refusal phase, users refused to accept reality despite the increasing number of deaths in other countries. In the anger phase (started after the announcement of the first death in the country), users' fear translated into anger about the looming feeling that things were about to change. Finally, in the acceptance phase, which began after the authorities imposed physical-distancing measures, users settled into a "new normal" for their daily activities. Overall, refusal of accepting reality gradually died off as the year went on, while acceptance increasingly took hold. During 2020, as cases surged in waves, so did anger, re-emerging cyclically at each wave. Our real-time operationalization of Strong's model is designed in a way that makes it possible to embed epidemic psychology into real-time models (e.g., epidemiological and mobility models).
△ Less
Submitted 20 July, 2021; v1 submitted 26 July, 2020;
originally announced July 2020.
-
Ten Social Dimensions of Conversations and Relationships
Authors:
Minje Choi,
Luca Maria Aiello,
Krisztian Zsolt Varga,
Daniele Quercia
Abstract:
Decades of social science research identified ten fundamental dimensions that provide the conceptual building blocks to describe the nature of human relationships. Yet, it is not clear to what extent these concepts are expressed in everyday language and what role they have in sha** observable dynamics of social interactions. After annotating conversational text through crowdsourcing, we trained…
▽ More
Decades of social science research identified ten fundamental dimensions that provide the conceptual building blocks to describe the nature of human relationships. Yet, it is not clear to what extent these concepts are expressed in everyday language and what role they have in sha** observable dynamics of social interactions. After annotating conversational text through crowdsourcing, we trained NLP tools to detect the presence of these types of interaction from conversations, and applied them to 160M messages written by geo-referenced Reddit users, 290k emails from the Enron corpus and 300k lines of dialogue from movie scripts. We show that social dimensions can be predicted purely from conversations with an AUC up to 0.98, and that the combination of the predicted dimensions suggests both the types of relationships people entertain (conflict vs. support) and the types of real-world communities (wealthy vs. deprived) they shape.
△ Less
Submitted 27 January, 2020;
originally announced January 2020.
-
FaceLift: A transparent deep learning framework to beautify urban scenes
Authors:
Sagar Joglekar,
Daniele Quercia,
Miriam Redi,
Luca Maria Aiello,
Tobias Kauer,
Nishanth Sastry
Abstract:
In the area of computer vision, deep learning techniques have recently been used to predict whether urban scenes are likely to be considered beautiful: it turns out that these techniques are able to make accurate predictions. Yet they fall short when it comes to generating actionable insights for urban design. To support urban interventions, one needs to go beyond predicting beauty, and tackle the…
▽ More
In the area of computer vision, deep learning techniques have recently been used to predict whether urban scenes are likely to be considered beautiful: it turns out that these techniques are able to make accurate predictions. Yet they fall short when it comes to generating actionable insights for urban design. To support urban interventions, one needs to go beyond predicting beauty, and tackle the challenge of recreating beauty. Unfortunately, deep learning techniques have not been designed with that challenge in mind. Given their "black-box nature", these models cannot be directly used to explain why a particular urban scene is deemed to be beautiful. To partly fix that, we propose a deep learning framework called Facelift, that is able to both beautify existing urban scenes (Google Street views) and explain which urban elements make those transformed scenes beautiful. To quantitatively evaluate our framework, we cannot resort to any existing metric (as the research problem at hand has never been tackled before) and need to formulate new ones. These new metrics should ideally capture the presence/absence of elements that make urban spaces great. Upon a review of the urban planning literature, we identify five main metrics: walkability, green spaces, openness, landmarks and visual complexity. We find that, across all the five metrics, the beautified scenes meet the expectations set by the literature on what great spaces tend to be made of. This result is further confirmed by a 20-participant expert survey in which FaceLift have been found to be effective in promoting citizen participation. All this suggests that, in the future, as our framework's components are further researched and become better and more sophisticated, it is not hard to imagine technologies that will be able to accurately and efficiently support architects and planners in the design of spaces we intuitively love.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
Predicting Urban Innovation from the Workforce Mobility Network in US
Authors:
Moreno Bonaventura,
Luca Maria Aiello,
Daniele Quercia,
Vito Latora
Abstract:
While great emphasis has been placed on the role of social interactions as driver of innovation growth, very few empirical studies have explicitly investigated the impact of social network structures on the innovation performance of cities. Past research has mostly explored scaling laws of socio-economic outputs of cities as determined by, for example, the single predictor of population. Here, by…
▽ More
While great emphasis has been placed on the role of social interactions as driver of innovation growth, very few empirical studies have explicitly investigated the impact of social network structures on the innovation performance of cities. Past research has mostly explored scaling laws of socio-economic outputs of cities as determined by, for example, the single predictor of population. Here, by drawing on a publicly available dataset of the startup ecosystem, we build the first Workforce Mobility Network among US metropolitan areas. We found that node centrality computed on this network accounts for most of the variability observed in cities' innovation performance and significantly outperforms other predictors such as population size or density, suggesting that policies and initiatives aiming at sustaining innovation processes might benefit from fostering professional networks alongside other economic or systemic incentives. As opposed to previous approaches powered by census data, our model can be updated in real-time upon open databases, opening up new opportunities both for researchers in a variety of disciplines to study urban economies in new ways, and for practitioners to design tools for monitoring such economies in real-time.
△ Less
Submitted 1 November, 2019;
originally announced November 2019.
-
The Language of Dialogue Is Complex
Authors:
Alexander Robertson,
Luca Maria Aiello,
Daniele Quercia
Abstract:
Integrative Complexity (IC) is a psychometric that measures the ability of a person to recognize multiple perspectives and connect them, thus identifying paths for conflict resolution. IC has been linked to a wide variety of political, social and personal outcomes but evaluating it is a time-consuming process requiring skilled professionals to manually score texts, a fact which accounts for the li…
▽ More
Integrative Complexity (IC) is a psychometric that measures the ability of a person to recognize multiple perspectives and connect them, thus identifying paths for conflict resolution. IC has been linked to a wide variety of political, social and personal outcomes but evaluating it is a time-consuming process requiring skilled professionals to manually score texts, a fact which accounts for the limited exploration of IC at scale on social media.We combine natural language processing and machine learning to train an IC classification model that achieves state-of-the-art performance on unseen data and more closely adheres to the established structure of the IC coding process than previous automated approaches. When applied to the content of 400k+ comments from online fora about depression and knowledge exchange, our model was capable of replicating key findings of prior work, thus providing the first example of using IC tools for large-scale social media analytics.
△ Less
Submitted 5 June, 2019;
originally announced June 2019.
-
Large-scale and high-resolution analysis of food purchases and health outcomes
Authors:
Luca Maria Aiello,
Rossano Schifanella,
Daniele Quercia,
Lucia Del Prete
Abstract:
To complement traditional dietary surveys, which are costly and of limited scale, researchers have resorted to digital data to infer the impact of eating habits on people's health. However, online studies are limited in resolution: they are carried out at regional level and do not capture precisely the composition of the food consumed. We study the association between food consumption (derived fro…
▽ More
To complement traditional dietary surveys, which are costly and of limited scale, researchers have resorted to digital data to infer the impact of eating habits on people's health. However, online studies are limited in resolution: they are carried out at regional level and do not capture precisely the composition of the food consumed. We study the association between food consumption (derived from the loyalty cards of the main grocery retailer in London) and health outcomes (derived from publicly-available medical prescription records). The scale and granularity of our analysis is unprecedented: we analyze 1.6B food item purchases and 1.1B medical prescriptions for the entire city of London over the course of one year. By studying food consumption down to the level of nutrients, we show that nutrient diversity and amount of calories are the strongest predictors of the prevalence of three diseases related to what is called the "metabolic syndrome": hypertension, high cholesterol, and diabetes. This syndrome is a cluster of symptoms generally associated with obesity, is common across the rich world, and affects one in four adults in the UK. Our linear regression models achieve an R2 of 0.6 when estimating the prevalence of diabetes in nearly 1000 census areas in London, and a classifier can identify (un)healthy areas with up to 91% accuracy. Interestingly, healthy areas are not necessarily well-off (income matters less than what one would expect) and have distinctive features: they tend to systematically eat less carbohydrates and sugar, diversify nutrients, and avoid large quantities. More generally, our study shows that analytics of digital records of grocery purchases can be used as a cheap and scalable tool for health surveillance and, upon these records, different stakeholders from governments to insurance companies to food companies could implement effective prevention strategies.
△ Less
Submitted 30 April, 2019;
originally announced May 2019.
-
Coloring in the Links: Capturing Social Ties as They are Perceived
Authors:
Sebastian Deri,
Jeremie Rappaz,
Luca Maria Aiello,
Daniele Quercia
Abstract:
The richness that characterizes relationships is often absent when they are modeled using computational methods in network science. Typically, relationships are represented simply as links, perhaps with weights. The lack of finer granularity is due in part to the fact that, aside from linkage and strength, no fundamental or immediately obvious dimensions exist along which to categorize relationshi…
▽ More
The richness that characterizes relationships is often absent when they are modeled using computational methods in network science. Typically, relationships are represented simply as links, perhaps with weights. The lack of finer granularity is due in part to the fact that, aside from linkage and strength, no fundamental or immediately obvious dimensions exist along which to categorize relationships. Here we propose a set of dimensions that capture major components of many relationships -- derived both from relevant academic literature and people's everyday descriptions of their relationships. We first review prominent findings in sociology and social psychology, highlighting dimensions that have been widely used to categorize social relationships. Next, we examine the validity of these dimensions empirically in two crowd-sourced experiments. Ultimately, we arrive at a set of ten major dimensions that can be used to categorize relationships: similarity, trust, romance, social support, identity, respect, knowledge exchange, power, fun, and conflict. These ten dimensions, while not dispositive, offer higher resolution than existing models. Indeed, we show that one can more accurately predict missing links in a social graph by using these dimensions than by using a state-of-the-art link embeddedness method. We also describe tinghy.org, an online platform we built to collect data about how social media users perceive their online relationships, allowing us to examine these dimensions at scale. Overall, by proposing a new way of modeling social graphs, our work aims to contribute both to theory in network science and practice in the design of social-networking applications.
△ Less
Submitted 12 February, 2019;
originally announced February 2019.
-
Anticipating cryptocurrency prices using machine learning
Authors:
Laura Alessandretti,
Abeer ElBahrawy,
Luca Maria Aiello,
Andrea Baronchelli
Abstract:
Machine learning and AI-assisted trading have attracted growing interest for the past few years. Here, we use this approach to test the hypothesis that the inefficiency of the cryptocurrency market can be exploited to generate abnormal profits. We analyse daily data for $1,681$ cryptocurrencies for the period between Nov. 2015 and Apr. 2018. We show that simple trading strategies assisted by state…
▽ More
Machine learning and AI-assisted trading have attracted growing interest for the past few years. Here, we use this approach to test the hypothesis that the inefficiency of the cryptocurrency market can be exploited to generate abnormal profits. We analyse daily data for $1,681$ cryptocurrencies for the period between Nov. 2015 and Apr. 2018. We show that simple trading strategies assisted by state-of-the-art machine learning algorithms outperform standard benchmarks. Our results show that nontrivial, but ultimately simple, algorithmic mechanisms can help anticipate the short-term evolution of the cryptocurrency market.
△ Less
Submitted 9 November, 2018; v1 submitted 22 May, 2018;
originally announced May 2018.
-
Hearts and Politics: Metrics for Tracking Biorhythm Changes during Brexit and Trump
Authors:
Luca Maria Aiello,
Daniele Quercia,
Eva Roitmann
Abstract:
Our internal experience of time reflects what is going in the world around us. Our body's natural rhythms get disrupted for a variety of external factors, including exposure to collective events. We collect readings of steps, sleep, and heart rates from 11K users of health tracking devices in London and San Francisco. We introduce measures to quantify changes in not only volume of these three bio-…
▽ More
Our internal experience of time reflects what is going in the world around us. Our body's natural rhythms get disrupted for a variety of external factors, including exposure to collective events. We collect readings of steps, sleep, and heart rates from 11K users of health tracking devices in London and San Francisco. We introduce measures to quantify changes in not only volume of these three bio-signals (as previous research has done) but also synchronicity and periodicity, and we empirically assess how strong those variations are, compared to random expectation, during four major events: Christmas, New Year's Eve, Brexit, and the US presidential election of 2016 (Donald Trump's election). While Christmas and New Year's eve are associated with short-term effects, Brexit and Trump's election are associated with longer-term disruptions. Our results promise to inform the design of new ways of monitoring population health at scale.
△ Less
Submitted 18 April, 2018;
originally announced April 2018.
-
The New Urban Success: How Culture Pays
Authors:
Desislava Hristova,
Luca Maria Aiello,
Daniele Quercia
Abstract:
Urban economists have put forward the idea that cities that are culturally interesting tend to attract "the creative class" and, as a result, end up being economically successful. Yet it is still unclear how economic and cultural dynamics mutually influence each other. By contrast, that has been extensively studied in the case of individuals. Over decades, the French sociologist Pierre Bourdieu sh…
▽ More
Urban economists have put forward the idea that cities that are culturally interesting tend to attract "the creative class" and, as a result, end up being economically successful. Yet it is still unclear how economic and cultural dynamics mutually influence each other. By contrast, that has been extensively studied in the case of individuals. Over decades, the French sociologist Pierre Bourdieu showed that people's success and their positions in society mainly depend on how much they can spend (their economic capital) and what their interests are (their cultural capital). For the first time, we adapt Bourdieu's framework to the city context. We operationalize a neighborhood's cultural capital in terms of the cultural interests that pictures geo-referenced in the neighborhood tend to express. This is made possible by the mining of what users of the photo-sharing site of Flickr have posted in the cities of London and New York over 5 years. In so doing, we are able to show that economic capital alone does not explain urban development. The combination of cultural capital and economic capital, instead, is more indicative of neighborhood growth in terms of house prices and improvements of socio-economic conditions. Culture pays, but only up to a point as it comes with one of the most vexing urban challenges: that of gentrification.
△ Less
Submitted 10 April, 2018;
originally announced April 2018.
-
Beautiful and damned. Combined effect of content quality and social ties on user engagement
Authors:
Luca M. Aiello,
Rossano Schifanella,
Miriam Redi,
Stacey Svetlichnaya,
Frank Liu,
Simon Osindero
Abstract:
User participation in online communities is driven by the intertwinement of the social network structure with the crowd-generated content that flows along its links. These aspects are rarely explored jointly and at scale. By looking at how users generate and access pictures of varying beauty on Flickr, we investigate how the production of quality impacts the dynamics of online social systems. We d…
▽ More
User participation in online communities is driven by the intertwinement of the social network structure with the crowd-generated content that flows along its links. These aspects are rarely explored jointly and at scale. By looking at how users generate and access pictures of varying beauty on Flickr, we investigate how the production of quality impacts the dynamics of online social systems. We develop a deep learning computer vision model to score images according to their aesthetic value and we validate its output through crowdsourcing. By applying it to over 15B Flickr photos, we study for the first time how image beauty is distributed over a large-scale social system. Beautiful images are evenly distributed in the network, although only a small core of people get social recognition for them. To study the impact of exposure to quality on user engagement, we set up matching experiments aimed at detecting causality from observational data. Exposure to beauty is double-edged: following people who produce high-quality content increases one's probability of uploading better photos; however, an excessive imbalance between the quality generated by a user and the user's neighbors leads to a decline in engagement. Our analysis has practical implications for improving link recommender systems.
△ Less
Submitted 1 November, 2017;
originally announced November 2017.
-
Evolution of Ego-networks in Social Media with Link Recommendations
Authors:
Luca Maria Aiello,
Nicola Barbieri
Abstract:
Ego-networks are fundamental structures in social graphs, yet the process of their evolution is still widely unexplored. In an online context, a key question is how link recommender systems may skew the growth of these networks, possibly restraining diversity. To shed light on this matter, we analyze the complete temporal evolution of 170M ego-networks extracted from Flickr and Tumblr, comparing l…
▽ More
Ego-networks are fundamental structures in social graphs, yet the process of their evolution is still widely unexplored. In an online context, a key question is how link recommender systems may skew the growth of these networks, possibly restraining diversity. To shed light on this matter, we analyze the complete temporal evolution of 170M ego-networks extracted from Flickr and Tumblr, comparing links that are created spontaneously with those that have been algorithmically recommended. We find that the evolution of ego-networks is bursty, community-driven, and characterized by subsequent phases of explosive diameter increase, slight shrinking, and stabilization. Recommendations favor popular and well-connected nodes, limiting the diameter expansion. With a matching experiment aimed at detecting causal relationships from observational data, we find that the bias introduced by the recommendations fosters global diversity in the process of neighbor selection. Last, with two link prediction experiments, we show how insights from our analysis can be used to improve the effectiveness of social recommender systems.
△ Less
Submitted 5 February, 2017;
originally announced February 2017.
-
iPhone's Digital Marketplace: Characterizing the Big Spenders
Authors:
Farshad Kooti,
Mihajlo Grbovic,
Luca Maria Aiello,
Eric Bax,
Kristina Lerman
Abstract:
With mobile shop** surging in popularity, people are spending ever more money on digital purchases through their mobile devices and phones. However, few large-scale studies of mobile shop** exist. In this paper we analyze a large data set consisting of more than 776M digital purchases made on Apple mobile devices that include songs, apps, and in-app purchases. We find that 61% of all the spend…
▽ More
With mobile shop** surging in popularity, people are spending ever more money on digital purchases through their mobile devices and phones. However, few large-scale studies of mobile shop** exist. In this paper we analyze a large data set consisting of more than 776M digital purchases made on Apple mobile devices that include songs, apps, and in-app purchases. We find that 61% of all the spending is on in-app purchases and that the top 1% of users are responsible for 59% of all the spending. These big spenders are more likely to be male and older, and less likely to be from the US. We study how they adopt and abandon individual app, and find that, after an initial phase of increased daily spending, users gradually lose interest: the delay between their purchases increases and the spending decreases with a sharp drop toward the end. Finally, we model the in-app purchasing behavior in multiple steps: 1) we model the time between purchases; 2) we train a classifier to predict whether the user will make a purchase from a new app or continue purchasing from the existing app; and 3) based on the outcome of the previous step, we attempt to predict the exact app, new or existing, from which the next purchase will come. The results yield new insights into spending habits in the mobile digital marketplace.
△ Less
Submitted 25 January, 2017;
originally announced January 2017.
-
Pornography consumption in Social Media
Authors:
Mauro Coletto,
Luca Maria Aiello,
Claudio Lucchese,
Fabrizio Silvestri
Abstract:
The structure of a social network is fundamentally related to the interests of its members. People assort spontaneously based on the topics that are relevant to them, forming social groups that revolve around different subjects. Online social media are also favorable ecosystems for the formation of topical communities centered on matters that are not commonly taken up by the general public because…
▽ More
The structure of a social network is fundamentally related to the interests of its members. People assort spontaneously based on the topics that are relevant to them, forming social groups that revolve around different subjects. Online social media are also favorable ecosystems for the formation of topical communities centered on matters that are not commonly taken up by the general public because of the embarrassment, discomfort, or shock they may cause. Those are communities that depict or discuss what are usually referred to as deviant behaviors, conducts that are commonly considered inappropriate because they are somehow violative of society's norms or moral standards that are shared among the majority of the members of society. Pornography consumption, drug use, excessive drinking, illegal hunting, eating disorders, or any self-harming or addictive practice are all examples of deviant behaviors.
△ Less
Submitted 20 January, 2017; v1 submitted 24 December, 2016;
originally announced December 2016.
-
On the Behaviour of Deviant Communities in Online Social Networks
Authors:
Mauro Coletto,
Luca Maria Aiello,
Claudio Lucchese,
Fabrizio Silvestri
Abstract:
On-line social networks are complex ensembles of inter-linked communities that interact on different topics. Some communities are characterized by what are usually referred to as deviant behaviors, conducts that are commonly considered inappropriate with respect to the society's norms or moral standards. Eating disorders, drug use, and adult content consumption are just a few examples. We refer to…
▽ More
On-line social networks are complex ensembles of inter-linked communities that interact on different topics. Some communities are characterized by what are usually referred to as deviant behaviors, conducts that are commonly considered inappropriate with respect to the society's norms or moral standards. Eating disorders, drug use, and adult content consumption are just a few examples. We refer to such communities as deviant networks. It is commonly believed that such deviant networks are niche, isolated social groups, whose activity is well separated from the mainstream social-media life. According to this assumption, research studies have mostly considered them in isolation. In this work we focused on adult content consumption networks, which are present in many on-line social media and in the Web in general. We found that few small and densely connected communities are responsible for most of the content production. Differently from previous work, we studied how such communities interact with the whole social network. We found that the produced content flows to the rest of the network mostly directly or through bridge-communities, reaching at least 450 times more users. We also show that a large fraction of the users can be inadvertently exposed to such content through indirect content resharing. We also discuss a demographic analysis of the producers and consumers networks. Finally, we show that it is easily possible to identify a few core users to radically uproot the diffusion process. We aim at setting the basis to study deviant communities in context.
△ Less
Submitted 26 October, 2016;
originally announced October 2016.
-
The Emotional and Chromatic Layers of Urban Smells
Authors:
Daniele Quercia,
Luca Maria Aiello,
Rossano Schifanella
Abstract:
People are able to detect up to 1 trillion odors. Yet, city planning is concerned only with a few bad odors, mainly because odors are currently captured only through complaints made by urban dwellers. To capture both good and bad odors, we resort to a methodology that has been recently proposed and relies on tagging information of geo-referenced pictures. In doing so for the cities of London and B…
▽ More
People are able to detect up to 1 trillion odors. Yet, city planning is concerned only with a few bad odors, mainly because odors are currently captured only through complaints made by urban dwellers. To capture both good and bad odors, we resort to a methodology that has been recently proposed and relies on tagging information of geo-referenced pictures. In doing so for the cities of London and Barcelona, this work makes three new contributions. We study 1) how the urban smellscape changes in time and space; 2) which emotions people share at places with specific smells; and 3) what is the color of a smell, if it exists. Without social media data, insights about those three aspects have been difficult to produce in the past, further delaying the creation of urban restorative experiences.
△ Less
Submitted 21 May, 2016;
originally announced May 2016.
-
Chatty Maps: Constructing sound maps of urban areas from social media data
Authors:
Luca Maria Aiello,
Rossano Schifanella,
Daniele Quercia,
Francesco Aletta
Abstract:
Urban sound has a huge influence over how we perceive places. Yet, city planning is concerned mainly with noise, simply because annoying sounds come to the attention of city officials in the form of complaints, while general urban sounds do not come to the attention as they cannot be easily captured at city scale. To capture both unpleasant and pleasant sounds, we applied a new methodology that re…
▽ More
Urban sound has a huge influence over how we perceive places. Yet, city planning is concerned mainly with noise, simply because annoying sounds come to the attention of city officials in the form of complaints, while general urban sounds do not come to the attention as they cannot be easily captured at city scale. To capture both unpleasant and pleasant sounds, we applied a new methodology that relies on tagging information of geo-referenced pictures to the cities of London and Barcelona. To begin with, we compiled the first urban sound dictionary and compared it to the one produced by collating insights from the literature: ours was experimentally more valid (if correlated with official noise pollution levels) and offered a wider geographic coverage. From picture tags, we then studied the relationship between soundscapes and emotions. We learned that streets with music sounds were associated with strong emotions of joy or sadness, while those with human sounds were associated with joy or surprise. Finally, we studied the relationship between soundscapes and people's perceptions and, in so doing, we were able to map which areas are chaotic, monotonous, calm, and exciting.Those insights promise to inform the creation of restorative experiences in our increasingly urbanized world.
△ Less
Submitted 24 March, 2016;
originally announced March 2016.
-
Portrait of an Online Shopper: Understanding and Predicting Consumer Behavior
Authors:
Farshad Kooti,
Kristina Lerman,
Luca Maria Aiello,
Mihajlo Grbovic,
Nemanja Djuric,
Vladan Radosavljevic
Abstract:
Consumer spending accounts for a large fraction of the US economic activity. Increasingly, consumer activity is moving to the web, where digital traces of shop** and purchases provide valuable data about consumer behavior. We analyze these data extracted from emails and combine them with demographic information to characterize, model, and predict consumer behavior. Breaking down purchasing by ag…
▽ More
Consumer spending accounts for a large fraction of the US economic activity. Increasingly, consumer activity is moving to the web, where digital traces of shop** and purchases provide valuable data about consumer behavior. We analyze these data extracted from emails and combine them with demographic information to characterize, model, and predict consumer behavior. Breaking down purchasing by age and gender, we find that the amount of money spent on online purchases grows sharply with age, peaking in late 30s. Men are more frequent online purchasers and spend more money when compared to women. Linking online shop** to income, we find that shoppers from more affluent areas purchase more expensive items and buy them more frequently, resulting in significantly more money spent on online purchases. We also look at dynamics of purchasing behavior and observe daily and weekly cycles in purchasing behavior, similarly to other online activities.
More specifically, we observe temporal patterns in purchasing behavior suggesting shoppers have finite budgets: the more expensive an item, the longer the shopper waits since the last purchase to buy it. We also observe that shoppers who email each other purchase more similar items than socially unconnected shoppers, and this effect is particularly evident among women. Finally, we build a model to predict when shoppers will make a purchase and how much will spend on it. We find that temporal features improve prediction accuracy over competitive baselines. A better understanding of consumer behavior can help improve marketing efforts and make online shop** more pleasant and efficient.
△ Less
Submitted 15 December, 2015;
originally announced December 2015.
-
Smelly Maps: The Digital Life of Urban Smellscapes
Authors:
Daniele Quercia,
Rossano Schifanella,
Luca Maria Aiello,
Kate McLean
Abstract:
Smell has a huge influence over how we perceive places. Despite its importance, smell has been crucially overlooked by urban planners and scientists alike, not least because it is difficult to record and analyze at scale. One of the authors of this paper has ventured out in the urban world and conducted smellwalks in a variety of cities: participants were exposed to a range of different smellscape…
▽ More
Smell has a huge influence over how we perceive places. Despite its importance, smell has been crucially overlooked by urban planners and scientists alike, not least because it is difficult to record and analyze at scale. One of the authors of this paper has ventured out in the urban world and conducted smellwalks in a variety of cities: participants were exposed to a range of different smellscapes and asked to record their experiences. As a result, smell-related words have been collected and classified, creating the first dictionary for urban smell. Here we explore the possibility of using social media data to reliably map the smells of entire cities. To this end, for both Barcelona and London, we collect geo-referenced picture tags from Flickr and Instagram, and geo-referenced tweets from Twitter. We match those tags and tweets with the words in the smell dictionary. We find that smell-related words are best classified in ten categories. We also find that specific categories (e.g., industry, transport, cleaning) correlate with governmental air quality indicators, adding validity to our study.
△ Less
Submitted 26 May, 2015;
originally announced May 2015.
-
Local Ranking Problem on the BrowseGraph
Authors:
Michele Trevisiol,
Luca Maria Aiello,
Paolo Boldi,
Roi Blanco
Abstract:
The "Local Ranking Problem" (LRP) is related to the computation of a centrality-like rank on a local graph, where the scores of the nodes could significantly differ from the ones computed on the global graph. Previous work has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a graph where nodes are webpages and edges are browsing transitions. Recently, this graph has receive…
▽ More
The "Local Ranking Problem" (LRP) is related to the computation of a centrality-like rank on a local graph, where the scores of the nodes could significantly differ from the ones computed on the global graph. Previous work has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a graph where nodes are webpages and edges are browsing transitions. Recently, this graph has received more and more attention in many different tasks such as ranking, prediction and recommendation. However, a web-server has only the browsing traffic performed on its pages (local BrowseGraph) and, as a consequence, the local computation can lead to estimation errors, which hinders the increasing number of applications in the state of the art. Also, although the divergence between the local and global ranks has been measured, the possibility of estimating such divergence using only local knowledge has been mainly overlooked. These aspects are of great interest for online service providers who want to: (i) gauge their ability to correctly assess the importance of their resources only based on their local knowledge, and (ii) take into account real user browsing fluxes that better capture the actual user interest than the static hyperlink network. We study the LRP problem on a BrowseGraph from a large news provider, considering as subgraphs the aggregations of browsing traces of users coming from different domains. We show that the distance between rankings can be accurately predicted based only on structural information of the local graph, being able to achieve an average rank correlation as high as 0.8.
△ Less
Submitted 23 May, 2015;
originally announced May 2015.
-
Evolution of Conversations in the Age of Email Overload
Authors:
Farshad Kooti,
Luca Maria Aiello,
Mihajlo Grbovic,
Kristina Lerman,
Amin Mantrach
Abstract:
Email is a ubiquitous communications tool in the workplace and plays an important role in social interactions. Previous studies of email were largely based on surveys and limited to relatively small populations of email users within organizations. In this paper, we report results of a large-scale study of more than 2 million users exchanging 16 billion emails over several months. We quantitatively…
▽ More
Email is a ubiquitous communications tool in the workplace and plays an important role in social interactions. Previous studies of email were largely based on surveys and limited to relatively small populations of email users within organizations. In this paper, we report results of a large-scale study of more than 2 million users exchanging 16 billion emails over several months. We quantitatively characterize the replying behavior in conversations within pairs of users. In particular, we study the time it takes the user to reply to a received message and the length of the reply sent. We consider a variety of factors that affect the reply time and length, such as the stage of the conversation, user demographics, and use of portable devices. In addition, we study how increasing load affects emailing behavior. We find that as users receive more email messages in a day, they reply to a smaller fraction of them, using shorter replies. However, their responsiveness remains intact, and they may even reply to emails faster. Finally, we predict the time to reply, length of reply, and whether the reply ends a conversation. We demonstrate considerable improvement over the baseline in all three prediction tasks, showing the significant role that the factors that we uncover play, in determining replying behavior. We rank these factors based on their predictive power. Our findings have important implications for understanding human behavior and designing better email management applications for tasks like ranking unread emails.
△ Less
Submitted 2 April, 2015;
originally announced April 2015.
-
The Digital Life of Walkable Streets
Authors:
Daniele Quercia,
Luca Maria Aiello,
Rossano Schifanella,
Adam Davies
Abstract:
Walkability has many health, environmental, and economic benefits. That is why web and mobile services have been offering ways of computing walkability scores of individual street segments. Those scores are generally computed from survey data and manual counting (of even trees). However, that is costly, owing to the high time, effort, and financial costs. To partly automate the computation of thos…
▽ More
Walkability has many health, environmental, and economic benefits. That is why web and mobile services have been offering ways of computing walkability scores of individual street segments. Those scores are generally computed from survey data and manual counting (of even trees). However, that is costly, owing to the high time, effort, and financial costs. To partly automate the computation of those scores, we explore the possibility of using the social media data of Flickr and Foursquare to automatically identify safe and walkable streets. We find that unsafe streets tend to be photographed during the day, while walkable streets are tagged with walkability-related keywords. These results open up practical opportunities (for, e.g., room booking services, urban route recommenders, and real-estate sites) and have theoretical implications for researchers who might resort to the use social media data to tackle previously unanswered questions in the area of walkability.
△ Less
Submitted 10 March, 2015;
originally announced March 2015.
-
People are Strange when you're a Stranger: Impact and Influence of Bots on Social Networks
Authors:
Luca Maria Aiello,
Martina Deplano,
Rossano Schifanella,
Giancarlo Ruffo
Abstract:
Bots are, for many Web and social media users, the source of many dangerous attacks or the carrier of unwanted messages, such as spam. Nevertheless, crawlers and software agents are a precious tool for analysts, and they are continuously executed to collect data or to test distributed applications. However, no one knows which is the real potential of a bot whose purpose is to control a community,…
▽ More
Bots are, for many Web and social media users, the source of many dangerous attacks or the carrier of unwanted messages, such as spam. Nevertheless, crawlers and software agents are a precious tool for analysts, and they are continuously executed to collect data or to test distributed applications. However, no one knows which is the real potential of a bot whose purpose is to control a community, to manipulate consensus, or to influence user behavior. It is commonly believed that the better an agent simulates human behavior in a social network, the more it can succeed to generate an impact in that community. We contribute to shed light on this issue through an online social experiment aimed to study to what extent a bot with no trust, no profile, and no aims to reproduce human behavior, can become popular and influential in a social media. Results show that a basic social probing activity can be used to acquire social relevance on the network and that the so-acquired popularity can be effectively leveraged to drive users in their social connectivity choices. We also register that our bot activity unveiled hidden social polarization patterns in the community and triggered an emotional response of individuals that brings to light subtle privacy hazards perceived by the user base.
△ Less
Submitted 30 July, 2014;
originally announced July 2014.
-
Reading the Source Code of Social Ties
Authors:
Luca Maria Aiello,
Rossano Schifanella,
Bogdan State
Abstract:
Though online social network research has exploded during the past years, not much thought has been given to the exploration of the nature of social links. Online interactions have been interpreted as indicative of one social process or another (e.g., status exchange or trust), often with little systematic justification regarding the relation between observed data and theoretical concept. Our rese…
▽ More
Though online social network research has exploded during the past years, not much thought has been given to the exploration of the nature of social links. Online interactions have been interpreted as indicative of one social process or another (e.g., status exchange or trust), often with little systematic justification regarding the relation between observed data and theoretical concept. Our research aims to breach this gap in computational social science by proposing an unsupervised, parameter-free method to discover, with high accuracy, the fundamental domains of interaction occurring in social networks. By applying this method on two online datasets different by scope and type of interaction (aNobii and Flickr) we observe the spontaneous emergence of three domains of interaction representing the exchange of status, knowledge and social support. By finding significant relations between the domains of interaction and classic social network analysis issues (e.g., tie strength, dyadic interaction over time) we show how the network of interactions induced by the extracted domains can be used as a starting point for more nuanced analysis of online social data that may one day incorporate the normative grammar of social interaction. Our methods finds applications in online social media services ranging from recommendation to visual link summarization.
△ Less
Submitted 21 July, 2014;
originally announced July 2014.
-
The Shortest Path to Happiness: Recommending Beautiful, Quiet, and Happy Routes in the City
Authors:
Daniele Quercia,
Rossano Schifanella,
Luca Maria Aiello
Abstract:
When providing directions to a place, web and mobile map** services are all able to suggest the shortest route. The goal of this work is to automatically suggest routes that are not only short but also emotionally pleasant. To quantify the extent to which urban locations are pleasant, we use data from a crowd-sourcing platform that shows two street scenes in London (out of hundreds), and a user…
▽ More
When providing directions to a place, web and mobile map** services are all able to suggest the shortest route. The goal of this work is to automatically suggest routes that are not only short but also emotionally pleasant. To quantify the extent to which urban locations are pleasant, we use data from a crowd-sourcing platform that shows two street scenes in London (out of hundreds), and a user votes on which one looks more beautiful, quiet, and happy. We consider votes from more than 3.3K individuals and translate them into quantitative measures of location perceptions. We arrange those locations into a graph upon which we learn pleasant routes. Based on a quantitative validation, we find that, compared to the shortest routes, the recommended ones add just a few extra walking minutes and are indeed perceived to be more beautiful, quiet, and happy. To test the generality of our approach, we consider Flickr metadata of more than 3.7M pictures in London and 1.3M in Boston, compute proxies for the crowdsourced beauty dimension (the one for which we have collected the most votes), and evaluate those proxies with 30 participants in London and 54 in Boston. These participants have not only rated our recommendations but have also carefully motivated their choices, providing insights for future work.
△ Less
Submitted 3 July, 2014;
originally announced July 2014.
-
Distinguishing Topical and Social Groups Based on Common Identity and Bond Theory
Authors:
Przemyslaw A. Grabowicz,
Luca Maria Aiello,
Víctor M. Eguíluz,
Alejandro Jaimes
Abstract:
Social groups play a crucial role in social media platforms because they form the basis for user participation and engagement. Groups are created explicitly by members of the community, but also form organically as members interact. Due to their importance, they have been studied widely (e.g., community detection, evolution, activity, etc.). One of the key questions for understanding how such grou…
▽ More
Social groups play a crucial role in social media platforms because they form the basis for user participation and engagement. Groups are created explicitly by members of the community, but also form organically as members interact. Due to their importance, they have been studied widely (e.g., community detection, evolution, activity, etc.). One of the key questions for understanding how such groups evolve is whether there are different types of groups and how they differ. In Sociology, theories have been proposed to help explain how such groups form. In particular, the common identity and common bond theory states that people join groups based on identity (i.e., interest in the topics discussed) or bond attachment (i.e., social relationships). The theory has been applied qualitatively to small groups to classify them as either topical or social. We use the identity and bond theory to define a set of features to classify groups into those two categories. Using a dataset from Flickr, we extract user-defined groups and automatically-detected groups, obtained from a community detection algorithm. We discuss the process of manual labeling of groups into social or topical and present results of predicting the group label based on the defined features. We directly validate the predictions of the theory showing that the metrics are able to forecast the group type with high accuracy. In addition, we present a comparison between declared and detected groups along topicality and sociality dimensions.
△ Less
Submitted 9 September, 2013;
originally announced September 2013.
-
Fast filtering and animation of large dynamic networks
Authors:
Przemyslaw A. Grabowicz,
Luca Maria Aiello,
Filippo Menczer
Abstract:
Detecting and visualizing what are the most relevant changes in an evolving network is an open challenge in several domains. We present a fast algorithm that filters subsets of the strongest nodes and edges representing an evolving weighted graph and visualize it by either creating a movie, or by streaming it to an interactive network visualization tool. The algorithm is an approximation of expone…
▽ More
Detecting and visualizing what are the most relevant changes in an evolving network is an open challenge in several domains. We present a fast algorithm that filters subsets of the strongest nodes and edges representing an evolving weighted graph and visualize it by either creating a movie, or by streaming it to an interactive network visualization tool. The algorithm is an approximation of exponential sliding time-window that scales linearly with the number of interactions. We compare the algorithm against rectangular and exponential sliding time-window methods. Our network filtering algorithm: i) captures persistent trends in the structure of dynamic weighted networks, ii) smoothens transitions between the snapshots of dynamic network, and iii) uses limited memory and processor time. The algorithm is publicly available as open-source software.
△ Less
Submitted 4 November, 2014; v1 submitted 1 August, 2013;
originally announced August 2013.
-
Tagging with DHARMA, a DHT-based Approach for Resource Map** through Approximation
Authors:
Luca Maria Aiello,
Marco Milanesio,
Giancarlo Ruffo,
Rossano Schifanella
Abstract:
We introduce collaborative tagging and faceted search on structured P2P systems. Since a trivial and brute force map** of an entire folksonomy over a DHT-based system may reduce scalability, we propose an approximated graph maintenance approach. Evaluations on real data coming from Last.fm prove that such strategies reduce vocabulary noise (i.e., representation's overfitting phenomena) and hotsp…
▽ More
We introduce collaborative tagging and faceted search on structured P2P systems. Since a trivial and brute force map** of an entire folksonomy over a DHT-based system may reduce scalability, we propose an approximated graph maintenance approach. Evaluations on real data coming from Last.fm prove that such strategies reduce vocabulary noise (i.e., representation's overfitting phenomena) and hotspots issues.
△ Less
Submitted 19 January, 2011;
originally announced January 2011.