-
Problematic Advertising and its Disparate Exposure on Facebook
Authors:
Muhammad Ali,
Angelica Goetzen,
Alan Mislove,
Elissa M. Redmiles,
Piotr Sapiezynski
Abstract:
Targeted advertising remains an important part of the free web browsing experience, where advertisers' targeting and personalization algorithms together find the most relevant audience for millions of ads every day. However, given the wide use of advertising, this also enables using ads as a vehicle for problematic content, such as scams or clickbait. Recent work that explores people's sentiments…
▽ More
Targeted advertising remains an important part of the free web browsing experience, where advertisers' targeting and personalization algorithms together find the most relevant audience for millions of ads every day. However, given the wide use of advertising, this also enables using ads as a vehicle for problematic content, such as scams or clickbait. Recent work that explores people's sentiments toward online ads, and the impacts of these ads on people's online experiences, has found evidence that online ads can indeed be problematic. Further, there is the potential for personalization to aid the delivery of such ads, even when the advertiser targets with low specificity. In this paper, we study Facebook -- one of the internet's largest ad platforms -- and investigate key gaps in our understanding of problematic online advertising: (a) What categories of ads do people find problematic? (b) Are there disparities in the distribution of problematic ads to viewers? and if so, (c) Who is responsible -- advertisers or advertising platforms? To answer these questions, we empirically measure a diverse sample of user experiences with Facebook ads via a 3-month longitudinal panel. We categorize over 32,000 ads collected from this panel ($n=132$); and survey participants' sentiments toward their own ads to identify four categories of problematic ads. Statistically modeling the distribution of problematic ads across demographics, we find that older people and minority groups are especially likely to be shown such ads. Further, given that 22% of problematic ads had no specific targeting from advertisers, we infer that ad delivery algorithms (advertising platforms themselves) played a significant role in the biased distribution of these ads.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
Evolution of Digital Advertising Strategies during the 2020 US Presidential Primary
Authors:
NaLette Brodnax,
Piotr Sapiezynski
Abstract:
Political advertising on digital platforms has grown dramatically in recent years as campaigns embrace new ways of targeting supporters and potential voters. Previous scholarship shows that digital advertising has both positive effects on democratic politics through increased voter knowledge and participation, and negative effects through user manipulation, opinion echo-chambers, and diminished pr…
▽ More
Political advertising on digital platforms has grown dramatically in recent years as campaigns embrace new ways of targeting supporters and potential voters. Previous scholarship shows that digital advertising has both positive effects on democratic politics through increased voter knowledge and participation, and negative effects through user manipulation, opinion echo-chambers, and diminished privacy. However, research on election campaign strategies has focused primarily on traditional media, such as television. Here, we examine how political campaign dynamics have evolved in response to the growth of digital media by analyzing the advertising strategies of US presidential election campaigns during the 2020 primary cycle. To identify geographic and temporal trends, we employ regression analyses of campaign spending across nearly 600,000 advertisements published on Facebook. We show that campaigns heavily target voters in candidates' home states during the "invisible primary" stage before shifting to states with early primaries.
△ Less
Submitted 10 December, 2020;
originally announced December 2020.
-
The Fallibility of Contact-Tracing Apps
Authors:
Piotr Sapiezynski,
Johanna Pruessing,
Vedran Sekara
Abstract:
Since the onset of the COVID-19's global spread we have been following the debate around contact tracing apps -- the tech-enabled response to the pandemic. As corporations, academics, governments, and civil society discuss the right way to implement these apps, we noticed recurring implicit assumptions. The proposed solutions are designed for a world where Internet access and smartphone ownership…
▽ More
Since the onset of the COVID-19's global spread we have been following the debate around contact tracing apps -- the tech-enabled response to the pandemic. As corporations, academics, governments, and civil society discuss the right way to implement these apps, we noticed recurring implicit assumptions. The proposed solutions are designed for a world where Internet access and smartphone ownership are a given, people are willing and able to install these apps, and those who receive notifications about potential exposure to the virus have access to testing and can isolate safely. In this work we challenge these assumptions. We not only show that there are not enough smartphones worldwide to reach required adoption thresholds but also highlight a broad lack of internet access, which affects certain groups more: the elderly, those with lower incomes, and those with limited ability to socially distance. Unfortunately, these are also the groups that are at the highest risks from COVID-19. We also report that the contact tracing apps that are already deployed on an opt-in basis show disappointing adoption levels. We warn about the potential consequences of over-extending the existing state and corporate surveillance powers. Finally, we describe a multitude of scenarios where contact tracing apps will not help regardless of access or policy. In this work we call for a comprehensive and equitable policy response that prioritizes the needs of the most vulnerable, protects human rights, and considers long term impact instead of focusing on technology-first fixes.
△ Less
Submitted 27 May, 2020; v1 submitted 22 May, 2020;
originally announced May 2020.
-
Algorithms that "Don't See Color": Comparing Biases in Lookalike and Special Ad Audiences
Authors:
Piotr Sapiezynski,
Avijit Ghosh,
Levi Kaplan,
Aaron Rieke,
Alan Mislove
Abstract:
Researchers and journalists have repeatedly shown that algorithms commonly used in domains such as credit, employment, healthcare, or criminal justice can have discriminatory effects. Some organizations have tried to mitigate these effects by simply removing sensitive features from an algorithm's inputs. In this paper, we explore the limits of this approach using a unique opportunity. In 2019, Fac…
▽ More
Researchers and journalists have repeatedly shown that algorithms commonly used in domains such as credit, employment, healthcare, or criminal justice can have discriminatory effects. Some organizations have tried to mitigate these effects by simply removing sensitive features from an algorithm's inputs. In this paper, we explore the limits of this approach using a unique opportunity. In 2019, Facebook agreed to settle a lawsuit by removing certain sensitive features from inputs of an algorithm that identifies users similar to those provided by an advertiser for ad targeting, making both the modified and unmodified versions of the algorithm available to advertisers. We develop methodologies to measure biases along the lines of gender, age, and race in the audiences created by this modified algorithm, relative to the unmodified one. Our results provide experimental proof that merely removing demographic features from a real-world algorithmic system's inputs can fail to prevent biased outputs. As a result, organizations using algorithms to help mediate access to important life opportunities should consider other approaches to mitigating discriminatory effects.
△ Less
Submitted 31 May, 2022; v1 submitted 16 December, 2019;
originally announced December 2019.
-
Ad Delivery Algorithms: The Hidden Arbiters of Political Messaging
Authors:
Muhammad Ali,
Piotr Sapiezynski,
Aleksandra Korolova,
Alan Mislove,
Aaron Rieke
Abstract:
Political campaigns are increasingly turning to digital advertising to reach voters. These platforms empower advertisers to target messages to platform users with great precision, including through inferences about those users' political affiliations. However, prior work has shown that platforms' ad delivery algorithms can selectively deliver ads within these target audiences in ways that can lead…
▽ More
Political campaigns are increasingly turning to digital advertising to reach voters. These platforms empower advertisers to target messages to platform users with great precision, including through inferences about those users' political affiliations. However, prior work has shown that platforms' ad delivery algorithms can selectively deliver ads within these target audiences in ways that can lead to demographic skews along race and gender lines, often without an advertiser's knowledge.
In this study, we investigate the impact of Facebook's ad delivery algorithms on political ads. We run a series of political ads on Facebook and measure how Facebook delivers those ads to different groups, depending on an ad's content (e.g., the political viewpoint featured) and targeting criteria. We find that Facebook's ad delivery algorithms effectively differentiate the price of reaching a user based on their inferred political alignment with the advertised content, inhibiting political campaigns' ability to reach voters with diverse political views. This effect is most acute when advertisers use small budgets, as Facebook's delivery algorithm tends to preferentially deliver to the users who are, according to Facebook's estimation, most relevant.
Our findings point to advertising platforms' potential role in political polarization and creating informational filter bubbles. Furthermore, some large ad platforms have recently changed their policies to restrict the targeting tools they offer to political campaigns; our findings show that such reforms will be insufficient if the goal is to ensure that political ads are shown to users of diverse political views. Our findings add urgency to calls for more meaningful public transparency into the political advertising ecosystem.
△ Less
Submitted 17 December, 2019; v1 submitted 9 December, 2019;
originally announced December 2019.
-
Discrimination through optimization: How Facebook's ad delivery can lead to skewed outcomes
Authors:
Muhammad Ali,
Piotr Sapiezynski,
Miranda Bogen,
Aleksandra Korolova,
Alan Mislove,
Aaron Rieke
Abstract:
The enormous financial success of online advertising platforms is partially due to the precise targeting features they offer. Although researchers and journalists have found many ways that advertisers can target---or exclude---particular groups of users seeing their ads, comparatively little attention has been paid to the implications of the platform's ad delivery process, comprised of the platfor…
▽ More
The enormous financial success of online advertising platforms is partially due to the precise targeting features they offer. Although researchers and journalists have found many ways that advertisers can target---or exclude---particular groups of users seeing their ads, comparatively little attention has been paid to the implications of the platform's ad delivery process, comprised of the platform's choices about which users see which ads.
It has been hypothesized that this process can "skew" ad delivery in ways that the advertisers do not intend, making some users less likely than others to see particular ads based on their demographic characteristics. In this paper, we demonstrate that such skewed delivery occurs on Facebook, due to market and financial optimization effects as well as the platform's own predictions about the "relevance" of ads to different groups of users. We find that both the advertiser's budget and the content of the ad each significantly contribute to the skew of Facebook's ad delivery. Critically, we observe significant skew in delivery along gender and racial lines for "real" ads for employment and housing opportunities despite neutral targeting parameters.
Our results demonstrate previously unknown mechanisms that can lead to potentially discriminatory ad delivery, even when advertisers set their targeting parameters to be highly inclusive. This underscores the need for policymakers and platforms to carefully consider the role of the ad delivery optimization run by ad platforms themselves---and not just the targeting choices of advertisers---in preventing discrimination in digital advertising.
△ Less
Submitted 12 September, 2019; v1 submitted 3 April, 2019;
originally announced April 2019.
-
Quantifying the Impact of User Attention on Fair Group Representation in Ranked Lists
Authors:
Piotr Sapiezynski,
Wesley Zeng,
Ronald E. Robertson,
Alan Mislove,
Christo Wilson
Abstract:
In this work, we introduce a novel metric for auditing group fairness in ranked lists. Our approach offers two benefits compared to the state of the art. First, we offer a blueprint for modeling of user attention. Rather than assuming a logarithmic loss in importance as a function of the rank, we can account for varying user behaviors through parametrization. For example, we expect a user to see m…
▽ More
In this work, we introduce a novel metric for auditing group fairness in ranked lists. Our approach offers two benefits compared to the state of the art. First, we offer a blueprint for modeling of user attention. Rather than assuming a logarithmic loss in importance as a function of the rank, we can account for varying user behaviors through parametrization. For example, we expect a user to see more items during a viewing of a social media feed than when they inspect the results list of a single web search query. Second, we allow non-binary protected attributes to enable investigating inherently continuous attributes (\eg political alignment on the liberal to conservative spectrum) as well as to facilitate measurements across aggregated sets of search results, rather than separately for each result list. By combining these two elements into our metric, we are able to better address the human factors inherent in this problem. We measure the whole sociotechnical system, consisting of a ranking algorithm and individuals using it, instead of exclusively focusing on the ranking algorithm. Finally, we use our metric to perform three simulated fairness audits. We show that determining fairness of a ranked output necessitates knowledge (or a model) of the end-users of the particular service. Depending on their attention distribution function, a fixed ranking of results can appear biased both in favor and against a protected group.
△ Less
Submitted 13 May, 2019; v1 submitted 29 January, 2019;
originally announced January 2019.
-
Offline Behaviors of Online Friends
Authors:
Piotr Sapiezynski,
Arkadiusz Stopczynski,
David Kofoed Wind,
Jure Leskovec,
Sune Lehmann
Abstract:
In this work we analyze traces of mobility and co-location among a group of nearly 1000 closely interacting individuals. We attempt to reconstruct the Facebook friendship graph, Facebook interaction network, as well as call and SMS networks from longitudinal records of person-to-person offline proximity. We find subtle, yet observable behavioral differences between pairs of people who communicate…
▽ More
In this work we analyze traces of mobility and co-location among a group of nearly 1000 closely interacting individuals. We attempt to reconstruct the Facebook friendship graph, Facebook interaction network, as well as call and SMS networks from longitudinal records of person-to-person offline proximity. We find subtle, yet observable behavioral differences between pairs of people who communicate using each of the different channels and we show that the signal of friendship is strong enough to stand out from the noise of random and schedule-driven offline interactions between familiar strangers. Our study also provides an overview of methods for link inference based on offline behavior and proposes new features to improve the performance of the prediction task.
△ Less
Submitted 8 November, 2018; v1 submitted 7 November, 2018;
originally announced November 2018.
-
Detrimental Network Effects in Privacy: A Graph-theoretic Model for Node-based Intrusions
Authors:
Florimond Houssiau,
Piotr Sapiezynski,
Laura Radaelli,
Erez Shmueli,
Yves-Alexandre de Montjoye
Abstract:
Despite proportionality being one of the tenets of data protection laws, we currently lack a robust analytical framework to evaluate the reach of modern data collections and the network effects at play. We here propose a graph-theoretic model and notions of node- and edge-observability to quantify the reach of networked data collections. We first prove closed-form expressions for our metrics and q…
▽ More
Despite proportionality being one of the tenets of data protection laws, we currently lack a robust analytical framework to evaluate the reach of modern data collections and the network effects at play. We here propose a graph-theoretic model and notions of node- and edge-observability to quantify the reach of networked data collections. We first prove closed-form expressions for our metrics and quantify the impact of the graph's structure on observability. Second, using our model, we quantify how (1) from 270,000 compromised accounts, Cambridge Analytica collected 68.0M Facebook profiles; (2) from surveilling 0.01\% the nodes in a mobile phone network, a law-enforcement agency could observe 18.6\% of all communications; and (3) an app installed on 1\% of smartphones could monitor the location of half of the London population through close proximity tracing. Better quantifying the reach of data collection mechanisms is essential to evaluate their proportionality.
△ Less
Submitted 15 March, 2023; v1 submitted 23 March, 2018;
originally announced March 2018.
-
Academic Performance and Behavioral Patterns
Authors:
Valentin Kassarnig,
Enys Mones,
Andreas Bjerre-Nielsen,
Piotr Sapiezynski,
David Dreyer Lassen,
Sune Lehmann
Abstract:
Identifying the factors that influence academic performance is an essential part of educational research. Previous studies have documented the importance of personality traits, class attendance, and social network structure. Because most of these analyses were based on a single behavioral aspect and/or small sample sizes, there is currently no quantification of the interplay of these factors. Here…
▽ More
Identifying the factors that influence academic performance is an essential part of educational research. Previous studies have documented the importance of personality traits, class attendance, and social network structure. Because most of these analyses were based on a single behavioral aspect and/or small sample sizes, there is currently no quantification of the interplay of these factors. Here, we study the academic performance among a cohort of 538 undergraduate students forming a single, densely connected social network. Our work is based on data collected using smartphones, which the students used as their primary phones for two years. The availability of multi-channel data from a single population allows us to directly compare the explanatory power of individual and social characteristics. We find that the most informative indicators of performance are based on social ties and that network indicators result in better model performance than individual characteristics (including both personality and class attendance). We confirm earlier findings that class attendance is the most important predictor among individual characteristics. Finally, our results suggest the presence of strong homophily and/or peer effects among university students.
△ Less
Submitted 9 April, 2018; v1 submitted 21 June, 2017;
originally announced June 2017.
-
The Role of Gender in Social Network Organization
Authors:
Ioanna Psylla,
Piotr Sapiezynski,
Enys Mones,
Sune Lehmann
Abstract:
The digital traces we leave behind when engaging with the modern world offer an interesting lens through which we study behavioral patterns as expression of gender. Although gender differentiation has been observed in a number of settings, the majority of studies focus on a single data stream in isolation. Here we use a dataset of high resolution data collected using mobile phones, as well as deta…
▽ More
The digital traces we leave behind when engaging with the modern world offer an interesting lens through which we study behavioral patterns as expression of gender. Although gender differentiation has been observed in a number of settings, the majority of studies focus on a single data stream in isolation. Here we use a dataset of high resolution data collected using mobile phones, as well as detailed questionnaires, to study gender differences in a large cohort.
We consider mobility behavior and individual personality traits among a group of more than $800$ university students. We also investigate interactions among them expressed via person-to-person contacts, interactions on online social networks, and telecommunication. Thus, we are able to study the differences between male and female behavior captured through a multitude of channels for a single cohort. We find that while the two genders are similar in a number of aspects, there are robust deviations that include multiple facets of social interactions, suggesting the existence of inherent behavioral differences. Finally, we quantify how aspects of an individual's characteristics and social behavior reveals their gender by posing it as a classification problem. We ask: How well can we distinguish between male and female study participants based on behavior alone? Which behavioral features are most predictive?
△ Less
Submitted 15 June, 2017;
originally announced June 2017.
-
Measuring Personalization of Web Search
Authors:
Anikó Hannák,
Piotr Sapieżyński,
Arash Molavi Khaki,
David Lazer,
Alan Mislove,
Christo Wilson
Abstract:
Web search is an integral part of our daily lives. Recently, there has been a trend of personalization in Web search, where different users receive different results for the same search query. The increasing level of personalization is leading to concerns about Filter Bubble effects, where certain users are simply unable to access information that the search engines' algorithm decides is irrelevan…
▽ More
Web search is an integral part of our daily lives. Recently, there has been a trend of personalization in Web search, where different users receive different results for the same search query. The increasing level of personalization is leading to concerns about Filter Bubble effects, where certain users are simply unable to access information that the search engines' algorithm decides is irrelevant. Despite these concerns, there has been little quantification of the extent of personalization in Web search today, or the user attributes that cause it.
In light of this situation, we make three contributions. First, we develop a methodology for measuring personalization in Web search results. While conceptually simple, there are numerous details that our methodology must handle in order to accurately attribute differences in search results to personalization. Second, we apply our methodology to 200 users on Google Web Search and 100 users on Bing. We find that, on average, 11.7% of results show differences due to personalization on Google, while 15.8% of results are personalized on Bing, but that this varies widely by search query and by result ranking. Third, we investigate the user features used to personalize on Google Web Search and Bing. Surprisingly, we only find measurable personalization as a result of searching with a logged in account and the IP address of the searching user. Our results are a first step towards understanding the extent and effects of personalization on Web search engines today.
△ Less
Submitted 15 June, 2017;
originally announced June 2017.
-
Evidence of Complex Contagion of Information in Social Media: An Experiment Using Twitter Bots
Authors:
Bjarke Mønsted,
Piotr Sapieżyński,
Emilio Ferrara,
Sune Lehmann
Abstract:
It has recently become possible to study the dynamics of information diffusion in techno-social systems at scale, due to the emergence of online platforms, such as Twitter, with millions of users. One question that systematically recurs is whether information spreads according to simple or complex dynamics: does each exposure to a piece of information have an independent probability of a user adop…
▽ More
It has recently become possible to study the dynamics of information diffusion in techno-social systems at scale, due to the emergence of online platforms, such as Twitter, with millions of users. One question that systematically recurs is whether information spreads according to simple or complex dynamics: does each exposure to a piece of information have an independent probability of a user adopting it (simple contagion), or does this probability depend instead on the number of sources of exposure, increasing above some threshold (complex contagion)? Most studies to date are observational and, therefore, unable to disentangle the effects of confounding factors such as social reinforcement, homophily, limited attention, or network community structure. Here we describe a novel controlled experiment that we performed on Twitter using `social bots' deployed to carry out coordinated attempts at spreading information. We propose two Bayesian statistical models describing simple and complex contagion dynamics, and test the competing hypotheses. We provide experimental evidence that the complex contagion model describes the observed information diffusion behavior more accurately than simple contagion. Future applications of our results include more effective defenses against malicious propaganda campaigns on social media, improved marketing and advertisement strategies, and design of effective network intervention techniques.
△ Less
Submitted 17 March, 2017;
originally announced March 2017.
-
Inferring Person-to-person Proximity Using WiFi Signals
Authors:
Piotr Sapiezynski,
Arkadiusz Stopczynski,
David Kofoed Wind,
Jure Leskovec,
Sune Lehmann
Abstract:
Today's societies are enveloped in an ever-growing telecommunication infrastructure. This infrastructure offers important opportunities for sensing and recording a multitude of human behaviors. Human mobility patterns are a prominent example of such a behavior which has been studied based on cell phone towers, Bluetooth beacons, and WiFi networks as proxies for location. However, while mobility is…
▽ More
Today's societies are enveloped in an ever-growing telecommunication infrastructure. This infrastructure offers important opportunities for sensing and recording a multitude of human behaviors. Human mobility patterns are a prominent example of such a behavior which has been studied based on cell phone towers, Bluetooth beacons, and WiFi networks as proxies for location. However, while mobility is an important aspect of human behavior, understanding complex social systems requires studying not only the movement of individuals, but also their interactions. Sensing social interactions on a large scale is a technical challenge and many commonly used approaches---including RFID badges or Bluetooth scanning---offer only limited scalability. Here we show that it is possible, in a scalable and robust way, to accurately infer person-to-person physical proximity from the lists of WiFi access points measured by smartphones carried by the two individuals. Based on a longitudinal dataset of approximately 800 participants with ground-truth interactions collected over a year, we show that our model performs better than the current state-of-the-art. Our results demonstrate the value of WiFi signals in social sensing as well as potential threats to privacy that they imply.
△ Less
Submitted 15 October, 2016;
originally announced October 2016.
-
Evidence for a Conserved Quantity in Human Mobility
Authors:
Laura Alessandretti,
Piotr Sapiezynski,
Vedran Sekara,
Sune Lehmann,
Andrea Baronchelli
Abstract:
Recent seminal works on human mobility have shown that individuals constantly exploit a small set of repeatedly visited locations. A concurrent literature has emphasized the explorative nature of human behavior, showing that the number of visited places grows steadily over time. How to reconcile these seemingly contradicting facts remains an open question. Here, we analyze high-resolution multi-ye…
▽ More
Recent seminal works on human mobility have shown that individuals constantly exploit a small set of repeatedly visited locations. A concurrent literature has emphasized the explorative nature of human behavior, showing that the number of visited places grows steadily over time. How to reconcile these seemingly contradicting facts remains an open question. Here, we analyze high-resolution multi-year traces of $\sim$40,000 individuals from 4 datasets and show that this tension vanishes when the long-term evolution of mobility patterns is considered. We reveal that mobility patterns evolve significantly yet smoothly, and that the number of familiar locations an individual visits at any point is a conserved quantity with a typical size of $\sim$25 locations. We use this finding to improve state-of-the-art modeling of human mobility. Furthermore, shifting the attention from aggregated quantities to individual behavior, we show that the size of an individual's set of preferred locations correlates with the number of her social interactions. This result suggests a connection between the conserved quantity we identify, which as we show can not be understood purely on the basis of time constraints, and the `Dunbar number' describing a cognitive upper limit to an individual's number of social relations. We anticipate that our work will spark further research linking the study of Human Mobility and the Cognitive and Behavioral Sciences.
△ Less
Submitted 19 June, 2018; v1 submitted 12 September, 2016;
originally announced September 2016.
-
Temporal Fidelity in Dynamic Social Networks
Authors:
Arkadiusz Stopczynski,
Piotr Sapiezynski,
Alex 'Sandy' Pentland,
Sune Lehmann
Abstract:
It has recently become possible to record detailed social interactions in large social systems with high resolution. As we study these datasets, human social interactions display patterns that emerge at multiple time scales, from minutes to months. On a fundamental level, understanding of the network dynamics can be used to inform the process of measuring social networks. The details of measuremen…
▽ More
It has recently become possible to record detailed social interactions in large social systems with high resolution. As we study these datasets, human social interactions display patterns that emerge at multiple time scales, from minutes to months. On a fundamental level, understanding of the network dynamics can be used to inform the process of measuring social networks. The details of measurement are of particular importance when considering dynamic processes where minute-to-minute details are important, because collection of physical proximity interactions with high temporal resolution is difficult and expensive. Here, we consider the dynamic network of proximity-interactions between approximately 500 individuals participating in the Copenhagen Networks Study. We show that in order to accurately model spreading processes in the network, the dynamic processes that occur on the order of minutes are essential and must be included in the analysis.
△ Less
Submitted 21 August, 2015; v1 submitted 6 July, 2015;
originally announced July 2015.
-
Tracking Human Mobility using WiFi signals
Authors:
Piotr Sapiezynski,
Arkadiusz Stopczynski,
Radu Gatej,
Sune Lehmann
Abstract:
We study six months of human mobility data, including WiFi and GPS traces recorded with high temporal resolution, and find that time series of WiFi scans contain a strong latent location signal. In fact, due to inherent stability and low entropy of human mobility, it is possible to assign location to WiFi access points based on a very small number of GPS samples and then use these access points as…
▽ More
We study six months of human mobility data, including WiFi and GPS traces recorded with high temporal resolution, and find that time series of WiFi scans contain a strong latent location signal. In fact, due to inherent stability and low entropy of human mobility, it is possible to assign location to WiFi access points based on a very small number of GPS samples and then use these access points as location beacons. Using just one GPS observation per day per person allows us to estimate the location of, and subsequently use, WiFi access points to account for 80\% of mobility across a population. These results reveal a great opportunity for using ubiquitous WiFi routers for high-resolution outdoor positioning, but also significant privacy implications of such side-channel location tracking.
△ Less
Submitted 23 May, 2015;
originally announced May 2015.
-
Measuring large-scale social networks with high resolution
Authors:
Arkadiusz Stopczynski,
Vedran Sekara,
Piotr Sapiezynski,
Andrea Cuttone,
Mette My Madsen,
Jakob Eg Larsen,
Sune Lehmann
Abstract:
This paper describes the deployment of a large-scale study designed to measure human interactions across a variety of communication channels, with high temporal resolution and spanning multiple years - the Copenhagen Networks Study. Specifically, we collect data on face-to-face interactions, telecommunication, social networks, location, and background information (personality, demographic, health,…
▽ More
This paper describes the deployment of a large-scale study designed to measure human interactions across a variety of communication channels, with high temporal resolution and spanning multiple years - the Copenhagen Networks Study. Specifically, we collect data on face-to-face interactions, telecommunication, social networks, location, and background information (personality, demographic, health, politics) for a densely connected population of 1,000 individuals, using state-of-art smartphones as social sensors. Here we provide an overview of the related work and describe the motivation and research agenda driving the study. Additionally the paper details the data-types measured, and the technical infrastructure in terms of both backend and phone software, as well as an outline of the deployment procedures. We document the participant privacy procedures and their underlying principles. The paper is concluded with early results from data analysis, illustrating the importance of multi-channel high-resolution approach to data collection.
△ Less
Submitted 13 February, 2014; v1 submitted 28 January, 2014;
originally announced January 2014.
-
Crowds, Bluetooth, and Rock-n-Roll. Understanding Music Festival Participant Behavior
Authors:
Jakob Eg Larsen,
Piotr Sapiezynski,
Arkadiusz Stopczynski,
Morten Moerup,
Rasmus Theodorsen
Abstract:
In this paper we present a study of sensing and analyzing an offline social network of participants at a large-scale music festival (8 days, 130,000+ participants). We place 33 fixed-location Bluetooth scanners in strategic spots around the festival area to discover Bluetooth-enabled mobile phones carried by the participants, and thus collect spatio-temporal traces of their mobility and interactio…
▽ More
In this paper we present a study of sensing and analyzing an offline social network of participants at a large-scale music festival (8 days, 130,000+ participants). We place 33 fixed-location Bluetooth scanners in strategic spots around the festival area to discover Bluetooth-enabled mobile phones carried by the participants, and thus collect spatio-temporal traces of their mobility and interactions. We subsequently analyze the data on two levels. On the micro level, we run a community detection algorithm to reveal a variety of groups the festival participants form. On the macro level, we employ an Infinite Relational Model (IRM) in order to recover the structure of the social network related to participants' music preferences. The obtained structure in the form of clusters of concerts and participants is then interpreted using meta-information about music genres, band origins, stages, and dates of performances. We show that most of the concerts clusters can be described by one or more of the meta-features, effectively revealing preferences of participants (e.g. a cluster of US bands) and discuss the significance of the findings and the potential and limitations of the used method. Finally, we discuss the possibility of employing the described method and techniques for creating user-oriented applications and extending the sensing capabilities during large-scale events by introducing user involvement.
△ Less
Submitted 14 June, 2013; v1 submitted 13 June, 2013;
originally announced June 2013.