-
Information Security and Privacy in the Digital World: Some Selected Topics
Authors:
Jaydip Sen,
Joceli Mayer,
Subhasis Dasgupta,
Subrata Nandi,
Srinivasan Krishnaswamy,
Pinaki Mitra,
Mahendra Pratap Singh,
Naga Prasanthi Kundeti,
Chandra Sekhara Rao MVP,
Sudha Sree Chekuri,
Seshu Babu Pallapothu,
Preethi Nanjundan,
Jossy P. George,
Abdelhadi El Allahi,
Ilham Morino,
Salma AIT Oussous,
Siham Beloualid,
Ahmed Tamtaoui,
Abderrahim Bajit
Abstract:
In the era of generative artificial intelligence and the Internet of Things, while there is explosive growth in the volume of data and the associated need for processing, analysis, and storage, several new challenges are faced in identifying spurious and fake information and protecting the privacy of sensitive data. This has led to an increasing demand for more robust and resilient schemes for aut…
▽ More
In the era of generative artificial intelligence and the Internet of Things, while there is explosive growth in the volume of data and the associated need for processing, analysis, and storage, several new challenges are faced in identifying spurious and fake information and protecting the privacy of sensitive data. This has led to an increasing demand for more robust and resilient schemes for authentication, integrity protection, encryption, non-repudiation, and privacy-preservation of data. The chapters in this book present some of the state-of-the-art research works in the field of cryptography and security in computing and communications.
△ Less
Submitted 29 March, 2024;
originally announced April 2024.
-
Practical End-to-End Optical Music Recognition for Pianoform Music
Authors:
Jiří Mayer,
Milan Straka,
Jan Hajič jr.,
Pavel Pecina
Abstract:
The majority of recent progress in Optical Music Recognition (OMR) has been achieved with Deep Learning methods, especially models following the end-to-end paradigm, reading input images and producing a linear sequence of tokens. Unfortunately, many music scores, especially piano music, cannot be easily converted to a linear sequence. This has led OMR researchers to use custom linearized encodings…
▽ More
The majority of recent progress in Optical Music Recognition (OMR) has been achieved with Deep Learning methods, especially models following the end-to-end paradigm, reading input images and producing a linear sequence of tokens. Unfortunately, many music scores, especially piano music, cannot be easily converted to a linear sequence. This has led OMR researchers to use custom linearized encodings, instead of broadly accepted structured formats for music notation. Their diversity makes it difficult to compare the performance of OMR systems directly. To bring recent OMR model progress closer to useful results: (a) We define a sequential format called Linearized MusicXML, allowing to train an end-to-end model directly and maintaining close cohesion and compatibility with the industry-standard MusicXML format. (b) We create a dev and test set for benchmarking typeset OMR with MusicXML ground truth based on the OpenScore Lieder corpus. They contain 1,438 and 1,493 pianoform systems, each with an image from IMSLP. (c) We train and fine-tune an end-to-end model to serve as a baseline on the dataset and employ the TEDn metric to evaluate the model. We also test our model against the recently published synthetic pianoform dataset GrandStaff and surpass the state-of-the-art results.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
The Challenges of Machine Learning for Trust and Safety: A Case Study on Misinformation Detection
Authors:
Madelyne Xiao,
Jonathan Mayer
Abstract:
We examine the disconnect between scholarship and practice in applying machine learning to trust and safety problems, using misinformation detection as a case study. We survey literature on automated detection of misinformation across a corpus of 248 well-cited papers in the field. We then examine subsets of papers for data and code availability, design missteps, reproducibility, and generalizabil…
▽ More
We examine the disconnect between scholarship and practice in applying machine learning to trust and safety problems, using misinformation detection as a case study. We survey literature on automated detection of misinformation across a corpus of 248 well-cited papers in the field. We then examine subsets of papers for data and code availability, design missteps, reproducibility, and generalizability. Our paper corpus includes published work in security, natural language processing, and computational social science. Across these disparate disciplines, we identify common errors in dataset and method design. In general, detection tasks are often meaningfully distinct from the challenges that online services actually face. Datasets and model evaluation are often non-representative of real-world contexts, and evaluation frequently is not independent of model training. We demonstrate the limitations of current detection methods in a series of three representative replication studies. Based on the results of these analyses and our literature survey, we conclude that the current state-of-the-art in fully-automated misinformation detection has limited efficacy in detecting human-generated misinformation. We offer recommendations for evaluating applications of machine learning to trust and safety problems and recommend future directions for research.
△ Less
Submitted 19 June, 2024; v1 submitted 23 August, 2023;
originally announced August 2023.
-
Account Verification on Social Media: User Perceptions and Paid Enrollment
Authors:
Madelyne Xiao,
Mona Wang,
Anunay Kulshrestha,
Jonathan Mayer
Abstract:
We investigate how users perceive social media account verification, how those perceptions compare to platform practices, and what happens when a gap emerges. We use recent changes in Twitter's verification process as a natural experiment, where the meaning and types of verification indicators rapidly and significantly shift. The project consists of two components: a user survey and a measurement…
▽ More
We investigate how users perceive social media account verification, how those perceptions compare to platform practices, and what happens when a gap emerges. We use recent changes in Twitter's verification process as a natural experiment, where the meaning and types of verification indicators rapidly and significantly shift. The project consists of two components: a user survey and a measurement of verified Twitter accounts.
In the survey study, we ask a demographically representative sample of U.S. respondents (n = 299) about social media account verification requirements both in general and for particular platforms. We also ask about experiences with online information sources and digital literacy. More than half of respondents misunderstand Twitter's criteria for blue check account verification, and over 80% of respondents misunderstand Twitter's new gold and gray check verification indicators. Our analysis of survey responses suggests that people who are older or have lower digital literacy may be modestly more likely to misunderstand Twitter verification.
In the measurement study, we randomly sample 15 million English language tweets from October 2022. We obtain account verification status for the associated accounts in November 2022, just before Twitter's verification changes, and we collect verification status again in January 2022. The resulting longitudinal dataset of 2.85 million accounts enables us to characterize the accounts that gained and lost verification following Twitter's changes. We find that accounts posting conservative political content, exhibiting positive views about Elon Musk, and promoting cryptocurrencies disproportionately obtain blue check verification after Twitter's changes.
We close by offering recommendations for improving account verification indicators and processes.
△ Less
Submitted 24 June, 2023; v1 submitted 28 April, 2023;
originally announced April 2023.
-
SoK: Content Moderation for End-to-End Encryption
Authors:
Sarah Scheffler,
Jonathan Mayer
Abstract:
Popular messaging applications now enable end-to-end-encryption (E2EE) by default, and E2EE data storage is becoming common. These important advances for security and privacy create new content moderation challenges for online services, because services can no longer directly access plaintext content. While ongoing public policy debates about E2EE and content moderation in the United States and Eu…
▽ More
Popular messaging applications now enable end-to-end-encryption (E2EE) by default, and E2EE data storage is becoming common. These important advances for security and privacy create new content moderation challenges for online services, because services can no longer directly access plaintext content. While ongoing public policy debates about E2EE and content moderation in the United States and European Union emphasize child sexual abuse material and misinformation in messaging and storage, we identify and synthesize a wealth of scholarship that goes far beyond those topics. We bridge literature that is diverse in both content moderation subject matter, such as malware, spam, hate speech, terrorist content, and enterprise policy compliance, as well as intended deployments, including not only privacy-preserving content moderation for messaging, email, and cloud storage, but also private introspection of encrypted web traffic by middleboxes. In this work, we systematize the study of content moderation in E2EE settings. We set out a process pipeline for content moderation, drawing on a broad interdisciplinary literature that is not specific to E2EE. We examine cryptography and policy design choices at all stages of this pipeline, and we suggest areas of future research to fill gaps in literature and better understand possible paths forward.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
Rally and WebScience: A Platform and Toolkit for Browser-Based Research on Technology and Society Problems
Authors:
Anne Kohlbrenner,
Ben Kaiser,
Kartikeya Kandula,
Rebecca Weiss,
Jonathan Mayer,
Ted Han,
Robert Helmer
Abstract:
Empirical technology and society research is in a methodological crisis. Problems increasingly involve closed platforms, targeted content, and context-specific behavior. Prevailing research methods, such as surveys, tasks, and web crawls, pose design and ecological validity limitations.
Deploying studies in participant browsers and devices is a promising direction. These vantage points can obser…
▽ More
Empirical technology and society research is in a methodological crisis. Problems increasingly involve closed platforms, targeted content, and context-specific behavior. Prevailing research methods, such as surveys, tasks, and web crawls, pose design and ecological validity limitations.
Deploying studies in participant browsers and devices is a promising direction. These vantage points can observe individualized experiences and implement UI interventions in real settings.
We survey scholarship that uses these methods, annotating 284 sampled papers. Our analysis demonstrates their potential, but also recurring implementation barriers and shortcomings.
We then present Rally and sdkName, a platform and toolkit for browser-based research. These systems lower implementation barriers and advance the science of measuring online behavior.
Finally, we evaluate Rally and sdkName against our design goals. We report results from a one-month pilot study on news engagement, analyzing 4,466,200 webpage visits from 1,817 participants. We also present observations from interviews with researchers using these systems.
△ Less
Submitted 30 November, 2022; v1 submitted 4 November, 2022;
originally announced November 2022.
-
Self-Censorship Under Law: A Case Study of the Hong Kong National Security Law
Authors:
Mona Wang,
Jonathan Mayer
Abstract:
We study how legislation that restricts speech can induce online self-censorship and alter online discourse, using the recent Hong Kong national security law as a case study. We collect a dataset of 7 million historical Tweets from Hong Kong users, supplemented with historical snapshots of Tweet streams collected by other researchers. We compare online activity before and after enactment of the na…
▽ More
We study how legislation that restricts speech can induce online self-censorship and alter online discourse, using the recent Hong Kong national security law as a case study. We collect a dataset of 7 million historical Tweets from Hong Kong users, supplemented with historical snapshots of Tweet streams collected by other researchers. We compare online activity before and after enactment of the national security law, and we find that Hong Kong users demonstrate two types of self-censorship. First, Hong Kong users are more likely than a control group, sampled randomly from historical snapshots of Tweet streams, to remove past online activity. Specifically, Hong Kong users are over a third more likely than the control group to delete or restrict their account and over twice as likely to delete past posts. Second, we find that Hong Kong users post less often about politically sensitive topics that have been censored on social media in mainland China. This trend continues to increase.
△ Less
Submitted 17 February, 2023; v1 submitted 20 October, 2022;
originally announced October 2022.
-
Proximal Policy Optimization for Tracking Control Exploiting Future Reference Information
Authors:
Jana Mayer,
Johannes Westermann,
Juan Pedro Gutiérrez H. Muriedas,
Uwe Mettin,
Alexander Lampe
Abstract:
In recent years, reinforcement learning (RL) has gained increasing attention in control engineering. Especially, policy gradient methods are widely used. In this work, we improve the tracking performance of proximal policy optimization (PPO) for arbitrary reference signals by incorporating information about future reference values. Two variants of extending the argument of the actor and the critic…
▽ More
In recent years, reinforcement learning (RL) has gained increasing attention in control engineering. Especially, policy gradient methods are widely used. In this work, we improve the tracking performance of proximal policy optimization (PPO) for arbitrary reference signals by incorporating information about future reference values. Two variants of extending the argument of the actor and the critic taking future reference values into account are presented. In the first variant, global future reference values are added to the argument. For the second variant, a novel kind of residual space with future reference values applicable to model-free reinforcement learning is introduced. Our approach is evaluated against a PI controller on a simple drive train model. We expect our method to generalize to arbitrary references better than previous approaches, pointing towards the applicability of RL to control real systems.
△ Less
Submitted 20 July, 2021;
originally announced July 2021.
-
What Makes a Dark Pattern... Dark? Design Attributes, Normative Considerations, and Measurement Methods
Authors:
Arunesh Mathur,
Jonathan Mayer,
Mihir Kshirsagar
Abstract:
There is a rapidly growing literature on dark patterns, user interface designs -- typically related to shop** or privacy -- that researchers deem problematic. Recent work has been predominantly descriptive, documenting and categorizing objectionable user interfaces. These contributions have been invaluable in highlighting specific designs for researchers and policymakers. But the current literat…
▽ More
There is a rapidly growing literature on dark patterns, user interface designs -- typically related to shop** or privacy -- that researchers deem problematic. Recent work has been predominantly descriptive, documenting and categorizing objectionable user interfaces. These contributions have been invaluable in highlighting specific designs for researchers and policymakers. But the current literature lacks a conceptual foundation: What makes a user interface a dark pattern? Why are certain designs problematic for users or society?
We review recent work on dark patterns and demonstrate that the literature does not reflect a singular concern or consistent definition, but rather, a set of thematically related considerations. Drawing from scholarship in psychology, economics, ethics, philosophy, and law, we articulate a set of normative perspectives for analyzing dark patterns and their effects on individuals and society. We then show how future research on dark patterns can go beyond subjective criticism of user interface designs and apply empirical methods grounded in normative perspectives.
△ Less
Submitted 12 January, 2021;
originally announced January 2021.
-
Adapting Security Warnings to Counter Online Disinformation
Authors:
Ben Kaiser,
Jerry Wei,
Eli Lucherini,
Kevin Lee,
J. Nathan Matias,
Jonathan Mayer
Abstract:
Disinformation is proliferating on the internet, and platforms are responding by attaching warnings to content. There is little evidence, however, that these warnings help users identify or avoid disinformation. In this work, we adapt methods and results from the information security warning literature in order to design and evaluate effective disinformation warnings. In an initial laboratory stud…
▽ More
Disinformation is proliferating on the internet, and platforms are responding by attaching warnings to content. There is little evidence, however, that these warnings help users identify or avoid disinformation. In this work, we adapt methods and results from the information security warning literature in order to design and evaluate effective disinformation warnings. In an initial laboratory study, we used a simulated search task to examine contextual and interstitial disinformation warning designs. We found that users routinely ignore contextual warnings, but users notice interstitial warnings -- and respond by seeking information from alternative sources. We then conducted a follow-on crowdworker study with eight interstitial warning designs. We confirmed a significant impact on user information-seeking behavior, and we found that a warning's design could effectively inform users or convey a risk of harm. We also found, however, that neither user comprehension nor fear of harm moderated behavioral effects. Our work provides evidence that disinformation warnings can -- when designed well -- help users identify and avoid disinformation. We show a path forward for designing effective warnings, and we contribute repeatable methods for evaluating behavioral effects. We also surface a possible dilemma: disinformation warnings might be able to inform users and guide behavior, but the behavioral effects might result from user experience friction, not informed decision making.
△ Less
Submitted 16 August, 2021; v1 submitted 24 August, 2020;
originally announced August 2020.
-
Privacy Policies over Time: Curation and Analysis of a Million-Document Dataset
Authors:
Ryan Amos,
Gunes Acar,
Eli Lucherini,
Mihir Kshirsagar,
Arvind Narayanan,
Jonathan Mayer
Abstract:
Automated analysis of privacy policies has proved a fruitful research direction, with developments such as automated policy summarization, question answering systems, and compliance detection. Prior research has been limited to analysis of privacy policies from a single point in time or from short spans of time, as researchers did not have access to a large-scale, longitudinal, curated dataset. To…
▽ More
Automated analysis of privacy policies has proved a fruitful research direction, with developments such as automated policy summarization, question answering systems, and compliance detection. Prior research has been limited to analysis of privacy policies from a single point in time or from short spans of time, as researchers did not have access to a large-scale, longitudinal, curated dataset. To address this gap, we developed a crawler that discovers, downloads, and extracts archived privacy policies from the Internet Archive's Wayback Machine. Using the crawler and following a series of validation and quality control steps, we curated a dataset of 1,071,488 English language privacy policies, spanning over two decades and over 130,000 distinct websites.
Our analyses of the data paint a troubling picture of the transparency and accessibility of privacy policies. By comparing the occurrence of tracking-related terminology in our dataset to prior web privacy measurements, we find that privacy policies have consistently failed to disclose the presence of common tracking technologies and third parties. We also find that over the last twenty years privacy policies have become even more difficult to read, doubling in length and increasing a full grade in the median reading level. Our data indicate that self-regulation for first-party websites has stagnated, while self-regulation for third parties has increased but is dominated by online advertising trade associations. Finally, we contribute to the literature on privacy regulation by demonstrating the historic impact of the GDPR on privacy policies.
△ Less
Submitted 20 July, 2021; v1 submitted 20 August, 2020;
originally announced August 2020.
-
Classifying Network Vendors at Internet Scale
Authors:
Jordan Holland,
Ross Teixeira,
Paul Schmitt,
Kevin Borgolte,
Jennifer Rexford,
Nick Feamster,
Jonathan Mayer
Abstract:
In this paper, we develop a method to create a large, labeled dataset of visible network device vendors across the Internet by map** network-visible IP addresses to device vendors. We use Internet-wide scanning, banner grabs of network-visible devices across the IPv4 address space, and clustering techniques to assign labels to more than 160,000 devices. We subsequently probe these devices and us…
▽ More
In this paper, we develop a method to create a large, labeled dataset of visible network device vendors across the Internet by map** network-visible IP addresses to device vendors. We use Internet-wide scanning, banner grabs of network-visible devices across the IPv4 address space, and clustering techniques to assign labels to more than 160,000 devices. We subsequently probe these devices and use features extracted from the responses to train a classifier that can accurately classify device vendors. Finally, we demonstrate how this method can be used to understand broader trends across the Internet by predicting device vendors in traceroutes from CAIDA's Archipelago measurement system and subsequently examining vendor distributions across these traceroutes.
△ Less
Submitted 24 June, 2020; v1 submitted 23 June, 2020;
originally announced June 2020.
-
Identifying Disinformation Websites Using Infrastructure Features
Authors:
Austin Hounsel,
Jordan Holland,
Ben Kaiser,
Kevin Borgolte,
Nick Feamster,
Jonathan Mayer
Abstract:
Platforms have struggled to keep pace with the spread of disinformation. Current responses like user reports, manual analysis, and third-party fact checking are slow and difficult to scale, and as a result, disinformation can spread unchecked for some time after being created. Automation is essential for enabling platforms to respond rapidly to disinformation. In this work, we explore a new direct…
▽ More
Platforms have struggled to keep pace with the spread of disinformation. Current responses like user reports, manual analysis, and third-party fact checking are slow and difficult to scale, and as a result, disinformation can spread unchecked for some time after being created. Automation is essential for enabling platforms to respond rapidly to disinformation. In this work, we explore a new direction for automated detection of disinformation websites: infrastructure features. Our hypothesis is that while disinformation websites may be perceptually similar to authentic news websites, there may also be significant non-perceptual differences in the domain registrations, TLS/SSL certificates, and web hosting configurations. Infrastructure features are particularly valuable for detecting disinformation websites because they are available before content goes live and reaches readers, enabling early detection. We demonstrate the feasibility of our approach on a large corpus of labeled website snapshots. We also present results from a preliminary real-time deployment, successfully discovering disinformation websites while highlighting unexplored challenges for automated disinformation detection.
△ Less
Submitted 28 September, 2020; v1 submitted 28 February, 2020;
originally announced March 2020.
-
Dark Patterns at Scale: Findings from a Crawl of 11K Shop** Websites
Authors:
Arunesh Mathur,
Gunes Acar,
Michael J. Friedman,
Elena Lucherini,
Jonathan Mayer,
Marshini Chetty,
Arvind Narayanan
Abstract:
Dark patterns are user interface design choices that benefit an online service by coercing, steering, or deceiving users into making unintended and potentially harmful decisions. We present automated techniques that enable experts to identify dark patterns on a large set of websites. Using these techniques, we study shop** websites, which often use dark patterns to influence users into making mo…
▽ More
Dark patterns are user interface design choices that benefit an online service by coercing, steering, or deceiving users into making unintended and potentially harmful decisions. We present automated techniques that enable experts to identify dark patterns on a large set of websites. Using these techniques, we study shop** websites, which often use dark patterns to influence users into making more purchases or disclosing more information than they would otherwise. Analyzing ~53K product pages from ~11K shop** websites, we discover 1,818 dark pattern instances, together representing 15 types and 7 broader categories. We examine these dark patterns for deceptive practices, and find 183 websites that engage in such practices. We also uncover 22 third-party entities that offer dark patterns as a turnkey solution. Finally, we develop a taxonomy of dark pattern characteristics that describes the underlying influence of the dark patterns and their potential harm on user decision-making. Based on our findings, we make recommendations for stakeholders including researchers and regulators to study, mitigate, and minimize the use of these patterns.
△ Less
Submitted 20 September, 2019; v1 submitted 16 July, 2019;
originally announced July 2019.
-
Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation
Authors:
Mengyu Chu,
You Xie,
Jonas Mayer,
Laura Leal-Taixé,
Nils Thuerey
Abstract:
Our work explores temporal self-supervision for GAN-based video generation tasks. While adversarial training successfully yields generative models for a variety of areas, temporal relationships in the generated data are much less explored. Natural temporal changes are crucial for sequential generation tasks, e.g. video super-resolution and unpaired video translation. For the former, state-of-the-a…
▽ More
Our work explores temporal self-supervision for GAN-based video generation tasks. While adversarial training successfully yields generative models for a variety of areas, temporal relationships in the generated data are much less explored. Natural temporal changes are crucial for sequential generation tasks, e.g. video super-resolution and unpaired video translation. For the former, state-of-the-art methods often favor simpler norm losses such as $L^2$ over adversarial training. However, their averaging nature easily leads to temporally smooth results with an undesirable lack of spatial detail. For unpaired video translation, existing approaches modify the generator networks to form spatio-temporal cycle consistencies. In contrast, we focus on improving learning objectives and propose a temporally self-supervised algorithm. For both tasks, we show that temporal adversarial learning is key to achieving temporally coherent solutions without sacrificing spatial detail. We also propose a novel **-Pong loss to improve the long-term temporal consistency. It effectively prevents recurrent networks from accumulating artifacts temporally without depressing detailed features. Additionally, we propose a first set of metrics to quantitatively evaluate the accuracy as well as the perceptual quality of the temporal evolution. A series of user studies confirm the rankings computed with these metrics. Code, data, models, and results are provided at https://github.com/thunil/TecoGAN. The project page https://ge.in.tum.de/publications/2019-tecogan-chu/ contains supplemental materials.
△ Less
Submitted 21 May, 2020; v1 submitted 23 November, 2018;
originally announced November 2018.
-
SLAMBench2: Multi-Objective Head-to-Head Benchmarking for Visual SLAM
Authors:
Bruno Bodin,
Harry Wagstaff,
Sajad Saeedi,
Luigi Nardi,
Emanuele Vespa,
John H Mayer,
Andy Nisbet,
Mikel Luján,
Steve Furber,
Andrew J Davison,
Paul H. J. Kelly,
Michael O'Boyle
Abstract:
SLAM is becoming a key component of robotics and augmented reality (AR) systems. While a large number of SLAM algorithms have been presented, there has been little effort to unify the interface of such algorithms, or to perform a holistic comparison of their capabilities. This is a problem since different SLAM applications can have different functional and non-functional requirements. For example,…
▽ More
SLAM is becoming a key component of robotics and augmented reality (AR) systems. While a large number of SLAM algorithms have been presented, there has been little effort to unify the interface of such algorithms, or to perform a holistic comparison of their capabilities. This is a problem since different SLAM applications can have different functional and non-functional requirements. For example, a mobile phonebased AR application has a tight energy budget, while a UAV navigation system usually requires high accuracy. SLAMBench2 is a benchmarking framework to evaluate existing and future SLAM systems, both open and close source, over an extensible list of datasets, while using a comparable and clearly specified list of performance metrics. A wide variety of existing SLAM algorithms and datasets is supported, e.g. ElasticFusion, InfiniTAM, ORB-SLAM2, OKVIS, and integrating new ones is straightforward and clearly specified by the framework. SLAMBench2 is a publicly-available software framework which represents a starting point for quantitative, comparable and validatable experimental research to investigate trade-offs across SLAM systems.
△ Less
Submitted 21 August, 2018;
originally announced August 2018.
-
The Future of Ad Blocking: An Analytical Framework and New Techniques
Authors:
Grant Storey,
Dillon Reisman,
Jonathan Mayer,
Arvind Narayanan
Abstract:
We present a systematic study of ad blocking - and the associated "arms race" - as a security problem. We model ad blocking as a state space with four states and six state transitions, which correspond to techniques that can be deployed by either publishers or ad blockers. We argue that this is a complete model of the system. We propose several new ad blocking techniques, including ones that borro…
▽ More
We present a systematic study of ad blocking - and the associated "arms race" - as a security problem. We model ad blocking as a state space with four states and six state transitions, which correspond to techniques that can be deployed by either publishers or ad blockers. We argue that this is a complete model of the system. We propose several new ad blocking techniques, including ones that borrow ideas from rootkits to prevent detection by anti-ad blocking scripts. Another technique uses the insight that ads must be recognizable by humans to comply with laws and industry self-regulation. We have built prototype implementations of three of these techniques, successfully blocking ads and evading detection. We systematically evaluate our proposed techniques, along with existing ones, in terms of security, practicality, and legality. We characterize the order of growth of the development effort required to create/maintain ad blockers as a function of the growth of the web. Based on our state-space model, our new techniques, and this systematization, we offer insights into the likely "end game" of the arms race. We challenge the widespread assumption that the arms race will escalate indefinitely, and instead identify a combination of evolving technical and legal factors that will determine the outcome.
△ Less
Submitted 23 May, 2017;
originally announced May 2017.