Search | arXiv e-print repository

Did I Vet You Before? Assessing the Chrome Web Store Vetting Process through Browser Extension Similarity

Authors: José Miguel Moreno, Narseo Vallina-Rodriguez, Juan Tapiador

Abstract: Web browsers, particularly Google Chrome and other Chromium-based browsers, have grown in popularity over the past decade, with browser extensions becoming an integral part of their ecosystem. These extensions can customize and enhance the user experience, providing functionality that ranges from ad blockers to, more recently, AI assistants. Given the ever-increasing importance of web browsers, di… ▽ More Web browsers, particularly Google Chrome and other Chromium-based browsers, have grown in popularity over the past decade, with browser extensions becoming an integral part of their ecosystem. These extensions can customize and enhance the user experience, providing functionality that ranges from ad blockers to, more recently, AI assistants. Given the ever-increasing importance of web browsers, distribution marketplaces for extensions play a key role in kee** users safe by vetting submissions that display abusive or malicious behavior. In this paper, we characterize the prevalence of malware and other infringing extensions in the Chrome Web Store (CWS), the largest distribution platform for this type of software. To do so, we introduce SimExt, a novel methodology for detecting similarly behaving extensions that leverages static and dynamic analysis, Natural Language Processing (NLP) and vector embeddings. Our study reveals significant gaps in the CWS vetting process, as 86% of infringing extensions are extremely similar to previously vetted items, and these extensions take months or even years to be removed. By characterizing the top kinds of infringing extension, we find that 83% are New Tab Extensions (NTEs) and raise some concerns about the consistency of the vetting labels assigned by CWS analysts. Our study also reveals that only 1% of malware extensions flagged by the CWS are detected as malicious by anti-malware engines, indicating a concerning gap between the threat landscape seen by CWS moderators and the detection capabilities of the threat intelligence community. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2306.14497 [pdf, other]

Your Code is 0000: An Analysis of the Disposable Phone Numbers Ecosystem

Authors: José Miguel Moreno, Srdjan Matic, Narseo Vallina-Rodriguez, Juan Tapiador

Abstract: Short Message Service (SMS) is a popular channel for online service providers to verify accounts and authenticate users registered to a particular service. Specialized applications, called Public SMS Gateways (PSGs), offer free Disposable Phone Numbers (DPNs) that can be used to receive SMS messages. DPNs allow users to protect their privacy when creating online accounts. However, they can also be… ▽ More Short Message Service (SMS) is a popular channel for online service providers to verify accounts and authenticate users registered to a particular service. Specialized applications, called Public SMS Gateways (PSGs), offer free Disposable Phone Numbers (DPNs) that can be used to receive SMS messages. DPNs allow users to protect their privacy when creating online accounts. However, they can also be abused for fraudulent activities and to bypass security mechanisms like Two-Factor Authentication (2FA). In this paper, we perform a large-scale and longitudinal study of the DPN ecosystem by monitoring 17,141 unique DPNs in 29 PSGs over the course of 12 months. Using a dataset of over 70M messages, we provide an overview of the ecosystem and study the different services that offer DPNs and their relationships. Next, we build a framework that (i) identifies and classifies the purpose of an SMS; and (ii) accurately attributes every message to more than 200 popular Internet services that require SMS for creating registered accounts. Our results indicate that the DPN ecosystem is globally used to support fraudulent account creation and access, and that this issue is ubiquitous and affects all major Internet platforms and specialized online services. △ Less

Submitted 26 June, 2023; originally announced June 2023.

arXiv:2305.11506 [pdf, other]

Chrowned by an Extension: Abusing the Chrome DevTools Protocol through the Debugger API

Authors: José Miguel Moreno, Narseo Vallina-Rodriguez, Juan Tapiador

Abstract: The Chromium open-source project has become a fundamental piece of the Web as we know it today, with multiple vendors offering browsers based on its codebase. One of its most popular features is the possibility of altering or enhancing the browser functionality through third-party programs known as browser extensions. Extensions have access to a wide range of capabilities through the use of APIs e… ▽ More The Chromium open-source project has become a fundamental piece of the Web as we know it today, with multiple vendors offering browsers based on its codebase. One of its most popular features is the possibility of altering or enhancing the browser functionality through third-party programs known as browser extensions. Extensions have access to a wide range of capabilities through the use of APIs exposed by Chromium. The Debugger API -- arguably the most powerful of such APIs -- allows extensions to use the Chrome DevTools Protocol (CDP), a capability-rich tool for debugging and instrumenting the browser. In this paper, we describe several vulnerabilities present in the Debugger API and in the granting of capabilities to extensions that can be used by an attacker to take control of the browser, escalate privileges, and break context isolation. We demonstrate their impact by introducing six attacks that allow an attacker to steal user information, monitor network traffic, modify site permissions (\eg access to camera or microphone), bypass security interstitials without user intervention, and change the browser settings. Our attacks work in all major Chromium-based browsers as they are rooted at the core of the Chromium project. We reported our findings to the Chromium Development Team, who already fixed some of them and are currently working on fixing the remaining ones. We conclude by discussing how questionable design decisions, lack of public specifications, and an overpowered Debugger API have contributed to enabling these attacks, and propose mitigations. △ Less

Submitted 31 May, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

arXiv:2302.00598 [pdf, other]

Reviewing War: Unconventional User Reviews as a Side Channel to Circumvent Information Controls

Authors: José Miguel Moreno, Sergio Pastrana, Jens Helge Reelfs, Pelayo Vallina, Andriy Panchenko, Georgios Smaragdakis, Oliver Hohlfeld, Narseo Vallina-Rodriguez, Juan Tapiador

Abstract: During the first days of the 2022 Russian invasion of Ukraine, Russia's media regulator blocked access to many global social media platforms and news sites, including Twitter, Facebook, and the BBC. To bypass the information controls set by Russian authorities, pro-Ukrainian groups explored unconventional ways to reach out to the Russian population, such as posting war-related content in the user… ▽ More During the first days of the 2022 Russian invasion of Ukraine, Russia's media regulator blocked access to many global social media platforms and news sites, including Twitter, Facebook, and the BBC. To bypass the information controls set by Russian authorities, pro-Ukrainian groups explored unconventional ways to reach out to the Russian population, such as posting war-related content in the user reviews of Russian business available on Google Maps or Tripadvisor. This paper provides a first analysis of this new phenomenon by analyzing the creative strategies to avoid state censorship. Specifically, we analyze reviews posted on these platforms from the beginning of the conflict to September 2022. We measure the channeling of war messages through user reviews in Tripadvisor and Google Maps, as well as in VK, a popular Russian social network. Our analysis of the content posted on these services reveals that users leveraged these platforms to seek and exchange humanitarian and travel advice, but also to disseminate disinformation and polarized messages. Finally, we analyze the response of platforms in terms of content moderation and their impact. △ Less

Submitted 1 February, 2023; originally announced February 2023.

arXiv:2212.03615 [pdf, other]

Not Your Average App: A Large-scale Privacy Analysis of Android Browsers

Authors: Amogh Pradeep, Álvaro Feal, Julien Gamba, Ashwin Rao, Martina Lindorfer, Narseo Vallina-Rodriguez, David Choffnes

Abstract: The transparency and privacy behavior of mobile browsers has remained widely unexplored by the research community. In fact, as opposed to regular Android apps, mobile browsers may present contradicting privacy behaviors. On the one end, they can have access to (and can expose) a unique combination of sensitive user data, from users' browsing history to permission-protected personally identifiable… ▽ More The transparency and privacy behavior of mobile browsers has remained widely unexplored by the research community. In fact, as opposed to regular Android apps, mobile browsers may present contradicting privacy behaviors. On the one end, they can have access to (and can expose) a unique combination of sensitive user data, from users' browsing history to permission-protected personally identifiable information (PII) such as unique identifiers and geolocation. However, on the other end, they also are in a unique position to protect users' privacy by limiting data sharing with other parties by implementing ad-blocking features. In this paper, we perform a comparative and empirical analysis on how hundreds of Android web browsers protect or expose user data during browsing sessions. To this end, we collect the largest dataset of Android browsers to date, from the Google Play Store and four Chinese app stores. Then, we developed a novel analysis pipeline that combines static and dynamic analysis methods to find a wide range of privacy-enhancing (e.g., ad-blocking) and privacy-harming behaviors (e.g., sending browsing histories to third parties, not validating TLS certificates, and exposing PII -- including non-resettable identifiers -- to third parties) across browsers. We find that various popular apps on both Google Play and Chinese stores have these privacy-harming behaviors, including apps that claim to be privacy-enhancing in their descriptions. Overall, our study not only provides new insights into important yet overlooked considerations for browsers' adoption and transparency, but also that automatic app analysis systems (e.g., sandboxes) need context-specific analysis to reveal such privacy behaviors. △ Less

Submitted 7 December, 2022; originally announced December 2022.

Comments: Privacy Enhancing Technologies Symposium, 2023

arXiv:2211.13104 [pdf, other]

Mixed Signals: Analyzing Software Attribution Challenges in the Android Ecosystem

Authors: Kaspar Hageman, Álvaro Feal, Julien Gamba, Aniketh Girish, Jakob Bleier, Martina Lindorfer, Juan Tapiador, Narseo Vallina-Rodriguez

Abstract: The ability to identify the author responsible for a given software object is critical for many research studies and for enhancing software transparency and accountability. However, as opposed to other application markets like iOS, attribution in the Android ecosystem is known to be hard. Prior research has leveraged market metadata and signing certificates to identify software authors without que… ▽ More The ability to identify the author responsible for a given software object is critical for many research studies and for enhancing software transparency and accountability. However, as opposed to other application markets like iOS, attribution in the Android ecosystem is known to be hard. Prior research has leveraged market metadata and signing certificates to identify software authors without questioning the validity and accuracy of these attribution signals. However, Android app authors can, either intentionally or by mistake, hide their true identity due to: (1) the lack of policy enforcement by markets to ensure the accuracy and correctness of the information disclosed by developers in their market profiles during the app release process, and (2) the use of self-signed certificates for signing apps instead of certificates issued by trusted CAs. In this paper, we perform the first empirical analysis of the availability, volatility and overall aptness of publicly available metadata for author attribution in Android app markets. To that end, we analyze a dataset of over 2.5 million market entries and apps extracted from five Android markets for over two years. Our results show that widely used attribution signals are often missing from market profiles and that they change over time. We also invalidate the general belief about the validity of signing certificates for author attribution. For instance, we find that apps from different authors share signing certificates due to the proliferation of app building frameworks and software factories. Finally, we introduce the concept of attribution graph and we apply it to evaluate the validity of existing attribution signals on the Google Play Store. Our results confirm that the lack of control over publicly available signals can confuse the attribution process. △ Less

Submitted 23 November, 2022; originally announced November 2022.

arXiv:2012.07695 [pdf, other]

Back in control -- An extensible middle-box on your phone

Authors: James Newman, Abbas Razaghpanah, Narseo Vallina-Rodriguez, Fabian E. Bustamante, Mark Allman, Diego Perino, Alessandro Finamore

Abstract: The closed design of mobile devices -- with the increased security and consistent user interfaces -- is in large part responsible for their becoming the dominant platform for accessing the Internet. These benefits, however, are not without a cost. Their operation of mobile devices and their apps is not easy to understand by either users or operators. We argue for recovering transparency and contro… ▽ More The closed design of mobile devices -- with the increased security and consistent user interfaces -- is in large part responsible for their becoming the dominant platform for accessing the Internet. These benefits, however, are not without a cost. Their operation of mobile devices and their apps is not easy to understand by either users or operators. We argue for recovering transparency and control on mobile devices through an extensible platform that can intercept and modify traffic before leaving the device or, on arrival, before it reaches the operating system. Conceptually, this is the same view of the traffic that a traditional middlebox would have at the far end of the first link in the network path. We call this platform ``middlebox zero'' or MBZ. By being on-board, MBZ also leverages local context as it processes the traffic and complements the network-wide view of standard middleboxes. We discuss the challenges of the MBZ approach, sketch a working design, and illustrate its potential with some concrete examples. △ Less

Submitted 14 December, 2020; originally announced December 2020.

Comments: The paper is a position piece under review

arXiv:2010.01497 [pdf, other]

doi 10.1145/3419394.3423662

Understanding Incentivized Mobile App Installs on Google Play Store

Authors: Shehroze Farooqi, Álvaro Feal, Tobias Lauinger, Damon McCoy, Zubair Shafiq, Narseo Vallina-Rodriguez

Abstract: "Incentivized" advertising platforms allow mobile app developers to acquire new users by directly paying users to install and engage with mobile apps (e.g., create an account, make in-app purchases). Incentivized installs are banned by the Apple App Store and discouraged by the Google Play Store because they can manipulate app store metrics (e.g., install counts, appearance in top charts). Yet, ma… ▽ More "Incentivized" advertising platforms allow mobile app developers to acquire new users by directly paying users to install and engage with mobile apps (e.g., create an account, make in-app purchases). Incentivized installs are banned by the Apple App Store and discouraged by the Google Play Store because they can manipulate app store metrics (e.g., install counts, appearance in top charts). Yet, many organizations still offer incentivized install services for Android apps. In this paper, we present the first study to understand the ecosystem of incentivized mobile app install campaigns in Android and its broader ramifications through a series of measurements. We identify incentivized install campaigns that require users to install an app and perform in-app tasks targeting manipulation of a wide variety of user engagement metrics (e.g., daily active users, user session lengths) and revenue. Our results suggest that these artificially inflated metrics can be effective in improving app store metrics as well as hel** mobile app developers to attract funding from venture capitalists. Our study also indicates lax enforcement of the Google Play Store's existing policies to prevent these behaviors. It further motivates the need for stricter policing of incentivized install campaigns. Our proposed measurements can also be leveraged by the Google Play Store to identify potential policy violations. △ Less

Submitted 4 October, 2020; originally announced October 2020.

Journal ref: ACM Internet Measurement Conference (2020)

arXiv:2008.10959 [pdf, other]

doi 10.1145/3419394.3423658

The Lockdown Effect: Implications of the COVID-19 Pandemic on Internet Traffic

Authors: Anja Feldmann, Oliver Gasser, Franziska Lichtblau, Enric Pujol, Ingmar Poese, Christoph Dietzel, Daniel Wagner, Matthias Wichtlhuber, Juan Tapiador, Narseo Vallina-Rodriguez, Oliver Hohlfeld, Georgios Smaragdakis

Abstract: Due to the COVID-19 pandemic, many governments imposed lock downs that forced hundreds of millions of citizens to stay at home. The implementation of confinement measures increased Internet traffic demands of residential users, in particular, for remote working, entertainment, commerce, and education, which, as a result, caused traffic shifts in the Internet core. In this paper, using data from a… ▽ More Due to the COVID-19 pandemic, many governments imposed lock downs that forced hundreds of millions of citizens to stay at home. The implementation of confinement measures increased Internet traffic demands of residential users, in particular, for remote working, entertainment, commerce, and education, which, as a result, caused traffic shifts in the Internet core. In this paper, using data from a diverse set of vantage points (one ISP, three IXPs, and one metropolitan educational network), we examine the effect of these lockdowns on traffic shifts. We find that the traffic volume increased by 15-20% almost within a week--while overall still modest, this constitutes a large increase within this short time period. However, despite this surge, we observe that the Internet infrastructure is able to handle the new volume, as most traffic shifts occur outside of traditional peak hours. When looking directly at the traffic sources, it turns out that, while hypergiants still contribute a significant fraction of traffic, we see (1) a higher increase in traffic of non-hypergiants, and (2) traffic increases in applications that people use when at home, such as Web conferencing, VPN, and gaming. While many networks see increased traffic demands, in particular, those providing services to residential users, academic networks experience major overall decreases. Yet, in these networks, we can observe substantial increases when considering applications associated to remote working and lecturing. △ Less

Submitted 5 October, 2020; v1 submitted 25 August, 2020; originally announced August 2020.

Journal ref: Proceedings of the 2020 Internet Measurement Conference (IMC '20)

arXiv:1907.12762 [pdf, other]

The Era of TLS 1.3: Measuring Deployment and Use with Active and Passive Methods

Authors: Ralph Holz, Johanna Amann, Abbas Razaghpanah, Narseo Vallina-Rodriguez

Abstract: TLS 1.3 marks a significant departure from previous versions of the Transport Layer Security protocol (TLS). The new version offers a simplified protocol flow, more secure cryptographic primitives, and new features to improve performance, among other things. In this paper, we conduct the first study of TLS 1.3 deployment and use since its standardization by the IETF. We use active scans to measure… ▽ More TLS 1.3 marks a significant departure from previous versions of the Transport Layer Security protocol (TLS). The new version offers a simplified protocol flow, more secure cryptographic primitives, and new features to improve performance, among other things. In this paper, we conduct the first study of TLS 1.3 deployment and use since its standardization by the IETF. We use active scans to measure deployment across more than 275M domains, including nearly 90M country-code top-level domains. We establish and investigate the critical contribution that hosting services and CDNs make to the fast, initial uptake of the protocol. We use passive monitoring at two positions on the globe to determine the degree to which users profit from the new protocol and establish the usage of its new features. Finally, we exploit data from a widely deployed measurement app in the Android ecosystem to analyze the use of TLS 1.3 in mobile networks and in mobile browsers. Our study shows that TLS 1.3 enjoys enormous support even in its early days, unprecedented for any TLS version. However, this is strongly related to very few global players pushing it into the market and sustaining its growth. △ Less

Submitted 6 August, 2019; v1 submitted 30 July, 2019; originally announced July 2019.

arXiv:1906.09682 [pdf, other]

Encrypted DNS --> Privacy? A Traffic Analysis Perspective

Authors: Sandra Siby, Marc Juarez, Claudia Diaz, Narseo Vallina-Rodriguez, Carmela Troncoso

Abstract: Virtually every connection to an Internet service is preceded by a DNS lookup which is performed without any traffic-level protection, thus enabling manipulation, redirection, surveillance, and censorship. To address these issues, large organizations such as Google and Cloudflare are deploying recently standardized protocols that encrypt DNS traffic between end users and recursive resolvers such a… ▽ More Virtually every connection to an Internet service is preceded by a DNS lookup which is performed without any traffic-level protection, thus enabling manipulation, redirection, surveillance, and censorship. To address these issues, large organizations such as Google and Cloudflare are deploying recently standardized protocols that encrypt DNS traffic between end users and recursive resolvers such as DNS-over-TLS (DoT) and DNS-over-HTTPS (DoH). In this paper, we examine whether encrypting DNS traffic can protect users from traffic analysis-based monitoring and censoring. We propose a novel feature set to perform the attacks, as those used to attack HTTPS or Tor traffic are not suitable for DNS' characteristics. We show that traffic analysis enables the identification of domains with high accuracy in closed and open world settings, using 124 times less data than attacks on HTTPS flows. We find that factors such as location, resolver, platform, or client do mitigate the attacks performance but they are far from completely stop** them. Our results indicate that DNS-based censorship is still possible on encrypted DNS traffic. In fact, we demonstrate that the standardized padding schemes are not effective. Yet, Tor -- which does not effectively mitigate traffic analysis attacks on web traffic -- is a good defense against DoH traffic analysis. △ Less

Submitted 6 October, 2019; v1 submitted 23 June, 2019; originally announced June 2019.

arXiv:1905.02713 [pdf, other]

An Analysis of Pre-installed Android Software

Authors: Julien Gamba, Mohammed Rashed, Abbas Razaghpanah, Juan Tapiador, Narseo Vallina-Rodriguez

Abstract: The open-source nature of the Android OS makes it possible for manufacturers to ship custom versions of the OS along with a set of pre-installed apps, often for product differentiation. Some device vendors have recently come under scrutiny for potentially invasive private data collection practices and other potentially harmful or unwanted behavior of the pre-installed apps on their devices. Yet, t… ▽ More The open-source nature of the Android OS makes it possible for manufacturers to ship custom versions of the OS along with a set of pre-installed apps, often for product differentiation. Some device vendors have recently come under scrutiny for potentially invasive private data collection practices and other potentially harmful or unwanted behavior of the pre-installed apps on their devices. Yet, the landscape of pre-installed software in Android has largely remained unexplored, particularly in terms of the security and privacy implications of such customizations. In this paper, we present the first large-scale study of pre-installed software on Android devices from more than 200 vendors. Our work relies on a large dataset of real-world Android firmware acquired worldwide using crowd-sourcing methods. This allows us to answer questions related to the stakeholders involved in the supply chain, from device manufacturers and mobile network operators to third-party organizations like advertising and tracking services, and social network platforms. Our study allows us to also uncover relationships between these actors, which seem to revolve primarily around advertising and data-driven services. Overall, the supply chain around Android's open source model lacks transparency and has facilitated potentially harmful behaviors and backdoored access to sensitive data and services without user consent or awareness. We conclude the paper with recommendations to improve transparency, attribution, and accountability in the Android ecosystem. △ Less

Submitted 7 May, 2019; originally announced May 2019.

Journal ref: 41st IEEE Symposium on Security and Privacy, 18-20 May 2020, San Fransisco, CA, USA

arXiv:1810.07780 [pdf, other]

Beyond Google Play: A Large-Scale Comparative Study of Chinese Android App Markets

Authors: Haoyu Wang, Zhe Liu, **gyue Liang, Narseo Vallina-Rodriguez, Yao Guo, Li Li, Juan Tapiador, **gcun Cao, Guoai Xu

Abstract: China is one of the largest Android markets in the world. As Chinese users cannot access Google Play to buy and install Android apps, a number of independent app stores have emerged and compete in the Chinese app market. Some of the Chinese app stores are pre-installed vendor-specific app markets (e.g., Huawei, Xiaomi and OPPO), whereas others are maintained by large tech companies (e.g., Baidu, Q… ▽ More China is one of the largest Android markets in the world. As Chinese users cannot access Google Play to buy and install Android apps, a number of independent app stores have emerged and compete in the Chinese app market. Some of the Chinese app stores are pre-installed vendor-specific app markets (e.g., Huawei, Xiaomi and OPPO), whereas others are maintained by large tech companies (e.g., Baidu, Qihoo 360 and Tencent). The nature of these app stores and the content available through them vary greatly, including their trustworthiness and security guarantees. As of today, the research community has not studied the Chinese Android ecosystem in depth. To fill this gap, we present the first large-scale comparative study that covers more than 6 million Android apps downloaded from 16 Chinese app markets and Google Play. We focus our study on catalog similarity across app stores, their features, publishing dynamics, and the prevalence of various forms of misbehavior (including the presence of fake, cloned and malicious apps). Our findings also suggest heterogeneous developer behavior across app stores, in terms of code maintenance, use of third-party services, and so forth. Overall, Chinese app markets perform substantially worse when taking active measures to protect mobile users and legit developers from deceptive and abusive actors, showing a significantly higher prevalence of malware, fake, and cloned apps than Google Play. △ Less

Submitted 26 September, 2018; originally announced October 2018.

Comments: To appear in the Proceedings of the 2018 Internet Measurement Conference (IMC)

arXiv:1805.11506 [pdf, other]

doi 10.1145/3278532.3278574

A Long Way to the Top: Significance, Structure, and Stability of Internet Top Lists

Authors: Quirin Scheitle, Oliver Hohlfeld, Julien Gamba, Jonas Jelten, Torsten Zimmermann, Stephen D. Strowes, Narseo Vallina-Rodriguez

Abstract: A broad range of research areas including Internet measurement, privacy, and network security rely on lists of target domains to be analysed; researchers make use of target lists for reasons of necessity or efficiency. The popular Alexa list of one million domains is a widely used example. Despite their prevalence in research papers, the soundness of top lists has seldom been questioned by the com… ▽ More A broad range of research areas including Internet measurement, privacy, and network security rely on lists of target domains to be analysed; researchers make use of target lists for reasons of necessity or efficiency. The popular Alexa list of one million domains is a widely used example. Despite their prevalence in research papers, the soundness of top lists has seldom been questioned by the community: little is known about the lists' creation, representativity, potential biases, stability, or overlap between lists. In this study we survey the extent, nature, and evolution of top lists used by research communities. We assess the structure and stability of these lists, and show that rank manipulation is possible for some lists. We also reproduce the results of several scientific studies to assess the impact of using a top list at all, which list specifically, and the date of list creation. We find that (i) top lists generally overestimate results compared to the general population by a significant margin, often even an order of magnitude, and (ii) some top lists have surprising change characteristics, causing high day-to-day fluctuation and leading to result instability. We conclude our paper with specific recommendations on the use of top lists, and how to interpret results based on top lists with caution. △ Less

Submitted 23 September, 2018; v1 submitted 29 May, 2018; originally announced May 2018.

Comments: To be published at ACM IMC 2018. Web site with live data under: https://toplists.github.io

arXiv:1609.07190 [pdf, other]

Tracking the Trackers: Towards Understanding the Mobile Advertising and Tracking Ecosystem

Authors: Narseo Vallina-Rodriguez, Srikanth Sundaresan, Abbas Razaghpanah, Rishab Nithyanand, Mark Allman, Christian Kreibich, Phillipa Gill

Abstract: Third-party services form an integral part of the mobile ecosystem: they allow app developers to add features such as performance analytics and social network integration, and to monetize their apps by enabling user tracking and targeted ad delivery. At present users, researchers, and regulators all have at best limited understanding of this third-party ecosystem. In this paper we seek to shrink t… ▽ More Third-party services form an integral part of the mobile ecosystem: they allow app developers to add features such as performance analytics and social network integration, and to monetize their apps by enabling user tracking and targeted ad delivery. At present users, researchers, and regulators all have at best limited understanding of this third-party ecosystem. In this paper we seek to shrink this gap. Using data from users of our ICSI Haystack app we gain a rich view of the mobile ecosystem: we identify and characterize domains associated with mobile advertising and user tracking, thereby taking an important step towards greater transparency. We furthermore outline our steps towards a public catalog and census of analytics services, their behavior, their personal data collection processes, and their use across mobile apps. △ Less

Submitted 26 October, 2016; v1 submitted 22 September, 2016; originally announced September 2016.

arXiv:1605.05606 [pdf, other]

doi 10.1145/2987443.2987474

A Multi-perspective Analysis of Carrier-Grade NAT Deployment

Authors: Philipp Richter, Florian Wohlfart, Narseo Vallina-Rodriguez, Mark Allman, Randy Bush, Anja Feldmann, Christian Kreibich, Nicholas Weaver, Vern Paxson

Abstract: As ISPs face IPv4 address scarcity they increasingly turn to network address translation (NAT) to accommodate the address needs of their customers. Recently, ISPs have moved beyond employing NATs only directly at individual customers and instead begun deploying Carrier-Grade NATs (CGNs) to apply address translation to many independent and disparate endpoints spanning physical locations, a phenomen… ▽ More As ISPs face IPv4 address scarcity they increasingly turn to network address translation (NAT) to accommodate the address needs of their customers. Recently, ISPs have moved beyond employing NATs only directly at individual customers and instead begun deploying Carrier-Grade NATs (CGNs) to apply address translation to many independent and disparate endpoints spanning physical locations, a phenomenon that so far has received little in the way of empirical assessment. In this work we present a broad and systematic study of the deployment and behavior of these middleboxes. We develop a methodology to detect the existence of hosts behind CGNs by extracting non-routable IP addresses from peer lists we obtain by crawling the BitTorrent DHT. We complement this approach with improvements to our Netalyzr troubleshooting service, enabling us to determine a range of indicators of CGN presence as well as detailed insights into key properties of CGNs. Combining the two data sources we illustrate the scope of CGN deployment on today's Internet, and report on characteristics of commonly deployed CGNs and their effect on end users. △ Less

Submitted 13 September, 2016; v1 submitted 18 May, 2016; originally announced May 2016.

Journal ref: Proceedings of ACM IMC 2016

arXiv:1605.05077 [pdf, other]

Ad-Blocking and Counter Blocking: A Slice of the Arms Race

Authors: Rishab Nithyanand, Sheharbano Khattak, Mobin Javed, Narseo Vallina-Rodriguez, Marjan Falahrastegar, Julia E. Powles, Emiliano De Cristofaro, Hamed Haddadi, Steven J. Murdoch

Abstract: Adblocking tools like Adblock Plus continue to rise in popularity, potentially threatening the dynamics of advertising revenue streams. In response, a number of publishers have ramped up efforts to develop and deploy mechanisms for detecting and/or counter-blocking adblockers (which we refer to as anti-adblockers), effectively escalating the online advertising arms race. In this paper, we develop… ▽ More Adblocking tools like Adblock Plus continue to rise in popularity, potentially threatening the dynamics of advertising revenue streams. In response, a number of publishers have ramped up efforts to develop and deploy mechanisms for detecting and/or counter-blocking adblockers (which we refer to as anti-adblockers), effectively escalating the online advertising arms race. In this paper, we develop a scalable approach for identifying third-party services shared across multiple web-sites and use it to provide a first characterization of anti-adblocking across the Alexa Top-5K websites. We map websites that perform anti-adblocking as well as the entities that provide anti-adblocking scripts. We study the modus operandi of these scripts and their impact on popular adblockers. We find that at least 6.7% of websites in the Alexa Top-5K use anti-adblocking scripts, acquired from 12 distinct entities -- some of which have a direct interest in nourishing the online advertising industry. △ Less

Submitted 20 July, 2016; v1 submitted 17 May, 2016; originally announced May 2016.

Comments: To appear in the Proceedings of the 6th USENIX Workshop on Free and Open Communications on the Internet (FOCI 2016)

arXiv:1510.01419 [pdf, other]

Haystack: A Multi-Purpose Mobile Vantage Point in User Space

Authors: Abbas Razaghpanah, Narseo Vallina-Rodriguez, Srikanth Sundaresan, Christian Kreibich, Phillipa Gill, Mark Allman, Vern Paxson

Abstract: Despite our growing reliance on mobile phones for a wide range of daily tasks, their operation remains largely opaque. A number of previous studies have addressed elements of this problem in a partial fashion, trading off analytic comprehensiveness and deployment scale. We overcome the barriers to large-scale deployment (e.g., requiring rooted devices) and comprehensiveness of previous efforts by… ▽ More Despite our growing reliance on mobile phones for a wide range of daily tasks, their operation remains largely opaque. A number of previous studies have addressed elements of this problem in a partial fashion, trading off analytic comprehensiveness and deployment scale. We overcome the barriers to large-scale deployment (e.g., requiring rooted devices) and comprehensiveness of previous efforts by taking a novel approach that leverages the VPN API on mobile devices to design Haystack, an in-situ mobile measurement platform that operates exclusively on the device, providing full access to the device's network traffic and local context without requiring root access. We present the design of Haystack and its implementation in an Android app that we deploy via standard distribution channels. Using data collected from 450 users of the app, we exemplify the advantages of Haystack over the state of the art and demonstrate its seamless experience even under demanding conditions. We also demonstrate its utility to users and researchers in characterizing mobile traffic and privacy risks. △ Less

Submitted 29 October, 2016; v1 submitted 5 October, 2015; originally announced October 2015.

Comments: 13 pages incl. figures

Showing 1–18 of 18 results for author: Vallina-Rodriguez, N