Search | arXiv e-print repository

Scalable Private Search with Wally

Authors: Hilal Asi, Fabian Boemer, Nicholas Genise, Muhammad Haris Mughees, Tabitha Ogilvie, Rehan Rishi, Guy N. Rothblum, Kunal Talwar, Karl Tarbe, Ruiyu Zhu, Marco Zuliani

Abstract: This paper presents Wally, a private search system that supports efficient semantic and keyword search queries against large databases. When sufficient clients are making the queries, Wally performance is significantly better than previous systems. In previous private search systems, for each client query, the server must perform at least one expensive cryptographic operation per database entry. A… ▽ More This paper presents Wally, a private search system that supports efficient semantic and keyword search queries against large databases. When sufficient clients are making the queries, Wally performance is significantly better than previous systems. In previous private search systems, for each client query, the server must perform at least one expensive cryptographic operation per database entry. As a result, performance degraded proportionally with the number of entries in the database. In Wally we get rid of this limitation. Specifically, for each query the server performs cryptographic operations only against a few database entries. We achieve these results by requiring each client to add a few fake queries, and sends each query via an anonymous network to the server at independently chosen random instants. Additionally, each client also uses somewhat homomorphic encryption (SHE) to hide whether a query is real or fake, Wally provides $(ε, δ)$-differential privacy guarantee, which is an accepted standard for strong privacy. The number of fake queries each client makes depends inversely on the number of clients making queries. Therefore, the fake queries' overhead vanishes as the number of clients increases, enabling scalability to millions of queries and large databases. Concretely, Wally can serve $8$M requests at a rate of 3,000 queries per second. That is around 60x higher than the state-of-the-art scheme. △ Less

Submitted 12 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

arXiv:2109.08189 [pdf, other]

PrivateFetch: Scalable Catalog Delivery in Privacy-Preserving Advertising

Authors: Muhammad Haris Mughees, Gonçalo Pestana, Alex Davidson, Benjamin Livshits

Abstract: In order to preserve the possibility of an Internet that is free at the point of use, attention is turning to new solutions that would allow targeted advertisement delivery based on behavioral information such as user preferences, without compromising user privacy. Recently, explorations in devising such systems either take approaches that rely on semantic guarantees like $k$-anonymity -- which ca… ▽ More In order to preserve the possibility of an Internet that is free at the point of use, attention is turning to new solutions that would allow targeted advertisement delivery based on behavioral information such as user preferences, without compromising user privacy. Recently, explorations in devising such systems either take approaches that rely on semantic guarantees like $k$-anonymity -- which can be easily subverted when combining with alternative information, and do not take into account the possibility that even knowledge of such clusters is privacy-invasive in themselves. Other approaches provide full privacy by moving all data and processing logic to clients -- but which is prohibitively expensive for both clients and servers. In this work, we devise a new framework called PrivateFetch for building practical ad-delivery pipelines that rely on cryptographic hardness and best-case privacy, rather than syntactic privacy guarantees or reliance on real-world anonymization tools. PrivateFetch utilizes local computation of preferences followed by high-performance single-server private information retrieval (PIR) to ensure that clients can pre-fetch ad content from servers, without revealing any of their inherent characteristics to the content provider. When considering an database of $>1,000,000$ ads, we show that we can deliver $30$ ads to a client in 40 seconds, with total communication costs of 192KB. We also demonstrate the feasibility of PrivateFetch by showing that the monetary cost of running it is less than 1% of average ad revenue. As such, our system is capable of pre-fetching ads for clients based on behavioral and contextual user information, before displaying them during a typical browsing session. In addition, while we test PrivateFetch as a private ad-delivery, the generality of our approach means that it could also be used for other content types. △ Less

Submitted 16 September, 2021; originally announced September 2021.

arXiv:1811.10296 [pdf, other]

Distributed and Secure ML with Self-tallying Multi-party Aggregation

Authors: Yunhui Long, Tanmay Gangwani, Haris Mughees, Carl Gunter

Abstract: Privacy preserving multi-party computation has many applications in areas such as medicine and online advertisements. In this work, we propose a framework for distributed, secure machine learning among untrusted individuals. The framework consists of two parts: a two-step training protocol based on homomorphic addition and a zero knowledge proof for data validity. By combining these two techniques… ▽ More Privacy preserving multi-party computation has many applications in areas such as medicine and online advertisements. In this work, we propose a framework for distributed, secure machine learning among untrusted individuals. The framework consists of two parts: a two-step training protocol based on homomorphic addition and a zero knowledge proof for data validity. By combining these two techniques, our framework provides privacy of per-user data, prevents against a malicious user contributing corrupted data to the shared pool, enables each user to self-compute the results of the algorithm without relying on external trusted third parties, and requires no private channels between groups of users. We show how different ML algorithms such as Latent Dirichlet Allocation, Naive Bayes, Decision Trees etc. fit our framework for distributed, secure computing. △ Less

Submitted 26 November, 2018; originally announced November 2018.

Comments: NeurIPS 2018 Workshop on PPML

arXiv:1605.08763 [pdf, other]

Smartphone Fingerprinting Via Motion Sensors: Analyzing Feasibility at Large-Scale and Studying Real Usage Patterns

Authors: Anupam Das, Nikita Borisov, Edward Chou, Muhammad Haris Mughees

Abstract: Advertisers are increasingly turning to fingerprinting techniques to track users across the web. As web browsing activity shifts to mobile platforms, traditional browser fingerprinting techniques become less effective; however, device fingerprinting using built-in sensors offers a new avenue for attack. We study the feasibility of using motion sensors to perform device fingerprinting at scale, and… ▽ More Advertisers are increasingly turning to fingerprinting techniques to track users across the web. As web browsing activity shifts to mobile platforms, traditional browser fingerprinting techniques become less effective; however, device fingerprinting using built-in sensors offers a new avenue for attack. We study the feasibility of using motion sensors to perform device fingerprinting at scale, and explore countermeasures that can be used to protect privacy. We perform a large-scale user study to demonstrate that motion sensor fingerprinting is effective with even 500 users. We also develop a model to estimate prediction accuracy for larger user populations; our model provides a conservative estimate of at least 12% classification accuracy with 100000 users. We then investigate the use of motion sensors on the web and find, distressingly, that many sites send motion sensor data to servers for storage and analysis, paving the way to potential fingerprinting. Finally, we consider the problem of develo** fingerprinting countermeasures; we evaluate a previously proposed obfuscation technique and a newly developed quantization technique via a user study. We find that both techniques are able to drastically reduce fingerprinting accuracy without significantly impacting the utility of the sensors in web applications. △ Less

Submitted 13 June, 2016; v1 submitted 27 May, 2016; originally announced May 2016.

arXiv:1605.05841 [pdf, other]

doi 10.1145/1235

A First Look at Ad-block Detection: A New Arms Race on the Web

Authors: Muhammad Haris Mughees, Zhiyun Qian, Zubair Shafiq, Karishma Dash, Pan Hui

Abstract: The rise of ad-blockers is viewed as an economic threat by online publishers, especially those who primarily rely on ad- vertising to support their services. To address this threat, publishers have started retaliating by employing ad-block detectors, which scout for ad-blocker users and react to them by restricting their content access and pushing them to whitelist the website or disabling ad-bloc… ▽ More The rise of ad-blockers is viewed as an economic threat by online publishers, especially those who primarily rely on ad- vertising to support their services. To address this threat, publishers have started retaliating by employing ad-block detectors, which scout for ad-blocker users and react to them by restricting their content access and pushing them to whitelist the website or disabling ad-blockers altogether. The clash between ad-blockers and ad-block detectors has resulted in a new arms race on the web. In this paper, we present the first systematic measurement and analysis of ad-block detection on the web. We have designed and implemented a machine learning based tech- nique to automatically detect ad-block detection, and use it to study the deployment of ad-block detectors on Alexa top- 100K websites. The approach is promising with precision of 94.8% and recall of 93.1%. We characterize the spectrum of different strategies used by websites for ad-block detection. We find that most of publishers use fairly simple passive ap- proaches for ad-block detection. However, we also note that a few websites use third-party services, e.g. PageFair, for ad-block detection and response. The third-party services use active deception and other sophisticated tactics to de- tect ad-blockers. We also find that the third-party services can successfully circumvent ad-blockers and display ads on publisher websites. △ Less

Submitted 19 May, 2016; originally announced May 2016.

Comments: 12 pages, 12 figures

Showing 1–5 of 5 results for author: Mughees, H