Search | arXiv e-print repository

Federated Evaluation and Tuning for On-Device Personalization: System Design & Applications

Authors: Matthias Paulik, Matt Seigel, Henry Mason, Dominic Telaar, Joris Kluivers, Rogier van Dalen, Chi Wai Lau, Luke Carlson, Filip Granqvist, Chris Vandevelde, Sudeep Agarwal, Julien Freudiger, Andrew Byde, Abhishek Bhowmick, Gaurav Kapoor, Si Beaumont, Áine Cahill, Dominic Hughes, Omid Javidbakht, Fei Dong, Rehan Rishi, Stanley Hung

Abstract: We describe the design of our federated task processing system. Originally, the system was created to support two specific federated tasks: evaluation and tuning of on-device ML systems, primarily for the purpose of personalizing these systems. In recent years, support for an additional federated task has been added: federated learning (FL) of deep neural networks. To our knowledge, only one other… ▽ More We describe the design of our federated task processing system. Originally, the system was created to support two specific federated tasks: evaluation and tuning of on-device ML systems, primarily for the purpose of personalizing these systems. In recent years, support for an additional federated task has been added: federated learning (FL) of deep neural networks. To our knowledge, only one other system has been described in literature that supports FL at scale. We include comparisons to that system to help discuss design decisions and attached trade-offs. Finally, we describe two specific large scale personalization use cases in detail to showcase the applicability of federated tuning to on-device personalization and to highlight application specific solutions. △ Less

Submitted 16 February, 2021; originally announced February 2021.

Comments: 11 pages, 1 figure

arXiv:1812.00984 [pdf, other]

Protection Against Reconstruction and Its Applications in Private Federated Learning

Authors: Abhishek Bhowmick, John Duchi, Julien Freudiger, Gaurav Kapoor, Ryan Rogers

Abstract: In large-scale statistical learning, data collection and model fitting are moving increasingly toward peripheral devices---phones, watches, fitness trackers---away from centralized data collection. Concomitant with this rise in decentralized data are increasing challenges of maintaining privacy while allowing enough information to fit accurate, useful statistical models. This motivates local notio… ▽ More In large-scale statistical learning, data collection and model fitting are moving increasingly toward peripheral devices---phones, watches, fitness trackers---away from centralized data collection. Concomitant with this rise in decentralized data are increasing challenges of maintaining privacy while allowing enough information to fit accurate, useful statistical models. This motivates local notions of privacy---most significantly, local differential privacy, which provides strong protections against sensitive data disclosures---where data is obfuscated before a statistician or learner can even observe it, providing strong protections to individuals' data. Yet local privacy as traditionally employed may prove too stringent for practical use, especially in modern high-dimensional statistical and machine learning problems. Consequently, we revisit the types of disclosures and adversaries against which we provide protections, considering adversaries with limited prior information and ensuring that with high probability, ensuring they cannot reconstruct an individual's data within useful tolerances. By reconceptualizing these protections, we allow more useful data release---large privacy parameters in local differential privacy---and we design new (minimax) optimal locally differentially private mechanisms for statistical learning problems for \emph{all} privacy levels. We thus present practicable approaches to large-scale locally private model training that were previously impossible, showing theoretically and empirically that we can fit large-scale image classification and language models with little degradation in utility. △ Less

Submitted 3 June, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

arXiv:1502.05337 [pdf, other]

Controlled Data Sharing for Collaborative Predictive Blacklisting

Authors: Julien Freudiger, Emiliano De Cristofaro, Alex Brito

Abstract: Although sharing data across organizations is often advocated as a promising way to enhance cybersecurity, collaborative initiatives are rarely put into practice owing to confidentiality, trust, and liability challenges. In this paper, we investigate whether collaborative threat mitigation can be realized via a controlled data sharing approach, whereby organizations make informed decisions as to w… ▽ More Although sharing data across organizations is often advocated as a promising way to enhance cybersecurity, collaborative initiatives are rarely put into practice owing to confidentiality, trust, and liability challenges. In this paper, we investigate whether collaborative threat mitigation can be realized via a controlled data sharing approach, whereby organizations make informed decisions as to whether or not, and how much, to share. Using appropriate cryptographic tools, entities can estimate the benefits of collaboration and agree on what to share in a privacy-preserving way, without having to disclose their datasets. We focus on collaborative predictive blacklisting, i.e., forecasting attack sources based on one's logs and those contributed by other organizations. We study the impact of different sharing strategies by experimenting on a real-world dataset of two billion suspicious IP addresses collected from Dshield over two months. We find that controlled data sharing yields up to 105% accuracy improvement on average, while also reducing the false positive rate. △ Less

Submitted 16 April, 2015; v1 submitted 18 February, 2015; originally announced February 2015.

Comments: A preliminary version of this paper appears in DIMVA 2015. This is the full version. arXiv admin note: substantial text overlap with arXiv:1403.2123

arXiv:1405.1328 [pdf, other]

What's the Gist? Privacy-Preserving Aggregation of User Profiles

Authors: Igor Bilogrevic, Julien Freudiger, Emiliano De Cristofaro, Ersin Uzun

Abstract: Over the past few years, online service providers have started gathering increasing amounts of personal information to build user profiles and monetize them with advertisers and data brokers. Users have little control of what information is processed and are often left with an all-or-nothing decision between receiving free services or refusing to be profiled. This paper explores an alternative app… ▽ More Over the past few years, online service providers have started gathering increasing amounts of personal information to build user profiles and monetize them with advertisers and data brokers. Users have little control of what information is processed and are often left with an all-or-nothing decision between receiving free services or refusing to be profiled. This paper explores an alternative approach where users only disclose an aggregate model -- the "gist" -- of their data. We aim to preserve data utility and simultaneously provide user privacy. We show that this approach can be efficiently supported by letting users contribute encrypted and differentially-private data to an aggregator. The aggregator combines encrypted contributions and can only extract an aggregate model of the underlying data. We evaluate our framework on a dataset of 100,000 U.S. users obtained from the U.S. Census Bureau and show that (i) it provides accurate aggregates with as little as 100 users, (ii) it generates revenue for both users and data brokers, and (iii) its overhead is appreciably low. △ Less

Submitted 25 June, 2014; v1 submitted 6 May, 2014; originally announced May 2014.

Comments: To appear in the Proceedings of ESORICS 2014

arXiv:1403.2123

Privacy-Friendly Collaboration for Cyber Threat Mitigation

Authors: Julien Freudiger, Emiliano De Cristofaro, Alex Brito

Abstract: Sharing of security data across organizational boundaries has often been advocated as a promising way to enhance cyber threat mitigation. However, collaborative security faces a number of important challenges, including privacy, trust, and liability concerns with the potential disclosure of sensitive data. In this paper, we focus on data sharing for predictive blacklisting, i.e., forecasting attac… ▽ More Sharing of security data across organizational boundaries has often been advocated as a promising way to enhance cyber threat mitigation. However, collaborative security faces a number of important challenges, including privacy, trust, and liability concerns with the potential disclosure of sensitive data. In this paper, we focus on data sharing for predictive blacklisting, i.e., forecasting attack sources based on past attack information. We propose a novel privacy-enhanced data sharing approach in which organizations estimate collaboration benefits without disclosing their datasets, organize into coalitions of allied organizations, and securely share data within these coalitions. We study how different partner selection strategies affect prediction accuracy by experimenting on a real-world dataset of 2 billion IP addresses and observe up to a 105% prediction improvement. △ Less

Submitted 1 March, 2017; v1 submitted 9 March, 2014; originally announced March 2014.

Comments: This paper has been withdrawn as it has been superseded by arXiv:1502.05337

arXiv:1309.5344 [pdf, other]

A Comparative Usability Study of Two-Factor Authentication

Authors: Emiliano De Cristofaro, Honglu Du, Julien Freudiger, Greg Norcie

Abstract: Two-factor authentication (2F) aims to enhance resilience of password-based authentication by requiring users to provide an additional authentication factor, e.g., a code generated by a security token. However, it also introduces non-negligible costs for service providers and requires users to carry out additional actions during the authentication process. In this paper, we present an exploratory… ▽ More Two-factor authentication (2F) aims to enhance resilience of password-based authentication by requiring users to provide an additional authentication factor, e.g., a code generated by a security token. However, it also introduces non-negligible costs for service providers and requires users to carry out additional actions during the authentication process. In this paper, we present an exploratory comparative study of the usability of 2F technologies. First, we conduct a pre-study interview to identify popular technologies as well as contexts and motivations in which they are used. We then present the results of a quantitative study based on a survey completed by 219 Mechanical Turk users, aiming to measure the usability of three popular 2F solutions: codes generated by security tokens, one-time PINs received via email or SMS, and dedicated smartphone apps (e.g., Google Authenticator). We record contexts and motivations, and study their impact on perceived usability. We find that 2F technologies are overall perceived as usable, regardless of motivation and/or context of use. We also present an exploratory factor analysis, highlighting that three metrics -- ease-of-use, required cognitive efforts, and trustworthiness -- are enough to capture key factors affecting 2F usability. △ Less

Submitted 31 January, 2014; v1 submitted 20 September, 2013; originally announced September 2013.

Comments: A preliminary version of this paper appears in USEC 2014

arXiv:0912.5391 [pdf, other]

doi 10.1109/MCOM.2008.4689252

Secure Vehicular Communication Systems: Design and Architecture

Authors: P. Papadimitratos, L. Buttyan, T. Holczer, E. Schoch, J. Freudiger, M. Raya, Z. Ma, F. Kargl, A. Kung, J. -P. Hubaux

Abstract: Significant developments have taken place over the past few years in the area of vehicular communication (VC) systems. Now, it is well understood in the community that security and protection of private user information are a prerequisite for the deployment of the technology. This is so, precisely because the benefits of VC systems, with the mission to enhance transportation safety and efficienc… ▽ More Significant developments have taken place over the past few years in the area of vehicular communication (VC) systems. Now, it is well understood in the community that security and protection of private user information are a prerequisite for the deployment of the technology. This is so, precisely because the benefits of VC systems, with the mission to enhance transportation safety and efficiency, are at stake. Without the integration of strong and practical security and privacy enhancing mechanisms, VC systems could be disrupted or disabled, even by relatively unsophisticated attackers. We address this problem within the SeVeCom project, having developed a security architecture that provides a comprehensive and practical solution. We present our results in a set of two papers in this issue. In this first one, we analyze threats and types of adversaries, we identify security and privacy requirements, and we present a spectrum of mechanisms to secure VC systems. We provide a solution that can be quickly adopted and deployed. In the second paper, we present our progress towards the implementation of our architecture and results on the performance of the secure VC system, along with a discussion of upcoming research challenges and our related current results. △ Less

Submitted 30 December, 2009; originally announced December 2009.

Journal ref: IEEE Communcations Magazine, vol. 46, no. 11, pp. 100--109, November 2008

Showing 1–7 of 7 results for author: Freudiger, J