Skip to main content

Showing 1–50 of 57 results for author: McMahan, B

.
  1. arXiv:2407.07737  [pdf, other

    cs.LG cs.CL cs.CR cs.DC

    Fine-Tuning Large Language Models with User-Level Differential Privacy

    Authors: Zachary Charles, Arun Ganesh, Ryan McKenna, H. Brendan McMahan, Nicole Mitchell, Krishna Pillutla, Keith Rush

    Abstract: We investigate practical and scalable algorithms for training large language models (LLMs) with user-level differential privacy (DP) in order to provably safeguard all the examples contributed by each user. We study two variants of DP-SGD with: (1) example-level sampling (ELS) and per-example gradient clip**, and (2) user-level sampling (ULS) and per-user gradient clip**. We derive a novel use… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  2. arXiv:2404.16706  [pdf, other

    cs.DS cs.CC cs.CR cs.LG

    Efficient and Near-Optimal Noise Generation for Streaming Differential Privacy

    Authors: Krishnamurthy Dvijotham, H. Brendan McMahan, Krishna Pillutla, Thomas Steinke, Abhradeep Thakurta

    Abstract: In the task of differentially private (DP) continual counting, we receive a stream of increments and our goal is to output an approximate running total of these increments, without revealing too much about any specific increment. Despite its simplicity, differentially private continual counting has attracted significant attention both in theory and in practice. Existing algorithms for differential… ▽ More

    Submitted 6 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  3. arXiv:2404.10764  [pdf, other

    cs.CR cs.LG

    Confidential Federated Computations

    Authors: Hubert Eichner, Daniel Ramage, Kallista Bonawitz, Dzmitry Huba, Tiziano Santoro, Brett McLarnon, Timon Van Overveldt, Nova Fallen, Peter Kairouz, Albert Cheu, Katharine Daly, Adria Gascon, Marco Gruteser, Brendan McMahan

    Abstract: Federated Learning and Analytics (FLA) have seen widespread adoption by technology platforms for processing sensitive on-device data. However, basic FLA systems have privacy limitations: they do not necessarily require anonymization mechanisms like differential privacy (DP), and provide limited protections against a potentially malicious service provider. Adding DP to a basic FLA system currently… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  4. arXiv:2306.08153  [pdf, other

    cs.LG cs.CR

    (Amplified) Banded Matrix Factorization: A unified approach to private training

    Authors: Christopher A. Choquette-Choo, Arun Ganesh, Ryan McKenna, H. Brendan McMahan, Keith Rush, Abhradeep Thakurta, Zheng Xu

    Abstract: Matrix factorization (MF) mechanisms for differential privacy (DP) have substantially improved the state-of-the-art in privacy-utility-computation tradeoffs for ML applications in a variety of scenarios, but in both the centralized and federated settings there remain instances where either MF cannot be easily applied, or other algorithms provide better tradeoffs (typically, as $ε$ becomes small).… ▽ More

    Submitted 1 November, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: 34 pages, 13 figures

  5. arXiv:2305.18465  [pdf, other

    cs.LG cs.CR

    Federated Learning of Gboard Language Models with Differential Privacy

    Authors: Zheng Xu, Yanxiang Zhang, Galen Andrew, Christopher A. Choquette-Choo, Peter Kairouz, H. Brendan McMahan, Jesse Rosenstock, Yuanbo Zhang

    Abstract: We train language models (LMs) with federated learning (FL) and differential privacy (DP) in the Google Keyboard (Gboard). We apply the DP-Follow-the-Regularized-Leader (DP-FTRL)~\citep{kairouz21b} algorithm to achieve meaningfully formal DP guarantees without requiring uniform sampling of client devices. To provide favorable privacy-utility trade-offs, we introduce a new client participation crit… ▽ More

    Submitted 17 July, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: ACL industry track; v2 updating SecAgg details

  6. arXiv:2305.18447  [pdf, other

    cs.LG cs.CR cs.IT math.ST

    Unleashing the Power of Randomization in Auditing Differentially Private ML

    Authors: Krishna Pillutla, Galen Andrew, Peter Kairouz, H. Brendan McMahan, Alina Oprea, Sewoong Oh

    Abstract: We present a rigorous methodology for auditing differentially private machine learning algorithms by adding multiple carefully designed examples called canaries. We take a first principles approach based on three key components. First, we introduce Lifted Differential Privacy (LiDP) that expands the definition of differential privacy to handle randomized datasets. This gives us the freedom to desi… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

  7. arXiv:2305.12132  [pdf, other

    cs.LG

    Can Public Large Language Models Help Private Cross-device Federated Learning?

    Authors: Boxin Wang, Yibo Jacky Zhang, Yuan Cao, Bo Li, H. Brendan McMahan, Sewoong Oh, Zheng Xu, Manzil Zaheer

    Abstract: We study (differentially) private federated learning (FL) of language models. The language models in cross-device FL are relatively small, which can be trained with meaningful formal user-level differential privacy (DP) guarantees when massive parallelism in training is enabled by the participation of a moderate size of users. Recently, public data has been used to improve privacy-utility trade-of… ▽ More

    Submitted 12 April, 2024; v1 submitted 20 May, 2023; originally announced May 2023.

    Comments: Published at Findings of NAACL 2024

  8. arXiv:2303.10218  [pdf, other

    cs.LG cs.AI

    An Empirical Evaluation of Federated Contextual Bandit Algorithms

    Authors: Alekh Agarwal, H. Brendan McMahan, Zheng Xu

    Abstract: As the adoption of federated learning increases for learning from sensitive data local to user devices, it is natural to ask if the learning can be done using implicit signals generated as users interact with the applications of interest, rather than requiring access to explicit labels which can be difficult to acquire in many tasks. We approach such problems with the framework of federated contex… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

  9. arXiv:2303.00654  [pdf, other

    cs.LG cs.CR stat.ML

    How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy

    Authors: Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, Abhradeep Thakurta

    Abstract: ML models are ubiquitous in real world applications and are a constant focus of research. At the same time, the community has started to realize the importance of protecting the privacy of ML training data. Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP t… ▽ More

    Submitted 31 July, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Journal ref: Journal of Artificial Intelligence Research 77 (2023) 1113-1201

  10. arXiv:2302.03098  [pdf, other

    cs.LG cs.CR

    One-shot Empirical Privacy Estimation for Federated Learning

    Authors: Galen Andrew, Peter Kairouz, Sewoong Oh, Alina Oprea, H. Brendan McMahan, Vinith M. Suriyakumar

    Abstract: Privacy estimation techniques for differentially private (DP) algorithms are useful for comparing against analytical bounds, or to empirically measure privacy loss in settings where known analytical bounds are not tight. However, existing privacy auditing techniques usually make strong assumptions on the adversary (e.g., knowledge of intermediate model iterates or the training data distribution),… ▽ More

    Submitted 18 April, 2024; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: Final revision, oral presentation at ICLR 2024

  11. arXiv:2302.01463  [pdf, other

    cs.LG

    Gradient Descent with Linearly Correlated Noise: Theory and Applications to Differential Privacy

    Authors: Anastasia Koloskova, Ryan McKenna, Zachary Charles, Keith Rush, Brendan McMahan

    Abstract: We study gradient descent under linearly correlated noise. Our work is motivated by recent practical methods for optimization with differential privacy (DP), such as DP-FTRL, which achieve strong performance in settings where privacy amplification techniques are infeasible (such as in federated learning). These methods inject privacy noise through a matrix factorization mechanism, making the noise… ▽ More

    Submitted 15 January, 2024; v1 submitted 2 February, 2023; originally announced February 2023.

  12. arXiv:2212.00309  [pdf, other

    cs.LG cs.CR

    Differentially Private Adaptive Optimization with Delayed Preconditioners

    Authors: Tian Li, Manzil Zaheer, Ken Ziyu Liu, Sashank J. Reddi, H. Brendan McMahan, Virginia Smith

    Abstract: Privacy noise may negate the benefits of using adaptive optimizers in differentially private model training. Prior works typically address this issue by using auxiliary information (e.g., public data) to boost the effectiveness of adaptive optimization. In this work, we explore techniques to estimate and efficiently adapt to gradient geometry in private adaptive optimization without auxiliary data… ▽ More

    Submitted 7 June, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: Accepted by ICLR 2023

  13. arXiv:2211.10844  [pdf, other

    cs.LG cs.CR cs.CV

    Learning to Generate Image Embeddings with User-level Differential Privacy

    Authors: Zheng Xu, Maxwell Collins, Yuxiao Wang, Liviu Panait, Sewoong Oh, Sean Augenstein, Ting Liu, Florian Schroff, H. Brendan McMahan

    Abstract: Small on-device models have been successfully trained with user-level differential privacy (DP) for next word prediction and image classification tasks in the past. However, existing methods can fail when directly applied to learn embedding models using supervised training data with a large class space. To achieve user-level DP for large image-to-embedding feature extractors, we propose DP-FedEmb,… ▽ More

    Submitted 31 March, 2023; v1 submitted 19 November, 2022; originally announced November 2022.

    Comments: CVPR camera ready. Addressed reviewer comments. Switched from add-or-remove-one DP to substitute-one DP

  14. arXiv:2211.06530  [pdf, other

    cs.LG cs.CR cs.DS stat.ML

    Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning

    Authors: Christopher A. Choquette-Choo, H. Brendan McMahan, Keith Rush, Abhradeep Thakurta

    Abstract: We introduce new differentially private (DP) mechanisms for gradient-based machine learning (ML) with multiple passes (epochs) over a dataset, substantially improving the achievable privacy-utility-computation tradeoffs. We formalize the problem of DP mechanisms for adaptive streams with multiple participations and introduce a non-trivial extension of online matrix factorization DP mechanisms to o… ▽ More

    Submitted 8 June, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: 9 pages main-text, 3 figures. 40 pages with 13 figures total

  15. arXiv:2208.09432  [pdf, other

    cs.LG cs.DC

    Federated Select: A Primitive for Communication- and Memory-Efficient Federated Learning

    Authors: Zachary Charles, Kallista Bonawitz, Stanislav Chiknavaryan, Brendan McMahan, Blaise Agüera y Arcas

    Abstract: Federated learning (FL) is a framework for machine learning across heterogeneous client devices in a privacy-preserving fashion. To date, most FL algorithms learn a "global" server model across multiple rounds. At each round, the same server model is broadcast to all participating clients, updated locally, and then aggregated across clients. In this work, we propose a more general procedure in whi… ▽ More

    Submitted 19 August, 2022; originally announced August 2022.

  16. arXiv:2202.08312  [pdf, other

    cs.LG math.OC

    Improved Differential Privacy for SGD via Optimal Private Linear Operators on Adaptive Streams

    Authors: Sergey Denisov, Brendan McMahan, Keith Rush, Adam Smith, Abhradeep Guha Thakurta

    Abstract: Motivated by recent applications requiring differential privacy over adaptive streams, we investigate the question of optimal instantiations of the matrix mechanism in this setting. We prove fundamental theoretical results on the applicability of matrix factorizations to adaptive streams, and provide a parameter-free fixed-point algorithm for computing optimal factorizations. We instantiate this f… ▽ More

    Submitted 17 January, 2023; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: 33 pages, 6 figures. Associated code at https://github.com/google-research/federated/tree/master/dp_matrix_factorization

  17. arXiv:2107.06917  [pdf, other

    cs.LG

    A Field Guide to Federated Optimization

    Authors: Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz , et al. (28 additional authors not shown)

    Abstract: Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

  18. arXiv:2103.00039  [pdf, other

    cs.CR cs.LG

    Practical and Private (Deep) Learning without Sampling or Shuffling

    Authors: Peter Kairouz, Brendan McMahan, Shuang Song, Om Thakkar, Abhradeep Thakurta, Zheng Xu

    Abstract: We consider training models with differential privacy (DP) using mini-batch gradients. The existing state-of-the-art, Differentially Private Stochastic Gradient Descent (DP-SGD), requires privacy amplification by sampling or shuffling to obtain the best privacy/accuracy/computation trade-offs. Unfortunately, the precise requirements on exact sampling and shuffling can be hard to obtain in importan… ▽ More

    Submitted 10 December, 2021; v1 submitted 26 February, 2021; originally announced March 2021.

  19. arXiv:2009.10031  [pdf, other

    cs.LG cs.CR stat.ML

    Training Production Language Models without Memorizing User Data

    Authors: Swaroop Ramaswamy, Om Thakkar, Rajiv Mathews, Galen Andrew, H. Brendan McMahan, Françoise Beaufays

    Abstract: This paper presents the first consumer-scale next-word prediction (NWP) model trained with Federated Learning (FL) while leveraging the Differentially Private Federated Averaging (DP-FedAvg) technique. There has been prior work on building practical FL infrastructure, including work demonstrating the feasibility of training language models on mobile devices using such infrastructure. It has also b… ▽ More

    Submitted 21 September, 2020; originally announced September 2020.

  20. arXiv:2007.06605  [pdf, other

    cs.LG cs.CR stat.ML

    Privacy Amplification via Random Check-Ins

    Authors: Borja Balle, Peter Kairouz, H. Brendan McMahan, Om Thakkar, Abhradeep Thakurta

    Abstract: Differentially Private Stochastic Gradient Descent (DP-SGD) forms a fundamental building block in many applications for learning over sensitive data. Two standard approaches, privacy amplification by subsampling, and privacy amplification by shuffling, permit adding lower noise in DP-SGD than via naïve schemes. A key assumption in both these approaches is that the elements in the data set can be u… ▽ More

    Submitted 30 July, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

    Comments: Updated proof for $(ε_0, δ_0)$-DP local randomizers

  21. arXiv:2007.04428  [pdf, other

    cs.CL

    Discourse Coherence, Reference Grounding and Goal Oriented Dialogue

    Authors: Baber Khalid, Malihe Alikhani, Michael Fellner, Brian McMahan, Matthew Stone

    Abstract: Prior approaches to realizing mixed-initiative human--computer referential communication have adopted information-state or collaborative problem-solving approaches. In this paper, we argue for a new approach, inspired by coherence-based models of discourse such as SDRT \cite{asher-lascarides:2003a}, in which utterances attach to an evolving discourse structure and the associated knowledge graph of… ▽ More

    Submitted 8 July, 2020; originally announced July 2020.

    Comments: Accepted for Publishing at SemDial 2020

  22. arXiv:2003.00295  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Adaptive Federated Optimization

    Authors: Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, H. Brendan McMahan

    Abstract: Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data. Standard federated optimization methods such as Federated Averaging (FedAvg) are often difficult to tune and exhibit unfavorable convergence behavior. In non-federated settings, adaptive optimization methods have… ▽ More

    Submitted 8 September, 2021; v1 submitted 29 February, 2020; originally announced March 2020.

    Comments: Published as a conference paper at ICLR 2021

  23. arXiv:2002.07839  [pdf, other

    cs.LG math.OC stat.ML

    Is Local SGD Better than Minibatch SGD?

    Authors: Blake Woodworth, Kumar Kshitij Patel, Sebastian U. Stich, Zhen Dai, Brian Bullins, H. Brendan McMahan, Ohad Shamir, Nathan Srebro

    Abstract: We study local SGD (also known as parallel SGD and federated averaging), a natural and frequently used stochastic distributed optimization method. Its theoretical foundations are currently lacking and we highlight how all existing error guarantees in the convex setting are dominated by a simple baseline, minibatch SGD. (1) For quadratic objectives we prove that local SGD strictly dominates minibat… ▽ More

    Submitted 20 July, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: 29 pages

  24. arXiv:1912.04977  [pdf, other

    cs.LG cs.CR stat.ML

    Advances and Open Problems in Federated Learning

    Authors: Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson , et al. (34 additional authors not shown)

    Abstract: Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while kee** the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs re… ▽ More

    Submitted 8 March, 2021; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: Published in Foundations and Trends in Machine Learning Vol 4 Issue 1. See: https://www.nowpublishers.com/article/Details/MAL-083

  25. arXiv:1912.00131  [pdf, other

    cs.DC cs.CR cs.LG stat.ML

    Federated Learning with Autotuned Communication-Efficient Secure Aggregation

    Authors: Keith Bonawitz, Fariborz Salehi, Jakub Konečný, Brendan McMahan, Marco Gruteser

    Abstract: Federated Learning enables mobile devices to collaboratively learn a shared inference model while kee** all the training data on a user's device, decoupling the ability to do machine learning from the need to store the data in the cloud. Existing work on federated learning with limited communication demonstrates how random rotation can enable users' model updates to be quantized much more effici… ▽ More

    Submitted 29 November, 2019; originally announced December 2019.

    Comments: 5 pages, 3 figures. To appear at the IEEE Asilomar Conference on Signals, Systems, and Computers 2019

  26. arXiv:1911.07963  [pdf, other

    cs.LG cs.CR stat.ML

    Can You Really Backdoor Federated Learning?

    Authors: Ziteng Sun, Peter Kairouz, Ananda Theertha Suresh, H. Brendan McMahan

    Abstract: The decentralized nature of federated learning makes detecting and defending against adversarial attacks a challenging task. This paper focuses on backdoor attacks in the federated learning setting, where the goal of the adversary is to reduce the performance of the model on targeted tasks while maintaining good performance on the main task. Unlike existing works, we allow non-malicious clients to… ▽ More

    Submitted 2 December, 2019; v1 submitted 18 November, 2019; originally announced November 2019.

    Comments: To appear at the 2nd International Workshop on Federated Learning for Data Privacy and Confidentiality at NeurIPS 2019

  27. arXiv:1911.06679  [pdf, other

    cs.LG stat.ML

    Generative Models for Effective ML on Private, Decentralized Datasets

    Authors: Sean Augenstein, H. Brendan McMahan, Daniel Ramage, Swaroop Ramaswamy, Peter Kairouz, Mingqing Chen, Rajiv Mathews, Blaise Aguera y Arcas

    Abstract: To improve real-world applications of machine learning, experienced modelers develop intuition about their datasets, their models, and how the two interact. Manual inspection of raw data - of representative samples, of outliers, of misclassifications - is an essential tool in a) identifying and fixing problems in the data, b) generating new modeling hypotheses, and c) assigning or refining human-p… ▽ More

    Submitted 4 February, 2020; v1 submitted 15 November, 2019; originally announced November 2019.

    Comments: 26 pages, 8 figures. Camera-ready ICLR 2020 version

  28. arXiv:1905.03871  [pdf, other

    cs.LG stat.ML

    Differentially Private Learning with Adaptive Clip**

    Authors: Galen Andrew, Om Thakkar, H. Brendan McMahan, Swaroop Ramaswamy

    Abstract: Existing approaches for training neural networks with user-level differential privacy (e.g., DP Federated Averaging) in federated learning (FL) settings involve bounding the contribution of each user's model update by clip** it to some constant value. However there is no good a priori setting of the clip** norm across tasks and learning settings: the update norm distribution depends on the mod… ▽ More

    Submitted 9 May, 2022; v1 submitted 9 May, 2019; originally announced May 2019.

    Comments: Accepted to NeurIPS, 2021

  29. arXiv:1904.10120  [pdf, other

    cs.LG stat.ML

    Semi-Cyclic Stochastic Gradient Descent

    Authors: Hubert Eichner, Tomer Koren, H. Brendan McMahan, Nathan Srebro, Kunal Talwar

    Abstract: We consider convex SGD updates with a block-cyclic structure, i.e. where each cycle consists of a small number of blocks, each with many samples from a possibly different, block-specific, distribution. This situation arises, e.g., in Federated Learning where the mobile devices available for updates at different times during the day have different characteristics. We show that such block-cyclic str… ▽ More

    Submitted 22 April, 2019; originally announced April 2019.

  30. arXiv:1904.03257  [pdf, ps, other

    cs.LG cs.DB cs.DC cs.SE stat.ML

    MLSys: The New Frontier of Machine Learning Systems

    Authors: Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood , et al. (44 additional authors not shown)

    Abstract: Machine learning (ML) techniques are enjoying rapidly increasing adoption. However, designing and implementing the systems that support ML models in real-world deployments remains a significant obstacle, in large part due to the radically different development and deployment profile of modern ML methods, and the range of practical concerns that come with broader adoption. We propose to foster a ne… ▽ More

    Submitted 1 December, 2019; v1 submitted 29 March, 2019; originally announced April 2019.

  31. arXiv:1902.08534  [pdf, other

    cs.CR

    Federated Heavy Hitters Discovery with Differential Privacy

    Authors: Wennan Zhu, Peter Kairouz, Brendan McMahan, Haicheng Sun, Wei Li

    Abstract: The discovery of heavy hitters (most frequent items) in user-generated data streams drives improvements in the app and web ecosystems, but can incur substantial privacy risks if not done with care. To address these risks, we propose a distributed and privacy-preserving algorithm for discovering the heavy hitters in a population of user-generated data streams. We leverage the sampling and threshold… ▽ More

    Submitted 28 February, 2020; v1 submitted 22 February, 2019; originally announced February 2019.

  32. arXiv:1902.01046  [pdf, other

    cs.LG cs.DC stat.ML

    Towards Federated Learning at Scale: System Design

    Authors: Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konečný, Stefano Mazzocchi, H. Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, Jason Roselander

    Abstract: Federated Learning is a distributed machine learning approach which enables model training on a large corpus of decentralized data. We have built a scalable production system for Federated Learning in the domain of mobile devices, based on TensorFlow. In this paper, we describe the resulting high-level design, sketch some of the challenges and their solutions, and touch upon the open problems and… ▽ More

    Submitted 22 March, 2019; v1 submitted 4 February, 2019; originally announced February 2019.

  33. arXiv:1812.07210  [pdf, other

    cs.LG cs.DC stat.ML

    Expanding the Reach of Federated Learning by Reducing Client Resource Requirements

    Authors: Sebastian Caldas, Jakub Konečny, H. Brendan McMahan, Ameet Talwalkar

    Abstract: Communication on heterogeneous edge networks is a fundamental bottleneck in Federated Learning (FL), restricting both model capacity and user participation. To address this issue, we introduce two novel strategies to reduce communication costs: (1) the use of lossy compression on the global model sent server-to-client; and (2) Federated Dropout, which allows users to efficiently train locally on s… ▽ More

    Submitted 8 January, 2019; v1 submitted 18 December, 2018; originally announced December 2018.

  34. arXiv:1812.06210  [pdf, ps, other

    cs.LG stat.ML

    A General Approach to Adding Differential Privacy to Iterative Training Procedures

    Authors: H. Brendan McMahan, Galen Andrew, Ulfar Erlingsson, Steve Chien, Ilya Mironov, Nicolas Papernot, Peter Kairouz

    Abstract: In this work we address the practical challenges of training machine learning models on privacy-sensitive datasets by introducing a modular approach that minimizes changes to training algorithms, provides a variety of configuration strategies for the privacy mechanism, and then isolates and simplifies the critical logic that computes the final privacy guarantees. A key challenge is that training a… ▽ More

    Submitted 4 March, 2019; v1 submitted 14 December, 2018; originally announced December 2018.

    Comments: Presented at NeurIPS 2018 workshop on Privacy Preserving Machine Learning; Companion paper to TensorFlow Privacy OSS Library

  35. arXiv:1812.01097  [pdf, other

    cs.LG stat.ML

    LEAF: A Benchmark for Federated Settings

    Authors: Sebastian Caldas, Sai Meher Karthik Duddu, Peter Wu, Tian Li, Jakub Konečný, H. Brendan McMahan, Virginia Smith, Ameet Talwalkar

    Abstract: Modern federated networks, such as those comprised of wearable devices, mobile phones, or autonomous vehicles, generate massive amounts of data each day. This wealth of data can help to learn models that can improve the user experience on each device. However, the scale and heterogeneity of federated data presents new challenges in research areas such as federated learning, meta-learning, and mult… ▽ More

    Submitted 9 December, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

  36. arXiv:1805.10559  [pdf, other

    stat.ML cs.CR cs.LG

    cpSGD: Communication-efficient and differentially-private distributed SGD

    Authors: Naman Agarwal, Ananda Theertha Suresh, Felix Yu, Sanjiv Kumar, H. Brendan Mcmahan

    Abstract: Distributed stochastic gradient descent is an important subroutine in distributed learning. A setting of particular interest is when the clients are mobile devices, where two important concerns are communication efficiency and the privacy of the clients. Several recent works have focused on reducing the communication cost or introducing privacy guarantees, but none of the proposed communication ef… ▽ More

    Submitted 26 May, 2018; originally announced May 2018.

  37. arXiv:1805.10222  [pdf, other

    math.OC cs.LG stat.ML

    Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization

    Authors: Blake Woodworth, Jialei Wang, Adam Smith, Brendan McMahan, Nathan Srebro

    Abstract: We suggest a general oracle-based framework that captures different parallel stochastic optimization settings described by a dependency graph, and derive generic lower bounds in terms of this graph. We then use the framework and derive lower bounds for several specific parallel optimization settings, including delayed updates and parallel processing with intermittent communication. We highlight ga… ▽ More

    Submitted 11 February, 2019; v1 submitted 25 May, 2018; originally announced May 2018.

  38. arXiv:1710.08377  [pdf, other

    cs.SD cs.AI eess.AS

    Listening to the World Improves Speech Command Recognition

    Authors: Brian McMahan, Delip Rao

    Abstract: We study transfer learning in convolutional network architectures applied to the task of recognizing audio, such as environmental sound events and speech commands. Our key finding is that not only is it possible to transfer representations from an unrelated task like environmental sound classification to a voice-focused task like speech command recognition, but also that doing so improves accuraci… ▽ More

    Submitted 23 October, 2017; originally announced October 2017.

    Comments: 8 pages

  39. arXiv:1710.06963  [pdf, other

    cs.LG

    Learning Differentially Private Recurrent Language Models

    Authors: H. Brendan McMahan, Daniel Ramage, Kunal Talwar, Li Zhang

    Abstract: We demonstrate that it is possible to train large recurrent language models with user-level differential privacy guarantees with only a negligible cost in predictive accuracy. Our work builds on recent advances in the training of deep networks on user-partitioned data and privacy accounting for stochastic gradient descent. In particular, we add user-level privacy protection to the federated averag… ▽ More

    Submitted 23 February, 2018; v1 submitted 18 October, 2017; originally announced October 2017.

    Comments: Camera-ready ICLR 2018 version, minor edits from previous

  40. arXiv:1708.08022  [pdf, ps, other

    stat.ML cs.CR cs.LG

    On the Protection of Private Information in Machine Learning Systems: Two Recent Approaches

    Authors: Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Nicolas Papernot, Kunal Talwar, Li Zhang

    Abstract: The recent, remarkable growth of machine learning has led to intense interest in the privacy of the data on which machine learning relies, and to new techniques for preserving privacy. However, older ideas about privacy may well remain valid and useful. This note reviews two recent works on privacy in the light of the wisdom of some of the early literature, in particular the principles distilled b… ▽ More

    Submitted 26 August, 2017; originally announced August 2017.

    Journal ref: IEEE 30th Computer Security Foundations Symposium (CSF), pages 1--6, 2017

  41. arXiv:1611.04482  [pdf, other

    cs.CR stat.ML

    Practical Secure Aggregation for Federated Learning on User-Held Data

    Authors: Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, Karn Seth

    Abstract: Secure Aggregation protocols allow a collection of mutually distrust parties, each holding a private value, to collaboratively compute the sum of those values without revealing the values themselves. We consider training a deep neural network in the Federated Learning model, using distributed stochastic gradient descent across user-held training data on mobile devices, wherein Secure Aggregation p… ▽ More

    Submitted 14 November, 2016; originally announced November 2016.

    Comments: 5 pages, 1 figure. To appear at the NIPS 2016 workshop on Private Multi-Party Machine Learning

  42. arXiv:1611.00429  [pdf, ps, other

    cs.LG

    Distributed Mean Estimation with Limited Communication

    Authors: Ananda Theertha Suresh, Felix X. Yu, Sanjiv Kumar, H. Brendan McMahan

    Abstract: Motivated by the need for distributed learning and optimization algorithms with low communication cost, we study communication efficient algorithms for distributed mean estimation. Unlike previous works, we make no probabilistic assumptions on the data. We first show that for $d$ dimensional data with $n$ clients, a naive stochastic binary rounding approach yields a mean squared error (MSE) of… ▽ More

    Submitted 25 September, 2017; v1 submitted 1 November, 2016; originally announced November 2016.

  43. arXiv:1610.05492  [pdf, other

    cs.LG

    Federated Learning: Strategies for Improving Communication Efficiency

    Authors: Jakub Konečný, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, Dave Bacon

    Abstract: Federated Learning is a machine learning setting where the goal is to train a high-quality centralized model while training data remains distributed over a large number of clients each with unreliable and relatively slow network connections. We consider learning algorithms for this setting where on each round, each client independently computes an update to the current model based on its local dat… ▽ More

    Submitted 30 October, 2017; v1 submitted 18 October, 2016; originally announced October 2016.

  44. arXiv:1610.02527  [pdf, other

    cs.LG

    Federated Optimization: Distributed Machine Learning for On-Device Intelligence

    Authors: Jakub Konečný, H. Brendan McMahan, Daniel Ramage, Peter Richtárik

    Abstract: We introduce a new and increasingly relevant setting for distributed optimization in machine learning, where the data defining the optimization are unevenly distributed over an extremely large number of nodes. The goal is to train a high-quality centralized model. We refer to this setting as Federated Optimization. In this setting, communication efficiency is of the utmost importance and minimizin… ▽ More

    Submitted 8 October, 2016; originally announced October 2016.

    Comments: 38 pages

  45. arXiv:1607.00133  [pdf, other

    stat.ML cs.CR cs.LG

    Deep Learning with Differential Privacy

    Authors: Martín Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang

    Abstract: Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop new algorithmic techniques for learning and a refin… ▽ More

    Submitted 24 October, 2016; v1 submitted 1 July, 2016; originally announced July 2016.

    Journal ref: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (ACM CCS), pp. 308-318, 2016

  46. arXiv:1602.05629  [pdf, other

    cs.LG

    Communication-Efficient Learning of Deep Networks from Decentralized Data

    Authors: H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, Blaise Agüera y Arcas

    Abstract: Modern mobile devices have access to a wealth of data suitable for learning models, which in turn can greatly improve the user experience on the device. For example, language models can improve speech recognition and text entry, and image models can automatically select good photos. However, this rich data is often privacy sensitive, large in quantity, or both, which may preclude logging to the da… ▽ More

    Submitted 26 January, 2023; v1 submitted 17 February, 2016; originally announced February 2016.

    Comments: [v4] Fixes a typo in the FedAvg pseudocode. [v3] Updates the large-scale LSTM experiments, along with other minor changes

    Journal ref: Proceedings of the 20 th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017. JMLR: W&CP volume 54

  47. arXiv:1511.03575  [pdf, ps, other

    cs.LG math.OC

    Federated Optimization:Distributed Optimization Beyond the Datacenter

    Authors: Jakub Konečný, Brendan McMahan, Daniel Ramage

    Abstract: We introduce a new and increasingly relevant setting for distributed optimization in machine learning, where the data defining the optimization are distributed (unevenly) over an extremely large number of \nodes, but the goal remains to train a high-quality centralized model. We refer to this setting as Federated Optimization. In this setting, communication efficiency is of utmost importance. A… ▽ More

    Submitted 11 November, 2015; originally announced November 2015.

    Comments: NIPS workshop version

  48. arXiv:1403.3465  [pdf, other

    cs.LG

    A Survey of Algorithms and Analysis for Adaptive Online Learning

    Authors: H. Brendan McMahan

    Abstract: We present tools for the analysis of Follow-The-Regularized-Leader (FTRL), Dual Averaging, and Mirror Descent algorithms when the regularizer (equivalently, prox-function or learning rate schedule) is chosen adaptively based on the data. Adaptivity can be used to prove regret bounds that hold on every round, and also allows for data-dependent regret bounds as in AdaGrad-style algorithms (e.g., Onl… ▽ More

    Submitted 9 November, 2015; v1 submitted 13 March, 2014; originally announced March 2014.

  49. arXiv:1403.0628  [pdf, ps, other

    cs.LG

    Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations

    Authors: H. Brendan McMahan, Francesco Orabona

    Abstract: We study algorithms for online linear optimization in Hilbert spaces, focusing on the case where the player is unconstrained. We develop a novel characterization of a large class of minimax algorithms, recovering, and even improving, several previous results as immediate corollaries. Moreover, using our tools, we develop an algorithm that provides a regret bound of… ▽ More

    Submitted 21 May, 2014; v1 submitted 3 March, 2014; originally announced March 2014.

    Comments: Proceedings of the 27th Annual Conference on Learning Theory (COLT 2014)

  50. arXiv:1303.4664  [pdf, other

    cs.LG

    Large-Scale Learning with Less RAM via Randomization

    Authors: Daniel Golovin, D. Sculley, H. Brendan McMahan, Michael Young

    Abstract: We reduce the memory footprint of popular large-scale online learning methods by projecting our weight vector onto a coarse discrete set using randomized rounding. Compared to standard 32-bit float encodings, this reduces RAM usage by more than 50% during training and by up to 95% when making predictions from a fixed model, with almost no loss in accuracy. We also show that randomized counting can… ▽ More

    Submitted 19 March, 2013; originally announced March 2013.

    Comments: Extended version of ICML 2013 paper