-
Must the Communication Graph of MPC Protocols be an Expander?
Authors:
Elette Boyle,
Ran Cohen,
Deepesh Data,
Pavel Hubáček
Abstract:
Secure multiparty computation (MPC) on incomplete communication networks has been studied within two primary models: (1) Where a partial network is fixed a priori, and thus corruptions can occur dependent on its structure, and (2) Where edges in the communication graph are determined dynamically as part of the protocol. Whereas a rich literature has succeeded in map** out the feasibility and lim…
▽ More
Secure multiparty computation (MPC) on incomplete communication networks has been studied within two primary models: (1) Where a partial network is fixed a priori, and thus corruptions can occur dependent on its structure, and (2) Where edges in the communication graph are determined dynamically as part of the protocol. Whereas a rich literature has succeeded in map** out the feasibility and limitations of graph structures supporting secure computation in the fixed-graph model (including strong classical lower bounds), these bounds do not apply in the latter dynamic-graph setting, which has recently seen exciting new results, but remains relatively unexplored.
In this work, we initiate a similar foundational study of MPC within the dynamic-graph model. As a first step, we investigate the property of graph expansion. All existing protocols (implicitly or explicitly) yield communication graphs which are expanders, but it is not clear whether this is inherent. Our results consist of two types (for constant fraction of corruptions):
* Upper bounds: We demonstrate secure protocols whose induced communication graphs are not expander graphs, within a wide range of settings (computational, information theoretic, with low locality, even with low locality and adaptive security), each assuming some form of input-independent setup.
* Lower bounds: In the plain model (no setup) with adaptive corruptions, we demonstrate that for certain functionalities, no protocol can maintain a non-expanding communication graph against all adversarial strategies. Our lower bound relies only on protocol correctness (not privacy), and requires a surprisingly delicate argument.
More generally, we provide a formal framework for analyzing the evolving communication graph of MPC protocols, giving a starting point for studying the relation between secure computation and further, more general graph properties.
△ Less
Submitted 21 June, 2023; v1 submitted 19 May, 2023;
originally announced May 2023.
-
A Generative Framework for Personalized Learning and Estimation: Theory, Algorithms, and Privacy
Authors:
Kaan Ozkara,
Antonious M. Girgis,
Deepesh Data,
Suhas Diggavi
Abstract:
A distinguishing characteristic of federated learning is that the (local) client data could have statistical heterogeneity. This heterogeneity has motivated the design of personalized learning, where individual (personalized) models are trained, through collaboration. There have been various personalization methods proposed in literature, with seemingly very different forms and methods ranging fro…
▽ More
A distinguishing characteristic of federated learning is that the (local) client data could have statistical heterogeneity. This heterogeneity has motivated the design of personalized learning, where individual (personalized) models are trained, through collaboration. There have been various personalization methods proposed in literature, with seemingly very different forms and methods ranging from use of a single global model for local regularization and model interpolation, to use of multiple global models for personalized clustering, etc. In this work, we begin with a generative framework that could potentially unify several different algorithms as well as suggest new algorithms. We apply our generative framework to personalized estimation, and connect it to the classical empirical Bayes' methodology. We develop private personalized estimation under this framework. We then use our generative framework for learning, which unifies several known personalized FL algorithms and also suggests new ones; we propose and study a new algorithm AdaPeD based on a Knowledge Distillation, which numerically outperforms several known algorithms. We also develop privacy for personalized learning methods with guarantees for user-level privacy and composition. We numerically evaluate the performance as well as the privacy for both the estimation and learning problems, demonstrating the advantages of our proposed methods.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Flexible Accuracy for Differential Privacy
Authors:
Aman Bansal,
Rahul Chunduru,
Deepesh Data,
Manoj Prabhakaran
Abstract:
Differential Privacy (DP) has become a gold standard in privacy-preserving data analysis. While it provides one of the most rigorous notions of privacy, there are many settings where its applicability is limited.
Our main contribution is in augmenting differential privacy with {\em Flexible Accuracy}, which allows small distortions in the input (e.g., drop** outliers) before measuring accuracy…
▽ More
Differential Privacy (DP) has become a gold standard in privacy-preserving data analysis. While it provides one of the most rigorous notions of privacy, there are many settings where its applicability is limited.
Our main contribution is in augmenting differential privacy with {\em Flexible Accuracy}, which allows small distortions in the input (e.g., drop** outliers) before measuring accuracy of the output, allowing one to extend DP mechanisms to high-sensitivity functions. We present mechanisms that can help in achieving this notion for functions that had no meaningful differentially private mechanisms previously. In particular, we illustrate an application to differentially private histograms, which in turn yields mechanisms for revealing the support of a dataset or the extremal values in the data. Analyses of our constructions exploit new versatile composition theorems that facilitate modular design.
All the above extensions use our new definitional framework, which is in terms of "lossy Wasserstein distance" -- a 2-parameter error measure for distributions. This may be of independent interest.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
QuPeD: Quantized Personalization via Distillation with Applications to Federated Learning
Authors:
Kaan Ozkara,
Navjot Singh,
Deepesh Data,
Suhas Diggavi
Abstract:
Traditionally, federated learning (FL) aims to train a single global model while collaboratively using multiple clients and a server. Two natural challenges that FL algorithms face are heterogeneity in data across clients and collaboration of clients with {\em diverse resources}. In this work, we introduce a \textit{quantized} and \textit{personalized} FL algorithm QuPeD that facilitates collectiv…
▽ More
Traditionally, federated learning (FL) aims to train a single global model while collaboratively using multiple clients and a server. Two natural challenges that FL algorithms face are heterogeneity in data across clients and collaboration of clients with {\em diverse resources}. In this work, we introduce a \textit{quantized} and \textit{personalized} FL algorithm QuPeD that facilitates collective (personalized model compression) training via \textit{knowledge distillation} (KD) among clients who have access to heterogeneous data and resources. For personalization, we allow clients to learn \textit{compressed personalized models} with different quantization parameters and model dimensions/structures. Towards this, first we propose an algorithm for learning quantized models through a relaxed optimization problem, where quantization values are also optimized over. When each client participating in the (federated) learning process has different requirements for the compressed model (both in model dimension and precision), we formulate a compressed personalization framework by introducing knowledge distillation loss for local client objectives collaborating through a global model. We develop an alternating proximal gradient update for solving this compressed personalization problem, and analyze its convergence properties. Numerically, we validate that QuPeD outperforms competing personalized FL methods, FedAvg, and local training of clients in various heterogeneous settings.
△ Less
Submitted 5 July, 2022; v1 submitted 29 July, 2021;
originally announced July 2021.
-
Renyi Differential Privacy of the Subsampled Shuffle Model in Distributed Learning
Authors:
Antonious M. Girgis,
Deepesh Data,
Suhas Diggavi
Abstract:
We study privacy in a distributed learning framework, where clients collaboratively build a learning model iteratively through interactions with a server from whom we need privacy. Motivated by stochastic optimization and the federated learning (FL) paradigm, we focus on the case where a small fraction of data samples are randomly sub-sampled in each round to participate in the learning process, w…
▽ More
We study privacy in a distributed learning framework, where clients collaboratively build a learning model iteratively through interactions with a server from whom we need privacy. Motivated by stochastic optimization and the federated learning (FL) paradigm, we focus on the case where a small fraction of data samples are randomly sub-sampled in each round to participate in the learning process, which also enables privacy amplification. To obtain even stronger local privacy guarantees, we study this in the shuffle privacy model, where each client randomizes its response using a local differentially private (LDP) mechanism and the server only receives a random permutation (shuffle) of the clients' responses without their association to each client. The principal result of this paper is a privacy-optimization performance trade-off for discrete randomization mechanisms in this sub-sampled shuffle privacy model. This is enabled through a new theoretical technique to analyze the Renyi Differential Privacy (RDP) of the sub-sampled shuffle model. We numerically demonstrate that, for important regimes, with composition our bound yields significant improvement in privacy guarantee over the state-of-the-art approximate Differential Privacy (DP) guarantee (with strong composition) for sub-sampled shuffled models. We also demonstrate numerically significant improvement in privacy-learning performance operating point using real data sets.
△ Less
Submitted 19 July, 2021;
originally announced July 2021.
-
A Field Guide to Federated Optimization
Authors:
Jianyu Wang,
Zachary Charles,
Zheng Xu,
Gauri Joshi,
H. Brendan McMahan,
Blaise Aguera y Arcas,
Maruan Al-Shedivat,
Galen Andrew,
Salman Avestimehr,
Katharine Daly,
Deepesh Data,
Suhas Diggavi,
Hubert Eichner,
Advait Gadhikar,
Zachary Garrett,
Antonious M. Girgis,
Filip Hanzely,
Andrew Hard,
Chaoyang He,
Samuel Horvath,
Zhouyuan Huo,
Alex Ingerman,
Martin Jaggi,
Tara Javidi,
Peter Kairouz
, et al. (28 additional authors not shown)
Abstract:
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and…
▽ More
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and other constraints that are not primary considerations in other problem settings. This paper provides recommendations and guidelines on formulating, designing, evaluating and analyzing federated optimization algorithms through concrete examples and practical implementation, with a focus on conducting effective simulations to infer real-world performance. The goal of this work is not to survey the current literature, but to inspire researchers and practitioners to design federated learning algorithms that can be used in various practical applications.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
On the Renyi Differential Privacy of the Shuffle Model
Authors:
Antonious M. Girgis,
Deepesh Data,
Suhas Diggavi,
Ananda Theertha Suresh,
Peter Kairouz
Abstract:
The central question studied in this paper is Renyi Differential Privacy (RDP) guarantees for general discrete local mechanisms in the shuffle privacy model. In the shuffle model, each of the $n$ clients randomizes its response using a local differentially private (LDP) mechanism and the untrusted server only receives a random permutation (shuffle) of the client responses without association to ea…
▽ More
The central question studied in this paper is Renyi Differential Privacy (RDP) guarantees for general discrete local mechanisms in the shuffle privacy model. In the shuffle model, each of the $n$ clients randomizes its response using a local differentially private (LDP) mechanism and the untrusted server only receives a random permutation (shuffle) of the client responses without association to each client. The principal result in this paper is the first non-trivial RDP guarantee for general discrete local randomization mechanisms in the shuffled privacy model, and we develop new analysis techniques for deriving our results which could be of independent interest. In applications, such an RDP guarantee is most useful when we use it for composing several private interactions. We numerically demonstrate that, for important regimes, with composition our bound yields an improvement in privacy guarantee by a factor of $8\times$ over the state-of-the-art approximate Differential Privacy (DP) guarantee (with standard composition) for shuffled models. Moreover, combining with Poisson subsampling, our result leads to at least $10\times$ improvement over subsampled approximate DP with standard composition.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
QuPeL: Quantized Personalization with Applications to Federated Learning
Authors:
Kaan Ozkara,
Navjot Singh,
Deepesh Data,
Suhas Diggavi
Abstract:
Traditionally, federated learning (FL) aims to train a single global model while collaboratively using multiple clients and a server. Two natural challenges that FL algorithms face are heterogeneity in data across clients and collaboration of clients with {\em diverse resources}. In this work, we introduce a \textit{quantized} and \textit{personalized} FL algorithm QuPeL that facilitates collectiv…
▽ More
Traditionally, federated learning (FL) aims to train a single global model while collaboratively using multiple clients and a server. Two natural challenges that FL algorithms face are heterogeneity in data across clients and collaboration of clients with {\em diverse resources}. In this work, we introduce a \textit{quantized} and \textit{personalized} FL algorithm QuPeL that facilitates collective training with heterogeneous clients while respecting resource diversity. For personalization, we allow clients to learn \textit{compressed personalized models} with different quantization parameters depending on their resources. Towards this, first we propose an algorithm for learning quantized models through a relaxed optimization problem, where quantization values are also optimized over. When each client participating in the (federated) learning process has different requirements of the quantized model (both in value and precision), we formulate a quantized personalization framework by introducing a penalty term for local client objectives against a globally trained model to encourage collaboration. We develop an alternating proximal gradient update for solving this quantized personalization problem, and we analyze its convergence properties. Numerically, we show that optimizing over the quantization levels increases the performance and we validate that QuPeL outperforms both FedAvg and local training of clients in a heterogeneous setting.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
Shuffled Model of Federated Learning: Privacy, Communication and Accuracy Trade-offs
Authors:
Antonious M. Girgis,
Deepesh Data,
Suhas Diggavi,
Peter Kairouz,
Ananda Theertha Suresh
Abstract:
We consider a distributed empirical risk minimization (ERM) optimization problem with communication efficiency and privacy requirements, motivated by the federated learning (FL) framework. Unique challenges to the traditional ERM problem in the context of FL include (i) need to provide privacy guarantees on clients' data, (ii) compress the communication between clients and the server, since client…
▽ More
We consider a distributed empirical risk minimization (ERM) optimization problem with communication efficiency and privacy requirements, motivated by the federated learning (FL) framework. Unique challenges to the traditional ERM problem in the context of FL include (i) need to provide privacy guarantees on clients' data, (ii) compress the communication between clients and the server, since clients might have low-bandwidth links, (iii) work with a dynamic client population at each round of communication between the server and the clients, as a small fraction of clients are sampled at each round. To address these challenges we develop (optimal) communication-efficient schemes for private mean estimation for several $\ell_p$ spaces, enabling efficient gradient aggregation for each iteration of the optimization solution of the ERM. We also provide lower and upper bounds for mean estimation with privacy and communication constraints for arbitrary $\ell_p$ spaces. To get the overall communication, privacy, and optimization performance operation point, we combine this with privacy amplification opportunities inherent to this setup. Our solution takes advantage of the inherent privacy amplification provided by client sampling and data sampling at each client (through Stochastic Gradient Descent) as well as the recently developed privacy framework using anonymization, which effectively presents to the server responses that are randomly shuffled with respect to the clients. Putting these together, we demonstrate that one can get the same privacy, optimization-performance operating point developed in recent methods that use full-precision communication, but at a much lower communication cost, i.e., effectively getting communication efficiency for "free".
△ Less
Submitted 23 September, 2020; v1 submitted 17 August, 2020;
originally announced August 2020.
-
Byzantine-Resilient High-Dimensional Federated Learning
Authors:
Deepesh Data,
Suhas Diggavi
Abstract:
We study stochastic gradient descent (SGD) with local iterations in the presence of malicious/Byzantine clients, motivated by the federated learning. The clients, instead of communicating with the central server in every iteration, maintain their local models, which they update by taking several SGD iterations based on their own datasets and then communicate the net update with the server, thereby…
▽ More
We study stochastic gradient descent (SGD) with local iterations in the presence of malicious/Byzantine clients, motivated by the federated learning. The clients, instead of communicating with the central server in every iteration, maintain their local models, which they update by taking several SGD iterations based on their own datasets and then communicate the net update with the server, thereby achieving communication-efficiency. Furthermore, only a subset of clients communicate with the server, and this subset may be different at different synchronization times. The Byzantine clients may collaborate and send arbitrary vectors to the server to disrupt the learning process. To combat the adversary, we employ an efficient high-dimensional robust mean estimation algorithm from Steinhardt et al.~\cite[ITCS 2018]{Resilience_SCV18} at the server to filter-out corrupt vectors; and to analyze the outlier-filtering procedure, we develop a novel matrix concentration result that may be of independent interest.
We provide convergence analyses for strongly-convex and non-convex smooth objectives in the heterogeneous data setting, where different clients may have different local datasets, and we do not make any probabilistic assumptions on data generation. We believe that ours is the first Byzantine-resilient algorithm and analysis with local iterations. We derive our convergence results under minimal assumptions of bounded variance for SGD and bounded gradient dissimilarity (which captures heterogeneity among local datasets). We also extend our results to the case when clients compute full-batch gradients.
△ Less
Submitted 16 August, 2020; v1 submitted 21 June, 2020;
originally announced June 2020.
-
Successive Refinement of Privacy
Authors:
Antonious M. Girgis,
Deepesh Data,
Kamalika Chaudhuri,
Christina Fragouli,
Suhas Diggavi
Abstract:
This work examines a novel question: how much randomness is needed to achieve local differential privacy (LDP)? A motivating scenario is providing {\em multiple levels of privacy} to multiple analysts, either for distribution or for heavy-hitter estimation, using the \emph{same} (randomized) output. We call this setting \emph{successive refinement of privacy}, as it provides hierarchical access to…
▽ More
This work examines a novel question: how much randomness is needed to achieve local differential privacy (LDP)? A motivating scenario is providing {\em multiple levels of privacy} to multiple analysts, either for distribution or for heavy-hitter estimation, using the \emph{same} (randomized) output. We call this setting \emph{successive refinement of privacy}, as it provides hierarchical access to the raw data with different privacy levels. For example, the same randomized output could enable one analyst to reconstruct the input, while another can only estimate the distribution subject to LDP requirements. This extends the classical Shannon (wiretap) security setting to local differential privacy. We provide (order-wise) tight characterizations of privacy-utility-randomness trade-offs in several cases for distribution estimation, including the standard LDP setting under a randomness constraint. We also provide a non-trivial privacy mechanism for multi-level privacy. Furthermore, we show that we cannot reuse random keys over time while preserving privacy of each user.
△ Less
Submitted 24 May, 2020;
originally announced May 2020.
-
Byzantine-Resilient SGD in High Dimensions on Heterogeneous Data
Authors:
Deepesh Data,
Suhas Diggavi
Abstract:
We study distributed stochastic gradient descent (SGD) in the master-worker architecture under Byzantine attacks. We consider the heterogeneous data model, where different workers may have different local datasets, and we do not make any probabilistic assumptions on data generation. At the core of our algorithm, we use the polynomial-time outlier-filtering procedure for robust mean estimation prop…
▽ More
We study distributed stochastic gradient descent (SGD) in the master-worker architecture under Byzantine attacks. We consider the heterogeneous data model, where different workers may have different local datasets, and we do not make any probabilistic assumptions on data generation. At the core of our algorithm, we use the polynomial-time outlier-filtering procedure for robust mean estimation proposed by Steinhardt et al. (ITCS 2018) to filter-out corrupt gradients. In order to be able to apply their filtering procedure in our {\em heterogeneous} data setting where workers compute {\em stochastic} gradients, we derive a new matrix concentration result, which may be of independent interest.
We provide convergence analyses for smooth strongly-convex and non-convex objectives. We derive our results under the bounded variance assumption on local stochastic gradients and a {\em deterministic} condition on datasets, namely, gradient dissimilarity; and for both these quantities, we provide concrete bounds in the statistical heterogeneous data model. We give a trade-off between the mini-batch size for stochastic gradients and the approximation error. Our algorithm can tolerate up to $\frac{1}{4}$ fraction Byzantine workers. It can find approximate optimal parameters in the strongly-convex setting exponentially fast and reach to an approximate stationary point in the non-convex setting with a linear speed, thus, matching the convergence rates of vanilla SGD in the Byzantine-free setting.
We also propose and analyze a Byzantine-resilient SGD algorithm with gradient compression, where workers send $k$ random coordinates of their gradients. Under mild conditions, we show a $\frac{d}{k}$-factor saving in communication bits as well as decoding complexity over our compression-free algorithm without affecting its convergence rate (order-wise) and the approximation error.
△ Less
Submitted 16 May, 2020;
originally announced May 2020.
-
SQuARM-SGD: Communication-Efficient Momentum SGD for Decentralized Optimization
Authors:
Navjot Singh,
Deepesh Data,
Jemin George,
Suhas Diggavi
Abstract:
In this paper, we propose and analyze SQuARM-SGD, a communication-efficient algorithm for decentralized training of large-scale machine learning models over a network. In SQuARM-SGD, each node performs a fixed number of local SGD steps using Nesterov's momentum and then sends sparsified and quantized updates to its neighbors regulated by a locally computable triggering criterion. We provide conver…
▽ More
In this paper, we propose and analyze SQuARM-SGD, a communication-efficient algorithm for decentralized training of large-scale machine learning models over a network. In SQuARM-SGD, each node performs a fixed number of local SGD steps using Nesterov's momentum and then sends sparsified and quantized updates to its neighbors regulated by a locally computable triggering criterion. We provide convergence guarantees of our algorithm for general (non-convex) and convex smooth objectives, which, to the best of our knowledge, is the first theoretical analysis for compressed decentralized SGD with momentum updates. We show that the convergence rate of SQuARM-SGD matches that of vanilla SGD. We empirically show that including momentum updates in SQuARM-SGD can lead to better test performance than the current state-of-the-art which does not consider momentum updates.
△ Less
Submitted 11 October, 2021; v1 submitted 12 May, 2020;
originally announced May 2020.
-
SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Stochastic Optimization
Authors:
Navjot Singh,
Deepesh Data,
Jemin George,
Suhas Diggavi
Abstract:
In this paper, we propose and analyze SPARQ-SGD, which is an event-triggered and compressed algorithm for decentralized training of large-scale machine learning models. Each node can locally compute a condition (event) which triggers a communication where quantized and sparsified local model parameters are sent. In SPARQ-SGD each node takes at least a fixed number ($H$) of local gradient steps and…
▽ More
In this paper, we propose and analyze SPARQ-SGD, which is an event-triggered and compressed algorithm for decentralized training of large-scale machine learning models. Each node can locally compute a condition (event) which triggers a communication where quantized and sparsified local model parameters are sent. In SPARQ-SGD each node takes at least a fixed number ($H$) of local gradient steps and then checks if the model parameters have significantly changed compared to its last update; it communicates further compressed model parameters only when there is a significant change, as specified by a (design) criterion. We prove that the SPARQ-SGD converges as $O(\frac{1}{nT})$ and $O(\frac{1}{\sqrt{nT}})$ in the strongly-convex and non-convex settings, respectively, demonstrating that such aggressive compression, including event-triggered communication, model sparsification and quantization does not affect the overall convergence rate as compared to uncompressed decentralized training; thereby theoretically yielding communication efficiency for "free". We evaluate SPARQ-SGD over real datasets to demonstrate significant amount of savings in communication over the state-of-the-art.
△ Less
Submitted 24 February, 2020; v1 submitted 31 October, 2019;
originally announced October 2019.
-
Data Encoding for Byzantine-Resilient Distributed Optimization
Authors:
Deepesh Data,
Linqi Song,
Suhas Diggavi
Abstract:
We study distributed optimization in the presence of Byzantine adversaries, where both data and computation are distributed among $m$ worker machines, $t$ of which may be corrupt. The compromised nodes may collaboratively and arbitrarily deviate from their pre-specified programs, and a designated (master) node iteratively computes the model/parameter vector for generalized linear models. In this w…
▽ More
We study distributed optimization in the presence of Byzantine adversaries, where both data and computation are distributed among $m$ worker machines, $t$ of which may be corrupt. The compromised nodes may collaboratively and arbitrarily deviate from their pre-specified programs, and a designated (master) node iteratively computes the model/parameter vector for generalized linear models. In this work, we primarily focus on two iterative algorithms: Proximal Gradient Descent (PGD) and Coordinate Descent (CD). Gradient descent (GD) is a special case of these algorithms. PGD is typically used in the data-parallel setting, where data is partitioned across different samples, whereas, CD is used in the model-parallelism setting, where data is partitioned across the parameter space.
In this paper, we propose a method based on data encoding and error correction over real numbers to combat adversarial attacks. We can tolerate up to $t\leq \lfloor\frac{m-1}{2}\rfloor$ corrupt worker nodes, which is information-theoretically optimal. We give deterministic guarantees, and our method does not assume any probability distribution on the data. We develop a {\em sparse} encoding scheme which enables computationally efficient data encoding and decoding. We demonstrate a trade-off between the corruption threshold and the resource requirements (storage, computational, and communication complexity). As an example, for $t\leq\frac{m}{3}$, our scheme incurs only a {\em constant} overhead on these resources, over that required by the plain distributed PGD/CD algorithms which provide no adversarial protection. To the best of our knowledge, ours is the first paper that makes CD secure against adversarial attacks.
Our encoding scheme extends efficiently to the data streaming model and for stochastic gradient descent (SGD). We also give experimental results to show the efficacy of our proposed schemes.
△ Less
Submitted 4 November, 2020; v1 submitted 4 July, 2019;
originally announced July 2019.
-
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations
Authors:
Debraj Basu,
Deepesh Data,
Can Karakus,
Suhas Diggavi
Abstract:
Communication bottleneck has been identified as a significant issue in distributed optimization of large-scale learning models. Recently, several approaches to mitigate this problem have been proposed, including different forms of gradient compression or computing local models and mixing them iteratively. In this paper, we propose \emph{Qsparse-local-SGD} algorithm, which combines aggressive spars…
▽ More
Communication bottleneck has been identified as a significant issue in distributed optimization of large-scale learning models. Recently, several approaches to mitigate this problem have been proposed, including different forms of gradient compression or computing local models and mixing them iteratively. In this paper, we propose \emph{Qsparse-local-SGD} algorithm, which combines aggressive sparsification with quantization and local computation along with error compensation, by kee** track of the difference between the true and compressed gradients. We propose both synchronous and asynchronous implementations of \emph{Qsparse-local-SGD}. We analyze convergence for \emph{Qsparse-local-SGD} in the \emph{distributed} setting for smooth non-convex and convex objective functions. We demonstrate that \emph{Qsparse-local-SGD} converges at the same rate as vanilla distributed SGD for many important classes of sparsifiers and quantizers. We use \emph{Qsparse-local-SGD} to train ResNet-50 on ImageNet and show that it results in significant savings over the state-of-the-art, in the number of bits transmitted to reach target accuracy.
△ Less
Submitted 2 November, 2019; v1 submitted 5 June, 2019;
originally announced June 2019.
-
Estimating the sample mean and standard deviation from commonly reported quantiles in meta-analysis
Authors:
Sean McGrath,
XiaoFei Zhao,
Russell Steele,
Brett D. Thombs,
Andrea Benedetti,
the DEPRESsion Screening Data,
Collaboration
Abstract:
Researchers increasingly use meta-analysis to synthesize the results of several studies in order to estimate a common effect. When the outcome variable is continuous, standard meta-analytic approaches assume that the primary studies report the sample mean and standard deviation of the outcome. However, when the outcome is skewed, authors sometimes summarize the data by reporting the sample median…
▽ More
Researchers increasingly use meta-analysis to synthesize the results of several studies in order to estimate a common effect. When the outcome variable is continuous, standard meta-analytic approaches assume that the primary studies report the sample mean and standard deviation of the outcome. However, when the outcome is skewed, authors sometimes summarize the data by reporting the sample median and one or both of (i) the minimum and maximum values and (ii) the first and third quartiles, but do not report the mean or standard deviation. To include these studies in meta-analysis, several methods have been developed to estimate the sample mean and standard deviation from the reported summary data. A major limitation of these widely used methods is that they assume that the outcome distribution is normal, which is unlikely to be tenable for studies reporting medians. We propose two novel approaches to estimate the sample mean and standard deviation when data are suspected to be non-normal. Our simulation results and empirical assessments show that the proposed methods often perform better than the existing methods when applied to non-normal data.
△ Less
Submitted 25 March, 2019;
originally announced March 2019.
-
Interactive Secure Function Computation
Authors:
Deepesh Data,
Gowtham R. Kurri,
Jithin Ravi,
Vinod M. Prabhakaran
Abstract:
We consider interactive computation of randomized functions between two users with the following privacy requirement: the interaction should not reveal to either user any extra information about the other user's input and output other than what can be inferred from the user's own input and output. We also consider the case where privacy is required against only one of the users. For both cases, we…
▽ More
We consider interactive computation of randomized functions between two users with the following privacy requirement: the interaction should not reveal to either user any extra information about the other user's input and output other than what can be inferred from the user's own input and output. We also consider the case where privacy is required against only one of the users. For both cases, we give single-letter expressions for feasibility and optimal rates of communication. Then we discuss the role of common randomness and interaction in both privacy settings. We also study perfectly secure non-interactive computation when only one of the users computes a randomized function based on a single transmission from the other user. We characterize randomized functions which can be perfectly securely computed in this model and obtain tight bounds on the optimal message lengths in all the privacy settings.
△ Less
Submitted 9 March, 2020; v1 submitted 10 December, 2018;
originally announced December 2018.
-
Secure Computation of Randomized Functions: Further Results
Authors:
Deepesh Data,
Vinod M. Prabhakaran
Abstract:
We consider secure computation of randomized functions between two users, where both the users (Alice and Bob) have inputs, Alice sends a message to Bob over a rate-limited, noise-free link, and then Bob produces the output. We study two cases: (i) when privacy condition is required only against Bob, who tries to learn more about Alice's input from the message than what can be inferred by his own…
▽ More
We consider secure computation of randomized functions between two users, where both the users (Alice and Bob) have inputs, Alice sends a message to Bob over a rate-limited, noise-free link, and then Bob produces the output. We study two cases: (i) when privacy condition is required only against Bob, who tries to learn more about Alice's input from the message than what can be inferred by his own input and output, and (ii) when there is no privacy requirement. For both the problems, we give single-letter expressions for the optimal rates. For the first problem, we also explicitly characterize securely computable randomized functions when input has full support, which leads to a much simpler expression for the optimal rate. Recently, Data (ISIT 2016) studied the remaining two cases (first, when privacy conditions are against both the users; and second, when privacy condition is only against Alice) and obtained single-letter expressions for optimal rates in both the scenarios.
△ Less
Submitted 19 May, 2017;
originally announced May 2017.
-
Secure Computation of Randomized Functions
Authors:
Deepesh Data
Abstract:
Two user secure computation of randomized functions is considered, where only one user computes the output. Both the users are semi-honest; and computation is such that no user learns any additional information about the other user's input and output other than what cannot be inferred from its own input and output. First we consider a scenario, where privacy conditions are against both the users.…
▽ More
Two user secure computation of randomized functions is considered, where only one user computes the output. Both the users are semi-honest; and computation is such that no user learns any additional information about the other user's input and output other than what cannot be inferred from its own input and output. First we consider a scenario, where privacy conditions are against both the users. In perfect security setting Kilian [STOC 2000] gave a characterization of securely computable randomized functions, and we provide rate-optimal protocols for such functions. We prove that the same characterization holds in asymptotic security setting as well and give a rate-optimal protocol. In another scenario, where privacy condition is only against the user who is not computing the function, we provide rate-optimal protocols. For perfect security in both the scenarios, our results are in terms of chromatic entropies of different graphs. In asymptotic security setting, we get single-letter expressions of rates in both the scenarios.
△ Less
Submitted 10 May, 2016; v1 submitted 25 January, 2016;
originally announced January 2016.
-
Communication and Randomness Lower Bounds for Secure Computation
Authors:
Deepesh Data,
Vinod M. Prabhakaran,
Manoj M. Prabhakaran
Abstract:
In secure multiparty computation (MPC), mutually distrusting users collaborate to compute a function of their private data without revealing any additional information about their data to other users. While it is known that information theoretically secure MPC is possible among $n$ users (connected by secure and noiseless links and have access to private randomness) against the collusion of less t…
▽ More
In secure multiparty computation (MPC), mutually distrusting users collaborate to compute a function of their private data without revealing any additional information about their data to other users. While it is known that information theoretically secure MPC is possible among $n$ users (connected by secure and noiseless links and have access to private randomness) against the collusion of less than $n/2$ users in the honest-but-curious model, relatively less is known about the communication and randomness complexity of secure computation.
In this work, we employ information theoretic techniques to obtain lower bounds on the amount of communication and randomness required for secure MPC. We restrict ourselves to a concrete interactive setting involving 3 users under which all functions are securely computable against corruption of a single user in the honest-but-curious model. We derive lower bounds for both the perfect security case (i.e., zero-error and no leakage of information) and asymptotic security (where the probability of error and information leakage vanish as block-length goes to $\infty$).
Our techniques include the use of a data processing inequality for residual information (i.e., the gap between mutual information and Gács-Körner common information), a new information inequality for 3-user protocols, and the idea of distribution switching. Our lower bounds are shown to be tight for various functions of interest. In particular, we show concrete functions which have "communication-ideal" protocols, i.e., which achieve the minimum communication simultaneously on all links in the network, and also use minimum amount of randomness. Also, we obtain the first explicit example of a function that incurs a higher communication cost than the input length in the secure computation model of "Feige, Kilian, and Naor [STOC, 1994]", who had shown that such functions exist.
△ Less
Submitted 10 May, 2016; v1 submitted 24 December, 2015;
originally announced December 2015.
-
How to Securely Compute the Modulo-Two Sum of Binary Sources
Authors:
Deepesh Data,
Bikash Kumar Dey,
Manoj Mishra,
Vinod M. Prabhakaran
Abstract:
In secure multiparty computation, mutually distrusting users in a network want to collaborate to compute functions of data which is distributed among the users. The users should not learn any additional information about the data of others than what they may infer from their own data and the functions they are computing. Previous works have mostly considered the worst case context (i.e., without a…
▽ More
In secure multiparty computation, mutually distrusting users in a network want to collaborate to compute functions of data which is distributed among the users. The users should not learn any additional information about the data of others than what they may infer from their own data and the functions they are computing. Previous works have mostly considered the worst case context (i.e., without assuming any distribution for the data); Lee and Abbe (2014) is a notable exception. Here, we study the average case (i.e., we work with a distribution on the data) where correctness and privacy is only desired asymptotically.
For concreteness and simplicity, we consider a secure version of the function computation problem of Körner and Marton (1979) where two users observe a doubly symmetric binary source with parameter p and the third user wants to compute the XOR. We show that the amount of communication and randomness resources required depends on the level of correctness desired. When zero-error and perfect privacy are required, the results of Data et al. (2014) show that it can be achieved if and only if a total rate of 1 bit is communicated between every pair of users and private randomness at the rate of 1 is used up. In contrast, we show here that, if we only want the probability of error to vanish asymptotically in block length, it can be achieved by a lower rate (binary entropy of p) for all the links and for private randomness; this also guarantees perfect privacy. We also show that no smaller rates are possible even if privacy is only required asymptotically.
△ Less
Submitted 26 May, 2014; v1 submitted 11 May, 2014;
originally announced May 2014.
-
On the Communication Complexity of Secure Computation
Authors:
Deepesh Data,
Vinod M. Prabhakaran,
Manoj M. Prabhakaran
Abstract:
Information theoretically secure multi-party computation (MPC) is a central primitive of modern cryptography. However, relatively little is known about the communication complexity of this primitive.
In this work, we develop powerful information theoretic tools to prove lower bounds on the communication complexity of MPC. We restrict ourselves to a 3-party setting in order to bring out the power…
▽ More
Information theoretically secure multi-party computation (MPC) is a central primitive of modern cryptography. However, relatively little is known about the communication complexity of this primitive.
In this work, we develop powerful information theoretic tools to prove lower bounds on the communication complexity of MPC. We restrict ourselves to a 3-party setting in order to bring out the power of these tools without introducing too many complications. Our techniques include the use of a data processing inequality for residual information - i.e., the gap between mutual information and Gács-Körner common information, a new information inequality for 3-party protocols, and the idea of distribution switching by which lower bounds computed under certain worst-case scenarios can be shown to apply for the general case.
Using these techniques we obtain tight bounds on communication complexity by MPC protocols for various interesting functions. In particular, we show concrete functions that have "communication-ideal" protocols, which achieve the minimum communication simultaneously on all links in the network. Also, we obtain the first explicit example of a function that incurs a higher communication cost than the input length in the secure computation model of Feige, Kilian and Naor (1994), who had shown that such functions exist. We also show that our communication bounds imply tight lower bounds on the amount of randomness required by MPC protocols for many interesting functions.
△ Less
Submitted 13 April, 2014; v1 submitted 29 November, 2013;
originally announced November 2013.