-
Sparsity and Privacy in Secret Sharing: A Fundamental Trade-Off
Authors:
Rawad Bitar,
Maximilian Egger,
Antonia Wachter-Zeh,
Marvin Xhemrishi
Abstract:
This work investigates the design of sparse secret sharing schemes that encode a sparse private matrix into sparse shares. This investigation is motivated by distributed computing, where the multiplication of sparse and private matrices is moved from a computationally weak main node to untrusted worker machines. Classical secret-sharing schemes produce dense shares. However, sparsity can help spee…
▽ More
This work investigates the design of sparse secret sharing schemes that encode a sparse private matrix into sparse shares. This investigation is motivated by distributed computing, where the multiplication of sparse and private matrices is moved from a computationally weak main node to untrusted worker machines. Classical secret-sharing schemes produce dense shares. However, sparsity can help speed up the computation. We show that, for matrices with i.i.d. entries, sparsity in the shares comes at a fundamental cost of weaker privacy. We derive a fundamental tradeoff between sparsity and privacy and construct optimal sparse secret sharing schemes that produce shares that leak the minimum amount of information for a desired sparsity of the shares. We apply our schemes to distributed sparse and private matrix multiplication schemes with no colluding workers while tolerating stragglers. For the setting of two non-communicating clusters of workers, we design a sparse one-time pad so that no private information is leaked to a cluster of untrusted and colluding workers, and the shares with bounded but non-zero leakage are assigned to a cluster of partially trusted workers. We conclude by discussing the necessity of using permutations for matrices with correlated entries.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Sparse and Private Distributed Matrix Multiplication with Straggler Tolerance
Authors:
Maximilian Egger,
Marvin Xhemrishi,
Antonia Wachter-Zeh,
Rawad Bitar
Abstract:
This paper considers the problem of outsourcing the multiplication of two private and sparse matrices to untrusted workers. Secret sharing schemes can be used to tolerate stragglers and guarantee information-theoretic privacy of the matrices. However, traditional secret sharing schemes destroy all sparsity in the offloaded computational tasks. Since exploiting the sparse nature of matrices was sho…
▽ More
This paper considers the problem of outsourcing the multiplication of two private and sparse matrices to untrusted workers. Secret sharing schemes can be used to tolerate stragglers and guarantee information-theoretic privacy of the matrices. However, traditional secret sharing schemes destroy all sparsity in the offloaded computational tasks. Since exploiting the sparse nature of matrices was shown to speed up the multiplication process, preserving the sparsity of the input matrices in the computational tasks sent to the workers is desirable. It was recently shown that sparsity can be guaranteed at the expense of a weaker privacy guarantee. Sparse secret sharing schemes with only two output shares were constructed. In this work, we construct sparse secret sharing schemes that generalize Shamir's secret sharing schemes for a fixed threshold $t=2$ and an arbitrarily large number of shares. We design our schemes to provide the strongest privacy guarantee given a desired sparsity of the shares under some mild assumptions. We show that increasing the number of shares, i.e., increasing straggler tolerance, incurs a degradation of the privacy guarantee. However, this degradation is negligible when the number of shares is comparably small to the cardinality of the input alphabet.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
FedGT: Identification of Malicious Clients in Federated Learning with Secure Aggregation
Authors:
Marvin Xhemrishi,
Johan Östman,
Antonia Wachter-Zeh,
Alexandre Graell i Amat
Abstract:
We propose FedGT, a novel framework for identifying malicious clients in federated learning with secure aggregation. Inspired by group testing, the framework leverages overlap** groups of clients to identify the presence of malicious clients in the groups via a decoding operation. The clients identified as malicious are then removed from the training of the model, which is performed over the rem…
▽ More
We propose FedGT, a novel framework for identifying malicious clients in federated learning with secure aggregation. Inspired by group testing, the framework leverages overlap** groups of clients to identify the presence of malicious clients in the groups via a decoding operation. The clients identified as malicious are then removed from the training of the model, which is performed over the remaining clients. By choosing the size, number, and overlap between groups, FedGT strikes a balance between privacy and security. Specifically, the server learns the aggregated model of the clients in each group - vanilla federated learning and secure aggregation correspond to the extreme cases of FedGT with group size equal to one and the total number of clients, respectively. The effectiveness of FedGT is demonstrated through extensive experiments on the MNIST, CIFAR-10, and ISIC2019 datasets in a cross-silo setting under different data-poisoning attacks. These experiments showcase FedGT's ability to identify malicious clients, resulting in high model utility. We further show that FedGT significantly outperforms the private robust aggregation approach based on the geometric median recently proposed by Pillutla et al. on heterogeneous client data (ISIC2019) and in the presence of targeted attacks (CIFAR-10 and ISIC2019).
△ Less
Submitted 10 October, 2023; v1 submitted 9 May, 2023;
originally announced May 2023.
-
Efficient Private Storage of Sparse Machine Learning Data
Authors:
Marvin Xhemrishi,
Maximilian Egger,
Rawad Bitar
Abstract:
We consider the problem of maintaining sparsity in private distributed storage of confidential machine learning data. In many applications, e.g., face recognition, the data used in machine learning algorithms is represented by sparse matrices which can be stored and processed efficiently. However, mechanisms maintaining perfect information-theoretic privacy require encoding the sparse matrices int…
▽ More
We consider the problem of maintaining sparsity in private distributed storage of confidential machine learning data. In many applications, e.g., face recognition, the data used in machine learning algorithms is represented by sparse matrices which can be stored and processed efficiently. However, mechanisms maintaining perfect information-theoretic privacy require encoding the sparse matrices into randomized dense matrices. It has been shown that, under some restrictions on the storage nodes, sparsity can be maintained at the expense of relaxing the perfect information-theoretic privacy requirement, i.e., allowing some information leakage. In this work, we lift the restrictions imposed on the storage nodes and show that there exists a trade-off between sparsity and the achievable privacy guarantees. We focus on the setting of non-colluding nodes and construct a coding scheme that encodes the sparse input matrices into matrices with the desired sparsity level while limiting the information leakage.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
Distributed Matrix-Vector Multiplication with Sparsity and Privacy Guarantees
Authors:
Marvin Xhemrishi,
Rawad Bitar,
Antonia Wachter-Zeh
Abstract:
We consider the problem of designing a coding scheme that allows both sparsity and privacy for distributed matrix-vector multiplication. Perfect information-theoretic privacy requires encoding the input sparse matrices into matrices distributed uniformly at random from the considered alphabet; thus destroying the sparsity. Computing matrix-vector multiplication for sparse matrices is known to be f…
▽ More
We consider the problem of designing a coding scheme that allows both sparsity and privacy for distributed matrix-vector multiplication. Perfect information-theoretic privacy requires encoding the input sparse matrices into matrices distributed uniformly at random from the considered alphabet; thus destroying the sparsity. Computing matrix-vector multiplication for sparse matrices is known to be fast. Distributing the computation over the non-sparse encoded matrices maintains privacy, but introduces artificial computing delays. In this work, we relax the privacy constraint and show that a certain level of sparsity can be maintained in the encoded matrices. We consider the chief/worker setting while assuming the presence of two clusters of workers: one is completely untrusted in which all workers collude to eavesdrop on the input matrix and in which perfect privacy must be satisfied; in the partly trusted cluster, only up to $z$ workers may collude and to which revealing small amount of information about the input matrix is allowed. We design a scheme that trades sparsity for privacy while achieving the desired constraints. We use cyclic task assignments of the encoded matrices to tolerate partial and full stragglers.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Computational Code-Based Privacy in Coded Federated Learning
Authors:
Marvin Xhemrishi,
Alexandre Graell i Amat,
Eirik Rosnes,
Antonia Wachter-Zeh
Abstract:
We propose a privacy-preserving federated learning (FL) scheme that is resilient against straggling devices. An adaptive scenario is suggested where the slower devices share their data with the faster ones and do not participate in the learning process. The proposed scheme employs code-based cryptography to ensure \emph{computational} privacy of the private data, i.e., no device with bounded compu…
▽ More
We propose a privacy-preserving federated learning (FL) scheme that is resilient against straggling devices. An adaptive scenario is suggested where the slower devices share their data with the faster ones and do not participate in the learning process. The proposed scheme employs code-based cryptography to ensure \emph{computational} privacy of the private data, i.e., no device with bounded computational power can obtain information about the other devices' data in feasible time. For a scenario with 25 devices, the proposed scheme achieves a speed-up of 4.7 and 4 for 92 and 128 bits security, respectively, for an accuracy of 95\% on the MNIST dataset compared with conventional mini-batch FL.
△ Less
Submitted 28 February, 2022;
originally announced February 2022.
-
The Wiretap Channel for Capacitive PUF-Based Security Enclosures
Authors:
Kathrin Garb,
Marvin Xhemrishi,
Ludwig Kürzinger,
Christoph Frisch
Abstract:
In order to protect devices from physical manipulations, protective security enclosures were developed. However, these battery-backed solutions come with a reduced lifetime, and have to be actively and continuously monitored. In order to overcome these drawbacks, batteryless capacitive enclosures based on Physical Unclonable Functions (PUFs) have been developed that generate a key-encryption-key (…
▽ More
In order to protect devices from physical manipulations, protective security enclosures were developed. However, these battery-backed solutions come with a reduced lifetime, and have to be actively and continuously monitored. In order to overcome these drawbacks, batteryless capacitive enclosures based on Physical Unclonable Functions (PUFs) have been developed that generate a key-encryption-key (KEK) for decryption of the key chain. In order to reproduce the PUF-key reliably and to compensate the effect of noise and environmental influences, the key generation includes error correction codes. However, drilling attacks that aim at partially destroying the enclosure also alter the PUF-response and are subjected to the same error correction procedures. Correcting attack effects, however, is highly undesirable as it would destroy the security concept of the enclosure. In general, designing error correction codes such that they provide tamper-sensitivity to attacks, while still correcting noise and environmental effects is a challenging task. We tackle this problem by first analyzing the behavior of the PUF-response under external influences and different post-processing parameters. From this, we derive a system model of the PUF-based enclosure, and construct a wiretap channel implementation from q-ary polar codes. We verify the obtained error correction scheme in a Monte Carlo simulation and demonstrate that our wiretap channel implementation achieves a physical layer security of 100 bits for 306 bits of entropy for the PUF-secret. Through this, we further develop capacitive PUF-based security enclosures and bring them one step closer to their commercial deployment.
△ Less
Submitted 17 November, 2022; v1 submitted 3 February, 2022;
originally announced February 2022.
-
Analysis of Communication Channels Related to Physical Unclonable Functions
Authors:
Georg Maringer,
Marvin Xhemrishi,
Sven Puchinger,
Kathrin Garb,
Hedongliang Liu,
Thomas Jerkovits,
Ludwig Kürzinger,
Matthias Hiller,
Antonia Wachter-Zeh
Abstract:
Cryptographic algorithms rely on the secrecy of their corresponding keys. On embedded systems with standard CMOS chips, where secure permanent memory such as flash is not available as a key storage, the secret key can be derived from Physical Unclonable Functions (PUFs) that make use of minuscule manufacturing variations of, for instance, SRAM cells. Since PUFs are affected by environmental change…
▽ More
Cryptographic algorithms rely on the secrecy of their corresponding keys. On embedded systems with standard CMOS chips, where secure permanent memory such as flash is not available as a key storage, the secret key can be derived from Physical Unclonable Functions (PUFs) that make use of minuscule manufacturing variations of, for instance, SRAM cells. Since PUFs are affected by environmental changes, the reliable reproduction of the PUF key requires error correction. For silicon PUFs with binary output, errors occur in the form of bitflips within the PUFs response. Modelling the channel as a Binary Symmetric Channel (BSC) with fixed crossover probability $p$ is only a first-order approximation of the real behavior of the PUF response. We propose a more realistic channel model, refered to as the Varying Binary Symmetric Channel (VBSC), which takes into account that the reliability of different PUF response bits may not be equal. We investigate its channel capacity for various scenarios which differ in the channel state information (CSI) present at encoder and decoder. We compare the capacity results for the VBSC for the different CSI cases with reference to the distribution of the bitflip probability according a work by Maes et al.
△ Less
Submitted 3 December, 2021;
originally announced December 2021.
-
Secure Private and Adaptive Matrix Multiplication Beyond the Singleton Bound
Authors:
Christoph Hofmeister,
Rawad Bitar,
Marvin Xhemrishi,
Antonia Wachter-Zeh
Abstract:
We consider the problem of designing secure and private codes for distributed matrix-matrix multiplication. A master server owns two private matrices and hires worker nodes to help compute their product. The matrices should remain information-theoretically private from the workers. Some of the workers are malicious and return corrupted results to the master. We design a framework for security agai…
▽ More
We consider the problem of designing secure and private codes for distributed matrix-matrix multiplication. A master server owns two private matrices and hires worker nodes to help compute their product. The matrices should remain information-theoretically private from the workers. Some of the workers are malicious and return corrupted results to the master. We design a framework for security against malicious workers in private matrix-matrix multiplication. The main idea is a careful use of Freivalds' algorithm to detect erroneous matrix multiplications. Our main goal is to apply this security framework to schemes with adaptive rates. Adaptive schemes divide the workers into clusters and thus provide flexibility in trading decoding complexity for efficiency. Our new scheme, SRPM3, provides a computationally efficient security check per cluster that detects the presence of one or more malicious workers with high probability. An additional per worker check is used to identify the malicious nodes. SRPM3 can tolerate the presence of an arbitrary number of malicious workers. We provide theoretical guarantees on the complexity of the security checks and simulation results on both, the missed detection rate as well as on the time needed for the integrity check.
△ Less
Submitted 14 February, 2022; v1 submitted 12 August, 2021;
originally announced August 2021.
-
Adaptive Private Distributed Matrix Multiplication
Authors:
Rawad Bitar,
Marvin Xhemrishi,
Antonia Wachter-Zeh
Abstract:
We consider the problem of designing codes with flexible rate (referred to as rateless codes), for private distributed matrix-matrix multiplication. A master server owns two private matrices $\mathbf{A}$ and $\mathbf{B}$ and hires worker nodes to help computing their multiplication. The matrices should remain information-theoretically private from the workers. Codes with fixed rate require the mas…
▽ More
We consider the problem of designing codes with flexible rate (referred to as rateless codes), for private distributed matrix-matrix multiplication. A master server owns two private matrices $\mathbf{A}$ and $\mathbf{B}$ and hires worker nodes to help computing their multiplication. The matrices should remain information-theoretically private from the workers. Codes with fixed rate require the master to assign tasks to the workers and then wait for a predetermined number of workers to finish their assigned tasks. The size of the tasks, hence the rate of the scheme, depends on the number of workers that the master waits for. We design a rateless private matrix-matrix multiplication scheme, called RPM3. In contrast to fixed-rate schemes, our scheme fixes the size of the tasks and allows the master to send multiple tasks to the workers. The master keeps sending tasks and receiving results until it can decode the multiplication; rendering the scheme flexible and adaptive to heterogeneous environments. Despite resulting in a smaller rate than known straggler-tolerant schemes, RPM3 provides a smaller mean waiting time of the master by leveraging the heterogeneity of the workers. The waiting time is studied under two different models for the workers' service time. We provide upper bounds for the mean waiting time under both models. In addition, we provide lower bounds on the mean waiting time under the worker-dependent fixed service time model.
△ Less
Submitted 14 January, 2021;
originally announced January 2021.
-
Rateless Codes for Private Distributed Matrix-Matrix Multiplication
Authors:
Rawad Bitar,
Marvin Xhemrishi,
Antonia Wachter-Zeh
Abstract:
We consider the problem of designing rateless coded private distributed matrix-matrix multiplication. A master server owns two private matrices $\mathbf{A}$ and $\mathbf{B}$ and wants to hire worker nodes to help compute the multiplication. The matrices should remain private from the workers, in an information-theoretic sense. This problem has been considered in the literature and codes with a pre…
▽ More
We consider the problem of designing rateless coded private distributed matrix-matrix multiplication. A master server owns two private matrices $\mathbf{A}$ and $\mathbf{B}$ and wants to hire worker nodes to help compute the multiplication. The matrices should remain private from the workers, in an information-theoretic sense. This problem has been considered in the literature and codes with a predesigned threshold are constructed. More precisely, the master assigns tasks to the workers and waits for a predetermined number of workers to finish their assigned tasks. The size of the tasks assigned to the workers depends on the designed threshold. We are interested in settings where the size of the task must be small and independent of the designed threshold. We design a rateless private matrix-matrix multiplications scheme, called RPM3. Our scheme fixes the size of the tasks and allows the master to send multiple tasks to the workers. The master keeps receiving results until it can decode the multiplication. Two main applications require this property: i) leverage the possible heterogeneity in the system and assign more tasks to workers that are faster; and ii) assign tasks adaptively to account for a possibly time-varying system.
△ Less
Submitted 27 April, 2020;
originally announced April 2020.