-
Federated Orthogonal Training: Mitigating Global Catastrophic Forgetting in Continual Federated Learning
Authors:
Yavuz Faruk Bakman,
Duygu Nur Yaldiz,
Yahya H. Ezzeldin,
Salman Avestimehr
Abstract:
Federated Learning (FL) has gained significant attraction due to its ability to enable privacy-preserving training over decentralized data. Current literature in FL mostly focuses on single-task learning. However, over time, new tasks may appear in the clients and the global model should learn these tasks without forgetting previous tasks. This real-world scenario is known as Continual Federated L…
▽ More
Federated Learning (FL) has gained significant attraction due to its ability to enable privacy-preserving training over decentralized data. Current literature in FL mostly focuses on single-task learning. However, over time, new tasks may appear in the clients and the global model should learn these tasks without forgetting previous tasks. This real-world scenario is known as Continual Federated Learning (CFL). The main challenge of CFL is Global Catastrophic Forgetting, which corresponds to the fact that when the global model is trained on new tasks, its performance on old tasks decreases. There have been a few recent works on CFL to propose methods that aim to address the global catastrophic forgetting problem. However, these works either have unrealistic assumptions on the availability of past data samples or violate the privacy principles of FL. We propose a novel method, Federated Orthogonal Training (FOT), to overcome these drawbacks and address the global catastrophic forgetting in CFL. Our algorithm extracts the global input subspace of each layer for old tasks and modifies the aggregated updates of new tasks such that they are orthogonal to the global principal subspace of old tasks for each layer. This decreases the interference between tasks, which is the main cause for forgetting. We empirically show that FOT outperforms state-of-the-art continual learning methods in the CFL setting, achieving an average accuracy gain of up to 15% with 27% lower forgetting while only incurring a minimal computation and communication cost.
△ Less
Submitted 16 October, 2023; v1 submitted 3 September, 2023;
originally announced September 2023.
-
SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models
Authors:
Sara Babakniya,
Ahmed Roushdy Elkordy,
Yahya H. Ezzeldin,
Qingfeng Liu,
Kee-Bong Song,
Mostafa El-Khamy,
Salman Avestimehr
Abstract:
Transfer learning via fine-tuning pre-trained transformer models has gained significant success in delivering state-of-the-art results across various NLP tasks. In the absence of centralized data, Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning. However, due to the limited communication, computation, and storage capabilities of edge devi…
▽ More
Transfer learning via fine-tuning pre-trained transformer models has gained significant success in delivering state-of-the-art results across various NLP tasks. In the absence of centralized data, Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning. However, due to the limited communication, computation, and storage capabilities of edge devices and the huge sizes of popular transformer models, efficient fine-tuning is crucial to make federated training feasible. This work explores the opportunities and challenges associated with applying parameter efficient fine-tuning (PEFT) methods in different FL settings for language tasks. Specifically, our investigation reveals that as the data across users becomes more diverse, the gap between fully fine-tuning the model and employing PEFT methods widens. To bridge this performance gap, we propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios through a novel data-driven initialization technique. Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning, with significant sparse updates with approximately $\sim 1\%$ density while reducing training time by up to $90\%$.
△ Less
Submitted 12 August, 2023;
originally announced August 2023.
-
The Resource Problem of Using Linear Layer Leakage Attack in Federated Learning
Authors:
Joshua C. Zhao,
Ahmed Roushdy Elkordy,
Atul Sharma,
Yahya H. Ezzeldin,
Salman Avestimehr,
Saurabh Bagchi
Abstract:
Secure aggregation promises a heightened level of privacy in federated learning, maintaining that a server only has access to a decrypted aggregate update. Within this setting, linear layer leakage methods are the only data reconstruction attacks able to scale and achieve a high leakage rate regardless of the number of clients or batch size. This is done through increasing the size of an injected…
▽ More
Secure aggregation promises a heightened level of privacy in federated learning, maintaining that a server only has access to a decrypted aggregate update. Within this setting, linear layer leakage methods are the only data reconstruction attacks able to scale and achieve a high leakage rate regardless of the number of clients or batch size. This is done through increasing the size of an injected fully-connected (FC) layer. However, this results in a resource overhead which grows larger with an increasing number of clients. We show that this resource overhead is caused by an incorrect perspective in all prior work that treats an attack on an aggregate update in the same way as an individual update with a larger batch size. Instead, by attacking the update from the perspective that aggregation is combining multiple individual updates, this allows the application of sparsity to alleviate resource overhead. We show that the use of sparsity can decrease the model size overhead by over 327$\times$ and the computation time by 3.34$\times$ compared to SOTA while maintaining equivalent total leakage rate, 77% even with $1000$ clients in aggregation.
△ Less
Submitted 26 March, 2023;
originally announced March 2023.
-
LOKI: Large-scale Data Reconstruction Attack against Federated Learning through Model Manipulation
Authors:
Joshua C. Zhao,
Atul Sharma,
Ahmed Roushdy Elkordy,
Yahya H. Ezzeldin,
Salman Avestimehr,
Saurabh Bagchi
Abstract:
Federated learning was introduced to enable machine learning over large decentralized datasets while promising privacy by eliminating the need for data sharing. Despite this, prior work has shown that shared gradients often contain private information and attackers can gain knowledge either through malicious modification of the architecture and parameters or by using optimization to approximate us…
▽ More
Federated learning was introduced to enable machine learning over large decentralized datasets while promising privacy by eliminating the need for data sharing. Despite this, prior work has shown that shared gradients often contain private information and attackers can gain knowledge either through malicious modification of the architecture and parameters or by using optimization to approximate user data from the shared gradients. However, prior data reconstruction attacks have been limited in setting and scale, as most works target FedSGD and limit the attack to single-client gradients. Many of these attacks fail in the more practical setting of FedAVG or if updates are aggregated together using secure aggregation. Data reconstruction becomes significantly more difficult, resulting in limited attack scale and/or decreased reconstruction quality. When both FedAVG and secure aggregation are used, there is no current method that is able to attack multiple clients concurrently in a federated learning setting. In this work we introduce LOKI, an attack that overcomes previous limitations and also breaks the anonymity of aggregation as the leaked data is identifiable and directly tied back to the clients they come from. Our design sends clients customized convolutional parameters, and the weight gradients of data points between clients remain separate even through aggregation. With FedAVG and aggregation across 100 clients, prior work can leak less than 1% of images on MNIST, CIFAR-100, and Tiny ImageNet. Using only a single training round, LOKI is able to leak 76-86% of all data samples.
△ Less
Submitted 25 September, 2023; v1 submitted 21 March, 2023;
originally announced March 2023.
-
Federated Analytics: A survey
Authors:
Ahmed Roushdy Elkordy,
Yahya H. Ezzeldin,
Shanshan Han,
Shantanu Sharma,
Chaoyang He,
Sharad Mehrotra,
Salman Avestimehr
Abstract:
Federated analytics (FA) is a privacy-preserving framework for computing data analytics over multiple remote parties (e.g., mobile devices) or silo-ed institutional entities (e.g., hospitals, banks) without sharing the data among parties. Motivated by the practical use cases of federated analytics, we follow a systematic discussion on federated analytics in this article. In particular, we discuss…
▽ More
Federated analytics (FA) is a privacy-preserving framework for computing data analytics over multiple remote parties (e.g., mobile devices) or silo-ed institutional entities (e.g., hospitals, banks) without sharing the data among parties. Motivated by the practical use cases of federated analytics, we follow a systematic discussion on federated analytics in this article. In particular, we discuss the unique characteristics of federated analytics and how it differs from federated learning. We also explore a wide range of FA queries and discuss various existing solutions and potential use case applications for different FA queries.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
How Much Privacy Does Federated Learning with Secure Aggregation Guarantee?
Authors:
Ahmed Roushdy Elkordy,
Jiang Zhang,
Yahya H. Ezzeldin,
Konstantinos Psounis,
Salman Avestimehr
Abstract:
Federated learning (FL) has attracted growing interest for enabling privacy-preserving machine learning on data stored at multiple users while avoiding moving the data off-device. However, while data never leaves users' devices, privacy still cannot be guaranteed since significant computations on users' training data are shared in the form of trained local models. These local models have recently…
▽ More
Federated learning (FL) has attracted growing interest for enabling privacy-preserving machine learning on data stored at multiple users while avoiding moving the data off-device. However, while data never leaves users' devices, privacy still cannot be guaranteed since significant computations on users' training data are shared in the form of trained local models. These local models have recently been shown to pose a substantial privacy threat through different privacy attacks such as model inversion attacks. As a remedy, Secure Aggregation (SA) has been developed as a framework to preserve privacy in FL, by guaranteeing the server can only learn the global aggregated model update but not the individual model updates. While SA ensures no additional information is leaked about the individual model update beyond the aggregated model update, there are no formal guarantees on how much privacy FL with SA can actually offer; as information about the individual dataset can still potentially leak through the aggregated model computed at the server. In this work, we perform a first analysis of the formal privacy guarantees for FL with SA. Specifically, we use Mutual Information (MI) as a quantification metric and derive upper bounds on how much information about each user's dataset can leak through the aggregated model update. When using the FedSGD aggregation algorithm, our theoretical bounds show that the amount of privacy leakage reduces linearly with the number of users participating in FL with SA. To validate our theoretical bounds, we use an MI Neural Estimator to empirically evaluate the privacy leakage under different FL setups on both the MNIST and CIFAR10 datasets. Our experiments verify our theoretical bounds for FedSGD, which show a reduction in privacy leakage as the number of users and local batch size grow, and an increase in privacy leakage with the number of training rounds.
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
FairFed: Enabling Group Fairness in Federated Learning
Authors:
Yahya H. Ezzeldin,
Shen Yan,
Chaoyang He,
Emilio Ferrara,
Salman Avestimehr
Abstract:
Training ML models which are fair across different demographic groups is of critical importance due to the increased integration of ML in crucial decision-making scenarios such as healthcare and recruitment. Federated learning has been viewed as a promising solution for collaboratively training machine learning models among multiple parties while maintaining the privacy of their local data. Howeve…
▽ More
Training ML models which are fair across different demographic groups is of critical importance due to the increased integration of ML in crucial decision-making scenarios such as healthcare and recruitment. Federated learning has been viewed as a promising solution for collaboratively training machine learning models among multiple parties while maintaining the privacy of their local data. However, federated learning also poses new challenges in mitigating the potential bias against certain populations (e.g., demographic groups), as this typically requires centralized access to the sensitive information (e.g., race, gender) of each datapoint. Motivated by the importance and challenges of group fairness in federated learning, in this work, we propose FairFed, a novel algorithm for fairness-aware aggregation to enhance group fairness in federated learning. Our proposed approach is server-side and agnostic to the applied local debiasing thus allowing for flexible use of different local debiasing methods across clients. We evaluate FairFed empirically versus common baselines for fair ML and federated learning, and demonstrate that it provides fairer models particularly under highly heterogeneous data distributions across clients. We also demonstrate the benefits of FairFed in scenarios involving naturally distributed real-life data collected from different geographical locations or departments within an organization.
△ Less
Submitted 23 November, 2022; v1 submitted 2 October, 2021;
originally announced October 2021.
-
A Reinforcement Learning Approach for Scheduling in mmWave Networks
Authors:
Mine Gokce Dogan,
Yahya H. Ezzeldin,
Christina Fragouli,
Addison W. Bohannon
Abstract:
We consider a source that wishes to communicate with a destination at a desired rate, over a mmWave network where links are subject to blockage and nodes to failure (e.g., in a hostile military environment). To achieve resilience to link and node failures, we here explore a state-of-the-art Soft Actor-Critic (SAC) deep reinforcement learning algorithm, that adapts the information flow through the…
▽ More
We consider a source that wishes to communicate with a destination at a desired rate, over a mmWave network where links are subject to blockage and nodes to failure (e.g., in a hostile military environment). To achieve resilience to link and node failures, we here explore a state-of-the-art Soft Actor-Critic (SAC) deep reinforcement learning algorithm, that adapts the information flow through the network, without using knowledge of the link capacities or network topology. Numerical evaluations show that our algorithm can achieve the desired rate even in dynamic environments and it is robust against blockage.
△ Less
Submitted 1 August, 2021;
originally announced August 2021.
-
On optimal relay placement in directional networks
Authors:
Mine Gokce Dogan,
Yahya H. Ezzeldin,
Christina Fragouli
Abstract:
In this paper, we study the problem of optimal topology design in wireless networks equipped with highly-directional transmission antennas. We use the 1-2-1 network model to characterize the optimal placement of two relays that assist the communication between a source-destination pair. We analytically show that under some conditions on the distance between the source-destination pair, the optimal…
▽ More
In this paper, we study the problem of optimal topology design in wireless networks equipped with highly-directional transmission antennas. We use the 1-2-1 network model to characterize the optimal placement of two relays that assist the communication between a source-destination pair. We analytically show that under some conditions on the distance between the source-destination pair, the optimal topology in terms of maximizing the network throughput is to place the relays as close as possible to the source and the destination.
△ Less
Submitted 6 February, 2021; v1 submitted 1 February, 2021;
originally announced February 2021.
-
Quantizing data for distributed learning
Authors:
Osama A. Hanna,
Yahya H. Ezzeldin,
Christina Fragouli,
Suhas Diggavi
Abstract:
We consider machine learning applications that train a model by leveraging data distributed over a trusted network, where communication constraints can create a performance bottleneck. A number of recent approaches propose to overcome this bottleneck through compression of gradient updates. However, as models become larger, so does the size of the gradient updates. In this paper, we propose an alt…
▽ More
We consider machine learning applications that train a model by leveraging data distributed over a trusted network, where communication constraints can create a performance bottleneck. A number of recent approaches propose to overcome this bottleneck through compression of gradient updates. However, as models become larger, so does the size of the gradient updates. In this paper, we propose an alternate approach to learn from distributed data that quantizes data instead of gradients, and can support learning over applications where the size of gradient updates is prohibitive. Our approach leverages the dependency of the computed gradient on data samples, which lie in a much smaller space in order to perform the quantization in the smaller dimension data space. At the cost of an extra gradient computation, the gradient estimate can be refined by conveying the difference between the gradient at the quantized data point and the original gradient using a small number of bits. Lastly, in order to save communication, our approach adds a layer that decides whether to transmit a quantized data sample or not based on its importance for learning. We analyze the convergence of the proposed approach for smooth convex and non-convex objective functions and show that we can achieve order optimal convergence rates with communication that mostly depends on the data rather than the model (gradient) dimension. We use our proposed algorithm to train ResNet models on the CIFAR-10 and ImageNet datasets, and show that we can achieve an order of magnitude savings over gradient compression methods. These communication savings come at the cost of increasing computation at the learning agent, and thus our approach is beneficial in scenarios where communication load is the main problem.
△ Less
Submitted 8 September, 2021; v1 submitted 14 December, 2020;
originally announced December 2020.
-
Gaussian 1-2-1 Networks with Imperfect Beamforming
Authors:
Yahya H. Ezzeldin,
Martina Cardone,
Christina Fragouli,
Giuseppe Caire
Abstract:
In this work, we study bounds on the capacity of full-duplex Gaussian 1-2-1 networks with imperfect beamforming. In particular, different from the ideal 1-2-1 network model introduced in [1], in this model beamforming patterns result in side-lobe leakage that cannot be perfectly suppressed. The 1-2-1 network model captures the directivity of mmWave network communications, where nodes communicate b…
▽ More
In this work, we study bounds on the capacity of full-duplex Gaussian 1-2-1 networks with imperfect beamforming. In particular, different from the ideal 1-2-1 network model introduced in [1], in this model beamforming patterns result in side-lobe leakage that cannot be perfectly suppressed. The 1-2-1 network model captures the directivity of mmWave network communications, where nodes communicate by pointing main-lobe "beams" at each other. We characterize the gap between the approximate capacities of the imperfect and ideal 1-2-1 models for the same channel coefficients and transmit power. We show that, under some conditions, this gap only depends on the number of nodes. Moreover, we evaluate the achievable rate of schemes that treat the resulting side-lobe leakage as noise, and show that they offer suitable solutions for implementation.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
On Distributed Quantization for Classification
Authors:
Osama A. Hanna,
Yahya H. Ezzeldin,
Tara Sadjadpour,
Christina Fragouli,
Suhas Diggavi
Abstract:
We consider the problem of distributed feature quantization, where the goal is to enable a pretrained classifier at a central node to carry out its classification on features that are gathered from distributed nodes through communication constrained channels. We propose the design of distributed quantization schemes specifically tailored to the classification task: unlike quantization schemes that…
▽ More
We consider the problem of distributed feature quantization, where the goal is to enable a pretrained classifier at a central node to carry out its classification on features that are gathered from distributed nodes through communication constrained channels. We propose the design of distributed quantization schemes specifically tailored to the classification task: unlike quantization schemes that help the central node reconstruct the original signal as accurately as possible, our focus is not reconstruction accuracy, but instead correct classification. Our work does not make any apriori distributional assumptions on the data, but instead uses training data for the quantizer design. Our main contributions include: we prove NP-hardness of finding optimal quantizers in the general case; we design an optimal scheme for a special case; we propose quantization algorithms, that leverage discrete neural representations and training data, and can be designed in polynomial-time for any number of features, any number of classes, and arbitrary division of features across the distributed nodes. We find that tailoring the quantizers to the classification task can offer significant savings: as compared to alternatives, we can achieve more than a factor of two reduction in terms of the number of bits communicated, for the same classification accuracy.
△ Less
Submitted 1 November, 2019;
originally announced November 2019.
-
Polynomial-time Capacity Calculation and Scheduling for Half-Duplex 1-2-1 Networks
Authors:
Yahya H. Ezzeldin,
Martina Cardone,
Christina Fragouli,
Giuseppe Caire
Abstract:
This paper studies the 1-2-1 half-duplex network model, where two half-duplex nodes can communicate only if they point `beams' at each other; otherwise, no signal can be exchanged or interference can be generated. The main result of this paper is the design of two polynomial-time algorithms that: (i) compute the approximate capacity of the 1-2-1 half-duplex network and, (ii) find the network sched…
▽ More
This paper studies the 1-2-1 half-duplex network model, where two half-duplex nodes can communicate only if they point `beams' at each other; otherwise, no signal can be exchanged or interference can be generated. The main result of this paper is the design of two polynomial-time algorithms that: (i) compute the approximate capacity of the 1-2-1 half-duplex network and, (ii) find the network schedule optimal for the approximate capacity. The paper starts by expressing the approximate capacity as a linear program with an exponential number of constraints. A core technical component consists of building a polynomial-time separation oracle for this linear program, by using algorithmic tools such as perfect matching polytopes and Gomory-Hu trees.
△ Less
Submitted 9 January, 2019;
originally announced January 2019.
-
Secure Communication over 1-2-1 Networks
Authors:
Gaurav Kumar Agarwal,
Yahya H. Ezzeldin,
Martina Cardone,
Christina Fragouli
Abstract:
This paper starts by assuming a 1-2-1 network, the abstracted noiseless model of mmWave networks that was shown to closely approximate the Gaussian capacity in [1], and studies secure communication. First, the secure capacity is derived for 1-2-1 networks where a source is connected to a destination through a network of unit capacity links. Then, lower and upper bounds on the secure capacity are d…
▽ More
This paper starts by assuming a 1-2-1 network, the abstracted noiseless model of mmWave networks that was shown to closely approximate the Gaussian capacity in [1], and studies secure communication. First, the secure capacity is derived for 1-2-1 networks where a source is connected to a destination through a network of unit capacity links. Then, lower and upper bounds on the secure capacity are derived for the case when source and destination have more than one beam, which allow them to transmit and receive in multiple directions at a time. Finally, secure capacity results are presented for diamond 1-2-1 networks when edges have different capacities.
△ Less
Submitted 12 January, 2018; v1 submitted 9 January, 2018;
originally announced January 2018.
-
Gaussian 1-2-1 Networks: Capacity Results for mmWave Communications
Authors:
Yahya H. Ezzeldin,
Martina Cardone,
Christina Fragouli,
Giuseppe Caire
Abstract:
This paper proposes a new model for wireless relay networks referred to as "1-2-1 network", where two nodes can communicate only if they point "beams" at each other, while if they do not point beams at each other, no signal can be exchanged or interference can be generated. This model is motivated by millimeter wave communications where, due to the high path loss, a link between two nodes can exis…
▽ More
This paper proposes a new model for wireless relay networks referred to as "1-2-1 network", where two nodes can communicate only if they point "beams" at each other, while if they do not point beams at each other, no signal can be exchanged or interference can be generated. This model is motivated by millimeter wave communications where, due to the high path loss, a link between two nodes can exist only if beamforming gain at both sides is established, while in the absence of beamforming gain the signal is received well below the thermal noise floor. The main result in this paper is that the 1-2-1 network capacity can be approximated by routing information along at most $2N+2$ paths, where $N$ is the number of relays connecting a source and a destination through an arbitrary topology.
△ Less
Submitted 17 June, 2018; v1 submitted 8 January, 2018;
originally announced January 2018.
-
Wireless Network Simplification: The Performance of Routing
Authors:
Yahya H. Ezzeldin,
Ayan Sengupta,
Christina Fragouli
Abstract:
Consider a wireless Gaussian network where a source wishes to communicate with a destination with the help of N full-duplex relay nodes. Most practical systems today route information from the source to the destination using the best path that connects them. In this paper, we show that routing can in the worst case result in an unbounded gap from the network capacity - or reversely, physical layer…
▽ More
Consider a wireless Gaussian network where a source wishes to communicate with a destination with the help of N full-duplex relay nodes. Most practical systems today route information from the source to the destination using the best path that connects them. In this paper, we show that routing can in the worst case result in an unbounded gap from the network capacity - or reversely, physical layer cooperation can offer unbounded gains over routing. More specifically, we show that for $N$-relay Gaussian networks with an arbitrary topology, routing can in the worst case guarantee an approximate fraction $\frac{1}{\left\lfloor N/2 \right\rfloor + 1}$ of the capacity of the full network, independently of the SNR regime. We prove that this guarantee is fundamental, i.e., it is the highest worst-case guarantee that we can provide for routing in relay networks. Next, we consider how these guarantees are refined for Gaussian layered relay networks with $L$ layers and $N_L$ relays per layer. We prove that for arbitrary $L$ and $N_L$, there always exists a route in the network that approximately achieves at least $\frac{2}{(L-1)N_L + 4}$ $\left(\mbox{resp.}\frac{2}{LN_L+2}\right)$ of the network capacity for odd $L$ (resp. even $L$), and there exist networks where the best routes exactly achieve these fractions. These results are formulated within the network simplification framework, that asks what fraction of the capacity we can achieve by using a subnetwork (in our case, a single path). A fundamental step in our proof is a simplification result for MIMO antenna selection that may also be of independent interest. To the best of our knowledge, this is the first result that characterizes, for general wireless network topologies, what is the performance of routing with respect to physical layer cooperation techniques that approximately achieve the network capacity.
△ Less
Submitted 2 November, 2017;
originally announced November 2017.
-
Half-Duplex Routing is NP-hard
Authors:
Yahya H. Ezzeldin,
Martina Cardone,
Christina Fragouli,
Daniela Tuninetti
Abstract:
Routing is a widespread approach to transfer information from a source node to a destination node in many deployed wireless ad-hoc networks. Today's implemented routing algorithms seek to efficiently find the path/route with the largest Full-Duplex (FD) capacity, which is given by the minimum among the point-to-point link capacities in the path. Such an approach may be suboptimal if then the nodes…
▽ More
Routing is a widespread approach to transfer information from a source node to a destination node in many deployed wireless ad-hoc networks. Today's implemented routing algorithms seek to efficiently find the path/route with the largest Full-Duplex (FD) capacity, which is given by the minimum among the point-to-point link capacities in the path. Such an approach may be suboptimal if then the nodes in the selected path are operated in Half-Duplex (HD) mode. Recently, the capacity (up to a constant gap that only depends on the number of nodes in the path) of an HD line network i.e., a path) has been shown to be equal to half of the minimum of the harmonic means of the capacities of two consecutive links in the path. This paper asks the questions of whether it is possible to design a polynomial-time algorithm that efficiently finds the path with the largest HD capacity in a relay network. This problem of finding that path is shown to be NP-hard in general. However, if the number of cycles in the network is polynomial in the number of nodes, then a polynomial-time algorithm can indeed be designed.
△ Less
Submitted 10 August, 2017;
originally announced August 2017.
-
Communication vs Distributed Computation: an alternative trade-off curve
Authors:
Yahya H. Ezzeldin,
Mohammed Karmoose,
Christina Fragouli
Abstract:
In this paper, we revisit the communication vs. distributed computing trade-off, studied within the framework of MapReduce in [1]. An implicit assumption in the aforementioned work is that each server performs all possible computations on all the files stored in its memory. Our starting observation is that, if servers can compute only the intermediate values they need, then storage constraints do…
▽ More
In this paper, we revisit the communication vs. distributed computing trade-off, studied within the framework of MapReduce in [1]. An implicit assumption in the aforementioned work is that each server performs all possible computations on all the files stored in its memory. Our starting observation is that, if servers can compute only the intermediate values they need, then storage constraints do not directly imply computation constraints. We examine how this affects the communication-computation trade-off and suggest that the trade-off be studied with a predetermined storage constraint. We then proceed to examine the case where servers need to perform computationally intensive tasks, and may not have sufficient time to perform all computations required by the scheme in [1]. Given a threshold that limits the computational load, we derive a lower bound on the associated communication load, and propose a heuristic scheme that achieves in some cases the lower bound.
△ Less
Submitted 24 May, 2017;
originally announced May 2017.
-
Efficiently Finding Simple Schedules in Gaussian Half-Duplex Relay Line Networks
Authors:
Yahya H. Ezzeldin,
Martina Cardone,
Christina Fragouli,
Daniela Tuninetti
Abstract:
The problem of operating a Gaussian Half-Duplex (HD) relay network optimally is challenging due to the exponential number of listen/transmit network states that need to be considered. Recent results have shown that, for the class of Gaussian HD networks with N relays, there always exists a simple schedule, i.e., with at most N +1 active states, that is sufficient for approximate (i.e., up to a con…
▽ More
The problem of operating a Gaussian Half-Duplex (HD) relay network optimally is challenging due to the exponential number of listen/transmit network states that need to be considered. Recent results have shown that, for the class of Gaussian HD networks with N relays, there always exists a simple schedule, i.e., with at most N +1 active states, that is sufficient for approximate (i.e., up to a constant gap) capacity characterization. This paper investigates how to efficiently find such a simple schedule over line networks. Towards this end, a polynomial-time algorithm is designed and proved to output a simple schedule that achieves the approximate capacity. The key ingredient of the algorithm is to leverage similarities between network states in HD and edge coloring in a graph. It is also shown that the algorithm allows to derive a closed-form expression for the approximate capacity of the Gaussian line network that can be evaluated distributively and in linear time. Additionally, it is shown using this closed-form that the problem of Half-Duplex routing is NP-Hard.
△ Less
Submitted 21 June, 2017; v1 submitted 16 January, 2017;
originally announced January 2017.
-
Consistency in the face of change: an adaptive approach to physical layer cooperation
Authors:
Ayan Sengupta,
Yahya H. Ezzeldin,
Siddhartha Brahma,
Christina Fragouli,
Suhas Diggavi
Abstract:
Most existing works on physical-layer (PHY) cooperation (beyond routing) focus on how to best use a given, static relay network--while wireless networks are anything but static. In this paper, we pose a different set of questions: given that we have multiple devices within range, which relay(s) do we use for PHY cooperation, to maintain a consistent target performance? How can we efficiently adapt…
▽ More
Most existing works on physical-layer (PHY) cooperation (beyond routing) focus on how to best use a given, static relay network--while wireless networks are anything but static. In this paper, we pose a different set of questions: given that we have multiple devices within range, which relay(s) do we use for PHY cooperation, to maintain a consistent target performance? How can we efficiently adapt, as network conditions change? And how important is it, in terms of performance, to adapt? Although adapting to the best path when routing is a well understood problem, how to do so over PHY cooperation networks is an open question. Our contributions are: (1) We demonstrate via theoretical evaluation, a diminishing returns trend as the number of deployed relays increases. (2) Using a simple algorithm based on network metrics, we efficiently select the sub-network to use at any given time to maintain a target reliability. (3) When streaming video from Netflix, we experimentally show (using measurements from a WARP radio testbed employing DIQIF relaying) that our adaptive PHY cooperation scheme provides a throughput gain of 2x over nonadaptive PHY schemes, and a gain of 6x over genie-aided IP-level adaptive routing.
△ Less
Submitted 6 December, 2016;
originally announced December 2016.
-
Network Simplification in Half-Duplex: Building on Submodularity
Authors:
Martina Cardone,
Yahya H. Ezzeldin,
Christina Fragouli,
Daniela Tuninetti
Abstract:
This paper explores the {\it network simplification} problem in the context of Gaussian Half-Duplex (HD) diamond networks. Specifically, given an $N$-relay diamond network, this problem seeks to derive fundamental guarantees on the capacity of the best $k$-relay subnetwork, as a function of the full network capacity. The main focus of this work is on the case when $k=N-1$ relays are selected out o…
▽ More
This paper explores the {\it network simplification} problem in the context of Gaussian Half-Duplex (HD) diamond networks. Specifically, given an $N$-relay diamond network, this problem seeks to derive fundamental guarantees on the capacity of the best $k$-relay subnetwork, as a function of the full network capacity. The main focus of this work is on the case when $k=N-1$ relays are selected out of the $N$ possible ones. First, a simple algorithm, which removes the relay with the minimum capacity (i.e., the worst relay), is analyzed and it is shown that the remaining $(N-1)$-relay subnetwork has an approximate (i.e., optimal up to a constant gap) HD capacity that is at least half of the approximate HD capacity of the full network. This fraction guarantee is shown to be tight if only the single relay capacities are known, i.e., there exists a class of Gaussian HD diamond networks with $N$ relays where, by removing the worst relay, the subnetwork of the remaining $k=N-1$ relays has an approximate capacity equal to half of the approximate capacity of the full network. Next, this work proves a fundamental guarantee, which improves over the previous fraction: there always exists a subnetwork of $k=N-1$ relays that achieves at least a fraction $\frac{N-1}{N}$ of the approximate capacity of the full network. This fraction is proved to be tight and it is shown that any optimal schedule of the full network can be used by at least one of the $N$ subnetworks of $N-1$ relays to achieve a worst-case performance guarantee of $\frac{N-1}{N}$. Additionally, these results are extended to derive lower bounds on the fraction guarantee for general $k \in [1:N]$. The key steps in the proofs lie in the derivation of properties of submodular functions, which provide a combinatorial handle on the network simplification problem in Gaussian HD diamond networks.
△ Less
Submitted 7 July, 2017; v1 submitted 5 July, 2016;
originally announced July 2016.
-
A Note on Antenna Selection in Gaussian MIMO Channels: Capacity Guarantees and Bounds
Authors:
Yahya H. Ezzeldin,
Ayan Sengupta,
Christina Fragouli
Abstract:
We consider the problem of selecting $k_t \times k_r$ antennas from a Gaussian MIMO channel with $n_t \times n_r$ antennas, where $k_t \leq n_t$ and $k_r \leq n_r$. We prove the following two results that hold universally, in the sense that they do not depend on the channel coefficients: (i) The capacity of the best $k_t \times k_r$ subchannel is always lower bounded by a fraction…
▽ More
We consider the problem of selecting $k_t \times k_r$ antennas from a Gaussian MIMO channel with $n_t \times n_r$ antennas, where $k_t \leq n_t$ and $k_r \leq n_r$. We prove the following two results that hold universally, in the sense that they do not depend on the channel coefficients: (i) The capacity of the best $k_t \times k_r$ subchannel is always lower bounded by a fraction $\frac{k_t k_r}{n_t n_r}$ of the full capacity (with $n_t \times n_r$ antennas). This bound is tight as the channel coefficients diminish in magnitude. (ii) There always exists a selection of $k_t \times k_r$ antennas (including the best) that achieves a fraction greater than $\frac{\min(k_t ,k_r)}{\min(n_t,n_r)}$ of the full capacity within an additive constant that is independent of the coefficients in the channel matrix. The key mathematical idea that allows us to derive these universal bounds is to directly relate the determinants of principle sub-matrices of a Hermitian matrix to the determinant of the entire matrix.
△ Less
Submitted 16 August, 2016; v1 submitted 21 January, 2016;
originally announced January 2016.
-
Wireless Network Simplification : Beyond Diamond Networks
Authors:
Yahya H. Ezzeldin,
Ayan Sengupta,
Christina Fragouli
Abstract:
We consider an arbitrary layered Gaussian relay network with $L$ layers of $N$ relays each, from which we select subnetworks with $K$ relays per layer. We prove that: (i) For arbitrary $L, N$ and $K = 1$, there always exists a subnetwork that approximately achieves $\frac{2}{(L-1)N + 4}$ $\left(\mbox{resp.}\frac{2}{LN+2}\right)$ of the network capacity for odd $L$ (resp. even $L$), (ii) For…
▽ More
We consider an arbitrary layered Gaussian relay network with $L$ layers of $N$ relays each, from which we select subnetworks with $K$ relays per layer. We prove that: (i) For arbitrary $L, N$ and $K = 1$, there always exists a subnetwork that approximately achieves $\frac{2}{(L-1)N + 4}$ $\left(\mbox{resp.}\frac{2}{LN+2}\right)$ of the network capacity for odd $L$ (resp. even $L$), (ii) For $L = 2, N = 3, K = 2$, there always exists a subnetwork that approximately achieves $\frac{1}{2}$ of the network capacity. We also provide example networks where even the best subnetworks achieve exactly these fractions (up to additive gaps). Along the way, we derive some results on MIMO antenna selection and capacity decomposition that may also be of independent interest.
△ Less
Submitted 26 January, 2016; v1 submitted 21 January, 2016;
originally announced January 2016.
-
Pseudo-Lattice Treatment for Subspace Aligned Interference Signals
Authors:
Yahya H. Ezzeldin,
Karim G. Seddik
Abstract:
For multi-input multi-output (MIMO) K-user interference networks, we propose the use of a channel transformation technique for joint detection of the useful and interference signals in an interference alignment scenario. We coin our detection technique as "pseudo-lattice treatment" and show that applying our technique, we can alleviate limitations facing Lattice Interference Alignment (L-IA). We s…
▽ More
For multi-input multi-output (MIMO) K-user interference networks, we propose the use of a channel transformation technique for joint detection of the useful and interference signals in an interference alignment scenario. We coin our detection technique as "pseudo-lattice treatment" and show that applying our technique, we can alleviate limitations facing Lattice Interference Alignment (L-IA). We show that for a 3-user interference network, two of the users can have their interference aligned in lattice structure through precoding. For the remaining user, performance gains in decoding subspace interference aligned signals at the receiver are achieved using our channel transformation technique. Our "pseudo-lattice" technique can also be applied at all users in case of Subspace Interference Alignment (S-IA). We investigate different solutions for applying channel transformation at the third receiver and evaluate performance for these techniques. Simulations are conducted to show the performance gain in using our pseudo-lattice method over other decoding techniques using different modulation schemes.
△ Less
Submitted 23 July, 2013;
originally announced July 2013.
-
Sparse Reconstruction-based Detection of Spatial Dimension Holes in Cognitive Radio Networks
Authors:
Yahya H. Ezzeldin,
Radwa A. Sultan,
Karim G. Seddik
Abstract:
In this paper, we investigate a spectrum sensing algorithm for detecting spatial dimension holes in Multiple Inputs Multiple Outputs (MIMO) transmissions for OFDM systems using Compressive Sensing (CS) tools. This extends the energy detector to allow for detecting transmission opportunities even if the band is already energy filled. We show that the task described above is not performed efficientl…
▽ More
In this paper, we investigate a spectrum sensing algorithm for detecting spatial dimension holes in Multiple Inputs Multiple Outputs (MIMO) transmissions for OFDM systems using Compressive Sensing (CS) tools. This extends the energy detector to allow for detecting transmission opportunities even if the band is already energy filled. We show that the task described above is not performed efficiently by regular MIMO decoders (such as MMSE decoder) due to possible sparsity in the transmit signal. Since CS reconstruction tools take into account the sparsity order of the signal, they are more efficient in detecting the activity of the users. Building on successful activity detection by the CS detector, we show that the use of a CS-aided MMSE decoders yields better performance rather than using either CS-based or MMSE decoders separately. Simulations are conducted to verify the gains from using CS detector for Primary user activity detection and the performance gain in using CS-aided MMSE decoders for decoding the PU information for future relaying.
△ Less
Submitted 23 July, 2013;
originally announced July 2013.