Search | arXiv e-print repository

Minimum Description Feature Selection for Complexity Reduction in Machine Learning-based Wireless Positioning

Authors: Myeung Suk Oh, Anindya Bijoy Das, Taejoon Kim, David J. Love, Christopher G. Brinton

Abstract: Recently, deep learning approaches have provided solutions to difficult problems in wireless positioning (WP). Although these WP algorithms have attained excellent and consistent performance against complex channel environments, the computational complexity coming from processing high-dimensional features can be prohibitive for mobile applications. In this work, we design a novel positioning neura… ▽ More Recently, deep learning approaches have provided solutions to difficult problems in wireless positioning (WP). Although these WP algorithms have attained excellent and consistent performance against complex channel environments, the computational complexity coming from processing high-dimensional features can be prohibitive for mobile applications. In this work, we design a novel positioning neural network (P-NN) that utilizes the minimum description features to substantially reduce the complexity of deep learning-based WP. P-NN's feature selection strategy is based on maximum power measurements and their temporal locations to convey information needed to conduct WP. We improve P-NN's learning ability by intelligently processing two different types of inputs: sparse image and measurement matrices. Specifically, we implement a self-attention layer to reinforce the training ability of our network. We also develop a technique to adapt feature space size, optimizing over the expected information gain and the classification capability quantified with information-theoretic measures on signal bin selection. Numerical results show that P-NN achieves a significant advantage in performance-complexity tradeoff over deep learning baselines that leverage the full power delay profile (PDP). In particular, we find that P-NN achieves a large improvement in performance for low SNR, as unnecessary measurements are discarded in our minimum description features. △ Less

Submitted 21 April, 2024; originally announced April 2024.

Comments: This paper has been accepted for the publication in IEEE Journal on Selected Areas in Communications. arXiv admin note: text overlap with arXiv:2402.09580

arXiv:2404.14319 [pdf, other]

Multi-Agent Hybrid SAC for Joint SS-DSA in CRNs

Authors: David R. Nickel, Anindya Bijoy Das, David J. Love, Christopher G. Brinton

Abstract: Opportunistic spectrum access has the potential to increase the efficiency of spectrum utilization in cognitive radio networks (CRNs). In CRNs, both spectrum sensing and resource allocation (SSRA) are critical to maximizing system throughput while minimizing collisions of secondary users with the primary network. However, many works in dynamic spectrum access do not consider the impact of imperfec… ▽ More Opportunistic spectrum access has the potential to increase the efficiency of spectrum utilization in cognitive radio networks (CRNs). In CRNs, both spectrum sensing and resource allocation (SSRA) are critical to maximizing system throughput while minimizing collisions of secondary users with the primary network. However, many works in dynamic spectrum access do not consider the impact of imperfect sensing information such as mis-detected channels, which the additional information available in joint SSRA can help remediate. In this work, we examine joint SSRA as an optimization which seeks to maximize a CRN's net communication rate subject to constraints on channel sensing, channel access, and transmit power. Given the non-trivial nature of the problem, we leverage multi-agent reinforcement learning to enable a network of secondary users to dynamically access unoccupied spectrum via only local test statistics, formulated under the energy detection paradigm of spectrum sensing. In doing so, we develop a novel multi-agent implementation of hybrid soft actor critic, MHSAC, based on the QMIX mixing scheme. Through experiments, we find that our SSRA algorithm, HySSRA, is successful in maximizing the CRN's utilization of spectrum resources while also limiting its interference with the primary network, and outperforms the current state-of-the-art by a wide margin. We also explore the impact of wireless variations such as coherence time on the efficacy of the system. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: 10 pages. Currently under review for ACM MobiHoc 2024

arXiv:2404.09861 [pdf, other]

Unsupervised Federated Optimization at the Edge: D2D-Enabled Learning without Labels

Authors: Satyavrat Wagle, Seyyedali Hosseinalipour, Naji Khosravan, Christopher G. Brinton

Abstract: Federated learning (FL) is a popular solution for distributed machine learning (ML). While FL has traditionally been studied for supervised ML tasks, in many applications, it is impractical to assume availability of labeled data across devices. To this end, we develop Cooperative Federated unsupervised Contrastive Learning ({\tt CF-CL)} to facilitate FL across edge devices with unlabeled datasets.… ▽ More Federated learning (FL) is a popular solution for distributed machine learning (ML). While FL has traditionally been studied for supervised ML tasks, in many applications, it is impractical to assume availability of labeled data across devices. To this end, we develop Cooperative Federated unsupervised Contrastive Learning ({\tt CF-CL)} to facilitate FL across edge devices with unlabeled datasets. {\tt CF-CL} employs local device cooperation where either explicit (i.e., raw data) or implicit (i.e., embeddings) information is exchanged through device-to-device (D2D) communications to improve local diversity. Specifically, we introduce a \textit{smart information push-pull} methodology for data/embedding exchange tailored to FL settings with either soft or strict data privacy restrictions. Information sharing is conducted through a probabilistic importance sampling technique at receivers leveraging a carefully crafted reserve dataset provided by transmitters. In the implicit case, embedding exchange is further integrated into the local ML training at the devices via a regularization term incorporated into the contrastive loss, augmented with a dynamic contrastive margin to adjust the volume of latent space explored. Numerical evaluations demonstrate that {\tt CF-CL} leads to alignment of latent spaces learned across devices, results in faster and more efficient global model training, and is effective in extreme non-i.i.d. data distribution settings across devices. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 16 pages, 11 figures

arXiv:2404.08003 [pdf, other]

Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis

Authors: Guangchen Lan, Dong-Jun Han, Abolfazl Hashemi, Vaneet Aggarwal, Christopher G. Brinton

Abstract: To improve the efficiency of reinforcement learning, we propose a novel asynchronous federated reinforcement learning framework termed AFedPG, which constructs a global model through collaboration among $N$ agents using policy gradient (PG) updates. To handle the challenge of lagged policies in asynchronous settings, we design delay-adaptive lookahead and normalized update techniques that can effe… ▽ More To improve the efficiency of reinforcement learning, we propose a novel asynchronous federated reinforcement learning framework termed AFedPG, which constructs a global model through collaboration among $N$ agents using policy gradient (PG) updates. To handle the challenge of lagged policies in asynchronous settings, we design delay-adaptive lookahead and normalized update techniques that can effectively handle the heterogeneous arrival times of policy gradients. We analyze the theoretical global convergence bound of AFedPG, and characterize the advantage of the proposed algorithm in terms of both the sample complexity and time complexity. Specifically, our AFedPG method achieves $\mathcal{O}(\frac{ε^{-2.5}}{N})$ sample complexity at each agent on average. Compared to the single agent setting with $\mathcal{O}(ε^{-2.5})$ sample complexity, it enjoys a linear speedup with respect to the number of agents. Moreover, compared to synchronous FedPG, AFedPG improves the time complexity from $\mathcal{O}(\frac{t_{\max}}{N})$ to $\mathcal{O}(\frac{1}{\sum_{i=1}^{N} \frac{1}{t_{i}}})$, where $t_{i}$ denotes the time consumption in each iteration at the agent $i$, and $t_{\max}$ is the largest one. The latter complexity $\mathcal{O}(\frac{1}{\sum_{i=1}^{N} \frac{1}{t_{i}}})$ is always smaller than the former one, and this improvement becomes significant in large-scale federated settings with heterogeneous computing powers ($t_{\max}\gg t_{\min}$). Finally, we empirically verify the improved performances of AFedPG in three MuJoCo environments with varying numbers of agents. We also demonstrate the improvements with different computing heterogeneity. △ Less

Submitted 14 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

ACM Class: I.2.6; I.2.11

arXiv:2402.09629 [pdf, other]

Smart Information Exchange for Unsupervised Federated Learning via Reinforcement Learning

Authors: Seohyun Lee, Anindya Bijoy Das, Satyavrat Wagle, Christopher G. Brinton

Abstract: One of the main challenges of decentralized machine learning paradigms such as Federated Learning (FL) is the presence of local non-i.i.d. datasets. Device-to-device transfers (D2D) between distributed devices has been shown to be an effective tool for dealing with this problem and robust to stragglers. In an unsupervised case, however, it is not obvious how data exchanges should take place due to… ▽ More One of the main challenges of decentralized machine learning paradigms such as Federated Learning (FL) is the presence of local non-i.i.d. datasets. Device-to-device transfers (D2D) between distributed devices has been shown to be an effective tool for dealing with this problem and robust to stragglers. In an unsupervised case, however, it is not obvious how data exchanges should take place due to the absence of labels. In this paper, we propose an approach to create an optimal graph for data transfer using Reinforcement Learning. The goal is to form links that will provide the most benefit considering the environment's constraints and improve convergence speed in an unsupervised FL environment. Numerical analysis shows the advantages in terms of convergence speed and straggler resilience of the proposed method to different available FL schemes and benchmark datasets. △ Less

Submitted 14 February, 2024; originally announced February 2024.

arXiv:2402.09580 [pdf, other]

Complexity Reduction in Machine Learning-Based Wireless Positioning: Minimum Description Features

Authors: Myeung Suk Oh, Anindya Bijoy Das, Taejoon Kim, David J. Love, Christopher G. Brinton

Abstract: A recent line of research has been investigating deep learning approaches to wireless positioning (WP). Although these WP algorithms have demonstrated high accuracy and robust performance against diverse channel conditions, they also have a major drawback: they require processing high-dimensional features, which can be prohibitive for mobile applications. In this work, we design a positioning neur… ▽ More A recent line of research has been investigating deep learning approaches to wireless positioning (WP). Although these WP algorithms have demonstrated high accuracy and robust performance against diverse channel conditions, they also have a major drawback: they require processing high-dimensional features, which can be prohibitive for mobile applications. In this work, we design a positioning neural network (P-NN) that substantially reduces the complexity of deep learning-based WP through carefully crafted minimum description features. Our feature selection is based on maximum power measurements and their temporal locations to convey information needed to conduct WP. We also develop a novel methodology for adaptively selecting the size of feature space, which optimizes over balancing the expected amount of useful information and classification capability, quantified using information-theoretic measures on the signal bin selection. Numerical results show that P-NN achieves a significant advantage in performance-complexity tradeoff over deep learning baselines that leverage the full power delay profile (PDP). △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: This paper has been accepted in IEEE International Conference on Communications (ICC) 2024

arXiv:2402.03448 [pdf, other]

Decentralized Sporadic Federated Learning: A Unified Algorithmic Framework with Convergence Guarantees

Authors: Shahryar Zehtabi, Dong-Jun Han, Rohit Parasnis, Seyyedali Hosseinalipour, Christopher G. Brinton

Abstract: Decentralized federated learning (DFL) captures FL settings where both (i) model updates and (ii) model aggregations are exclusively carried out by the clients without a central server. Existing DFL works have mostly focused on settings where clients conduct a fixed number of local updates between local model exchanges, overlooking heterogeneity and dynamics in communication and computation capabi… ▽ More Decentralized federated learning (DFL) captures FL settings where both (i) model updates and (ii) model aggregations are exclusively carried out by the clients without a central server. Existing DFL works have mostly focused on settings where clients conduct a fixed number of local updates between local model exchanges, overlooking heterogeneity and dynamics in communication and computation capabilities. In this work, we propose Decentralized Sporadic Federated Learning (DSpodFL), a DFL methodology built on a generalized notion of sporadicity in both local gradient and aggregation processes. DSpodFL subsumes many existing decentralized optimization methods under a unified algorithmic framework by modeling the per-iteration (i) occurrence of gradient descent at each client and (ii) exchange of models between client pairs as arbitrary indicator random variables, thus capturing heterogeneous and time-varying computation/communication scenarios. We analytically characterize the convergence behavior of DSpodFL for both convex and non-convex models, for both constant and diminishing learning rates, under mild assumptions on the communication graph connectivity, data heterogeneity across clients, and gradient noises, and show how our bounds recover existing results as special cases. Experiments demonstrate that DSpodFL consistently achieves improved training speeds compared with baselines under various system settings. △ Less

Submitted 31 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.02225 [pdf, other]

Rethinking the Starting Point: Collaborative Pre-Training for Federated Downstream Tasks

Authors: Yun-Wei Chu, Dong-Jun Han, Seyyedali Hosseinalipour, Christopher G. Brinton

Abstract: A few recent studies have demonstrated that leveraging centrally pre-trained models can offer advantageous initializations for federated learning (FL). However, existing pre-training methods do not generalize well when faced with an arbitrary set of downstream FL tasks. Specifically, they often (i) achieve limited average accuracy, particularly when there are unseen downstream labels, and (ii) res… ▽ More A few recent studies have demonstrated that leveraging centrally pre-trained models can offer advantageous initializations for federated learning (FL). However, existing pre-training methods do not generalize well when faced with an arbitrary set of downstream FL tasks. Specifically, they often (i) achieve limited average accuracy, particularly when there are unseen downstream labels, and (ii) result in significant accuracy variance, failing to provide a balanced performance across clients. To address these challenges, we propose CoPreFL, a collaborative/distributed pre-training approach which provides a robust initialization for downstream FL tasks. The key idea of CoPreFL is a model-agnostic meta-learning (MAML) procedure that tailors the global model to closely mimic heterogeneous and unseen FL scenarios, resulting in a pre-trained model that is rapidly adaptable to arbitrary FL tasks. Our MAML procedure incorporates performance variance into the meta-objective function, balancing performance across clients rather than solely optimizing for accuracy. Through extensive experiments, we demonstrate that CoPreFL obtains significant improvements in both average accuracy and variance across arbitrary downstream FL tasks with unseen/seen labels, compared with various pre-training baselines. We also show how CoPreFL is compatible with different well-known FL algorithms applied by the downstream tasks, enhancing performance in each case. △ Less

Submitted 6 June, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

arXiv:2401.16685 [pdf, other]

Communication-Efficient Multimodal Federated Learning: Joint Modality and Client Selection

Authors: Liangqi Yuan, Dong-Jun Han, Su Wang, Devesh Upadhyay, Christopher G. Brinton

Abstract: Multimodal federated learning (FL) aims to enrich model training in FL settings where clients are collecting measurements across multiple modalities. However, key challenges to multimodal FL remain unaddressed, particularly in heterogeneous network settings where: (i) the set of modalities collected by each client will be diverse, and (ii) communication limitations prevent clients from uploading a… ▽ More Multimodal federated learning (FL) aims to enrich model training in FL settings where clients are collecting measurements across multiple modalities. However, key challenges to multimodal FL remain unaddressed, particularly in heterogeneous network settings where: (i) the set of modalities collected by each client will be diverse, and (ii) communication limitations prevent clients from uploading all their locally trained modality models to the server. In this paper, we propose multimodal Federated learning with joint Modality and Client selection (mmFedMC), a new FL methodology that can tackle the above-mentioned challenges in multimodal settings. The joint selection algorithm incorporates two main components: (a) A modality selection methodology for each client, which weighs (i) the impact of the modality, gauged by Shapley value analysis, (ii) the modality model size as a gauge of communication overhead, against (iii) the frequency of modality model updates, denoted recency, to enhance generalizability. (b) A client selection strategy for the server based on the local loss of modality model at each client. Experiments on five real-world datasets demonstrate the ability of mmFedMC to achieve comparable accuracy to several baselines while reducing the communication overhead by over 20x. A demo video of our methodology is available at https://liangqiy.com/mmfedmc/. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: arXiv admin note: text overlap with arXiv:2310.07048

arXiv:2401.07456 [pdf, other]

Only Send What You Need: Learning to Communicate Efficiently in Federated Multilingual Machine Translation

Authors: Yun-Wei Chu, Dong-Jun Han, Christopher G. Brinton

Abstract: Federated learning (FL) is a promising approach for solving multilingual tasks, potentially enabling clients with their own language-specific data to collaboratively construct a high-quality neural machine translation (NMT) model. However, communication constraints in practical network systems present challenges for exchanging large-scale NMT engines between FL parties. In this paper, we propose a… ▽ More Federated learning (FL) is a promising approach for solving multilingual tasks, potentially enabling clients with their own language-specific data to collaboratively construct a high-quality neural machine translation (NMT) model. However, communication constraints in practical network systems present challenges for exchanging large-scale NMT engines between FL parties. In this paper, we propose a meta-learning-based adaptive parameter selection methodology, MetaSend, that improves the communication efficiency of model transmissions from clients during FL-based multilingual NMT training. Our approach learns a dynamic threshold for filtering parameters prior to transmission without compromising the NMT model quality, based on the tensor deviations of clients between different FL rounds. Through experiments on two NMT datasets with different language distributions, we demonstrate that MetaSend obtains substantial improvements over baselines in translation quality in the presence of a limited communication budget. △ Less

Submitted 14 January, 2024; originally announced January 2024.

arXiv:2401.00477 [pdf, other]

Coding for Gaussian Two-Way Channels: Linear and Learning-Based Approaches

Authors: Junghoon Kim, Taejoon Kim, Anindya Bijoy Das, Seyyedali Hosseinalipour, David J. Love, Christopher G. Brinton

Abstract: Although user cooperation cannot improve the capacity of Gaussian two-way channels (GTWCs) with independent noises, it can improve communication reliability. In this work, we aim to enhance and balance the communication reliability in GTWCs by minimizing the sum of error probabilities via joint design of encoders and decoders at the users. We first formulate general encoding/decoding functions, wh… ▽ More Although user cooperation cannot improve the capacity of Gaussian two-way channels (GTWCs) with independent noises, it can improve communication reliability. In this work, we aim to enhance and balance the communication reliability in GTWCs by minimizing the sum of error probabilities via joint design of encoders and decoders at the users. We first formulate general encoding/decoding functions, where the user cooperation is captured by the coupling of user encoding processes. The coupling effect renders the encoder/decoder design non-trivial, requiring effective decoding to capture this effect, as well as efficient power management at the encoders within power constraints. To address these challenges, we propose two different two-way coding strategies: linear coding and learning-based coding. For linear coding, we propose optimal linear decoding and discuss new insights on encoding regarding user cooperation to balance reliability. We then propose an efficient algorithm for joint encoder/decoder design. For learning-based coding, we introduce a novel recurrent neural network (RNN)-based coding architecture, where we propose interactive RNNs and a power control layer for encoding, and we incorporate bi-directional RNNs with an attention mechanism for decoding. Through simulations, we show that our two-way coding methodologies outperform conventional channel coding schemes (that do not utilize user cooperation) significantly in sum-error performance. We also demonstrate that our linear coding excels at high signal-to-noise ratios (SNRs), while our RNN-based coding performs best at low SNRs. We further investigate our two-way coding strategies in terms of power distribution, two-way coding benefit, different coding rates, and block-length gain. △ Less

Submitted 31 December, 2023; originally announced January 2024.

Comments: This work has been submitted to the IEEE Transactions on Information Theory

arXiv:2312.16638 [pdf, other]

Fault-Tolerant Vertical Federated Learning on Dynamic Networks

Authors: Surojit Ganguli, Zeyu Zhou, Christopher G. Brinton, David I. Inouye

Abstract: Vertical Federated learning (VFL) is a class of FL where each client shares the same sample space but only holds a subset of the features. While VFL tackles key privacy challenges of distributed learning, it often assumes perfect hardware and communication capabilities. This assumption hinders the broad deployment of VFL, particularly on edge devices, which are heterogeneous in their in-situ capab… ▽ More Vertical Federated learning (VFL) is a class of FL where each client shares the same sample space but only holds a subset of the features. While VFL tackles key privacy challenges of distributed learning, it often assumes perfect hardware and communication capabilities. This assumption hinders the broad deployment of VFL, particularly on edge devices, which are heterogeneous in their in-situ capabilities and will connect/disconnect from the network over time. To address this gap, we define Internet Learning (IL) including its data splitting and network context and which puts good performance under extreme dynamic condition of clients as the primary goal. We propose VFL as a naive baseline and develop several extensions to handle the IL paradigm of learning. Furthermore, we implement new methods, propose metrics, and extensively analyze results based on simulating a sensor network. The results show that the developed methods are more robust to changes in the network than VFL baseline. △ Less

Submitted 27 December, 2023; originally announced December 2023.

arXiv:2312.15361 [pdf, other]

Cooperative Federated Learning over Ground-to-Satellite Integrated Networks: Joint Local Computation and Data Offloading

Authors: Dong-Jun Han, Seyyedali Hosseinalipour, David J. Love, Mung Chiang, Christopher G. Brinton

Abstract: While network coverage maps continue to expand, many devices located in remote areas remain unconnected to terrestrial communication infrastructures, preventing them from getting access to the associated data-driven services. In this paper, we propose a ground-to-satellite cooperative federated learning (FL) methodology to facilitate machine learning service management over remote regions. Our met… ▽ More While network coverage maps continue to expand, many devices located in remote areas remain unconnected to terrestrial communication infrastructures, preventing them from getting access to the associated data-driven services. In this paper, we propose a ground-to-satellite cooperative federated learning (FL) methodology to facilitate machine learning service management over remote regions. Our methodology orchestrates satellite constellations to provide the following key functions during FL: (i) processing data offloaded from ground devices, (ii) aggregating models within device clusters, and (iii) relaying models/data to other satellites via inter-satellite links (ISLs). Due to the limited coverage time of each satellite over a particular remote area, we facilitate satellite transmission of trained models and acquired data to neighboring satellites via ISL, so that the incoming satellite can continue conducting FL for the region. We theoretically analyze the convergence behavior of our algorithm, and develop a training latency minimizer which optimizes over satellite-specific network resources, including the amount of data to be offloaded from ground devices to satellites and satellites' computation speeds. Through experiments on three datasets, we show that our methodology can significantly speed up the convergence of FL compared with terrestrial-only and other satellite baseline approaches. △ Less

Submitted 23 December, 2023; originally announced December 2023.

Comments: This paper is accepted for publication in IEEE Journal on Selected Areas in Communications (JSAC)

arXiv:2312.04728 [pdf, other]

Taming Subnet-Drift in D2D-Enabled Fog Learning: A Hierarchical Gradient Tracking Approach

Authors: Evan Chen, Shiqiang Wang, Christopher G. Brinton

Abstract: Federated learning (FL) encounters scalability challenges when implemented over fog networks. Semi-decentralized FL (SD-FL) proposes a solution that divides model cooperation into two stages: at the lower stage, device-to-device (D2D) communications is employed for local model aggregations within subnetworks (subnets), while the upper stage handles device-server (DS) communications for global mode… ▽ More Federated learning (FL) encounters scalability challenges when implemented over fog networks. Semi-decentralized FL (SD-FL) proposes a solution that divides model cooperation into two stages: at the lower stage, device-to-device (D2D) communications is employed for local model aggregations within subnetworks (subnets), while the upper stage handles device-server (DS) communications for global model aggregations. However, existing SD-FL schemes are based on gradient diversity assumptions that become performance bottlenecks as data distributions become more heterogeneous. In this work, we develop semi-decentralized gradient tracking (SD-GT), the first SD-FL methodology that removes the need for such assumptions by incorporating tracking terms into device updates for each communication layer. Analytical characterization of SD-GT reveals convergence upper bounds for both non-convex and strongly-convex problems, for a suitable choice of step size. We employ the resulting bounds in the development of a co-optimization algorithm for optimizing subnet sampling rates and D2D rounds according to a performance-efficiency trade-off. Our subsequent numerical evaluations demonstrate that SD-GT obtains substantial improvements in trained model quality and communication cost relative to baselines in SD-FL and gradient tracking on several datasets. △ Less

Submitted 9 January, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

Comments: This paper is accepted for publication in the proceedings of 2024 IEEE International Conference on Computer Communications (INFOCOM)

arXiv:2311.07946 [pdf, other]

The Impact of Adversarial Node Placement in Decentralized Federated Learning Networks

Authors: Adam Piaseczny, Eric Ruzomberka, Rohit Parasnis, Christopher G. Brinton

Abstract: As Federated Learning (FL) grows in popularity, new decentralized frameworks are becoming widespread. These frameworks leverage the benefits of decentralized environments to enable fast and energy-efficient inter-device communication. However, this growing popularity also intensifies the need for robust security measures. While existing research has explored various aspects of FL security, the rol… ▽ More As Federated Learning (FL) grows in popularity, new decentralized frameworks are becoming widespread. These frameworks leverage the benefits of decentralized environments to enable fast and energy-efficient inter-device communication. However, this growing popularity also intensifies the need for robust security measures. While existing research has explored various aspects of FL security, the role of adversarial node placement in decentralized networks remains largely unexplored. This paper addresses this gap by analyzing the performance of decentralized FL for various adversarial placement strategies when adversaries can jointly coordinate their placement within a network. We establish two baseline strategies for placing adversarial node: random placement and network centrality-based placement. Building on this foundation, we propose a novel attack algorithm that prioritizes adversarial spread over adversarial centrality by maximizing the average network distance between adversaries. We show that the new attack algorithm significantly impacts key performance metrics such as testing accuracy, outperforming the baseline frameworks by between $9\%$ and $66.5\%$ for the considered setups. Our findings provide valuable insights into the vulnerabilities of decentralized FL systems, setting the stage for future research aimed at develo** more secure and robust decentralized FL frameworks. △ Less

Submitted 19 March, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

Comments: Accepted to ICC 2024 conference

arXiv:2311.04350 [pdf, other]

Device Sampling and Resource Optimization for Federated Learning in Cooperative Edge Networks

Authors: Su Wang, Roberto Morabito, Seyyedali Hosseinalipour, Mung Chiang, Christopher G. Brinton

Abstract: The conventional federated learning (FedL) architecture distributes machine learning (ML) across worker devices by having them train local models that are periodically aggregated by a server. FedL ignores two important characteristics of contemporary wireless networks, however: (i) the network may contain heterogeneous communication/computation resources, and (ii) there may be significant overlaps… ▽ More The conventional federated learning (FedL) architecture distributes machine learning (ML) across worker devices by having them train local models that are periodically aggregated by a server. FedL ignores two important characteristics of contemporary wireless networks, however: (i) the network may contain heterogeneous communication/computation resources, and (ii) there may be significant overlaps in devices' local data distributions. In this work, we develop a novel optimization methodology that jointly accounts for these factors via intelligent device sampling complemented by device-to-device (D2D) offloading. Our optimization methodology aims to select the best combination of sampled nodes and data offloading configuration to maximize FedL training accuracy while minimizing data processing and D2D communication resource consumption subject to realistic constraints on the network topology and device capabilities. Theoretical analysis of the D2D offloading subproblem leads to new FedL convergence bounds and an efficient sequential convex optimizer. Using these results, we develop a sampling methodology based on graph convolutional networks (GCNs) which learns the relationship between network attributes, sampled nodes, and D2D data offloading to maximize FedL accuracy. Through evaluation on popular datasets and real-world network measurements from our edge testbed, we find that our methodology outperforms popular device sampling methodologies from literature in terms of ML model performance, data processing overhead, and energy consumption. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: Submitted to IEEE/ACM Transactions on Networking. arXiv admin note: substantial text overlap with arXiv:2101.00787

arXiv:2311.00227 [pdf, other]

StableFDG: Style and Attention Based Learning for Federated Domain Generalization

Authors: Jungwuk Park, Dong-Jun Han, **ho Kim, Shiqiang Wang, Christopher G. Brinton, Jaekyun Moon

Abstract: Traditional federated learning (FL) algorithms operate under the assumption that the data distributions at training (source domains) and testing (target domain) are the same. The fact that domain shifts often occur in practice necessitates equip** FL methods with a domain generalization (DG) capability. However, existing DG algorithms face fundamental challenges in FL setups due to the lack of s… ▽ More Traditional federated learning (FL) algorithms operate under the assumption that the data distributions at training (source domains) and testing (target domain) are the same. The fact that domain shifts often occur in practice necessitates equip** FL methods with a domain generalization (DG) capability. However, existing DG algorithms face fundamental challenges in FL setups due to the lack of samples/domains in each client's local dataset. In this paper, we propose StableFDG, a style and attention based learning strategy for accomplishing federated domain generalization, introducing two key contributions. The first is style-based learning, which enables each client to explore novel styles beyond the original source domains in its local dataset, improving domain diversity based on the proposed style sharing, shifting, and exploration strategies. Our second contribution is an attention-based feature highlighter, which captures the similarities between the features of data samples in the same class, and emphasizes the important/common characteristics to better learn the domain-invariant characteristics of each class in data-poor FL scenarios. Experimental results show that StableFDG outperforms existing baselines on various DG benchmark datasets, demonstrating its efficacy. △ Less

Submitted 31 October, 2023; originally announced November 2023.

Comments: Accepted at NeurIPS 2023, 19 pages

arXiv:2310.17890 [pdf, other]

Submodel Partitioning in Hierarchical Federated Learning: Algorithm Design and Convergence Analysis

Authors: Wenzhi Fang, Dong-Jun Han, Christopher G. Brinton

Abstract: Hierarchical federated learning (HFL) has demonstrated promising scalability advantages over the traditional "star-topology" architecture-based federated learning (FL). However, HFL still imposes significant computation, communication, and storage burdens on the edge, especially when training a large-scale model over resource-constrained Internet of Things (IoT) devices. In this paper, we propose… ▽ More Hierarchical federated learning (HFL) has demonstrated promising scalability advantages over the traditional "star-topology" architecture-based federated learning (FL). However, HFL still imposes significant computation, communication, and storage burdens on the edge, especially when training a large-scale model over resource-constrained Internet of Things (IoT) devices. In this paper, we propose hierarchical independent submodel training (HIST), a new FL methodology that aims to address these issues in hierarchical settings. The key idea behind HIST is a hierarchical version of model partitioning, where we partition the global model into disjoint submodels in each round, and distribute them across different cells, so that each cell is responsible for training only one partition of the full model. This enables each client to save computation/storage costs while alleviating the communication loads throughout the hierarchy. We characterize the convergence behavior of HIST for non-convex loss functions under mild assumptions, showing the impact of several attributes (e.g., number of cells, local and global aggregation frequency) on the performance-efficiency tradeoff. Finally, through numerical experiments, we verify that HIST is able to save communication costs by a wide margin while achieving the same target testing accuracy. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Comments: 14 pages, 4 figures

arXiv:2310.10804 [pdf, other]

Constant Modulus Waveform Design with Block-Level Interference Exploitation for DFRC Systems

Authors: Byunghyun Lee, Anindya Bijoy Das, David J. Love, Christopher G. Brinton, James V. Krogmeier

Abstract: Dual-functional radar-communication (DFRC) is a promising technology where radar and communication functions operate on the same spectrum and hardware. In this paper, we propose an algorithm for designing constant modulus waveforms for DFRC systems. Particularly, we jointly optimize the correlation properties and the spatial beam pattern. For communication, we employ constructive interference-base… ▽ More Dual-functional radar-communication (DFRC) is a promising technology where radar and communication functions operate on the same spectrum and hardware. In this paper, we propose an algorithm for designing constant modulus waveforms for DFRC systems. Particularly, we jointly optimize the correlation properties and the spatial beam pattern. For communication, we employ constructive interference-based block-level precoding (CI-BLP) to exploit distortion due to multi-user and radar transmission. We propose a majorization-minimization (MM)-based solution to the formulated problem. To accelerate convergence, we propose an improved majorizing function that leverages a novel diagonal matrix structure. We then evaluate the performance of the proposed algorithm through rigorous simulations. Simulation results demonstrate the effectiveness of the proposed approach and the proposed majorizer. △ Less

Submitted 6 April, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

Comments: Accepted to IEEE International Conference on Communications (ICC) 2024

arXiv:2310.07048 [pdf, other]

FedMFS: Federated Multimodal Fusion Learning with Selective Modality Communication

Authors: Liangqi Yuan, Dong-Jun Han, Vishnu Pandi Chellapandi, Stanislaw H. Żak, Christopher G. Brinton

Abstract: Multimodal federated learning (FL) aims to enrich model training in FL settings where devices are collecting measurements across multiple modalities (e.g., sensors measuring pressure, motion, and other types of data). However, key challenges to multimodal FL remain unaddressed, particularly in heterogeneous network settings: (i) the set of modalities collected by each device will be diverse, and (… ▽ More Multimodal federated learning (FL) aims to enrich model training in FL settings where devices are collecting measurements across multiple modalities (e.g., sensors measuring pressure, motion, and other types of data). However, key challenges to multimodal FL remain unaddressed, particularly in heterogeneous network settings: (i) the set of modalities collected by each device will be diverse, and (ii) communication limitations prevent devices from uploading all their locally trained modality models to the server. In this paper, we propose Federated Multimodal Fusion learning with Selective modality communication (FedMFS), a new multimodal fusion FL methodology that can tackle the above mentioned challenges. The key idea is the introduction of a modality selection criterion for each device, which weighs (i) the impact of the modality, gauged by Shapley value analysis, against (ii) the modality model size as a gauge for communication overhead. This enables FedMFS to flexibly balance performance against communication costs, depending on resource constraints and application requirements. Experiments on the real-world ActionSense dataset demonstrate the ability of FedMFS to achieve comparable accuracy to several baselines while reducing the communication overhead by over 4x. △ Less

Submitted 12 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

Comments: ICC 2024

arXiv:2310.03178 [pdf, other]

Digital Ethics in Federated Learning

Authors: Liangqi Yuan, Ziran Wang, Christopher G. Brinton

Abstract: The Internet of Things (IoT) consistently generates vast amounts of data, sparking increasing concern over the protection of data privacy and the limitation of data misuse. Federated learning (FL) facilitates collaborative capabilities among multiple parties by sharing machine learning (ML) model parameters instead of raw user data, and it has recently gained significant attention for its potentia… ▽ More The Internet of Things (IoT) consistently generates vast amounts of data, sparking increasing concern over the protection of data privacy and the limitation of data misuse. Federated learning (FL) facilitates collaborative capabilities among multiple parties by sharing machine learning (ML) model parameters instead of raw user data, and it has recently gained significant attention for its potential in privacy preservation and learning efficiency enhancement. In this paper, we highlight the digital ethics concerns that arise when human-centric devices serve as clients in FL. More specifically, challenges of game dynamics, fairness, incentive, and continuity arise in FL due to differences in perspectives and objectives between clients and the server. We analyze these challenges and their solutions from the perspectives of both the client and the server, and through the viewpoints of centralized and decentralized FL. Finally, we explore the opportunities in FL for human-centric IoT as directions for future development. △ Less

Submitted 18 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

arXiv:2308.10407 [pdf, other]

Federated Learning for Connected and Automated Vehicles: A Survey of Existing Approaches and Challenges

Authors: Vishnu Pandi Chellapandi, Liangqi Yuan, Christopher G. Brinton, Stanislaw H Zak, Ziran Wang

Abstract: Machine learning (ML) is widely used for key tasks in Connected and Automated Vehicles (CAV), including perception, planning, and control. However, its reliance on vehicular data for model training presents significant challenges related to in-vehicle user privacy and communication overhead generated by massive data volumes. Federated learning (FL) is a decentralized ML approach that enables multi… ▽ More Machine learning (ML) is widely used for key tasks in Connected and Automated Vehicles (CAV), including perception, planning, and control. However, its reliance on vehicular data for model training presents significant challenges related to in-vehicle user privacy and communication overhead generated by massive data volumes. Federated learning (FL) is a decentralized ML approach that enables multiple vehicles to collaboratively develop models, broadening learning from various driving environments, enhancing overall performance, and simultaneously securing local vehicle data privacy and security. This survey paper presents a review of the advancements made in the application of FL for CAV (FL4CAV). First, centralized and decentralized frameworks of FL are analyzed, highlighting their key characteristics and methodologies. Second, diverse data sources, models, and data security techniques relevant to FL in CAVs are reviewed, emphasizing their significance in ensuring privacy and confidentiality. Third, specific applications of FL are explored, providing insight into the base models and datasets employed for each application. Finally, existing challenges for FL4CAV are listed and potential directions for future investigation to further enhance the effectiveness and efficiency of FL in the context of CAV are discussed. △ Less

Submitted 11 November, 2023; v1 submitted 20 August, 2023; originally announced August 2023.

Comments: IEEE Transactions on Intelligent Vehicles

arXiv:2308.04331 [pdf, ps, other]

Preserving Sparsity and Privacy in Straggler-Resilient Distributed Matrix Computations

Authors: Anindya Bijoy Das, Aditya Ramamoorthy, David J. Love, Christopher G. Brinton

Abstract: Existing approaches to distributed matrix computations involve allocating coded combinations of submatrices to worker nodes, to build resilience to stragglers and/or enhance privacy. In this study, we consider the challenge of preserving input sparsity in such approaches to retain the associated computational efficiency enhancements. First, we find a lower bound on the weight of coding, i.e., the… ▽ More Existing approaches to distributed matrix computations involve allocating coded combinations of submatrices to worker nodes, to build resilience to stragglers and/or enhance privacy. In this study, we consider the challenge of preserving input sparsity in such approaches to retain the associated computational efficiency enhancements. First, we find a lower bound on the weight of coding, i.e., the number of submatrices to be combined to obtain coded submatrices to provide the resilience to the maximum possible number of stragglers (for given number of nodes and their storage constraints). Next we propose a distributed matrix computation scheme which meets this exact lower bound on the weight of the coding. Further, we develop controllable trade-off between worker computation time and the privacy constraint for sparse input matrices in settings where the worker nodes are honest but curious. Numerical experiments conducted in Amazon Web Services (AWS) validate our assertions regarding straggler mitigation and computation speed for sparse matrices. △ Less

Submitted 8 August, 2023; originally announced August 2023.

arXiv:2308.03933 [pdf, other]

A Reinforcement Learning-Based Approach to Graph Discovery in D2D-Enabled Federated Learning

Authors: Satyavrat Wagle, Anindya Bijoy Das, David J. Love, Christopher G. Brinton

Abstract: Augmenting federated learning (FL) with direct device-to-device (D2D) communications can help improve convergence speed and reduce model bias through rapid local information exchange. However, data privacy concerns, device trust issues, and unreliable wireless channels each pose challenges to determining an effective yet resource efficient D2D structure. In this paper, we develop a decentralized r… ▽ More Augmenting federated learning (FL) with direct device-to-device (D2D) communications can help improve convergence speed and reduce model bias through rapid local information exchange. However, data privacy concerns, device trust issues, and unreliable wireless channels each pose challenges to determining an effective yet resource efficient D2D structure. In this paper, we develop a decentralized reinforcement learning (RL) methodology for D2D graph discovery that promotes communication of non-sensitive yet impactful data-points over trusted yet reliable links. Each device functions as an RL agent, training a policy to predict the impact of incoming links. Local (device-level) and global rewards are coupled through message passing within and between device clusters. Numerical experiments confirm the advantages offered by our method in terms of convergence speed and straggler resilience across several datasets and FL schemes. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2307.10805 [pdf, ps, other]

Communication-Efficient Split Learning via Adaptive Feature-Wise Compression

Authors: Yongjeong Oh, Jaeho Lee, Christopher G. Brinton, Yo-Seb Jeon

Abstract: This paper proposes a novel communication-efficient split learning (SL) framework, named SplitFC, which reduces the communication overhead required for transmitting intermediate feature and gradient vectors during the SL training process. The key idea of SplitFC is to leverage different dispersion degrees exhibited in the columns of the matrices. SplitFC incorporates two compression strategies: (i… ▽ More This paper proposes a novel communication-efficient split learning (SL) framework, named SplitFC, which reduces the communication overhead required for transmitting intermediate feature and gradient vectors during the SL training process. The key idea of SplitFC is to leverage different dispersion degrees exhibited in the columns of the matrices. SplitFC incorporates two compression strategies: (i) adaptive feature-wise dropout and (ii) adaptive feature-wise quantization. In the first strategy, the intermediate feature vectors are dropped with adaptive dropout probabilities determined based on the standard deviation of these vectors. Then, by the chain rule, the intermediate gradient vectors associated with the dropped feature vectors are also dropped. In the second strategy, the non-dropped intermediate feature and gradient vectors are quantized using adaptive quantization levels determined based on the ranges of the vectors. To minimize the quantization error, the optimal quantization levels of this strategy are derived in a closed-form expression. Simulation results on the MNIST, CIFAR-10, and CelebA datasets demonstrate that SplitFC provides more than a 5.6% increase in classification accuracy compared to state-of-the-art SL frameworks, while they require 320 times less communication overhead compared to the vanilla SL framework without compression. △ Less

Submitted 20 July, 2023; originally announced July 2023.

arXiv:2307.05528 [pdf, ps, other]

On Pseudolinear Codes for Correcting Adversarial Errors

Authors: Eric Ruzomberka, Homa Nikbakht, Christopher G. Brinton, H. Vincent Poor

Abstract: We consider error-correction coding schemes for adversarial wiretap channels (AWTCs) in which the channel can a) read a fraction of the codeword bits up to a bound $r$ and b) flip a fraction of the bits up to a bound $p$. The channel can freely choose the locations of the bit reads and bit flips via a process with unbounded computational power. Codes for the AWTC are of broad interest in the area… ▽ More We consider error-correction coding schemes for adversarial wiretap channels (AWTCs) in which the channel can a) read a fraction of the codeword bits up to a bound $r$ and b) flip a fraction of the bits up to a bound $p$. The channel can freely choose the locations of the bit reads and bit flips via a process with unbounded computational power. Codes for the AWTC are of broad interest in the area of information security, as they can provide data resiliency in settings where an attacker has limited access to a storage or transmission medium. We investigate a family of non-linear codes known as pseudolinear codes, which were first proposed by Guruswami and Indyk (FOCS 2001) for constructing list-decodable codes independent of the AWTC setting. Unlike general non-linear codes, pseudolinear codes admit efficient encoders and have succinct representations. We focus on unique decoding and show that random pseudolinear codes can achieve rates up to the binary symmetric channel (BSC) capacity $1-H_2(p)$ for any $p,r$ in the less noisy region: $p<1/2$ and $r<1-H_2(p)$ where $H_2(\cdot)$ is the binary entropy function. Thus, pseudolinear codes are the first known optimal-rate binary code family for the less noisy AWTC that admit efficient encoders. The above result can be viewed as a derandomization result of random general codes in the AWTC setting, which in turn opens new avenues for applying derandomization techniques to randomized constructions of AWTC codes. Our proof applies a novel concentration inequality for sums of random variables with limited independence which may be of interest as an analysis tool more generally. △ Less

Submitted 7 July, 2023; originally announced July 2023.

arXiv:2307.04222 [pdf, other]

Derandomizing Codes for the Binary Adversarial Wiretap Channel of Type II

Authors: Eric Ruzomberka, Homa Nikbakht, Christopher G. Brinton, David J. Love, H. Vincent Poor

Abstract: We revisit the binary adversarial wiretap channel (AWTC) of type II in which an active adversary can read a fraction $r$ and flip a fraction $p$ of codeword bits. The semantic-secrecy capacity of the AWTC II is partially known, where the best-known lower bound is non-constructive, proven via a random coding argument that uses a large number (that is exponential in blocklength $n$) of random bits t… ▽ More We revisit the binary adversarial wiretap channel (AWTC) of type II in which an active adversary can read a fraction $r$ and flip a fraction $p$ of codeword bits. The semantic-secrecy capacity of the AWTC II is partially known, where the best-known lower bound is non-constructive, proven via a random coding argument that uses a large number (that is exponential in blocklength $n$) of random bits to seed the random code. In this paper, we establish a new derandomization result in which we match the best-known lower bound of $1-H_2(p)-r$ where $H_2(\cdot)$ is the binary entropy function via a random code that uses a small seed of only $O(n^2)$ bits. Our random code construction is a novel application of pseudolinear codes -- a class of non-linear codes that have $k$-wise independent codewords when picked at random where $k$ is a design parameter. As the key technical tool in our analysis, we provide a soft-covering lemma in the flavor of Goldfeld, Cuff and Permuter (Trans. Inf. Theory 2016) that holds for random codes with $k$-wise independent codewords. △ Less

Submitted 9 July, 2023; originally announced July 2023.

arXiv:2306.04872 [pdf, other]

Mitigating Evasion Attacks in Federated Learning-Based Signal Classifiers

Authors: Su Wang, Rajeev Sahay, Adam Piaseczny, Christopher G. Brinton

Abstract: There has been recent interest in leveraging federated learning (FL) for radio signal classification tasks. In FL, model parameters are periodically communicated from participating devices, which train on local datasets, to a central server which aggregates them into a global model. While FL has privacy/security advantages due to raw data not leaving the devices, it is still susceptible to adversa… ▽ More There has been recent interest in leveraging federated learning (FL) for radio signal classification tasks. In FL, model parameters are periodically communicated from participating devices, which train on local datasets, to a central server which aggregates them into a global model. While FL has privacy/security advantages due to raw data not leaving the devices, it is still susceptible to adversarial attacks. In this work, we first reveal the susceptibility of FL-based signal classifiers to model poisoning attacks, which compromise the training process despite not observing data transmissions. In this capacity, we develop an attack framework that significantly degrades the training process of the global model. Our attack framework induces a more potent model poisoning attack to the global classifier than existing baselines while also being able to compromise existing server-driven defenses. In response to this gap, we develop Underlying Server Defense of Federated Learning (USD-FL), a novel defense methodology for FL-based signal classifiers. We subsequently compare the defensive efficacy, runtimes, and false positive detection rates of USD-FL relative to existing server-driven defenses, showing that USD-FL has notable advantages over the baseline defenses in all three areas. △ Less

Submitted 7 June, 2023; originally announced June 2023.

Comments: Submitted to IEEE Transactions on Cognitive Communications and Networking. arXiv admin note: substantial text overlap with arXiv:2301.08866

arXiv:2306.01603 [pdf, other]

Decentralized Federated Learning: A Survey and Perspective

Authors: Liangqi Yuan, Ziran Wang, Lichao Sun, Philip S. Yu, Christopher G. Brinton

Abstract: Federated learning (FL) has been gaining attention for its ability to share knowledge while maintaining user data, protecting privacy, increasing learning efficiency, and reducing communication overhead. Decentralized FL (DFL) is a decentralized network architecture that eliminates the need for a central server in contrast to centralized FL (CFL). DFL enables direct communication between clients,… ▽ More Federated learning (FL) has been gaining attention for its ability to share knowledge while maintaining user data, protecting privacy, increasing learning efficiency, and reducing communication overhead. Decentralized FL (DFL) is a decentralized network architecture that eliminates the need for a central server in contrast to centralized FL (CFL). DFL enables direct communication between clients, resulting in significant savings in communication resources. In this paper, a comprehensive survey and profound perspective are provided for DFL. First, a review of the methodology, challenges, and variants of CFL is conducted, laying the background of DFL. Then, a systematic and detailed perspective on DFL is introduced, including iteration order, communication protocols, network topologies, paradigm proposals, and temporal variability. Next, based on the definition of DFL, several extended variants and categorizations are proposed with state-of-the-art (SOTA) technologies. Lastly, in addition to summarizing the current challenges in the DFL, some possible solutions and future research directions are also discussed. △ Less

Submitted 4 May, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

arXiv:2305.13503 [pdf, other]

Asynchronous Multi-Model Dynamic Federated Learning over Wireless Networks: Theory, Modeling, and Optimization

Authors: Zhan-Lun Chang, Seyyedali Hosseinalipour, Mung Chiang, Christopher G. Brinton

Abstract: Federated learning (FL) has emerged as a key technique for distributed machine learning (ML). Most literature on FL has focused on ML model training for (i) a single task/model, with (ii) a synchronous scheme for updating model parameters, and (iii) a static data distribution setting across devices, which is often not realistic in practical wireless environments. To address this, we develop DMA-FL… ▽ More Federated learning (FL) has emerged as a key technique for distributed machine learning (ML). Most literature on FL has focused on ML model training for (i) a single task/model, with (ii) a synchronous scheme for updating model parameters, and (iii) a static data distribution setting across devices, which is often not realistic in practical wireless environments. To address this, we develop DMA-FL considering dynamic FL with multiple downstream tasks/models over an asynchronous model update architecture. We first characterize convergence via introducing scheduling tensors and rectangular functions to capture the impact of system parameters on learning performance. Our analysis sheds light on the joint impact of device training variables (e.g., number of local gradient descent steps), asynchronous scheduling decisions (i.e., when a device trains a task), and dynamic data drifts on the performance of ML training for different tasks. Leveraging these results, we formulate an optimization for jointly configuring resource allocation and device scheduling to strike an efficient trade-off between energy consumption and ML performance. Our solver for the resulting non-convex mixed integer program employs constraint relaxations and successive convex approximations with convergence guarantees. Through numerical experiments, we reveal that DMA-FL substantially improves the performance-efficiency tradeoff. △ Less

Submitted 15 February, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

Comments: Completed the major revision for IEEE Transactions on Cognitive Communications and Networking

arXiv:2305.00384 [pdf, other]

doi 10.1109/TVT.2023.3279833

Dynamic and Robust Sensor Selection Strategies for Wireless Positioning with TOA/RSS Measurement

Authors: Myeung Suk Oh, Seyyedali Hosseinalipour, Taejoon Kim, David J. Love, James V. Krogmeier, Christopher G. Brinton

Abstract: Emerging wireless applications are requiring ever more accurate location-positioning from sensor measurements. In this paper, we develop sensor selection strategies for 3D wireless positioning based on time of arrival (TOA) and received signal strength (RSS) measurements to handle two distinct scenarios: (i) known approximated target location, for which we conduct dynamic sensor selection to minim… ▽ More Emerging wireless applications are requiring ever more accurate location-positioning from sensor measurements. In this paper, we develop sensor selection strategies for 3D wireless positioning based on time of arrival (TOA) and received signal strength (RSS) measurements to handle two distinct scenarios: (i) known approximated target location, for which we conduct dynamic sensor selection to minimize the positioning error; and (ii) unknown approximated target location, in which the worst-case positioning error is minimized via robust sensor selection. We derive expressions for the Cramér-Rao lower bound (CRLB) as a performance metric to quantify the positioning accuracy resulted from selected sensors. For dynamic sensor selection, two greedy selection strategies are proposed, each of which exploits properties revealed in the derived CRLB expressions. These selection strategies are shown to strike an efficient balance between computational complexity and performance suboptimality. For robust sensor selection, we show that the conventional convex relaxation approach leads to instability, and then develop three algorithms based on (i) iterative convex optimization (ICO), (ii) difference of convex functions programming (DCP), and (iii) discrete monotonic optimization (DMO). Each of these strategies exhibits a different tradeoff between computational complexity and optimality guarantee. Simulation results show that the proposed sensor selection strategies provide significant improvements in terms of accuracy and/or complexity compared to existing sensor selection methods. △ Less

Submitted 30 April, 2023; originally announced May 2023.

Comments: This paper has been accepted to IEEE Transactions on Vehicular Technology for future publication

arXiv:2304.12422 [pdf, other]

Multi-Source to Multi-Target Decentralized Federated Domain Adaptation

Authors: Su Wang, Seyyedali Hosseinalipour, Christopher G. Brinton

Abstract: Heterogeneity across devices in federated learning (FL) typically refers to statistical (e.g., non-i.i.d. data distributions) and resource (e.g., communication bandwidth) dimensions. In this paper, we focus on another important dimension that has received less attention: varying quantities/distributions of labeled and unlabeled data across devices. In order to leverage all data, we develop a decen… ▽ More Heterogeneity across devices in federated learning (FL) typically refers to statistical (e.g., non-i.i.d. data distributions) and resource (e.g., communication bandwidth) dimensions. In this paper, we focus on another important dimension that has received less attention: varying quantities/distributions of labeled and unlabeled data across devices. In order to leverage all data, we develop a decentralized federated domain adaptation methodology which considers the transfer of ML models from devices with high quality labeled data (called sources) to devices with low quality or unlabeled data (called targets). Our methodology, Source-Target Determination and Link Formation (ST-LF), optimizes both (i) classification of devices into sources and targets and (ii) source-target link formation, in a manner that considers the trade-off between ML model accuracy and communication energy efficiency. To obtain a concrete objective function, we derive a measurable generalization error bound that accounts for estimates of source-target hypothesis deviations and divergences between data distributions. The resulting optimization problem is a mixed-integer signomial program, a class of NP-hard problems, for which we develop an algorithm based on successive convex approximations to solve it tractably. Subsequent numerical evaluations of ST-LF demonstrate that it improves classification accuracy and energy efficiency over state-of-the-art baselines. △ Less

Submitted 8 January, 2024; v1 submitted 24 April, 2023; originally announced April 2023.

Comments: Accepted in IEEE Transactions on Cognitive Communications and Networking

arXiv:2304.10640 [pdf, other]

On the Effects of Data Heterogeneity on the Convergence Rates of Distributed Linear System Solvers

Authors: Boris Velasevic, Rohit Parasnis, Christopher G. Brinton, Navid Azizan

Abstract: We consider the fundamental problem of solving a large-scale system of linear equations. In particular, we consider the setting where a taskmaster intends to solve the system in a distributed/federated fashion with the help of a set of machines, who each have a subset of the equations. Although there exist several approaches for solving this problem, missing is a rigorous comparison between the co… ▽ More We consider the fundamental problem of solving a large-scale system of linear equations. In particular, we consider the setting where a taskmaster intends to solve the system in a distributed/federated fashion with the help of a set of machines, who each have a subset of the equations. Although there exist several approaches for solving this problem, missing is a rigorous comparison between the convergence rates of the projection-based methods and those of the optimization-based ones. In this paper, we analyze and compare these two classes of algorithms with a particular focus on the most efficient method from each class, namely, the recently proposed Accelerated Projection-Based Consensus (APC) and the Distributed Heavy-Ball Method (D-HBM). To this end, we first propose a geometric notion of data heterogeneity called angular heterogeneity and discuss its generality. Using this notion, we bound and compare the convergence rates of the studied algorithms and capture the effects of both cross-machine and local data heterogeneity on these quantities. Our analysis results in a number of novel insights besides showing that APC is the most efficient method in realistic scenarios where there is a large data heterogeneity. Our numerical analyses validate our theoretical results. △ Less

Submitted 15 February, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

Comments: 11 pages, 5 figures

ACM Class: G.1.3; I.2.11; I.2.6

arXiv:2303.08988 [pdf, other]

Connectivity-Aware Semi-Decentralized Federated Learning over Time-Varying D2D Networks

Authors: Rohit Parasnis, Seyyedali Hosseinalipour, Yun-Wei Chu, Mung Chiang, Christopher G. Brinton

Abstract: Semi-decentralized federated learning blends the conventional device to-server (D2S) interaction structure of federated model training with localized device-to-device (D2D) communications. We study this architecture over practical edge networks with multiple D2D clusters modeled as time-varying and directed communication graphs. Our investigation results in an algorithm that controls the fundament… ▽ More Semi-decentralized federated learning blends the conventional device to-server (D2S) interaction structure of federated model training with localized device-to-device (D2D) communications. We study this architecture over practical edge networks with multiple D2D clusters modeled as time-varying and directed communication graphs. Our investigation results in an algorithm that controls the fundamental trade-off between (a) the rate of convergence of the model training process towards the global optimizer, and (b) the number of D2S transmissions required for global aggregation. Specifically, in our semi-decentralized methodology, D2D consensus updates are injected into the federated averaging framework based on column-stochastic weight matrices that encapsulate the connectivity within the clusters. To arrive at our algorithm, we show how the expected optimality gap in the current global model depends on the greatest two singular values of the weighted adjacency matrices (and hence on the densities) of the D2D clusters. We then derive tight bounds on these singular values in terms of the node degrees of the D2D clusters, and we use the resulting expressions to design a threshold on the number of clients required to participate in any given global aggregation round so as to ensure a desired convergence rate. Simulations performed on real-world datasets reveal that our connectivity-aware algorithm reduces the total communication cost required to reach a target accuracy significantly compared with baselines depending on the connectivity structure and the learning task. △ Less

Submitted 20 July, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: 10 pages, 5 figures. This paper has been accepted to ACM-MobiHoc 2023

arXiv:2303.08361 [pdf, other]

Towards Cooperative Federated Learning over Heterogeneous Edge/Fog Networks

Authors: Su Wang, Seyyedali Hosseinalipour, Vaneet Aggarwal, Christopher G. Brinton, David J. Love, Weifeng Su, Mung Chiang

Abstract: Federated learning (FL) has been promoted as a popular technique for training machine learning (ML) models over edge/fog networks. Traditional implementations of FL have largely neglected the potential for inter-network cooperation, treating edge/fog devices and other infrastructure participating in ML as separate processing elements. Consequently, FL has been vulnerable to several dimensions of n… ▽ More Federated learning (FL) has been promoted as a popular technique for training machine learning (ML) models over edge/fog networks. Traditional implementations of FL have largely neglected the potential for inter-network cooperation, treating edge/fog devices and other infrastructure participating in ML as separate processing elements. Consequently, FL has been vulnerable to several dimensions of network heterogeneity, such as varying computation capabilities, communication resources, data qualities, and privacy demands. We advocate for cooperative federated learning (CFL), a cooperative edge/fog ML paradigm built on device-to-device (D2D) and device-to-server (D2S) interactions. Through D2D and D2S cooperation, CFL counteracts network heterogeneity in edge/fog networks through enabling a model/data/resource pooling mechanism, which will yield substantial improvements in ML model training quality and network resource consumption. We propose a set of core methodologies that form the foundation of D2D and D2S cooperation and present preliminary experiments that demonstrate their benefits. We also discuss new FL functionalities enabled by this cooperative framework such as the integration of unlabeled data and heterogeneous device privacy into ML model training. Finally, we describe some open research directions at the intersection of cooperative edge/fog and FL. △ Less

Submitted 15 March, 2023; originally announced March 2023.

Comments: This paper has been accepted for publication in IEEE Communications Magazine

arXiv:2303.00727 [pdf, other]

Challenges and Opportunities for Beyond-5G Wireless Security

Authors: Eric Ruzomberka, David J. Love, Christopher G. Brinton, Arpit Gupta, Chih-Chun Wang, H. Vincent Poor

Abstract: The demand for broadband wireless access is driving research and standardization of 5G and beyond-5G wireless systems. In this paper, we aim to identify emerging security challenges for these wireless systems and pose multiple research areas to address these challenges. The demand for broadband wireless access is driving research and standardization of 5G and beyond-5G wireless systems. In this paper, we aim to identify emerging security challenges for these wireless systems and pose multiple research areas to address these challenges. △ Less

Submitted 1 March, 2023; originally announced March 2023.

arXiv:2302.14648 [pdf, other]

Digital Over-the-Air Federated Learning in Multi-Antenna Systems

Authors: Sihua Wang, Mingzhe Chen, Cong Shen, Changchuan Yin, Christopher G. Brinton

Abstract: In this paper, the performance optimization of federated learning (FL), when deployed over a realistic wireless multiple-input multiple-output (MIMO) communication system with digital modulation and over-the-air computation (AirComp) is studied. In particular, a MIMO system is considered in which edge devices transmit their local FL models (trained using their locally collected data) to a paramete… ▽ More In this paper, the performance optimization of federated learning (FL), when deployed over a realistic wireless multiple-input multiple-output (MIMO) communication system with digital modulation and over-the-air computation (AirComp) is studied. In particular, a MIMO system is considered in which edge devices transmit their local FL models (trained using their locally collected data) to a parameter server (PS) using beamforming to maximize the number of devices scheduled for transmission. The PS, acting as a central controller, generates a global FL model using the received local FL models and broadcasts it back to all devices. Due to the limited bandwidth in a wireless network, AirComp is adopted to enable efficient wireless data aggregation. However, fading of wireless channels can produce aggregate distortions in an AirComp-based FL scheme. To tackle this challenge, we propose a modified federated averaging (FedAvg) algorithm that combines digital modulation with AirComp to mitigate wireless fading while ensuring the communication efficiency. This is achieved by a joint transmit and receive beamforming design, which is formulated as an optimization problem to dynamically adjust the beamforming matrices based on current FL model parameters so as to minimize the transmitting error and ensure the FL performance. To achieve this goal, we first analytically characterize how the beamforming matrices affect the performance of the FedAvg in different iterations. Based on this relationship, an artificial neural network (ANN) is used to estimate the local FL models of all devices and adjust the beamforming matrices at the PS for future model transmission. The algorithmic advantages and improved performance of the proposed methodologies are demonstrated through extensive numerical experiments. △ Less

Submitted 25 April, 2024; v1 submitted 4 February, 2023; originally announced February 2023.

arXiv:2302.12305 [pdf, ps, other]

Coded Matrix Computations for D2D-enabled Linearized Federated Learning

Authors: Anindya Bijoy Das, Aditya Ramamoorthy, David J. Love, Christopher G. Brinton

Abstract: Federated learning (FL) is a popular technique for training a global model on data distributed across client devices. Like other distributed training techniques, FL is susceptible to straggler (slower or failed) clients. Recent work has proposed to address this through device-to-device (D2D) offloading, which introduces privacy concerns. In this paper, we propose a novel straggler-optimal approach… ▽ More Federated learning (FL) is a popular technique for training a global model on data distributed across client devices. Like other distributed training techniques, FL is susceptible to straggler (slower or failed) clients. Recent work has proposed to address this through device-to-device (D2D) offloading, which introduces privacy concerns. In this paper, we propose a novel straggler-optimal approach for coded matrix computations which can significantly reduce the communication delay and privacy issues introduced from D2D data transmissions in FL. Moreover, our proposed approach leads to a considerable improvement of the local computation speed when the generated data matrix is sparse. Numerical evaluations confirm the superiority of our proposed method over baseline approaches. △ Less

Submitted 23 February, 2023; originally announced February 2023.

Comments: arXiv admin note: text overlap with arXiv:2301.12685

arXiv:2301.12685 [pdf, ps, other]

Distributed Matrix Computations with Low-weight Encodings

Authors: Anindya Bijoy Das, Aditya Ramamoorthy, David J. Love, Christopher G. Brinton

Abstract: Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum distance separable) codes into the framework; this can achieve resilience against an optimal number of stragglers. However, these codes assign dense linear combin… ▽ More Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum distance separable) codes into the framework; this can achieve resilience against an optimal number of stragglers. However, these codes assign dense linear combinations of submatrices to the worker nodes. When the input matrices are sparse, these approaches increase the number of non-zero entries in the encoded matrices, which in turn adversely affects the worker computation time. In this work, we develop a distributed matrix computation approach where the assigned encoded submatrices are random linear combinations of a small number of submatrices. In addition to being well suited for sparse input matrices, our approach continues have the optimal straggler resilience in a certain range of problem parameters. Moreover, compared to recent sparse matrix computation approaches, the search for a "good" set of random coefficients to promote numerical stability in our method is much more computationally efficient. We show that our approach can efficiently utilize partial computations done by slower worker nodes in a heterogeneous system which can enhance the overall computation speed. Numerical experiments conducted through Amazon Web Services (AWS) demonstrate up to 30% reduction in per worker node computation time and 100x faster encoding compared to the available methods. △ Less

Submitted 22 August, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

arXiv:2301.08866 [pdf, other]

How Potent are Evasion Attacks for Poisoning Federated Learning-Based Signal Classifiers?

Authors: Su Wang, Rajeev Sahay, Christopher G. Brinton

Abstract: There has been recent interest in leveraging federated learning (FL) for radio signal classification tasks. In FL, model parameters are periodically communicated from participating devices, training on their own local datasets, to a central server which aggregates them into a global model. While FL has privacy/security advantages due to raw data not leaving the devices, it is still susceptible to… ▽ More There has been recent interest in leveraging federated learning (FL) for radio signal classification tasks. In FL, model parameters are periodically communicated from participating devices, training on their own local datasets, to a central server which aggregates them into a global model. While FL has privacy/security advantages due to raw data not leaving the devices, it is still susceptible to several adversarial attacks. In this work, we reveal the susceptibility of FL-based signal classifiers to model poisoning attacks, which compromise the training process despite not observing data transmissions. In this capacity, we develop an attack framework in which compromised FL devices perturb their local datasets using adversarial evasion attacks. As a result, the training process of the global model significantly degrades on in-distribution signals (i.e., signals received over channels with identical distributions at each edge device). We compare our work to previously proposed FL attacks and reveal that as few as one adversarial device operating with a low-powered perturbation under our attack framework can induce the potent model poisoning attack to the global classifier. Moreover, we find that more devices partaking in adversarial poisoning will proportionally degrade the classification performance. △ Less

Submitted 20 January, 2023; originally announced January 2023.

Comments: 6 pages, Accepted to IEEE ICC 2023

arXiv:2301.04774 [pdf, other]

doi 10.1109/JSAC.2023.3336154

A Decentralized Pilot Assignment Algorithm for Scalable O-RAN Cell-Free Massive MIMO

Authors: Myeung Suk Oh, Anindya Bijoy Das, Seyyedali Hosseinalipour, Taejoon Kim, David J. Love, Christopher G. Brinton

Abstract: Radio access networks (RANs) in monolithic architectures have limited adaptability to supporting different network scenarios. Recently, open-RAN (O-RAN) techniques have begun adding enormous flexibility to RAN implementations. O-RAN is a natural architectural fit for cell-free massive multiple-input multiple-output (CFmMIMO) systems, where many geographically-distributed access points (APs) are em… ▽ More Radio access networks (RANs) in monolithic architectures have limited adaptability to supporting different network scenarios. Recently, open-RAN (O-RAN) techniques have begun adding enormous flexibility to RAN implementations. O-RAN is a natural architectural fit for cell-free massive multiple-input multiple-output (CFmMIMO) systems, where many geographically-distributed access points (APs) are employed to achieve ubiquitous coverage and enhanced user performance. In this paper, we address the decentralized pilot assignment (PA) problem for scalable O-RAN-based CFmMIMO systems. We propose a low-complexity PA scheme using a multi-agent deep reinforcement learning (MA-DRL) framework in which multiple learning agents perform distributed learning over the O-RAN communication architecture to suppress pilot contamination. Our approach does not require prior channel knowledge but instead relies on real-time interactions made with the environment during the learning procedure. In addition, we design a codebook search (CS) scheme that exploits the decentralization of our O-RAN CFmMIMO architecture, where different codebook sets can be utilized to further improve PA performance without any significant additional complexities. Numerical evaluations verify that our proposed scheme provides substantial computational scalability advantages and improvements in channel estimation performance compared to the state-of-the-art. △ Less

Submitted 1 April, 2024; v1 submitted 11 January, 2023; originally announced January 2023.

Comments: The journal version of this paper is published in IEEE Journal on Selected Areas in Communications

arXiv:2301.01606 [pdf, other]

Predicting Learning Interactions in Social Learning Networks: A Deep Learning Enabled Approach

Authors: Rajeev Sahay, Serena Nicoll, Minjun Zhang, Tsung-Yen Yang, Carlee Joe-Wong, Kerrie A. Douglas, Christopher G Brinton

Abstract: We consider the problem of predicting link formation in Social Learning Networks (SLN), a type of social network that forms when people learn from one another through structured interactions. While link prediction has been studied for general types of social networks, the evolution of SLNs over their lifetimes coupled with their dependence on which topics are being discussed presents new challenge… ▽ More We consider the problem of predicting link formation in Social Learning Networks (SLN), a type of social network that forms when people learn from one another through structured interactions. While link prediction has been studied for general types of social networks, the evolution of SLNs over their lifetimes coupled with their dependence on which topics are being discussed presents new challenges for this type of network. To address these challenges, we develop a series of autonomous link prediction methodologies that utilize spatial and time-evolving network architectures to pass network state between space and time periods, and that models over three types of SLN features updated in each period: neighborhood-based (e.g., resource allocation), path-based (e.g., shortest path), and post-based (e.g., topic similarity). Through evaluation on six real-world datasets from Massive Open Online Course (MOOC) discussion forums and from Purdue University, we find that our method obtains substantial improvements over Bayesian models, linear classifiers, and graph neural networks, with AUCs typically above 0.91 and reaching 0.99 depending on the dataset. Our feature importance analysis shows that while neighborhood and path-based features contribute the most to the results, post-based features add additional information that may not always be relevant for link prediction. △ Less

Submitted 3 January, 2023; originally announced January 2023.

Comments: This work was published in the IEEE/ACM Transactions on Networking

arXiv:2212.08343 [pdf, other]

SplitGP: Achieving Both Generalization and Personalization in Federated Learning

Authors: Dong-Jun Han, Do-Yeon Kim, Minseok Choi, Christopher G. Brinton, Jaekyun Moon

Abstract: A fundamental challenge to providing edge-AI services is the need for a machine learning (ML) model that achieves personalization (i.e., to individual clients) and generalization (i.e., to unseen data) properties concurrently. Existing techniques in federated learning (FL) have encountered a steep tradeoff between these objectives and impose large computational requirements on edge devices during… ▽ More A fundamental challenge to providing edge-AI services is the need for a machine learning (ML) model that achieves personalization (i.e., to individual clients) and generalization (i.e., to unseen data) properties concurrently. Existing techniques in federated learning (FL) have encountered a steep tradeoff between these objectives and impose large computational requirements on edge devices during training and inference. In this paper, we propose SplitGP, a new split learning solution that can simultaneously capture generalization and personalization capabilities for efficient inference across resource-constrained clients (e.g., mobile/IoT devices). Our key idea is to split the full ML model into client-side and server-side components, and impose different roles to them: the client-side model is trained to have strong personalization capability optimized to each client's main task, while the server-side model is trained to have strong generalization capability for handling all clients' out-of-distribution tasks. We analytically characterize the convergence behavior of SplitGP, revealing that all client models approach stationary points asymptotically. Further, we analyze the inference time in SplitGP and provide bounds for determining model split ratios. Experimental results show that SplitGP outperforms existing baselines by wide margins in inference time and test accuracy for varying amounts of out-of-distribution samples. △ Less

Submitted 11 February, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

Comments: To appear in IEEE INFOCOM 2023

arXiv:2211.15365 [pdf, other]

Defending Adversarial Attacks on Deep Learning Based Power Allocation in Massive MIMO Using Denoising Autoencoders

Authors: Rajeev Sahay, Minjun Zhang, David J. Love, Christopher G. Brinton

Abstract: Recent work has advocated for the use of deep learning to perform power allocation in the downlink of massive MIMO (maMIMO) networks. Yet, such deep learning models are vulnerable to adversarial attacks. In the context of maMIMO power allocation, adversarial attacks refer to the injection of subtle perturbations into the deep learning model's input, during inference (i.e., the adversarial perturba… ▽ More Recent work has advocated for the use of deep learning to perform power allocation in the downlink of massive MIMO (maMIMO) networks. Yet, such deep learning models are vulnerable to adversarial attacks. In the context of maMIMO power allocation, adversarial attacks refer to the injection of subtle perturbations into the deep learning model's input, during inference (i.e., the adversarial perturbation is injected into inputs during deployment after the model has been trained) that are specifically crafted to force the trained regression model to output an infeasible power allocation solution. In this work, we develop an autoencoder-based mitigation technique, which allows deep learning-based power allocation models to operate in the presence of adversaries without requiring retraining. Specifically, we develop a denoising autoencoder (DAE), which learns a map** between potentially perturbed data and its corresponding unperturbed input. We test our defense across multiple attacks and in multiple threat models and demonstrate its ability to (i) mitigate the effects of adversarial attacks on power allocation networks using two common precoding schemes, (ii) outperform previously proposed benchmarks for mitigating regression-based adversarial attacks on maMIMO networks, (iii) retain accurate performance in the absence of an attack, and (iv) operate with low computational overhead. △ Less

Submitted 19 March, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

Comments: This work has been published in the IEEE Transactions on Cognitive Communications and Networking

arXiv:2211.12640 [pdf, other]

Event-Triggered Decentralized Federated Learning over Resource-Constrained Edge Devices

Authors: Shahryar Zehtabi, Seyyedali Hosseinalipour, Christopher G. Brinton

Abstract: Federated learning (FL) is a technique for distributed machine learning (ML), in which edge devices carry out local model training on their individual datasets. In traditional FL algorithms, trained models at the edge are periodically sent to a central server for aggregation, utilizing a star topology as the underlying communication graph. However, assuming access to a central coordinator is not a… ▽ More Federated learning (FL) is a technique for distributed machine learning (ML), in which edge devices carry out local model training on their individual datasets. In traditional FL algorithms, trained models at the edge are periodically sent to a central server for aggregation, utilizing a star topology as the underlying communication graph. However, assuming access to a central coordinator is not always practical, e.g., in ad hoc wireless network settings. In this paper, we develop a novel methodology for fully decentralized FL, where in addition to local training, devices conduct model aggregation via cooperative consensus formation with their one-hop neighbors over the decentralized underlying physical network. We further eliminate the need for a timing coordinator by introducing asynchronous, event-triggered communications among the devices. In doing so, to account for the inherent resource heterogeneity challenges in FL, we define personalized communication triggering conditions at each device that weigh the change in local model parameters against the available local resources. We theoretically demonstrate that our methodology converges to the globally optimal learning model at a $O{(\frac{\ln{k}}{\sqrt{k}})}$ rate under standard assumptions in distributed learning and consensus literature. Our subsequent numerical evaluations demonstrate that our methodology obtains substantial improvements in convergence speed and/or communication savings compared with existing decentralized FL baselines. △ Less

Submitted 22 November, 2022; originally announced November 2022.

Comments: 23 pages. arXiv admin note: text overlap with arXiv:2204.03726

arXiv:2210.16569 [pdf, ps, other]

Linear Coding for Gaussian Two-Way Channels

Authors: Junghoon Kim, Seyyedali Hosseinalipour, Taejoon Kim, David J. Love, Christopher G. Brinton

Abstract: We consider linear coding for Gaussian two-way channels (GTWCs), in which each user generates the transmit symbols by linearly encoding both its message and the past received symbols (i.e., the feedback information) from the other user. In Gaussian one-way channels (GOWCs), Butman has proposed a well-developed model for linear encoding that encapsulates feedback information into transmit signals.… ▽ More We consider linear coding for Gaussian two-way channels (GTWCs), in which each user generates the transmit symbols by linearly encoding both its message and the past received symbols (i.e., the feedback information) from the other user. In Gaussian one-way channels (GOWCs), Butman has proposed a well-developed model for linear encoding that encapsulates feedback information into transmit signals. However, such a model for GTWCs has not been well studied since the coupling of the encoding processes at the users in GTWCs renders the encoding design non-trivial and challenging. In this paper, we aim to fill this gap in the literature by extending the existing signal models in GOWCs to GTWCs. With our developed signal model for GTWCs, we formulate an optimization problem to jointly design the encoding/decoding schemes for both the users, aiming to minimize the weighted sum of their transmit powers under signal-to-noise ratio constraints. First, we derive an optimal form of the linear decoding schemes under any arbitrary encoding schemes employed at the users. Further, we provide new insights on the encoding design for GTWCs. In particular, we show that it is optimal that one of the users (i) does not transmit the feedback information to the other user at the last channel use, and (ii) transmits its message only over the last channel use. With these solution behaviors, we further simplify the problem and solve it via an iterative two-way optimization scheme. We numerically demonstrate that our proposed scheme for GTWCs achieves a better performance in terms of the transmit power compared to the existing counterparts, such as the non-feedback scheme and one-way optimization scheme. △ Less

Submitted 29 October, 2022; originally announced October 2022.

Comments: Accepted for publication in 58th Annual Allerton Conference on Communication, Control, and Computing

arXiv:2209.10200 [pdf, other]

Performance Optimization for Variable Bitwidth Federated Learning in Wireless Networks

Authors: Sihua Wang, Mingzhe Chen, Christopher G. Brinton, Changchuan Yin, Walid Saad, Shuguang Cui

Abstract: This paper considers improving wireless communication and computation efficiency in federated learning (FL) via model quantization. In the proposed bitwidth FL scheme, edge devices train and transmit quantized versions of their local FL model parameters to a coordinating server, which aggregates them into a quantized global model and synchronizes the devices. The goal is to jointly determine the b… ▽ More This paper considers improving wireless communication and computation efficiency in federated learning (FL) via model quantization. In the proposed bitwidth FL scheme, edge devices train and transmit quantized versions of their local FL model parameters to a coordinating server, which aggregates them into a quantized global model and synchronizes the devices. The goal is to jointly determine the bitwidths employed for local FL model quantization and the set of devices participating in FL training at each iteration. We pose this as an optimization problem that aims to minimize the training loss of quantized FL under a per-iteration device sampling budget and delay requirement. However, the formulated problem is difficult to solve without (i) a concrete understanding of how quantization impacts global ML performance and (ii) the ability of the server to construct estimates of this process efficiently. To address the first challenge, we analytically characterize how limited wireless resources and induced quantization errors affect the performance of the proposed FL method. Our results quantify how the improvement of FL training loss between two consecutive iterations depends on the device selection and quantization scheme as well as on several parameters inherent to the model being learned. Then, we show that the FL training process can be described as a Markov decision process and propose a model-based reinforcement learning (RL) method to optimize action selection over iterations. Compared to model-free RL, this model-based RL approach leverages the derived mathematical characterization of the FL training process to discover an effective device selection and quantization scheme without imposing additional device communication overhead. Simulation results show that the proposed FL algorithm can reduce the convergence time. △ Less

Submitted 10 July, 2023; v1 submitted 21 September, 2022; originally announced September 2022.

arXiv:2208.02856 [pdf, other]

Embedding Alignment for Unsupervised Federated Learning via Smart Data Exchange

Authors: Satyavrat Wagle, Seyyedali Hosseinalipour, Naji Khosravan, Mung Chiang, Christopher G. Brinton

Abstract: Federated learning (FL) has been recognized as one of the most promising solutions for distributed machine learning (ML). In most of the current literature, FL has been studied for supervised ML tasks, in which edge devices collect labeled data. Nevertheless, in many applications, it is impractical to assume existence of labeled data across devices. To this end, we develop a novel methodology, Coo… ▽ More Federated learning (FL) has been recognized as one of the most promising solutions for distributed machine learning (ML). In most of the current literature, FL has been studied for supervised ML tasks, in which edge devices collect labeled data. Nevertheless, in many applications, it is impractical to assume existence of labeled data across devices. To this end, we develop a novel methodology, Cooperative Federated unsupervised Contrastive Learning (CF-CL), for FL across edge devices with unlabeled datasets. CF-CL employs local device cooperation where data are exchanged among devices through device-to-device (D2D) communications to avoid local model bias resulting from non-independent and identically distributed (non-i.i.d.) local datasets. CF-CL introduces a push-pull smart data sharing mechanism tailored to unsupervised FL settings, in which, each device pushes a subset of its local datapoints to its neighbors as reserved data points, and pulls a set of datapoints from its neighbors, sampled through a probabilistic importance sampling technique. We demonstrate that CF-CL leads to (i) alignment of unsupervised learned latent spaces across devices, (ii) faster global convergence, allowing for less frequent global model aggregations; and (iii) is effective in extreme non-i.i.d. data settings across the devices. △ Less

Submitted 4 August, 2022; originally announced August 2022.

Comments: Accepted for publication in IEEE Global Communications Conferences (GLOBECOM), 2022

arXiv:2207.12482 [pdf, ps, other]

doi 10.1109/TDSC.2022.3192852

AGAPECert: An Auditable, Generalized, Automated, Privacy-Enabling Certification Framework with Oblivious Smart Contracts

Authors: Servio Palacios, Aaron Ault, James V. Krogmeier, Bharat Bhargava, Christopher G. Brinton

Abstract: This paper introduces AGAPECert, an Auditable, Generalized, Automated, Privacy-Enabling, Certification framework capable of performing auditable computation on private data and reporting real-time aggregate certification status without disclosing underlying private data. AGAPECert utilizes a novel mix of trusted execution environments, blockchain technologies, and a real-time graph-based API stand… ▽ More This paper introduces AGAPECert, an Auditable, Generalized, Automated, Privacy-Enabling, Certification framework capable of performing auditable computation on private data and reporting real-time aggregate certification status without disclosing underlying private data. AGAPECert utilizes a novel mix of trusted execution environments, blockchain technologies, and a real-time graph-based API standard to provide automated, oblivious, and auditable certification. Our technique allows a privacy-conscious data owner to run pre-approved Oblivious Smart Contract code in their own environment on their own private data to produce Private Automated Certifications. These certifications are verifiable, purely functional transformations of the available data, enabling a third party to trust that the private data must have the necessary properties to produce the resulting certification. Recently, a multitude of solutions for certification and traceability in supply chains have been proposed. These often suffer from significant privacy issues because they tend to take a" shared, replicated database" approach: every node in the network has access to a copy of all relevant data and contract code to guarantee the integrity and reach consensus, even in the presence of malicious nodes. In these contexts of certifications that require global coordination, AGAPECert can include a blockchain to guarantee ordering of events, while kee** a core privacy model where private data is not shared outside of the data owner's own platform. AGAPECert contributes an open-source certification framework that can be adopted in any regulated environment to keep sensitive data private while enabling a trusted automated workflow. △ Less

Submitted 25 July, 2022; originally announced July 2022.

Comments: to be published in IEEE Transactions on Dependable and Secure Computing

ACM Class: C.3; D.m; E.m; J.7

arXiv:2206.07232 [pdf, other]

A Neural Network-Prepended GLRT Framework for Signal Detection Under Nonlinear Distortions

Authors: Rajeev Sahay, Swaroop Appadwedula, David J. Love, Christopher G. Brinton

Abstract: Many communications and sensing applications hinge on the detection of a signal in a noisy, interference-heavy environment. Signal processing theory yields techniques such as the generalized likelihood ratio test (GLRT) to perform detection when the received samples correspond to a linear observation model. Numerous practical applications exist, however, where the received signal has passed throug… ▽ More Many communications and sensing applications hinge on the detection of a signal in a noisy, interference-heavy environment. Signal processing theory yields techniques such as the generalized likelihood ratio test (GLRT) to perform detection when the received samples correspond to a linear observation model. Numerous practical applications exist, however, where the received signal has passed through a nonlinearity, causing significant performance degradation of the GLRT. In this work, we propose prepending the GLRT detector with a neural network classifier capable of identifying the particular nonlinear time samples in a received signal. We show that pre-processing received nonlinear signals using our trained classifier to eliminate excessively nonlinear samples (i) improves the detection performance of the GLRT on nonlinear signals and (ii) retains the theoretical guarantees provided by the GLRT on linear observation models for accurate signal detection. △ Less

Submitted 14 June, 2022; originally announced June 2022.

Comments: This work was published in the IEEE Communications Letters

Showing 1–50 of 82 results for author: Brinton, C G