Search | arXiv e-print repository

Intelligent Client Selection for Federated Learning using Cellular Automata

Authors: Nikolaos Pavlidis, Vasileios Perifanis, Theodoros Panagiotis Chatzinikolaou, Georgios Ch. Sirakoulis, Pavlos S. Efraimidis

Abstract: Federated Learning (FL) has emerged as a promising solution for privacy-enhancement and latency minimization in various real-world applications, such as transportation, communications, and healthcare. FL endeavors to bring Machine Learning (ML) down to the edge by harnessing data from million of devices and IoT sensors, thus enabling rapid responses to dynamic environments and yielding highly pers… ▽ More Federated Learning (FL) has emerged as a promising solution for privacy-enhancement and latency minimization in various real-world applications, such as transportation, communications, and healthcare. FL endeavors to bring Machine Learning (ML) down to the edge by harnessing data from million of devices and IoT sensors, thus enabling rapid responses to dynamic environments and yielding highly personalized results. However, the increased amount of sensors across diverse applications poses challenges in terms of communication and resource allocation, hindering the participation of all devices in the federated process and prompting the need for effective FL client selection. To address this issue, we propose Cellular Automaton-based Client Selection (CA-CS), a novel client selection algorithm, which leverages Cellular Automata (CA) as models to effectively capture spatio-temporal changes in a fast-evolving environment. CA-CS considers the computational resources and communication capacity of each participating client, while also accounting for inter-client interactions between neighbors during the client selection process, enabling intelligent client selection for online FL processes on data streams that closely resemble real-world scenarios. In this paper, we present a thorough evaluation of the proposed CA-CS algorithm using MNIST and CIFAR-10 datasets, while making a direct comparison against a uniformly random client selection scheme. Our results demonstrate that CA-CS achieves comparable accuracy to the random selection approach, while effectively avoiding high-latency clients. △ Less

Submitted 18 October, 2023; v1 submitted 1 October, 2023; originally announced October 2023.

Comments: 18th IEEE International Workshop on Cellular Nanoscale Networks and their Applications

arXiv:2309.10645 [pdf, other]

Towards Energy-Aware Federated Traffic Prediction for Cellular Networks

Authors: Vasileios Perifanis, Nikolaos Pavlidis, Selim F. Yilmaz, Francesc Wilhelmi, Elia Guerra, Marco Miozzo, Pavlos S. Efraimidis, Paolo Dini, Remous-Aris Koutsiamanis

Abstract: Cellular traffic prediction is a crucial activity for optimizing networks in fifth-generation (5G) networks and beyond, as accurate forecasting is essential for intelligent network design, resource allocation and anomaly mitigation. Although machine learning (ML) is a promising approach to effectively predict network traffic, the centralization of massive data in a single data center raises issues… ▽ More Cellular traffic prediction is a crucial activity for optimizing networks in fifth-generation (5G) networks and beyond, as accurate forecasting is essential for intelligent network design, resource allocation and anomaly mitigation. Although machine learning (ML) is a promising approach to effectively predict network traffic, the centralization of massive data in a single data center raises issues regarding confidentiality, privacy and data transfer demands. To address these challenges, federated learning (FL) emerges as an appealing ML training framework which offers high accurate predictions through parallel distributed computations. However, the environmental impact of these methods is often overlooked, which calls into question their sustainability. In this paper, we address the trade-off between accuracy and energy consumption in FL by proposing a novel sustainability indicator that allows assessing the feasibility of ML models. Then, we comprehensively evaluate state-of-the-art deep learning (DL) architectures in a federated scenario using real-world measurements from base station (BS) sites in the area of Barcelona, Spain. Our findings indicate that larger ML models achieve marginally improved performance but have a significant environmental impact in terms of carbon footprint, which make them impractical for real-world applications. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: International Symposium on Federated Learning Technologies and Applications (FLTA), 2023

arXiv:2309.04311 [pdf, other]

Federated Learning for Early Dropout Prediction on Healthy Ageing Applications

Authors: Christos Chrysanthos Nikolaidis, Vasileios Perifanis, Nikolaos Pavlidis, Pavlos S. Efraimidis

Abstract: The provision of social care applications is crucial for elderly people to improve their quality of life and enables operators to provide early interventions. Accurate predictions of user dropouts in healthy ageing applications are essential since they are directly related to individual health statuses. Machine Learning (ML) algorithms have enabled highly accurate predictions, outperforming tradit… ▽ More The provision of social care applications is crucial for elderly people to improve their quality of life and enables operators to provide early interventions. Accurate predictions of user dropouts in healthy ageing applications are essential since they are directly related to individual health statuses. Machine Learning (ML) algorithms have enabled highly accurate predictions, outperforming traditional statistical methods that struggle to cope with individual patterns. However, ML requires a substantial amount of data for training, which is challenging due to the presence of personal identifiable information (PII) and the fragmentation posed by regulations. In this paper, we present a federated machine learning (FML) approach that minimizes privacy concerns and enables distributed training, without transferring individual data. We employ collaborative training by considering individuals and organizations under FML, which models both cross-device and cross-silo learning scenarios. Our approach is evaluated on a real-world dataset with non-independent and identically distributed (non-iid) data among clients, class imbalance and label ambiguity. Our results show that data selection and class imbalance handling techniques significantly improve the predictive accuracy of models trained under FML, demonstrating comparable or superior predictive performance than traditional ML models. △ Less

Submitted 8 September, 2023; originally announced September 2023.

arXiv:2308.00539 [pdf, other]

Predicting Early Dropouts of an Active and Healthy Ageing App

Authors: Vasileios Perifanis, Ioanna Michailidi, Giorgos Stamatelatos, George Drosatos, Pavlos S. Efraimidis

Abstract: In this work, we present a machine learning approach for predicting early dropouts of an active and healthy ageing app. The presented algorithms have been submitted to the IFMBE Scientific Challenge 2022, part of IUPESM WC 2022. We have processed the given database and generated seven datasets. We used pre-processing techniques to construct classification models that predict the adherence of users… ▽ More In this work, we present a machine learning approach for predicting early dropouts of an active and healthy ageing app. The presented algorithms have been submitted to the IFMBE Scientific Challenge 2022, part of IUPESM WC 2022. We have processed the given database and generated seven datasets. We used pre-processing techniques to construct classification models that predict the adherence of users using dynamic and static features. We submitted 11 official runs and our results show that machine learning algorithms can provide high-quality adherence predictions. Based on the results, the dynamic features positively influence a model's classification performance. Due to the imbalanced nature of the dataset, we employed oversampling methods such as SMOTE and ADASYN to improve the classification performance. The oversampling approaches led to a remarkable improvement of 10\%. Our methods won first place in the IFMBE Scientific Challenge 2022. △ Less

Submitted 1 August, 2023; originally announced August 2023.

arXiv:2211.15220 [pdf, other]

doi 10.1016/j.comnet.2023.109950

Federated Learning for 5G Base Station Traffic Forecasting

Authors: Vasileios Perifanis, Nikolaos Pavlidis, Remous-Aris Koutsiamanis, Pavlos S. Efraimidis

Abstract: Cellular traffic prediction is of great importance on the path of enabling 5G mobile networks to perform intelligent and efficient infrastructure planning and management. However, available data are limited to base station logging information. Hence, training methods for generating high-quality predictions that can generalize to new observations across diverse parties are in demand. Traditional ap… ▽ More Cellular traffic prediction is of great importance on the path of enabling 5G mobile networks to perform intelligent and efficient infrastructure planning and management. However, available data are limited to base station logging information. Hence, training methods for generating high-quality predictions that can generalize to new observations across diverse parties are in demand. Traditional approaches require collecting measurements from multiple base stations, transmitting them to a central entity and conducting machine learning operations using the acquire data. The dissemination of local observations raises concerns regarding confidentiality and performance, which impede the applicability of machine learning techniques. Although various distributed learning methods have been proposed to address this issue, their application to traffic prediction remains highly unexplored. In this work, we investigate the efficacy of federated learning applied to raw base station LTE data for time-series forecasting. We evaluate one-step predictions using five different neural network architectures trained with a federated setting on non-identically distributed data. Our results show that the learning architectures adapted to the federated setting yield equivalent prediction error to the centralized setting. In addition, preprocessing techniques on base stations enhance forecasting accuracy, while advanced federated aggregators do not surpass simpler approaches. Simulations considering the environmental impact suggest that federated learning holds the potential for reducing carbon emissions and energy consumption. Finally, we consider a large-scale scenario with synthetic data and demonstrate that federated learning reduces the computational and communication costs compared to centralized settings. △ Less

Submitted 26 August, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

Journal ref: Computer Networks, 109950, 2023

arXiv:2112.11134 [pdf, other]

FedPOIRec: Privacy Preserving Federated POI Recommendation with Social Influence

Authors: Vasileios Perifanis, George Drosatos, Giorgos Stamatelatos, Pavlos S. Efraimidis

Abstract: With the growing number of Location-Based Social Networks, privacy preserving location prediction has become a primary task for hel** users discover new points-of-interest (POIs). Traditional systems consider a centralized approach that requires the transmission and collection of users' private data. In this work, we present FedPOIRec, a privacy preserving federated learning approach enhanced wi… ▽ More With the growing number of Location-Based Social Networks, privacy preserving location prediction has become a primary task for hel** users discover new points-of-interest (POIs). Traditional systems consider a centralized approach that requires the transmission and collection of users' private data. In this work, we present FedPOIRec, a privacy preserving federated learning approach enhanced with features from users' social circles for top-$N$ POI recommendations. First, the FedPOIRec framework is built on the principle that local data never leave the owner's device, while the local updates are blindly aggregated by a parameter server. Second, the local recommenders get personalized by allowing users to exchange their learned parameters, enabling knowledge transfer among friends. To this end, we propose a privacy preserving protocol for integrating the preferences of a user's friends after the federated computation, by exploiting the properties of the CKKS fully homomorphic encryption scheme. To evaluate FedPOIRec, we apply our approach into five real-world datasets using two recommendation models. Extensive experiments demonstrate that FedPOIRec achieves comparable recommendation quality to centralized approaches, while the social integration protocol incurs low computation and communication overhead on the user side. △ Less

Submitted 21 December, 2021; originally announced December 2021.

arXiv:2110.00287 [pdf, other]

An Exact, Linear Time Barabási-Albert Algorithm

Authors: Giorgos Stamatelatos, Pavlos S. Efraimidis

Abstract: This paper presents the development of a new class of algorithms that accurately implement the preferential attachment mechanism of the Barabási-Albert (BA) model to generate scale-free graphs. Contrary to existing approximate preferential attachment schemes, our methods are exact in terms of the proportionality of the vertex selection probabilities to their degree and run in linear time with resp… ▽ More This paper presents the development of a new class of algorithms that accurately implement the preferential attachment mechanism of the Barabási-Albert (BA) model to generate scale-free graphs. Contrary to existing approximate preferential attachment schemes, our methods are exact in terms of the proportionality of the vertex selection probabilities to their degree and run in linear time with respect to the order of the generated graph. Our algorithms utilize a series of precise, diverse, weighted and unweighted random sampling steps to engineer the desired properties of the graph generator. We analytically show that they obey the definition of the original BA model that generates scale-free graphs and discuss their higher-order properties. The proposed methods additionally include options to manipulate one dimension of control over the joint inclusion of groups of vertices. △ Less

Submitted 30 March, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

arXiv:2106.04405 [pdf, other]

doi 10.1016/j.knosys.2022.108441

Federated Neural Collaborative Filtering

Authors: Vasileios Perifanis, Pavlos S. Efraimidis

Abstract: In this work, we present a federated version of the state-of-the-art Neural Collaborative Filtering (NCF) approach for item recommendations. The system, named FedNCF, enables learning without requiring users to disclose or transmit their raw data. Data localization preserves data privacy and complies with regulations such as the GDPR. Although federated learning enables model training without loca… ▽ More In this work, we present a federated version of the state-of-the-art Neural Collaborative Filtering (NCF) approach for item recommendations. The system, named FedNCF, enables learning without requiring users to disclose or transmit their raw data. Data localization preserves data privacy and complies with regulations such as the GDPR. Although federated learning enables model training without local data dissemination, the transmission of raw clients' updates raises additional privacy issues. To address this challenge, we incorporate a privacy-preserving aggregation method that satisfies the security requirements against an honest but curious entity. We argue theoretically and experimentally that existing aggregation algorithms are inconsistent with latent factor model updates. We propose an enhancement by decomposing the aggregation step into matrix factorization and neural network-based averaging. Experimental validation shows that FedNCF achieves comparable recommendation quality to the original NCF system, while our proposed aggregation leads to faster convergence compared to existing methods. We investigate the effectiveness of the federated recommender system and evaluate the privacy-preserving mechanism in terms of computational cost. △ Less

Submitted 16 February, 2022; v1 submitted 2 June, 2021; originally announced June 2021.

arXiv:2105.10520 [pdf, other]

Exploring Ethereum's Data Stores: A Cost and Performance Comparison

Authors: P. Kostamis, A. Sendros, P. S. Efraimidis

Abstract: The cost of using a blockchain infrastructure as well as the time required to search and retrieve information from it must be considered when designing a decentralized application. In this work, we examine a comprehensive set of data management approaches for Ethereum applications and assess the associated cost in gas as well as the retrieval performance. More precisely, we analyze the storage and… ▽ More The cost of using a blockchain infrastructure as well as the time required to search and retrieve information from it must be considered when designing a decentralized application. In this work, we examine a comprehensive set of data management approaches for Ethereum applications and assess the associated cost in gas as well as the retrieval performance. More precisely, we analyze the storage and retrieval of various-sized data, utilizing smart contract storage. In addition, we study hybrid approaches by using IPFS and Swarm as storage platforms along with Ethereum as a timestam** proof mechanism. Such schemes are especially effective when large chunks of data have to be managed. Moreover, we present methods for low-cost data handling in Ethereum, namely the event-logs, the transaction payload, and the almost surprising exploitation of unused function arguments. Finally, we evaluate these methods on a comprehensive set of experiments. △ Less

Submitted 21 May, 2021; originally announced May 2021.

arXiv:2105.07472 [pdf, ps, other]

Lexicographic Enumeration of Set Partitions

Authors: Giorgos Stamatelatos, Pavlos S. Efraimidis

Abstract: In this report, we summarize the set partition enumeration problems and thoroughly explain the algorithms used to solve them. These algorithms iterate through the partitions in lexicographic order and are easy to understand and implement in modern high-level programming languages, without recursive structures and jump logic. We show that they require linear space in respect to the set cardinality… ▽ More In this report, we summarize the set partition enumeration problems and thoroughly explain the algorithms used to solve them. These algorithms iterate through the partitions in lexicographic order and are easy to understand and implement in modern high-level programming languages, without recursive structures and jump logic. We show that they require linear space in respect to the set cardinality and advance the enumeration in constant amortized time. The methods discussed in this document are not novel. Our goal is to demonstrate the process of enumerating set partitions and highlight the ideas behind it. This work is an aid for learners approaching this enumeration problem and programmers undertaking the task of implementing it. △ Less

Submitted 16 May, 2021; originally announced May 2021.

arXiv:2102.08173 [pdf, other]

About Weighted Random Sampling in Preferential Attachment Models

Authors: Giorgos Stamatelatos, Pavlos S. Efraimidis

Abstract: The Barabási-Albert model is a popular scheme for creating scale-free graphs but has been previously shown to have ambiguities in its definition. In this paper we discuss a new ambiguity in the definition of the BA model by identifying the tight relation between the preferential attachment process and unequal probability random sampling. While the probability that each individual vertex is selecte… ▽ More The Barabási-Albert model is a popular scheme for creating scale-free graphs but has been previously shown to have ambiguities in its definition. In this paper we discuss a new ambiguity in the definition of the BA model by identifying the tight relation between the preferential attachment process and unequal probability random sampling. While the probability that each individual vertex is selected is set to be proportional to their degree, the model does not specify the joint probabilities that any tuple of $m$ vertices is selected together for $m>1$. We demonstrate the consequences using analytical, experimental, and empirical analyses and propose a concise definition of the model that addresses this ambiguity. Using the connection with unequal probability random sampling, we also highlight a confusion about the process via which nodes are selected on each time step, for which - despite being implicitly indicated in the original paper - current literature appears fragmented. △ Less

Submitted 13 October, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

arXiv:1208.3747 [pdf, other]

On Money as a Means of Coordination between Network Packets

Authors: Pavlos S. Efraimidis, Remous-Aris Koutsiamanis

Abstract: In this work, we apply a common economic tool, namely money, to coordinate network packets. In particular, we present a network economy, called PacketEconomy, where each flow is modeled as a population of rational network packets, and these packets can self-regulate their access to network resources by mutually trading their positions in router queues. Every packet of the economy has its price, an… ▽ More In this work, we apply a common economic tool, namely money, to coordinate network packets. In particular, we present a network economy, called PacketEconomy, where each flow is modeled as a population of rational network packets, and these packets can self-regulate their access to network resources by mutually trading their positions in router queues. Every packet of the economy has its price, and this price determines if and when the packet will agree to buy or sell a better position. We consider a corresponding Markov model of trade and show that there are Nash equilibria (NE) where queue positions and money are exchanged directly between the network packets. This simple approach, interestingly, delivers improvements even when fiat money is used. We present theoretical arguments and experimental results to support our claims. △ Less

Submitted 18 August, 2012; originally announced August 2012.

arXiv:1103.0116 [pdf]

An exact and O(1) time heaviest and lightest hitters algorithm for sliding-window data streams

Authors: Remous-Aris Koutsiamanis, Pavlos S. Efraimidis

Abstract: In this work we focus on the problem of finding the heaviest-k and lightest-k hitters in a sliding window data stream. The most recent research endeavours have yielded an epsilon-approximate algorithm with update operations in constant time with high probability and O(1/epsilon) query time for the heaviest hitters case. We propose a novel algorithm which for the first time, to our knowledge, provi… ▽ More In this work we focus on the problem of finding the heaviest-k and lightest-k hitters in a sliding window data stream. The most recent research endeavours have yielded an epsilon-approximate algorithm with update operations in constant time with high probability and O(1/epsilon) query time for the heaviest hitters case. We propose a novel algorithm which for the first time, to our knowledge, provides exact, not approximate, results while at the same time achieves O(1) time with high probability complexity on both update and query operations. Furthermore, our algorithm is able to provide both the heaviest-k and the lightest-k hitters at the same time without any overhead. In this work, we describe the algorithm and the accompanying data structure that supports it and perform quantitative experiments with synthetic data to verify our theoretical predictions. △ Less

Submitted 1 March, 2011; originally announced March 2011.

arXiv:1012.0259 [pdf, ps, other]

(α, β) Fibonacci Search

Authors: Pavlos S. Efraimidis

Abstract: Knuth [12, Page 417] states that "the (program of the) Fibonaccian search technique looks very mysterious at first glance" and that "it seems to work by magic". In this work, we show that there is even more magic in Fibonaccian (or else Fibonacci) search. We present a generalized Fibonacci procedure that follows perfectly the implicit optimal decision tree for search problems where the cost of eac… ▽ More Knuth [12, Page 417] states that "the (program of the) Fibonaccian search technique looks very mysterious at first glance" and that "it seems to work by magic". In this work, we show that there is even more magic in Fibonaccian (or else Fibonacci) search. We present a generalized Fibonacci procedure that follows perfectly the implicit optimal decision tree for search problems where the cost of each comparison depends on its outcome. △ Less

Submitted 1 December, 2010; originally announced December 2010.

Report number: Technical Report LPDP-2010-02

arXiv:1012.0256 [pdf, other]

Weighted Random Sampling over Data Streams

Authors: Pavlos S. Efraimidis

Abstract: In this work, we present a comprehensive treatment of weighted random sampling (WRS) over data streams. More precisely, we examine two natural interpretations of the item weights, describe an existing algorithm for each case ([2, 4]), discuss sampling with and without replacement and show adaptations of the algorithms for several WRS problems and evolving data streams. In this work, we present a comprehensive treatment of weighted random sampling (WRS) over data streams. More precisely, we examine two natural interpretations of the item weights, describe an existing algorithm for each case ([2, 4]), discuss sampling with and without replacement and show adaptations of the algorithms for several WRS problems and evolving data streams. △ Less

Submitted 28 July, 2015; v1 submitted 1 December, 2010; originally announced December 2010.

Comments: Corrected minor typos. Infeasible items are now additionally called "overweight" items (WRS-N-P). Enriched the Introduction (Section 1) with more text and references to related work. Revised the description of sampling with a bounded number of replacements (Section 4.2)

Report number: Technical Report LPDP-2010-03

Showing 1–15 of 15 results for author: Efraimidis, P S