-
Shielding Federated Learning Systems against Inference Attacks with ARM TrustZone
Authors:
Aghiles Ait Messaoud,
Sonia Ben Mokhtar,
Vlad Nitu,
Valerio Schiavoni
Abstract:
Federated Learning (FL) opens new perspectives for training machine learning models while kee** personal data on the users premises. Specifically, in FL, models are trained on the users devices and only model updates (i.e., gradients) are sent to a central server for aggregation purposes. However, the long list of inference attacks that leak private data from gradients, published in the recent y…
▽ More
Federated Learning (FL) opens new perspectives for training machine learning models while kee** personal data on the users premises. Specifically, in FL, models are trained on the users devices and only model updates (i.e., gradients) are sent to a central server for aggregation purposes. However, the long list of inference attacks that leak private data from gradients, published in the recent years, have emphasized the need of devising effective protection mechanisms to incentivize the adoption of FL at scale. While there exist solutions to mitigate these attacks on the server side, little has been done to protect users from attacks performed on the client side. In this context, the use of Trusted Execution Environments (TEEs) on the client side are among the most proposing solutions. However, existing frameworks (e.g., DarkneTZ) require statically putting a large portion of the machine learning model into the TEE to effectively protect against complex attacks or a combination of attacks. We present GradSec, a solution that allows protecting in a TEE only sensitive layers of a machine learning model, either statically or dynamically, hence reducing both the TCB size and the overall training time by up to 30% and 56%, respectively compared to state-of-the-art competitors.
△ Less
Submitted 15 October, 2022; v1 submitted 11 August, 2022;
originally announced August 2022.
-
PEPPER: Empowering User-Centric Recommender Systems over Gossip Learning
Authors:
Yacine Belal,
Aurélien Bellet,
Sonia Ben Mokhtar,
Vlad Nitu
Abstract:
Recommender systems are proving to be an invaluable tool for extracting user-relevant content hel** users in their daily activities (e.g., finding relevant places to visit, content to consume, items to purchase). However, to be effective, these systems need to collect and analyze large volumes of personal data (e.g., location check-ins, movie ratings, click rates .. etc.), which exposes users to…
▽ More
Recommender systems are proving to be an invaluable tool for extracting user-relevant content hel** users in their daily activities (e.g., finding relevant places to visit, content to consume, items to purchase). However, to be effective, these systems need to collect and analyze large volumes of personal data (e.g., location check-ins, movie ratings, click rates .. etc.), which exposes users to numerous privacy threats. In this context, recommender systems based on Federated Learning (FL) appear to be a promising solution for enforcing privacy as they compute accurate recommendations while kee** personal data on the users' devices. However, FL, and therefore FL-based recommender systems, rely on a central server that can experience scalability issues besides being vulnerable to attacks. To remedy this, we propose PEPPER, a decentralized recommender system based on gossip learning principles. In PEPPER, users gossip model updates and aggregate them asynchronously. At the heart of PEPPER reside two key components: a personalized peer-sampling protocol that keeps in the neighborhood of each node, a proportion of nodes that have similar interests to the former and a simple yet effective model aggregation function that builds a model that is better suited to each user. Through experiments on three real datasets implementing two use cases: a location check-in recommendation and a movie recommendation, we demonstrate that our solution converges up to 42% faster than with other decentralized solutions providing up to 9% improvement on average performance metric such as hit ratio and up to 21% improvement on long tail performance compared to decentralized competitors.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
Tournesol: A quest for a large, secure and trustworthy database of reliable human judgments
Authors:
Lê-Nguyên Hoang,
Louis Faucon,
Aidan Jungo,
Sergei Volodin,
Dalia Papuc,
Orfeas Liossatos,
Ben Crulis,
Mariame Tighanimine,
Isabela Constantin,
Anastasiia Kucherenko,
Alexandre Maurer,
Felix Grimberg,
Vlad Nitu,
Chris Vossen,
Sébastien Rouault,
El-Mahdi El-Mhamdi
Abstract:
Today's large-scale algorithms have become immensely influential, as they recommend and moderate the content that billions of humans are exposed to on a daily basis. They are the de-facto regulators of our societies' information diet, from sha** opinions on public health to organizing groups for social movements. This creates serious concerns, but also great opportunities to promote quality info…
▽ More
Today's large-scale algorithms have become immensely influential, as they recommend and moderate the content that billions of humans are exposed to on a daily basis. They are the de-facto regulators of our societies' information diet, from sha** opinions on public health to organizing groups for social movements. This creates serious concerns, but also great opportunities to promote quality information. Addressing the concerns and seizing the opportunities is a challenging, enormous and fabulous endeavor, as intuitively appealing ideas often come with unwanted {\it side effects}, and as it requires us to think about what we deeply prefer.
Understanding how today's large-scale algorithms are built is critical to determine what interventions will be most effective. Given that these algorithms rely heavily on {\it machine learning}, we make the following key observation: \emph{any algorithm trained on uncontrolled data must not be trusted}. Indeed, a malicious entity could take control over the data, poison it with dangerously manipulative fabricated inputs, and thereby make the trained algorithm extremely unsafe. We thus argue that the first step towards safe and ethical large-scale algorithms must be the collection of a large, secure and trustworthy dataset of reliable human judgments.
To achieve this, we introduce \emph{Tournesol}, an open source platform available at \url{https://tournesol.app}. Tournesol aims to collect a large database of human judgments on what algorithms ought to widely recommend (and what they ought to stop widely recommending). We outline the structure of the Tournesol database, the key features of the Tournesol platform and the main hurdles that must be overcome to make it a successful project. Most importantly, we argue that, if successful, Tournesol may then serve as the essential foundation for any safe and ethical large-scale algorithm.
△ Less
Submitted 29 May, 2021;
originally announced July 2021.
-
FLeet: Online Federated Learning via Staleness Awareness and Performance Prediction
Authors:
Georgios Damaskinos,
Rachid Guerraoui,
Anne-Marie Kermarrec,
Vlad Nitu,
Rhicheek Patra,
Francois Taiani
Abstract:
Federated Learning (FL) is very appealing for its privacy benefits: essentially, a global model is trained with updates computed on mobile devices while kee** the data of users local. Standard FL infrastructures are however designed to have no energy or performance impact on mobile devices, and are therefore not suitable for applications that require frequent (online) model updates, such as news…
▽ More
Federated Learning (FL) is very appealing for its privacy benefits: essentially, a global model is trained with updates computed on mobile devices while kee** the data of users local. Standard FL infrastructures are however designed to have no energy or performance impact on mobile devices, and are therefore not suitable for applications that require frequent (online) model updates, such as news recommenders.
This paper presents FLeet, the first Online FL system, acting as a middleware between the Android OS and the machine learning application. FLeet combines the privacy of Standard FL with the precision of online learning thanks to two core components: (i) I-Prof, a new lightweight profiler that predicts and controls the impact of learning tasks on mobile devices, and (ii) AdaSGD, a new adaptive learning algorithm that is resilient to delayed updates.
Our extensive evaluation shows that Online FL, as implemented by FLeet, can deliver a 2.3x quality boost compared to Standard FL, while only consuming 0.036% of the battery per day. I-Prof can accurately control the impact of learning tasks by improving the prediction accuracy up to 3.6x (computation time) and up to 19x (energy). AdaSGD outperforms alternative FL approaches by 18.4% in terms of convergence speed on heterogeneous data.
△ Less
Submitted 3 December, 2020; v1 submitted 12 June, 2020;
originally announced June 2020.
-
Efficient, Dynamic Multi-tenant Edge Computation in EdgeOS
Authors:
Yuxin Ren,
Vlad Nitu,
Guyue Liu,
Gabriel Parmer,
Timothy Wood,
Alain Tchana,
Riley Kennedy
Abstract:
In the future, computing will be immersed in the world around us -- from augmented reality to autonomous vehicles to the Internet of Things. Many of these smart devices will offer services that respond in real time to their physical surroundings, requiring complex processing with strict performance guarantees. Edge clouds promise a pervasive computational infrastructure a short network hop away fr…
▽ More
In the future, computing will be immersed in the world around us -- from augmented reality to autonomous vehicles to the Internet of Things. Many of these smart devices will offer services that respond in real time to their physical surroundings, requiring complex processing with strict performance guarantees. Edge clouds promise a pervasive computational infrastructure a short network hop away from end devices, but today's operating systems are a poor fit to meet the goals of scalable isolation, dense multi-tenancy, and predictable performance required by these emerging applications. In this paper we present EdgeOS, a micro-kernel based operating system that meets these goals by blending recent advances in real-time systems and network function virtualization. EdgeOS introduces a Featherweight Process model that offers lightweight isolation and supports extreme scalability even under high churn. Our architecture provides efficient communication mechanisms, and low-overhead per-client isolation. To achieve high performance networking, EdgeOS employs kernel bypass paired with the isolation properties of Featherweight Processes. We have evaluated our EdgeOS prototype for running high scale network middleboxes using the Click software router and endpoint applications using memcached. EdgeOS reduces startup latency by 170X compared to Linux processes and over five orders of magnitude compared to containers, while providing three orders of magnitude latency improvement when running 300 to 1000 edge-cloud memcached instances on one server.
△ Less
Submitted 4 January, 2019;
originally announced January 2019.