-
Enhancing Privacy against Inversion Attacks in Federated Learning by using Mixing Gradients Strategies
Authors:
Shaltiel Eloul,
Fran Silavong,
Sanket Kamthe,
Antonios Georgiadis,
Sean J. Moran
Abstract:
Federated learning reduces the risk of information leakage, but remains vulnerable to attacks. We investigate how several neural network design decisions can defend against gradients inversion attacks. We show that overlap** gradients provides numerical resistance to gradient inversion on the highly vulnerable dense layer. Specifically, we propose to leverage batching to maximise mixing of gradi…
▽ More
Federated learning reduces the risk of information leakage, but remains vulnerable to attacks. We investigate how several neural network design decisions can defend against gradients inversion attacks. We show that overlap** gradients provides numerical resistance to gradient inversion on the highly vulnerable dense layer. Specifically, we propose to leverage batching to maximise mixing of gradients by choosing an appropriate loss function and drawing identical labels. We show that otherwise it is possible to directly recover all vectors in a mini-batch without any numerical optimisation due to the de-mixing nature of the cross entropy loss. To accurately assess data recovery, we introduce an absolute variation distance (AVD) metric for information leakage in images, derived from total variation. In contrast to standard metrics, e.g. Mean Squared Error or Structural Similarity Index, AVD offers a continuous metric for extracting information in noisy images. Finally, our empirical results on information recovery from various inversion attacks and training performance supports our defense strategies. These strategies are also shown to be useful for deep convolutional neural networks such as LeNET for image recognition. We hope that this study will help guide the development of further strategies that achieve a trustful federation policy.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
ST-FL: Style Transfer Preprocessing in Federated Learning for COVID-19 Segmentation
Authors:
Antonios Georgiadis,
Varun Babbar,
Fran Silavong,
Sean Moran,
Rob Otter
Abstract:
Chest Computational Tomography (CT) scans present low cost, speed and objectivity for COVID-19 diagnosis and deep learning methods have shown great promise in assisting the analysis and interpretation of these images. Most hospitals or countries can train their own models using in-house data, however empirical evidence shows that those models perform poorly when tested on new unseen cases, surfaci…
▽ More
Chest Computational Tomography (CT) scans present low cost, speed and objectivity for COVID-19 diagnosis and deep learning methods have shown great promise in assisting the analysis and interpretation of these images. Most hospitals or countries can train their own models using in-house data, however empirical evidence shows that those models perform poorly when tested on new unseen cases, surfacing the need for coordinated global collaboration. Due to privacy regulations, medical data sharing between hospitals and nations is extremely difficult. We propose a GAN-augmented federated learning model, dubbed ST-FL (Style Transfer Federated Learning), for COVID-19 image segmentation. Federated learning (FL) permits a centralised model to be learned in a secure manner from heterogeneous datasets located in disparate private data silos. We demonstrate that the widely varying data quality on FL client nodes leads to a sub-optimal centralised FL model for COVID-19 chest CT image segmentation. ST-FL is a novel FL framework that is robust in the face of highly variable data quality at client nodes. The robustness is achieved by a denoising CycleGAN model at each client of the federation that maps arbitrary quality images into the same target quality, counteracting the severe data variability evident in real-world FL use-cases. Each client is provided with the target style, which is the same for all clients, and trains their own denoiser. Our qualitative and quantitative results suggest that this FL model performs comparably to, and in some cases better than, a model that has centralised access to all the training data.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
Senatus -- A Fast and Accurate Code-to-Code Recommendation Engine
Authors:
Fran Silavong,
Sean Moran,
Antonios Georgiadis,
Rohan Saphal,
Robert Otter
Abstract:
Machine learning on source code (MLOnCode) is a popular research field that has been driven by the availability of large-scale code repositories and the development of powerful probabilistic and deep learning models for mining source code. Code-to-code recommendation is a task in MLOnCode that aims to recommend relevant, diverse and concise code snippets that usefully extend the code currently bei…
▽ More
Machine learning on source code (MLOnCode) is a popular research field that has been driven by the availability of large-scale code repositories and the development of powerful probabilistic and deep learning models for mining source code. Code-to-code recommendation is a task in MLOnCode that aims to recommend relevant, diverse and concise code snippets that usefully extend the code currently being written by a developer in their development environment (IDE). Code-to-code recommendation engines hold the promise of increasing developer productivity by reducing context switching from the IDE and increasing code-reuse. Existing code-to-code recommendation engines do not scale gracefully to large codebases, exhibiting a linear growth in query time as the code repository increases in size. In addition, existing code-to-code recommendation engines fail to account for the global statistics of code repositories in the ranking function, such as the distribution of code snippet lengths, leading to sub-optimal retrieval results. We address both of these weaknesses with \emph{Senatus}, a new code-to-code recommendation engine. At the core of Senatus is \emph{De-Skew} LSH a new locality sensitive hashing (LSH) algorithm that indexes the data for fast (sub-linear time) retrieval while also counteracting the skewness in the snippet length distribution using novel abstract syntax tree-based feature scoring and selection algorithms. We evaluate Senatus and find the recommendations to be of higher quality than competing baselines, while achieving faster search. For example on the CodeSearchNet dataset Senatus improves performance by 31.21\% F1 and 147.9\emph{x} faster query time compared to Facebook Aroma. Senatus also outperforms standard MinHash LSH by 29.2\% F1 and 51.02\emph{x} faster query time.
△ Less
Submitted 26 April, 2022; v1 submitted 5 November, 2021;
originally announced November 2021.
-
Parameterized Synthetic Image Data Set for Fisheye Lens
Authors:
Zhen Chen,
Anthimos Georgiadis
Abstract:
Based on different projection geometry, a fisheye image can be presented as a parameterized non-rectilinear image. Deep neural networks(DNN) is one of the solutions to extract parameters for fisheye image feature description. However, a large number of images are required for training a reasonable prediction model for DNN. In this paper, we propose to extend the scale of the training dataset using…
▽ More
Based on different projection geometry, a fisheye image can be presented as a parameterized non-rectilinear image. Deep neural networks(DNN) is one of the solutions to extract parameters for fisheye image feature description. However, a large number of images are required for training a reasonable prediction model for DNN. In this paper, we propose to extend the scale of the training dataset using parameterized synthetic images. It effectively boosts the diversity of images and avoids the data scale limitation. To simulate different viewing angles and distances, we adopt controllable parameterized projection processes on transformation. The reliability of the proposed method is proved by testing images captured by our fisheye camera. The synthetic dataset is the first dataset that is able to extend to a big scale labeled fisheye image dataset. It is accessible via: http://www2.leuphana.de/misl/fisheye-data-set/.
△ Less
Submitted 12 November, 2018;
originally announced November 2018.
-
The CTTC 5G end-to-end experimental platform: Integrating heterogeneous wireless/optical networks, distributed cloud, and IoT devices
Authors:
Raul Muñóz,
Josep Mangues,
Ricard Vilalta,
Christos Verikoukis,
Jesús Alonso-Zarate,
Nikolaos Bartzoudis,
Apostolos Georgiadis,
Miquel Payaró,
Ana Pérez-Neira,
Ramon Casellas,
Ricardo Martínez,
José Núñez-Martínez,
Manuel Requena-Esteso,
David Pubill,
Oriol Font-Bach,
Pol Henarejos,
Jordi Serra,
Francisco Vazquez-Gallego
Abstract:
The Internet of Things (IoT) will facilitate a wide variety of applications in different domains, such as smart cities, smart grids, industrial automation (Industry 4.0), smart driving, assistance of the elderly, and home automation. Billions of heterogeneous smart devices with different application requirements will be connected to the networks and will generate huge aggregated volumes of data th…
▽ More
The Internet of Things (IoT) will facilitate a wide variety of applications in different domains, such as smart cities, smart grids, industrial automation (Industry 4.0), smart driving, assistance of the elderly, and home automation. Billions of heterogeneous smart devices with different application requirements will be connected to the networks and will generate huge aggregated volumes of data that will be processed in distributed cloud infrastructures. On the other hand, there is also a general trend to deploy functions as software (SW) instances in cloud infrastructures [e.g., network function virtualization (NFV) or mobile edge computing (MEC)]. Thus, the next generation of mobile networks, the fifth-generation (5G), will need not only to develop new radio interfaces or waveforms to cope with the expected traffic growth but also to integrate heterogeneous networks from end to end (E2E) with distributed cloud resources to deliver E2E IoT and mobile services. This article presents the E2E 5G platform that is being developed by the Centre Tecnològic de Telecomunicacions de Catalunya (CTTC), the first known platform capable of reproducing such an ambitious scenario.
△ Less
Submitted 20 March, 2018;
originally announced March 2018.
-
Towards the 1G of Mobile Power Network: RF, Signal and System Designs to Make Smart Objects Autonomous
Authors:
Bruno Clerckx,
Alessandra Costanzo,
Apostolos Georgiadis,
Nuno Borges Carvalho
Abstract:
This article reviews some recent promising approaches to make mobile power closer to reality. In contrast with articles commonly published by the microwave community and the communication/signal processing community that separately emphasize RF, circuit and antenna solutions for WPT on one hand and communications, signal and system designs for WPT on the other hand, this review article uniquely br…
▽ More
This article reviews some recent promising approaches to make mobile power closer to reality. In contrast with articles commonly published by the microwave community and the communication/signal processing community that separately emphasize RF, circuit and antenna solutions for WPT on one hand and communications, signal and system designs for WPT on the other hand, this review article uniquely bridges RF, signal and system designs in order to bring those communities closer to each other and get a better understanding of the fundamental building blocks of an efficient WPT network architecture. We start by reviewing the engineering requirements and design challenges of making mobile power a reality. We then review the state-of-the-art in a wide range of areas spanning sensors and devices, RF design for wireless power and wireless communications. We identify their limitations and make critical observations before providing some fresh new look and promising avenues on signal and system designs for WPT.
△ Less
Submitted 17 December, 2017;
originally announced December 2017.