Search | arXiv e-print repository

Improved Modelling of Federated Datasets using Mixtures-of-Dirichlet-Multinomials

Abstract: In practice, training using federated learning can be orders of magnitude slower than standard centralized training. This severely limits the amount of experimentation and tuning that can be done, making it challenging to obtain good performance on a given task. Server-side proxy data can be used to run training simulations, for instance for hyperparameter tuning. This can greatly speed up the tra… ▽ More In practice, training using federated learning can be orders of magnitude slower than standard centralized training. This severely limits the amount of experimentation and tuning that can be done, making it challenging to obtain good performance on a given task. Server-side proxy data can be used to run training simulations, for instance for hyperparameter tuning. This can greatly speed up the training pipeline by reducing the number of tuning runs to be performed overall on the true clients. However, it is challenging to ensure that these simulations accurately reflect the dynamics of the real federated training. In particular, the proxy data used for simulations often comes as a single centralized dataset without a partition into distinct clients, and partitioning this data in a naive way can lead to simulations that poorly reflect real federated training. In this paper we address the challenge of how to partition centralized data in a way that reflects the statistical heterogeneity of the true federated clients. We propose a fully federated, theoretically justified, algorithm that efficiently learns the distribution of the true clients and observe improved server-side simulations when using the inferred distribution to create simulated clients from the centralized data. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2404.06430 [pdf, other]

pfl-research: simulation framework for accelerating research in Private Federated Learning

Authors: Filip Granqvist, Congzheng Song, Áine Cahill, Rogier van Dalen, Martin Pelikan, Yi Sheng Chan, Xiaojun Feng, Natarajan Krishnaswami, Vojta **a, Mona Chitnis

Abstract: Federated learning (FL) is an emerging machine learning (ML) training paradigm where clients own their data and collaborate to train a global model, without revealing any data to the server and other participants. Researchers commonly perform experiments in a simulation environment to quickly iterate on ideas. However, existing open-source tools do not offer the efficiency required to simulate FL… ▽ More Federated learning (FL) is an emerging machine learning (ML) training paradigm where clients own their data and collaborate to train a global model, without revealing any data to the server and other participants. Researchers commonly perform experiments in a simulation environment to quickly iterate on ideas. However, existing open-source tools do not offer the efficiency required to simulate FL on larger and more realistic FL datasets. We introduce pfl-research, a fast, modular, and easy-to-use Python framework for simulating FL. It supports TensorFlow, PyTorch, and non-neural network models, and is tightly integrated with state-of-the-art privacy algorithms. We study the speed of open-source FL frameworks and show that pfl-research is 7-72$\times$ faster than alternative open-source frameworks on common cross-device setups. Such speedup will significantly boost the productivity of the FL research community and enable testing hypotheses on realistic FL datasets that were previously too resource intensive. We release a suite of benchmarks that evaluates an algorithm's overall performance on a diverse set of realistic scenarios. The code is available on GitHub at https://github.com/apple/pfl-research. △ Less

Submitted 9 April, 2024; originally announced April 2024.

arXiv:2307.15017 [pdf, other]

Samplable Anonymous Aggregation for Private Federated Data Analysis

Authors: Kunal Talwar, Shan Wang, Audra McMillan, Vojta **a, Vitaly Feldman, Bailey Basile, Aine Cahill, Yi Sheng Chan, Mike Chatzidakis, Junye Chen, Oliver Chick, Mona Chitnis, Suman Ganta, Yusuf Goren, Filip Granqvist, Kristine Guo, Frederic Jacobs, Omid Javidbakht, Albert Liu, Richard Low, Dan Mascenik, Steve Myers, David Park, Wonhee Park, Gianni Parsa , et al. (11 additional authors not shown)

Abstract: We revisit the problem of designing scalable protocols for private statistics and private federated learning when each device holds its private data. Our first contribution is to propose a simple primitive that allows for efficient implementation of several commonly used algorithms, and allows for privacy accounting that is close to that in the central setting without requiring the strong trust as… ▽ More We revisit the problem of designing scalable protocols for private statistics and private federated learning when each device holds its private data. Our first contribution is to propose a simple primitive that allows for efficient implementation of several commonly used algorithms, and allows for privacy accounting that is close to that in the central setting without requiring the strong trust assumptions it entails. Second, we propose a system architecture that implements this primitive and perform a security analysis of the proposed system. △ Less

Submitted 27 July, 2023; originally announced July 2023.

Comments: 24 pages

arXiv:2306.17695 [pdf, other]

A New Task and Dataset on Detecting Attacks on Human Rights Defenders

Authors: Shihao Ran, Di Lu, Joel Tetreault, Aoife Cahill, Alejandro Jaimes

Abstract: The ability to conduct retrospective analyses of attacks on human rights defenders over time and by location is important for humanitarian organizations to better understand historical or ongoing human rights violations and thus better manage the global impact of such events. We hypothesize that NLP can support such efforts by quickly processing large collections of news articles to detect and sum… ▽ More The ability to conduct retrospective analyses of attacks on human rights defenders over time and by location is important for humanitarian organizations to better understand historical or ongoing human rights violations and thus better manage the global impact of such events. We hypothesize that NLP can support such efforts by quickly processing large collections of news articles to detect and summarize the characteristics of attacks on human rights defenders. To that end, we propose a new dataset for detecting Attacks on Human Rights Defenders (HRDsAttack) consisting of crowdsourced annotations on 500 online news articles. The annotations include fine-grained information about the type and location of the attacks, as well as information about the victim(s). We demonstrate the usefulness of the dataset by using it to train and evaluate baseline models on several sub-tasks to predict the annotated characteristics. △ Less

Submitted 30 June, 2023; originally announced June 2023.

arXiv:2211.05153 [pdf]

doi 10.1109/ICDH55609.2022.00009

Extracting, Visualizing, and Learning from Dynamic Data: Perfusion in Surgical Video for Tissue Characterization

Authors: Jonathan P. Epperlein, Niall P. Hardy, Pol Mac Aonghusa, Ronan A. Cahill

Abstract: Intraoperative assessment of tissue can be guided through fluorescence imaging which involves systemic dosing with a fluorophore and subsequent examination of the tissue region of interest with a near-infrared camera. This typically involves administering indocyanine green (ICG) hours or even days before surgery and intraoperative visualization at the time predicted for steady-state signal-to-back… ▽ More Intraoperative assessment of tissue can be guided through fluorescence imaging which involves systemic dosing with a fluorophore and subsequent examination of the tissue region of interest with a near-infrared camera. This typically involves administering indocyanine green (ICG) hours or even days before surgery and intraoperative visualization at the time predicted for steady-state signal-to-background status. Here, we describe our efforts to capture and utilize the information contained in the first few minutes after ICG administration from the perspective of both signal processing and surgical practice. We prove a method for characterization of cancerous versus benign rectal lesions now undergoing further development and validation via multicenter clinical phase studies. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: Presented and published at IEEE International Conference on Digital Health (ICDH) 2022

ACM Class: J.3

Journal ref: 2022 IEEE International Conference on Digital Health (ICDH), 2022, pp. 7-12

arXiv:2203.09943 [pdf, other]

Training a Tokenizer for Free with Private Federated Learning

Authors: Eugene Bagdasaryan, Congzheng Song, Rogier van Dalen, Matt Seigel, Áine Cahill

Abstract: Federated learning with differential privacy, i.e. private federated learning (PFL), makes it possible to train models on private data distributed across users' devices without harming privacy. PFL is efficient for models, such as neural networks, that have a fixed number of parameters, and thus a fixed-dimensional gradient vector. Such models include neural-net language models, but not tokenizers… ▽ More Federated learning with differential privacy, i.e. private federated learning (PFL), makes it possible to train models on private data distributed across users' devices without harming privacy. PFL is efficient for models, such as neural networks, that have a fixed number of parameters, and thus a fixed-dimensional gradient vector. Such models include neural-net language models, but not tokenizers, the topic of this work. Training a tokenizer requires frequencies of words from an unlimited vocabulary, and existing methods for finding an unlimited vocabulary need a separate privacy budget. A workaround is to train the tokenizer on publicly available data. However, in this paper we first show that a tokenizer trained on mismatched data results in worse model performance compared to a privacy-violating "oracle" tokenizer that accesses user data, with perplexity increasing by 20%. We also show that sub-word tokenizers are better suited to the federated context than word-level ones, since they can encode new words, though with more tokens per word. Second, we propose a novel method to obtain a tokenizer without using any additional privacy budget. During private federated learning of the language model, we sample from the model, train a new tokenizer on the sampled sequences, and update the model embeddings. We then continue private federated learning, and obtain performance within 1% of the "oracle" tokenizer. Since this process trains the tokenizer only indirectly on private data, we can use the "postprocessing guarantee" of differential privacy and thus use no additional privacy budget. △ Less

Submitted 15 March, 2022; originally announced March 2022.

arXiv:2102.08503 [pdf, other]

Federated Evaluation and Tuning for On-Device Personalization: System Design & Applications

Authors: Matthias Paulik, Matt Seigel, Henry Mason, Dominic Telaar, Joris Kluivers, Rogier van Dalen, Chi Wai Lau, Luke Carlson, Filip Granqvist, Chris Vandevelde, Sudeep Agarwal, Julien Freudiger, Andrew Byde, Abhishek Bhowmick, Gaurav Kapoor, Si Beaumont, Áine Cahill, Dominic Hughes, Omid Javidbakht, Fei Dong, Rehan Rishi, Stanley Hung

Abstract: We describe the design of our federated task processing system. Originally, the system was created to support two specific federated tasks: evaluation and tuning of on-device ML systems, primarily for the purpose of personalizing these systems. In recent years, support for an additional federated task has been added: federated learning (FL) of deep neural networks. To our knowledge, only one other… ▽ More We describe the design of our federated task processing system. Originally, the system was created to support two specific federated tasks: evaluation and tuning of on-device ML systems, primarily for the purpose of personalizing these systems. In recent years, support for an additional federated task has been added: federated learning (FL) of deep neural networks. To our knowledge, only one other system has been described in literature that supports FL at scale. We include comparisons to that system to help discuss design decisions and attached trade-offs. Finally, we describe two specific large scale personalization use cases in detail to showcase the applicability of federated tuning to on-device personalization and to highlight application specific solutions. △ Less

Submitted 16 February, 2021; originally announced February 2021.

Comments: 11 pages, 1 figure

arXiv:2008.02651 [pdf, other]

Improving on-device speaker verification using federated learning with privacy

Authors: Filip Granqvist, Matt Seigel, Rogier van Dalen, Áine Cahill, Stephen Shum, Matthias Paulik

Abstract: Information on speaker characteristics can be useful as side information in improving speaker recognition accuracy. However, such information is often private. This paper investigates how privacy-preserving learning can improve a speaker verification system, by enabling the use of privacy-sensitive speaker data to train an auxiliary classification model that predicts vocal characteristics of speak… ▽ More Information on speaker characteristics can be useful as side information in improving speaker recognition accuracy. However, such information is often private. This paper investigates how privacy-preserving learning can improve a speaker verification system, by enabling the use of privacy-sensitive speaker data to train an auxiliary classification model that predicts vocal characteristics of speakers. In particular, this paper explores the utility achieved by approaches which combine different federated learning and differential privacy mechanisms. These approaches make it possible to train a central model while protecting user privacy, with users' data remaining on their devices. Furthermore, they make learning on a large population of speakers possible, ensuring good coverage of speaker characteristics when training a model. The auxiliary model described here uses features extracted from phrases which trigger a speaker verification system. From these features, the model predicts speaker characteristic labels considered useful as side information. The knowledge of the auxiliary model is distilled into a speaker verification system using multi-task learning, with the side information labels predicted by this auxiliary model being the additional task. This approach results in a 6% relative improvement in equal error rate over a baseline system. △ Less

Submitted 6 August, 2020; originally announced August 2020.

Comments: To appear in proceedings of INTERSPEECH 2020

arXiv:1801.06765 [pdf]

doi 10.1016/j.nima.2018.01.061

Ultra-High Brightness Electron Beams from Very-High Field Cryogenic Radio-frequency Photocathode Sources

Authors: J. B. Rosenzweig, A. Cahill, B. Carlsten, G. Castorina, M. Croia, C. Emma, A. Fukusawa, B. Spataro, D. Alesini, V. Dolgashev, M. Ferrario, G. Lawler, R. Li, C. Limborg, J. Maxson, P. Musumeci, R. Pompili, S. Tantawi, O. Williams

Abstract: Recent investigations of RF copper structures operated at cryogenic temperatures performed by a SLAC-UCLA collaboration have shown a dramatic increase in the maximum surface electric field, to 500 MV/m. We examine use of these fields to enable very high field cryogenic photoinjectors that can attain over an order of magnitude increase in peak electron beam brightness. We present beam dynamics stud… ▽ More Recent investigations of RF copper structures operated at cryogenic temperatures performed by a SLAC-UCLA collaboration have shown a dramatic increase in the maximum surface electric field, to 500 MV/m. We examine use of these fields to enable very high field cryogenic photoinjectors that can attain over an order of magnitude increase in peak electron beam brightness. We present beam dynamics studies relevant to X-ray FEL injectors, using start-to-end simulations that show the high brightness and low emittance of this source enables operation of a compact FEL reaching a photon energy of 80 keV. The preservation of beam brightness in compression, exploiting micro-bunching techniques is discussed. While the gain in brightness at high field is due to increase of the emission current density, further increases in brightness due to lowering of the intrinsic cathode emittance in cryogenic operation are also enabled. While the original proposal for this type of cryogenic, ultra-high field photoinjector has emphasized S-band designs, there are numerous potential advantages that may be conferred by operation in C-band. We examine issues related to experimental implementation in C-band, and expected performance of this type of device in a future hard X-ray FEL such as MaRIE. △ Less

Submitted 20 January, 2018; originally announced January 2018.

Comments: 10 pages, 5 figures, to appear in Proceedings of 2018 European Advanced Accelerator Conference, Nuclear Instruments and Methods

arXiv:1603.01657 [pdf]

Next Generation High Brightness Electron Beams From Ultra-High Field Cryogenic Radiofrequency Photocathode Sources

Authors: J. B. Rosenzweig, A. Cahill, V. Dolgashev, C. Emma, A. Fukusawa, R. Li, C. Limborg, J. Maxson, P. Musumeci, A. Nause, R. Pakter, R. Pompili, R. Roussel, B. Spataro, S. Tantawi

Abstract: Recent studies of the performance of radio-frequency (RF) copper cavities operated at cryogenic temperatures have shown a dramatic increase in the maximum achievable surface electric field. We propose to exploit this development to enable a new generation of photoinjectors operated at cryogenic temperatures that may attain, through enhancement of the launch field at the photocathode, a significant… ▽ More Recent studies of the performance of radio-frequency (RF) copper cavities operated at cryogenic temperatures have shown a dramatic increase in the maximum achievable surface electric field. We propose to exploit this development to enable a new generation of photoinjectors operated at cryogenic temperatures that may attain, through enhancement of the launch field at the photocathode, a significant increase in five-dimensional electron beam brightness. We present detailed studies of the beam dynamics associated with such a system, by examining an S-band photoinjector operated at 250 MV/m peak electric field that reaches normalized emittances in the 40 nm-rad range at charges (100-200 pC) suitable for use in a hard X-ray free-electron laser (XFEL) scenario based on the LCLS. In this case, we show by start-to-end simulations that the properties of this source may give rise to high efficiency operation of an XFEL, and permit extension of the photon energy reach by an order of magnitude, to over 80 keV. The brightness needed for such XFELs is achieved through low source emittances in tandem with high current after compression. In the XFEL examples analyzed, the emittances during final compression are preserved using micro-bunching techniques. Extreme low emittance scenarios obtained at pC charge, appropriate for significantly extending temporal resolution limits of ultrafast electron diffraction and microscopy experiments, are also reviewed. While the increase in brightness in a cryogenic photoinjector is mainly due to the augmentation of the emission current density via field enhancement, further possible increases in performance arising from lowering the intrinsic cathode emittance in cryogenic operation are also analyzed. Issues in experimental implementation, including cavity optimization for lowering cryogenic thermal dissipation, external coupling, and cryo-cooler system are discussed. △ Less

Submitted 30 December, 2018; v1 submitted 4 March, 2016; originally announced March 2016.

Comments: 34 pages, 17 figures, 127 references. submitted to Physical Review Accelerators and Beams

arXiv:1403.0801 [pdf, other]

Is getting the right answer just about choosing the right words? The role of syntactically-informed features in short answer scoring

Authors: Derrick Higgins, Chris Brew, Michael Heilman, Ramon Ziai, Lei Chen, Aoife Cahill, Michael Flor, Nitin Madnani, Joel Tetreault, Daniel Blanchard, Diane Napolitano, Chong Min Lee, John Blackmore

Abstract: Developments in the educational landscape have spurred greater interest in the problem of automatically scoring short answer questions. A recent shared task on this topic revealed a fundamental divide in the modeling approaches that have been applied to this problem, with the best-performing systems split between those that employ a knowledge engineering approach and those that almost solely lever… ▽ More Developments in the educational landscape have spurred greater interest in the problem of automatically scoring short answer questions. A recent shared task on this topic revealed a fundamental divide in the modeling approaches that have been applied to this problem, with the best-performing systems split between those that employ a knowledge engineering approach and those that almost solely leverage lexical information (as opposed to higher-level syntactic information) in assigning a score to a given response. This paper aims to introduce the NLP community to the largest corpus currently available for short-answer scoring, provide an overview of methods used in the shared task using this data, and explore the extent to which more syntactically-informed features can contribute to the short answer scoring task in a way that avoids the question-specific manual effort of the knowledge engineering approach. △ Less

Submitted 5 March, 2014; v1 submitted 4 March, 2014; originally announced March 2014.

arXiv:1202.2628 [pdf, other]

doi 10.1016/j.nima.2012.08.052

Characterization of the Hamamatsu R11410-10 3-Inch Photomultiplier Tube for Liquid Xenon Dark Matter Direct Detection Experiments

Authors: K. Lung, K. Arisaka, A. Bargetzi, P. Beltrame, A. Cahill, T. Genma, C. Ghag, D. Gordon, J. Sainz, A. Teymourian, Y. Yoshizawa

Abstract: To satisfy the requirements of the next generation of dark matter detectors based on the dual phase TPC, Hamamatsu, in close collaboration with UCLA, has developed the R11410-10 photomultipler tube. In this work, we present the detailed tests performed on this device. High QE (>30%) accompanied by a low dark count rate (50 Hz at 0.3 PE) and high gain (10^7) with good single PE resolution have been… ▽ More To satisfy the requirements of the next generation of dark matter detectors based on the dual phase TPC, Hamamatsu, in close collaboration with UCLA, has developed the R11410-10 photomultipler tube. In this work, we present the detailed tests performed on this device. High QE (>30%) accompanied by a low dark count rate (50 Hz at 0.3 PE) and high gain (10^7) with good single PE resolution have been observed. A comprehensive screening measurement campaign is ongoing while the manufacturer quotes a radioactivity of 20 mBq/PMT. These characteristics show the R11410-10 to be particularly suitable for the forthcoming zero background liquid xenon detectors. △ Less

Submitted 23 August, 2012; v1 submitted 12 February, 2012; originally announced February 2012.

Comments: 19 pages, 18 figures

Showing 1–12 of 12 results for author: Cahill, Á