Search | arXiv e-print repository

The PLATO Mission

Authors: Heike Rauer, Conny Aerts, Juan Cabrera, Magali Deleuil, Anders Erikson, Laurent Gizon, Mariejo Goupil, Ana Heras, Jose Lorenzo-Alvarez, Filippo Marliani, Cesar Martin-Garcia, J. Miguel Mas-Hesse, Laurence O'Rourke, Hugh Osborn, Isabella Pagano, Giampaolo Piotto, Don Pollacco, Roberto Ragazzoni, Gavin Ramsay, Stéphane Udry, Thierry Appourchaux, Willy Benz, Alexis Brandeker, Manuel Güdel, Eduardo Janot-Pacheco , et al. (801 additional authors not shown)

Abstract: PLATO (PLAnetary Transits and Oscillations of stars) is ESA's M3 mission designed to detect and characterise extrasolar planets and perform asteroseismic monitoring of a large number of stars. PLATO will detect small planets (down to <2 R_(Earth)) around bright stars (<11 mag), including terrestrial planets in the habitable zone of solar-like stars. With the complement of radial velocity observati… ▽ More PLATO (PLAnetary Transits and Oscillations of stars) is ESA's M3 mission designed to detect and characterise extrasolar planets and perform asteroseismic monitoring of a large number of stars. PLATO will detect small planets (down to <2 R_(Earth)) around bright stars (<11 mag), including terrestrial planets in the habitable zone of solar-like stars. With the complement of radial velocity observations from the ground, planets will be characterised for their radius, mass, and age with high accuracy (5 %, 10 %, 10 % for an Earth-Sun combination respectively). PLATO will provide us with a large-scale catalogue of well-characterised small planets up to intermediate orbital periods, relevant for a meaningful comparison to planet formation theories and to better understand planet evolution. It will make possible comparative exoplanetology to place our Solar System planets in a broader context. In parallel, PLATO will study (host) stars using asteroseismology, allowing us to determine the stellar properties with high accuracy, substantially enhancing our knowledge of stellar structure and evolution. The payload instrument consists of 26 cameras with 12cm aperture each. For at least four years, the mission will perform high-precision photometric measurements. Here we review the science objectives, present PLATO's target samples and fields, provide an overview of expected core science performance as well as a description of the instrument and the mission profile at the beginning of the serial production of the flight cameras. PLATO is scheduled for a launch date end 2026. This overview therefore provides a summary of the mission to the community in preparation of the upcoming operational phases. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2405.02437 [pdf, other]

FastLloyd: Federated, Accurate, Secure, and Tunable $k$-Means Clustering with Differential Privacy

Authors: Abdulrahman Diaa, Thomas Humphries, Florian Kerschbaum

Abstract: We study the problem of privacy-preserving $k$-means clustering in the horizontally federated setting. Existing federated approaches using secure computation, suffer from substantial overheads and do not offer output privacy. At the same time, differentially private (DP) $k$-means algorithms assume a trusted central curator and do not extend to federated settings. Naively combining the secure and… ▽ More We study the problem of privacy-preserving $k$-means clustering in the horizontally federated setting. Existing federated approaches using secure computation, suffer from substantial overheads and do not offer output privacy. At the same time, differentially private (DP) $k$-means algorithms assume a trusted central curator and do not extend to federated settings. Naively combining the secure and DP solutions results in a protocol with impractical overhead. Instead, our work provides enhancements to both the DP and secure computation components, resulting in a design that is faster, more private, and more accurate than previous work. By utilizing the computational DP model, we design a lightweight, secure aggregation-based approach that achieves four orders of magnitude speed-up over state-of-the-art related work. Furthermore, we not only maintain the utility of the state-of-the-art in the central model of DP, but we improve the utility further by taking advantage of constrained clustering techniques. △ Less

Submitted 3 May, 2024; originally announced May 2024.

arXiv:2405.01222 [pdf, ps, other]

doi 10.1051/0004-6361/202449643

Probing the dynamical and kinematical structures of detached shells around AGB stars

Authors: M. Maercker, E. De Beck, T. Khouri, W. H. T. Vlemmings, J. Gustafsson, H. Olofsson, D. Tafoya, F. Kerschbaum, M. Lindqvist

Abstract: Aims. We aim to resolve the spatial and kinematic sub-structures in five detached-shell sources to provide detailed constraints for hydrodynamic models that describe the formation and evolution of the shells. Methods. We use observations of the 12 CO (1-0) emission towards five carbon-AGB stars with ALMA. The data have angular resolutions of 0.3 arcsec to 1arcsec and a velocity resolution of 0.3 k… ▽ More Aims. We aim to resolve the spatial and kinematic sub-structures in five detached-shell sources to provide detailed constraints for hydrodynamic models that describe the formation and evolution of the shells. Methods. We use observations of the 12 CO (1-0) emission towards five carbon-AGB stars with ALMA. The data have angular resolutions of 0.3 arcsec to 1arcsec and a velocity resolution of 0.3 km/s . This enables us to quantify spatial and kinematic structures in the shells. Results. The observed emission is separated into two distinct components: a more coherent, bright outer shell and a more filamentary, fainter inner shell. The kinematic information shows that the inner sub-shells move at a higher velocity relative to the outer sub-shells. The observed sub-structures confirm the predictions from hydrodynamical models. However, the models do not predict a double-shell structure, and the CO emission likely only traces the inner and outer edges of the shell, implying a lack of CO in the middle layers of the detached shell. Previous estimates of the masses and temperatures are consistent with originating mainly from the brighter subshell, but the total shell masses are likely lower limits. Conclusions. The observed spatial and kinematical splittings of the shells appear consistent with results from hydrodynamical models, provided the CO emission does not trace the H2 density distribution in the shell but rather traces the edges of the shells. It is therefore not possible to constrain the total shell mass based on the CO observations alone. Complementary observations of, e.g., CI as a dissociation product of CO would be necessary to understand the distribution of CO compared to H2. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: 18 pages (incl. 5 pages Appendix), 13 Figures

Journal ref: A&A 687, A112 (2024)

arXiv:2402.14937 [pdf, other]

SoK: Analyzing Adversarial Examples: A Framework to Study Adversary Knowledge

Authors: Lucas Fenaux, Florian Kerschbaum

Abstract: Adversarial examples are malicious inputs to machine learning models that trigger a misclassification. This type of attack has been studied for close to a decade, and we find that there is a lack of study and formalization of adversary knowledge when mounting attacks. This has yielded a complex space of attack research with hard-to-compare threat models and attacks. We focus on the image classific… ▽ More Adversarial examples are malicious inputs to machine learning models that trigger a misclassification. This type of attack has been studied for close to a decade, and we find that there is a lack of study and formalization of adversary knowledge when mounting attacks. This has yielded a complex space of attack research with hard-to-compare threat models and attacks. We focus on the image classification domain and provide a theoretical framework to study adversary knowledge inspired by work in order theory. We present an adversarial example game, inspired by cryptographic games, to standardize attacks. We survey recent attacks in the image classification domain and classify their adversary's knowledge in our framework. From this systematization, we compile results that both confirm existing beliefs about adversary knowledge, such as the potency of information about the attacked model as well as allow us to derive new conclusions on the difficulty associated with the white-box and transferable threat models, for example, that transferable attacks might not be as difficult as previously thought. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2312.11339 [pdf, other]

The EBLM Project XI. Mass, radius and effective temperature measurements for 23 M-dwarf companions to solar-type stars observed with CHEOPS

Authors: M. I. Swayne, P. F. L. Maxted, A. H. M. J. Triaud, S. G. Sousa, A. Deline, D. Ehrenreich, S. Hoyer, G. Olofsson, I. Boisse, A. Duck, S. Gill, D. Martin, J. McCormac, C. M. Persson, A. Santerne, D. Sebastian, M. R. Standing, L. Acuña, Y. Alibert, R. Alonso, G. Anglada, T. Bárczy, D. Barrado Navascues, S. C. C. Barros, W. Baumjohann , et al. (82 additional authors not shown)

Abstract: Observations of low-mass stars have frequently shown a disagreement between observed stellar radii and radii predicted by theoretical stellar structure models. This ``radius inflation'' problem could have an impact on both stellar and exoplanetary science. We present the final results of our observation programme with the CHEOPS satellite to obtain high-precision light curves of eclipsing binaries… ▽ More Observations of low-mass stars have frequently shown a disagreement between observed stellar radii and radii predicted by theoretical stellar structure models. This ``radius inflation'' problem could have an impact on both stellar and exoplanetary science. We present the final results of our observation programme with the CHEOPS satellite to obtain high-precision light curves of eclipsing binaries with low mass stellar companions (EBLMs). Combined with the spectroscopic orbits of the solar-type companion, we can derive the masses, radii and effective temperatures of 23 M-dwarf stars. We use the PYCHEOPS data analysis software to analyse their primary and secondary occultations. For all but one target, we also perform analyses with TESS light curves for comparison. We have assessed the impact of starspot-induced variation on our derived parameters and account for this in our radius and effective temperature uncertainties using simulated light curves. We observe trends for inflation with both metallicity and orbital separation. We also observe a strong trend in the difference between theoretical and observational effective temperatures with metallicity. There is no such trend with orbital separation. These results are not consistent with the idea that observed inflation in stellar radius combines with lower effective temperature to preserve the luminosity predicted by low-mass stellar models. Our EBLM systems are high-quality and homogeneous measurements that can be used in further studies into radius inflation. △ Less

Submitted 18 December, 2023; originally announced December 2023.

Comments: 21 pages, 10 figures, accepted for publication in MNRAS, Supplementary material provided as ancillary files

arXiv:2312.00157 [pdf, other]

Universal Backdoor Attacks

Authors: Benjamin Schneider, Nils Lukas, Florian Kerschbaum

Abstract: Web-scraped datasets are vulnerable to data poisoning, which can be used for backdooring deep image classifiers during training. Since training on large datasets is expensive, a model is trained once and re-used many times. Unlike adversarial examples, backdoor attacks often target specific classes rather than any class learned by the model. One might expect that targeting many classes through a n… ▽ More Web-scraped datasets are vulnerable to data poisoning, which can be used for backdooring deep image classifiers during training. Since training on large datasets is expensive, a model is trained once and re-used many times. Unlike adversarial examples, backdoor attacks often target specific classes rather than any class learned by the model. One might expect that targeting many classes through a naive composition of attacks vastly increases the number of poison samples. We show this is not necessarily true and more efficient, universal data poisoning attacks exist that allow controlling misclassifications from any source class into any target class with a small increase in poison samples. Our idea is to generate triggers with salient characteristics that the model can learn. The triggers we craft exploit a phenomenon we call inter-class poison transferability, where learning a trigger from one class makes the model more vulnerable to learning triggers for other classes. We demonstrate the effectiveness and robustness of our universal backdoor attacks by controlling models with up to 6,000 classes while poisoning only 0.15% of the training dataset. Our source code is available at https://github.com/Ben-Schneider-code/Universal-Backdoor-Attacks. △ Less

Submitted 19 January, 2024; v1 submitted 30 November, 2023; originally announced December 2023.

Comments: Accepted for publication at ICLR 2024

arXiv:2310.14565 [pdf, other]

PEPSI: Practically Efficient Private Set Intersection in the Unbalanced Setting

Authors: Rasoul Akhavan Mahdavi, Nils Lukas, Faezeh Ebrahimianghazani, Thomas Humphries, Bailey Kacsmar, John Premkumar, Xinda Li, Simon Oya, Ehsan Amjadian, Florian Kerschbaum

Abstract: Two parties with private data sets can find shared elements using a Private Set Intersection (PSI) protocol without revealing any information beyond the intersection. Circuit PSI protocols privately compute an arbitrary function of the intersection - such as its cardinality, and are often employed in an unbalanced setting where one party has more data than the other. Existing protocols are either… ▽ More Two parties with private data sets can find shared elements using a Private Set Intersection (PSI) protocol without revealing any information beyond the intersection. Circuit PSI protocols privately compute an arbitrary function of the intersection - such as its cardinality, and are often employed in an unbalanced setting where one party has more data than the other. Existing protocols are either computationally inefficient or require extensive server-client communication on the order of the larger set. We introduce Practically Efficient PSI or PEPSI, a non-interactive solution where only the client sends its encrypted data. PEPSI can process an intersection of 1024 client items with a million server items in under a second, using less than 5 MB of communication. Our work is over 4 orders of magnitude faster than an existing non-interactive circuit PSI protocol and requires only 10% of the communication. It is also up to 20 times faster than the work of Ion et al., which computes a limited set of functions and has communication costs proportional to the larger set. Our work is the first to demonstrate that non-interactive circuit PSI can be practically applied in an unbalanced setting. △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2310.11548 [pdf, other]

doi 10.14778/3659437.3659455

Differentially Private Data Generation with Missing Data

Authors: Shubhankar Mohapatra, Jianqiao Zong, Florian Kerschbaum, Xi He

Abstract: Despite several works that succeed in generating synthetic data with differential privacy (DP) guarantees, they are inadequate for generating high-quality synthetic data when the input data has missing values. In this work, we formalize the problems of DP synthetic data with missing values and propose three effective adaptive strategies that significantly improve the utility of the synthetic data… ▽ More Despite several works that succeed in generating synthetic data with differential privacy (DP) guarantees, they are inadequate for generating high-quality synthetic data when the input data has missing values. In this work, we formalize the problems of DP synthetic data with missing values and propose three effective adaptive strategies that significantly improve the utility of the synthetic data on four real-world datasets with different types and levels of missing data and privacy requirements. We also identify the relationship between privacy impact for the complete ground truth data and incomplete data for these DP synthetic data generation algorithms. We model the missing mechanisms as a sampling process to obtain tighter upper bounds for the privacy guarantees to the ground truth data. Overall, this study contributes to a better understanding of the challenges and opportunities for using private synthetic data generation algorithms in the presence of missing data. △ Less

Submitted 30 May, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: 18 pages, 9 figures, 2 tables

Journal ref: PVLDB Volume 17, 2024

arXiv:2310.09056 [pdf, other]

doi 10.1051/0004-6361/202346120

Extended far-UV emission surrounding asymptotic giant branch stars as seen by GALEX

Authors: V. Răstău, M. Mečina, F. Kerschbaum, H. Olofsson, M. Maercker, M. Drechsler, X. Strottner, L. Mulato

Abstract: Aims. Our goal is to study the long-term mass-loss rate characteristics of asymptotic giant branch (AGB) stars through wind-wind and wind-interstellar medium interaction. Methods. Far-ultraviolet (FUV) images from the Galex survey are used to investigate extended UV emission associated with AGB stars. Results. FUV emission was found towards eight objects. The emission displays different shapes… ▽ More Aims. Our goal is to study the long-term mass-loss rate characteristics of asymptotic giant branch (AGB) stars through wind-wind and wind-interstellar medium interaction. Methods. Far-ultraviolet (FUV) images from the Galex survey are used to investigate extended UV emission associated with AGB stars. Results. FUV emission was found towards eight objects. The emission displays different shapes and sizes; interaction regions were identified, often with infrared counterparts, but no equivalent near-ultraviolet (NUV) emission was found in most cases. Conclusions. The FUV emission is likely attributed to shock-excited molecular hydrogen, considering the lack of NUV emission and the large space velocities of the objects, and makes it possible to trace old structures that are too faint to be observed, for instance, in the infrared. △ Less

Submitted 13 October, 2023; originally announced October 2023.

Journal ref: A&A 680, A12 (2023)

arXiv:2309.16952 [pdf, other]

Leveraging Optimization for Adaptive Attacks on Image Watermarks

Authors: Nils Lukas, Abdulrahman Diaa, Lucas Fenaux, Florian Kerschbaum

Abstract: Untrustworthy users can misuse image generators to synthesize high-quality deepfakes and engage in unethical activities. Watermarking deters misuse by marking generated content with a hidden message, enabling its detection using a secret watermarking key. A core security property of watermarking is robustness, which states that an attacker can only evade detection by substantially degrading image… ▽ More Untrustworthy users can misuse image generators to synthesize high-quality deepfakes and engage in unethical activities. Watermarking deters misuse by marking generated content with a hidden message, enabling its detection using a secret watermarking key. A core security property of watermarking is robustness, which states that an attacker can only evade detection by substantially degrading image quality. Assessing robustness requires designing an adaptive attack for the specific watermarking algorithm. When evaluating watermarking algorithms and their (adaptive) attacks, it is challenging to determine whether an adaptive attack is optimal, i.e., the best possible attack. We solve this problem by defining an objective function and then approach adaptive attacks as an optimization problem. The core idea of our adaptive attacks is to replicate secret watermarking keys locally by creating surrogate keys that are differentiable and can be used to optimize the attack's parameters. We demonstrate for Stable Diffusion models that such an attacker can break all five surveyed watermarking methods at no visible degradation in image quality. Optimizing our attacks is efficient and requires less than 1 GPU hour to reduce the detection accuracy to 6.3% or less. Our findings emphasize the need for more rigorous robustness testing against adaptive, learnable attackers. △ Less

Submitted 20 January, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: ICLR'24

arXiv:2309.06496 [pdf, other]

Level Up: Private Non-Interactive Decision Tree Evaluation using Levelled Homomorphic Encryption

Authors: Rasoul Akhavan Mahdavi, Haoyan Ni, Dimitry Linkov, Florian Kerschbaum

Abstract: As machine learning as a service continues gaining popularity, concerns about privacy and intellectual property arise. Users often hesitate to disclose their private information to obtain a service, while service providers aim to protect their proprietary models. Decision trees, a widely used machine learning model, are favoured for their simplicity, interpretability, and ease of training. In this… ▽ More As machine learning as a service continues gaining popularity, concerns about privacy and intellectual property arise. Users often hesitate to disclose their private information to obtain a service, while service providers aim to protect their proprietary models. Decision trees, a widely used machine learning model, are favoured for their simplicity, interpretability, and ease of training. In this context, Private Decision Tree Evaluation (PDTE) enables a server holding a private decision tree to provide predictions based on a client's private attributes. The protocol is such that the server learns nothing about the client's private attributes. Similarly, the client learns nothing about the server's model besides the prediction and some hyperparameters. In this paper, we propose two novel non-interactive PDTE protocols, XXCMP-PDTE and RCC-PDTE, based on two new non-interactive comparison protocols, XXCMP and RCC. Our evaluation of these comparison operators demonstrates that our proposed constructions can efficiently evaluate high-precision numbers. Specifically, RCC can compare 32-bit numbers in under 10 milliseconds. We assess our proposed PDTE protocols on decision trees trained over UCI datasets and compare our results with existing work in the field. Moreover, we evaluate synthetic decision trees to showcase scalability, revealing that RCC-PDTE can evaluate a decision tree with over 1000 nodes and 16 bits of precision in under 2 seconds. In contrast, the current state-of-the-art requires over 10 seconds to evaluate such a tree with only 11 bits of precision. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2308.14840 [pdf, other]

doi 10.1561/3300000041

Identifying and Mitigating the Security Risks of Generative AI

Authors: Clark Barrett, Brad Boyd, Elie Burzstein, Nicholas Carlini, Brad Chen, Jihye Choi, Amrita Roy Chowdhury, Mihai Christodorescu, Anupam Datta, Soheil Feizi, Kathleen Fisher, Tatsunori Hashimoto, Dan Hendrycks, Somesh Jha, Daniel Kang, Florian Kerschbaum, Eric Mitchell, John Mitchell, Zulfikar Ramzan, Khawaja Shams, Dawn Song, Ankur Taly, Diyi Yang

Abstract: Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well… ▽ More Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks. This paper reports the findings of a workshop held at Google (co-organized by Stanford University and the University of Wisconsin-Madison) on the dual-use dilemma posed by GenAI. This paper is not meant to be comprehensive, but is rather an attempt to synthesize some of the interesting findings from the workshop. We discuss short-term and long-term goals for the community on this topic. We hope this paper provides both a launching point for a discussion on this important topic as well as interesting problems that the research community can work to address. △ Less

Submitted 28 December, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

Journal ref: Foundations and Trends in Privacy and Security 6 (2023) 1-52

arXiv:2308.10718 [pdf, other]

Backdooring Textual Inversion for Concept Censorship

Authors: Yutong Wu, Jie Zhang, Florian Kerschbaum, Tianwei Zhang

Abstract: Recent years have witnessed success in AIGC (AI Generated Content). People can make use of a pre-trained diffusion model to generate images of high quality or freely modify existing pictures with only prompts in nature language. More excitingly, the emerging personalization techniques make it feasible to create specific-desired images with only a few images as references. However, this induces sev… ▽ More Recent years have witnessed success in AIGC (AI Generated Content). People can make use of a pre-trained diffusion model to generate images of high quality or freely modify existing pictures with only prompts in nature language. More excitingly, the emerging personalization techniques make it feasible to create specific-desired images with only a few images as references. However, this induces severe threats if such advanced techniques are misused by malicious users, such as spreading fake news or defaming individual reputations. Thus, it is necessary to regulate personalization models (i.e., concept censorship) for their development and advancement. In this paper, we focus on the personalization technique dubbed Textual Inversion (TI), which is becoming prevailing for its lightweight nature and excellent performance. TI crafts the word embedding that contains detailed information about a specific object. Users can easily download the word embedding from public websites like Civitai and add it to their own stable diffusion model without fine-tuning for personalization. To achieve the concept censorship of a TI model, we propose leveraging the backdoor technique for good by injecting backdoors into the Textual Inversion embeddings. Briefly, we select some sensitive words as triggers during the training of TI, which will be censored for normal use. In the subsequent generation stage, if the triggers are combined with personalized embeddings as final prompts, the model will output a pre-defined target image rather than images including the desired malicious concept. To demonstrate the effectiveness of our approach, we conduct extensive experiments on Stable Diffusion, a prevailing open-sourced text-to-image model. Our code, data, and results are available at https://concept-censorship.github.io. △ Less

Submitted 23 August, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

arXiv:2306.08538 [pdf, other]

Fast and Private Inference of Deep Neural Networks by Co-designing Activation Functions

Authors: Abdulrahman Diaa, Lucas Fenaux, Thomas Humphries, Marian Dietz, Faezeh Ebrahimianghazani, Bailey Kacsmar, Xinda Li, Nils Lukas, Rasoul Akhavan Mahdavi, Simon Oya, Ehsan Amjadian, Florian Kerschbaum

Abstract: Machine Learning as a Service (MLaaS) is an increasingly popular design where a company with abundant computing resources trains a deep neural network and offers query access for tasks like image classification. The challenge with this design is that MLaaS requires the client to reveal their potentially sensitive queries to the company hosting the model. Multi-party computation (MPC) protects the… ▽ More Machine Learning as a Service (MLaaS) is an increasingly popular design where a company with abundant computing resources trains a deep neural network and offers query access for tasks like image classification. The challenge with this design is that MLaaS requires the client to reveal their potentially sensitive queries to the company hosting the model. Multi-party computation (MPC) protects the client's data by allowing encrypted inferences. However, current approaches suffer from prohibitively large inference times. The inference time bottleneck in MPC is the evaluation of non-linear layers such as ReLU activation functions. Motivated by the success of previous work co-designing machine learning and MPC, we develop an activation function co-design. We replace all ReLUs with a polynomial approximation and evaluate them with single-round MPC protocols, which give state-of-the-art inference times in wide-area networks. Furthermore, to address the accuracy issues previously encountered with polynomial activations, we propose a novel training algorithm that gives accuracy competitive with plaintext models. Our evaluation shows between $3$ and $110\times$ speedups in inference time on large models with up to $23$ million parameters while maintaining competitive inference accuracy. △ Less

Submitted 16 April, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

Comments: To appear at USENIX Security 2024

arXiv:2305.09671 [pdf, other]

Pick your Poison: Undetectability versus Robustness in Data Poisoning Attacks

Authors: Nils Lukas, Florian Kerschbaum

Abstract: Deep image classification models trained on vast amounts of web-scraped data are susceptible to data poisoning - a mechanism for backdooring models. A small number of poisoned samples seen during training can severely undermine a model's integrity during inference. Existing work considers an effective defense as one that either (i) restores a model's integrity through repair or (ii) detects an att… ▽ More Deep image classification models trained on vast amounts of web-scraped data are susceptible to data poisoning - a mechanism for backdooring models. A small number of poisoned samples seen during training can severely undermine a model's integrity during inference. Existing work considers an effective defense as one that either (i) restores a model's integrity through repair or (ii) detects an attack. We argue that this approach overlooks a crucial trade-off: Attackers can increase robustness at the expense of detectability (over-poisoning) or decrease detectability at the cost of robustness (under-poisoning). In practice, attacks should remain both undetectable and robust. Detectable but robust attacks draw human attention and rigorous model evaluation or cause the model to be re-trained or discarded. In contrast, attacks that are undetectable but lack robustness can be repaired with minimal impact on model accuracy. Our research points to intrinsic flaws in current attack evaluation methods and raises the bar for all data poisoning attackers who must delicately balance this trade-off to remain robust and undetectable. To demonstrate the existence of more potent defenders, we propose defenses designed to (i) detect or (ii) repair poisoned models using a limited amount of trusted image-label pairs. Our results show that an attacker who needs to be robust and undetectable is substantially less threatening. Our defenses mitigate all tested attacks with a maximum accuracy decline of 2% using only 1% of clean data on CIFAR-10 and 2.5% on ImageNet. We demonstrate the scalability of our defenses by evaluating large vision-language models, such as CLIP. Attackers who can manipulate the model's parameters pose an elevated risk as they can achieve higher robustness at low detectability compared to data poisoning attackers. △ Less

Submitted 29 June, 2023; v1 submitted 7 May, 2023; originally announced May 2023.

Comments: Preprint

arXiv:2304.07361 [pdf, other]

PTW: Pivotal Tuning Watermarking for Pre-Trained Image Generators

Authors: Nils Lukas, Florian Kerschbaum

Abstract: Deepfakes refer to content synthesized using deep generators, which, when misused, have the potential to erode trust in digital media. Synthesizing high-quality deepfakes requires access to large and complex generators only a few entities can train and provide. The threat is malicious users that exploit access to the provided model and generate harmful deepfakes without risking detection. Watermar… ▽ More Deepfakes refer to content synthesized using deep generators, which, when misused, have the potential to erode trust in digital media. Synthesizing high-quality deepfakes requires access to large and complex generators only a few entities can train and provide. The threat is malicious users that exploit access to the provided model and generate harmful deepfakes without risking detection. Watermarking makes deepfakes detectable by embedding an identifiable code into the generator that is later extractable from its generated images. We propose Pivotal Tuning Watermarking (PTW), a method for watermarking pre-trained generators (i) three orders of magnitude faster than watermarking from scratch and (ii) without the need for any training data. We improve existing watermarking methods and scale to generators $4 \times$ larger than related work. PTW can embed longer codes than existing methods while better preserving the generator's image quality. We propose rigorous, game-based definitions for robustness and undetectability, and our study reveals that watermarking is not robust against an adaptive white-box attacker who controls the generator's parameters. We propose an adaptive attack that can successfully remove any watermarking with access to only 200 non-watermarked images. Our work challenges the trustworthiness of watermarking for deepfake detection when the parameters of a generator are available. The source code to reproduce our experiments is available at https://github.com/nilslukas/gan-watermark. △ Less

Submitted 7 November, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

Comments: USENIX Security 2023

arXiv:2303.09043 [pdf, ps, other]

HE is all you need: Compressing FHE Ciphertexts using Additive HE

Authors: Rasoul Akhavan Mahdavi, Abdulrahman Diaa, Florian Kerschbaum

Abstract: Fully Homomorphic Encryption (FHE) permits the evaluation of an arbitrary function on encrypted data. However, FHE ciphertexts, particularly those based on lattice assumptions such as LWE/RLWE are very large compared to the underlying plaintext. Large ciphertexts are hard to communicate over the network and this is an obstacle to the adoption of FHE, particularly for clients with limited bandwidth… ▽ More Fully Homomorphic Encryption (FHE) permits the evaluation of an arbitrary function on encrypted data. However, FHE ciphertexts, particularly those based on lattice assumptions such as LWE/RLWE are very large compared to the underlying plaintext. Large ciphertexts are hard to communicate over the network and this is an obstacle to the adoption of FHE, particularly for clients with limited bandwidth. In this work, we propose the first technique to compress ciphertexts sent from the server to the client using an additive encryption scheme with smaller ciphertexts. Using the additive scheme, the client sends auxiliary information to the server which is used to compress the ciphertext. Our evaluation shows up to 95% percent and 97% compression for LWE and RLWE ciphertexts, respectively. △ Less

Submitted 15 March, 2023; originally announced March 2023.

arXiv:2302.10664 [pdf, other]

doi 10.1051/0004-6361/202245607

TOI-1055 b: Neptunian planet characterised with HARPS, TESS, and CHEOPS

Authors: A. Bonfanti, D. Gandolfi, J. A. Egger, L. Fossati, J. Cabrera, A. Krenn, Y. Alibert, W. Benz, N. Billot, H. -G. Florén, M. Lendl, V. Adibekyan, S. Salmon, N. C. Santos, S. G. Sousa, T. G. Wilson, O. Barragán, A. Collier Cameron, L. Delrez, M. Esposito, E. Goffo, H. Osborne, H. P. Osborn, L. M. Serrano, V. Van Eylen , et al. (67 additional authors not shown)

Abstract: TOI-1055 is a Sun-like star known to host a transiting Neptune-sized planet on a 17.5-day orbit (TOI-1055 b). Radial velocity (RV) analyses carried out by two independent groups using nearly the same set of HARPS spectra have provided measurements of planetary masses that differ by $\sim$ 2$σ$. Our aim in this work is to solve the inconsistency in the published planetary masses by significantly ex… ▽ More TOI-1055 is a Sun-like star known to host a transiting Neptune-sized planet on a 17.5-day orbit (TOI-1055 b). Radial velocity (RV) analyses carried out by two independent groups using nearly the same set of HARPS spectra have provided measurements of planetary masses that differ by $\sim$ 2$σ$. Our aim in this work is to solve the inconsistency in the published planetary masses by significantly extending the set of HARPS RV measurements and employing a new analysis tool that is able to account and correct for stellar activity. Our further aim was to improve the precision on measurements of the planetary radius by observing two transits of the planet with the CHEOPS space telescope. We fit a skew normal (SN) function to each cross correlation function extracted from the HARPS spectra to obtain RV measurements and hyperparameters to be used for the detrending. We evaluated the correlation changes of the hyperparameters along the RV time series using the breakpoint technique. We performed a joint photometric and RV analysis using a Markov chain Monte Carlo (MCMC) scheme to simultaneously detrend the light curves and the RV time series. We firmly detected the Keplerian signal of TOI-1055 b, deriving a planetary mass of $M_b=20.4_{-2.5}^{+2.6} M_{\oplus}$ ($\sim$12%). This value is in agreement with one of the two estimates in the literature, but it is significantly more precise. Thanks to the TESS transit light curves combined with exquisite CHEOPS photometry, we also derived a planetary radius of $R_b=3.490_{-0.064}^{+0.070} R_{\oplus}$ ($\sim$1.9%). Our mass and radius measurements imply a mean density of $ρ_b=2.65_{-0.35}^{+0.37}$ g cm$^{-3}$ ($\sim$14%). We further inferred the planetary structure and found that TOI-1055 b is very likely to host a substantial gas envelope with a mass of $0.41^{+0.34}_{-0.20}$ M$_\oplus$ and a thickness of $1.05^{+0.30}_{-0.29}$ R$_\oplus$. △ Less

Submitted 22 February, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

Comments: 13 pages, 6 figures, 5 tables. Accepted for publication in A&A

Journal ref: A&A 671, L8 (2023)

arXiv:2211.10752 [pdf, other]

Towards Robust Dataset Learning

Authors: Yihan Wu, Xinda Li, Florian Kerschbaum, Heng Huang, Hongyang Zhang

Abstract: Adversarial training has been actively studied in recent computer vision research to improve the robustness of models. However, due to the huge computational cost of generating adversarial samples, adversarial training methods are often slow. In this paper, we study the problem of learning a robust dataset such that any classifier naturally trained on the dataset is adversarially robust. Such a da… ▽ More Adversarial training has been actively studied in recent computer vision research to improve the robustness of models. However, due to the huge computational cost of generating adversarial samples, adversarial training methods are often slow. In this paper, we study the problem of learning a robust dataset such that any classifier naturally trained on the dataset is adversarially robust. Such a dataset benefits the downstream tasks as natural training is much faster than adversarial training, and demonstrates that the desired property of robustness is transferable between models and data. In this work, we propose a principled, tri-level optimization to formulate the robust dataset learning problem. We show that, under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset. Extensive experiments on MNIST, CIFAR10, and TinyImageNet demostrate the effectiveness of our algorithm with different network initializations and architectures. △ Less

Submitted 19 November, 2022; originally announced November 2022.

arXiv:2211.07026 [pdf, other]

Comprehension from Chaos: Towards Informed Consent for Private Computation

Authors: Bailey Kacsmar, Vasisht Duddu, Kyle Tilbury, Blase Ur, Florian Kerschbaum

Abstract: Private computation, which includes techniques like multi-party computation and private query execution, holds great promise for enabling organizations to analyze data they and their partners hold while maintaining data subjects' privacy. Despite recent interest in communicating about differential privacy, end users' perspectives on private computation have not previously been studied. To fill thi… ▽ More Private computation, which includes techniques like multi-party computation and private query execution, holds great promise for enabling organizations to analyze data they and their partners hold while maintaining data subjects' privacy. Despite recent interest in communicating about differential privacy, end users' perspectives on private computation have not previously been studied. To fill this gap, we conducted 22 semi-structured interviews investigating users' understanding of, and expectations for, private computation over data about them. Interviews centered on four concrete data-analysis scenarios (e.g., ad conversion analysis), each with a variant that did not use private computation and another that did (private set intersection, multi-party computation, and privacy preserving query procedures). While participants struggled with abstract definitions of private computation, they found the concrete scenarios enlightening and plausible even though we did not explain the complex cryptographic underpinnings. Private computation increased participants' acceptance of data sharing, but not unconditionally; the purpose of data sharing and analysis was the primary driver of their attitudes. Through collective activities, participants emphasized the importance of detailing the purpose of a computation and clarifying that inputs to private computation are not shared across organizations when describing private computation to end users. △ Less

Submitted 23 August, 2023; v1 submitted 13 November, 2022; originally announced November 2022.

arXiv:2209.13913 [pdf, other]

Faster Secure Comparisons with Offline Phase for Efficient Private Set Intersection

Authors: Florian Kerschbaum, Erik-Oliver Blass, Rasoul Akhavan Mahdavi

Abstract: In a Private section intersection (PSI) protocol, Alice and Bob compute the intersection of their respective sets without disclosing any element not in the intersection. PSI protocols have been extensively studied in the literature and are deployed in industry. With state-of-the-art protocols achieving optimal asymptotic complexity, performance improvements are rare and can only improve complexi… ▽ More In a Private section intersection (PSI) protocol, Alice and Bob compute the intersection of their respective sets without disclosing any element not in the intersection. PSI protocols have been extensively studied in the literature and are deployed in industry. With state-of-the-art protocols achieving optimal asymptotic complexity, performance improvements are rare and can only improve complexity constants. In this paper, we present a new private, extremely efficient comparison protocol that leads to a PSI protocol with low constants. A useful property of our comparison protocol is that it can be divided into an online and an offline phase. All expensive cryptographic operations are performed during the offline phase, and the online phase performs only four fast field operations per comparison. This leads to an incredibly fast online phase, and our evaluation shows that it outperforms related work, including KKRT (CCS 16), VOLE-PSI (EuroCrypt 21), and OKVS (Crypto 21). We also evaluate standard approaches to implement the offline phase using different trust assumptions: cryptographic, hardware, and a third party (dealer model). △ Less

Submitted 28 September, 2022; originally announced September 2022.

arXiv:2205.02130 [pdf, other]

The Limits of Word Level Differential Privacy

Authors: Justus Mattern, Benjamin Weggenmann, Florian Kerschbaum

Abstract: As the issues of privacy and trust are receiving increasing attention within the research community, various attempts have been made to anonymize textual data. A significant subset of these approaches incorporate differentially private mechanisms to perturb word embeddings, thus replacing individual words in a sentence. While these methods represent very important contributions, have various advan… ▽ More As the issues of privacy and trust are receiving increasing attention within the research community, various attempts have been made to anonymize textual data. A significant subset of these approaches incorporate differentially private mechanisms to perturb word embeddings, thus replacing individual words in a sentence. While these methods represent very important contributions, have various advantages over other techniques and do show anonymization capabilities, they have several shortcomings. In this paper, we investigate these weaknesses and demonstrate significant mathematical constraints diminishing the theoretical privacy guarantee as well as major practical shortcomings with regard to the protection against deanonymization attacks, the preservation of content of the original sentences as well as the quality of the language output. Finally, we propose a new method for text anonymization based on transformer based language models fine-tuned for paraphrasing that circumvents most of the identified weaknesses and also offers a formal privacy guarantee. We evaluate the performance of our method via thorough experimentation and demonstrate superior performance over the discussed mechanisms. △ Less

Submitted 2 May, 2022; originally announced May 2022.

arXiv:2204.07877 [pdf, other]

Assessing Differentially Private Variational Autoencoders under Membership Inference

Authors: Daniel Bernau, Jonas Robl, Florian Kerschbaum

Abstract: We present an approach to quantify and compare the privacy-accuracy trade-off for differentially private Variational Autoencoders. Our work complements previous work in two aspects. First, we evaluate the the strong reconstruction MI attack against Variational Autoencoders under differential privacy. Second, we address the data scientist's challenge of setting privacy parameter epsilon, which stee… ▽ More We present an approach to quantify and compare the privacy-accuracy trade-off for differentially private Variational Autoencoders. Our work complements previous work in two aspects. First, we evaluate the the strong reconstruction MI attack against Variational Autoencoders under differential privacy. Second, we address the data scientist's challenge of setting privacy parameter epsilon, which steers the differential privacy strength and thus also the privacy-accuracy trade-off. In our experimental study we consider image and time series data, and three local and central differential privacy mechanisms. We find that the privacy-accuracy trade-offs strongly depend on the dataset and model architecture. We do rarely observe favorable privacy-accuracy trade-off for Variational Autoencoders, and identify a case where LDP outperforms CDP. △ Less

Submitted 16 April, 2022; originally announced April 2022.

arXiv:2202.07569 [pdf, other]

Constant-weight PIR: Single-round Keyword PIR via Constant-weight Equality Operators

Authors: Rasoul Akhavan Mahdavi, Florian Kerschbaum

Abstract: Equality operators are an essential building block in tasks over secure computation such as private information retrieval. In private information retrieval (PIR), a user queries a database such that the server does not learn which element is queried. In this work, we propose \emph{equality operators for constant-weight codewords}. A constant-weight code is a collection of codewords that share the… ▽ More Equality operators are an essential building block in tasks over secure computation such as private information retrieval. In private information retrieval (PIR), a user queries a database such that the server does not learn which element is queried. In this work, we propose \emph{equality operators for constant-weight codewords}. A constant-weight code is a collection of codewords that share the same Hamming weight. Constant-weight equality operators have a multiplicative depth that depends only on the Hamming weight of the code, not the bit-length of the elements. In our experiments, we show how these equality operators are up to 10 times faster than existing equality operators. Furthermore, we propose PIR using the constant-weight equality operator or \emph{constant-weight PIR}, which is a PIR protocol using an approach previously deemed impractical. We show that for private retrieval of large, streaming data, constant-weight PIR has a smaller communication complexity and lower runtime compared to SEALPIR and MulPIR, respectively, which are two state-of-the-art solutions for PIR. Moreover, we show how constant-weight PIR can be extended to keyword PIR. In keyword PIR, the desired element is retrieved by a unique identifier pertaining to the sought item, e.g., the name of a file. Previous solutions to keyword PIR require one or multiple rounds of communication to reduce the problem to normal PIR. We show that constant-weight PIR is the first practical single-round solution to single-server keyword PIR. △ Less

Submitted 16 February, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

arXiv:2110.05524 [pdf, other]

Generalization Techniques Empirically Outperform Differential Privacy against Membership Inference

Authors: Jiaxiang Liu, Simon Oya, Florian Kerschbaum

Abstract: Differentially private training algorithms provide protection against one of the most popular attacks in machine learning: the membership inference attack. However, these privacy algorithms incur a loss of the model's classification accuracy, therefore creating a privacy-utility trade-off. The amount of noise that differential privacy requires to provide strong theoretical protection guarantees in… ▽ More Differentially private training algorithms provide protection against one of the most popular attacks in machine learning: the membership inference attack. However, these privacy algorithms incur a loss of the model's classification accuracy, therefore creating a privacy-utility trade-off. The amount of noise that differential privacy requires to provide strong theoretical protection guarantees in deep learning typically renders the models unusable, but authors have observed that even lower noise levels provide acceptable empirical protection against existing membership inference attacks. In this work, we look for alternatives to differential privacy towards empirically protecting against membership inference attacks. We study the protection that simply following good machine learning practices (not designed with privacy in mind) offers against membership inference. We evaluate the performance of state-of-the-art techniques, such as pre-training and sharpness-aware minimization, alone and with differentially private training algorithms, and find that, when using early stop**, the algorithms without differential privacy can provide both higher utility and higher privacy than their differentially private counterparts. These findings challenge the belief that differential privacy is a good defense to protect against existing membership inference attacks △ Less

Submitted 11 October, 2021; originally announced October 2021.

arXiv:2110.04180 [pdf, other]

IHOP: Improved Statistical Query Recovery against Searchable Symmetric Encryption through Quadratic Optimization

Authors: Simon Oya, Florian Kerschbaum

Abstract: Effective query recovery attacks against Searchable Symmetric Encryption (SSE) schemes typically rely on auxiliary ground-truth information about the queries or dataset. Query recovery is also possible under the weaker statistical auxiliary information assumption, although statistical-based attacks achieve lower accuracy and are not considered a serious threat. In this work we present IHOP, a stat… ▽ More Effective query recovery attacks against Searchable Symmetric Encryption (SSE) schemes typically rely on auxiliary ground-truth information about the queries or dataset. Query recovery is also possible under the weaker statistical auxiliary information assumption, although statistical-based attacks achieve lower accuracy and are not considered a serious threat. In this work we present IHOP, a statistical-based query recovery attack that formulates query recovery as a quadratic optimization problem and reaches a solution by iterating over linear assignment problems. We perform an extensive evaluation with five real datasets, and show that IHOP outperforms all other statistical-based query recovery attacks under different parameter and leakage configurations, including the case where the client uses some access-pattern obfuscation defenses. In some cases, our attack achieves almost perfect query recovery accuracy. Finally, we use IHOP in a frequency-only leakage setting where the client's queries are correlated, and show that our attack can exploit query dependencies even when PANCAKE, a recent frequency-hiding defense by Grubbs et al., is applied. Our findings indicate that statistical query recovery attacks pose a severe threat to privacy-preserving SSE schemes. △ Less

Submitted 31 May, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

Comments: 18 pages

arXiv:2108.10742 [pdf, other]

doi 10.1051/0004-6361/202140952

DEATHSTAR: Nearby AGB stars with the Atacama Compact Array II. CO envelope sizes and asymmetries: The S-type stars

Authors: M. Andriantsaralaza, S. Ramstedt, W. H. T. Vlemmings, T. Danilovich, E. De Beck, M. A. T. Groenewegen, S. Höfner, F. Kerschbaum, T. Khouri, M. Lindqvist, M. Maercker, H. Olofsson, G. Quintana-Lacaci, M. Saberi, R. Sahai, A. Zijlstra

Abstract: We aim to constrain the sizes of the CO circumstellar envelopes (CSEs) of 16 S-type stars, along with an additional 7 and 4 CSEs of C-type and M-type AGB stars, respectively. We map the emission from the CO J=2-1 and 3-2 lines observed with the Atacama Compact Array (ACA) and its total power (TP) antennas, and fit with a Gaussian distribution in the uv- and image planes for ACA-only and TP observa… ▽ More We aim to constrain the sizes of the CO circumstellar envelopes (CSEs) of 16 S-type stars, along with an additional 7 and 4 CSEs of C-type and M-type AGB stars, respectively. We map the emission from the CO J=2-1 and 3-2 lines observed with the Atacama Compact Array (ACA) and its total power (TP) antennas, and fit with a Gaussian distribution in the uv- and image planes for ACA-only and TP observations, respectively. The major axis of the fitted Gaussian for the CO(2-1) line data gives a first estimate of the size of the CO-line-emitting CSE. We investigate possible signs of deviation from spherical symmetry by analysing the line profiles, the results from visibility fitting, and by investigating the deconvolved images. The sizes of the CO-line-emitting CSEs of low-mass-loss-rate (low-MLR) S-stars fall between the sizes of the CSEs of C-stars, which are larger, and those of M-stars, which are smaller, as expected because of the differences in their respective CO abundances. The sizes of the low-MLR S-type stars show no dependence on circumstellar density, while a steeper density dependence is observed at high MLR. Furthermore, our results show that the CO CSEs of most of the S-stars in our sample are consistent with a spherically symmetric and smooth outflow. The CO envelope sizes obtained in this paper will be used to constrain detailed radiative transfer modelling to directly determine more accurate MLR estimates for the stars in our sample. For several of our sources that present signs of deviation from spherical symmetry, further high-resolution observations would be necessary to investigate the nature of, and the physical processes behind, these asymmetrical structures. This will provide further insight into the mass-loss process and its related chemistry in S-type AGB stars. △ Less

Submitted 24 August, 2021; originally announced August 2021.

Comments: 8 pages, 2 figures, 4 appendices, accepted in A&A

Journal ref: A&A 653, A53 (2021)

arXiv:2108.04974 [pdf, other]

SoK: How Robust is Image Classification Deep Neural Network Watermarking? (Extended Version)

Authors: Nils Lukas, Edward Jiang, Xinda Li, Florian Kerschbaum

Abstract: Deep Neural Network (DNN) watermarking is a method for provenance verification of DNN models. Watermarking should be robust against watermark removal attacks that derive a surrogate model that evades provenance verification. Many watermarking schemes that claim robustness have been proposed, but their robustness is only validated in isolation against a relatively small set of attacks. There is no… ▽ More Deep Neural Network (DNN) watermarking is a method for provenance verification of DNN models. Watermarking should be robust against watermark removal attacks that derive a surrogate model that evades provenance verification. Many watermarking schemes that claim robustness have been proposed, but their robustness is only validated in isolation against a relatively small set of attacks. There is no systematic, empirical evaluation of these claims against a common, comprehensive set of removal attacks. This uncertainty about a watermarking scheme's robustness causes difficulty to trust their deployment in practice. In this paper, we evaluate whether recently proposed watermarking schemes that claim robustness are robust against a large set of removal attacks. We survey methods from the literature that (i) are known removal attacks, (ii) derive surrogate models but have not been evaluated as removal attacks, and (iii) novel removal attacks. Weight shifting and smooth retraining are novel removal attacks adapted to the DNN watermarking schemes surveyed in this paper. We propose taxonomies for watermarking schemes and removal attacks. Our empirical evaluation includes an ablation study over sets of parameters for each attack and watermarking scheme on the CIFAR-10 and ImageNet datasets. Surprisingly, none of the surveyed watermarking schemes is robust in practice. We find that schemes fail to withstand adaptive attacks and known methods for deriving surrogate models that have not been evaluated as removal attacks. This points to intrinsic flaws in how robustness is currently evaluated. We show that watermarking schemes need to be evaluated against a more extensive set of removal attacks with a more realistic adversary model. Our source code and a complete dataset of evaluation results are publicly available, which allows to independently verify our conclusions. △ Less

Submitted 10 August, 2021; originally announced August 2021.

arXiv:2107.12407 [pdf, other]

Selective MPC: Distributed Computation of Differentially Private Key-Value Statistics

Authors: Thomas Humphries, Rasoul Akhavan Mahdavi, Shannon Veitch, Florian Kerschbaum

Abstract: Key-value data is a naturally occurring data type that has not been thoroughly investigated in the local trust model. Existing local differentially private (LDP) solutions for computing statistics over key-value data suffer from the inherent accuracy limitations of each user adding their own noise. Multi-party computation (MPC) maintains better accuracy than LDP and similarly does not require a tr… ▽ More Key-value data is a naturally occurring data type that has not been thoroughly investigated in the local trust model. Existing local differentially private (LDP) solutions for computing statistics over key-value data suffer from the inherent accuracy limitations of each user adding their own noise. Multi-party computation (MPC) maintains better accuracy than LDP and similarly does not require a trusted central party. However, naively applying MPC to key-value data results in prohibitively expensive computation costs. In this work, we present selective multi-party computation, a novel approach to distributed computation that leverages DP leakage to efficiently and accurately compute statistics over key-value data. By providing each party with a view of a random subset of the data, we can capture subtractive noise. We prove that our protocol satisfies pure DP and is provably secure in the combined DP/MPC model. Our empirical evaluation demonstrates that we can compute statistics over 10,000 keys in 20 seconds and can scale up to 30 servers while obtaining results for a single key in under a second. △ Less

Submitted 30 August, 2022; v1 submitted 26 July, 2021; originally announced July 2021.

arXiv:2103.05792 [pdf, other]

Equi-Joins Over Encrypted Data for Series of Queries

Authors: Masoumeh Shafieinejad, Suraj Gupta, ** Yang Liu, Koray Karabina, Florian Kerschbaum

Abstract: Encryption provides a method to protect data outsourced to a DBMS provider, e.g., in the cloud. However, performing database operations over encrypted data requires specialized encryption schemes that carefully balance security and performance. In this paper, we present a new encryption scheme that can efficiently perform equi-joins over encrypted data with better security than the state-of-the-ar… ▽ More Encryption provides a method to protect data outsourced to a DBMS provider, e.g., in the cloud. However, performing database operations over encrypted data requires specialized encryption schemes that carefully balance security and performance. In this paper, we present a new encryption scheme that can efficiently perform equi-joins over encrypted data with better security than the state-of-the-art. In particular, our encryption scheme reduces the leakage to equality of rows that match a selection criterion and only reveals the transitive closure of the sum of the leakages of each query in a series of queries. Our encryption scheme is provable secure. We implemented our encryption scheme and evaluated it over a dataset from the TPC-H benchmark. △ Less

Submitted 9 March, 2021; originally announced March 2021.

Comments: 13 pages, 4 figures, 6 tables

arXiv:2103.05173 [pdf, other]

PCOR: Private Contextual Outlier Release via Differentially Private Search

Authors: Masoumeh Shafieinejad, Florian Kerschbaum, Ihab F. Ilyas

Abstract: Outlier detection plays a significant role in various real world applications such as intrusion, malfunction, and fraud detection. Traditionally, outlier detection techniques are applied to find outliers in the context of the whole dataset. However, this practice neglects contextual outliers, that are not outliers in the whole dataset but in some specific neighborhoods. Contextual outliers are par… ▽ More Outlier detection plays a significant role in various real world applications such as intrusion, malfunction, and fraud detection. Traditionally, outlier detection techniques are applied to find outliers in the context of the whole dataset. However, this practice neglects contextual outliers, that are not outliers in the whole dataset but in some specific neighborhoods. Contextual outliers are particularly important in data exploration and targeted anomaly explanation and diagnosis. In these scenarios, the data owner computes the following information: i) The attributes that contribute to the abnormality of an outlier (metric), ii) Contextual description of the outlier's neighborhoods (context), and iii) The utility score of the context, e.g. its strength in showing the outlier's significance, or in relation to a particular explanation for the outlier. However, revealing the outlier's context leaks information about the other individuals in the population as well, violating their privacy. We address the issue of population privacy violations in this paper, and propose a solution for the two main challenges. In this setting, the data owner is required to release a valid context for the queried record, i.e. a context in which the record is an outlier. Hence, the first major challenge is that the privacy technique must preserve the validity of the context for each record. We propose techniques to protect the privacy of individuals through a relaxed notion of differential privacy to satisfy this requirement. The second major challenge is applying the proposed techniques efficiently, as they impose intensive computation to the base algorithm. To overcome this challenge, we propose a graph structure to map the contexts to, and introduce differentially private graph search algorithms as efficient solutions for the computation problem caused by differential privacy techniques. △ Less

Submitted 8 March, 2021; originally announced March 2021.

arXiv:2103.02913 [pdf, other]

Quantifying identifiability to choose and audit $ε$ in differentially private deep learning

Authors: Daniel Bernau, Günther Eibl, Philip W. Grassal, Hannah Keller, Florian Kerschbaum

Abstract: Differential privacy allows bounding the influence that training data records have on a machine learning model. To use differential privacy in machine learning, data scientists must choose privacy parameters $(ε,δ)$. Choosing meaningful privacy parameters is key, since models trained with weak privacy parameters might result in excessive privacy leakage, while strong privacy parameters might overl… ▽ More Differential privacy allows bounding the influence that training data records have on a machine learning model. To use differential privacy in machine learning, data scientists must choose privacy parameters $(ε,δ)$. Choosing meaningful privacy parameters is key, since models trained with weak privacy parameters might result in excessive privacy leakage, while strong privacy parameters might overly degrade model utility. However, privacy parameter values are difficult to choose for two main reasons. First, the theoretical upper bound on privacy loss $(ε,δ)$ might be loose, depending on the chosen sensitivity and data distribution of practical datasets. Second, legal requirements and societal norms for anonymization often refer to individual identifiability, to which $(ε,δ)$ are only indirectly related. We transform $(ε,δ)$ to a bound on the Bayesian posterior belief of the adversary assumed by differential privacy concerning the presence of any record in the training dataset. The bound holds for multidimensional queries under composition, and we show that it can be tight in practice. Furthermore, we derive an identifiability bound, which relates the adversary assumed in differential privacy to previous work on membership inference adversaries. We formulate an implementation of this differential privacy adversary that allows data scientists to audit model training and compute empirical identifiability scores and empirical $(ε,δ)$. △ Less

Submitted 20 July, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

arXiv:2102.09651 [pdf, other]

Obfuscated Access and Search Patterns in Searchable Encryption

Authors: Zhiwei Shang, Simon Oya, Andreas Peter, Florian Kerschbaum

Abstract: Searchable Symmetric Encryption (SSE) allows a data owner to securely outsource its encrypted data to a cloud server while maintaining the ability to search over it and retrieve matched documents. Most existing SSE schemes leak which documents are accessed per query, i.e., the so-called access pattern, and thus are vulnerable to attacks that can recover the database or the queried keywords. Curren… ▽ More Searchable Symmetric Encryption (SSE) allows a data owner to securely outsource its encrypted data to a cloud server while maintaining the ability to search over it and retrieve matched documents. Most existing SSE schemes leak which documents are accessed per query, i.e., the so-called access pattern, and thus are vulnerable to attacks that can recover the database or the queried keywords. Current techniques that fully hide access patterns, such as ORAM or PIR, suffer from heavy communication or computational costs, and are not designed with search capabilities in mind. Recently, Chen et al. (INFOCOM'18) proposed an obfuscation framework for SSE that protects the access pattern in a differentially private way with a reasonable utility cost. However, this scheme leaks the so-called search pattern, i.e., how many times a certain query is performed. This leakage makes the proposal vulnerable to certain database and query recovery attacks. In this paper, we propose OSSE (Obfuscated SSE), an SSE scheme that obfuscates the access pattern independently for each query performed. This in turn hides the search pattern and makes our scheme resistant against attacks that rely on this leakage. Under certain reasonable assumptions, our scheme has smaller communication overhead than ORAM-based SSE. Furthermore, our scheme works in a single communication round and requires very small constant client-side storage. Our empirical evaluation shows that OSSE is highly effective at protecting against different query recovery attacks while kee** a reasonable utility level. Our protocol provides significantly more protection than the proposal by Chen et al.~against some state-of-the-art attacks, which demonstrates the importance of hiding search patterns in designing effective privacy-preserving SSE schemes. △ Less

Submitted 18 February, 2021; originally announced February 2021.

Comments: To be published at Network and Distributed Systems Security (NDSS) Symposium 2021, 21-24 February 2021, San Diego, CA, USA

arXiv:2010.14963 [pdf, other]

doi 10.1051/0004-6361/202039178

Extended view on the dust shells around two carbon stars

Authors: M. Mečina, B. Aringer, W. Nowotny, M. A. T. Groenewegen, F. Kerschbaum, M. Brunner, H. -P. Gail

Abstract: Stars on the asymptotic giant branch (AGB) lose considerable amounts of matter through their dust-driven stellar winds. A number of such sources have been imaged by Herschel/PACS, revealing a diverse sample of different morphological types. Among them are a few examples which show geometrically thin, spherically symmetric shells which can be used to probe the mass loss history of their host stars.… ▽ More Stars on the asymptotic giant branch (AGB) lose considerable amounts of matter through their dust-driven stellar winds. A number of such sources have been imaged by Herschel/PACS, revealing a diverse sample of different morphological types. Among them are a few examples which show geometrically thin, spherically symmetric shells which can be used to probe the mass loss history of their host stars. We aim to determine the physical properties of the dust envelope around the two carbon stars U Hya and W Ori. With the much-improved spatial constraints from the new far-infrared maps, our primary goal is to measure the dust masses contained in the shells and see how they fit the proposed scenarios of shell formation. We calculated the radiative transfer of the circumstellar dust envelope using the 1D code More of DUSTY (MoD). Adopting a parametrised density profile, we obtained a best-fit model in terms of the photometric and spectroscopic data, as well as a radial intensity profile based on Herschel/PACS data. For the case of U Hya, we also computed a grid of circumstellar envelopes by means of a stationary wind code and compare the results of the two modelling approaches. The Herschel/PACS maps show U Hya surrounded by a detached shell of $114''\ (0.12\,\mathrm{pc})$ in radius, confirming the observations from previous space missions. The dust masses calculated for the shell by the two approaches are consistent with respect to the adopted dust grain properties. In addition, around W Ori, we detect for the first time a weak spherically symmetric structure with a radius of $92''\ (0.17\,\mathrm{pc})$ and a dust mass of $(3.5\pm0.3)\times10^{-6}\,\mathrm{M_\odot}$. △ Less

Submitted 28 October, 2020; originally announced October 2020.

Journal ref: A&A 644, A66 (2020)

arXiv:2010.12112 [pdf, other]

Investigating Membership Inference Attacks under Data Dependencies

Authors: Thomas Humphries, Simon Oya, Lindsey Tulloch, Matthew Rafuse, Ian Goldberg, Urs Hengartner, Florian Kerschbaum

Abstract: Training machine learning models on privacy-sensitive data has become a popular practice, driving innovation in ever-expanding fields. This has opened the door to new attacks that can have serious privacy implications. One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model. A growing body of literature uses Differentially Pr… ▽ More Training machine learning models on privacy-sensitive data has become a popular practice, driving innovation in ever-expanding fields. This has opened the door to new attacks that can have serious privacy implications. One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model. A growing body of literature uses Differentially Private (DP) training algorithms as a defence against such attacks. However, these works evaluate the defence under the restrictive assumption that all members of the training set, as well as non-members, are independent and identically distributed. This assumption does not hold for many real-world use cases in the literature. Motivated by this, we evaluate membership inference with statistical dependencies among samples and explain why DP does not provide meaningful protection (the privacy parameter $ε$ scales with the training set size $n$) in this more general case. We conduct a series of empirical evaluations with off-the-shelf MIAs using training sets built from real-world data showing different types of dependencies among samples. Our results reveal that training set dependencies can severely increase the performance of MIAs, and therefore assuming that data samples are statistically independent can significantly underestimate the performance of MIAs. △ Less

Submitted 14 June, 2023; v1 submitted 22 October, 2020; originally announced October 2020.

Comments: IEEE 36th Computer Security Foundations Symposium (CSF)

arXiv:2010.03465 [pdf, other]

Hiding the Access Pattern is Not Enough: Exploiting Search Pattern Leakage in Searchable Encryption

Authors: Simon Oya, Florian Kerschbaum

Abstract: Recent Searchable Symmetric Encryption (SSE) schemes enable secure searching over an encrypted database stored in a server while limiting the information leaked to the server. These schemes focus on hiding the access pattern, which refers to the set of documents that match the client's queries. This provides protection against current attacks that largely depend on this leakage to succeed. However… ▽ More Recent Searchable Symmetric Encryption (SSE) schemes enable secure searching over an encrypted database stored in a server while limiting the information leaked to the server. These schemes focus on hiding the access pattern, which refers to the set of documents that match the client's queries. This provides protection against current attacks that largely depend on this leakage to succeed. However, most SSE constructions also leak whether or not two queries aim for the same keyword, also called the search pattern. In this work, we show that search pattern leakage can severely undermine current SSE defenses. We propose an attack that leverages both access and search pattern leakage, as well as some background and query distribution information, to recover the keywords of the queries performed by the client. Our attack follows a maximum likelihood estimation approach, and is easy to adapt against SSE defenses that obfuscate the access pattern. We empirically show that our attack is efficient, it outperforms other proposed attacks, and it completely thwarts two out of the three defenses we evaluate it against, even when these defenses are set to high privacy regimes. These findings highlight that hiding the search pattern, a feature that most constructions are lacking, is key towards providing practical privacy guarantees in SSE. △ Less

Submitted 7 October, 2020; originally announced October 2020.

Comments: 16 pages. 11 figures. To appear at Proceedings of the 30th USENIX Security Symposium (August 11-13, 2021, Vancouver, B.C., Canada)

arXiv:2008.07885 [pdf, ps, other]

DEATHSTAR: Nearby AGB stars with the Atacama Compact Array I. CO envelope sizes and asymmetries: A new hope for accurate mass-loss-rate estimates

Authors: S. Ramstedt, W. H. T. Vlemmings, L. Doan, T. Danilovich, M. Lindqvist, M. Saberi, H. Olofsson, E. De Beck, M. A. T. Groenewegen, S. Höfner, J. H. Kastner, F. Kerschbaum, T. Khouri, M. Maercker, R. Montez, G. Quintana-Lacaci, R. Sahai, D. Tafoya, A. Zijlstra

Abstract: This is the first publication of the DEATHSTAR project. The goal of the project is to reduce the uncertainties of observational estimates of mass-loss rates from Asymptotic Giant Branch (AGB) stars. Line emission from 12CO J=2-1 and 3-2 were mapped using the ACA. In this initial analysis, the emission distribution was fit to a Gaussian distribution in the uv-plane. Detailed radiative transfer anal… ▽ More This is the first publication of the DEATHSTAR project. The goal of the project is to reduce the uncertainties of observational estimates of mass-loss rates from Asymptotic Giant Branch (AGB) stars. Line emission from 12CO J=2-1 and 3-2 were mapped using the ACA. In this initial analysis, the emission distribution was fit to a Gaussian distribution in the uv-plane. Detailed radiative transfer analysis will be presented in the future. The axes of the best-fit Gaussian at the line center of the 12CO J=2-1 emission gives a first indication of the size of the emitting region. Furthermore, the fitting results, such as the major and minor axis, center position, and the goodness of fit across both lines, constrain the symmetry of the emission distribution. We find that the CO envelope sizes are, in general, larger for C-type than for M-type AGB stars, which is expected if the CO/H2 ratio is larger in C-type stars. Furthermore, a relation between the 12CO J=2-1 size and circumstellar density is shown that, while in broad agreement with photodissociation calculations, reveals large scatter and systematic differences between the stellar types. The majority of the sources have CO envelopes that are consistent with a spherically symmetric, smooth outflow. For about a third of the sources, indications of strong asymmetries are found. This is consistent with previous interferometric investigations of northern sources. Smaller scale asymmetries are found in a larger fraction of sources. These results for CO envelope radii and shapes can be used to constrain detailed radiative transfer modeling of the same stars so as to determine mass-loss rates that are independent of photodissociation models. For a large fraction of the sources, observations at higher spatial resolution will be necessary to further investigate the complex circumstellar dynamics revealed by our ACA observations. △ Less

Submitted 18 August, 2020; originally announced August 2020.

Comments: 10 pages, 5 figures, 5 appendices, accepted in A&A

arXiv:2003.09481 [pdf, other]

doi 10.14778/3407790.3407814

Efficient Oblivious Database Joins

Authors: Simeon Krastnikov, Florian Kerschbaum, Douglas Stebila

Abstract: A major algorithmic challenge in designing applications intended for secure remote execution is ensuring that they are oblivious to their inputs, in the sense that their memory access patterns do not leak sensitive information to the server. This problem is particularly relevant to cloud databases that wish to allow queries over the client's encrypted data. One of the major obstacles to such a goa… ▽ More A major algorithmic challenge in designing applications intended for secure remote execution is ensuring that they are oblivious to their inputs, in the sense that their memory access patterns do not leak sensitive information to the server. This problem is particularly relevant to cloud databases that wish to allow queries over the client's encrypted data. One of the major obstacles to such a goal is the join operator, which is non-trivial to implement obliviously without resorting to generic but inefficient solutions like Oblivious RAM (ORAM). We present an oblivious algorithm for equi-joins which (up to a logarithmic factor) matches the optimal $O(n\log n)$ complexity of the standard non-secure sort-merge join (on inputs producing $O(n)$ outputs). We do not use use expensive primitives like ORAM or rely on unrealistic hardware or security assumptions. Our approach, which is based on sorting networks and novel provably-oblivious constructions, is conceptually simple, easily verifiable, and very efficient in practice. Its data-independent algorithmic structure makes it secure in various different settings for remote computation, even in those that are known to be vulnerable to certain side-channel attacks (such as Intel SGX) or with strict requirements for low circuit complexity (like secure multiparty computation). We confirm that our approach is easily realizable through a compact implementation which matches our expectations for performance and is shown, both formally and empirically, to possess the desired security characteristics. △ Less

Submitted 15 December, 2020; v1 submitted 20 March, 2020; originally announced March 2020.

Journal ref: Proceedings of the VLDB Endowment (PVLDB), 13(11): 2132-2145, 2020

arXiv:2002.05097 [pdf, other]

EncDBDB: Searchable Encrypted, Fast, Compressed, In-Memory Database using Enclaves

Authors: Benny Fuhry, Jayanth Jain H A, Florian Kerschbaum

Abstract: Data confidentiality is an important requirement for clients when outsourcing databases to the cloud. Trusted execution environments, such as Intel SGX, offer an efficient, hardware-based solution to this cryptographic problem. Existing solutions are not optimized for column-oriented, in-memory databases and pose impractical memory requirements on the enclave. We present EncDBDB, a novel approach… ▽ More Data confidentiality is an important requirement for clients when outsourcing databases to the cloud. Trusted execution environments, such as Intel SGX, offer an efficient, hardware-based solution to this cryptographic problem. Existing solutions are not optimized for column-oriented, in-memory databases and pose impractical memory requirements on the enclave. We present EncDBDB, a novel approach for client-controlled encryption of a column-oriented, in-memory databases allowing range searches using an enclave. EncDBDB offers nine encrypted dictionaries, which provide different security, performance and storage efficiency tradeoffs for the data. It is especially suited for complex, read-oriented, analytic queries, e.g., as present in data warehouses. The computational overhead compared to plaintext processing is within a millisecond even for databases with millions of entries and the leakage is limited. Compressed encrypted data requires less space than a corresponding plaintext column. Furthermore, the resulting code - and data - in the enclave is very small reducing the potential for security-relevant implementation errors and side-channel leakages. △ Less

Submitted 12 February, 2020; originally announced February 2020.

arXiv:1912.11328 [pdf, other]

Assessing differentially private deep learning with Membership Inference

Authors: Daniel Bernau, Philip-William Grassal, Jonas Robl, Florian Kerschbaum

Abstract: Attacks that aim to identify the training data of public neural networks represent a severe threat to the privacy of individuals participating in the training data set. A possible protection is offered by anonymization of the training data or training function with differential privacy. However, data scientists can choose between local and central differential privacy and need to select meaningful… ▽ More Attacks that aim to identify the training data of public neural networks represent a severe threat to the privacy of individuals participating in the training data set. A possible protection is offered by anonymization of the training data or training function with differential privacy. However, data scientists can choose between local and central differential privacy and need to select meaningful privacy parameters $ε$ which is challenging for non-privacy experts. We empirically compare local and central differential privacy mechanisms under white- and black-box membership inference to evaluate their relative privacy-accuracy trade-offs. We experiment with several datasets and show that this trade-off is similar for both types of mechanisms. This suggests that local differential privacy is a sound alternative to central differential privacy for differentially private deep learning, since small $ε$ in central differential privacy and large $ε$ in local differential privacy result in similar membership inference attack risk. △ Less

Submitted 26 May, 2020; v1 submitted 24 December, 2019; originally announced December 2019.

arXiv:1912.00888 [pdf, other]

Deep Neural Network Fingerprinting by Conferrable Adversarial Examples

Authors: Nils Lukas, Yuxuan Zhang, Florian Kerschbaum

Abstract: In Machine Learning as a Service, a provider trains a deep neural network and gives many users access. The hosted (source) model is susceptible to model stealing attacks, where an adversary derives a surrogate model from API access to the source model. For post hoc detection of such attacks, the provider needs a robust method to determine whether a suspect model is a surrogate of their model. We p… ▽ More In Machine Learning as a Service, a provider trains a deep neural network and gives many users access. The hosted (source) model is susceptible to model stealing attacks, where an adversary derives a surrogate model from API access to the source model. For post hoc detection of such attacks, the provider needs a robust method to determine whether a suspect model is a surrogate of their model. We propose a fingerprinting method for deep neural network classifiers that extracts a set of inputs from the source model so that only surrogates agree with the source model on the classification of such inputs. These inputs are a subclass of transferable adversarial examples which we call conferrable adversarial examples that exclusively transfer with a target label from a source model to its surrogates. We propose a new method to generate these conferrable adversarial examples. We present an extensive study on the irremovability of our fingerprint against fine-tuning, weight pruning, retraining, retraining with different architectures, three model extraction attacks from related work, transfer learning, adversarial training, and two new adaptive attacks. Our fingerprint is robust against distillation, related model extraction attacks, and even transfer learning when the attacker has no access to the model provider's dataset. Our fingerprint is the first method that reaches a ROC AUC of 1.0 in verifying surrogates, compared to a ROC AUC of 0.63 by previous fingerprints. △ Less

Submitted 20 January, 2021; v1 submitted 2 December, 2019; originally announced December 2019.

arXiv:1911.10756 [pdf, ps, other]

doi 10.1051/0004-6361/201935245

The extended molecular envelope of the asymptotic giant branch star $π^{1}$ Gruis as seen by ALMA II. The spiral-outflow observed at high-angular resolution

Authors: L. Doan, S. Ramstedt, W. H. T. Vlemmings, S. Mohamed, S. Höfner, E. De Beck, F. Kerschbaum, M. Lindqvist, M. Maercker, C. Paladini, M. Wittkowski

Abstract: The AGB star $π^{1}$ Gruis has a known companion (at a separation of ~400 AU) which cannot explain the strong deviations from the spherical symmetry of the CSE. Recently, hydrodynamic simulations of mass transfer in closer binary systems have successfully reproduced the spiral-shaped CSEs found around a handful of sources. There is growing evidence for an even closer, undetected companion complica… ▽ More The AGB star $π^{1}$ Gruis has a known companion (at a separation of ~400 AU) which cannot explain the strong deviations from the spherical symmetry of the CSE. Recently, hydrodynamic simulations of mass transfer in closer binary systems have successfully reproduced the spiral-shaped CSEs found around a handful of sources. There is growing evidence for an even closer, undetected companion complicating the case of $π^{1}$ Gruis further. The improved spatial resolution allows for the investigation of the complex circumstellar morphology and the search for imprints on the CSE of the third component. We have observed the 12CO J=3-2 line emission from $π^{1}$ Gruis using both the compact and extended array of Atacama Large Millimeter/submillimeter Array (ALMA). The interferometric data has furthermore been combined with data from the ALMA total power (TP) array. The imaged brightness distribution has been used to constrain a non-local, non-LTE 3D radiative transfer model of the CSE. The high-angular resolution ALMA data have revealed the first example of a source on the AGB where both a faster bipolar outflow and a spiral pattern along the orbital plane can be seen in the gas envelope. The spiral can be traced in the low- to intermediate velocity, from 13 to 25 km s$^{-1}$, equatorial torus. The largest spiral-arm separation is $\approx$5".5 and consistent with a companion with an orbital period of $\approx$330 yrs and a separation of less than 70 AU. The kinematics of the bipolar outflow is consistent with it being created during a mass-loss eruption where the mass-loss rate from the system increased by at least a factor of 5 during 10-15 yrs. The spiral pattern is the result of an undetected companion. The bipolar outflow is the result of a rather recent mass-loss eruption event. △ Less

Submitted 27 November, 2019; v1 submitted 25 November, 2019; originally announced November 2019.

Comments: 12 pages, 11 figures

arXiv:1910.14268 [pdf, other]

RIGA: Covert and Robust White-Box Watermarking of Deep Neural Networks

Authors: Tianhao Wang, Florian Kerschbaum

Abstract: Watermarking of deep neural networks (DNN) can enable their tracing once released by a data owner. In this paper, we generalize white-box watermarking algorithms for DNNs, where the data owner needs white-box access to the model to extract the watermark. White-box watermarking algorithms have the advantage that they do not impact the accuracy of the watermarked model. We propose Robust whIte-box G… ▽ More Watermarking of deep neural networks (DNN) can enable their tracing once released by a data owner. In this paper, we generalize white-box watermarking algorithms for DNNs, where the data owner needs white-box access to the model to extract the watermark. White-box watermarking algorithms have the advantage that they do not impact the accuracy of the watermarked model. We propose Robust whIte-box GAn watermarking (RIGA), a novel white-box watermarking algorithm that uses adversarial training. Our extensive experiments demonstrate that the proposed watermarking algorithm not only does not impact accuracy, but also significantly improves the covertness and robustness over the current state-of-art. △ Less

Submitted 13 February, 2021; v1 submitted 31 October, 2019; originally announced October 2019.

Comments: WebConf'21 (Full Paper)

arXiv:1909.08362 [pdf, other]

Non-Interactive Private Decision Tree Evaluation

Authors: Anselme Tueno, Yordan Boev, Florian Kerschbaum

Abstract: Decision trees are a powerful prediction model with many applications in statistics, data mining, and machine learning. In some settings, the model and the data to be classified may contain sensitive information belonging to different parties. In this paper, we, therefore, address the problem of privately evaluating a decision tree on private data. This scenario consists of a server holding a priv… ▽ More Decision trees are a powerful prediction model with many applications in statistics, data mining, and machine learning. In some settings, the model and the data to be classified may contain sensitive information belonging to different parties. In this paper, we, therefore, address the problem of privately evaluating a decision tree on private data. This scenario consists of a server holding a private decision tree model and a client interested in classifying its private attribute vector using the server's private model. The goal of the computation is to obtain the classification while preserving the privacy of both - the decision tree and the client input. After the computation, the classification result is revealed only to the client, and nothing else is revealed neither to the client nor to the server. Existing privacy-preserving protocols that address this problem use or combine different generic secure multiparty computation approaches resulting in several interactions between the client and the server. Our goal is to design and implement a novel client-server protocol that delegates the complete tree evaluation to the server while preserving privacy and reducing the overhead. The idea is to use fully (somewhat) homomorphic encryption and evaluate the tree on ciphertexts encrypted under the client's public key. However, since current somewhat homomorphic encryption schemes have high overhead, we combine efficient data representations with different algorithmic optimizations to keep the computational overhead and the communication cost low. As a result, we are able to provide the first non-interactive protocol, that allows the client to delegate the evaluation to the server by sending an encrypted input and receiving only the encryption of the result. Our scheme has only one round and can evaluate a complete tree of depth 10 within seconds. △ Less

Submitted 18 September, 2019; originally announced September 2019.

arXiv:1909.08347 [pdf, other]

Secure Computation of the kth-Ranked Element in a Star Network

Authors: Anselme Tueno, Florian Kerschbaum, Stefan Katzenbeisser, Yordan Boev, Mubashir Qureshi

Abstract: We consider the problem of securely computing the kth-ranked element in a sequence of n private integers distributed among n parties. The kth-ranked element (e.g., minimum, maximum, median) is of particular interest in benchmarking, which allows a company to compare its own key performance indicator to the statistics of its peer group. The individual integers are sensitive data, yet the kth-ranked… ▽ More We consider the problem of securely computing the kth-ranked element in a sequence of n private integers distributed among n parties. The kth-ranked element (e.g., minimum, maximum, median) is of particular interest in benchmarking, which allows a company to compare its own key performance indicator to the statistics of its peer group. The individual integers are sensitive data, yet the kth-ranked element is of mutual interest to the parties. Previous secure computation protocols for the kth-ranked element require a communication channel between each pair of parties. They do not scale to a large number of parties as they are highly interactive resulting in longer delays. Moreover, they are difficult to deploy as special arrangements are required between each pair of parties to establish a secure connection. A server model naturally fits with the client-server architecture of Internet applications in which clients are connected to the server and not to other clients. It can simplify secure computation by reducing the number of rounds, and as a result, improve its performance and scalability. In this model, there are communication channels only between each client and the server, while only clients provide inputs to the computation. Hence, it is a centralized communication pattern, i.e., a star network. We propose different approaches for privately computing the kth-ranked element in the server model, using either garbled circuits or threshold homomorphic encryption. Our schemes have a constant number of rounds and can compute the kth-ranked element within seconds for up to 50 clients in a WAN. △ Less

Submitted 18 September, 2019; originally announced September 2019.

arXiv:1906.07745 [pdf, other]

On the Robustness of the Backdoor-based Watermarking in Deep Neural Networks

Authors: Masoumeh Shafieinejad, Jiaqi Wang, Nils Lukas, Xinda Li, Florian Kerschbaum

Abstract: Obtaining the state of the art performance of deep learning models imposes a high cost to model generators, due to the tedious data preparation and the substantial processing requirements. To protect the model from unauthorized re-distribution, watermarking approaches have been introduced in the past couple of years. We investigate the robustness and reliability of state-of-the-art deep neural net… ▽ More Obtaining the state of the art performance of deep learning models imposes a high cost to model generators, due to the tedious data preparation and the substantial processing requirements. To protect the model from unauthorized re-distribution, watermarking approaches have been introduced in the past couple of years. We investigate the robustness and reliability of state-of-the-art deep neural network watermarking schemes. We focus on backdoor-based watermarking and propose two -- a black-box and a white-box -- attacks that remove the watermark. Our black-box attack steals the model and removes the watermark with minimum requirements; it just relies on public unlabeled data and a black-box access to the classification label. It does not need classification confidences or access to the model's sensitive information such as the training data set, the trigger set or the model parameters. The white-box attack, proposes an efficient watermark removal when the parameters of the marked model are available; our white-box attack does not require access to the labeled data or the trigger set and improves the runtime of the black-box attack up to seventeen times. We as well prove the security inadequacy of the backdoor-based watermarking in kee** the watermark undetectable by proposing an attack that detects whether a model contains a watermark. Our attacks show that a recipient of a marked model can remove a backdoor-based watermark with significantly less effort than training a new model and some other techniques are needed to protect against re-distribution by a motivated attacker. △ Less

Submitted 25 November, 2019; v1 submitted 18 June, 2019; originally announced June 2019.

arXiv:1903.04585 [pdf, other]

Cool, evolved stars: results, challenges, and promises for the next decade

Authors: Gioia Rau, Rodolfo Montez Jr., Kenneth G. Carpenter, Markus Wittkowski, Sara Bladh, Margarita Karovska, Vladimir Airapetian, Tom Ayres, Martha Boyer, Andrea Chiavassa, Geoffrey Clayton, William Danchi, Orsola De Marco, Andrea K. Dupree, Tomasz Kaminski, Joel H. Kastner, Franz Kerschbaum, Jeffrey Linsky, Bruno Lopez, John Monnier, Miguel Montargès, Krister Nielsen, Keiichi Ohnaka, Sofia Ramstedt, Rachael Roettenbacher , et al. (5 additional authors not shown)

Abstract: Cool, evolved stars are the main source of chemical enrichment of the interstellar medium (ISM), and understanding their mass loss and structure offers a unique opportunity to study the cycle of matter in the Universe. Pulsation, convection, and other dynamic processes in cool evolved stars create an atmosphere where molecules and dust can form, including those necessary to the formation of life (… ▽ More Cool, evolved stars are the main source of chemical enrichment of the interstellar medium (ISM), and understanding their mass loss and structure offers a unique opportunity to study the cycle of matter in the Universe. Pulsation, convection, and other dynamic processes in cool evolved stars create an atmosphere where molecules and dust can form, including those necessary to the formation of life (e.g.~Carbon-bearing molecules). Understanding the structure and composition of these stars is thus vital to several aspects of stellar astrophysics, ranging from ISM studies to modeling young galaxies and exoplanet research. Recent modeling efforts and increasingly precise observations now reveal that our understanding of cool stars photospheric, chromospheric, and atmospheric structures is limited by inadequate knowledge of the dynamic and chemical processes at work. Here we outline promising scientific opportunities for the next decade. We identify and discuss the following main opportunities: (1) identify and model the physical processes that must be included in current 1D and 3D atmosphere models of cool, evolved stars; (2) refine our understanding of photospheric, chromospheric, and outer atmospheric regions of cool evolved stars, their properties and parameters, through high-resolution spectroscopic observations, and interferometric observations at high angular resolution; (3) include the neglected role of chromospheric activity in the mass loss process of red giant branch and red super giant stars, and understand the role played by their magnetic fields; (4) identify the important sha** mechanisms for planetary nebulae and their relation with the parent asymptotic giant branch stars. △ Less

Submitted 11 March, 2019; originally announced March 2019.

Comments: 10 pages, 2 figures, White Paper submitted to the Astronomy and Astrophysics Decadal Survey (Astro2020)

arXiv:1902.09259 [pdf, other]

doi 10.1038/s41550-019-0703-5

Reduction of the maximum mass-loss rate of OH/IR stars due to unnoticed binary interaction

Authors: L. Decin, W. Homan, T. Danilovich, A. de Koter, D. Engels, L. B. F. M. Waters, S. Muller, C. Gielen, D. A. García-Hernández, R. J. Stancliffe, M. Vande Sande, G. Molenberghs, F. Kerschbaum, A. A. Zijlstra, I. El Mellah

Abstract: In 1981, the idea of a superwind that ends the life of cool giant stars was proposed. Extreme OH/IR-stars develop superwinds with the highest mass-loss rates known so far, up to a few 10^(-4) Msun/yr, informing our understanding of the maximum mass-loss rate achieved during the Asymptotic Giant Branch (AGB) phase. A condundrum arises whereby the observationally determined duration of the superwind… ▽ More In 1981, the idea of a superwind that ends the life of cool giant stars was proposed. Extreme OH/IR-stars develop superwinds with the highest mass-loss rates known so far, up to a few 10^(-4) Msun/yr, informing our understanding of the maximum mass-loss rate achieved during the Asymptotic Giant Branch (AGB) phase. A condundrum arises whereby the observationally determined duration of the superwind phase is too short for these stars to become white dwarfs. Here, we report on the detection of spiral structures around two cornerstone extreme OH/IR-stars, OH26.5+0.6 and OH30.1-0.7, identifying them as wide binary systems. Hydrodynamical simulations show that the companion's gravitational attraction creates an equatorial density enhancement mimicking a short extreme superwind phase, thereby solving the decades-old conundrum. This discovery restricts the maximum mass-loss rate of AGB stars around the single-scattering radiation-pressure limit of a few 10^(-5) Msun/yr. This brings about crucial implications for nucleosynthetic yields, planet survival, and the wind-driving mechanism. △ Less

Submitted 25 February, 2019; originally announced February 2019.

Comments: Publication date: 25 February 2019 at 16:00h GMT

Journal ref: Nature Astronomy, 2019

arXiv:1811.07577 [pdf, other]

doi 10.1051/0004-6361/201833652

ALMA observations of the "fresh" carbon-rich AGB star TX Piscium. The discovery of an elliptical detached shell

Authors: Magdalena Brunner, Marko Mecina, Matthias Maercker, Ernst A. Dorfi, Franz Kerschbaum, Hans Olofsson, Gioia Rau

Abstract: Aims. The carbon-rich asymptotic giant branch (AGB) star TX Piscium (TX Psc) has been observed multiple times during multiple epochs and at different wavelengths and resolutions, showing a complex molecular CO line profile and a ring-like structure in thermal dust emission. We investigate the molecular counterpart in high resolution, aiming to resolve the ring-like structure and identify its origi… ▽ More Aims. The carbon-rich asymptotic giant branch (AGB) star TX Piscium (TX Psc) has been observed multiple times during multiple epochs and at different wavelengths and resolutions, showing a complex molecular CO line profile and a ring-like structure in thermal dust emission. We investigate the molecular counterpart in high resolution, aiming to resolve the ring-like structure and identify its origin. Methods. Atacama Large Millimeter/submillimeter Array (ALMA) observations have been carried out to map the circumstellar envelope (CSE) of TX Psc in CO(2-1) emission and investigate the counterpart to the ring-like dust structure. Results. We report the detection of a thin, irregular, and elliptical detached molecular shell around TX Psc, which coincides with the dust emission. This is the first discovery of a non-spherically symmetric detached shell, raising questions about the sha** of detached shells. Conclusions. We investigate possible sha** mechanisms for elliptical detached shells and find that in the case of TX Psc, stellar rotation of 2 km/s can lead to a non-uniform mass-loss rate and velocity distribution from stellar pole to equator, recreating the elliptical CSE. We discuss the possible scenarios for increased stellar momentum, enabling the rotation rates needed to reproduce the ellipticity of our observations, and come to the conclusion that momentum transfer of an orbiting object with the mass of a brown dwarf would be sufficient. △ Less

Submitted 19 November, 2018; originally announced November 2018.

Journal ref: A&A 621, A50 (2019)

arXiv:1806.01622 [pdf, other]

doi 10.1051/0004-6361/201832724

Molecular line study of the S-type AGB star W Aquilae. ALMA observations of CS, SiS, SiO and HCN

Authors: Magdalena Brunner, Taissa Danilovich, Sofia Ramstedt, Ivan Marti-Vidal, Elvire De Beck, Wouter H. T. Vlemmings, Michael Lindqvist, Franz Kerschbaum

Abstract: Context. With the outstanding spatial resolution and sensitivity of the Atacama Large Millimeter/sub-millimeter Array (ALMA), molecular gas other than the abundant CO can be observed and resolved in circumstellar envelopes (CSEs) around evolved stars, such as the binary S-type Asymptotic Giant Branch (AGB) star W Aquilae. Aims. We aim to constrain the chemical composition of the CSE and determine… ▽ More Context. With the outstanding spatial resolution and sensitivity of the Atacama Large Millimeter/sub-millimeter Array (ALMA), molecular gas other than the abundant CO can be observed and resolved in circumstellar envelopes (CSEs) around evolved stars, such as the binary S-type Asymptotic Giant Branch (AGB) star W Aquilae. Aims. We aim to constrain the chemical composition of the CSE and determine the radial abundance distribution, the photospheric peak abundance, and isotopic ratios of a selection of chemically important molecular species in the innermost CSE of W Aql. The derived parameters are put into the context of the chemical evolution of AGB stars and are compared with theoretical models. Methods. We employ one-dimensional radiative transfer modeling - with the accelerated lambda iteration (ALI) radiative transfer code - of the radial abundance distribution of a total of five molecular species (CS, SiS, 30SiS, 29SiO and H13CN) and determine the best fitting model parameters based on high-resolution ALMA observations as well as archival single-dish observations. The additional advantage of the spatially resolved ALMA observations is that we can directly constrain the radial profile of the observed line transitions from the observations. Results. We derive abundances and e-folding radii for CS, SiS, 30SiS, 29SiO and H13CN and compare them to previous studies, which are based only on unresolved single-dish spectra. Our results are in line with previous results and are more accurate due to resolution of the emission regions. △ Less

Submitted 5 June, 2018; originally announced June 2018.

Journal ref: A&A 617, A23 (2018)

Showing 1–50 of 106 results for author: Kerschbaum, F