Search | arXiv e-print repository

What Are the Chances? Explaining the Epsilon Parameter in Differential Privacy

Authors: Priyanka Nanayakkara, Mary Anne Smart, Rachel Cummings, Gabriel Kaptchuk, Elissa Redmiles

Abstract: Differential privacy (DP) is a mathematical privacy notion increasingly deployed across government and industry. With DP, privacy protections are probabilistic: they are bounded by the privacy budget parameter, $ε$. Prior work in health and computational science finds that people struggle to reason about probabilistic risks. Yet, communicating the implications of $ε$ to people contributing their d… ▽ More Differential privacy (DP) is a mathematical privacy notion increasingly deployed across government and industry. With DP, privacy protections are probabilistic: they are bounded by the privacy budget parameter, $ε$. Prior work in health and computational science finds that people struggle to reason about probabilistic risks. Yet, communicating the implications of $ε$ to people contributing their data is vital to avoiding privacy theater -- presenting meaningless privacy protection as meaningful -- and empowering more informed data-sharing decisions. Drawing on best practices in risk communication and usability, we develop three methods to convey probabilistic DP guarantees to end users: two that communicate odds and one offering concrete examples of DP outputs. We quantitatively evaluate these explanation methods in a vignette survey study ($n=963$) via three metrics: objective risk comprehension, subjective privacy understanding of DP guarantees, and self-efficacy. We find that odds-based explanation methods are more effective than (1) output-based methods and (2) state-of-the-art approaches that gloss over information about $ε$. Further, when offered information about $ε$, respondents are more willing to share their data than when presented with a state-of-the-art DP explanation; this willingness to share is sensitive to $ε$ values: as privacy protections weaken, respondents are less likely to share data. △ Less

Submitted 1 March, 2023; originally announced March 2023.

arXiv:2111.04439 [pdf, other]

Addressing Privacy Threats from Machine Learning

Authors: Mary Anne Smart

Abstract: Every year at NeurIPS, machine learning researchers gather and discuss exciting applications of machine learning in areas such as public health, disaster response, climate change, education, and more. However, many of these same researchers are expressing growing concern about applications of machine learning for surveillance (Nanayakkara et al., 2021). This paper presents a brief overview of stra… ▽ More Every year at NeurIPS, machine learning researchers gather and discuss exciting applications of machine learning in areas such as public health, disaster response, climate change, education, and more. However, many of these same researchers are expressing growing concern about applications of machine learning for surveillance (Nanayakkara et al., 2021). This paper presents a brief overview of strategies for resisting these surveillance technologies and calls for greater collaboration between machine learning and human-computer interaction researchers to address the threats that these technologies pose. △ Less

Submitted 24 October, 2021; originally announced November 2021.

Comments: 3 pages. Human Centered AI Workshop @ NeurIPS 2021 accepted submission

arXiv:2101.11744 [pdf, other]

On the map** between Hopfield networks and Restricted Boltzmann Machines

Authors: Matthew Smart, Anton Zilman

Abstract: Hopfield networks (HNs) and Restricted Boltzmann Machines (RBMs) are two important models at the interface of statistical physics, machine learning, and neuroscience. Recently, there has been interest in the relationship between HNs and RBMs, due to their similarity under the statistical mechanics formalism. An exact map** between HNs and RBMs has been previously noted for the special case of or… ▽ More Hopfield networks (HNs) and Restricted Boltzmann Machines (RBMs) are two important models at the interface of statistical physics, machine learning, and neuroscience. Recently, there has been interest in the relationship between HNs and RBMs, due to their similarity under the statistical mechanics formalism. An exact map** between HNs and RBMs has been previously noted for the special case of orthogonal (uncorrelated) encoded patterns. We present here an exact map** in the case of correlated pattern HNs, which are more broadly applicable to existing datasets. Specifically, we show that any HN with $N$ binary variables and $p<N$ arbitrary binary patterns can be transformed into an RBM with $N$ binary visible variables and $p$ gaussian hidden variables. We outline the conditions under which the reverse map** exists, and conduct experiments on the MNIST dataset which suggest the map** provides a useful initialization to the RBM weights. We discuss extensions, the potential importance of this correspondence for the training of RBMs, and for understanding the performance of deep architectures which utilize RBMs. △ Less

Submitted 5 March, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

Comments: ICLR 2021 oral paper

Journal ref: The 9th International Conference on Learning Representations (ICLR 2021)

arXiv:2002.10077 [pdf, other]

Approximate Data Deletion from Machine Learning Models

Authors: Zachary Izzo, Mary Anne Smart, Kamalika Chaudhuri, James Zou

Abstract: Deleting data from a trained machine learning (ML) model is a critical task in many applications. For example, we may want to remove the influence of training points that might be out of date or outliers. Regulations such as EU's General Data Protection Regulation also stipulate that individuals can request to have their data deleted. The naive approach to data deletion is to retrain the ML model… ▽ More Deleting data from a trained machine learning (ML) model is a critical task in many applications. For example, we may want to remove the influence of training points that might be out of date or outliers. Regulations such as EU's General Data Protection Regulation also stipulate that individuals can request to have their data deleted. The naive approach to data deletion is to retrain the ML model on the remaining data, but this is too time consuming. In this work, we propose a new approximate deletion method for linear and logistic models whose computational cost is linear in the the feature dimension $d$ and independent of the number of training data $n$. This is a significant gain over all existing methods, which all have superlinear time dependence on the dimension. We also develop a new feature-injection test to evaluate the thoroughness of data deletion from ML models. △ Less

Submitted 23 February, 2021; v1 submitted 24 February, 2020; originally announced February 2020.

Comments: 20 pages, 1 figure, accepted for publication at AISTATS 2021

arXiv:2001.10117 [pdf, other]

doi 10.1177/0278364920979368

Canadian Adverse Driving Conditions Dataset

Authors: Matthew Pitropov, Danson Garcia, Jason Rebello, Michael Smart, Carlos Wang, Krzysztof Czarnecki, Steven Waslander

Abstract: The Canadian Adverse Driving Conditions (CADC) dataset was collected with the Autonomoose autonomous vehicle platform, based on a modified Lincoln MKZ. The dataset, collected during winter within the Region of Waterloo, Canada, is the first autonomous vehicle dataset that focuses on adverse driving conditions specifically. It contains 7,000 frames collected through a variety of winter weather cond… ▽ More The Canadian Adverse Driving Conditions (CADC) dataset was collected with the Autonomoose autonomous vehicle platform, based on a modified Lincoln MKZ. The dataset, collected during winter within the Region of Waterloo, Canada, is the first autonomous vehicle dataset that focuses on adverse driving conditions specifically. It contains 7,000 frames collected through a variety of winter weather conditions of annotated data from 8 cameras (Ximea MQ013CG-E2), Lidar (VLP-32C) and a GNSS+INS system (Novatel OEM638). The sensors are time synchronized and calibrated with the intrinsic and extrinsic calibrations included in the dataset. Lidar frame annotations that represent ground truth for 3D object detection and tracking have been provided by Scale AI. △ Less

Submitted 27 February, 2020; v1 submitted 27 January, 2020; originally announced January 2020.

arXiv:1903.03838 [pdf, other]

BayesOD: A Bayesian Approach for Uncertainty Estimation in Deep Object Detectors

Authors: Ali Harakeh, Michael Smart, Steven L. Waslander

Abstract: When incorporating deep neural networks into robotic systems, a major challenge is the lack of uncertainty measures associated with their output predictions. Methods for uncertainty estimation in the output of deep object detectors (DNNs) have been proposed in recent works, but have had limited success due to 1) information loss at the detectors non-maximum suppression (NMS) stage, and 2) failure… ▽ More When incorporating deep neural networks into robotic systems, a major challenge is the lack of uncertainty measures associated with their output predictions. Methods for uncertainty estimation in the output of deep object detectors (DNNs) have been proposed in recent works, but have had limited success due to 1) information loss at the detectors non-maximum suppression (NMS) stage, and 2) failure to take into account the multitask, many-to-one nature of anchor-based object detection. To that end, we introduce BayesOD, an uncertainty estimation approach that reformulates the standard object detector inference and Non-Maximum suppression components from a Bayesian perspective. Experiments performed on four common object detection datasets show that BayesOD provides uncertainty estimates that are better correlated with the accuracy of detections, manifesting as a significant reduction of 9.77\%-13.13\% on the minimum Gaussian uncertainty error metric and a reduction of 1.63\%-5.23\% on the minimum Categorical uncertainty error metric. Code will be released at {\url{https://github.com/asharakeh/bayes-od-rc}}. △ Less

Submitted 16 September, 2019; v1 submitted 9 March, 2019; originally announced March 2019.

Showing 1–6 of 6 results for author: Smart, M