-
Label Noise Robustness for Domain-Agnostic Fair Corrections via Nearest Neighbors Label Spreading
Authors:
Nathan Stromberg,
Rohan Ayyagari,
Sanmi Koyejo,
Richard Nock,
Lalitha Sankar
Abstract:
Last-layer retraining methods have emerged as an efficient framework for correcting existing base models. Within this framework, several methods have been proposed to deal with correcting models for subgroup fairness with and without group membership information. Importantly, prior work has demonstrated that many methods are susceptible to noisy labels. To this end, we propose a drop-in correction…
▽ More
Last-layer retraining methods have emerged as an efficient framework for correcting existing base models. Within this framework, several methods have been proposed to deal with correcting models for subgroup fairness with and without group membership information. Importantly, prior work has demonstrated that many methods are susceptible to noisy labels. To this end, we propose a drop-in correction for label noise in last-layer retraining, and demonstrate that it achieves state-of-the-art worst-group accuracy for a broad range of symmetric label noise and across a wide variety of datasets exhibiting spurious correlations. Our proposed approach uses label spreading on a latent nearest neighbors graph and has minimal computational overhead compared to existing methods.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Theoretical Guarantees of Data Augmented Last Layer Retraining Methods
Authors:
Monica Welfert,
Nathan Stromberg,
Lalitha Sankar
Abstract:
Ensuring fair predictions across many distinct subpopulations in the training data can be prohibitive for large models. Recently, simple linear last layer retraining strategies, in combination with data augmentation methods such as upweighting, downsampling and mixup, have been shown to achieve state-of-the-art performance for worst-group accuracy, which quantifies accuracy for the least prevalent…
▽ More
Ensuring fair predictions across many distinct subpopulations in the training data can be prohibitive for large models. Recently, simple linear last layer retraining strategies, in combination with data augmentation methods such as upweighting, downsampling and mixup, have been shown to achieve state-of-the-art performance for worst-group accuracy, which quantifies accuracy for the least prevalent subpopulation. For linear last layer retraining and the abovementioned augmentations, we present the optimal worst-group accuracy when modeling the distribution of the latent representations (input to the last layer) as Gaussian for each subpopulation. We evaluate and verify our results for both synthetic and large publicly available datasets.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Model Predictive Control for Joint Ram** and Regulation-Type Service from Distributed Energy Resource Aggregations
Authors:
Joel Mathias,
Rajasekhar Anguluri,
Oliver Kosut,
Lalitha Sankar
Abstract:
Distributed energy resources (DERs) such as grid-responsive loads and batteries can be harnessed to provide ram** and regulation services across the grid. This paper concerns the problem of optimal allocation of different classes of DERs, where each class is an aggregation of similar DERs, to balance net-demand forecasts. The resulting resource allocation problem is solved using model-predictive…
▽ More
Distributed energy resources (DERs) such as grid-responsive loads and batteries can be harnessed to provide ram** and regulation services across the grid. This paper concerns the problem of optimal allocation of different classes of DERs, where each class is an aggregation of similar DERs, to balance net-demand forecasts. The resulting resource allocation problem is solved using model-predictive control (MPC) that utilizes a rolling sequence of finite time-horizon constrained optimizations. This is based on the concept that we have more accurate estimates of the load forecast in the short term, so each optimization in the rolling sequence of optimization problems uses more accurate short term load forecasts while ensuring satisfaction of capacity and dynamical constraints. Simulations demonstrate that the MPC solution can indeed reduce the ram** required from bulk generation, while mitigating near-real time grid disturbances.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
An Adversarial Approach to Evaluating the Robustness of Event Identification Models
Authors:
Obai Bahwal,
Oliver Kosut,
Lalitha Sankar
Abstract:
Intelligent machine learning approaches are finding active use for event detection and identification that allow real-time situational awareness. Yet, such machine learning algorithms have been shown to be susceptible to adversarial attacks on the incoming telemetry data. This paper considers a physics-based modal decomposition method to extract features for event classification and focuses on int…
▽ More
Intelligent machine learning approaches are finding active use for event detection and identification that allow real-time situational awareness. Yet, such machine learning algorithms have been shown to be susceptible to adversarial attacks on the incoming telemetry data. This paper considers a physics-based modal decomposition method to extract features for event classification and focuses on interpretable classifiers including logistic regression and gradient boosting to distinguish two types of events: load loss and generation loss. The resulting classifiers are then tested against an adversarial algorithm to evaluate their robustness. The adversarial attack is tested in two settings: the white box setting, wherein the attacker knows exactly the classification model; and the gray box setting, wherein the attacker has access to historical data from the same network as was used to train the classifier, but does not know the classification model. Thorough experiments on the synthetic South Carolina 500-bus system highlight that a relatively simpler model such as logistic regression is more susceptible to adversarial attacks than gradient boosting.
△ Less
Submitted 22 April, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
Robustness to Subpopulation Shift with Domain Label Noise via Regularized Annotation of Domains
Authors:
Nathan Stromberg,
Rohan Ayyagari,
Monica Welfert,
Sanmi Koyejo,
Richard Nock,
Lalitha Sankar
Abstract:
Existing methods for last layer retraining that aim to optimize worst-group accuracy (WGA) rely heavily on well-annotated groups in the training data. We show, both in theory and practice, that annotation-based data augmentations using either downsampling or upweighting for WGA are susceptible to domain annotation noise, and in high-noise regimes approach the WGA of a model trained with vanilla em…
▽ More
Existing methods for last layer retraining that aim to optimize worst-group accuracy (WGA) rely heavily on well-annotated groups in the training data. We show, both in theory and practice, that annotation-based data augmentations using either downsampling or upweighting for WGA are susceptible to domain annotation noise, and in high-noise regimes approach the WGA of a model trained with vanilla empirical risk minimization. We introduce Regularized Annotation of Domains (RAD) in order to train robust last layer classifiers without the need for explicit domain annotations. Our results show that RAD is competitive with other recently proposed domain annotation-free techniques. Most importantly, RAD outperforms state-of-the-art annotation-reliant methods even with only 5% noise in the training data for several publicly available datasets.
△ Less
Submitted 26 June, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
Adaptive Methods for Variational Inequalities under Relaxed Smoothness Assumption
Authors:
Daniil Vankov,
Angelia Nedich,
Lalitha Sankar
Abstract:
Variational Inequality (VI) problems have attracted great interest in the machine learning (ML) community due to their application in adversarial and multi-agent training. Despite its relevance in ML, the oft-used strong-monotonicity and Lipschitz continuity assumptions on VI problems are restrictive and do not hold in practice. To address this, we relax smoothness and monotonicity assumptions and…
▽ More
Variational Inequality (VI) problems have attracted great interest in the machine learning (ML) community due to their application in adversarial and multi-agent training. Despite its relevance in ML, the oft-used strong-monotonicity and Lipschitz continuity assumptions on VI problems are restrictive and do not hold in practice. To address this, we relax smoothness and monotonicity assumptions and study structured non-monotone generalized smoothness. The key idea of our results is in adaptive stepsizes. We prove the first-known convergence results for solving generalized smooth VIs for the three popular methods, namely, projection, Korpelevich, and Popov methods. Our convergence rate results for generalized smooth VIs match or improve existing results on smooth VIs. We present numerical experiments that support our theoretical guarantees and highlight the efficiency of proposed adaptive stepsizes.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Parameter Optimization with Conscious Allocation (POCA)
Authors:
Joshua Inman,
Tanmay Khandait,
Giulia Pedrielli,
Lalitha Sankar
Abstract:
The performance of modern machine learning algorithms depends upon the selection of a set of hyperparameters. Common examples of hyperparameters are learning rate and the number of layers in a dense neural network. Auto-ML is a branch of optimization that has produced important contributions in this area. Within Auto-ML, hyperband-based approaches, which eliminate poorly-performing configurations…
▽ More
The performance of modern machine learning algorithms depends upon the selection of a set of hyperparameters. Common examples of hyperparameters are learning rate and the number of layers in a dense neural network. Auto-ML is a branch of optimization that has produced important contributions in this area. Within Auto-ML, hyperband-based approaches, which eliminate poorly-performing configurations after evaluating them at low budgets, are among the most effective. However, the performance of these algorithms strongly depends on how effectively they allocate the computational budget to various hyperparameter configurations. We present the new Parameter Optimization with Conscious Allocation (POCA), a hyperband-based algorithm that adaptively allocates the inputted budget to the hyperparameter configurations it generates following a Bayesian sampling scheme. We compare POCA to its nearest competitor at optimizing the hyperparameters of an artificial toy function and a deep neural network and find that POCA finds strong configurations faster in both settings.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
Addressing GAN Training Instabilities via Tunable Classification Losses
Authors:
Monica Welfert,
Gowtham R. Kurri,
Kyle Otstot,
Lalitha Sankar
Abstract:
Generative adversarial networks (GANs), modeled as a zero-sum game between a generator (G) and a discriminator (D), allow generating synthetic data with formal guarantees. Noting that D is a classifier, we begin by reformulating the GAN value function using class probability estimation (CPE) losses. We prove a two-way correspondence between CPE loss GANs and $f$-GANs which minimize $f$-divergences…
▽ More
Generative adversarial networks (GANs), modeled as a zero-sum game between a generator (G) and a discriminator (D), allow generating synthetic data with formal guarantees. Noting that D is a classifier, we begin by reformulating the GAN value function using class probability estimation (CPE) losses. We prove a two-way correspondence between CPE loss GANs and $f$-GANs which minimize $f$-divergences. We also show that all symmetric $f$-divergences are equivalent in convergence. In the finite sample and model capacity setting, we define and obtain bounds on estimation and generalization errors. We specialize these results to $α$-GANs, defined using $α$-loss, a tunable CPE loss family parametrized by $α\in(0,\infty]$. We next introduce a class of dual-objective GANs to address training instabilities of GANs by modeling each player's objective using $α$-loss to obtain $(α_D,α_G)$-GANs. We show that the resulting non-zero sum game simplifies to minimizing an $f$-divergence under appropriate conditions on $(α_D,α_G)$. Generalizing this dual-objective formulation using CPE losses, we define and obtain upper bounds on an appropriately defined estimation error. Finally, we highlight the value of tuning $(α_D,α_G)$ in alleviating training instabilities for the synthetic 2D Gaussian mixture ring as well as the large publicly available Celeb-A and LSUN Classroom image datasets.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
Last Iterate Convergence of Popov Method for Non-monotone Stochastic Variational Inequalities
Authors:
Daniil Vankov,
Angelia Nedich,
Lalitha Sankar
Abstract:
This paper focuses on non-monotone stochastic variational inequalities (SVIs) that may not have a unique solution. A commonly used efficient algorithm to solve VIs is the Popov method, which is known to have the optimal convergence rate for VIs with Lipschitz continuous and strongly monotone operators. We introduce a broader class of structured non-monotone operators, namely $p$-quasi sharp operat…
▽ More
This paper focuses on non-monotone stochastic variational inequalities (SVIs) that may not have a unique solution. A commonly used efficient algorithm to solve VIs is the Popov method, which is known to have the optimal convergence rate for VIs with Lipschitz continuous and strongly monotone operators. We introduce a broader class of structured non-monotone operators, namely $p$-quasi sharp operators ($p> 0$), which allows tractably analyzing convergence behavior of algorithms. We show that the stochastic Popov method converges almost surely to a solution for all operators from this class under a linear growth. In addition, we obtain the last iterate convergence rate (in expectation) for the method under a linear growth condition for $2$-quasi sharp operators. Based on our analysis, we refine the results for smooth $2$-quasi sharp and $p$-quasi sharp operators (on a compact set), and obtain the optimal convergence rates. We further provide numerical experiments that demonstrate advantages of stochastic Popov method over stochastic projection method for solving SVIs.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
A Semi-Supervised Approach for Power System Event Identification
Authors:
Nima Taghipourbazargani,
Lalitha Sankar,
Oliver Kosut
Abstract:
Event identification is increasingly recognized as crucial for enhancing the reliability, security, and stability of the electric power system. With the growing deployment of Phasor Measurement Units (PMUs) and advancements in data science, there are promising opportunities to explore data-driven event identification via machine learning classification techniques. However, obtaining accurately-lab…
▽ More
Event identification is increasingly recognized as crucial for enhancing the reliability, security, and stability of the electric power system. With the growing deployment of Phasor Measurement Units (PMUs) and advancements in data science, there are promising opportunities to explore data-driven event identification via machine learning classification techniques. However, obtaining accurately-labeled eventful PMU data samples remains challenging due to its labor-intensive nature and uncertainty about the event type (class) in real-time. Thus, it is natural to use semi-supervised learning techniques, which make use of both labeled and unlabeled samples. %We propose a novel semi-supervised framework to assess the effectiveness of incorporating unlabeled eventful samples to enhance existing event identification methodologies. We evaluate three categories of classical semi-supervised approaches: (i) self-training, (ii) transductive support vector machines (TSVM), and (iii) graph-based label spreading (LS) method. Our approach characterizes events using physically interpretable features extracted from modal analysis of synthetic eventful PMU data. In particular, we focus on the identification of four event classes whose identification is crucial for grid operations. We have developed and publicly shared a comprehensive Event Identification package which consists of three aspects: data generation, feature extraction, and event identification with limited labels using semi-supervised methodologies. Using this package, we generate and evaluate eventful PMU data for the South Carolina synthetic network. Our evaluation consistently demonstrates that graph-based LS outperforms the other two semi-supervised methods that we consider, and can noticeably improve event identification performance relative to the setting with only a small number of labeled samples.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Unifying Privacy Measures via Maximal $(α,β)$-Leakage (M$α$beL)
Authors:
Atefeh Gilani,
Gowtham R. Kurri,
Oliver Kosut,
Lalitha Sankar
Abstract:
We introduce a family of information leakage measures called maximal $(α,β)$-leakage (M$α$beL), parameterized by real numbers $α$ and $β$ greater than or equal to 1. The measure is formalized via an operational definition involving an adversary guessing an unknown (randomized) function of the data given the released data. We obtain a simplified computable expression for the measure and show that i…
▽ More
We introduce a family of information leakage measures called maximal $(α,β)$-leakage (M$α$beL), parameterized by real numbers $α$ and $β$ greater than or equal to 1. The measure is formalized via an operational definition involving an adversary guessing an unknown (randomized) function of the data given the released data. We obtain a simplified computable expression for the measure and show that it satisfies several basic properties such as monotonicity in $β$ for a fixed $α$, non-negativity, data processing inequalities, and additivity over independent releases. We highlight the relevance of this family by showing that it bridges several known leakage measures, including maximal $α$-leakage $(β=1)$, maximal leakage $(α=\infty,β=1)$, local differential privacy (LDP) $(α=\infty,β=\infty)$, and local Renyi differential privacy (LRDP) $(α=β)$, thereby giving an operational interpretation to local Renyi differential privacy. We also study a conditional version of M$α$beL on leveraging which we recover differential privacy and Renyi differential privacy. A new variant of LRDP, which we call maximal Renyi leakage, appears as a special case of M$α$beL for $α=\infty$ that smoothly tunes between maximal leakage ($β=1$) and LDP ($β=\infty$). Finally, we show that a vector form of the maximal Renyi leakage relaxes differential privacy under Gaussian and Laplacian mechanisms.
△ Less
Submitted 4 April, 2024; v1 submitted 14 April, 2023;
originally announced April 2023.
-
$(α_D,α_G)$-GANs: Addressing GAN Training Instabilities via Dual Objectives
Authors:
Monica Welfert,
Kyle Otstot,
Gowtham R. Kurri,
Lalitha Sankar
Abstract:
In an effort to address the training instabilities of GANs, we introduce a class of dual-objective GANs with different value functions (objectives) for the generator (G) and discriminator (D). In particular, we model each objective using $α$-loss, a tunable classification loss, to obtain $(α_D,α_G)$-GANs, parameterized by $(α_D,α_G)\in (0,\infty]^2$. For sufficiently large number of samples and ca…
▽ More
In an effort to address the training instabilities of GANs, we introduce a class of dual-objective GANs with different value functions (objectives) for the generator (G) and discriminator (D). In particular, we model each objective using $α$-loss, a tunable classification loss, to obtain $(α_D,α_G)$-GANs, parameterized by $(α_D,α_G)\in (0,\infty]^2$. For sufficiently large number of samples and capacities for G and D, we show that the resulting non-zero sum game simplifies to minimizing an $f$-divergence under appropriate conditions on $(α_D,α_G)$. In the finite sample and capacity setting, we define estimation error to quantify the gap in the generator's performance relative to the optimal setting with infinite samples and obtain upper bounds on this error, showing it to be order optimal under certain conditions. Finally, we highlight the value of tuning $(α_D,α_G)$ in alleviating training instabilities for the synthetic 2D Gaussian mixture ring and the Stacked MNIST datasets.
△ Less
Submitted 3 May, 2023; v1 submitted 28 February, 2023;
originally announced February 2023.
-
Smoothly Giving up: Robustness for Simple Models
Authors:
Tyler Sypherd,
Nathan Stromberg,
Richard Nock,
Visar Berisha,
Lalitha Sankar
Abstract:
There is a growing need for models that are interpretable and have reduced energy and computational cost (e.g., in health care analytics and federated learning). Examples of algorithms to train such models include logistic regression and boosting. However, one challenge facing these algorithms is that they provably suffer from label noise; this has been attributed to the joint interaction between…
▽ More
There is a growing need for models that are interpretable and have reduced energy and computational cost (e.g., in health care analytics and federated learning). Examples of algorithms to train such models include logistic regression and boosting. However, one challenge facing these algorithms is that they provably suffer from label noise; this has been attributed to the joint interaction between oft-used convex loss functions and simpler hypothesis classes, resulting in too much emphasis being placed on outliers. In this work, we use the margin-based $α$-loss, which continuously tunes between canonical convex and quasi-convex losses, to robustly train simple models. We show that the $α$ hyperparameter smoothly introduces non-convexity and offers the benefit of "giving up" on noisy training examples. We also provide results on the Long-Servedio dataset for boosting and a COVID-19 survey dataset for logistic regression, highlighting the efficacy of our approach across multiple relevant domains.
△ Less
Submitted 17 February, 2023;
originally announced February 2023.
-
An Alphabet of Leakage Measures
Authors:
Atefeh Gilani,
Gowtham R. Kurri,
Oliver Kosut,
Lalitha Sankar
Abstract:
We introduce a family of information leakage measures called maximal $α,β$-leakage, parameterized by real numbers $α$ and $β$. The measure is formalized via an operational definition involving an adversary guessing an unknown function of the data given the released data. We obtain a simple, computable expression for the measure and show that it satisfies several basic properties such as monotonici…
▽ More
We introduce a family of information leakage measures called maximal $α,β$-leakage, parameterized by real numbers $α$ and $β$. The measure is formalized via an operational definition involving an adversary guessing an unknown function of the data given the released data. We obtain a simple, computable expression for the measure and show that it satisfies several basic properties such as monotonicity in $β$ for a fixed $α$, non-negativity, data processing inequalities, and additivity over independent releases. Finally, we highlight the relevance of this family by showing that it bridges several known leakage measures, including maximal $α$-leakage $(β=1)$, maximal leakage $(α=\infty,β=1)$, local differential privacy $(α=\infty,β=\infty)$, and local Renyi differential privacy $(α=β)$.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
Robust Model Selection of Gaussian Graphical Models
Authors:
Abrar Zahin,
Rajasekhar Anguluri,
Lalitha Sankar,
Oliver Kosut,
Gautam Dasarathy
Abstract:
In Gaussian graphical model selection, noise-corrupted samples present significant challenges. It is known that even minimal amounts of noise can obscure the underlying structure, leading to fundamental identifiability issues. A recent line of work addressing this "robust model selection" problem narrows its focus to tree-structured graphical models. Even within this specific class of models, exac…
▽ More
In Gaussian graphical model selection, noise-corrupted samples present significant challenges. It is known that even minimal amounts of noise can obscure the underlying structure, leading to fundamental identifiability issues. A recent line of work addressing this "robust model selection" problem narrows its focus to tree-structured graphical models. Even within this specific class of models, exact structure recovery is shown to be impossible. However, several algorithms have been developed that are known to provably recover the underlying tree-structure up to an (unavoidable) equivalence class.
In this paper, we extend these results beyond tree-structured graphs. We first characterize the equivalence class up to which general graphs can be recovered in the presence of noise. Despite the inherent ambiguity (which we prove is unavoidable), the structure that can be recovered reveals local clustering information and global connectivity patterns in the underlying model. Such information is useful in a range of real-world problems, including power grids, social networks, protein-protein interactions, and neural structures. We then propose an algorithm which provably recovers the underlying graph up to the identified ambiguity. We further provide finite sample guarantees in the high-dimensional regime for our algorithm and validate our results through numerical simulations.
△ Less
Submitted 7 May, 2024; v1 submitted 10 November, 2022;
originally announced November 2022.
-
An Operational Approach to Information Leakage via Generalized Gain Functions
Authors:
Gowtham R. Kurri,
Lalitha Sankar,
Oliver Kosut
Abstract:
We introduce a \emph{gain function} viewpoint of information leakage by proposing \emph{maximal $g$-leakage}, a rich class of operationally meaningful leakage measures that subsumes recently introduced leakage measures -- {maximal leakage} and {maximal $α$-leakage}. In maximal $g$-leakage, the gain of an adversary in guessing an unknown random variable is measured using a {gain function} applied t…
▽ More
We introduce a \emph{gain function} viewpoint of information leakage by proposing \emph{maximal $g$-leakage}, a rich class of operationally meaningful leakage measures that subsumes recently introduced leakage measures -- {maximal leakage} and {maximal $α$-leakage}. In maximal $g$-leakage, the gain of an adversary in guessing an unknown random variable is measured using a {gain function} applied to the probability of correctly guessing. In particular, maximal $g$-leakage captures the multiplicative increase, upon observing $Y$, in the expected gain of an adversary in guessing a randomized function of $X$, maximized over all such randomized functions. We also consider the scenario where an adversary can make multiple attempts to guess the randomized function of interest. We show that maximal leakage is an upper bound on maximal $g$-leakage under multiple guesses, for any non-negative gain function $g$. We obtain a closed-form expression for maximal $g$-leakage under multiple guesses for a class of concave gain functions. We also study maximal $g$-leakage measure for a specific class of gain functions related to the $α$-loss. In particular, we first completely characterize the minimal expected $α$-loss under multiple guesses and analyze how the corresponding leakage measure is affected with the number of guesses. Finally, we study two variants of maximal $g$-leakage depending on the type of adversary and obtain closed-form expressions for them, which do not depend on the particular gain function considered as long as it satisfies some mild regularity conditions. We do this by develo** a variational characterization for the Rényi divergence of order infinity which naturally generalizes the definition of pointwise maximal leakage to incorporate arbitrary gain functions.
△ Less
Submitted 7 December, 2023; v1 submitted 28 September, 2022;
originally announced September 2022.
-
PMU Tracker: A Visualization Platform for Epicentric Event Propagation Analysis in the Power Grid
Authors:
Anjana Arunkumar,
Andrea Pinceti,
Lalitha Sankar,
Chris Bryan
Abstract:
The electrical power grid is a critical infrastructure, with disruptions in transmission having severe repercussions on daily activities, across multiple sectors. To identify, prevent, and mitigate such events, power grids are being refurbished as 'smart' systems that include the widespread deployment of GPS-enabled phasor measurement units (PMUs). PMUs provide fast, precise, and time-synchronized…
▽ More
The electrical power grid is a critical infrastructure, with disruptions in transmission having severe repercussions on daily activities, across multiple sectors. To identify, prevent, and mitigate such events, power grids are being refurbished as 'smart' systems that include the widespread deployment of GPS-enabled phasor measurement units (PMUs). PMUs provide fast, precise, and time-synchronized measurements of voltage and current, enabling real-time wide-area monitoring and control. However, the potential benefits of PMUs, for analyzing grid events like abnormal power oscillations and load fluctuations, are hindered by the fact that these sensors produce large, concurrent volumes of noisy data. In this paper, we describe working with power grid engineers to investigate how this problem can be addressed from a visual analytics perspective. As a result, we have developed PMU Tracker, an event localization tool that supports power grid operators in visually analyzing and identifying power grid events and tracking their propagation through the power grid's network. As a part of the PMU Tracker interface, we develop a novel visualization technique which we term an epicentric cluster dendrogram, which allows operators to analyze the effects of an event as it propagates outwards from a source location. We robustly validate PMU Tracker with: (1) a usage scenario demonstrating how PMU Tracker can be used to analyze anomalous grid events, and (2) case studies with power grid operators using a real-world interconnection dataset. Our results indicate that PMU Tracker effectively supports the analysis of power grid events; we also demonstrate and discuss how PMU Tracker's visual analytics approach can be generalized to other domains composed of time-varying networks with epicentric event characteristics.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
The Saddle-Point Accountant for Differential Privacy
Authors:
Wael Alghamdi,
Shahab Asoodeh,
Flavio P. Calmon,
Juan Felipe Gomez,
Oliver Kosut,
Lalitha Sankar,
Fei Wei
Abstract:
We introduce a new differential privacy (DP) accountant called the saddle-point accountant (SPA). SPA approximates privacy guarantees for the composition of DP mechanisms in an accurate and fast manner. Our approach is inspired by the saddle-point method -- a ubiquitous numerical technique in statistics. We prove rigorous performance guarantees by deriving upper and lower bounds for the approximat…
▽ More
We introduce a new differential privacy (DP) accountant called the saddle-point accountant (SPA). SPA approximates privacy guarantees for the composition of DP mechanisms in an accurate and fast manner. Our approach is inspired by the saddle-point method -- a ubiquitous numerical technique in statistics. We prove rigorous performance guarantees by deriving upper and lower bounds for the approximation error offered by SPA. The crux of SPA is a combination of large-deviation methods with central limit theorems, which we derive via exponentially tilting the privacy loss random variables corresponding to the DP mechanisms. One key advantage of SPA is that it runs in constant time for the $n$-fold composition of a privacy mechanism. Numerical experiments demonstrate that SPA achieves comparable accuracy to state-of-the-art accounting methods with a faster runtime.
△ Less
Submitted 19 August, 2022;
originally announced August 2022.
-
Parameter Estimation in Ill-conditioned Low-inertia Power Systems
Authors:
Rajasekhar Anguluri,
Lalitha Sankar,
Oliver Kosut
Abstract:
This paper examines model parameter estimation in dynamic power systems whose governing electro-mechanical equations are ill-conditioned or singular. This ill-conditioning is because of converter-interfaced power systems generators' zero or small inertia contribution. Consequently, the overall system inertia decreases, resulting in low-inertia power systems. We show that the standard state-space m…
▽ More
This paper examines model parameter estimation in dynamic power systems whose governing electro-mechanical equations are ill-conditioned or singular. This ill-conditioning is because of converter-interfaced power systems generators' zero or small inertia contribution. Consequently, the overall system inertia decreases, resulting in low-inertia power systems. We show that the standard state-space model based on least squares or subspace estimators fails to exist for these models. We overcome this challenge by considering a least-squares estimator directly on the coupled swing-equation model but not on its transformed first-order state-space form. We specifically focus on estimating inertia (mechanical and virtual) and dam** constants, although our method is general enough for estimating other parameters. Our theoretical analysis highlights the role of network topology on the parameter estimates of an individual generator. For generators with greater connectivity, estimation of the associated parameters is more susceptible to variations in other generator states. Furthermore, we numerically show that estimating the parameters by ignoring their ill-conditioning aspects yields highly unreliable results.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
Cactus Mechanisms: Optimal Differential Privacy Mechanisms in the Large-Composition Regime
Authors:
Wael Alghamdi,
Shahab Asoodeh,
Flavio P. Calmon,
Oliver Kosut,
Lalitha Sankar,
Fei Wei
Abstract:
Most differential privacy mechanisms are applied (i.e., composed) numerous times on sensitive data. We study the design of optimal differential privacy mechanisms in the limit of a large number of compositions. As a consequence of the law of large numbers, in this regime the best privacy mechanism is the one that minimizes the Kullback-Leibler divergence between the conditional output distribution…
▽ More
Most differential privacy mechanisms are applied (i.e., composed) numerous times on sensitive data. We study the design of optimal differential privacy mechanisms in the limit of a large number of compositions. As a consequence of the law of large numbers, in this regime the best privacy mechanism is the one that minimizes the Kullback-Leibler divergence between the conditional output distributions of the mechanism given two different inputs. We formulate an optimization problem to minimize this divergence subject to a cost constraint on the noise. We first prove that additive mechanisms are optimal. Since the optimization problem is infinite dimensional, it cannot be solved directly; nevertheless, we quantize the problem to derive near-optimal additive mechanisms that we call "cactus mechanisms" due to their shape. We show that our quantization approach can be arbitrarily close to an optimal mechanism. Surprisingly, for quadratic cost, the Gaussian mechanism is strictly sub-optimal compared to this cactus mechanism. Finally, we provide numerical results which indicate that cactus mechanism outperforms the Gaussian mechanism for a finite number of compositions.
△ Less
Submitted 25 June, 2022;
originally announced July 2022.
-
AugLoss: A Robust Augmentation-based Fine Tuning Methodology
Authors:
Kyle Otstot,
Andrew Yang,
John Kevin Cava,
Lalitha Sankar
Abstract:
Deep Learning (DL) models achieve great successes in many domains. However, DL models increasingly face safety and robustness concerns, including noisy labeling in the training stage and feature distribution shifts in the testing stage. Previous works made significant progress in addressing these problems, but the focus has largely been on develo** solutions for only one problem at a time. For e…
▽ More
Deep Learning (DL) models achieve great successes in many domains. However, DL models increasingly face safety and robustness concerns, including noisy labeling in the training stage and feature distribution shifts in the testing stage. Previous works made significant progress in addressing these problems, but the focus has largely been on develo** solutions for only one problem at a time. For example, recent work has argued for the use of tunable robust loss functions to mitigate label noise, and data augmentation (e.g., AugMix) to combat distribution shifts. As a step towards addressing both problems simultaneously, we introduce AugLoss, a simple but effective methodology that achieves robustness against both train-time noisy labeling and test-time feature distribution shifts by unifying data augmentation and robust loss functions. We conduct comprehensive experiments in varied settings of real-world dataset corruption to showcase the gains achieved by AugLoss compared to previous state-of-the-art methods. Lastly, we hope this work will open new directions for designing more robust and reliable DL models under real-world corruptions.
△ Less
Submitted 28 January, 2024; v1 submitted 5 June, 2022;
originally announced June 2022.
-
$α$-GAN: Convergence and Estimation Guarantees
Authors:
Gowtham R. Kurri,
Monica Welfert,
Tyler Sypherd,
Lalitha Sankar
Abstract:
We prove a two-way correspondence between the min-max optimization of general CPE loss function GANs and the minimization of associated $f$-divergences. We then focus on $α$-GAN, defined via the $α$-loss, which interpolates several GANs (Hellinger, vanilla, Total Variation) and corresponds to the minimization of the Arimoto divergence. We show that the Arimoto divergences induced by $α$-GAN equiva…
▽ More
We prove a two-way correspondence between the min-max optimization of general CPE loss function GANs and the minimization of associated $f$-divergences. We then focus on $α$-GAN, defined via the $α$-loss, which interpolates several GANs (Hellinger, vanilla, Total Variation) and corresponds to the minimization of the Arimoto divergence. We show that the Arimoto divergences induced by $α$-GAN equivalently converge, for all $α\in \mathbb{R}_{>0}\cup\{\infty\}$. However, under restricted learning models and finite samples, we provide estimation bounds which indicate diverse GAN behavior as a function of $α$. Finally, we present empirical results on a toy dataset that highlight the practical utility of tuning the $α$ hyperparameter.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
A Machine Learning Framework for Event Identification via Modal Analysis of PMU Data
Authors:
Nima T. Bazargani,
Gautam Dasarathy,
Lalitha Sankar,
Oliver Kosut
Abstract:
Power systems are prone to a variety of events (e.g. line trips and generation loss) and real-time identification of such events is crucial in terms of situational awareness, reliability, and security. Using measurements from multiple synchrophasors, i.e., phasor measurement units (PMUs), we propose to identify events by extracting features based on modal dynamics. We combine such traditional phys…
▽ More
Power systems are prone to a variety of events (e.g. line trips and generation loss) and real-time identification of such events is crucial in terms of situational awareness, reliability, and security. Using measurements from multiple synchrophasors, i.e., phasor measurement units (PMUs), we propose to identify events by extracting features based on modal dynamics. We combine such traditional physics-based feature extraction methods with machine learning to distinguish different event types. Including all measurement channels at each PMU allows exploiting diverse features but also requires learning classification models over a high-dimensional space. To address this issue, various feature selection methods are implemented to choose the best subset of features. Using the obtained subset of features, we investigate the performance of two well-known classification models, namely, logistic regression (LR) and support vector machines (SVM) to identify generation loss and line trip events in two datasets. The first dataset is obtained from simulated generation loss and line trip events in the Texas 2000-bus synthetic grid. The second is a proprietary dataset with labeled events obtained from a large utility in the USA involving measurements from nearly 500 PMUs. Our results indicate that the proposed framework is promising for identifying the two types of events.
△ Less
Submitted 3 October, 2022; v1 submitted 14 February, 2022;
originally announced February 2022.
-
A Variational Formula for Infinity-Rényi Divergence with Applications to Information Leakage
Authors:
Gowtham R. Kurri,
Oliver Kosut,
Lalitha Sankar
Abstract:
We present a variational characterization for the Rényi divergence of order infinity. Our characterization is related to guessing: the objective functional is a ratio of maximal expected values of a gain function applied to the probability of correctly guessing an unknown random variable. An important aspect of our variational characterization is that it remains agnostic to the particular gain fun…
▽ More
We present a variational characterization for the Rényi divergence of order infinity. Our characterization is related to guessing: the objective functional is a ratio of maximal expected values of a gain function applied to the probability of correctly guessing an unknown random variable. An important aspect of our variational characterization is that it remains agnostic to the particular gain function considered, as long as it satisfies some regularity conditions. Also, we define two variants of a tunable measure of information leakage, the maximal $α$-leakage, and obtain closed-form expressions for these information measures by leveraging our variational characterization.
△ Less
Submitted 2 May, 2022; v1 submitted 12 February, 2022;
originally announced February 2022.
-
A Complex-LASSO Approach for Localizing Forced Oscillations in Power Systems
Authors:
Rajasekhar Anguluri,
Nima Taghipourbazargani,
Oliver Kosut,
Lalitha Sankar
Abstract:
We study the problem of localizing multiple sources of forced oscillations (FOs) and estimating their characteristics, such as frequency, phase, and amplitude, using noisy PMU measurements. For each source location, we model the input oscillation as a sum of unknown sinusoidal terms. This allows us to obtain a linear relationship between measurements and the inputs at the unknown sinusoids' freque…
▽ More
We study the problem of localizing multiple sources of forced oscillations (FOs) and estimating their characteristics, such as frequency, phase, and amplitude, using noisy PMU measurements. For each source location, we model the input oscillation as a sum of unknown sinusoidal terms. This allows us to obtain a linear relationship between measurements and the inputs at the unknown sinusoids' frequencies in the frequency domain. We determine these frequencies by thresholding the empirical spectrum of the noisy measurements. Assuming sparsity in the number of FOs' locations and the number of sinusoids at each location, we cast the location recovery problem as an $\ell_1$-regularized least squares problem in the complex domain -- i.e., complex-LASSO (linear shrinkage and selection operator). We numerically solve this optimization problem using the complex-valued coordinate descent method, and show its efficiency on the IEEE 68-bus, 16 machine and WECC 179-bus, 29-machine systems.
△ Less
Submitted 19 January, 2022;
originally announced January 2022.
-
Localization and Estimation of Unknown Forced Inputs: A Group LASSO Approach
Authors:
Rajasekhar Anguluri,
Lalitha Sankar,
Oliver Kosut
Abstract:
We model and study the problem of localizing a set of sparse forcing inputs for linear dynamical systems from noisy measurements when the initial state is unknown. This problem is of particular relevance to detecting forced oscillations in electric power networks. We express measurements as an additive model comprising the initial state and inputs grouped over time, both expanded in terms of the b…
▽ More
We model and study the problem of localizing a set of sparse forcing inputs for linear dynamical systems from noisy measurements when the initial state is unknown. This problem is of particular relevance to detecting forced oscillations in electric power networks. We express measurements as an additive model comprising the initial state and inputs grouped over time, both expanded in terms of the basis functions (i.e., impulse response coefficients). Using this model, with probabilistic guarantees, we recover the locations and simultaneously estimate the initial state and forcing inputs using a variant of the group LASSO (linear absolute shrinkage and selection operator) method. Specifically, we provide a tight upper bound on: (i) the probability that the group LASSO estimator wrongly identifies the source locations, and (ii) the $\ell_2$-norm of the estimation error. Our bounds explicitly depend upon the length of the measurement horizon, the noise statistics, the number of inputs and sensors, and the singular values of impulse response matrices. Our theoretical analysis is one of the first to provide a complete treatment for the group LASSO estimator for linear dynamical systems under input-to-output delay assumptions. Finally, we validate our results on synthetic models and the IEEE 68-bus, 16-machine system.
△ Less
Submitted 19 January, 2022;
originally announced January 2022.
-
Unity is Strength: A Formalization of Cross-Domain Maximal Extractable Value
Authors:
Alexandre Obadia,
Alejo Salles,
Lakshman Sankar,
Tarun Chitra,
Vaibhav Chellani,
Philip Daian
Abstract:
The multi-chain future is upon us. Modular architectures are coming to maturity across the ecosystem to scale bandwidth and throughput of cryptocurrency. One example of such is the Ethereum modular architecture, with its beacon chain, its execution chain, its Layer 2s, and soon its shards. These can all be thought as separate blockchains, heavily inter-connected with one another, and together form…
▽ More
The multi-chain future is upon us. Modular architectures are coming to maturity across the ecosystem to scale bandwidth and throughput of cryptocurrency. One example of such is the Ethereum modular architecture, with its beacon chain, its execution chain, its Layer 2s, and soon its shards. These can all be thought as separate blockchains, heavily inter-connected with one another, and together forming an ecosystem. In this work, we call each of these interconnected blockchains "domains", and study the manifestation of Maximal Extractable Value (MEV, a generalization of "Miner Extractable Value") across them. In other words, we investigate whether there exists extractable value that depends on the ordering of transactions in two or more domains jointly. We first recall the definitions of Extractable and Maximal Extractable Value, before introducing a definition of Cross-Domain Maximal Extractable Value. We find that Cross-Domain MEV can be used to measure the incentive for transaction sequencers in different domains to collude with one another, and study the scenarios in which there exists such an incentive. We end the work with a list of negative externalities that might arise from cross-domain MEV extraction and lay out several open questions. We note that the formalism in this work is a work in progress, and we hope that it can serve as the basis for formal analysis tools in the style of those presented in Clockwork Finance, as well as for discussion on how to mitigate the upcoming negative externalities of substantial cross-domain MEV.
△ Less
Submitted 5 December, 2021; v1 submitted 2 December, 2021;
originally announced December 2021.
-
Lower Bounds for the MMSE via Neural Network Estimation and Their Applications to Privacy
Authors:
Mario Diaz,
Peter Kairouz,
Lalitha Sankar
Abstract:
The minimum mean-square error (MMSE) achievable by optimal estimation of a random variable $Y\in\mathbb{R}$ given another random variable $X\in\mathbb{R}^{d}$ is of much interest in a variety of statistical settings. In the context of estimation-theoretic privacy, the MMSE has been proposed as an information leakage measure that captures the ability of an adversary in estimating $Y$ upon observing…
▽ More
The minimum mean-square error (MMSE) achievable by optimal estimation of a random variable $Y\in\mathbb{R}$ given another random variable $X\in\mathbb{R}^{d}$ is of much interest in a variety of statistical settings. In the context of estimation-theoretic privacy, the MMSE has been proposed as an information leakage measure that captures the ability of an adversary in estimating $Y$ upon observing $X$. In this paper we establish provable lower bounds for the MMSE based on a two-layer neural network estimator of the MMSE and the Barron constant of an appropriate function of the conditional expectation of $Y$ given $X$. Furthermore, we derive a general upper bound for the Barron constant that, when $X\in\mathbb{R}$ is post-processed by the additive Gaussian mechanism and $Y$ is binary, produces order optimal estimates in the large noise regime. In order to obtain numerical lower bounds for the MMSE in some concrete applications, we introduce an efficient optimization process that approximates the value of the proposed neural network estimator. Overall, we provide an effective machinery to obtain provable lower bounds for the MMSE.
△ Less
Submitted 10 July, 2022; v1 submitted 29 August, 2021;
originally announced August 2021.
-
Evaluating Multiple Guesses by an Adversary via a Tunable Loss Function
Authors:
Gowtham R. Kurri,
Oliver Kosut,
Lalitha Sankar
Abstract:
We consider a problem of guessing, wherein an adversary is interested in knowing the value of the realization of a discrete random variable $X$ on observing another correlated random variable $Y$. The adversary can make multiple (say, $k$) guesses. The adversary's guessing strategy is assumed to minimize $α$-loss, a class of tunable loss functions parameterized by $α$. It has been shown before tha…
▽ More
We consider a problem of guessing, wherein an adversary is interested in knowing the value of the realization of a discrete random variable $X$ on observing another correlated random variable $Y$. The adversary can make multiple (say, $k$) guesses. The adversary's guessing strategy is assumed to minimize $α$-loss, a class of tunable loss functions parameterized by $α$. It has been shown before that this loss function captures well known loss functions including the exponential loss ($α=1/2$), the log-loss ($α=1$) and the $0$-$1$ loss ($α=\infty$). We completely characterize the optimal adversarial strategy and the resulting expected $α$-loss, thereby recovering known results for $α=\infty$. We define an information leakage measure from the $k$-guesses setup and derive a condition under which the leakage is unchanged from a single guess.
△ Less
Submitted 19 August, 2021;
originally announced August 2021.
-
Generation of Synthetic Multi-Resolution Time Series Load Data
Authors:
Andrea Pinceti,
Lalitha Sankar,
Oliver Kosut
Abstract:
The availability of large datasets is crucial for the development of new power system applications and tools; unfortunately, very few are publicly and freely available. We designed an end-to-end generative framework for the creation of synthetic bus-level time-series load data for transmission networks. The model is trained on a real dataset of over 70 Terabytes of synchrophasor measurements spann…
▽ More
The availability of large datasets is crucial for the development of new power system applications and tools; unfortunately, very few are publicly and freely available. We designed an end-to-end generative framework for the creation of synthetic bus-level time-series load data for transmission networks. The model is trained on a real dataset of over 70 Terabytes of synchrophasor measurements spanning multiple years. Leveraging a combination of principal component analysis and conditional generative adversarial network models, the scheme we developed allows for the generation of data at varying sampling rates (up to a maximum of 30 samples per second) and ranging in length from seconds to years. The generative models are tested extensively to verify that they correctly capture the diverse characteristics of real loads. Finally, we develop an open-source tool called LoadGAN which gives researchers access to the fully trained generative models via a graphical interface.
△ Less
Submitted 24 July, 2022; v1 submitted 7 July, 2021;
originally announced July 2021.
-
Synthetic Time-Series Load Data via Conditional Generative Adversarial Networks
Authors:
Andrea Pinceti,
Lalitha Sankar,
Oliver Kosut
Abstract:
A framework for the generation of synthetic time-series transmission-level load data is presented. Conditional generative adversarial networks are used to learn the patterns of a real dataset of hourly-sampled week-long load profiles and generate unique synthetic profiles on demand, based on the season and type of load required. Extensive testing of the generative model is performed to verify that…
▽ More
A framework for the generation of synthetic time-series transmission-level load data is presented. Conditional generative adversarial networks are used to learn the patterns of a real dataset of hourly-sampled week-long load profiles and generate unique synthetic profiles on demand, based on the season and type of load required. Extensive testing of the generative model is performed to verify that the synthetic data fully captures the characteristics of real loads and that it can be used for downstream power system and/or machine learning applications.
△ Less
Submitted 7 July, 2021;
originally announced July 2021.
-
Being Properly Improper
Authors:
Tyler Sypherd,
Richard Nock,
Lalitha Sankar
Abstract:
Properness for supervised losses stipulates that the loss function shapes the learning algorithm towards the true posterior of the data generating distribution. Unfortunately, data in modern machine learning can be corrupted or twisted in many ways. Hence, optimizing a proper loss function on twisted data could perilously lead the learning algorithm towards the twisted posterior, rather than to th…
▽ More
Properness for supervised losses stipulates that the loss function shapes the learning algorithm towards the true posterior of the data generating distribution. Unfortunately, data in modern machine learning can be corrupted or twisted in many ways. Hence, optimizing a proper loss function on twisted data could perilously lead the learning algorithm towards the twisted posterior, rather than to the desired clean posterior. Many papers cope with specific twists (e.g., label/feature/adversarial noise), but there is a growing need for a unified and actionable understanding atop properness. Our chief theoretical contribution is a generalization of the properness framework with a notion called twist-properness, which delineates loss functions with the ability to "untwist" the twisted posterior into the clean posterior. Notably, we show that a nontrivial extension of a loss function called $α$-loss, which was first introduced in information theory, is twist-proper. We study the twist-proper $α$-loss under a novel boosting algorithm, called PILBoost, and provide formal and experimental results for this algorithm. Our overarching practical conclusion is that the twist-proper $α$-loss outperforms the proper $\log$-loss on several variants of twisted data.
△ Less
Submitted 31 January, 2022; v1 submitted 18 June, 2021;
originally announced June 2021.
-
Realizing GANs via a Tunable Loss Function
Authors:
Gowtham R. Kurri,
Tyler Sypherd,
Lalitha Sankar
Abstract:
We introduce a tunable GAN, called $α$-GAN, parameterized by $α\in (0,\infty]$, which interpolates between various $f$-GANs and Integral Probability Metric based GANs (under constrained discriminator set). We construct $α$-GAN using a supervised loss function, namely, $α$-loss, which is a tunable loss function capturing several canonical losses. We show that $α$-GAN is intimately related to the Ar…
▽ More
We introduce a tunable GAN, called $α$-GAN, parameterized by $α\in (0,\infty]$, which interpolates between various $f$-GANs and Integral Probability Metric based GANs (under constrained discriminator set). We construct $α$-GAN using a supervised loss function, namely, $α$-loss, which is a tunable loss function capturing several canonical losses. We show that $α$-GAN is intimately related to the Arimoto divergence, which was first proposed by Österriecher (1996), and later studied by Liese and Vajda (2006). We also study the convergence properties of $α$-GAN. We posit that the holistic understanding that $α$-GAN introduces will have practical benefits of addressing both the issues of vanishing gradients and mode collapse.
△ Less
Submitted 18 October, 2021; v1 submitted 9 June, 2021;
originally announced June 2021.
-
A Verifiable Framework for Cyber-Physical Attacks and Countermeasures in a Resilient Electric Power Grid
Authors:
Zhigang Chu,
Andrea Pinceti,
Ramin Kaviani,
Roozbeh Khodadadeh,
Xingpeng Li,
Jiazi Zhang,
Karthik Saikumar,
Mostafa Sahraei-Ardakani,
Christopher Mosier,
Robin Podmore,
Kory Hedman,
Oliver Kosut,
Lalitha Sankar
Abstract:
In this paper, we investigate the feasibility and physical consequences of cyber attacks against energy management systems (EMS). Within this framework, we have designed a complete simulation platform to emulate realistic EMS operations: it includes state estimation (SE), real-time contingency analysis (RTCA), and security constrained economic dispatch (SCED). This software platform allowed us to…
▽ More
In this paper, we investigate the feasibility and physical consequences of cyber attacks against energy management systems (EMS). Within this framework, we have designed a complete simulation platform to emulate realistic EMS operations: it includes state estimation (SE), real-time contingency analysis (RTCA), and security constrained economic dispatch (SCED). This software platform allowed us to achieve two main objectives: 1) to study the cyber vulnerabilities of an EMS and understand their consequences on the system, and 2) to formulate and implement countermeasures against cyber-attacks exploiting these vulnerabilities. Our results show that the false data injection attacks against state estimation described in the literature do not easily cause base-case overflows because of the conservatism introduced by RTCA. For a successful attack, a more sophisticated model that includes all of the EMS blocks is needed; even in this scenario, only post-contingency violations can be achieved. Nonetheless, we propose several countermeasures that can detect changes due to cyber-attacks and limit their impact on the system.
△ Less
Submitted 28 April, 2021;
originally announced April 2021.
-
Three Variants of Differential Privacy: Lossless Conversion and Applications
Authors:
Shahab Asoodeh,
Jiachun Liao,
Flavio P. Calmon,
Oliver Kosut,
Lalitha Sankar
Abstract:
We consider three different variants of differential privacy (DP), namely approximate DP, Rényi DP (RDP), and hypothesis test DP. In the first part, we develop a machinery for optimally relating approximate DP to RDP based on the joint range of two $f$-divergences that underlie the approximate DP and RDP. In particular, this enables us to derive the optimal approximate DP parameters of a mechanism…
▽ More
We consider three different variants of differential privacy (DP), namely approximate DP, Rényi DP (RDP), and hypothesis test DP. In the first part, we develop a machinery for optimally relating approximate DP to RDP based on the joint range of two $f$-divergences that underlie the approximate DP and RDP. In particular, this enables us to derive the optimal approximate DP parameters of a mechanism that satisfies a given level of RDP. As an application, we apply our result to the moments accountant framework for characterizing privacy guarantees of noisy stochastic gradient descent (SGD). When compared to the state-of-the-art, our bounds may lead to about 100 more stochastic gradient descent iterations for training deep learning models for the same privacy budget. In the second part, we establish a relationship between RDP and hypothesis test DP which allows us to translate the RDP constraint into a tradeoff between type I and type II error probabilities of a certain binary hypothesis test. We then demonstrate that for noisy SGD our result leads to tighter privacy guarantees compared to the recently proposed $f$-DP framework for some range of parameters.
△ Less
Submitted 23 January, 2021; v1 submitted 14 August, 2020;
originally announced August 2020.
-
On the alpha-loss Landscape in the Logistic Model
Authors:
Tyler Sypherd,
Mario Diaz,
Lalitha Sankar,
Gautam Dasarathy
Abstract:
We analyze the optimization landscape of a recently introduced tunable class of loss functions called $α$-loss, $α\in (0,\infty]$, in the logistic model. This family encapsulates the exponential loss ($α= 1/2$), the log-loss ($α= 1$), and the 0-1 loss ($α= \infty$) and contains compelling properties that enable the practitioner to discern among a host of operating conditions relevant to emerging l…
▽ More
We analyze the optimization landscape of a recently introduced tunable class of loss functions called $α$-loss, $α\in (0,\infty]$, in the logistic model. This family encapsulates the exponential loss ($α= 1/2$), the log-loss ($α= 1$), and the 0-1 loss ($α= \infty$) and contains compelling properties that enable the practitioner to discern among a host of operating conditions relevant to emerging learning methods. Specifically, we study the evolution of the optimization landscape of $α$-loss with respect to $α$ using tools drawn from the study of strictly-locally-quasi-convex functions in addition to geometric techniques. We interpret these results in terms of optimization complexity via normalized gradient descent.
△ Less
Submitted 22 June, 2020;
originally announced June 2020.
-
$N-1$ Reliability Makes It Difficult for False Data Injection Attacks to Cause Physical Consequences
Authors:
Zhigang Chu,
Jiazi Zhang,
Oliver Kosut,
Lalitha Sankar
Abstract:
This paper demonstrates that false data injection (FDI) attacks are extremely limited in their ability to cause physical consequences on $N-1$ reliable power systems operating with real-time contingency analysis (RTCA) and security constrained economic dispatch (SCED). Prior work has shown that FDI attacks can be designed via an attacker-defender bi-level linear program (ADBLP) to cause physical o…
▽ More
This paper demonstrates that false data injection (FDI) attacks are extremely limited in their ability to cause physical consequences on $N-1$ reliable power systems operating with real-time contingency analysis (RTCA) and security constrained economic dispatch (SCED). Prior work has shown that FDI attacks can be designed via an attacker-defender bi-level linear program (ADBLP) to cause physical overflows after re-dispatch using DCOPF. In this paper, it is shown that attacks designed using DCOPF fail to cause overflows on $N-1$ reliable systems because the system response modeled is inaccurate. An ADBLP that accurately models the system response is proposed to find the worst-case physical consequences, thereby modeling a strong attacker with system level knowledge. Simulation results on the synthetic Texas system with 2000 buses show that even with the new enhanced attacks, for systems operated conservatively due to $N-1$ constraints, the designed attacks only lead to post-contingency overflows. Moreover, the attacker must control a large portion of measurements and physically create a contingency in the system to cause consequences. Therefore, it is conceivable but requires an extremely sophisticated attacker to cause physical consequences on $N-1$ reliable power systems operated with RTCA and SCED.
△ Less
Submitted 13 March, 2020;
originally announced March 2020.
-
Detecting Load Redistribution Attacks via Support Vector Models
Authors:
Zhigang Chu,
Oliver Kosut,
Lalitha Sankar
Abstract:
A machine learning-based detection framework is proposed to detect a class of cyber-attacks that redistribute loads by modifying measurements. The detection framework consists of a multi-output support vector regression (SVR) load predictor that predicts loads by exploiting both spatial and temporal correlations, and a subsequent support vector machine (SVM) attack detector to determine the existe…
▽ More
A machine learning-based detection framework is proposed to detect a class of cyber-attacks that redistribute loads by modifying measurements. The detection framework consists of a multi-output support vector regression (SVR) load predictor that predicts loads by exploiting both spatial and temporal correlations, and a subsequent support vector machine (SVM) attack detector to determine the existence of load redistribution (LR) attacks utilizing loads predicted by the SVR predictor. Historical load data for training the SVR are obtained from the publicly available PJM zonal loads and are mapped to the IEEE 30-bus system. The SVM is trained using normal data and randomly created LR attacks, and is tested against both random and intelligently designed LR attacks. The results show that the proposed detection framework can effectively detect LR attacks. Moreover, attack mitigation can be achieved by using the SVR predicted loads to re-dispatch generations.
△ Less
Submitted 13 March, 2020;
originally announced March 2020.
-
A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via $f$-Divergences
Authors:
Shahab Asoodeh,
Jiachun Liao,
Flavio P. Calmon,
Oliver Kosut,
Lalitha Sankar
Abstract:
We derive the optimal differential privacy (DP) parameters of a mechanism that satisfies a given level of Rényi differential privacy (RDP). Our result is based on the joint range of two $f$-divergences that underlie the approximate and the Rényi variations of differential privacy. We apply our result to the moments accountant framework for characterizing privacy guarantees of stochastic gradient d…
▽ More
We derive the optimal differential privacy (DP) parameters of a mechanism that satisfies a given level of Rényi differential privacy (RDP). Our result is based on the joint range of two $f$-divergences that underlie the approximate and the Rényi variations of differential privacy. We apply our result to the moments accountant framework for characterizing privacy guarantees of stochastic gradient descent. When compared to the state-of-the-art, our bounds may lead to about 100 more stochastic gradient descent iterations for training deep learning models for the same privacy budget.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
Detection and Localization of Load Redistribution Attacks on Large Scale Systems
Authors:
Andrea Pinceti,
Lalitha Sankar,
Oliver Kosut
Abstract:
A nearest neighbor-based detection scheme against load redistribution attacks is presented. The detector is designed to scale from small to very large systems while guaranteeing consistent detection performance. Extensive testing is performed on a realistic, large scale system to evaluate the performance of the proposed detector against a wide range of attacks, from simple random noise attacks to…
▽ More
A nearest neighbor-based detection scheme against load redistribution attacks is presented. The detector is designed to scale from small to very large systems while guaranteeing consistent detection performance. Extensive testing is performed on a realistic, large scale system to evaluate the performance of the proposed detector against a wide range of attacks, from simple random noise attacks to sophisticated load redistribution attacks. The detection capability is analyzed against different attack parameters to evaluate its sensitivity. Finally, a statistical test that leverages the proposed detection algorithm is introduced to identify which loads are likely to have been maliciously modified, thus, localizing the attack subgraph. This test is based on ascribing to each load a risk measure (probability of being attacked) and then computing the best posterior likelihood that minimizes log-loss.
△ Less
Submitted 15 June, 2020; v1 submitted 19 December, 2019;
originally announced December 2019.
-
Theoretical Guarantees for Model Auditing with Finite Adversaries
Authors:
Mario Diaz,
Peter Kairouz,
Jiachun Liao,
Lalitha Sankar
Abstract:
Privacy concerns have led to the development of privacy-preserving approaches for learning models from sensitive data. Yet, in practice, even models learned with privacy guarantees can inadvertently memorize unique training examples or leak sensitive features. To identify such privacy violations, existing model auditing techniques use finite adversaries defined as machine learning models with (a)…
▽ More
Privacy concerns have led to the development of privacy-preserving approaches for learning models from sensitive data. Yet, in practice, even models learned with privacy guarantees can inadvertently memorize unique training examples or leak sensitive features. To identify such privacy violations, existing model auditing techniques use finite adversaries defined as machine learning models with (a) access to some finite side information (e.g., a small auditing dataset), and (b) finite capacity (e.g., a fixed neural network architecture). Our work investigates the requirements under which an unsuccessful attempt to identify privacy violations by a finite adversary implies that no stronger adversary can succeed at such a task. We do so via parameters that quantify the capabilities of the finite adversary, including the size of the neural network employed by such an adversary and the amount of side information it has access to as well as the regularity of the (perhaps privacy-guaranteeing) audited model.
△ Less
Submitted 8 November, 2019;
originally announced November 2019.
-
Generating Fair Universal Representations using Adversarial Models
Authors:
Peter Kairouz,
Jiachun Liao,
Chong Huang,
Maunil Vyas,
Monica Welfert,
Lalitha Sankar
Abstract:
We present a data-driven framework for learning fair universal representations (FUR) that guarantee statistical fairness for any learning task that may not be known a priori. Our framework leverages recent advances in adversarial learning to allow a data holder to learn representations in which a set of sensitive attributes are decoupled from the rest of the dataset. We formulate this as a constra…
▽ More
We present a data-driven framework for learning fair universal representations (FUR) that guarantee statistical fairness for any learning task that may not be known a priori. Our framework leverages recent advances in adversarial learning to allow a data holder to learn representations in which a set of sensitive attributes are decoupled from the rest of the dataset. We formulate this as a constrained minimax game between an encoder and an adversary where the constraint ensures a measure of usefulness (utility) of the representation. The resulting problem is that of censoring, i.e., finding a representation that is least informative about the sensitive attributes given a utility constraint. For appropriately chosen adversarial loss functions, our censoring framework precisely clarifies the optimal adversarial strategy against strong information-theoretic adversaries; it also achieves the fairness measure of demographic parity for the resulting constrained representations. We evaluate the performance of our proposed framework on both synthetic and publicly available datasets. For these datasets, we use two tradeoff measures: censoring vs. representation fidelity and fairness vs. utility for downstream tasks, to amply demonstrate that multiple sensitive features can be effectively censored even as the resulting fair representations ensure accuracy for multiple downstream tasks.
△ Less
Submitted 11 May, 2022; v1 submitted 27 September, 2019;
originally announced October 2019.
-
A Tunable Loss Function for Robust Classification: Calibration, Landscape, and Generalization
Authors:
Tyler Sypherd,
Mario Diaz,
John Kevin Cava,
Gautam Dasarathy,
Peter Kairouz,
Lalitha Sankar
Abstract:
We introduce a tunable loss function called $α$-loss, parameterized by $α\in (0,\infty]$, which interpolates between the exponential loss ($α= 1/2$), the log-loss ($α= 1$), and the 0-1 loss ($α= \infty$), for the machine learning setting of classification. Theoretically, we illustrate a fundamental connection between $α$-loss and Arimoto conditional entropy, verify the classification-calibration o…
▽ More
We introduce a tunable loss function called $α$-loss, parameterized by $α\in (0,\infty]$, which interpolates between the exponential loss ($α= 1/2$), the log-loss ($α= 1$), and the 0-1 loss ($α= \infty$), for the machine learning setting of classification. Theoretically, we illustrate a fundamental connection between $α$-loss and Arimoto conditional entropy, verify the classification-calibration of $α$-loss in order to demonstrate asymptotic optimality via Rademacher complexity generalization techniques, and build-upon a notion called strictly local quasi-convexity in order to quantitatively characterize the optimization landscape of $α$-loss. Practically, we perform class imbalance, robustness, and classification experiments on benchmark image datasets using convolutional-neural-networks. Our main practical conclusion is that certain tasks may benefit from tuning $α$-loss away from log-loss ($α= 1$), and to this end we provide simple heuristics for the practitioner. In particular, navigating the $α$ hyperparameter can readily provide superior model robustness to label flips ($α> 1$) and sensitivity to imbalanced classes ($α< 1$).
△ Less
Submitted 21 December, 2022; v1 submitted 5 June, 2019;
originally announced June 2019.
-
Can Predictive Filters Detect Gradually Ram** False Data Injection Attacks Against PMUs?
Authors:
Zhigang Chu,
Andrea Pinceti,
Reetam Sen Biswas,
Oliver Kosut,
Anamitra Pal,
Lalitha Sankar
Abstract:
Intelligently designed false data injection (FDI) attacks have been shown to be able to bypass the $χ^2$-test based bad data detector (BDD), resulting in physical consequences (such as line overloads) in the power system. In this paper, it is shown that if an attack is suddenly injected into the system, a predictive filter with sufficient accuracy is able to detect it. However, an attacker can gra…
▽ More
Intelligently designed false data injection (FDI) attacks have been shown to be able to bypass the $χ^2$-test based bad data detector (BDD), resulting in physical consequences (such as line overloads) in the power system. In this paper, it is shown that if an attack is suddenly injected into the system, a predictive filter with sufficient accuracy is able to detect it. However, an attacker can gradually increase the magnitude of the attack to avoid detection, and still cause damage to the system.
△ Less
Submitted 6 May, 2019;
originally announced May 2019.
-
Vulnerability Assessment of N-1 Reliable Power Systems to False Data Injection Attacks
Authors:
Zhigang Chu,
Jiazi Zhang,
Oliver Kosut,
Lalitha Sankar
Abstract:
This paper studies the vulnerability of large-scale power systems to false data injection (FDI) attacks through their physical consequences. Prior work has shown that an attacker-defender bi-level linear program (ADBLP) can be used to determine the worst-case consequences of FDI attacks aiming to maximize the physical power flow on a target line. Understanding the consequences of these attacks req…
▽ More
This paper studies the vulnerability of large-scale power systems to false data injection (FDI) attacks through their physical consequences. Prior work has shown that an attacker-defender bi-level linear program (ADBLP) can be used to determine the worst-case consequences of FDI attacks aiming to maximize the physical power flow on a target line. Understanding the consequences of these attacks requires consideration of power system operations commonly used in practice, specifically real-time contingency analysis (RTCA) and security constrained economic dispatch (SCED). An ADBLP is formulated with detailed assumptions on attacker's knowledge, and a modified Benders' decomposition algorithm is introduced to solve such an ADBLP. The vulnerability analysis results presented for the synthetic Texas system with 2000 buses show that intelligent FDI attacks can cause post-contingency overflows.
△ Less
Submitted 18 March, 2019;
originally announced March 2019.
-
A Tunable Loss Function for Binary Classification
Authors:
Tyler Sypherd,
Mario Diaz,
Lalitha Sankar,
Peter Kairouz
Abstract:
We present $α$-loss, $α\in [1,\infty]$, a tunable loss function for binary classification that bridges log-loss ($α=1$) and $0$-$1$ loss ($α= \infty$). We prove that $α$-loss has an equivalent margin-based form and is classification-calibrated, two desirable properties for a good surrogate loss function for the ideal yet intractable $0$-$1$ loss. For logistic regression-based classification, we pr…
▽ More
We present $α$-loss, $α\in [1,\infty]$, a tunable loss function for binary classification that bridges log-loss ($α=1$) and $0$-$1$ loss ($α= \infty$). We prove that $α$-loss has an equivalent margin-based form and is classification-calibrated, two desirable properties for a good surrogate loss function for the ideal yet intractable $0$-$1$ loss. For logistic regression-based classification, we provide an upper bound on the difference between the empirical and expected risk at the empirical risk minimizers for $α$-loss by exploiting its Lipschitzianity along with recent results on the landscape features of empirical risk functions. Finally, we show that $α$-loss with $α= 2$ performs better than log-loss on MNIST for logistic regression.
△ Less
Submitted 19 March, 2019; v1 submitted 12 February, 2019;
originally announced February 2019.
-
Robustness of Maximal $α$-Leakage to Side Information
Authors:
Jiachun Liao,
Lalitha Sankar,
Oliver Kosut,
Flavio P. Calmon
Abstract:
Maximal $α$-leakage is a tunable measure of information leakage based on the accuracy of guessing an arbitrary function of private data based on public data. The parameter $α$ determines the loss function used to measure the accuracy of a belief, ranging from log-loss at $α=1$ to the probability of error at $α=\infty$. To study the effect of side information on this measure, we introduce and defin…
▽ More
Maximal $α$-leakage is a tunable measure of information leakage based on the accuracy of guessing an arbitrary function of private data based on public data. The parameter $α$ determines the loss function used to measure the accuracy of a belief, ranging from log-loss at $α=1$ to the probability of error at $α=\infty$. To study the effect of side information on this measure, we introduce and define conditional maximal $α$-leakage. We show that, for a chosen map** (channel) from the actual (viewed as private) data to the released (public) data and some side information, the conditional maximal $α$-leakage is the supremum (over all side information) of the conditional Arimoto channel capacity where the conditioning is on the side information. We prove that if the side information is conditionally independent of the public data given the private data, the side information cannot increase the information leakage.
△ Less
Submitted 4 April, 2019; v1 submitted 21 January, 2019;
originally announced January 2019.
-
On the Robustness of Information-Theoretic Privacy Measures and Mechanisms
Authors:
Mario Diaz,
Hao Wang,
Flavio P. Calmon,
Lalitha Sankar
Abstract:
Consider a data publishing setting for a dataset composed by both private and non-private features. The publisher uses an empirical distribution, estimated from $n$ i.i.d. samples, to design a privacy mechanism which is applied to new fresh samples afterward. In this paper, we study the discrepancy between the privacy-utility guarantees for the empirical distribution, used to design the privacy me…
▽ More
Consider a data publishing setting for a dataset composed by both private and non-private features. The publisher uses an empirical distribution, estimated from $n$ i.i.d. samples, to design a privacy mechanism which is applied to new fresh samples afterward. In this paper, we study the discrepancy between the privacy-utility guarantees for the empirical distribution, used to design the privacy mechanism, and those for the true distribution, experienced by the privacy mechanism in practice. We first show that, for any privacy mechanism, these discrepancies vanish at speed $O(1/\sqrt{n})$ with high probability. These bounds follow from our main technical results regarding the Lipschitz continuity of the considered information leakage measures. Then we prove that the optimal privacy mechanisms for the empirical distribution approach the corresponding mechanisms for the true distribution as the sample size $n$ increases, thereby establishing the statistical consistency of the optimal privacy mechanisms. Finally, we introduce and study uniform privacy mechanisms which, by construction, provide privacy to all the distributions within a neighborhood of the estimated distribution and, thereby, guarantee privacy for the true distribution with high probability.
△ Less
Submitted 19 March, 2020; v1 submitted 14 November, 2018;
originally announced November 2018.
-
Tunable Measures for Information Leakage and Applications to Privacy-Utility Tradeoffs
Authors:
Jiachun Liao,
Oliver Kosut,
Lalitha Sankar,
Flavio du Pin Calmon
Abstract:
We introduce a tunable measure for information leakage called maximal alpha-leakage. This measure quantifies the maximal gain of an adversary in inferring any (potentially random) function of a dataset from a release of the data. The inferential capability of the adversary is, in turn, quantified by a class of adversarial loss functions that we introduce as $α$-loss, $α\in[1,\infty]$. The choice o…
▽ More
We introduce a tunable measure for information leakage called maximal alpha-leakage. This measure quantifies the maximal gain of an adversary in inferring any (potentially random) function of a dataset from a release of the data. The inferential capability of the adversary is, in turn, quantified by a class of adversarial loss functions that we introduce as $α$-loss, $α\in[1,\infty]$. The choice of $α$ determines the specific adversarial action and ranges from refining a belief (about any function of the data) for $α=1$ to guessing the most likely value for $α=\infty$ while refining the $α^{th}$ moment of the belief for $α$ in between. Maximal alpha-leakage then quantifies the adversarial gain under $α$-loss over all possible functions of the data. In particular, for the extremal values of $α=1$ and $α=\infty$, maximal alpha-leakage simplifies to mutual information and maximal leakage, respectively. For $α\in(1,\infty)$ this measure is shown to be the Arimoto channel capacity of order $α$. We show that maximal alpha-leakage satisfies data processing inequalities and a sub-additivity property thereby allowing for a weak composition result. Building upon these properties, we use maximal alpha-leakage as the privacy measure and study the problem of data publishing with privacy guarantees, wherein the utility of the released data is ensured via a hard distortion constraint. Unlike average distortion, hard distortion provides a deterministic guarantee of fidelity. We show that under a hard distortion constraint, for $α>1$ the optimal mechanism is independent of $α$, and therefore, the resulting optimal tradeoff is the same for all values of $α>1$. Finally, the tunability of maximal alpha-leakage as a privacy measure is also illustrated for binary data with average Hamming distortion as the utility measure.
△ Less
Submitted 19 August, 2019; v1 submitted 24 September, 2018;
originally announced September 2018.
-
Generative Adversarial Privacy
Authors:
Chong Huang,
Peter Kairouz,
Xiao Chen,
Lalitha Sankar,
Ram Rajagopal
Abstract:
We present a data-driven framework called generative adversarial privacy (GAP). Inspired by recent advancements in generative adversarial networks (GANs), GAP allows the data holder to learn the privatization mechanism directly from the data. Under GAP, finding the optimal privacy mechanism is formulated as a constrained minimax game between a privatizer and an adversary. We show that for appropri…
▽ More
We present a data-driven framework called generative adversarial privacy (GAP). Inspired by recent advancements in generative adversarial networks (GANs), GAP allows the data holder to learn the privatization mechanism directly from the data. Under GAP, finding the optimal privacy mechanism is formulated as a constrained minimax game between a privatizer and an adversary. We show that for appropriately chosen adversarial loss functions, GAP provides privacy guarantees against strong information-theoretic adversaries. We also evaluate GAP's performance on the GENKI face database.
△ Less
Submitted 26 June, 2019; v1 submitted 13 July, 2018;
originally announced July 2018.