-
Certificates of Differential Privacy and Unlearning for Gradient-Based Training
Authors:
Matthew Wicker,
Philip Sosnin,
Adrianna Janik,
Mark N. Müller,
Adrian Weller,
Calvin Tsay
Abstract:
Proper data stewardship requires that model owners protect the privacy of individuals' data used during training. Whether through anonymization with differential privacy or the use of unlearning in non-anonymized settings, the gold-standard techniques for providing privacy guarantees can come with significant performance penalties or be too weak to provide practical assurances. In part, this is du…
▽ More
Proper data stewardship requires that model owners protect the privacy of individuals' data used during training. Whether through anonymization with differential privacy or the use of unlearning in non-anonymized settings, the gold-standard techniques for providing privacy guarantees can come with significant performance penalties or be too weak to provide practical assurances. In part, this is due to the fact that the guarantee provided by differential privacy represents the worst-case privacy leakage for any individual, while the true privacy leakage of releasing the prediction for a given individual might be substantially smaller or even, as we show, non-existent. This work provides a novel framework based on convex relaxations and bounds propagation that can compute formal guarantees (certificates) that releasing specific predictions satisfies $ε=0$ privacy guarantees or do not depend on data that is subject to an unlearning request. Our framework offers a new verification-centric approach to privacy and unlearning guarantees, that can be used to further engender user trust with tighter privacy guarantees, provide formal proofs of robustness to certain membership inference attacks, identify potentially vulnerable records, and enhance current unlearning approaches. We validate the effectiveness of our approach on tasks from financial services, medical imaging, and natural language processing.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Certified Robustness to Data Poisoning in Gradient-Based Training
Authors:
Philip Sosnin,
Mark N. Müller,
Maximilian Baader,
Calvin Tsay,
Matthew Wicker
Abstract:
Modern machine learning pipelines leverage large amounts of public data, making it infeasible to guarantee data quality and leaving models open to poisoning and backdoor attacks. However, provably bounding model behavior under such attacks remains an open problem. In this work, we address this challenge and develop the first framework providing provable guarantees on the behavior of models trained…
▽ More
Modern machine learning pipelines leverage large amounts of public data, making it infeasible to guarantee data quality and leaving models open to poisoning and backdoor attacks. However, provably bounding model behavior under such attacks remains an open problem. In this work, we address this challenge and develop the first framework providing provable guarantees on the behavior of models trained with potentially manipulated data. In particular, our framework certifies robustness against untargeted and targeted poisoning as well as backdoor attacks for both input and label manipulations. Our method leverages convex relaxations to over-approximate the set of all possible parameter updates for a given poisoning threat model, allowing us to bound the set of all reachable parameters for any gradient-based learning algorithm. Given this set of parameters, we provide bounds on worst-case behavior, including model performance and backdoor success rate. We demonstrate our approach on multiple real-world datasets from applications including energy consumption, medical imaging, and autonomous driving.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Scaling Mixed-Integer Programming for Certification of Neural Network Controllers Using Bounds Tightening
Authors:
Philip Sosnin,
Calvin Tsay
Abstract:
Neural networks offer a computationally efficient approximation of model predictive control, but they lack guarantees on the resulting controlled system's properties. Formal certification of neural networks is crucial for ensuring safety, particularly in safety-critical domains such as autonomous vehicles. One approach to formally certify properties of neural networks is to solve a mixed-integer p…
▽ More
Neural networks offer a computationally efficient approximation of model predictive control, but they lack guarantees on the resulting controlled system's properties. Formal certification of neural networks is crucial for ensuring safety, particularly in safety-critical domains such as autonomous vehicles. One approach to formally certify properties of neural networks is to solve a mixed-integer program based on the network. This approach suffers from scalability issues due to the complexity of solving the resulting mixed-integer programs. Nevertheless, these issues can be (partially) mitigated via bound-tightening techniques prior to forming the mixed-integer program, which results in tighter formulations and faster optimisation. This paper presents bound-tightening techniques in the context of neural network explicit control policies. Bound tightening is particularly important when considering problems spanning multiple time steps of a controlled system, as the bounds must be propagated through the problem depth. Several strategies for bound tightening are evaluated in terms of both computational complexity and tightness of the bounds.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.