Skip to main content

Showing 1–4 of 4 results for author: Kadhe, S R

.
  1. arXiv:2406.11780  [pdf, other

    cs.LG cs.AI cs.CL

    Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs

    Authors: Swanand Ravindra Kadhe, Farhan Ahmed, Dennis Wei, Nathalie Baracaldo, Inkit Padhi

    Abstract: Large language models (LLMs) have shown to pose social and ethical risks such as generating toxic language or facilitating malicious use of hazardous knowledge. Machine unlearning is a promising approach to improve LLM safety by directly removing harmful behaviors and knowledge. In this paper, we propose "SPlit, UNlearn, MerGE" (SPUNGE), a framework that can be used with any unlearning method to a… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2312.07420  [pdf, other

    cs.LG cs.CY

    FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in LLMs

    Authors: Swanand Ravindra Kadhe, Anisa Halimi, Ambrish Rawat, Nathalie Baracaldo

    Abstract: Training large language models (LLMs) is a costly endeavour in terms of time and computational resources. The large amount of training data used during the unsupervised pre-training phase makes it difficult to verify all data and, unfortunately, undesirable data may be ingested during training. Re-training from scratch is impractical and has led to the creation of the 'unlearning' discipline where… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted in NeurIPS 2023 Workshop on Socially Responsible Language Modelling Research (SoLaR)

  3. arXiv:2312.04748  [pdf, other

    cs.CR cs.AI cs.CL

    Forcing Generative Models to Degenerate Ones: The Power of Data Poisoning Attacks

    Authors: Shuli Jiang, Swanand Ravindra Kadhe, Yi Zhou, Ling Cai, Nathalie Baracaldo

    Abstract: Growing applications of large language models (LLMs) trained by a third party raise serious concerns on the security vulnerability of LLMs.It has been demonstrated that malicious actors can covertly exploit these vulnerabilities in LLMs through poisoning attacks aimed at generating undesirable outputs. While poisoning attacks have received significant attention in the image domain (e.g., object de… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: 19 pages, 6 figures. Published at NeurIPS 2023 Workshop on Backdoors in Deep Learning: The Good, the Bad, and the Ugly

  4. arXiv:2310.19304  [pdf, other

    cs.CR cs.LG

    Privacy-Preserving Federated Learning over Vertically and Horizontally Partitioned Data for Financial Anomaly Detection

    Authors: Swanand Ravindra Kadhe, Heiko Ludwig, Nathalie Baracaldo, Alan King, Yi Zhou, Keith Houck, Ambrish Rawat, Mark Purcell, Naoise Holohan, Mikio Takeuchi, Ryo Kawahara, Nir Drucker, Hayim Shaul, Eyal Kushnir, Omri Soceanu

    Abstract: The effective detection of evidence of financial anomalies requires collaboration among multiple entities who own a diverse set of data, such as a payment network system (PNS) and its partner banks. Trust among these financial institutions is limited by regulation and competition. Federated learning (FL) enables entities to collaboratively train a model when data is either vertically or horizontal… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Prize Winner in the U.S. Privacy Enhancing Technologies (PETs) Prize Challenge