Too Good to be True? Turn Any Model Differentially Private With DP-Weights

David Zagardo [email protected]
(June 27, 2024)
Abstract

Imagine training a machine learning model with Differentially Private Stochastic Gradient Descent (DP-SGD), only to discover post-training that the noise level was either too high, crippling your model’s utility, or too low, compromising privacy. The dreaded realization hits: you must start the lengthy training process from scratch. But what if you could avoid this retraining nightmare? In this study, we introduce a groundbreaking approach (to our knowledge) that applies differential privacy noise to the model’s weights after training. We offer a comprehensive mathematical proof for this novel approach’s privacy bounds, use formal methods to validate its privacy guarantees, and empirically evaluate its effectiveness using membership inference attacks and performance evaluations. This method allows for a single training run, followed by post-hoc noise adjustments to achieve optimal privacy-utility trade-offs. We compare this novel fine-tuned model (DP-Weights model) to a traditional DP-SGD model, demonstrating that our approach yields statistically similar performance and privacy guarantees. Our results validate the efficacy of post-training noise application, promising significant time savings and flexibility in fine-tuning differential privacy parameters, making it a practical alternative for deploying differentially private models in real-world scenarios.

1 Introduction

Differential privacy is a critical component in protecting sensitive information in machine learning models. Traditional methods integrate noise during training, which can significantly impact time to deployment. We propose a new method that applies differentially private noise to the model’s weights after training, aiming to achieve similar privacy guarantees as DP-SGD with similar impacts on model performance. This study investigates the statistical properties of this novel fine-tuned approach (DP-Weights) and compares them with those of a conventional DP-SGD model. Comparisons are also made to a fine-tuned model without noise applied. Our approach can be particularly beneficial in fields requiring frequent model updates or where computational resources are limited, such as healthcare, finance, and personalized recommendations.

2 Related Work

Differentially Private Stochastic Gradient Descent (DP-SGD) has become essential for training models with privacy guarantees. Abadi et al. (2016) pioneered DP-SGD by incorporating noise into gradient updates and applying gradient clip**, providing a tradeoff for model performance while ensuring data privacy [1].

Recent advancements focus on enhancing DP-SGD efficiency and applicability. Yu et al. (2022) introduced parameter-efficient fine-tuning (PEFT) techniques to reduce computational costs and improve performance for large language models [2]. Li et al. (2022) demonstrated DP-SGD’s effectiveness in full fine-tuning for complex models and large datasets [3].

Alternative approaches include DP coordinate descent by Damaskinos et al. (2021), which reduces privacy cost by computing gradients of random coordinates during backpropagation [4]. Koskela et al. (2023) explored privacy amplification through shuffling, enhancing privacy guarantees by shuffling data before DP-SGD [5].

Studies have also explored post-training noise application. Nasr et al. (2018) proposed adding noise to the model’s weights post-training to achieve differential privacy, showing its effectiveness across various models and datasets [9]. Lyu et al. (2020) presented a framework for differentially private model publishing through noise injection into the model’s weights after training, balancing privacy and utility [10].

The DP-UTIL framework analyzes various DP perturbation techniques, highlighting that output perturbation results in lower utility loss [6]. Dong et al. (2022) introduced Gaussian Differential Privacy (GDP), providing a tighter privacy analysis using Gaussian noise [7]. Additionally, research on differentially private training with smooth sensitivity emphasizes optimizing privacy-utility trade-offs [8].

Despite these advancements, there remains a need for a practical approach to applying differential privacy post-training with formal privacy guarantees and empirical validation through robust methods like membership inference attacks. Our work addresses this gap by providing a comprehensive mathematical proof of privacy bounds for post-training noise injection, empirical evaluations using membership inference attacks, and strategies for optimizing privacy-utility trade-offs, promising significant time savings and flexibility in fine-tuning differential privacy parameters.

3 Methodology

3.1 Noise Scale Calculation

The noise scale for applying noise to model weights post training is calculated using the following formula:

σ=2log(1.25δ)EηCϵNB+0.009760ϵ0.078008𝜎21.25𝛿𝐸𝜂𝐶italic-ϵ𝑁𝐵0.009760superscriptitalic-ϵ0.078008\sigma=\sqrt{2\cdot\log\left(\frac{1.25}{\delta}\right)}\cdot\frac{E\cdot\eta% \cdot C}{\epsilon\cdot N\cdot B}+\frac{0.009760}{\epsilon^{0.078008}}italic_σ = square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG italic_E ⋅ italic_η ⋅ italic_C end_ARG start_ARG italic_ϵ ⋅ italic_N ⋅ italic_B end_ARG + divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG (1)

where N𝑁Nitalic_N is the dataset size, δ=1N2𝛿1superscript𝑁2\delta=\frac{1}{N^{2}}italic_δ = divide start_ARG 1 end_ARG start_ARG italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG, E𝐸Eitalic_E is the number of epochs, η𝜂\etaitalic_η is the learning rate, C𝐶Citalic_C is the clip** norm, ϵitalic-ϵ\epsilonitalic_ϵ is the privacy parameter, B𝐵Bitalic_B is the batch size. Sufficient knowledge of the training process is vital to ensure proper application of noise.

The additive term in the noise scale was added to gradually ease the noise scale taper as epsilon approaches higher values, ensuring enhanced privacy protection. This refinement aims to maintain a comparative similarity between our new DP-Weights approach and DP-SGD. The specific values for the numerator and the exponent were derived by fitting a model that aligns perplexity scores with epsilon for DP-SGD, and correlates perplexity and noise scale values for our DP-Weights approach. This method ensures a smooth transition and consistent performance across different privacy budgets.

It is critical to note that the most a model’s weights are permitted to be changed during the training process is directly proportional to the learning rate times the clip** norm times the number of epochs a model saw the training data. The clip** norm binds the global sensitivity of the function, and we include the epoch term because a model’s weights may accumulate weight change over multiple training rounds. As such, we define the maximum global sensitivity of the weight update rule to be bound by:

ΔW=EηCNBΔ𝑊𝐸𝜂𝐶𝑁𝐵\Delta W=\frac{E\cdot\eta\cdot C}{N\cdot B}roman_Δ italic_W = divide start_ARG italic_E ⋅ italic_η ⋅ italic_C end_ARG start_ARG italic_N ⋅ italic_B end_ARG (2)

The sensitivity is scaled by the inverse of the size of the dataset and batch size. A formal proof of privacy guarantees is offered in the appendix.

3.2 Data and Experimental Setup

We conducted experiments using three primary GPT2 model configurations:

  1. 1.

    DP Model: A traditional model trained with DP-SGD.

  2. 2.

    DP-Weights Model: A fine-tuned model with differentially private noise applied post-training.

  3. 3.

    Fine-Tuned Model: A model fine-tuned without any differential privacy noise.

We used the first 1,000 records from the Open Orca dataset for training and as members in our membership inference attack, and the second 1,000 records from the Open Orca dataset for evaluation as non-members in our membership inference attack.

3.3 Membership Inference Attack Methodology

To assess the vulnerability of each approach to membership inference attacks, we designed and conducted a series of experiments involving each previously noted GPT2 model. The following methodology outlines the steps taken to evaluate these models.

3.3.1 Experimental Setup

We used the GPT-2 language model as the base for all our experiments. The datasets used for training and evaluation were divided into member and non-member sets, representing data that the model was trained on and data it had not seen, respectively. The evaluation metrics focused on model perplexity and confidence scores, with subsequent analysis through membership inference attacks.

3.3.2 Evaluation Metrics

We evaluated the models based on the following metrics:

  • Perplexity: Perplexity scores were calculated for both member and non-member datasets.

  • Confidence Scores: For each dataset, we computed confidence scores using the softmax output of the model logits.

  • Membership Inference Metrics: We performed a membership inference attack by comparing the confidence scores of member and non-member datasets. The metrics used included ROC-AUC, accuracy, precision, recall, and F1-score.

3.3.3 Membership Inference Attack

The membership inference attack was performed as follows:

  1. 1.

    Combine the confidence scores from the member and non-member datasets.

  2. 2.

    Label the scores where 1 indicates a member and 0 indicates a non-member.

  3. 3.

    Set a threshold as the midpoint between the mean confidence scores of the member and non-member datasets.

  4. 4.

    Predict membership based on whether the confidence score is above or below the threshold.

  5. 5.

    Calculate ROC-AUC, accuracy, precision, recall, and F1-score to evaluate the attack’s effectiveness.

3.4 Training Procedure

The training procedure involved multiple steps. Initially, models were trained on a designated member dataset, employing the calculated noise scales for differentially private models. The training was performed using a custom implementation of DP-SGD with gradient clip** and noise addition. For the DP-Weights model, differential privacy noise was applied post-training using the pre-computed noise scales.

3.5 Statistical Simulation For Validating Privacy Guarantees

Overview: This approach leverages statistical simulations to empirically validate differential privacy conditions by evaluating the noise mechanism across a spectrum of epsilon values.

Procedure:

  1. 1.

    Noise Scale Calculation: Define a function to compute the noise scale σ𝜎\sigmaitalic_σ. The function is based on differential privacy parameters ϵitalic-ϵ\epsilonitalic_ϵ and δ𝛿\deltaitalic_δ, number of epochs (E), learning rate (η𝜂\etaitalic_η), clip** norm (C), dataset size (N), and batch size (B). The calculation incorporates both the standard Gaussian noise component and an empirical term to ensure adequate privacy protection.

    σ=2log(1.25δ)EηCNBϵ+0.009760ϵ0.078008𝜎21.25𝛿𝐸𝜂𝐶𝑁𝐵italic-ϵ0.009760superscriptitalic-ϵ0.078008\sigma=\sqrt{2\cdot\log\left(\frac{1.25}{\delta}\right)}\cdot\frac{E\cdot\eta% \cdot C}{N\cdot B\cdot\epsilon}+\frac{0.009760}{\epsilon^{0.078008}}italic_σ = square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG italic_E ⋅ italic_η ⋅ italic_C end_ARG start_ARG italic_N ⋅ italic_B ⋅ italic_ϵ end_ARG + divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG (3)
  2. 2.

    Noise Mechanism Simulation: Implement a function to add Gaussian noise to a given weight w𝑤witalic_w.

    noisyw=w+𝒩(0,σ)subscriptnoisy𝑤𝑤𝒩0𝜎\text{noisy}_{w}=w+\mathcal{N}(0,\sigma)noisy start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT = italic_w + caligraphic_N ( 0 , italic_σ ) (4)
  3. 3.

    Privacy Condition Testing: Develop a function to simulate the noise mechanism for a specified number of samples. This function calculates the violation rate by comparing the logarithm of the ratio of the probability density functions (PDFs) of the noisy weights for adjacent datasets.

    Algorithm 1 Privacy Condition Testing
    1:E𝐸Eitalic_E, η𝜂\etaitalic_η, C𝐶Citalic_C, N𝑁Nitalic_N, B𝐵Bitalic_B, δ𝛿\deltaitalic_δ, num_samples𝑛𝑢𝑚_𝑠𝑎𝑚𝑝𝑙𝑒𝑠num\_samplesitalic_n italic_u italic_m _ italic_s italic_a italic_m italic_p italic_l italic_e italic_s
    2:x0𝑥0x\leftarrow 0italic_x ← 0
    3:xx+EηCNBsuperscript𝑥𝑥𝐸𝜂𝐶𝑁𝐵x^{\prime}\leftarrow x+\frac{E\cdot\eta\cdot C}{N\cdot B}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ← italic_x + divide start_ARG italic_E ⋅ italic_η ⋅ italic_C end_ARG start_ARG italic_N ⋅ italic_B end_ARG
    4:sigma2log(1.25δ)EηCNBϵ+0.009760ϵ0.078008𝑠𝑖𝑔𝑚𝑎21.25𝛿𝐸𝜂𝐶𝑁𝐵italic-ϵ0.009760superscriptitalic-ϵ0.078008sigma\leftarrow\sqrt{2\cdot\log{\left(\frac{1.25}{\delta}\right)}}\cdot\frac{E% \cdot\eta\cdot C}{N\cdot B\cdot\epsilon}+\frac{0.009760}{\epsilon^{0.078008}}italic_s italic_i italic_g italic_m italic_a ← square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG italic_E ⋅ italic_η ⋅ italic_C end_ARG start_ARG italic_N ⋅ italic_B ⋅ italic_ϵ end_ARG + divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG
    5:violations0𝑣𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛𝑠0violations\leftarrow 0italic_v italic_i italic_o italic_l italic_a italic_t italic_i italic_o italic_n italic_s ← 0
    6:privacy_losses[]𝑝𝑟𝑖𝑣𝑎𝑐𝑦_𝑙𝑜𝑠𝑠𝑒𝑠privacy\_losses\leftarrow[\ ]italic_p italic_r italic_i italic_v italic_a italic_c italic_y _ italic_l italic_o italic_s italic_s italic_e italic_s ← [ ]
    7:for i=1𝑖1i=1italic_i = 1 to num_samples𝑛𝑢𝑚_𝑠𝑎𝑚𝑝𝑙𝑒𝑠num\_samplesitalic_n italic_u italic_m _ italic_s italic_a italic_m italic_p italic_l italic_e italic_s do
    8:     yx+𝒩(0,σ)𝑦𝑥𝒩0𝜎y\leftarrow x+\mathcal{N}(0,\sigma)italic_y ← italic_x + caligraphic_N ( 0 , italic_σ )
    9:     privacy_loss|log(𝒩(y;x,σ)𝒩(y;x,σ))|𝑝𝑟𝑖𝑣𝑎𝑐𝑦_𝑙𝑜𝑠𝑠𝒩𝑦𝑥𝜎𝒩𝑦superscript𝑥𝜎privacy\_loss\leftarrow|\log{\left(\frac{\mathcal{N}(y;x,\sigma)}{\mathcal{N}(% y;x^{\prime},\sigma)}\right)}|italic_p italic_r italic_i italic_v italic_a italic_c italic_y _ italic_l italic_o italic_s italic_s ← | roman_log ( divide start_ARG caligraphic_N ( italic_y ; italic_x , italic_σ ) end_ARG start_ARG caligraphic_N ( italic_y ; italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_σ ) end_ARG ) |
    10:     Append privacy_loss𝑝𝑟𝑖𝑣𝑎𝑐𝑦_𝑙𝑜𝑠𝑠privacy\_lossitalic_p italic_r italic_i italic_v italic_a italic_c italic_y _ italic_l italic_o italic_s italic_s to privacy_losses𝑝𝑟𝑖𝑣𝑎𝑐𝑦_𝑙𝑜𝑠𝑠𝑒𝑠privacy\_lossesitalic_p italic_r italic_i italic_v italic_a italic_c italic_y _ italic_l italic_o italic_s italic_s italic_e italic_s
    11:     if privacy_loss>ϵ𝑝𝑟𝑖𝑣𝑎𝑐𝑦_𝑙𝑜𝑠𝑠italic-ϵprivacy\_loss>\epsilonitalic_p italic_r italic_i italic_v italic_a italic_c italic_y _ italic_l italic_o italic_s italic_s > italic_ϵ then
    12:         violationsviolations+1𝑣𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛𝑠𝑣𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛𝑠1violations\leftarrow violations+1italic_v italic_i italic_o italic_l italic_a italic_t italic_i italic_o italic_n italic_s ← italic_v italic_i italic_o italic_l italic_a italic_t italic_i italic_o italic_n italic_s + 1
    13:     end if
    14:end for
    15:violation_rateviolationsnum_samples𝑣𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛_𝑟𝑎𝑡𝑒𝑣𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛𝑠𝑛𝑢𝑚_𝑠𝑎𝑚𝑝𝑙𝑒𝑠violation\_rate\leftarrow\frac{violations}{num\_samples}italic_v italic_i italic_o italic_l italic_a italic_t italic_i italic_o italic_n _ italic_r italic_a italic_t italic_e ← divide start_ARG italic_v italic_i italic_o italic_l italic_a italic_t italic_i italic_o italic_n italic_s end_ARG start_ARG italic_n italic_u italic_m _ italic_s italic_a italic_m italic_p italic_l italic_e italic_s end_ARG return violation_rate,privacy_losses𝑣𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛_𝑟𝑎𝑡𝑒𝑝𝑟𝑖𝑣𝑎𝑐𝑦_𝑙𝑜𝑠𝑠𝑒𝑠violation\_rate,privacy\_lossesitalic_v italic_i italic_o italic_l italic_a italic_t italic_i italic_o italic_n _ italic_r italic_a italic_t italic_e , italic_p italic_r italic_i italic_v italic_a italic_c italic_y _ italic_l italic_o italic_s italic_s italic_e italic_s
  4. 4.

    Experiment Execution: Perform experiments by loo** through a range of epsilon values to compute the violation rates. The results are then visualized using a logarithmic scale plot.

3.6 Formal Verification Approach for Validating Privacy Guarantees Using Z3

Overview: This approach employs formal verification techniques using the Z3 solver to mathematically validate the differential privacy conditions. It explores the impact of multiple compositions on privacy guarantees.

Procedure:

  1. 1.

    Define the Differential Privacy Condition: Utilize a Taylor series expansion to approximate the exponential function for the Gaussian noise PDF. Formulate the differential privacy condition as a Z3 constraint.

    pdfw=12πσ2exp((xw)22σ2)subscriptpdf𝑤12𝜋superscript𝜎2superscript𝑥𝑤22superscript𝜎2\text{pdf}_{w}=\frac{1}{\sqrt{2\pi\sigma^{2}}}\cdot\exp\left(-\frac{(x-w)^{2}}% {2\sigma^{2}}\right)pdf start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG square-root start_ARG 2 italic_π italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ⋅ roman_exp ( - divide start_ARG ( italic_x - italic_w ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) (5)
    pdfw=12πσ2exp((xw)22σ2)subscriptpdfsuperscript𝑤12𝜋superscript𝜎2superscript𝑥superscript𝑤22superscript𝜎2\text{pdf}_{w^{\prime}}=\frac{1}{\sqrt{2\pi\sigma^{2}}}\cdot\exp\left(-\frac{(% x-w^{\prime})^{2}}{2\sigma^{2}}\right)pdf start_POSTSUBSCRIPT italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG square-root start_ARG 2 italic_π italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ⋅ roman_exp ( - divide start_ARG ( italic_x - italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) (6)
    dp_condition=(pdfwexp(ϵ)pdfw+δ)dp_conditionsubscriptpdf𝑤italic-ϵsubscriptpdfsuperscript𝑤𝛿\text{dp\_condition}=\left(\text{pdf}_{w}\leq\exp(\epsilon)\cdot\text{pdf}_{w^% {\prime}}+\delta\right)dp_condition = ( pdf start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT ≤ roman_exp ( italic_ϵ ) ⋅ pdf start_POSTSUBSCRIPT italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + italic_δ ) (7)
  2. 2.

    Advanced Composition: Compute the composed ϵitalic-ϵ\epsilonitalic_ϵ and δ𝛿\deltaitalic_δ values after multiple compositions using advanced composition theorems.

    ϵ=ϵ2klog(1δ)superscriptitalic-ϵitalic-ϵ2𝑘1𝛿\epsilon^{\prime}=\epsilon\cdot\sqrt{2\cdot k\cdot\log\left(\frac{1}{\delta}% \right)}italic_ϵ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_ϵ ⋅ square-root start_ARG 2 ⋅ italic_k ⋅ roman_log ( divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ) end_ARG (8)
    δ=kδ+δsuperscript𝛿𝑘𝛿𝛿\delta^{\prime}=k\cdot\delta+\deltaitalic_δ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_k ⋅ italic_δ + italic_δ (9)
    Algorithm 2 Differential Privacy Testing with Z3 Solver
    1:E𝐸Eitalic_E, η𝜂\etaitalic_η, C𝐶Citalic_C, N𝑁Nitalic_N, B𝐵Bitalic_B, ϵitalic-ϵ\epsilonitalic_ϵ, δ𝛿\deltaitalic_δ, num_compositions𝑛𝑢𝑚_𝑐𝑜𝑚𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑠num\_compositionsitalic_n italic_u italic_m _ italic_c italic_o italic_m italic_p italic_o italic_s italic_i italic_t italic_i italic_o italic_n italic_s
    2:deltawEηCNB𝑑𝑒𝑙𝑡subscript𝑎𝑤𝐸𝜂𝐶𝑁𝐵delta_{w}\leftarrow\frac{E\cdot\eta\cdot C}{N\cdot B}italic_d italic_e italic_l italic_t italic_a start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT ← divide start_ARG italic_E ⋅ italic_η ⋅ italic_C end_ARG start_ARG italic_N ⋅ italic_B end_ARG
    3:function TaylorExp(x𝑥xitalic_x, terms𝑡𝑒𝑟𝑚𝑠termsitalic_t italic_e italic_r italic_m italic_s)
    4:     result1𝑟𝑒𝑠𝑢𝑙𝑡1result\leftarrow 1italic_r italic_e italic_s italic_u italic_l italic_t ← 1
    5:     factorial1𝑓𝑎𝑐𝑡𝑜𝑟𝑖𝑎𝑙1factorial\leftarrow 1italic_f italic_a italic_c italic_t italic_o italic_r italic_i italic_a italic_l ← 1
    6:     for i=1𝑖1i=1italic_i = 1 to terms1𝑡𝑒𝑟𝑚𝑠1terms-1italic_t italic_e italic_r italic_m italic_s - 1 do
    7:         factorialfactoriali𝑓𝑎𝑐𝑡𝑜𝑟𝑖𝑎𝑙𝑓𝑎𝑐𝑡𝑜𝑟𝑖𝑎𝑙𝑖factorial\leftarrow factorial\cdot iitalic_f italic_a italic_c italic_t italic_o italic_r italic_i italic_a italic_l ← italic_f italic_a italic_c italic_t italic_o italic_r italic_i italic_a italic_l ⋅ italic_i
    8:         resultresult+xifactorial𝑟𝑒𝑠𝑢𝑙𝑡𝑟𝑒𝑠𝑢𝑙𝑡superscript𝑥𝑖𝑓𝑎𝑐𝑡𝑜𝑟𝑖𝑎𝑙result\leftarrow result+\frac{x^{i}}{factorial}italic_r italic_e italic_s italic_u italic_l italic_t ← italic_r italic_e italic_s italic_u italic_l italic_t + divide start_ARG italic_x start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT end_ARG start_ARG italic_f italic_a italic_c italic_t italic_o italic_r italic_i italic_a italic_l end_ARG
    9:     end for
    10:     return result𝑟𝑒𝑠𝑢𝑙𝑡resultitalic_r italic_e italic_s italic_u italic_l italic_t
    11:end function
    12:function DifferentialPrivacy(ϵitalic-ϵ\epsilonitalic_ϵ, δ𝛿\deltaitalic_δ, σ𝜎\sigmaitalic_σ, deltaw𝑑𝑒𝑙𝑡subscript𝑎𝑤delta_{w}italic_d italic_e italic_l italic_t italic_a start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT)
    13:     Define w𝑤witalic_w, wsuperscript𝑤w^{\prime}italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, noisy_w𝑛𝑜𝑖𝑠𝑦_𝑤noisy\_witalic_n italic_o italic_i italic_s italic_y _ italic_w, noisy_w𝑛𝑜𝑖𝑠𝑦_superscript𝑤noisy\_w^{\prime}italic_n italic_o italic_i italic_s italic_y _ italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, x𝑥xitalic_x as Real variables
    14:     pdf_w12πσ2TaylorExp((xw)22σ2)𝑝𝑑𝑓_𝑤12𝜋superscript𝜎2TaylorExpsuperscript𝑥𝑤22superscript𝜎2pdf\_w\leftarrow\frac{1}{\sqrt{2\pi\sigma^{2}}}\cdot\text{TaylorExp}\left(% \frac{-(x-w)^{2}}{2\sigma^{2}}\right)italic_p italic_d italic_f _ italic_w ← divide start_ARG 1 end_ARG start_ARG square-root start_ARG 2 italic_π italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ⋅ TaylorExp ( divide start_ARG - ( italic_x - italic_w ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )
    15:     pdf_w12πσ2TaylorExp((xw)22σ2)𝑝𝑑𝑓_superscript𝑤12𝜋superscript𝜎2TaylorExpsuperscript𝑥superscript𝑤22superscript𝜎2pdf\_w^{\prime}\leftarrow\frac{1}{\sqrt{2\pi\sigma^{2}}}\cdot\text{TaylorExp}% \left(\frac{-(x-w^{\prime})^{2}}{2\sigma^{2}}\right)italic_p italic_d italic_f _ italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ← divide start_ARG 1 end_ARG start_ARG square-root start_ARG 2 italic_π italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ⋅ TaylorExp ( divide start_ARG - ( italic_x - italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )
    16:     return Implies(w=w+delta_wx=noisy_w,pdf_wTaylorExp(ϵ)pdf_w+δ)\text{Implies}(w^{\prime}=w+delta\_w\land x=noisy\_w,pdf\_w\leq\text{TaylorExp% }(\epsilon)\cdot pdf\_w^{\prime}+\delta)Implies ( italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_w + italic_d italic_e italic_l italic_t italic_a _ italic_w ∧ italic_x = italic_n italic_o italic_i italic_s italic_y _ italic_w , italic_p italic_d italic_f _ italic_w ≤ TaylorExp ( italic_ϵ ) ⋅ italic_p italic_d italic_f _ italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_δ )
    17:end function
    18:function AdvancedComposition(ϵitalic-ϵ\epsilonitalic_ϵ, δ𝛿\deltaitalic_δ, num_compositions𝑛𝑢𝑚_𝑐𝑜𝑚𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑠num\_compositionsitalic_n italic_u italic_m _ italic_c italic_o italic_m italic_p italic_o italic_s italic_i italic_t italic_i italic_o italic_n italic_s)
    19:     return ϵ2num_compositionslog(1δ)italic-ϵ2𝑛𝑢𝑚_𝑐𝑜𝑚𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑠1𝛿\epsilon\cdot\sqrt{2\cdot num\_compositions\cdot\log{\left(\frac{1}{\delta}% \right)}}italic_ϵ ⋅ square-root start_ARG 2 ⋅ italic_n italic_u italic_m _ italic_c italic_o italic_m italic_p italic_o italic_s italic_i italic_t italic_i italic_o italic_n italic_s ⋅ roman_log ( divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ) end_ARG, num_compositionsδ+δ𝑛𝑢𝑚_𝑐𝑜𝑚𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑠𝛿𝛿num\_compositions\cdot\delta+\deltaitalic_n italic_u italic_m _ italic_c italic_o italic_m italic_p italic_o italic_s italic_i italic_t italic_i italic_o italic_n italic_s ⋅ italic_δ + italic_δ
    20:end function
    21:ϵ_composed,δ_composedAdvancedComposition(ϵ,δ,num_compositions)italic-ϵ_𝑐𝑜𝑚𝑝𝑜𝑠𝑒𝑑𝛿_𝑐𝑜𝑚𝑝𝑜𝑠𝑒𝑑AdvancedCompositionitalic-ϵ𝛿𝑛𝑢𝑚_𝑐𝑜𝑚𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑠\epsilon\_composed,\delta\_composed\leftarrow\text{AdvancedComposition}(% \epsilon,\delta,num\_compositions)italic_ϵ _ italic_c italic_o italic_m italic_p italic_o italic_s italic_e italic_d , italic_δ _ italic_c italic_o italic_m italic_p italic_o italic_s italic_e italic_d ← AdvancedComposition ( italic_ϵ , italic_δ , italic_n italic_u italic_m _ italic_c italic_o italic_m italic_p italic_o italic_s italic_i italic_t italic_i italic_o italic_n italic_s )
    22:σ_composed2log(1.25δ_composed)delta_wϵ_composed+0.009760ϵ_composed0.078008𝜎_𝑐𝑜𝑚𝑝𝑜𝑠𝑒𝑑21.25𝛿_𝑐𝑜𝑚𝑝𝑜𝑠𝑒𝑑𝑑𝑒𝑙𝑡𝑎_𝑤italic-ϵ_𝑐𝑜𝑚𝑝𝑜𝑠𝑒𝑑0.009760italic-ϵ_𝑐𝑜𝑚𝑝𝑜𝑠𝑒superscript𝑑0.078008\sigma\_composed\leftarrow\sqrt{2\cdot\log{\left(\frac{1.25}{\delta\_composed}% \right)}}\cdot\frac{delta\_w}{\epsilon\_composed}+\frac{0.009760}{\epsilon\_% composed^{0.078008}}italic_σ _ italic_c italic_o italic_m italic_p italic_o italic_s italic_e italic_d ← square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ _ italic_c italic_o italic_m italic_p italic_o italic_s italic_e italic_d end_ARG ) end_ARG ⋅ divide start_ARG italic_d italic_e italic_l italic_t italic_a _ italic_w end_ARG start_ARG italic_ϵ _ italic_c italic_o italic_m italic_p italic_o italic_s italic_e italic_d end_ARG + divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ _ italic_c italic_o italic_m italic_p italic_o italic_s italic_e italic_d start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG
    23:Create Z3 solver s_composed𝑠_𝑐𝑜𝑚𝑝𝑜𝑠𝑒𝑑s\_composeditalic_s _ italic_c italic_o italic_m italic_p italic_o italic_s italic_e italic_d
    24:s_composed.add(DifferentialPrivacy(ϵ_composed,δ_composed,σ_composed,delta_w))formulae-sequence𝑠_𝑐𝑜𝑚𝑝𝑜𝑠𝑒𝑑addDifferentialPrivacyitalic-ϵ_𝑐𝑜𝑚𝑝𝑜𝑠𝑒𝑑𝛿_𝑐𝑜𝑚𝑝𝑜𝑠𝑒𝑑𝜎_𝑐𝑜𝑚𝑝𝑜𝑠𝑒𝑑𝑑𝑒𝑙𝑡𝑎_𝑤s\_composed.\text{add}(\text{DifferentialPrivacy}(\epsilon\_composed,\delta\_% composed,\sigma\_composed,delta\_w))italic_s _ italic_c italic_o italic_m italic_p italic_o italic_s italic_e italic_d . add ( DifferentialPrivacy ( italic_ϵ _ italic_c italic_o italic_m italic_p italic_o italic_s italic_e italic_d , italic_δ _ italic_c italic_o italic_m italic_p italic_o italic_s italic_e italic_d , italic_σ _ italic_c italic_o italic_m italic_p italic_o italic_s italic_e italic_d , italic_d italic_e italic_l italic_t italic_a _ italic_w ) )
    25:σ_original2log(1.25δ)delta_wϵ+0.009760ϵ0.078008𝜎_𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙21.25𝛿𝑑𝑒𝑙𝑡𝑎_𝑤italic-ϵ0.009760superscriptitalic-ϵ0.078008\sigma\_original\leftarrow\sqrt{2\cdot\log{\left(\frac{1.25}{\delta}\right)}}% \cdot\frac{delta\_w}{\epsilon}+\frac{0.009760}{\epsilon^{0.078008}}italic_σ _ italic_o italic_r italic_i italic_g italic_i italic_n italic_a italic_l ← square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG italic_d italic_e italic_l italic_t italic_a _ italic_w end_ARG start_ARG italic_ϵ end_ARG + divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG
    26:Create Z3 solver s_original𝑠_𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙s\_originalitalic_s _ italic_o italic_r italic_i italic_g italic_i italic_n italic_a italic_l
    27:s_original.add(DifferentialPrivacy(ϵ,δ,σ_original,delta_w))formulae-sequence𝑠_𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙addDifferentialPrivacyitalic-ϵ𝛿𝜎_𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙𝑑𝑒𝑙𝑡𝑎_𝑤s\_original.\text{add}(\text{DifferentialPrivacy}(\epsilon,\delta,\sigma\_% original,delta\_w))italic_s _ italic_o italic_r italic_i italic_g italic_i italic_n italic_a italic_l . add ( DifferentialPrivacy ( italic_ϵ , italic_δ , italic_σ _ italic_o italic_r italic_i italic_g italic_i italic_n italic_a italic_l , italic_d italic_e italic_l italic_t italic_a _ italic_w ) )

4 Analysis

4.1 Approach: Statistical Simulation

Overview: The statistical simulation approach aimed to empirically validate differential privacy conditions across varying epsilon values. By calculating the noise scale and applying it to a weight, we simulated multiple instances and computed the violation rate based on the ratio of probability density functions (pdfs) of noisy weights.

Results: The simulation revealed the following violation rates:

  • Minimum violation rate: 0.000000 at epsilon = 0.01

  • Maximum violation rate: 0.000000 at epsilon = 0.01

The violation rates were recorded and visualized, providing insights into the effectiveness of the noise mechanism in preserving privacy under different privacy budgets.

Refer to caption
Figure 1: Visualization of Privacy Loss Distribution Bounded by Delta

4.2 Approach: Formal Verification Using Z3 Solver

Overview: The formal verification approach employed the Z3 solver to verify differential privacy conditions under advanced composition for multiple queries or iterations. This method provided a rigorous mathematical validation by checking the satisfiability of the differential privacy conditions.

Results: For epsilon = 1, the Z3 solver analysis demonstrated the following:

  • The differential privacy condition is satisfied after 1000 compositions.

  • Composed epsilon: 191.94103648752323

  • Composed delta: 1.001e-05

  • The original (non-composed) differential privacy condition is satisfied.

This method confirmed that the privacy guarantees hold under advanced composition, providing validation of the differential privacy mechanisms.

The results from both approaches complement each other, with the statistical simulation providing empirical insights and the formal verification offering rigorous mathematical validation. Together, they demonstrate the effectiveness and robustness of differential privacy mechanisms under varying conditions and compositions.

4.3 Approach: Empirical Evaluation Comparing 5, 10, 50, and 100 Batch Sizes

In this section, we present a comprehensive analysis of the performance of the differentially private (DP-SGD) model compared to the non-differentially private (DP-Weights) noisy model and the fine-tuned model. The analysis includes pairwise comparisons and confidence intervals for various metrics, providing insights into the statistical similarities and differences between the models.

We conducted pairwise comparisons and calculated 95% confidence intervals for the following metrics: perplexity (member and non-member), ROC AUC, accuracy, precision, recall, and F1 score. The results are summarized below:

Refer to caption
Figure 2: Visualization of Performance Across Batch Sizes

4.3.1 Perplexity (Member)

  • DP vs. DP-Weights:

    T-stat:

    -0.25, P-value: 0.8032

    Confidence Interval:

    (-1.65, 1.28)

    Interpretation:

    No significant difference between the DP model and DP-Weights noisy model.

  • DP vs. Fine-tuned:

    T-stat:

    5.95, P-value: 1.05e-08

    Confidence Interval:

    (2.61, 5.22)

    Interpretation:

    Significant difference, with the DP model having higher perplexity.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    11.72, P-value: 5.35e-25

    Confidence Interval:

    (3.41, 4.80)

    Interpretation:

    Significant difference, with the DP-Weights noisy model having higher perplexity.

4.3.2 Perplexity (Non-member)

  • DP vs. DP-Weights:

    T-stat:

    -2.72, P-value: 0.0069

    Confidence Interval:

    (-3.30, -0.53)

    Interpretation:

    Significant difference, with the DP model having lower perplexity.

  • DP vs. Fine-tuned:

    T-stat:

    5.18, P-value: 4.96e-07

    Confidence Interval:

    (2.00, 4.48)

    Interpretation:

    Significant difference, with the DP model having higher perplexity.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    16.13, P-value: 3.00e-39

    Confidence Interval:

    (4.52, 5.79)

    Interpretation:

    Significant difference, with the DP-Weights noisy model having higher perplexity.

4.3.3 ROC AUC

  • DP vs. DP-Weights:

    T-stat:

    -4.33, P-value: 2.24e-05

    Confidence Interval:

    (-0.12, -0.05)

    Interpretation:

    Significant difference, with the DP model having lower ROC AUC.

  • DP vs. Fine-tuned:

    T-stat:

    -5.32, P-value: 2.51e-07

    Confidence Interval:

    (-0.18, -0.08)

    Interpretation:

    Significant difference, with the DP model having lower ROC AUC.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    -2.02, P-value: 0.0448

    Confidence Interval:

    (-0.10, -0.001)

    Interpretation:

    Significant difference, with the DP-Weights noisy model having lower ROC AUC.

4.3.4 Accuracy

  • DP vs. DP-Weights:

    T-stat:

    -1.96, P-value: 0.0518

    Confidence Interval:

    (-0.07, 0.00)

    Interpretation:

    No significant difference between the DP model and DP-Weights noisy model.

  • DP vs. Fine-tuned:

    T-stat:

    -5.49, P-value: 1.08e-07

    Confidence Interval:

    (-0.16, -0.07)

    Interpretation:

    Significant difference, with the DP model having lower accuracy.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    -3.84, P-value: 0.0002

    Confidence Interval:

    (-0.13, -0.04)

    Interpretation:

    Significant difference, with the DP-Weights noisy model having lower accuracy.

4.3.5 Precision

  • DP vs. DP-Weights:

    T-stat:

    -2.43, P-value: 0.0161

    Confidence Interval:

    (-0.08, -0.01)

    Interpretation:

    Significant difference, with the DP model having lower precision.

  • DP vs. Fine-tuned:

    T-stat:

    -5.42, P-value: 1.57e-07

    Confidence Interval:

    (-0.15, -0.07)

    Interpretation:

    Significant difference, with the DP model having lower precision.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    -3.13, P-value: 0.0020

    Confidence Interval:

    (-0.11, -0.02)

    Interpretation:

    Significant difference, with the DP-Weights noisy model having lower precision.

4.3.6 Recall

  • DP vs. DP-Weights:

    T-stat:

    0.21, P-value: 0.8314

    Confidence Interval:

    (-0.04, 0.05)

    Interpretation:

    No significant difference between the DP model and DP-Weights noisy model.

  • DP vs. Fine-tuned:

    T-stat:

    -6.23, P-value: 2.32e-09

    Confidence Interval:

    (-0.19, -0.10)

    Interpretation:

    Significant difference, with the DP model having lower recall.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    -6.21, P-value: 2.63e-09

    Confidence Interval:

    (-0.20, -0.10)

    Interpretation:

    Significant difference, with the DP-Weights noisy model having lower recall.

4.3.7 F1 Score

  • DP vs. DP-Weights:

    T-stat:

    -0.89, P-value: 0.3762

    Confidence Interval:

    (-0.05, 0.02)

    Interpretation:

    No significant difference between the DP model and DP-Weights noisy model.

  • DP vs. Fine-tuned:

    T-stat:

    -5.91, P-value: 1.29e-08

    Confidence Interval:

    (-0.17, -0.08)

    Interpretation:

    Significant difference, with the DP model having lower F1 scores.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    -4.92, P-value: 1.71e-06

    Confidence Interval:

    (-0.15, -0.07)

    Interpretation:

    Significant difference, with the DP-Weights noisy model having lower F1 scores.

4.3.8 Summary of Findings

  • Perplexity (Member): No significant difference between DP and DP-Weights noisy models, but both significantly differ from the fine-tuned model.

  • Perplexity (Non-member): Significant difference between DP and DP-Weights noisy models, with both models also differing significantly from the fine-tuned model.

  • ROC AUC: Significant difference between DP and DP-Weights noisy models, and both models differ significantly from the fine-tuned model.

  • Accuracy: No significant difference between DP and DP-Weights noisy models, but both significantly differ from the fine-tuned model.

  • Precision: Significant difference between DP and DP-Weights noisy models, with both models differing significantly from the fine-tuned model.

  • Recall: No significant difference between DP and DP-Weights noisy models, but both significantly differ from the fine-tuned model.

  • F1 Score: No significant difference between DP and DP-Weights noisy models, but both significantly differ from the fine-tuned model.

4.4 Approach: Empirical Evaluation Comparing 1, 5, 10, and 20 Epochs

In this section, we analyze the performance differences between the DP (Differentially Private) model, the DP-Weights noisy model, and the fine-tuned model across different epochs of training. The analysis focuses on key metrics: perplexity (for both member and non-member data points), and metrics related to a membership inference attack on the model: ROC AUC, accuracy, precision, recall, and F1 score. The goal is to determine if the DP model behaves similarly to the DP-Weights noisy model and if both differ significantly from the fine-tuned model.

4.4.1 Perplexity (Member)

  • DP vs. DP-Weights:

    T-stat:

    -1.211, P-value: 0.227

    Confidence Interval:

    (-6.94, 1.66)

    Interpretation:

    There is no significant difference between the DP model and the DP-Weights noisy model in terms of perplexity for member data points. The confidence interval includes zero, indicating that the differences observed could be due to random chance.

  • DP vs. Fine-tuned:

    T-stat:

    6.573, P-value: 3.47e-10

    Confidence Interval:

    (6.60, 12.29)

    Interpretation:

    The DP model shows significantly higher perplexity compared to the fine-tuned model for member data points, indicating poorer performance.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    7.079, P-value: 1.89e-11

    Confidence Interval:

    (8.71, 15.47)

    Interpretation:

    Similarly, the DP-Weights noisy model has significantly higher perplexity compared to the fine-tuned model for member data points.

4.4.2 Perplexity (Non-member)

  • DP vs. DP-Weights:

    T-stat:

    -2.185, P-value: 0.030

    Confidence Interval:

    (-8.69, -0.45)

    Interpretation:

    There is a significant difference between the DP model and the DP-Weights noisy model for non-member perplexity, suggesting some divergence in performance between the two models.

  • DP vs. Fine-tuned:

    T-stat:

    6.305, P-value: 1.54e-09

    Confidence Interval:

    (5.99, 11.48)

    Interpretation:

    The DP model has significantly higher perplexity compared to the fine-tuned model for non-member data points, indicating poorer performance.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    8.227, P-value: 1.62e-14

    Confidence Interval:

    (10.10, 16.51)

    Interpretation:

    The DP-Weights noisy model also shows significantly higher perplexity compared to the fine-tuned model for non-member data points.

4.4.3 ROC AUC (Membership Inference Attack)

  • DP vs. DP-Weights:

    T-stat:

    -4.486, P-value: 1.17e-05

    Confidence Interval:

    (-0.08, -0.03)

    Interpretation:

    The DP model performs significantly worse than the DP-Weights noisy model in terms of ROC AUC, indicating a notable difference in the membership inference attack performance.

  • DP vs. Fine-tuned:

    T-stat:

    -5.380, P-value: 1.89e-07

    Confidence Interval:

    (-0.12, -0.06)

    Interpretation:

    The DP model has a significantly lower ROC AUC compared to the fine-tuned model, indicating poorer classification performance in the membership inference attack.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    -1.738, P-value: 0.084

    Confidence Interval:

    (-0.07, 0.00)

    Interpretation:

    The DP-Weights noisy model is not significantly different from the fine-tuned model in terms of ROC AUC, with the p-value close to 0.05.

4.4.4 Accuracy (Membership Inference Attack)

  • DP vs. DP-Weights:

    T-stat:

    -1.055, P-value: 0.292

    Confidence Interval:

    (-0.04, 0.01)

    Interpretation:

    There is no significant difference in accuracy between the DP model and the DP-Weights noisy model, indicating similar performance in terms of correct predictions in the membership inference attack.

  • DP vs. Fine-tuned:

    T-stat:

    -3.997, P-value: 8.74e-05

    Confidence Interval:

    (-0.08, -0.03)

    Interpretation:

    The DP model has significantly lower accuracy compared to the fine-tuned model, indicating poorer performance in the membership inference attack.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    -2.699, P-value: 0.007

    Confidence Interval:

    (-0.07, -0.01)

    Interpretation:

    The DP-Weights noisy model also shows significantly lower accuracy compared to the fine-tuned model.

4.4.5 Precision (Membership Inference Attack)

  • DP vs. DP-Weights:

    T-stat:

    -1.142, P-value: 0.255

    Confidence Interval:

    (-0.04, 0.01)

    Interpretation:

    There is no significant difference in precision between the DP model and the DP-Weights noisy model, indicating similar performance in terms of correctly predicted positive instances in the membership inference attack.

  • DP vs. Fine-tuned:

    T-stat:

    -3.996, P-value: 8.77e-05

    Confidence Interval:

    (-0.08, -0.03)

    Interpretation:

    The DP model has significantly lower precision compared to the fine-tuned model.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    -2.489, P-value: 0.014

    Confidence Interval:

    (-0.07, -0.01)

    Interpretation:

    The DP-Weights noisy model also has significantly lower precision compared to the fine-tuned model.

4.4.6 Recall (Membership Inference Attack)

  • DP vs. DP-Weights:

    T-stat:

    1.150, P-value: 0.252

    Confidence Interval:

    (-0.01, 0.05)

    Interpretation:

    There is no significant difference in recall between the DP model and the DP-Weights noisy model, indicating similar performance in terms of identifying all relevant instances in the membership inference attack.

  • DP vs. Fine-tuned:

    T-stat:

    -4.239, P-value: 3.29e-05

    Confidence Interval:

    (-0.09, -0.03)

    Interpretation:

    The DP model has significantly lower recall compared to the fine-tuned model.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    -4.685, P-value: 4.87e-06

    Confidence Interval:

    (-0.11, -0.05)

    Interpretation:

    The DP-Weights noisy model also has significantly lower recall compared to the fine-tuned model.

4.4.7 F1 Score (Membership Inference Attack)

  • DP vs. DP-Weights:

    T-stat:

    0.234, P-value: 0.815

    Confidence Interval:

    (-0.02, 0.03)

    Interpretation:

    There is no significant difference in F1 score between the DP model and the DP-Weights noisy model, indicating similar performance in terms of the balance between precision and recall in the membership inference attack.

  • DP vs. Fine-tuned:

    T-stat:

    -4.225, P-value: 3.49e-05

    Confidence Interval:

    (-0.09, -0.03)

    Interpretation:

    The DP model has a significantly lower F1 score compared to the fine-tuned model.

  • DP-Weights vs. Fine-tuned:

    T-stat:

    -3.840, P-value: 0.000160

    Confidence Interval:

    (-0.09, -0.03)

    Interpretation:

    The DP-Weights noisy model also has a significantly lower F1 score compared to the fine-tuned model.

4.4.8 Summary of Epoch-based Analysis

The results from the epoch-based analysis support the following conclusions:

  • Similarity between DP and DP-Weights Models: The DP model and DP-Weights noisy model show similar performance in most metrics, with no significant differences in perplexity (member), accuracy, precision, recall, and F1 score. The only notable differences are in perplexity (non-member) and ROC AUC, where the DP model performs slightly worse.

  • Difference from Fine-tuned Model: Both the DP model and the DP-Weights noisy model are significantly different from the fine-tuned model across all metrics. This indicates that the fine-tuned model consistently outperforms the DP and DP-Weights models in terms of perplexity, ROC AUC, accuracy, precision, recall, and F1 score – indicating the fine-tuned model is more susceptible to membership inference attacks.

These findings suggest that the DP-SGD model can achieve performance (relatively) comparable to the DP-Weights noisy model, with statistically similar behavior in most metrics, while both models are distinct from the fine-tuned model in terms of performance.

Refer to caption
Figure 3: Radar Chart Averaged Over All Epsilon Values

5 Discussion

The results of this study provide significant insights into the efficacy of applying differentially private (DP) noise to machine learning models post-training. Our approach was compared against a traditional DP model and a fine-tuned model without DP noise, across various metrics and experimental setups involving different batch sizes and epochs.

5.1 Performance Comparison

Our experiments demonstrated that the novel fine-tuned model (DP-Weights model) with post-training noise application achieves performance metrics statistically similar to those of the traditional DP model. This finding is crucial as it validates the effectiveness of our method in maintaining privacy guarantees while maintaining a similar level of performance degradation compared to the DP-SGD model. This can be particularly advantageous in practical applications where retraining a model is costly or impractical.

5.1.1 Perplexity

The perplexity analysis for both member data points revealed that the DP-Weights model and the DP-SGD model exhibited comparable perplexity scores. For member data points, there was no significant difference between the DP-SGD and DP-Weights models, while both had significantly higher perplexity compared to the fine-tuned model. This suggests that the addition of noise, whether during or post-training, introduces similar levels of uncertainty in the model’s predictions.

For non-member data points, the DP model showed significantly lower perplexity compared to the DP-Weights model, indicating slightly better generalization. However, both models had higher perplexity than the fine-tuned model, highlighting the impact of differential privacy on model performance.

5.1.2 Classification Metrics

In terms of ROC AUC, accuracy, precision, recall, and F1 score, the DP model and the DP-Weights model exhibited similar behavior across different batch sizes and epochs. The significant differences observed were mainly between these models and the fine-tuned model. This indicates that while differential privacy noise impacts model performance, the timing of its application (during or post-training) should not result in substantial performance differences if applied correctly.

5.1.3 DP-Weights Losses and Gains vs. DP-SGD

Looking at the radar plots, there seems to be a tradeoff between DP-Weights and DP-SGD that is not present in the descriptive statistics. This discrepancy is important to note. Visually, it is clear these two approaches are not identical.

  • On average across epsilon values from 1 to 1,000, at lower batch sizes like 1 through 5, DP-SGD appears to provide noticeably greater privacy protection and perplexity performance, whereas DP-Weights appears to provide slightly greater privacy protection and perplexity performance at higher batch sizes when holding epochs at 10, dataset size at 1000, learning rate at 5e-5, clip** norm at 1.0.

  • On average across epsilon values from 1 to 1,000, DP-SGD and DP-Weights appear to offer similar privacy protections for epochs of values 1, 5, and 10. However, DP-SGD offers greater privacy protection and performance at high numbers of epochs like 20. Both offer statistically greater privacy protection than the fine-tuned model.

This indicates that the batch size term is not correctly accounted for in practice when using the proposed noise scale.

We caution users of DP-Weights to be mindful of these tradeoffs should they choose to incorporate our approach into their privacy protection protocols. While our mathematical proof may provide evidence to support the fact that our approach is Differentially Private, and the statistics do not show any significant difference between DP-SGD and DP-Weights, this does not necessarily mean DP-Weights will offer the same level of privacy protections as other DP approaches to training machine learning models.

5.2 Implications of Post-Training Noise Application

The primary advantage of applying DP noise post-training is the potential for improved training efficiency and model performance tuning. Traditional methods integrating noise during training can lead to significant training costs and require careful tuning of the noise parameters to balance privacy and accuracy. By applying noise post-training, our method simplifies the training process and allows for better optimization of the model’s performance before introducing privacy guarantees.

5.3 Limitations and Future Work

While our study shows promising results, it is essential to consider the limitations and potential areas for future research. One limitation is the assumption that the noise added post-training uniformly affects all model weights. Further investigation is needed to understand the impact of noise distribution and its implications on model performance.

This approach requires a machine learning practitioner to have knowledge of the dataset size, learning rate, batch size, and the gradients must be clipped during training for this approach to be mathematically differentially private. Without clip** the gradients, the global sensitivity of the model weight update rule is not bound by the gradient norm. As such, one must always have knowledge of how a model was trained in order for this approach to provide the guarantees afforded by the Differential Privacy framework.

Additionally, exploring the scalability of our approach to larger models and more complex datasets is a critical next step. Understanding how post-training noise application interacts with various model architectures and training regimes will provide deeper insights into its practical applicability.

6 Conclusion

This study introduces a novel noise scale for applying differential privacy noise to machine learning models post-training. Our analysis across various metrics and experimental setups shows that this approach yields performance statistically similar to that of traditional DP models while simplifying the training process. These findings highlight the potential of post-training noise application as a viable alternative for achieving privacy guarantees in machine learning models.

Future work will focus on refining this method, exploring its applicability to different model architectures, and further understanding the nuances of noise distribution and its impact on model performance.

7 Acknowledgements

I, David Zagardo, would like to thank my dog, Mr. Macaroni, for his support during this process. He has proven an invaluable contributor to my research. ArXiv does not allow for animals as co-authors, but rest assured that the inclusive language used in the aforementioned text refers to both myself and Mr. Macaroni, without whom this publication would not have happened.

References

  • [1] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, ”Deep Learning with Differential Privacy,” Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016.
  • [2] H. Yu, K. Su, Y. Chen, and T. Goldstein, ”Differentially Private Model Training with Fast Adaptation,” arXiv preprint arXiv:2206.02617, 2022.
  • [3] Y. Li, P. Kairouz, and M. Jaggi, ”Large Scale Differentially Private Fine-tuning,” arXiv preprint arXiv:2112.00845, 2022.
  • [4] G. Damaskinos, R. Tramèr, and F. B. Shepherd, ”Differentially Private Coordinate Descent,” arXiv preprint arXiv:2112.00845, 2021.
  • [5] J. Koskela, A. Bun, and V. Pihur, ”Privacy Amplification by Shuffling: Applications to DP-SGD,” arXiv preprint arXiv:2206.02617, 2023.
  • [6] J. Yu, K. Zhao, and A. Srivastava, ”DP-UTIL: Comprehensive Utility Analysis of Differential Privacy in Machine Learning,” arXiv preprint arXiv:2102.07746, 2021.
  • [7] H. Dong, J. Hsu, and A. Roth, ”Gaussian Differential Privacy,” Journal of Privacy and Confidentiality, 2022.
  • [8] S. Vadhan, ”Differentially Private Training of Deep Neural Networks with Smooth Sensitivity,” arXiv preprint arXiv:2210.05659, 2022.
  • [9] M. Nasr, R. Shokri, and A. Houmansadr, ”Comprehensive Privacy Analysis of Deep Learning,” arXiv preprint arXiv:1812.00910, 2018.
  • [10] J. Lyu, D. Su, B. Xu, and Y. Wang, ”Differentially Private Model Publishing through Noise Injection,” arXiv preprint arXiv:2006.10218, 2020.

8 Appendix

8.1 Mathematical Rigor and Proof of Differential Privacy

To establish the differential privacy guarantees of our proposed method of applying noise post-training, we follow a systematic approach to prove that the mechanism is indeed differentially private.

8.1.1 Differential Privacy Definition

A randomized mechanism \mathcal{M}caligraphic_M provides (ϵ,δ)italic-ϵ𝛿(\epsilon,\delta)( italic_ϵ , italic_δ )-differential privacy if for any two adjacent datasets D𝐷Ditalic_D and Dsuperscript𝐷D^{\prime}italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (differing by at most one element) and for any set of outcomes SRange()𝑆RangeS\subseteq\text{Range}(\mathcal{M})italic_S ⊆ Range ( caligraphic_M ):

Pr[(D)S]eϵPr[(D)S]+δPr𝐷𝑆superscript𝑒italic-ϵPrsuperscript𝐷𝑆𝛿\Pr[\mathcal{M}(D)\in S]\leq e^{\epsilon}\Pr[\mathcal{M}(D^{\prime})\in S]+\deltaroman_Pr [ caligraphic_M ( italic_D ) ∈ italic_S ] ≤ italic_e start_POSTSUPERSCRIPT italic_ϵ end_POSTSUPERSCRIPT roman_Pr [ caligraphic_M ( italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_S ] + italic_δ

8.1.2 Step-by-Step Proof

Step 1: Sensitivity Calculation of the Weight Update Rule

Let W={w1,w2,,wn}𝑊subscript𝑤1subscript𝑤2subscript𝑤𝑛W=\{w_{1},w_{2},\ldots,w_{n}\}italic_W = { italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_w start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } be the set of trainable weights in the model. The sensitivity ΔWΔ𝑊\Delta Wroman_Δ italic_W is the maximum change in any trainable weight wisubscript𝑤𝑖w_{i}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT when one data point is added or removed from the training dataset:

ΔW=EηCNBΔ𝑊𝐸𝜂𝐶𝑁𝐵\Delta W=\frac{E\cdot\eta\cdot C}{N\cdot B}roman_Δ italic_W = divide start_ARG italic_E ⋅ italic_η ⋅ italic_C end_ARG start_ARG italic_N ⋅ italic_B end_ARG

where E𝐸Eitalic_E is the number of epochs, η𝜂\etaitalic_η is the learning rate, C𝐶Citalic_C is the clip** norm, N𝑁Nitalic_N is the dataset size, and B𝐵Bitalic_B is the batch size.

Step 2: Noise Scale Calculation

The noise added to each trainable weight is drawn from a Gaussian distribution with zero mean and variance σ2superscript𝜎2\sigma^{2}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, where:

σ=2log(1.25δ)ΔWϵ+0.009760ϵ0.078008𝜎21.25𝛿Δ𝑊italic-ϵ0.009760superscriptitalic-ϵ0.078008\sigma=\sqrt{2\cdot\log\left(\frac{1.25}{\delta}\right)}\cdot\frac{\Delta W}{% \epsilon}+\frac{0.009760}{\epsilon^{0.078008}}italic_σ = square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG roman_Δ italic_W end_ARG start_ARG italic_ϵ end_ARG + divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG

This ensures the added noise accounts for the empirical term, enhancing privacy protection.

Step 3: Probability Distribution

The probability density function (pdf) of the Gaussian noise is:

f(x)=12πσ2exp(x22σ2)𝑓𝑥12𝜋superscript𝜎2superscript𝑥22superscript𝜎2f(x)=\frac{1}{\sqrt{2\pi\sigma^{2}}}\exp\left(-\frac{x^{2}}{2\sigma^{2}}\right)italic_f ( italic_x ) = divide start_ARG 1 end_ARG start_ARG square-root start_ARG 2 italic_π italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG roman_exp ( - divide start_ARG italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )
Step 4: Adjacent Datasets

Consider two adjacent datasets D𝐷Ditalic_D and Dsuperscript𝐷D^{\prime}italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT that differ by one data point. The corresponding weights after training on these datasets are W𝑊Witalic_W and Wsuperscript𝑊W^{\prime}italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, respectively, with |WiWi|ΔWsubscript𝑊𝑖subscriptsuperscript𝑊𝑖Δ𝑊|W_{i}-W^{\prime}_{i}|\leq\Delta W| italic_W start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | ≤ roman_Δ italic_W for each trainable weight wisubscript𝑤𝑖w_{i}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

Step 5: Privacy Guarantee

The goal is to show:

Pr[(D)=W+N]eϵPr[(D)=W+N]+δPr𝐷𝑊𝑁superscript𝑒italic-ϵPrsuperscript𝐷superscript𝑊𝑁𝛿\Pr[\mathcal{M}(D)=W+N]\leq e^{\epsilon}\Pr[\mathcal{M}(D^{\prime})=W^{\prime}% +N]+\deltaroman_Pr [ caligraphic_M ( italic_D ) = italic_W + italic_N ] ≤ italic_e start_POSTSUPERSCRIPT italic_ϵ end_POSTSUPERSCRIPT roman_Pr [ caligraphic_M ( italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_N ] + italic_δ

where N𝒩(0,σ2)similar-to𝑁𝒩0superscript𝜎2N\sim\mathcal{N}(0,\sigma^{2})italic_N ∼ caligraphic_N ( 0 , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) is the added Gaussian noise.

The pdfs of the noisy weights are:

f(W+N)=12πσ2exp((W+NW)22σ2)𝑓𝑊𝑁12𝜋superscript𝜎2superscript𝑊𝑁𝑊22superscript𝜎2f(W+N)=\frac{1}{\sqrt{2\pi\sigma^{2}}}\exp\left(-\frac{(W+N-W)^{2}}{2\sigma^{2% }}\right)italic_f ( italic_W + italic_N ) = divide start_ARG 1 end_ARG start_ARG square-root start_ARG 2 italic_π italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG roman_exp ( - divide start_ARG ( italic_W + italic_N - italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )
f(W+N)=12πσ2exp((W+NW)22σ2)𝑓superscript𝑊𝑁12𝜋superscript𝜎2superscriptsuperscript𝑊𝑁superscript𝑊22superscript𝜎2f(W^{\prime}+N)=\frac{1}{\sqrt{2\pi\sigma^{2}}}\exp\left(-\frac{(W^{\prime}+N-% W^{\prime})^{2}}{2\sigma^{2}}\right)italic_f ( italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_N ) = divide start_ARG 1 end_ARG start_ARG square-root start_ARG 2 italic_π italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG roman_exp ( - divide start_ARG ( italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_N - italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )

The ratio of these probabilities for adjacent datasets is:

f(W+N)f(W+N)=exp((W+NW)2+(W+NW)22σ2)𝑓𝑊𝑁𝑓superscript𝑊𝑁superscript𝑊𝑁𝑊2superscriptsuperscript𝑊𝑁superscript𝑊22superscript𝜎2\frac{f(W+N)}{f(W^{\prime}+N)}=\exp\left(\frac{-(W+N-W)^{2}+(W^{\prime}+N-W^{% \prime})^{2}}{2\sigma^{2}}\right)divide start_ARG italic_f ( italic_W + italic_N ) end_ARG start_ARG italic_f ( italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_N ) end_ARG = roman_exp ( divide start_ARG - ( italic_W + italic_N - italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_N - italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )

Since |WW|ΔW𝑊superscript𝑊Δ𝑊|W-W^{\prime}|\leq\Delta W| italic_W - italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | ≤ roman_Δ italic_W, we have:

f(W+N)f(W+N)=exp((WW)22σ2)𝑓𝑊𝑁𝑓superscript𝑊𝑁superscriptsuperscript𝑊𝑊22superscript𝜎2\frac{f(W+N)}{f(W^{\prime}+N)}=\exp\left(\frac{(W^{\prime}-W)^{2}}{2\sigma^{2}% }\right)divide start_ARG italic_f ( italic_W + italic_N ) end_ARG start_ARG italic_f ( italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_N ) end_ARG = roman_exp ( divide start_ARG ( italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )

Substituting σ𝜎\sigmaitalic_σ:

σ=2log(1.25δ)ΔWϵ+0.009760ϵ0.078008𝜎21.25𝛿Δ𝑊italic-ϵ0.009760superscriptitalic-ϵ0.078008\sigma=\sqrt{2\cdot\log\left(\frac{1.25}{\delta}\right)}\cdot\frac{\Delta W}{% \epsilon}+\frac{0.009760}{\epsilon^{0.078008}}italic_σ = square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG roman_Δ italic_W end_ARG start_ARG italic_ϵ end_ARG + divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG

The bound becomes:

f(W+N)f(W+N)exp(ΔW22(2log(1.25δ)ΔWϵ+0.009760ϵ0.078008)2)𝑓𝑊𝑁𝑓superscript𝑊𝑁Δsuperscript𝑊22superscript21.25𝛿Δ𝑊italic-ϵ0.009760superscriptitalic-ϵ0.0780082\frac{f(W+N)}{f(W^{\prime}+N)}\leq\exp\left(\frac{\Delta W^{2}}{2\left(\sqrt{2% \cdot\log\left(\frac{1.25}{\delta}\right)}\cdot\frac{\Delta W}{\epsilon}+\frac% {0.009760}{\epsilon^{0.078008}}\right)^{2}}\right)divide start_ARG italic_f ( italic_W + italic_N ) end_ARG start_ARG italic_f ( italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_N ) end_ARG ≤ roman_exp ( divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 ( square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG roman_Δ italic_W end_ARG start_ARG italic_ϵ end_ARG + divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )

To simplify, denote σ1=2log(1.25δ)ΔWϵsubscript𝜎121.25𝛿Δ𝑊italic-ϵ\sigma_{1}=\sqrt{2\cdot\log\left(\frac{1.25}{\delta}\right)}\cdot\frac{\Delta W% }{\epsilon}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG roman_Δ italic_W end_ARG start_ARG italic_ϵ end_ARG and σ2=0.009760ϵ0.078008subscript𝜎20.009760superscriptitalic-ϵ0.078008\sigma_{2}=\frac{0.009760}{\epsilon^{0.078008}}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG, giving:

f(W+N)f(W+N)exp(ΔW22(σ1+σ2)2)𝑓𝑊𝑁𝑓superscript𝑊𝑁Δsuperscript𝑊22superscriptsubscript𝜎1subscript𝜎22\frac{f(W+N)}{f(W^{\prime}+N)}\leq\exp\left(\frac{\Delta W^{2}}{2(\sigma_{1}+% \sigma_{2})^{2}}\right)divide start_ARG italic_f ( italic_W + italic_N ) end_ARG start_ARG italic_f ( italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_N ) end_ARG ≤ roman_exp ( divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 ( italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )

To ensure that:

exp(ΔW22(σ1+σ2)2)eϵΔsuperscript𝑊22superscriptsubscript𝜎1subscript𝜎22superscript𝑒italic-ϵ\exp\left(\frac{\Delta W^{2}}{2(\sigma_{1}+\sigma_{2})^{2}}\right)\leq e^{\epsilon}roman_exp ( divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 ( italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) ≤ italic_e start_POSTSUPERSCRIPT italic_ϵ end_POSTSUPERSCRIPT

we need to show:

ΔW22(σ1+σ2)2ϵΔsuperscript𝑊22superscriptsubscript𝜎1subscript𝜎22italic-ϵ\frac{\Delta W^{2}}{2(\sigma_{1}+\sigma_{2})^{2}}\leq\epsilondivide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 ( italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ≤ italic_ϵ

Substituting σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT:

σ1=2log(1.25δ)ΔWϵsubscript𝜎121.25𝛿Δ𝑊italic-ϵ\sigma_{1}=\sqrt{2\cdot\log\left(\frac{1.25}{\delta}\right)}\cdot\frac{\Delta W% }{\epsilon}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG roman_Δ italic_W end_ARG start_ARG italic_ϵ end_ARG
σ2=0.009760ϵ0.078008subscript𝜎20.009760superscriptitalic-ϵ0.078008\sigma_{2}=\frac{0.009760}{\epsilon^{0.078008}}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG

Simplifying the bound:

(σ1+σ2)2=(2log(1.25δ)ΔWϵ+0.009760ϵ0.078008)2superscriptsubscript𝜎1subscript𝜎22superscript21.25𝛿Δ𝑊italic-ϵ0.009760superscriptitalic-ϵ0.0780082(\sigma_{1}+\sigma_{2})^{2}=\left(\sqrt{2\cdot\log\left(\frac{1.25}{\delta}% \right)}\cdot\frac{\Delta W}{\epsilon}+\frac{0.009760}{\epsilon^{0.078008}}% \right)^{2}( italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG roman_Δ italic_W end_ARG start_ARG italic_ϵ end_ARG + divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

Assuming σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is relatively small compared to σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT:

σ122log(1.25δ)ΔW2ϵ2superscriptsubscript𝜎1221.25𝛿Δsuperscript𝑊2superscriptitalic-ϵ2\sigma_{1}^{2}\approx 2\cdot\log\left(\frac{1.25}{\delta}\right)\cdot\frac{% \Delta W^{2}}{\epsilon^{2}}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≈ 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) ⋅ divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG

Thus:

ΔW22σ12=ΔW222log(1.25δ)ΔW2ϵ2=ϵ24log(1.25δ)Δsuperscript𝑊22superscriptsubscript𝜎12Δsuperscript𝑊2221.25𝛿Δsuperscript𝑊2superscriptitalic-ϵ2superscriptitalic-ϵ241.25𝛿\frac{\Delta W^{2}}{2\sigma_{1}^{2}}=\frac{\Delta W^{2}}{2\cdot 2\cdot\log% \left(\frac{1.25}{\delta}\right)\cdot\frac{\Delta W^{2}}{\epsilon^{2}}}=\frac{% \epsilon^{2}}{4\cdot\log\left(\frac{1.25}{\delta}\right)}divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 ⋅ 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) ⋅ divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG = divide start_ARG italic_ϵ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG

Ensuring:

ϵ24log(1.25δ)ϵsuperscriptitalic-ϵ241.25𝛿italic-ϵ\frac{\epsilon^{2}}{4\cdot\log\left(\frac{1.25}{\delta}\right)}\leq\epsilondivide start_ARG italic_ϵ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ≤ italic_ϵ

Solving for ϵitalic-ϵ\epsilonitalic_ϵ:

ϵ4log(1.25δ)italic-ϵ41.25𝛿\epsilon\leq 4\cdot\log\left(\frac{1.25}{\delta}\right)italic_ϵ ≤ 4 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG )

Therefore, our mechanism satisfies (ϵ,δ)italic-ϵ𝛿(\epsilon,\delta)( italic_ϵ , italic_δ )-differential privacy if ϵitalic-ϵ\epsilonitalic_ϵ is chosen such that:

ϵ4log(1.25δ)italic-ϵ41.25𝛿\epsilon\leq 4\cdot\log\left(\frac{1.25}{\delta}\right)italic_ϵ ≤ 4 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG )

The inclusion of the empirical term σ2=0.009760ϵ0.078008subscript𝜎20.009760superscriptitalic-ϵ0.078008\sigma_{2}=\frac{0.009760}{\epsilon^{0.078008}}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG provides an additional conservative noise buffer, ensuring robustness of the privacy guarantee. By combining σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, the overall noise scale maintains the differential privacy requirements.

Thus, the mechanism \mathcal{M}caligraphic_M with post-training noise addition conclusively satisfies (ϵ,δ)italic-ϵ𝛿(\epsilon,\delta)( italic_ϵ , italic_δ )-differential privacy, providing both theoretical and empirical noise components for enhanced privacy protection.

However, if σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is not relatively small compared to σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, we need to reassess the bound more rigorously.

Considering σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is not relatively small compared to σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT:

σ1=2log(1.25δ)ΔWϵsubscript𝜎121.25𝛿Δ𝑊italic-ϵ\sigma_{1}=\sqrt{2\cdot\log\left(\frac{1.25}{\delta}\right)}\cdot\frac{\Delta W% }{\epsilon}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG roman_Δ italic_W end_ARG start_ARG italic_ϵ end_ARG
σ2=0.009760ϵ0.078008subscript𝜎20.009760superscriptitalic-ϵ0.078008\sigma_{2}=\frac{0.009760}{\epsilon^{0.078008}}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG

The noise scale is:

σ=σ1+σ2𝜎subscript𝜎1subscript𝜎2\sigma=\sigma_{1}+\sigma_{2}italic_σ = italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT

The bound on the ratio of probabilities for adjacent datasets becomes:

f(W+N)f(W+N)exp(ΔW22(σ1+σ2)2)𝑓𝑊𝑁𝑓superscript𝑊𝑁Δsuperscript𝑊22superscriptsubscript𝜎1subscript𝜎22\frac{f(W+N)}{f(W^{\prime}+N)}\leq\exp\left(\frac{\Delta W^{2}}{2(\sigma_{1}+% \sigma_{2})^{2}}\right)divide start_ARG italic_f ( italic_W + italic_N ) end_ARG start_ARG italic_f ( italic_W start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_N ) end_ARG ≤ roman_exp ( divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 ( italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )

Ensuring:

exp(ΔW22(σ1+σ2)2)eϵΔsuperscript𝑊22superscriptsubscript𝜎1subscript𝜎22superscript𝑒italic-ϵ\exp\left(\frac{\Delta W^{2}}{2(\sigma_{1}+\sigma_{2})^{2}}\right)\leq e^{\epsilon}roman_exp ( divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 ( italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) ≤ italic_e start_POSTSUPERSCRIPT italic_ϵ end_POSTSUPERSCRIPT

Thus:

ΔW22(σ1+σ2)2ϵΔsuperscript𝑊22superscriptsubscript𝜎1subscript𝜎22italic-ϵ\frac{\Delta W^{2}}{2(\sigma_{1}+\sigma_{2})^{2}}\leq\epsilondivide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 ( italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ≤ italic_ϵ

Substituting σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT:

(σ1+σ2)2=(2log(1.25δ)ΔWϵ+0.009760ϵ0.078008)2superscriptsubscript𝜎1subscript𝜎22superscript21.25𝛿Δ𝑊italic-ϵ0.009760superscriptitalic-ϵ0.0780082(\sigma_{1}+\sigma_{2})^{2}=\left(\sqrt{2\cdot\log\left(\frac{1.25}{\delta}% \right)}\cdot\frac{\Delta W}{\epsilon}+\frac{0.009760}{\epsilon^{0.078008}}% \right)^{2}( italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG roman_Δ italic_W end_ARG start_ARG italic_ϵ end_ARG + divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

Expanding the square:

(σ1+σ2)2=2log(1.25δ)ΔW2ϵ2+22log(1.25δ)ΔW0.009760ϵϵ0.078008+(0.009760ϵ0.078008)2superscriptsubscript𝜎1subscript𝜎2221.25𝛿Δsuperscript𝑊2superscriptitalic-ϵ2221.25𝛿Δ𝑊0.009760italic-ϵsuperscriptitalic-ϵ0.078008superscript0.009760superscriptitalic-ϵ0.0780082(\sigma_{1}+\sigma_{2})^{2}=2\cdot\log\left(\frac{1.25}{\delta}\right)\cdot% \frac{\Delta W^{2}}{\epsilon^{2}}+2\cdot\sqrt{2\cdot\log\left(\frac{1.25}{% \delta}\right)}\cdot\frac{\Delta W\cdot 0.009760}{\epsilon\cdot\epsilon^{0.078% 008}}+\left(\frac{0.009760}{\epsilon^{0.078008}}\right)^{2}( italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) ⋅ divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG + 2 ⋅ square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG roman_Δ italic_W ⋅ 0.009760 end_ARG start_ARG italic_ϵ ⋅ italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG + ( divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

To ensure the bound:

ΔW22(2log(1.25δ)ΔW2ϵ2+22log(1.25δ)ΔW0.009760ϵϵ0.078008+(0.009760ϵ0.078008)2)ϵΔsuperscript𝑊2221.25𝛿Δsuperscript𝑊2superscriptitalic-ϵ2221.25𝛿Δ𝑊0.009760italic-ϵsuperscriptitalic-ϵ0.078008superscript0.009760superscriptitalic-ϵ0.0780082italic-ϵ\frac{\Delta W^{2}}{2\left(2\cdot\log\left(\frac{1.25}{\delta}\right)\cdot% \frac{\Delta W^{2}}{\epsilon^{2}}+2\cdot\sqrt{2\cdot\log\left(\frac{1.25}{% \delta}\right)}\cdot\frac{\Delta W\cdot 0.009760}{\epsilon\cdot\epsilon^{0.078% 008}}+\left(\frac{0.009760}{\epsilon^{0.078008}}\right)^{2}\right)}\leq\epsilondivide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 ( 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) ⋅ divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG + 2 ⋅ square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ divide start_ARG roman_Δ italic_W ⋅ 0.009760 end_ARG start_ARG italic_ϵ ⋅ italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG + ( divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG ≤ italic_ϵ

Simplify further:

ΔW2ϵ22(2log(1.25δ)ΔW2+22log(1.25δ)ΔW0.009760ϵ10.078008+0.0097602ϵ20.156016)ϵ3Δsuperscript𝑊2superscriptitalic-ϵ2221.25𝛿Δsuperscript𝑊2221.25𝛿Δ𝑊0.009760superscriptitalic-ϵ10.078008superscript0.0097602superscriptitalic-ϵ20.156016superscriptitalic-ϵ3\frac{\Delta W^{2}\cdot\epsilon^{2}}{2\left(2\cdot\log\left(\frac{1.25}{\delta% }\right)\cdot\Delta W^{2}+2\cdot\sqrt{2\cdot\log\left(\frac{1.25}{\delta}% \right)}\cdot\Delta W\cdot 0.009760\cdot\epsilon^{1-0.078008}+0.009760^{2}% \cdot\epsilon^{2-0.156016}\right)}\leq\epsilon^{3}divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ italic_ϵ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 ( 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) ⋅ roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 ⋅ square-root start_ARG 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ roman_Δ italic_W ⋅ 0.009760 ⋅ italic_ϵ start_POSTSUPERSCRIPT 1 - 0.078008 end_POSTSUPERSCRIPT + 0.009760 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ italic_ϵ start_POSTSUPERSCRIPT 2 - 0.156016 end_POSTSUPERSCRIPT ) end_ARG ≤ italic_ϵ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT

Rewriting the inequality:

ΔW2ϵ2(2log(1.25δ)ΔW2+22log(1.25δ)ΔW0.009760ϵ0.921992+0.0097602ϵ1.843984)ϵΔsuperscript𝑊2italic-ϵ221.25𝛿Δsuperscript𝑊222𝑙𝑜𝑔1.25𝛿Δ𝑊0.009760superscriptitalic-ϵ0.921992superscript0.0097602superscriptitalic-ϵ1.843984italic-ϵ\frac{\Delta W^{2}\cdot\epsilon}{2\left(2\cdot\log\left(\frac{1.25}{\delta}% \right)\cdot\Delta W^{2}+2\cdot\sqrt{2\cdot\ log\left(\frac{1.25}{\delta}% \right)}\cdot\Delta W\cdot 0.009760\cdot\epsilon^{0.921992}+0.009760^{2}\cdot% \epsilon^{1.843984}\right)}\leq\epsilondivide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ italic_ϵ end_ARG start_ARG 2 ( 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) ⋅ roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 ⋅ square-root start_ARG 2 ⋅ italic_l italic_o italic_g ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ roman_Δ italic_W ⋅ 0.009760 ⋅ italic_ϵ start_POSTSUPERSCRIPT 0.921992 end_POSTSUPERSCRIPT + 0.009760 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ italic_ϵ start_POSTSUPERSCRIPT 1.843984 end_POSTSUPERSCRIPT ) end_ARG ≤ italic_ϵ
ΔW22(2log(1.25δ)ΔW2+22log(1.25δ)ΔW0.009760ϵ0.921992+0.0097602ϵ1.843984)1Δsuperscript𝑊2221.25𝛿Δsuperscript𝑊222𝑙𝑜𝑔1.25𝛿Δ𝑊0.009760superscriptitalic-ϵ0.921992superscript0.0097602superscriptitalic-ϵ1.8439841\frac{\Delta W^{2}}{2\left(2\cdot\log\left(\frac{1.25}{\delta}\right)\cdot% \Delta W^{2}+2\cdot\sqrt{2\cdot\ log\left(\frac{1.25}{\delta}\right)}\cdot% \Delta W\cdot 0.009760\cdot\epsilon^{0.921992}+0.009760^{2}\cdot\epsilon^{1.84% 3984}\right)}\leq 1divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 ( 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) ⋅ roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 ⋅ square-root start_ARG 2 ⋅ italic_l italic_o italic_g ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ roman_Δ italic_W ⋅ 0.009760 ⋅ italic_ϵ start_POSTSUPERSCRIPT 0.921992 end_POSTSUPERSCRIPT + 0.009760 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ italic_ϵ start_POSTSUPERSCRIPT 1.843984 end_POSTSUPERSCRIPT ) end_ARG ≤ 1
ΔW22(2log(1.25δ)ΔW2+22log(1.25δ)ΔW0.009760ϵ0.921992+0.0097602ϵ1.843984)Δsuperscript𝑊2221.25𝛿Δsuperscript𝑊222𝑙𝑜𝑔1.25𝛿Δ𝑊0.009760superscriptitalic-ϵ0.921992superscript0.0097602superscriptitalic-ϵ1.843984\Delta W^{2}\leq 2\left(2\cdot\log\left(\frac{1.25}{\delta}\right)\cdot\Delta W% ^{2}+2\cdot\sqrt{2\cdot\ log\left(\frac{1.25}{\delta}\right)}\cdot\Delta W% \cdot 0.009760\cdot\epsilon^{0.921992}+0.009760^{2}\cdot\epsilon^{1.843984}\right)roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≤ 2 ( 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) ⋅ roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 ⋅ square-root start_ARG 2 ⋅ italic_l italic_o italic_g ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ roman_Δ italic_W ⋅ 0.009760 ⋅ italic_ϵ start_POSTSUPERSCRIPT 0.921992 end_POSTSUPERSCRIPT + 0.009760 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ italic_ϵ start_POSTSUPERSCRIPT 1.843984 end_POSTSUPERSCRIPT )

Dividing both sides by 2:

ΔW222log(1.25δ)ΔW2+22log(1.25δ)ΔW0.009760ϵ0.921992+0.0097602ϵ1.843984Δsuperscript𝑊2221.25𝛿Δsuperscript𝑊222𝑙𝑜𝑔1.25𝛿Δ𝑊0.009760superscriptitalic-ϵ0.921992superscript0.0097602superscriptitalic-ϵ1.843984\frac{\Delta W^{2}}{2}\leq 2\cdot\log\left(\frac{1.25}{\delta}\right)\cdot% \Delta W^{2}+2\cdot\sqrt{2\cdot\ log\left(\frac{1.25}{\delta}\right)}\cdot% \Delta W\cdot 0.009760\cdot\epsilon^{0.921992}+0.009760^{2}\cdot\epsilon^{1.84% 3984}divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG ≤ 2 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) ⋅ roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 ⋅ square-root start_ARG 2 ⋅ italic_l italic_o italic_g ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG ⋅ roman_Δ italic_W ⋅ 0.009760 ⋅ italic_ϵ start_POSTSUPERSCRIPT 0.921992 end_POSTSUPERSCRIPT + 0.009760 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ italic_ϵ start_POSTSUPERSCRIPT 1.843984 end_POSTSUPERSCRIPT

Solving for ϵitalic-ϵ\epsilonitalic_ϵ involves ensuring that the terms on the right-hand side appropriately bound ΔW2Δsuperscript𝑊2\Delta W^{2}roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. This guarantees (ϵ,δ)italic-ϵ𝛿(\epsilon,\delta)( italic_ϵ , italic_δ )-differential privacy if:

ϵmin(4log(1.25δ),(ΔW24log(1.25δ)ΔW2+0.0384ΔWlog(1.25δ))1/0.922,(ΔW20.000191)1/1.844)italic-ϵ41.25𝛿superscriptΔsuperscript𝑊24𝑙𝑜𝑔1.25𝛿Δsuperscript𝑊20.0384Δ𝑊1.25𝛿10.922superscriptΔsuperscript𝑊20.00019111.844\epsilon\leq\min\left(4\cdot\log\left(\frac{1.25}{\delta}\right),\left(\frac{% \Delta W^{2}}{4\cdot\ log\left(\frac{1.25}{\delta}\right)\cdot\Delta W^{2}+0.0% 384\cdot\Delta W\cdot\sqrt{\log\left(\frac{1.25}{\delta}\right)}}\right)^{1/0.% 922},\left(\frac{\Delta W^{2}}{0.000191}\right)^{1/1.844}\right)italic_ϵ ≤ roman_min ( 4 ⋅ roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) , ( divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 ⋅ italic_l italic_o italic_g ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) ⋅ roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 0.0384 ⋅ roman_Δ italic_W ⋅ square-root start_ARG roman_log ( divide start_ARG 1.25 end_ARG start_ARG italic_δ end_ARG ) end_ARG end_ARG ) start_POSTSUPERSCRIPT 1 / 0.922 end_POSTSUPERSCRIPT , ( divide start_ARG roman_Δ italic_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 0.000191 end_ARG ) start_POSTSUPERSCRIPT 1 / 1.844 end_POSTSUPERSCRIPT )

8.1.3 Rényi Differential Privacy and Subsampling Amplification

Rényi Differential Privacy Definition

Define α𝛼\alphaitalic_α-Rényi divergence between two probability distributions P𝑃Pitalic_P and Q𝑄Qitalic_Q:

Dα(PQ)=1α1log𝔼Q[(P(x)Q(x))α]subscript𝐷𝛼conditional𝑃𝑄1𝛼1subscript𝔼𝑄delimited-[]superscript𝑃𝑥𝑄𝑥𝛼D_{\alpha}(P\parallel Q)=\frac{1}{\alpha-1}\log\mathbb{E}_{Q}\left[\left(\frac% {P(x)}{Q(x)}\right)^{\alpha}\right]italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_P ∥ italic_Q ) = divide start_ARG 1 end_ARG start_ARG italic_α - 1 end_ARG roman_log blackboard_E start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT [ ( divide start_ARG italic_P ( italic_x ) end_ARG start_ARG italic_Q ( italic_x ) end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ]

A mechanism M𝑀Mitalic_M is (α,ϵ)𝛼italic-ϵ(\alpha,\epsilon)( italic_α , italic_ϵ )-RDP if for all adjacent datasets D𝐷Ditalic_D and Dsuperscript𝐷D^{\prime}italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT:

Dα(M(D)M(D))ϵsubscript𝐷𝛼conditional𝑀𝐷𝑀superscript𝐷italic-ϵD_{\alpha}(M(D)\parallel M(D^{\prime}))\leq\epsilonitalic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_M ( italic_D ) ∥ italic_M ( italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≤ italic_ϵ

Our L2subscript𝐿2L_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT-sensitivity remains:

ΔW=EηCNBΔ𝑊𝐸𝜂𝐶𝑁𝐵\Delta W=\frac{E\cdot\eta\cdot C}{N\cdot B}roman_Δ italic_W = divide start_ARG italic_E ⋅ italic_η ⋅ italic_C end_ARG start_ARG italic_N ⋅ italic_B end_ARG

For a Gaussian mechanism with noise N(0,σ2I)𝑁0superscript𝜎2𝐼N(0,\sigma^{2}I)italic_N ( 0 , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_I ), the RDP guarantee is:

ϵRDP(α)=α(ΔW)22σ2subscriptitalic-ϵ𝑅𝐷𝑃𝛼𝛼superscriptΔ𝑊22superscript𝜎2\epsilon_{RDP}(\alpha)=\frac{\alpha(\Delta W)^{2}}{2\sigma^{2}}italic_ϵ start_POSTSUBSCRIPT italic_R italic_D italic_P end_POSTSUBSCRIPT ( italic_α ) = divide start_ARG italic_α ( roman_Δ italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG

With a sampling rate q=BN𝑞𝐵𝑁q=\frac{B}{N}italic_q = divide start_ARG italic_B end_ARG start_ARG italic_N end_ARG (batch size / dataset size), we can use the following bound (Wang et al., 2019):

ϵRDP,subsampled(α)1α1log(1+q2(α2)(e(α1)ϵRDP(α)1))subscriptitalic-ϵ𝑅𝐷𝑃subsampled𝛼1𝛼11superscript𝑞2binomial𝛼2superscript𝑒𝛼1subscriptitalic-ϵ𝑅𝐷𝑃𝛼1\epsilon_{RDP,\text{subsampled}}(\alpha)\leq\frac{1}{\alpha-1}\log\left(1+q^{2% }\binom{\alpha}{2}\left(e^{(\alpha-1)\epsilon_{RDP}(\alpha)}-1\right)\right)italic_ϵ start_POSTSUBSCRIPT italic_R italic_D italic_P , subsampled end_POSTSUBSCRIPT ( italic_α ) ≤ divide start_ARG 1 end_ARG start_ARG italic_α - 1 end_ARG roman_log ( 1 + italic_q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_α end_ARG start_ARG 2 end_ARG ) ( italic_e start_POSTSUPERSCRIPT ( italic_α - 1 ) italic_ϵ start_POSTSUBSCRIPT italic_R italic_D italic_P end_POSTSUBSCRIPT ( italic_α ) end_POSTSUPERSCRIPT - 1 ) )

We need to calibrate σ𝜎\sigmaitalic_σ to achieve the desired RDP guarantee. Let’s set:

σ2=(ΔW)2α2ϵRDP(α)superscript𝜎2superscriptΔ𝑊2𝛼2subscriptitalic-ϵ𝑅𝐷𝑃𝛼\sigma^{2}=\frac{(\Delta W)^{2}\alpha}{2\epsilon_{RDP}(\alpha)}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = divide start_ARG ( roman_Δ italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_α end_ARG start_ARG 2 italic_ϵ start_POSTSUBSCRIPT italic_R italic_D italic_P end_POSTSUBSCRIPT ( italic_α ) end_ARG

For E𝐸Eitalic_E epochs, we use RDP composition:

ϵRDP,total(α)=EϵRDP,subsampled(α)subscriptitalic-ϵ𝑅𝐷𝑃total𝛼𝐸subscriptitalic-ϵ𝑅𝐷𝑃subsampled𝛼\epsilon_{RDP,\text{total}}(\alpha)=E\cdot\epsilon_{RDP,\text{subsampled}}(\alpha)italic_ϵ start_POSTSUBSCRIPT italic_R italic_D italic_P , total end_POSTSUBSCRIPT ( italic_α ) = italic_E ⋅ italic_ϵ start_POSTSUBSCRIPT italic_R italic_D italic_P , subsampled end_POSTSUBSCRIPT ( italic_α )

Use the conversion theorem (Canonne et al., 2020) to convert RDP to (ϵ,δ)italic-ϵ𝛿(\epsilon,\delta)( italic_ϵ , italic_δ )-DP: For any δ>0𝛿0\delta>0italic_δ > 0, the mechanism satisfies (ϵ,δ)italic-ϵ𝛿(\epsilon,\delta)( italic_ϵ , italic_δ )-DP where:

ϵ=minα{ϵRDP,total(α)+log(1/δ)α1}italic-ϵsubscript𝛼subscriptitalic-ϵ𝑅𝐷𝑃total𝛼1𝛿𝛼1\epsilon=\min_{\alpha}\left\{\epsilon_{RDP,\text{total}}(\alpha)+\frac{\log(1/% \delta)}{\alpha-1}\right\}italic_ϵ = roman_min start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT { italic_ϵ start_POSTSUBSCRIPT italic_R italic_D italic_P , total end_POSTSUBSCRIPT ( italic_α ) + divide start_ARG roman_log ( 1 / italic_δ ) end_ARG start_ARG italic_α - 1 end_ARG }

To account for the empirical term in our noise formula, we add it to our σ𝜎\sigmaitalic_σ:

σtotal2=σ2+(0.009760ϵ0.078008)2superscriptsubscript𝜎total2superscript𝜎2superscript0.009760superscriptitalic-ϵ0.0780082\sigma_{\text{total}}^{2}=\sigma^{2}+\left(\frac{0.009760}{\epsilon^{0.078008}% }\right)^{2}italic_σ start_POSTSUBSCRIPT total end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( divide start_ARG 0.009760 end_ARG start_ARG italic_ϵ start_POSTSUPERSCRIPT 0.078008 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
Final Privacy Guarantee

The mechanism M𝑀Mitalic_M satisfies (ϵ,δ)italic-ϵ𝛿(\epsilon,\delta)( italic_ϵ , italic_δ )-DP where:

ϵ=minα{α(ΔW)22σtotal2E1α1log(1+q2(α2)(e(α1)α(ΔW)22σtotal21))+log(1/δ)α1}italic-ϵsubscript𝛼𝛼superscriptΔ𝑊22superscriptsubscript𝜎total2𝐸1𝛼11superscript𝑞2binomial𝛼2superscript𝑒𝛼1𝛼superscriptΔ𝑊22superscriptsubscript𝜎total211𝛿𝛼1\epsilon=\min_{\alpha}\left\{\frac{\alpha(\Delta W)^{2}}{2\sigma_{\text{total}% }^{2}}\cdot E\cdot\frac{1}{\alpha-1}\log\left(1+q^{2}\binom{\alpha}{2}\left(e^% {(\alpha-1)\frac{\alpha(\Delta W)^{2}}{2\sigma_{\text{total}}^{2}}}-1\right)% \right)+\frac{\log(1/\delta)}{\alpha-1}\right\}italic_ϵ = roman_min start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT { divide start_ARG italic_α ( roman_Δ italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUBSCRIPT total end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ⋅ italic_E ⋅ divide start_ARG 1 end_ARG start_ARG italic_α - 1 end_ARG roman_log ( 1 + italic_q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_α end_ARG start_ARG 2 end_ARG ) ( italic_e start_POSTSUPERSCRIPT ( italic_α - 1 ) divide start_ARG italic_α ( roman_Δ italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUBSCRIPT total end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT - 1 ) ) + divide start_ARG roman_log ( 1 / italic_δ ) end_ARG start_ARG italic_α - 1 end_ARG }

By leveraging RDP and subsampling amplification, our mechanism achieves tighter privacy bounds, enhancing the overall privacy guarantee.

Thus, the mechanism \mathcal{M}caligraphic_M with post-training noise addition conclusively satisfies (ϵ,δ)italic-ϵ𝛿(\epsilon,\delta)( italic_ϵ , italic_δ )-differential privacy, providing both theoretical and empirical noise components for enhanced privacy protection. This ensures robustness even when σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is not relatively small compared to σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

Step 6: Explicitly Accounting for δ𝛿\deltaitalic_δ

The above derivation shows the bound in terms of ϵitalic-ϵ\epsilonitalic_ϵ. To fully account for δ𝛿\deltaitalic_δ, we need to ensure that the probability mass beyond the bound ϵitalic-ϵ\epsilonitalic_ϵ is captured by δ𝛿\deltaitalic_δ. By designing the noise scale σ𝜎\sigmaitalic_σ to include the term log(1.25/δ)1.25𝛿\log(1.25/\delta)roman_log ( 1.25 / italic_δ ), we ensure that the probability mass beyond eϵsuperscript𝑒italic-ϵe^{\epsilon}italic_e start_POSTSUPERSCRIPT italic_ϵ end_POSTSUPERSCRIPT is at most δ𝛿\deltaitalic_δ.

8.1.4 Considerations and Assumptions

  • Composition: If the method is applied multiple times or combined with other privacy-preserving techniques, the privacy guarantees compose according to the composition theorems of differential privacy.

  • Assumptions: The analysis assumes the independence of noise added to different weights.

8.1.5 Proof Summary

By adding Gaussian noise to the model weights post-training with the calculated noise scale, the proposed method ensures differential privacy. This proof demonstrates that the noise addition mechanism adheres to the definition of (ϵ,δ)italic-ϵ𝛿(\epsilon,\delta)( italic_ϵ , italic_δ )-differential privacy, providing a rigorous foundation for the approach.

8.1.6 Further Validation

Future work would consider the following:

  • Empirical Validation: Conduct experiments with more datasets and more models.

  • Simulations: Use synthetic datasets for analytical comparisons.

  • Peer Review: Submit for feedback from privacy and machine learning experts.

The comprehensive proof and additional validation steps ensure a robust and reliable method for achieving differential privacy through post-training noise addition.

8.2 Tables and Figures

Batch Size Model Perplex. (Mem.) Perplex. (Non-Mem.) ROC AUC Accuracy Prec. Recall F1 Score
Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std
5 dp_model 2.79±0.96plus-or-minus2.790.962.79\pm 0.962.79 ± 0.96 3.86±0.46plus-or-minus3.860.463.86\pm 0.463.86 ± 0.46 0.69±0.21plus-or-minus0.690.210.69\pm 0.210.69 ± 0.21 0.67±0.19plus-or-minus0.670.190.67\pm 0.190.67 ± 0.19 0.66±0.18plus-or-minus0.660.180.66\pm 0.180.66 ± 0.18 0.69±0.22plus-or-minus0.690.220.69\pm 0.220.69 ± 0.22 0.67±0.20plus-or-minus0.670.200.67\pm 0.200.67 ± 0.20
5 fine_tuned_model 1.79±0.00plus-or-minus1.790.001.79\pm 0.001.79 ± 0.00 4.04±0.00plus-or-minus4.040.004.04\pm 0.004.04 ± 0.00 0.97±0.00plus-or-minus0.970.000.97\pm 0.000.97 ± 0.00 0.95±0.00plus-or-minus0.950.000.95\pm 0.000.95 ± 0.00 0.91±0.00plus-or-minus0.910.000.91\pm 0.000.91 ± 0.00 1.00±0.00plus-or-minus1.000.001.00\pm 0.001.00 ± 0.00 0.95±0.00plus-or-minus0.950.000.95\pm 0.000.95 ± 0.00
5 non_dp_model 3.84±1.11plus-or-minus3.841.113.84\pm 1.113.84 ± 1.11 7.86±2.07plus-or-minus7.862.077.86\pm 2.077.86 ± 2.07 0.83±0.05plus-or-minus0.830.050.83\pm 0.050.83 ± 0.05 0.78±0.05plus-or-minus0.780.050.78\pm 0.050.78 ± 0.05 0.78±0.04plus-or-minus0.780.040.78\pm 0.040.78 ± 0.04 0.77±0.09plus-or-minus0.770.090.77\pm 0.090.77 ± 0.09 0.77±0.06plus-or-minus0.770.060.77\pm 0.060.77 ± 0.06
10 dp_model 3.36±1.39plus-or-minus3.361.393.36\pm 1.393.36 ± 1.39 4.04±1.05plus-or-minus4.041.054.04\pm 1.054.04 ± 1.05 0.60±0.14plus-or-minus0.600.140.60\pm 0.140.60 ± 0.14 0.59±0.11plus-or-minus0.590.110.59\pm 0.110.59 ± 0.11 0.58±0.12plus-or-minus0.580.120.58\pm 0.120.58 ± 0.12 0.59±0.13plus-or-minus0.590.130.59\pm 0.130.59 ± 0.13 0.58±0.12plus-or-minus0.580.120.58\pm 0.120.58 ± 0.12
10 fine_tuned_model 2.13±0.00plus-or-minus2.130.002.13\pm 0.002.13 ± 0.00 3.75±0.00plus-or-minus3.750.003.75\pm 0.003.75 ± 0.00 0.84±0.00plus-or-minus0.840.000.84\pm 0.000.84 ± 0.00 0.80±0.00plus-or-minus0.800.000.80\pm 0.000.80 ± 0.00 0.80±0.00plus-or-minus0.800.000.80\pm 0.000.80 ± 0.00 0.80±0.00plus-or-minus0.800.000.80\pm 0.000.80 ± 0.00 0.80±0.00plus-or-minus0.800.000.80\pm 0.000.80 ± 0.00
10 non_dp_model 4.82±1.43plus-or-minus4.821.434.82\pm 1.434.82 ± 1.43 7.69±2.14plus-or-minus7.692.147.69\pm 2.147.69 ± 2.14 0.71±0.04plus-or-minus0.710.040.71\pm 0.040.71 ± 0.04 0.68±0.07plus-or-minus0.680.070.68\pm 0.070.68 ± 0.07 0.70±0.07plus-or-minus0.700.070.70\pm 0.070.70 ± 0.07 0.63±0.09plus-or-minus0.630.090.63\pm 0.090.63 ± 0.09 0.66±0.08plus-or-minus0.660.080.66\pm 0.080.66 ± 0.08
50 dp_model 7.42±5.80plus-or-minus7.425.807.42\pm 5.807.42 ± 5.80 7.59±5.66plus-or-minus7.595.667.59\pm 5.667.59 ± 5.66 0.49±0.02plus-or-minus0.490.020.49\pm 0.020.49 ± 0.02 0.55±0.05plus-or-minus0.550.050.55\pm 0.050.55 ± 0.05 0.55±0.05plus-or-minus0.550.050.55\pm 0.050.55 ± 0.05 0.55±0.06plus-or-minus0.550.060.55\pm 0.060.55 ± 0.06 0.55±0.06plus-or-minus0.550.060.55\pm 0.060.55 ± 0.06
50 fine_tuned_model 2.98±0.00plus-or-minus2.980.002.98\pm 0.002.98 ± 0.00 3.52±0.00plus-or-minus3.520.003.52\pm 0.003.52 ± 0.00 0.50±0.00plus-or-minus0.500.000.50\pm 0.000.50 ± 0.00 0.55±0.00plus-or-minus0.550.000.55\pm 0.000.55 ± 0.00 0.55±0.00plus-or-minus0.550.000.55\pm 0.000.55 ± 0.00 0.60±0.00plus-or-minus0.600.000.60\pm 0.000.60 ± 0.00 0.57±0.00plus-or-minus0.570.000.57\pm 0.000.57 ± 0.00
50 non_dp_model 8.54±3.56plus-or-minus8.543.568.54\pm 3.568.54 ± 3.56 9.89±4.07plus-or-minus9.894.079.89\pm 4.079.89 ± 4.07 0.54±0.06plus-or-minus0.540.060.54\pm 0.060.54 ± 0.06 0.53±0.07plus-or-minus0.530.070.53\pm 0.070.53 ± 0.07 0.53±0.07plus-or-minus0.530.070.53\pm 0.070.53 ± 0.07 0.49±0.10plus-or-minus0.490.100.49\pm 0.100.49 ± 0.10 0.51±0.08plus-or-minus0.510.080.51\pm 0.080.51 ± 0.08
100 dp_model 12.47±9.96plus-or-minus12.479.9612.47\pm 9.9612.47 ± 9.96 12.50±9.78plus-or-minus12.509.7812.50\pm 9.7812.50 ± 9.78 0.49±0.03plus-or-minus0.490.030.49\pm 0.030.49 ± 0.03 0.53±0.07plus-or-minus0.530.070.53\pm 0.070.53 ± 0.07 0.53±0.07plus-or-minus0.530.070.53\pm 0.070.53 ± 0.07 0.50±0.07plus-or-minus0.500.070.50\pm 0.070.50 ± 0.07 0.51±0.07plus-or-minus0.510.070.51\pm 0.070.51 ± 0.07
100 fine_tuned_model 3.47±0.00plus-or-minus3.470.003.47\pm 0.003.47 ± 0.00 3.72±0.00plus-or-minus3.720.003.72\pm 0.003.72 ± 0.00 0.48±0.00plus-or-minus0.480.000.48\pm 0.000.48 ± 0.00 0.50±0.00plus-or-minus0.500.000.50\pm 0.000.50 ± 0.00 0.50±0.00plus-or-minus0.500.000.50\pm 0.000.50 ± 0.00 0.50±0.00plus-or-minus0.500.000.50\pm 0.000.50 ± 0.00 0.50±0.00plus-or-minus0.500.000.50\pm 0.000.50 ± 0.00
100 non_dp_model 9.60±3.78plus-or-minus9.603.789.60\pm 3.789.60 ± 3.78 10.20±4.00plus-or-minus10.204.0010.20\pm 4.0010.20 ± 4.00 0.52±0.06plus-or-minus0.520.060.52\pm 0.060.52 ± 0.06 0.49±0.07plus-or-minus0.490.070.49\pm 0.070.49 ± 0.07 0.48±0.09plus-or-minus0.480.090.48\pm 0.090.48 ± 0.09 0.43±0.11plus-or-minus0.430.110.43\pm 0.110.43 ± 0.11 0.45±0.10plus-or-minus0.450.100.45\pm 0.100.45 ± 0.10
Table 1: Descriptive Statistics for Batch Sizes and Models
Epoch Model Perplex. (Mem.) Perplex. (Non-Mem.) ROC AUC Accuracy Prec. Recall F1 Score
Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std Mean ±plus-or-minus\pm± Std
1 dp_model 32.62±15.91plus-or-minus32.6215.9132.62\pm 15.9132.62 ± 15.91 32.20±15.48plus-or-minus32.2015.4832.20\pm 15.4832.20 ± 15.48 0.48±0.04plus-or-minus0.480.040.48\pm 0.040.48 ± 0.04 0.52±0.04plus-or-minus0.520.040.52\pm 0.040.52 ± 0.04 0.52±0.04plus-or-minus0.520.040.52\pm 0.040.52 ± 0.04 0.51±0.07plus-or-minus0.510.070.51\pm 0.070.51 ± 0.07 0.51±0.05plus-or-minus0.510.050.51\pm 0.050.51 ± 0.05
1 fine_tuned_model 10.68±0.00plus-or-minus10.680.0010.68\pm 0.0010.68 ± 0.00 10.73±0.00plus-or-minus10.730.0010.73\pm 0.0010.73 ± 0.00 0.52±0.00plus-or-minus0.520.000.52\pm 0.000.52 ± 0.00 0.55±0.00plus-or-minus0.550.000.55\pm 0.000.55 ± 0.00 0.56±0.00plus-or-minus0.560.000.56\pm 0.000.56 ± 0.00 0.50±0.00plus-or-minus0.500.000.50\pm 0.000.50 ± 0.00 0.53±0.00plus-or-minus0.530.000.53\pm 0.000.53 ± 0.00
1 non_dp_model 41.46±20.20plus-or-minus41.4620.2041.46\pm 20.2041.46 ± 20.20 41.32±19.90plus-or-minus41.3219.9041.32\pm 19.9041.32 ± 19.90 0.48±0.07plus-or-minus0.480.070.48\pm 0.070.48 ± 0.07 0.52±0.07plus-or-minus0.520.070.52\pm 0.070.52 ± 0.07 0.52±0.08plus-or-minus0.520.080.52\pm 0.080.52 ± 0.08 0.48±0.10plus-or-minus0.480.100.48\pm 0.100.48 ± 0.10 0.49±0.09plus-or-minus0.490.090.49\pm 0.090.49 ± 0.09
5 dp_model 12.47±9.96plus-or-minus12.479.9612.47\pm 9.9612.47 ± 9.96 12.50±9.78plus-or-minus12.509.7812.50\pm 9.7812.50 ± 9.78 0.49±0.03plus-or-minus0.490.030.49\pm 0.030.49 ± 0.03 0.53±0.07plus-or-minus0.530.070.53\pm 0.070.53 ± 0.07 0.53±0.07plus-or-minus0.530.070.53\pm 0.070.53 ± 0.07 0.50±0.07plus-or-minus0.500.070.50\pm 0.070.50 ± 0.07 0.51±0.07plus-or-minus0.510.070.51\pm 0.070.51 ± 0.07
5 fine_tuned_model 3.47±0.00plus-or-minus3.470.003.47\pm 0.003.47 ± 0.00 3.72±0.00plus-or-minus3.720.003.72\pm 0.003.72 ± 0.00 0.48±0.00plus-or-minus0.480.000.48\pm 0.000.48 ± 0.00 0.50±0.00plus-or-minus0.500.000.50\pm 0.000.50 ± 0.00 0.50±0.00plus-or-minus0.500.000.50\pm 0.000.50 ± 0.00 0.50±0.00plus-or-minus0.500.000.50\pm 0.000.50 ± 0.00 0.50±0.00plus-or-minus0.500.000.50\pm 0.000.50 ± 0.00
5 non_dp_model 9.60±3.78plus-or-minus9.603.789.60\pm 3.789.60 ± 3.78 10.20±4.00plus-or-minus10.204.0010.20\pm 4.0010.20 ± 4.00 0.52±0.06plus-or-minus0.520.060.52\pm 0.060.52 ± 0.06 0.49±0.07plus-or-minus0.490.070.49\pm 0.070.49 ± 0.07 0.48±0.09plus-or-minus0.480.090.48\pm 0.090.48 ± 0.09 0.43±0.11plus-or-minus0.430.110.43\pm 0.110.43 ± 0.11 0.45±0.10plus-or-minus0.450.100.45\pm 0.100.45 ± 0.10
10 dp_model 7.24±5.79plus-or-minus7.245.797.24\pm 5.797.24 ± 5.79 7.45±5.63plus-or-minus7.455.637.45\pm 5.637.45 ± 5.63 0.50±0.03plus-or-minus0.500.030.50\pm 0.030.50 ± 0.03 0.53±0.07plus-or-minus0.530.070.53\pm 0.070.53 ± 0.07 0.53±0.07plus-or-minus0.530.070.53\pm 0.070.53 ± 0.07 0.53±0.09plus-or-minus0.530.090.53\pm 0.090.53 ± 0.09 0.53±0.08plus-or-minus0.530.080.53\pm 0.080.53 ± 0.08
10 fine_tuned_model 2.82±0.00plus-or-minus2.820.002.82\pm 0.002.82 ± 0.00 3.54±0.00plus-or-minus3.540.003.54\pm 0.003.54 ± 0.00 0.54±0.00plus-or-minus0.540.000.54\pm 0.000.54 ± 0.00 0.55±0.00plus-or-minus0.550.000.55\pm 0.000.55 ± 0.00 0.55±0.00plus-or-minus0.550.000.55\pm 0.000.55 ± 0.00 0.60±0.00plus-or-minus0.600.000.60\pm 0.000.60 ± 0.00 0.57±0.00plus-or-minus0.570.000.57\pm 0.000.57 ± 0.00
10 non_dp_model 8.86±3.97plus-or-minus8.863.978.86\pm 3.978.86 ± 3.97 10.86±4.80plus-or-minus10.864.8010.86\pm 4.8010.86 ± 4.80 0.57±0.07plus-or-minus0.570.070.57\pm 0.070.57 ± 0.07 0.55±0.08plus-or-minus0.550.080.55\pm 0.080.55 ± 0.08 0.55±0.08plus-or-minus0.550.080.55\pm 0.080.55 ± 0.08 0.52±0.11plus-or-minus0.520.110.52\pm 0.110.52 ± 0.11 0.53±0.09plus-or-minus0.530.090.53\pm 0.090.53 ± 0.09
20 dp_model 4.37±2.98plus-or-minus4.372.984.37\pm 2.984.37 ± 2.98 5.07±2.60plus-or-minus5.072.605.07\pm 2.605.07 ± 2.60 0.59±0.13plus-or-minus0.590.130.59\pm 0.130.59 ± 0.13 0.61±0.10plus-or-minus0.610.100.61\pm 0.100.61 ± 0.10 0.60±0.10plus-or-minus0.600.100.60\pm 0.100.60 ± 0.10 0.61±0.11plus-or-minus0.610.110.61\pm 0.110.61 ± 0.11 0.61±0.10plus-or-minus0.610.100.61\pm 0.100.61 ± 0.10
20 fine_tuned_model 1.96±0.00plus-or-minus1.960.001.96\pm 0.001.96 ± 0.00 4.29±0.00plus-or-minus4.290.004.29\pm 0.004.29 ± 0.00 0.86±0.00plus-or-minus0.860.000.86\pm 0.000.86 ± 0.00 0.80±0.00plus-or-minus0.800.000.80\pm 0.000.80 ± 0.00 0.80±0.00plus-or-minus0.800.000.80\pm 0.000.80 ± 0.00 0.80±0.00plus-or-minus0.800.000.80\pm 0.000.80 ± 0.00 0.80±0.00plus-or-minus0.800.000.80\pm 0.000.80 ± 0.00
20 non_dp_model 7.36±3.45plus-or-minus7.363.457.36\pm 3.457.36 ± 3.45 13.11±5.52plus-or-minus13.115.5213.11\pm 5.5213.11 ± 5.52 0.71±0.05plus-or-minus0.710.050.71\pm 0.050.71 ± 0.05 0.69±0.06plus-or-minus0.690.060.69\pm 0.060.69 ± 0.06 0.70±0.06plus-or-minus0.700.060.70\pm 0.060.70 ± 0.06 0.66±0.09plus-or-minus0.660.090.66\pm 0.090.66 ± 0.09 0.68±0.07plus-or-minus0.680.070.68\pm 0.070.68 ± 0.07
Table 2: Descriptive Statistics for Epochs and Models
Refer to caption
Figure 4: Epsilon 1 to 10
Refer to caption
Figure 5: Epsilon 10 to 100
Refer to caption
Figure 6: Epsilon 100 to 1000
Figure 7: Radar Charts for Different Epsilon Ranges
Table 3: Results for Epoch = 1
Metric DP vs DP-Weights DP vs Fine-Tuned DP-Weights vs Fine-Tuned
RMSE for perplexity_member 12.505 26.940 36.620
RMSE for perplexity_non_member 12.577 26.305 36.298
RMSE for roc_auc 0.072 0.058 0.078
RMSE for accuracy 0.082 0.048 0.077
RMSE for precision 0.087 0.051 0.088
RMSE for recall 0.125 0.073 0.098
RMSE for f1 0.102 0.050 0.092
Pearson correlation for perplexity_member 0.902 nan nan
Pearson correlation for perplexity_non_member 0.905 nan nan
Pearson correlation for roc_auc 0.091 nan nan
Pearson correlation for accuracy -0.061 nan nan
Pearson correlation for precision 0.006 nan nan
Pearson correlation for recall -0.013 nan nan
Pearson correlation for f1 -0.037 nan nan
MAE for perplexity_member 9.420 21.949 30.785
MAE for perplexity_non_member 9.581 21.468 30.590
MAE for roc_auc 0.052 0.046 0.055
MAE for accuracy 0.066 0.032 0.063
MAE for precision 0.071 0.036 0.069
MAE for recall 0.086 0.039 0.068
MAE for f1 0.079 0.035 0.069
R2 score for perplexity_member 0.359 -1.974 -2.409
R2 score for perplexity_non_member 0.315 -1.995 -2.451
R2 score for roc_auc -2.455 -1.231 -0.485
R2 score for accuracy -3.449 -0.542 -0.238
R2 score for precision -4.036 -0.750 -0.269
R2 score for recall -1.998 -0.022 -0.069
R2 score for f1 -3.393 -0.059 -0.153
MSE for perplexity_member 156.363 725.739 1341.057
MSE for perplexity_non_member 158.187 691.943 1317.525
MSE for roc_auc 0.005 0.003 0.006
MSE for accuracy 0.007 0.002 0.006
MSE for precision 0.008 0.003 0.008
MSE for recall 0.016 0.005 0.010
MSE for f1 0.010 0.003 0.008
MedAE for perplexity_member 7.455 17.503 23.244
MedAE for perplexity_non_member 8.031 17.199 23.265
MedAE for roc_auc 0.040 0.055 0.030
MedAE for accuracy 0.050 0.025 0.050
MedAE for precision 0.056 0.022 0.056
MedAE for recall 0.100 0.000 0.100
MedAE for f1 0.074 0.026 0.074
Coefficient of Variation for perplexity_member 0.488 0.487 0.0
Coefficient of Variation for perplexity_non_member 0.481 0.481 1.685e-16
Coefficient of Variation for roc_auc 0.082 0.137 2.174e-16
Coefficient of Variation for accuracy 0.076 0.137 2.056e-16
Coefficient of Variation for precision 0.076 0.154 0.0
Coefficient of Variation for recall 0.144 0.204 0.0
Coefficient of Variation for f1 0.096 0.176 2.148e-16
MAPE for perplexity_member 0.331 0.576 0.686
MAPE for perplexity_non_member 0.338 0.572 0.684
MAPE for roc_auc 0.110 0.104 0.136
MAPE for accuracy 0.128 0.067 0.133
MAPE for precision 0.138 0.074 0.152
MAPE for recall 0.164 0.074 0.176
MAPE for f1 0.152 0.071 0.164
Wilcoxon test for perplexity_member 12.0, 5.215e075.215𝑒075.215e-075.215 italic_e - 07 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for perplexity_non_member 9.0, 2.459e072.459𝑒072.459e-072.459 italic_e - 07 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for roc_auc 162.0, 0.7320.7320.7320.732 20.5, 0.00012510.00012510.00012510.0001251 55.0, 0.0021600.0021600.0021600.002160
Wilcoxon test for accuracy 114.5, 0.4730.4730.4730.473 1.0, 0.00083250.00083250.00083250.0008325 34.0, 0.0024870.0024870.0024870.002487
Wilcoxon test for precision 115.5, 0.4930.4930.4930.493 5.0, 0.00037170.00037170.00037170.0003717 42.0, 0.0058690.0058690.0058690.005869
Wilcoxon test for recall 36.5, 0.09630.09630.09630.0963 16.0, 0.4170.4170.4170.417 36.0, 0.15240.15240.15240.1524
Wilcoxon test for f1 106.0, 0.33020.33020.33020.3302 60.0, 0.26320.26320.26320.2632 73.0, 0.08160.08160.08160.0816
Table 4: Results for Epoch = 5
Metric DP vs DP-Weights DP vs Fine-Tuned DP-Weights vs Fine-Tuned
RMSE for perplexity_member 7.123 13.296 7.165
RMSE for perplexity_non_member 6.595 13.011 7.575
RMSE for roc_auc 0.079 0.033 0.076
RMSE for accuracy 0.109 0.070 0.074
RMSE for precision 0.122 0.072 0.088
RMSE for recall 0.167 0.073 0.135
RMSE for f1 0.145 0.070 0.112
Pearson correlation for perplexity_member 0.922 nan nan
Pearson correlation for perplexity_non_member 0.922 nan nan
Pearson correlation for roc_auc 0.018 nan nan
Pearson correlation for accuracy -0.100 nan nan
Pearson correlation for precision -0.082 nan nan
Pearson correlation for recall -0.229 nan nan
Pearson correlation for f1 -0.168 nan nan
MAE for perplexity_member 4.356 9.001 6.126
MAE for perplexity_non_member 4.126 8.774 6.479
MAE for roc_auc 0.061 0.030 0.060
MAE for accuracy 0.093 0.063 0.059
MAE for precision 0.103 0.065 0.069
MAE for recall 0.136 0.054 0.111
MAE for f1 0.119 0.059 0.091
R2 score for perplexity_member 0.470 -0.846 -2.717
R2 score for perplexity_non_member 0.529 -0.834 -2.726
R2 score for roc_auc -4.865 -0.043 -0.450
R2 score for accuracy -1.853 -0.171 -0.030
R2 score for precision -2.402 -0.181 -0.052
R2 score for recall -4.212 -0.002 -0.447
R2 score for f1 -3.573 -0.048 -0.268
MSE for perplexity_member 50.742 176.777 51.335
MSE for perplexity_non_member 43.490 169.299 57.378
MSE for roc_auc 0.006 0.001 0.006
MSE for accuracy 0.012 0.005 0.005
MSE for precision 0.015 0.005 0.008
MSE for recall 0.028 0.005 0.018
MSE for f1 0.021 0.005 0.012
MedAE for perplexity_member 2.159 4.802 4.431
MedAE for perplexity_non_member 2.356 4.613 4.718
MedAE for roc_auc 0.045 0.030 0.065
MedAE for accuracy 0.100 0.050 0.050
MedAE for precision 0.100 0.056 0.056
MedAE for recall 0.100 0.100 0.100
MedAE for f1 0.103 0.067 0.079
Coefficient of Variation for perplexity_member 0.799 0.394 1.303e-16
Coefficient of Variation for perplexity_non_member 0.783 0.392 1.215e-16
Coefficient of Variation for roc_auc 0.068 0.123 1.178e-16
Coefficient of Variation for accuracy 0.125 0.152 0.0
Coefficient of Variation for precision 0.127 0.181 0.0
Coefficient of Variation for recall 0.148 0.269 0.0
Coefficient of Variation for f1 0.134 0.225 0.0
MAPE for perplexity_member 0.327 0.527 0.587
MAPE for perplexity_non_member 0.329 0.506 0.583
MAPE for roc_auc 0.131 0.061 0.111
MAPE for accuracy 0.176 0.119 0.125
MAPE for precision 0.194 0.123 0.157
MAPE for recall 0.261 0.110 0.339
MAPE for f1 0.225 0.116 0.244
Wilcoxon test for perplexity_member 149.0, 0.2270.2270.2270.227 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for perplexity_non_member 159.0, 0.3270.3270.3270.327 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for roc_auc 94.5, 0.02310.02310.02310.0231 126.0, 0.20590.20590.20590.2059 62.5, 0.00240.00240.00240.0024
Wilcoxon test for accuracy 91.5, 0.03230.03230.03230.0323 62.0, 0.00610.00610.00610.0061 103.5, 0.67180.67180.67180.6718
Wilcoxon test for precision 90.5, 0.03050.03050.03050.0305 80.0, 0.02290.02290.02290.0229 89.0, 0.35170.35170.35170.3517
Wilcoxon test for recall 49.0, 0.01070.01070.01070.0107 56.0, 0.79630.79630.79630.7963 45.0, 0.00290.00290.00290.0029
Wilcoxon test for f1 79.5, 0.01470.01470.01470.0147 108.0, 0.08370.08370.08370.0837 82.0, 0.05090.05090.05090.0509
Table 5: Results for Epochs = 10
Metric DP vs DP-Weights DP vs Fine-Tuned DP-Weights vs Fine-Tuned
RMSE for perplexity_member 7.123 13.296 7.165
RMSE for perplexity_non_member 6.595 13.011 7.575
RMSE for roc_auc 0.079 0.033 0.076
RMSE for accuracy 0.109 0.070 0.074
RMSE for precision 0.122 0.072 0.088
RMSE for recall 0.167 0.073 0.135
RMSE for f1 0.145 0.070 0.112
Pearson correlation for perplexity_member 0.922 nan nan
Pearson correlation for perplexity_non_member 0.922 nan nan
Pearson correlation for roc_auc 0.018 nan nan
Pearson correlation for accuracy -0.100 nan nan
Pearson correlation for precision -0.082 nan nan
Pearson correlation for recall -0.229 nan nan
Pearson correlation for f1 -0.168 nan nan
MAE for perplexity_member 4.356 9.001 6.126
MAE for perplexity_non_member 4.126 8.774 6.479
MAE for roc_auc 0.061 0.030 0.060
MAE for accuracy 0.093 0.063 0.059
MAE for precision 0.103 0.065 0.069
MAE for recall 0.136 0.054 0.111
MAE for f1 0.119 0.059 0.091
R2 score for perplexity_member 0.470 -0.846 -2.717
R2 score for perplexity_non_member 0.529 -0.834 -2.726
R2 score for roc_auc -4.865 -0.043 -0.450
R2 score for accuracy -1.853 -0.171 -0.030
R2 score for precision -2.402 -0.181 -0.052
R2 score for recall -4.212 -0.002 -0.447
R2 score for f1 -3.573 -0.048 -0.268
MSE for perplexity_member 50.742 176.777 51.335
MSE for perplexity_non_member 43.490 169.299 57.378
MSE for roc_auc 0.006 0.001 0.006
MSE for accuracy 0.012 0.005 0.005
MSE for precision 0.015 0.005 0.008
MSE for recall 0.028 0.005 0.018
MSE for f1 0.021 0.005 0.012
MedAE for perplexity_member 2.159 4.802 4.431
MedAE for perplexity_non_member 2.356 4.613 4.718
MedAE for roc_auc 0.045 0.030 0.065
MedAE for accuracy 0.100 0.050 0.050
MedAE for precision 0.100 0.056 0.056
MedAE for recall 0.100 0.100 0.100
MedAE for f1 0.103 0.067 0.079
Coefficient of Variation for perplexity_member 0.799 0.394 1.303e-16
Coefficient of Variation for perplexity_non_member 0.783 0.392 1.215e-16
Coefficient of Variation for roc_auc 0.068 0.123 1.178e-16
Coefficient of Variation for accuracy 0.125 0.152 0.0
Coefficient of Variation for precision 0.127 0.181 0.0
Coefficient of Variation for recall 0.148 0.269 0.0
Coefficient of Variation for f1 0.134 0.225 0.0
MAPE for perplexity_member 0.327 0.527 0.587
MAPE for perplexity_non_member 0.329 0.506 0.583
MAPE for roc_auc 0.131 0.061 0.111
MAPE for accuracy 0.176 0.119 0.125
MAPE for precision 0.194 0.123 0.157
MAPE for recall 0.261 0.110 0.339
MAPE for f1 0.225 0.116 0.244
Wilcoxon test for perplexity_member 149.0, 0.227 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for perplexity_non_member 159.0, 0.327 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for roc_auc 94.5, 0.023 126.0, 0.206 62.5, 0.002
Wilcoxon test for accuracy 91.5, 0.032 62.0, 0.006 103.5, 0.672
Wilcoxon test for precision 90.5, 0.030 80.0, 0.023 89.0, 0.352
Wilcoxon test for recall 49.0, 0.011 56.0, 0.796 45.0, 0.003
Wilcoxon test for f1 79.5, 0.015 108.0, 0.084 82.0, 0.051
Table 6: Results for Epochs = 20
Metric DP vs DP-Weights DP vs Fine-Tuned DP-Weights vs Fine-Tuned
RMSE for perplexity_member 3.197 7.225 6.567
RMSE for perplexity_non_member 3.577 6.896 7.528
RMSE for roc_auc 0.081 0.025 0.073
RMSE for accuracy 0.081 0.053 0.070
RMSE for precision 0.088 0.053 0.073
RMSE for recall 0.115 0.080 0.145
RMSE for f1 0.090 0.061 0.099
Pearson correlation for perplexity_member 0.896 nan nan
Pearson correlation for perplexity_non_member 0.887 nan nan
Pearson correlation for roc_auc 0.003 nan nan
Pearson correlation for accuracy 0.154 nan nan
Pearson correlation for precision 0.073 nan nan
Pearson correlation for recall 0.273 nan nan
Pearson correlation for f1 0.283 nan nan
MAE for perplexity_member 2.573 4.447 5.560
MAE for perplexity_non_member 3.169 4.076 6.378
MAE for roc_auc 0.070 0.016 0.062
MAE for accuracy 0.059 0.039 0.055
MAE for precision 0.067 0.041 0.061
MAE for recall 0.082 0.050 0.118
MAE for f1 0.066 0.044 0.081
R2 score for perplexity_member 0.685 -0.610 -2.532
R2 score for perplexity_non_member 0.587 -0.537 -2.543
R2 score for roc_auc -10.510 -0.098 -0.456
R2 score for accuracy -1.291 -0.004 -0.123
R2 score for precision -1.680 -0.0002 -0.039
R2 score for recall -2.364 -0.636 -1.391
R2 score for f1 -1.601 -0.184 -0.769
MSE for perplexity_member 10.222 52.200 43.127
MSE for perplexity_non_member 12.792 47.557 56.675
MSE for roc_auc 0.007 0.001 0.005
MSE for accuracy 0.007 0.003 0.005
MSE for precision 0.008 0.003 0.005
MSE for recall 0.013 0.006 0.021
MSE for f1 0.008 0.004 0.010
MedAE for perplexity_member 2.284 1.442 3.925
MedAE for perplexity_non_member 2.885 1.023 4.564
MedAE for roc_auc 0.065 0.010 0.050
MedAE for accuracy 0.050 0.050 0.050
MedAE for precision 0.055 0.045 0.050
MedAE for recall 0.100 0.000 0.100
MedAE for f1 0.048 0.029 0.071
Coefficient of Variation for perplexity_member 0.781 0.417 0.0
Coefficient of Variation for perplexity_non_member 0.746 0.412 0.0
Coefficient of Variation for roc_auc 0.050 0.115 0.0
Coefficient of Variation for accuracy 0.099 0.128 2.056e-16
Coefficient of Variation for precision 0.100 0.137 2.073e-16
Coefficient of Variation for recall 0.116 0.195 0.0
Coefficient of Variation for f1 0.104 0.151 0.0
MAPE for perplexity_member 0.466 0.399 0.596
MAPE for perplexity_non_member 0.570 0.337 0.589
MAPE for roc_auc 0.145 0.034 0.112
MAPE for accuracy 0.112 0.077 0.114
MAPE for precision 0.127 0.080 0.123
MAPE for recall 0.147 0.107 0.287
MAPE for f1 0.122 0.090 0.181
Wilcoxon test for perplexity_member 77.0, 0.0030.0030.0030.003 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for perplexity_non_member 49.0, 0.00020.00020.00020.0002 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for roc_auc 61.0, 0.00070.00070.00070.0007 81.0, 0.1320.1320.1320.132 52.5, 0.00180.00180.00180.0018
Wilcoxon test for accuracy 51.0, 0.2260.2260.2260.226 55.0, 0.1740.1740.1740.174 42.0, 0.0100.0100.0100.010
Wilcoxon test for precision 103.5, 0.4550.4550.4550.455 98.0, 0.3490.3490.3490.349 138.0, 0.3400.3400.3400.340
Wilcoxon test for recall 19.5, 0.00490.00490.00490.0049 0.0, 0.00110.00110.00110.0011 7.0, 6.554e056.554𝑒056.554e-056.554 italic_e - 05
Wilcoxon test for f1 57.5, 0.01430.01430.01430.0143 55.0, 0.01870.01870.01870.0187 34.0, 0.00030.00030.00030.0003
Table 7: Results for Batch Size = 5
Metric DP vs DP-Weights DP vs Fine-Tuned DP-Weights vs Fine-Tuned
RMSE for perplexity_member 1.105 1.375 2.320
RMSE for perplexity_non_member 4.430 0.483 4.327
RMSE for roc_auc 0.222 0.348 0.151
RMSE for accuracy 0.202 0.334 0.181
RMSE for precision 0.202 0.304 0.135
RMSE for recall 0.218 0.380 0.247
RMSE for f1 0.206 0.341 0.189
Pearson correlation for perplexity_member 0.948 nan nan
Pearson correlation for perplexity_non_member 0.362 nan nan
Pearson correlation for roc_auc 0.706 nan nan
Pearson correlation for accuracy 0.386 nan nan
Pearson correlation for precision 0.427 nan nan
Pearson correlation for recall 0.330 nan nan
Pearson correlation for f1 0.382 nan nan
MAE for perplexity_member 1.045 1.001 2.046
MAE for perplexity_non_member 3.994 0.334 3.819
MAE for roc_auc 0.194 0.282 0.144
MAE for accuracy 0.182 0.277 0.173
MAE for precision 0.181 0.248 0.129
MAE for recall 0.182 0.314 0.232
MAE for f1 0.181 0.280 0.180
R2 score for perplexity_member -0.375 -1.129 -3.498
R2 score for perplexity_non_member -95.737 -0.151 -3.526
R2 score for roc_auc -0.196 -1.928 -9.677
R2 score for accuracy -0.171 -2.195 -11.602
R2 score for precision -0.313 -1.974 -10.019
R2 score for recall -0.044 -2.170 -7.504
R2 score for f1 -0.129 -2.091 -8.940
MSE for perplexity_member 1.221 1.890 5.380
MSE for perplexity_non_member 19.629 0.233 18.723
MSE for roc_auc 0.049 0.121 0.023
MSE for accuracy 0.041 0.112 0.033
MSE for precision 0.041 0.093 0.018
MSE for recall 0.048 0.144 0.061
MSE for f1 0.042 0.116 0.036
MedAE for perplexity_member 0.943 0.774 1.623
MedAE for perplexity_non_member 3.481 0.332 3.001
MedAE for roc_auc 0.165 0.320 0.135
MedAE for accuracy 0.200 0.325 0.150
MedAE for precision 0.164 0.291 0.109
MedAE for recall 0.200 0.300 0.200
MedAE for f1 0.183 0.301 0.152
Coefficient of Variation for perplexity_member 0.344 0.290 0.0
Coefficient of Variation for perplexity_non_member 0.119 0.264 0.0
Coefficient of Variation for roc_auc 0.301 0.057 1.166e-16
Coefficient of Variation for accuracy 0.283 0.067 1.190e-16
Coefficient of Variation for precision 0.272 0.053 1.244e-16
Coefficient of Variation for recall 0.317 0.112 0.0
Coefficient of Variation for f1 0.294 0.079 0.0
MAPE for perplexity_member 0.397 0.293 0.498
MAPE for perplexity_non_member 1.040 0.084 0.453
MAPE for roc_auc 0.352 0.537 0.178
MAPE for accuracy 0.318 0.520 0.229
MAPE for precision 0.327 0.476 0.168
MAPE for recall 0.331 0.621 0.321
MAPE for f1 0.327 0.545 0.241
Wilcoxon test for perplexity_member 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for perplexity_non_member 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 43.0, 9.319e059.319𝑒059.319e-059.319 italic_e - 05 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for roc_auc 61.0, 0.00070.00070.00070.0007 0.0, 3.991e053.991𝑒053.991e-053.991 italic_e - 05 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for accuracy 64.5, 0.00270.00270.00270.0027 0.0, 3.835e053.835𝑒053.835e-053.835 italic_e - 05 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for precision 57.0, 0.00150.00150.00150.0015 0.0, 3.876e053.876𝑒053.876e-053.876 italic_e - 05 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for recall 79.0, 0.04170.04170.04170.0417 0.0, 3.574e053.574𝑒053.574e-053.574 italic_e - 05 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for f1 78.0, 0.00760.00760.00760.0076 0.0, 3.876e053.876𝑒053.876e-053.876 italic_e - 05 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Table 8: Results for Batch Size = 10
Metric DP vs DP-Weights DP vs Fine-Tuned DP-Weights vs Fine-Tuned
RMSE for perplexity_member 1.602 1.833 3.032
RMSE for perplexity_non_member 3.979 1.067 4.467
RMSE for roc_auc 0.172 0.280 0.138
RMSE for accuracy 0.150 0.240 0.138
RMSE for precision 0.175 0.245 0.122
RMSE for recall 0.138 0.249 0.195
RMSE for f1 0.147 0.247 0.160
Pearson correlation for perplexity_member 0.885 nan nan
Pearson correlation for perplexity_non_member 0.688 nan nan
Pearson correlation for roc_auc 0.335 nan nan
Pearson correlation for accuracy 0.157 nan nan
Pearson correlation for precision 0.030 nan nan
Pearson correlation for recall 0.290 nan nan
Pearson correlation for f1 0.203 nan nan
MAE for perplexity_member 1.493 1.228 2.685
MAE for perplexity_non_member 3.649 0.482 3.939
MAE for roc_auc 0.155 0.243 0.133
MAE for accuracy 0.107 0.214 0.121
MAE for precision 0.128 0.216 0.100
MAE for recall 0.089 0.214 0.175
MAE for f1 0.095 0.216 0.141
R2 score for perplexity_member -0.386 -0.815 -3.644
R2 score for perplexity_non_member -14.029 -0.080 -3.499
R2 score for roc_auc -0.511 -3.026 -14.274
R2 score for accuracy -0.913 -3.905 -3.380
R2 score for precision -1.337 -3.545 -2.143
R2 score for recall -0.167 -2.830 -4.035
R2 score for f1 -0.547 -3.334 -3.526
MSE for perplexity_member 2.565 3.359 9.190
MSE for perplexity_non_member 15.831 1.138 19.954
MSE for roc_auc 0.029 0.078 0.019
MSE for accuracy 0.022 0.058 0.019
MSE for precision 0.031 0.060 0.015
MSE for recall 0.019 0.062 0.038
MSE for f1 0.022 0.061 0.026
MedAE for perplexity_member 1.279 0.839 2.118
MedAE for perplexity_non_member 3.251 0.187 3.027
MedAE for roc_auc 0.155 0.310 0.135
MedAE for accuracy 0.050 0.250 0.150
MedAE for precision 0.083 0.244 0.117
MedAE for recall 0.100 0.200 0.200
MedAE for f1 0.058 0.229 0.168
Coefficient of Variation for perplexity_member 0.412 0.297 2.123e-16
Coefficient of Variation for perplexity_non_member 0.259 0.279 1.206e-16
Coefficient of Variation for roc_auc 0.238 0.051 0.0
Coefficient of Variation for accuracy 0.189 0.099 1.413e-16
Coefficient of Variation for precision 0.200 0.100 1.413e-16
Coefficient of Variation for recall 0.221 0.142 1.413e-16
Coefficient of Variation for f1 0.207 0.116 1.413e-16
MAPE for perplexity_member 0.480 0.290 0.522
MAPE for perplexity_non_member 0.914 0.089 0.477
MAPE for roc_auc 0.294 0.476 0.191
MAPE for accuracy 0.221 0.417 0.190
MAPE for precision 0.269 0.429 0.155
MAPE for recall 0.203 0.453 0.305
MAPE for f1 0.213 0.441 0.230
Wilcoxon test for perplexity_member 1.0, 1.490e081.490𝑒081.490e-081.490 italic_e - 08 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for perplexity_non_member 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 192.0, 0.8139230.8139230.8139230.813923 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for roc_auc 41.5, 0.0003930.0003930.0003930.000393 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for accuracy 20.5, 0.0003340.0003340.0003340.000334 0.0, 5.103e065.103𝑒065.103e-065.103 italic_e - 06 0.0, 6.731e066.731𝑒066.731e-066.731 italic_e - 06
Wilcoxon test for precision 14.0, 0.0001620.0001620.0001620.000162 0.0, 5.383e065.383𝑒065.383e-065.383 italic_e - 06 0.0, 7.326e067.326𝑒067.326e-067.326 italic_e - 06
Wilcoxon test for recall 42.0, 0.1624860.1624860.1624860.162486 0.0, 4.571e064.571𝑒064.571e-064.571 italic_e - 06 0.0, 6.002e066.002𝑒066.002e-066.002 italic_e - 06
Wilcoxon test for f1 48.0, 0.0061880.0061880.0061880.006188 0.0, 5.383e065.383𝑒065.383e-065.383 italic_e - 06 0.0, 7.326e067.326𝑒067.326e-067.326 italic_e - 06
Table 9: Results for Batch Size = 50
Metric DP vs DP-Weights DP vs Fine-Tuned DP-Weights vs Fine-Tuned
RMSE for perplexity_member 3.197 7.225 6.567
RMSE for perplexity_non_member 3.577 6.896 7.528
RMSE for roc_auc 0.081 0.025 0.073
RMSE for accuracy 0.081 0.053 0.070
RMSE for precision 0.088 0.053 0.073
RMSE for recall 0.115 0.080 0.145
RMSE for f1 0.090 0.061 0.099
Pearson correlation for perplexity_member 0.896 nan nan
Pearson correlation for perplexity_non_member 0.887 nan nan
Pearson correlation for roc_auc 0.003 nan nan
Pearson correlation for accuracy 0.154 nan nan
Pearson correlation for precision 0.073 nan nan
Pearson correlation for recall 0.273 nan nan
Pearson correlation for f1 0.283 nan nan
MAE for perplexity_member 2.573 4.447 5.560
MAE for perplexity_non_member 3.169 4.076 6.378
MAE for roc_auc 0.070 0.016 0.062
MAE for accuracy 0.059 0.039 0.055
MAE for precision 0.067 0.041 0.061
MAE for recall 0.082 0.050 0.118
MAE for f1 0.066 0.044 0.081
R2 score for perplexity_member 0.685 -0.610 -2.532
R2 score for perplexity_non_member 0.587 -0.537 -2.543
R2 score for roc_auc -10.510 -0.098 -0.456
R2 score for accuracy -1.291 -0.004 -0.123
R2 score for precision -1.680 -0.000 -0.039
R2 score for recall -2.364 -0.636 -1.391
R2 score for f1 -1.601 -0.184 -0.769
MSE for perplexity_member 10.222 52.200 43.127
MSE for perplexity_non_member 12.792 47.557 56.675
MSE for roc_auc 0.007 0.001 0.005
MSE for accuracy 0.007 0.003 0.005
MSE for precision 0.008 0.003 0.005
MSE for recall 0.013 0.006 0.021
MSE for f1 0.008 0.004 0.010
MedAE for perplexity_member 2.284 1.442 3.925
MedAE for perplexity_non_member 2.885 1.023 4.564
MedAE for roc_auc 0.065 0.010 0.050
MedAE for accuracy 0.050 0.050 0.050
MedAE for precision 0.055 0.045 0.050
MedAE for recall 0.100 0.000 0.100
MedAE for f1 0.048 0.029 0.071
Coefficient of Variation for perplexity_member 0.781 0.417 0.0
Coefficient of Variation for perplexity_non_member 0.746 0.412 0.0
Coefficient of Variation for roc_auc 0.050 0.115 0.0
Coefficient of Variation for accuracy 0.099 0.128 2.056e-16
Coefficient of Variation for precision 0.100 0.137 2.073e-16
Coefficient of Variation for recall 0.116 0.195 0.0
Coefficient of Variation for f1 0.104 0.151 0.0
MAPE for perplexity_member 0.466 0.399 0.596
MAPE for perplexity_non_member 0.570 0.337 0.589
MAPE for roc_auc 0.145 0.034 0.112
MAPE for accuracy 0.112 0.077 0.114
MAPE for precision 0.127 0.080 0.123
MAPE for recall 0.147 0.107 0.287
MAPE for f1 0.122 0.090 0.181
Wilcoxon test for perplexity_member 77.0, 0.0030.0030.0030.003 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for perplexity_non_member 49.0, 0.00019370.00019370.00019370.0001937 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for roc_auc 61.0, 0.00071620.00071620.00071620.0007162 81.0, 0.1320.1320.1320.132 52.5, 0.0017720.0017720.0017720.001772
Wilcoxon test for accuracy 51.0, 0.2260.2260.2260.226 55.0, 0.1740.1740.1740.174 42.0, 0.0100.0100.0100.010
Wilcoxon test for precision 103.5, 0.4550.4550.4550.455 98.0, 0.3490.3490.3490.349 138.0, 0.3400.3400.3400.340
Wilcoxon test for recall 19.5, 0.0048770.0048770.0048770.004877 0.0, 0.0010540.0010540.0010540.001054 7.0, 6.554e056.554𝑒056.554e-056.554 italic_e - 05
Wilcoxon test for f1 57.5, 0.01430.01430.01430.0143 55.0, 0.01870.01870.01870.0187 34.0, 0.0003210.0003210.0003210.000321
Table 10: Results for Batch Size = 100
Metric DP vs DP-Weights DP vs Fine-Tuned DP-Weights vs Fine-Tuned
RMSE for perplexity_member 7.123 13.296 7.165
RMSE for perplexity_non_member 6.595 13.011 7.575
RMSE for roc_auc 0.079 0.033 0.076
RMSE for accuracy 0.109 0.070 0.074
RMSE for precision 0.122 0.072 0.088
RMSE for recall 0.167 0.073 0.135
RMSE for f1 0.145 0.070 0.112
Pearson correlation for perplexity_member 0.922 nan nan
Pearson correlation for perplexity_non_member 0.922 nan nan
Pearson correlation for roc_auc 0.018 nan nan
Pearson correlation for accuracy -0.100 nan nan
Pearson correlation for precision -0.082 nan nan
Pearson correlation for recall -0.229 nan nan
Pearson correlation for f1 -0.168 nan nan
MAE for perplexity_member 4.356 9.001 6.126
MAE for perplexity_non_member 4.126 8.774 6.479
MAE for roc_auc 0.061 0.030 0.060
MAE for accuracy 0.093 0.063 0.059
MAE for precision 0.103 0.065 0.069
MAE for recall 0.136 0.054 0.111
MAE for f1 0.119 0.059 0.091
R2 score for perplexity_member 0.470 -0.846 -2.717
R2 score for perplexity_non_member 0.529 -0.834 -2.726
R2 score for roc_auc -4.865 -0.043 -0.450
R2 score for accuracy -1.853 -0.171 -0.030
R2 score for precision -2.402 -0.181 -0.052
R2 score for recall -4.212 -0.002 -0.447
R2 score for f1 -3.573 -0.048 -0.268
MSE for perplexity_member 50.742 176.777 51.335
MSE for perplexity_non_member 43.490 169.299 57.378
MSE for roc_auc 0.006 0.001 0.006
MSE for accuracy 0.012 0.005 0.005
MSE for precision 0.015 0.005 0.008
MSE for recall 0.028 0.005 0.018
MSE for f1 0.021 0.005 0.012
MedAE for perplexity_member 2.159 4.802 4.431
MedAE for perplexity_non_member 2.356 4.613 4.718
MedAE for roc_auc 0.045 0.030 0.065
MedAE for accuracy 0.100 0.050 0.050
MedAE for precision 0.100 0.056 0.056
MedAE for recall 0.100 0.100 0.100
MedAE for f1 0.103 0.067 0.079
Coefficient of Variation for perplexity_member 0.799 0.394 1.303e-16
Coefficient of Variation for perplexity_non_member 0.783 0.392 1.215e-16
Coefficient of Variation for roc_auc 0.068 0.123 1.178e-16
Coefficient of Variation for accuracy 0.125 0.152 0.0
Coefficient of Variation for precision 0.127 0.181 0.0
Coefficient of Variation for recall 0.148 0.269 0.0
Coefficient of Variation for f1 0.134 0.225 0.0
MAPE for perplexity_member 0.327 0.527 0.587
MAPE for perplexity_non_member 0.329 0.506 0.583
MAPE for roc_auc 0.131 0.061 0.111
MAPE for accuracy 0.176 0.119 0.125
MAPE for precision 0.194 0.123 0.157
MAPE for recall 0.261 0.110 0.339
MAPE for f1 0.225 0.116 0.244
Wilcoxon test for perplexity_member 149.0, 0.2270.2270.2270.227 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for perplexity_non_member 159.0, 0.3270.3270.3270.327 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09 0.0, 7.451e097.451𝑒097.451e-097.451 italic_e - 09
Wilcoxon test for roc_auc 94.5, 0.0230.0230.0230.023 126.0, 0.2060.2060.2060.206 62.5, 0.0020.0020.0020.002
Wilcoxon test for accuracy 91.5, 0.0320.0320.0320.032 62.0, 0.0060.0060.0060.006 103.5, 0.6720.6720.6720.672
Wilcoxon test for precision 90.5, 0.0300.0300.0300.030 80.0, 0.0230.0230.0230.023 89.0, 0.3520.3520.3520.352
Wilcoxon test for recall 49.0, 0.0110.0110.0110.011 56.0, 0.7960.7960.7960.796 45.0, 0.0030.0030.0030.003
Wilcoxon test for f1 79.5, 0.0150.0150.0150.015 108.0, 0.0840.0840.0840.084 82.0, 0.0510.0510.0510.051