Metric Differential Privacy at the User-Level

Jacob Imola University of CopenhagenCopenhagenDenmark [email protected] Amrita Roy Chowdhury UCSDLa JollaCaliforniaUSA [email protected]  and  Kamalika Chaudhuri UCSDLa JollaCaliforniaUSA [email protected]
(2018; 20 February 2007; 12 March 2009; 5 June 2009)
Abstract.

Metric differential privacy (DP) provides heterogeneous privacy guarantees based on a distance between the pair of inputs. It is a widely popular notion of privacy since it captures the natural privacy semantics for many applications (such as, for location data) and results in better utility than standard DP. However, prior work in metric DP has primarily focused on the item-level setting where every user only reports a single data item. A more realistic setting is that of user-level DP where each user contributes multiple items and privacy is then desired at the granularity of the user’s entire contribution. In this paper, we initiate the study of metric DP at the user-level. Specifically, we use the earth-mover’s distance (dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT) as our metric to obtain a notion of privacy as it captures both the magnitude and spatial aspects of changes in a user’s data.

We make three main technical contributions. First, we design two novel mechanisms under dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP to answer linear queries and item-wise queries. Specifically, our analysis for the latter involves a generalization of the privacy amplification by shuffling result which may be of independent interest. Second, we provide a black-box reduction from the general unbounded to bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP (size of the dataset is fixed and public) with a novel sampling based mechanism. Third, we show that our proposed mechanisms can provably provide improved utility over user-level DP, for certain types of linear queries and frequency estimation.

User-level Differential Privacy, Earth-Mover’s Distance, Couplings
copyright: acmlicensedjournalyear: 2018doi: XXXXXXX.XXXXXXXconference: Make sure to enter the correct conference title from your rights confirmation emai; June 03–05, 2018; Woodstock, NYisbn: 978-1-4503-XXXX-X/18/06ccs: Security and privacyccs: Theory of computation Design and analysis of algorithms

1. Introduction

Differential privacy (DP) is the state-of-the art technique that enables useful data analysis while still providing a strong privacy guarantee at the granularity of individuals (Dwork, 2006). Over nearly two decades, DP has enjoyed significant academic attention and has proven its efficacy in practical applications as well. It has been successfully deployed in diverse settings, including the US census (Abowd, 2018), Apple’s iOS platform (Cormode et al., 2018), and Google Chrome (Erlingsson et al., 2014).

Intuitively, DP guarantee makes a pair of input data to be indistinguishable from each other. The standard DP guarantee requires all pairs of inputs to be indistinguishable thereby providing a uniform privacy guarantee to all pairs. This implies that every pair of input is considered equally sensitive. However, many practical applications call for a more tailored privacy semantics based on the heterogeneity of the data. In particular, input pairs that are closer or more similar to each other are considered to be more sensitive. For instance, for location data, revealing the exact city of residence is far more sensitive than revealing just the country. Metric DP (d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP; (Chatzikokolakis et al., 2013)) is a notion of DP that formally captures this heterogeneity in privacy semantics. Specifically, similarity is measured via a distance metric d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT and the privacy guarantee degrades linearly with the d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT distance between the pair of inputs. In addition to offering a more nuanced privacy definition, metric DP also improves utility compared to standard DP. This improvement stems from metric DP requiring only similar pairs of input to be indistinguishable, which results in a significantly lower noise than standard DP.

Prior work in metric DP has primarily focused on the item-level setting where every user only reports a single data item (for e.g., a single record in a dataset). However, in many practical applications, a user contributes multiple items to a dataset. Privacy is then desired at the granularity of the user’s entire contribution. This has spurred a large body of work known as user-level DP (Amin et al., 2019; Bassily and Sun, 2023; Cummings et al., 2022; Acharya et al., 2023). However, all of this work considers only standard DP and is thus susceptible to the same limitations in utility as noted earlier. To this end, we initiate the study of metric DP at the user-level. While there have been some prior attempts at this, these work is limited to specific settings such as text data (Fernandes et al., 2019). To the best of our knowledge, this is the first work to give a general definition of metric DP at the user-level.

The immediate task is to define a metric on the entire collection of a user’s data. Recall that metric DP caters to the privacy semantics that similar data is more sensitive. But the challenge here is that the similarity between two collections (sets) of data points has to be measured along two dimensions – (1)1(1)( 1 ) the distance between the individual data items, and (2)2(2)( 2 ) the fraction of the data items in the set that are different. In particular, note that in addition to small changes in the item-wise distances, changes in a smaller amount of the data also indicate more similarity and hence, correspond to more sensitive information (see below for concrete examples). This necessitates a measure that can express both of these quantities as a single metric. We tackle this challenge by using the earth-mover’s distance (dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT(Givens and Shortt, 1984)) on the normalized representation of the user’s data. Informally, the dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT between two distributions is the minimum cost of transporting one distribution to another, where the cost is determined by the quantity of data items moved multiplied by the distance (measured via d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT) over which they are moved. Our resulting privacy definition, denoted as dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP, yields the following privacy semantics. Under dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP, the strength of the privacy guarantee (indistinguishability) between two pairs of inputs K,K𝐾superscript𝐾K,K^{\prime}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (sets of data items) grows inversely with τq𝜏𝑞\tau qitalic_τ italic_q if Ksuperscript𝐾K^{\prime}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT can be obtained by changing τ𝜏\tauitalic_τ fraction of K𝐾Kitalic_K by an average distance of q𝑞qitalic_q (Def. 3.1). dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT therefore takes into account both the structure of the distributions as well as the raw difference in their values. Consequently, the parameters τ𝜏\tauitalic_τ and q𝑞qitalic_q provide flexibility in interpretation and offer a nuanced privacy definition suitable for many practical applications. We illustrate this with the following examples:
Location Data. We will use our location dataset as a canonical example throughout the paper. Suppose that the location dataset consists of daily locations of users collected over a period of time. Here, the parameter τ𝜏\tauitalic_τ can be interpreted in terms of the length of the time window the change in Ksuperscript𝐾K^{\prime}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT pertains to, and q𝑞qitalic_q corresponds to the extent of change in the location. Then, dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP makes it harder to distinguish between locations that are (1)1(1)( 1 ) close to each other, and (2)2(2)( 2 ) collected over a smaller time window. This is natural, since locations gathered over an extended period, such as a month, may reveal routine patterns that are less sensitive than locations recorded on a single day (for instance, a single-day location might reveal a non-routine visit to a friend or hospital).
Textual Data. Consider a natural language dataset of user conversations where each user’s data is represented as a set of words. Typically, word embeddings ϕitalic-ϕ\phiitalic_ϕ map each word into a high-dimensional space, and word similarity is measured using a distance, such as the Euclidean distance, between ϕ(x1)italic-ϕsubscript𝑥1\phi(x_{1})italic_ϕ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) and ϕ(x2)italic-ϕsubscript𝑥2\phi(x_{2})italic_ϕ ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ). Now, the parameter τ𝜏\tauitalic_τ corresponds to what fraction of the user’s conversation has changed in Ksuperscript𝐾K^{\prime}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT from K𝐾Kitalic_K, while q𝑞qitalic_q corresponds to the extent of the changes in the textual content. Thus, two conversations are harder to distinguish if (1)1(1)( 1 ) there is only a fine-grained difference in their textual semantics 111Such as transitioning from text about algebra to trigonometry versus changing it from ”math” to ”classical music”., and (2)2(2)( 2 ) if it pertains to just a small fraction of the conversation (indicating a user rarely discussed the topic, which typically implies more sensitive information).
Graph Data. Consider a graph G=(V,E)𝐺𝑉𝐸G=(V,E)italic_G = ( italic_V , italic_E ) in which connections in E𝐸Eitalic_E are private. Suppose there is additional public information in the form of a covariate ϕ:Vd:italic-ϕ𝑉superscript𝑑\phi:V\rightarrow\mathbb{R}^{d}italic_ϕ : italic_V → blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, which captures some auxiliary information about a user—for instance, the interests of a user. Here similarity between users is measured via covariate distance. The parameter τ𝜏\tauitalic_τ corresponds to the fraction of a user’s connections which has changed in Ksuperscript𝐾K^{\prime}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT from K𝐾Kitalic_K, and the parameter q𝑞qitalic_q corresponds to the extent of the change in their interests. Thus, two graphs are harder to distinguish between if (1)1(1)( 1 ) it is a fine-grained change to the interest222for instance, shifting from movies featuring Dwayne Johnson to Vin Diesel instead of from ”action” to “rom-com”, and (2)2(2)( 2 ) if it pertains to only a few of the user’s connections. (say a small, private group of friends). This again captures natural privacy semantics as users are more likely to share common interests with their close friends than with a larger group, such as all workplace colleagues.

1.1. Details of Our Contributions

We consider n𝑛nitalic_n users who hold datasets {Ki}i=1nsuperscriptsubscriptsubscript𝐾𝑖𝑖1𝑛\{K_{i}\}_{i=1}^{n}{ italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, each containing elements from a data domain 𝒳𝒳\mathcal{X}caligraphic_X of size k=|𝒳|𝑘𝒳k=|\mathcal{X}|italic_k = | caligraphic_X |. Let d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT denote the distance metric defined over 𝒳𝒳\mathcal{X}caligraphic_X. WLOG, we consider d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT to be a normalized distance metric, i.e., all measures of distance are normalized to be at most 1111. Let K~isubscript~𝐾𝑖\tilde{K}_{i}over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT denote the normalized version of the dataset Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT between any pair of datasets {Ki,Ki}subscript𝐾𝑖superscriptsubscript𝐾𝑖\{K_{i},K_{i}^{\prime}\}{ italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } can be defined by first normalizing them to {K~i,K~i}subscript~𝐾𝑖superscriptsubscript~𝐾𝑖\{\tilde{K}_{i},\tilde{K}_{i}^{\prime}\}{ over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT }, and then using d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT to measure the minimum cost of transporting K~isubscript~𝐾𝑖\tilde{K}_{i}over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to K~isuperscriptsubscript~𝐾𝑖\tilde{K}_{i}^{\prime}over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. The global dataset is given by KG=K1Knsubscript𝐾𝐺subscript𝐾1subscript𝐾𝑛K_{G}=K_{1}\cup\cdots\cup K_{n}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT = italic_K start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ ⋯ ∪ italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, and there is an aggregator who wants to privately compute a query F(K)𝐹𝐾F(K)italic_F ( italic_K ). In the central model, the aggregator already holds Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT from each user, and applies a private mechanism (K)𝐾\mathcal{M}(K)caligraphic_M ( italic_K ) to obtain a private estimate for F𝐹Fitalic_F. In the local model, the users do not trust the aggregator, and communicate private messages {mi=i(Ki)}subscript𝑚𝑖subscript𝑖subscript𝐾𝑖\{m_{i}=\mathcal{M}_{i}(K_{i})\}{ italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = caligraphic_M start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } to the aggregator. The aggregator then post-processes these messages (m1,,mn)subscript𝑚1subscript𝑚𝑛\mathcal{F}(m_{1},\ldots,m_{n})caligraphic_F ( italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_m start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) to output a private estimate of F𝐹Fitalic_F. For simplicity, in this work we assume the mechanisms isubscript𝑖\mathcal{M}_{i}caligraphic_M start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to be non-interactive.

We also make a distinction between bounded and unbounded data. Note that boundedness here refers to the size of each user’s dataset and not the number of the users – throughout the paper, we assume that the number of users, n𝑛nitalic_n, is fixed and publicly known. In our specific context, bounded data corresponds to the case where the size of each user’s dataset is publicly known, and the mechanism \mathcal{M}caligraphic_M only needs to preserve privacy between datasets of the same size. Furthermore, in the central model, each user’s dataset has the same public size. The benefit of this simplification is that algorithm analysis is easier. Such a bounded data setting has been considered in many previous works (Li et al., 2016). We also consider the general unbounded data setting where each user can have datasets of varying sizes, with the size being private as well.

For each model and type of boundedness, we summarize how one would apply dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP, along with the resulting semantics, in Table 1. We also include a corresponding notion of the standard user-level DP (Liu et al., 2023) (provides a uniform privacy guarantee to all pairs of datasets) and serves as our baseline. In what follows, we elaborate on our main contributions.

Model Granularity Data Boundedness Privacy Guarantee Semantics Notes
Local (applies to each isubscript𝑖\mathcal{M}_{i}caligraphic_M start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT) User Unbounded (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-user-level DP (Def. 2.1) Two input datasets K,K𝒳𝐾superscript𝐾superscript𝒳K,K^{\prime}\in\mathcal{X}^{*}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT are indistinguishable with parameters (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ ) Recently proposed in (Acharya et al., 2023). Acts our baseline for the local model.
User Bounded (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP (Def. 3.1) Two input datasets K,K𝒳m𝐾superscript𝐾superscript𝒳𝑚K,K^{\prime}\in\mathcal{X}^{m}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT are indistinguishable with parameters (αdEM(K~,K~),δ)𝛼subscript𝑑EM~𝐾superscript~𝐾𝛿(\alpha d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime}),\delta)( italic_α italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , italic_δ ). The size of each dataset, m𝑚mitalic_m, is public. Proofs of privacy easier due to Lemma 2.1.
User Unbounded (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-unbounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP (Def. 3.1) Two input datasets K,K𝒳𝐾superscript𝐾superscript𝒳K,K^{\prime}\in\mathcal{X}^{*}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is indistinguishable with parameters (αdEM(K,K),δ)𝛼subscript𝑑EM𝐾superscript𝐾𝛿(\alpha d_{\textsf{EM}}(K,K^{\prime}),\delta)( italic_α italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , italic_δ ). Implies user-level DP when αε𝛼𝜀\alpha\leq\varepsilonitalic_α ≤ italic_ε since dEM(,)1subscript𝑑EM1d_{\textsf{EM}}(\cdot,\cdot)\leq 1italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( ⋅ , ⋅ ) ≤ 1.
Item N/A (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP (Def. 2.3) Two input items x,x𝒳𝑥superscript𝑥𝒳x,x^{\prime}\in\mathcal{X}italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X is protected with parameters (αd𝒳(x,x),δ)𝛼subscript𝑑𝒳𝑥superscript𝑥𝛿(\alpha d_{\mathcal{X}}(x,x^{\prime}),\delta)( italic_α italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , italic_δ ) Proposed in (Chatzikokolakis et al., 2015)
Central (applies to \mathcal{M}caligraphic_M) User Unbounded (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-user-level DP (Def. 2.2) Let KG=K1Knsubscript𝐾𝐺subscript𝐾1subscript𝐾𝑛K_{G}=K_{1}\cup\cdots K_{n}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT = italic_K start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ ⋯ italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT where Ki𝒳subscript𝐾𝑖superscript𝒳K_{i}\in\mathcal{X}^{*}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. Two input global datasets KG,KGsubscript𝐾𝐺superscriptsubscript𝐾𝐺K_{G},K_{G}^{\prime}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT s.t. they differ only on the dataset of a single user {Ki,Ki},i[n]subscript𝐾𝑖superscriptsubscript𝐾𝑖𝑖delimited-[]𝑛\{K_{i},K_{i}^{\prime}\},i\in[n]{ italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } , italic_i ∈ [ italic_n ] are indistinguishable with parameters (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ ) Studied widely (Bassily and Sun, 2023; Liu et al., 2020, 2023). Acts our baseline for the central model.
User Bounded (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP (Def. 3.1) Two input global datasets KG,KGsubscript𝐾𝐺superscriptsubscript𝐾𝐺K_{G},K_{G}^{\prime}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT s.t. they differ only on {Ki,Ki}𝒳m×𝒳msubscript𝐾𝑖superscriptsubscript𝐾𝑖superscript𝒳𝑚superscript𝒳𝑚\{K_{i},K_{i}^{\prime}\}\in\mathcal{X}^{m}\times\mathcal{X}^{m}{ italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } ∈ caligraphic_X start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT × caligraphic_X start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT are indistinguishable with parameters (αdEM(K~i,K~i),δ)𝛼subscript𝑑EMsubscript~𝐾𝑖superscriptsubscript~𝐾𝑖𝛿(\alpha d_{\textsf{EM}}(\tilde{K}_{i},\tilde{K}_{i}^{\prime}),\delta)( italic_α italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , italic_δ ) Each Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT has size m𝑚mitalic_m which is public.
User Unbounded (ε,δ,r)𝜀𝛿𝑟(\varepsilon,\delta,r)( italic_ε , italic_δ , italic_r )-discrete dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP (Def. 3.2) Two input global datasets KG,KGsubscript𝐾𝐺superscriptsubscript𝐾𝐺K_{G},K_{G}^{\prime}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT s.t. they differ only on {Ki,Ki}subscript𝐾𝑖superscriptsubscript𝐾𝑖\{K_{i},K_{i}^{\prime}\}{ italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } and dEM(K~i,K~i)rsubscript𝑑EMsubscript~𝐾𝑖superscriptsubscript~𝐾𝑖𝑟d_{\textsf{EM}}(\tilde{K}_{i},\tilde{K}_{i}^{\prime})\leq ritalic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ italic_r are indistinguishable with parameters (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ ). Using group privacy, we can show the following parameters (εdr,δexp(εdr))𝜀𝑑𝑟𝛿𝜀𝑑𝑟(\varepsilon\lceil\frac{d}{r}\rceil,\delta\exp(\varepsilon\lceil\frac{d}{r}% \rceil))( italic_ε ⌈ divide start_ARG italic_d end_ARG start_ARG italic_r end_ARG ⌉ , italic_δ roman_exp ( italic_ε ⌈ divide start_ARG italic_d end_ARG start_ARG italic_r end_ARG ⌉ ) ) where d=dEM(K~i,K~i)𝑑subscript𝑑EMsubscript~𝐾𝑖superscriptsubscript~𝐾𝑖d=d_{\textsf{EM}}(\tilde{K}_{i},\tilde{K}_{i}^{\prime})italic_d = italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). Implies user-level DP when r1𝑟1r\geq 1italic_r ≥ 1 since dEM(,)1subscript𝑑EM1d_{\textsf{EM}}(\cdot,\cdot)\leq 1italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( ⋅ , ⋅ ) ≤ 1.
Table 1. Summary of privacy definitions for this paper. The number of users, n𝑛nitalic_n, is fixed and publicly known for all the definitions.

1.1.1. Mechanism Design

We provide novel mechanisms for answering two types of queries for dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP.

Linear Query

First, we study how to release linear queries VK~G𝑉subscript~𝐾𝐺V\tilde{K}_{G}italic_V over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT, where K~G𝒳subscript~𝐾𝐺superscript𝒳\tilde{K}_{G}\in\mathbb{R}^{\mathcal{X}}over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_X end_POSTSUPERSCRIPT is the normalized representation of the global dataset KGsubscript𝐾𝐺K_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT and Vd×𝒳𝑉superscript𝑑𝒳V\in\mathbb{R}^{d\times\mathcal{X}}italic_V ∈ blackboard_R start_POSTSUPERSCRIPT italic_d × caligraphic_X end_POSTSUPERSCRIPT is a real-valued matrix with bounded entries. While computing the sensitivity of a linear query is easy under user-level DP, proving a sensitivity under dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP is quite challenging. Specifically, it requires analysis of a coupling between two possible datasets, along with a stronger assumption that V𝑉Vitalic_V is “Lipschitz” in a sense, rather than just being bounded. To this end, we first prove the following bound:

Theorem 1.1.

(Informal version of Thm. 4.1): The sensitivity of VK~G𝑉subscript~𝐾𝐺V\tilde{K}_{G}italic_V over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT is upper bounded by

maxK,KVK~GVK~GdEM(K~G,K~G)maxx,x𝒳V[x]V[x]d𝒳(x,x),subscript𝐾superscript𝐾norm𝑉subscript~𝐾𝐺𝑉superscriptsubscript~𝐾𝐺subscript𝑑EMsubscript~𝐾𝐺superscriptsubscript~𝐾𝐺subscript𝑥superscript𝑥𝒳norm𝑉delimited-[]𝑥𝑉delimited-[]superscript𝑥subscript𝑑𝒳𝑥superscript𝑥\max_{K,K^{\prime}}\frac{\|V\tilde{K}_{G}-V\tilde{K}_{G}^{\prime}\|}{d_{% \textsf{EM}}(\tilde{K}_{G},{\tilde{K}_{G}}^{\prime})}\leq\max_{x,x^{\prime}\in% \mathcal{X}}\frac{\|V[x]-V[x^{\prime}]\|}{d_{\mathcal{X}}(x,x^{\prime})},roman_max start_POSTSUBSCRIPT italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT divide start_ARG ∥ italic_V over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT - italic_V over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∥ end_ARG start_ARG italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT , over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG ≤ roman_max start_POSTSUBSCRIPT italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X end_POSTSUBSCRIPT divide start_ARG ∥ italic_V [ italic_x ] - italic_V [ italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] ∥ end_ARG start_ARG italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG ,

where the notation V[x]𝑉delimited-[]𝑥V[x]italic_V [ italic_x ] indicates the column of V𝑉Vitalic_V indexed by x𝑥xitalic_x.

Using the above result, we show that the sensitivity of V𝑉Vitalic_V, which is a maximum over the space of all datasets, can be reduced to a Lipschitz property of V𝑉Vitalic_V that is much easier to compute. In Sec.  6.1, we show that a special class of linear queries, which we call linear embedding queries, satisfies the above mentioned Lipschitzness and can provide provably better utility than user-level DP.

Unordered Release of Item-wise Queries

We design a mechanism for performing itemwise queries on the entire dataset K𝐾Kitalic_K. Our approach is to simply apply a private mechanism 𝒜𝒜\mathcal{A}caligraphic_A to each item kiKsubscript𝑘𝑖𝐾k_{i}\in Kitalic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_K and release the set of noisy outputs {𝒜(ki)}𝒜subscript𝑘𝑖\{\mathcal{A}(k_{i})\}{ caligraphic_A ( italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } after shuffling them. Here 𝒜𝒜\mathcal{A}caligraphic_A can be an arbitrary mechanism satisfying (α,0)𝛼0(\alpha,0)( italic_α , 0 )-d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT DP which makes our mechanism completely general-purpose (see Sec. 4.2 for some concrete examples of 𝒜𝒜\mathcal{A}caligraphic_A). Here we consider the bounded data setting since the size of K𝐾Kitalic_K is revealed. The main technical novelty lies in providing a tight privacy analysis of the above mechanism. Specifically, prior work shows that the above mechanism satisfies bounded (mα,0)𝑚𝛼0(m\alpha,0)( italic_m italic_α , 0 )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP (Fernandes et al., 2019) by using the interplay between couplings and privacy via composition. However, we show that composition is not the right tool for tight privacy analysis since it does not take into that the output of our mechanism is an unordered list, i.e., the 𝒜(ki)𝒜subscript𝑘𝑖\mathcal{A}(k_{i})caligraphic_A ( italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )s are released in a random arbitrary order. Instead, we generalize a tight result from privacy amplification by shuffling (Feldman et al., 2022) to metric DP.

Theorem 1.2.

(Informal version of Thm. 4.3) Suppose that 𝒜:𝒳𝒴:𝒜𝒳𝒴\mathcal{A}:\mathcal{X}\rightarrow\mathcal{Y}caligraphic_A : caligraphic_X → caligraphic_Y is an α𝛼\alphaitalic_α-d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT DP algorithm with respect to d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT. Let (x1,,xm)𝒳msubscript𝑥1subscript𝑥𝑚superscript𝒳𝑚(x_{1},\ldots,x_{m})\in\mathcal{X}^{m}( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ∈ caligraphic_X start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT be a dataset. Then, releasing 𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x1),,𝒜(xm))𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜subscript𝑥1𝒜subscript𝑥𝑚\mathsf{Shuffle}(\mathcal{A}(x_{1}),\ldots,\mathcal{A}(x_{m}))sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) satisfies (O(αmeαln(m/δ)),δeα)𝑂𝛼𝑚superscript𝑒𝛼𝑚𝛿𝛿superscript𝑒𝛼(O(\alpha\sqrt{me^{\alpha}\ln(m/\delta)}),\delta e^{\alpha})( italic_O ( italic_α square-root start_ARG italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT roman_ln ( italic_m / italic_δ ) end_ARG ) , italic_δ italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP.

This analysis reduces the cost of releasing m𝑚mitalic_m points in the multiset from mα𝑚𝛼m\alphaitalic_m italic_α to mα𝑚𝛼\sqrt{m}\alphasquare-root start_ARG italic_m end_ARG italic_α, allowing for better utility. We keep the analysis general – we consider releasing the shuffled multiset of any black-box mechanism 𝒜𝒜\mathcal{A}caligraphic_A, that satisfies metric DP in the data domain 𝒳𝒳\mathcal{X}caligraphic_X, applied to each data point. Consequently, this result has broader applications to the shuffle model of privacy, and may be of independent interest.

1.1.2. Extending dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP to the Unbounded Setting

We start our mechanism designs by considering the bounded data settings in both the local and central models of privacy (See Table 1) as this enables easier privacy analysis (Sec. 4). However, the bounded setting might be restrictive in practice as it cannot support usecases where users have different amounts of data, or the data sizes are also private. To this end, we extend dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP to the more general unbounded setting. We show that when user data is relatively homogeneous (such as, when it is i.i.d.), the privacy analysis of the unbounded setting may be reduced to the bounded setting.

Specifically, in Sec.  5 we create a black-box projection mechanism which projects any unbounded dataset onto a dataset where each user contributes a fixed, predefined amount of data. This enables running any bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP mechanism on the projected data. Our projection mechanism samples a fixed number of dataset items with replacement from each user. The privacy analysis follows by showing that the dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT between any two datasets remains relatively unchanged by sampling, up to a small additive factor as determined by the Chernoff’s bound.

One caveat is that the introduced additive factor necessitates a slight adjustment to the privacy semantics of dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP. Instead of protecting any change of dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT distance d𝑑ditalic_d with a privacy parameter dα𝑑𝛼d\alphaitalic_d italic_α, we consider a small threshold r𝑟ritalic_r such that all changes less than r𝑟ritalic_r are protected with a uniform parameter rα𝑟𝛼r\alphaitalic_r italic_α. In essence, this privacy guarantee provides dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP at the granularity of units of dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT distance r𝑟ritalic_r. We refer to this notion as discrete user-level dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP (Def. 5.1) and have the following result:

Theorem 1.3.

(Informal version of Thm. 5.3) Suppose that for n𝑛nitalic_n users, \mathcal{M}caligraphic_M is a mechanism which satisfies (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP. The algorithm which, given arbitrary user datasets K1,,Knsubscript𝐾1subscript𝐾𝑛K_{1},\ldots,K_{n}italic_K start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, takes s𝑠sitalic_s i.i.d. samples from each Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and then applies \mathcal{M}caligraphic_M on each of the sampled data items, satisfies (αr,δ,r)𝛼𝑟𝛿𝑟(\alpha r,\delta,r)( italic_α italic_r , italic_δ , italic_r )-discrete dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP for all r2ln(1/δ)s𝑟21𝛿𝑠r\geq\frac{2\ln(1/\delta)}{s}italic_r ≥ divide start_ARG 2 roman_ln ( 1 / italic_δ ) end_ARG start_ARG italic_s end_ARG.

The two notions of privacy are nearly equivalent for small r𝑟ritalic_r, showing that unbounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP can be reduced to bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP with an almost exact translation of the privacy guarantee.

1.1.3. Demonstrating Improvements Over User-level DP

We compare the privacy and utility of our proposed dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP mechanisms with baseline mechanisms satisfying user-level DP. Specifically, in Sec.  6.1, we study a special type of linear query called linear embedding queries and in Sec.  6.2, we study problem of private frequency estimation. For simplicity, we consider the bounded data setting.

Let’s start by understanding the relationship between (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP and (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-user-level DP. The following observations hold in both the central and local models:

  • α=ε𝛼𝜀\alpha=\varepsilonitalic_α = italic_ε: Since333If αε𝛼𝜀\alpha\leq\varepsilonitalic_α ≤ italic_ε, then user-level DP is strictly weaker than dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP; the more appropriate baseline is to use α=ε𝛼𝜀\alpha=\varepsilonitalic_α = italic_ε. we assume d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT is normalized, we always have dEM1subscript𝑑EM1d_{\textsf{EM}}\leq 1italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ≤ 1. Thus, in this case (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP implies (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-user-level DP. However, any pair of input K,K𝐾superscript𝐾K,K^{\prime}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that dEM(K~,K~)<1subscript𝑑EM~𝐾superscript~𝐾1d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})<1italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) < 1 the privacy protection of dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP is actually stronger.

  • α>ε𝛼𝜀\alpha>\varepsilonitalic_α > italic_ε: In this case, some pairs of inputs (with a large dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT distance between them) are protected less strongly than they are under user-level DP. However, as indicated in our aforementioned real-life examples, input pairs with high dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT (i.e., dissimilar input pairs) are typically less sensitive.

Now, we interpret the theoretical error bounds for linear embedding queries. From Table 2(a), the error for releasing a d𝑑ditalic_d-dimensional linear embedding query under user-level DP is O(dεn)𝑂𝑑𝜀𝑛O(\frac{d}{\varepsilon n})italic_O ( divide start_ARG italic_d end_ARG start_ARG italic_ε italic_n end_ARG ), while it is O(dαn)𝑂𝑑𝛼𝑛O(\frac{d}{\alpha n})italic_O ( divide start_ARG italic_d end_ARG start_ARG italic_α italic_n end_ARG ) for dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP. When α=ε𝛼𝜀\alpha=\varepsilonitalic_α = italic_ε, these utilities are identical, but dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP offers stronger privacy. When α>ε𝛼𝜀\alpha>\varepsilonitalic_α > italic_ε, then the utility of dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP is higher than that of user-level DP, with the the two guarantees offering differing privacy semantics. Thus, in both cases, there is a clear benefit of using dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP. These observations are the same in the local model.

Finally, for frequency estimation in the local model, Table 2(b) shows that the error of user-level DP is O(k2ln(m/δ)nε2)𝑂superscript𝑘2𝑚𝛿𝑛superscript𝜀2O(\sqrt{\frac{k^{2}\ln(m/\delta)}{n\varepsilon^{2}}})italic_O ( square-root start_ARG divide start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_ln ( italic_m / italic_δ ) end_ARG start_ARG italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ), while it is O(k3nα2max{ln(mδ,α}O(\sqrt{\frac{k^{3}}{n\alpha^{2}}\max\{\ln(\frac{m}{\delta},\alpha\}}italic_O ( square-root start_ARG divide start_ARG italic_k start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_max { roman_ln ( divide start_ARG italic_m end_ARG start_ARG italic_δ end_ARG , italic_α } end_ARG for dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP. For constant ε𝜀\varepsilonitalic_ε and αε2kln(m/δ)𝛼superscript𝜀2𝑘𝑚𝛿\alpha\geq\varepsilon^{2}\frac{k}{\ln(m/\delta)}italic_α ≥ italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_k end_ARG start_ARG roman_ln ( italic_m / italic_δ ) end_ARG, the utility is improved. In the central model, the error of the user-level DP algorithm is O(knε)𝑂𝑘𝑛𝜀O(\frac{k}{n\varepsilon})italic_O ( divide start_ARG italic_k end_ARG start_ARG italic_n italic_ε end_ARG ) while it is O(k3/2nαmax{ln(mδ),α})𝑂superscript𝑘32𝑛𝛼𝑚𝛿𝛼O(\frac{k^{3/2}}{n\alpha}\sqrt{\max\{\ln(\frac{m}{\delta}),\alpha\}})italic_O ( divide start_ARG italic_k start_POSTSUPERSCRIPT 3 / 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n italic_α end_ARG square-root start_ARG roman_max { roman_ln ( divide start_ARG italic_m end_ARG start_ARG italic_δ end_ARG ) , italic_α } end_ARG ) for dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP. The algorithm under dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP has the added benefit that it can be implemented in the shuffle model of privacy, which requires less trust and parallels prior work in the shuffle model (Feldman et al., 2022). There is a utility improvement for αε2k𝛼superscript𝜀2𝑘\alpha\geq\varepsilon^{2}kitalic_α ≥ italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_k. When εαε2k𝜀𝛼superscript𝜀2𝑘\varepsilon\leq\alpha\leq\varepsilon^{2}kitalic_ε ≤ italic_α ≤ italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_k, we leave it as an interesting open problem whether dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP can offer utility improvements over user-level DP.

Linear Embedding Queries
Algorithm Privacy Guarantee Privacy Model 2subscript2\ell_{2}roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT Error Notes
Laplace Mechanism (ε,0)𝜀0(\varepsilon,0)( italic_ε , 0 )-user level DP Central, Bounded O(dεn)𝑂𝑑𝜀𝑛O(\frac{d}{\varepsilon n})italic_O ( divide start_ARG italic_d end_ARG start_ARG italic_ε italic_n end_ARG ) (Lemma 6.3) dEMDPsubscript𝑑EM𝐷𝑃d_{\textsf{EM}}-DPitalic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT - italic_D italic_P gives same utility but stronger privacy α=ε𝛼𝜀\alpha=\varepsilonitalic_α = italic_ε; dEMDPsubscript𝑑EM𝐷𝑃d_{\textsf{EM}}-DPitalic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT - italic_D italic_P gives better utility but different privacy for α>ε𝛼𝜀\alpha>\varepsilonitalic_α > italic_ε.
PrivEMDLinear (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP Central, Bounded O(dαnln1δ)𝑂𝑑𝛼𝑛1𝛿O(\frac{d}{\alpha n}\sqrt{\ln\frac{1}{\delta}})italic_O ( divide start_ARG italic_d end_ARG start_ARG italic_α italic_n end_ARG square-root start_ARG roman_ln divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG end_ARG ) (Lemma 6.2)
(a) Comparison of dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP to user-level DP in the central model for releasing a d𝑑ditalic_d-dimensional linear embedding query. The errors in the local model are a factor n𝑛\sqrt{n}square-root start_ARG italic_n end_ARG higher.
Frequency Estimation
Algorithm Privacy Guarantee Privacy Model dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT Error Notes
Hadamard Response (ε,0)𝜀0(\varepsilon,0)( italic_ε , 0 )-user-level DP Local, Bounded O(k2ln(m/δ)nε2)𝑂superscript𝑘2𝑚𝛿𝑛superscript𝜀2O\left(\sqrt{\frac{k^{2}\ln(m/\delta)}{n\varepsilon^{2}}}\right)italic_O ( square-root start_ARG divide start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_ln ( italic_m / italic_δ ) end_ARG start_ARG italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ) (Lemma 6.4) Assuming k,ε,αm𝑘𝜀𝛼𝑚k,\varepsilon,\alpha\leq\sqrt{m}italic_k , italic_ε , italic_α ≤ square-root start_ARG italic_m end_ARG; dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP gives better utility for αε2kln(m/δ)𝛼superscript𝜀2𝑘𝑚𝛿\alpha\geq\varepsilon^{2}\frac{k}{\ln(m/\delta)}italic_α ≥ italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_k end_ARG start_ARG roman_ln ( italic_m / italic_δ ) end_ARG.
PrivEMDItemWise (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP Local, Bounded O(k3nα2max{ln(mδ),α})𝑂superscript𝑘3𝑛superscript𝛼2𝑚𝛿𝛼O\left(\sqrt{\frac{k^{3}}{n\alpha^{2}}\max\left\{\ln(\frac{m}{\delta}),\alpha% \right\}}\right)italic_O ( square-root start_ARG divide start_ARG italic_k start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_max { roman_ln ( divide start_ARG italic_m end_ARG start_ARG italic_δ end_ARG ) , italic_α } end_ARG ) (Thm. 6.6)
Laplace Mechanism (ε,0)𝜀0(\varepsilon,0)( italic_ε , 0 )-user-level DP Central, Bounded O(knε)𝑂𝑘𝑛𝜀O\left(\frac{k}{n\varepsilon}\right)italic_O ( divide start_ARG italic_k end_ARG start_ARG italic_n italic_ε end_ARG ) (Lemma 6.7) Assuming n<mα𝑛𝑚𝛼n<\frac{m}{\alpha}italic_n < divide start_ARG italic_m end_ARG start_ARG italic_α end_ARG; dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP gives better utility when α>ε2k𝛼superscript𝜀2𝑘\alpha>\varepsilon^{2}kitalic_α > italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_k .
PrivEMDItemWise (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP Central, Bounded 444This algorithm works in the shuffle model, which requires less trust than the central model. O(k3/2nαmax{ln(mδ),α})𝑂superscript𝑘32𝑛𝛼𝑚𝛿𝛼O\left(\frac{k^{3/2}}{n\alpha}\sqrt{\max\left\{\ln(\frac{m}{\delta}),\alpha% \right\}}\right)italic_O ( divide start_ARG italic_k start_POSTSUPERSCRIPT 3 / 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n italic_α end_ARG square-root start_ARG roman_max { roman_ln ( divide start_ARG italic_m end_ARG start_ARG italic_δ end_ARG ) , italic_α } end_ARG ) (Corollary 6.8)
(b) Comparison of dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP to user-level DP for frequency estimation in the setting defined in Sec.  6.2. k𝑘kitalic_k is the domain size |𝒳|𝒳|\mathcal{X}|| caligraphic_X |.
Table 2. Summary of theoretical utility guarantees, assuming there are n𝑛nitalic_n users who hold datasets of size m𝑚mitalic_m.

2. Background

2.1. Differential Privacy

Intuitively, DP is a property of a mechanism which ensures that its output distribution remains insensitive to changes in the data of a single individual. The standard DP guarantee, which is also know as item-level DP, considers each user Uisubscript𝑈𝑖U_{i}italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to contribute only a single item xi𝒳subscript𝑥𝑖𝒳x_{i}\in\mathcal{X}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_X to a global dataset, i.e., Ki=xisubscript𝐾𝑖subscript𝑥𝑖K_{i}=x_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. In this paper, we consider differential privacy at the user level. We start by considering the local model:

Definition 2.1 (Unbounded User-level Local DP (Acharya et al., 2023)).

We say a mechanism \mathcal{M}caligraphic_M acting on a dataset K𝐾Kitalic_K satisfies (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-unbounded user-level local DP if, for all K,K𝒳𝐾superscript𝐾superscript𝒳K,K^{\prime}\in\mathcal{X}^{*}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and all outputs O𝑂Oitalic_O

(1) Pr[(K)=O]eεPr[(K)=O]+δ.Pr𝐾𝑂superscript𝑒𝜀Prsuperscript𝐾𝑂𝛿\Pr[\mathcal{M}(K)=O]\leq e^{\varepsilon}\Pr[\mathcal{M}(K^{\prime})=O]+\delta.roman_Pr [ caligraphic_M ( italic_K ) = italic_O ] ≤ italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT roman_Pr [ caligraphic_M ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_O ] + italic_δ .

Note that here we consider the more general unbounded data setting where the two datasets {K,K}𝐾superscript𝐾\{K,K^{\prime}\}{ italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } can have arbitrary sizes.

Next, we present the definition for the central model.

Definition 2.2 (Unbounded User-level Central DP (Liu et al., 2023)).

Let KG=K1Knsubscript𝐾𝐺subscript𝐾1subscript𝐾𝑛K_{G}=K_{1}\cup\cdots\cup K_{n}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT = italic_K start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ ⋯ ∪ italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT denote a global dataset from n𝑛nitalic_n users where i[n],Ki𝒳formulae-sequencefor-all𝑖delimited-[]𝑛subscript𝐾𝑖superscript𝒳\forall i\in[n],K_{i}\in\mathcal{X}^{*}∀ italic_i ∈ [ italic_n ] , italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. We say KGKGsimilar-tosubscript𝐾𝐺subscriptsuperscript𝐾𝐺K_{G}\sim K^{\prime}_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ∼ italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT, if KGsubscriptsuperscript𝐾𝐺K^{\prime}_{G}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT can be obtained from KGsubscript𝐾𝐺K_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT by changing the dataset of a single user Uisubscript𝑈𝑖U_{i}italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT from Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to Kisuperscriptsubscript𝐾𝑖K_{i}^{\prime}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. We say a mechanism \mathcal{M}caligraphic_M acting on a dataset K𝐾Kitalic_K satisfies (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-unbounded user-level central DP if, for all KG,KGsubscript𝐾𝐺superscriptsubscript𝐾𝐺K_{G},K_{G}^{\prime}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that KGKGsimilar-tosubscript𝐾𝐺subscriptsuperscript𝐾𝐺K_{G}\sim K^{\prime}_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ∼ italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT, and all outputs O𝑂Oitalic_O

(2) Pr[(KG)=O]eεPr[(KG)=O]+δ.Prsubscript𝐾𝐺𝑂superscript𝑒𝜀Prsuperscriptsubscript𝐾𝐺𝑂𝛿\Pr[\mathcal{M}(K_{G})=O]\leq e^{\varepsilon}\Pr[\mathcal{M}(K_{G}^{\prime})=O% ]+\delta.roman_Pr [ caligraphic_M ( italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ) = italic_O ] ≤ italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT roman_Pr [ caligraphic_M ( italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_O ] + italic_δ .

Note that there is no restriction in the sizes of the datasets {Ki},i[n]subscript𝐾𝑖𝑖delimited-[]𝑛\{K_{i}\},i\in[n]{ italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } , italic_i ∈ [ italic_n ] in the above definition.

Next, we define metric DP (d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP) that enables the privacy guarantee to depend on the d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT distance between the pair of inputs. We start by introducing it at the item-level (so we consider changing an item x𝒳𝑥𝒳x\in\mathcal{X}italic_x ∈ caligraphic_X to another item x𝒳superscript𝑥𝒳x^{\prime}\in\mathcal{X}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X). For simplicity, we consider the local model, so the mechanism acts on just a single item:

Definition 2.3 (Local d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP (Alvim et al., 2018)).

We say \mathcal{M}caligraphic_M satisfies (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-local d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP if for all data elements x,x𝒳𝑥superscript𝑥𝒳x,x^{\prime}\in\mathcal{X}italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X, and all outputs O𝑂Oitalic_O

Pr[(x)=O]eαd𝒳(x,x)Pr[(x)=O].Pr𝑥𝑂superscript𝑒𝛼subscript𝑑𝒳𝑥superscript𝑥Prsuperscript𝑥𝑂\Pr[\mathcal{M}(x)=O]\leq e^{\alpha d_{\mathcal{X}}(x,x^{\prime})}\Pr[\mathcal% {M}(x^{\prime})=O].roman_Pr [ caligraphic_M ( italic_x ) = italic_O ] ≤ italic_e start_POSTSUPERSCRIPT italic_α italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT roman_Pr [ caligraphic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_O ] .

We replace the traditional privacy parameter ε𝜀\varepsilonitalic_ε with α𝛼\alphaitalic_α in the above definition, because ε𝜀\varepsilonitalic_ε in Definitions 2.1 and 2.2 is a unitless parameter while α𝛼\alphaitalic_α has the inverse unit of d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT.

2.2. Earth-Mover’s Distance

Notations. We denote the set of all possible datasets as 𝒳superscript𝒳\mathcal{X}^{*}caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. We will also view a dataset K𝒳𝐾superscript𝒳K\in\mathcal{X}^{*}italic_K ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT as a probability distribution defined by its normalized histogram K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG. To do so, let Δ𝒳𝒳superscriptΔ𝒳superscript𝒳\Delta^{\mathcal{X}}\subseteq\mathbb{R}^{\mathcal{X}}roman_Δ start_POSTSUPERSCRIPT caligraphic_X end_POSTSUPERSCRIPT ⊆ blackboard_R start_POSTSUPERSCRIPT caligraphic_X end_POSTSUPERSCRIPT denote the probability simplex indexed by 𝒳𝒳\mathcal{X}caligraphic_X—i.e. the set of all vectors vxx𝒳subscriptdelimited-⟨⟩subscript𝑣𝑥𝑥𝒳\langle v_{x}\rangle_{x\in\mathcal{X}}⟨ italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT such that vx0subscript𝑣𝑥0v_{x}\geq 0italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ≥ 0 and x𝒳vx=1subscript𝑥𝒳subscript𝑣𝑥1\sum_{x\in\mathcal{X}}v_{x}=1∑ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = 1. For a dataset K𝐾Kitalic_K, K~Δ𝒳~𝐾superscriptΔ𝒳\tilde{K}\in\Delta^{\mathcal{X}}over~ start_ARG italic_K end_ARG ∈ roman_Δ start_POSTSUPERSCRIPT caligraphic_X end_POSTSUPERSCRIPT then denotes the probability distribution defined by K𝐾Kitalic_K, meaning K~[x]=Num. occurrences of x in K|K|~𝐾delimited-[]𝑥Num. occurrences of x in K𝐾\tilde{K}[x]=\frac{\text{Num. occurrences of $x$ in $K$}}{|K|}over~ start_ARG italic_K end_ARG [ italic_x ] = divide start_ARG Num. occurrences of italic_x in italic_K end_ARG start_ARG | italic_K | end_ARG. A natural way to extend the notion of distance from items in 𝒳𝒳\mathcal{X}caligraphic_X to distributions in Δ𝒳superscriptΔ𝒳\Delta^{\mathcal{X}}roman_Δ start_POSTSUPERSCRIPT caligraphic_X end_POSTSUPERSCRIPT is to use the Earth-Mover’s (or 1111-Wasserstein) distance (Givens and Shortt, 1984), which we now define. For a joint distribution C(x1,x2)Δ𝒳×𝒳𝐶subscript𝑥1subscript𝑥2superscriptΔ𝒳𝒳C(x_{1},x_{2})\in\Delta^{\mathcal{X}\times\mathcal{X}}italic_C ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ roman_Δ start_POSTSUPERSCRIPT caligraphic_X × caligraphic_X end_POSTSUPERSCRIPT, let Cx1(x2)subscript𝐶subscript𝑥1subscript𝑥2C_{x_{1}}(x_{2})italic_C start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) denote the distribution conditioned on observing x1subscript𝑥1x_{1}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and let C1(x1)subscript𝐶1subscript𝑥1C_{1}(x_{1})italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) denote the marginal distribution of x1subscript𝑥1x_{1}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. We define Cx2(x1)subscript𝐶subscript𝑥2subscript𝑥1C_{x_{2}}(x_{1})italic_C start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) and C2(x2)subscript𝐶2subscript𝑥2C_{2}(x_{2})italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) similarly.

Definition 2.4.

For distributions P,QΔ𝒳𝑃𝑄superscriptΔ𝒳P,Q\in\Delta^{\mathcal{X}}italic_P , italic_Q ∈ roman_Δ start_POSTSUPERSCRIPT caligraphic_X end_POSTSUPERSCRIPT, a joint distribution C𝐶Citalic_C on 𝒳×𝒳𝒳𝒳\mathcal{X}\times\mathcal{X}caligraphic_X × caligraphic_X is a coupling between P𝑃Pitalic_P and Q𝑄Qitalic_Q if C1=Psubscript𝐶1𝑃C_{1}=Pitalic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_P and C2=Qsubscript𝐶2𝑄C_{2}=Qitalic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_Q. We let 𝒞(P,Q)𝒞𝑃𝑄\mathcal{C}(P,Q)caligraphic_C ( italic_P , italic_Q ) denote the set of couplings between P𝑃Pitalic_P and Q𝑄Qitalic_Q.

A coupling C𝐶Citalic_C can be viewed as a “transportation plan” between P𝑃Pitalic_P and Q𝑄Qitalic_Q, in the sense that if C𝐶Citalic_C places m𝑚mitalic_m probability mass at a point (x1,x2)subscript𝑥1subscript𝑥2(x_{1},x_{2})( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), then m𝑚mitalic_m probability mass from P𝑃Pitalic_P at x1subscript𝑥1x_{1}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is transported to Q𝑄Qitalic_Q at x2subscript𝑥2x_{2}italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (or vice-versa). We define the cost of a coupling as the expected transportation distance given by 𝔼(x,x)Cd𝒳(x,x)subscript𝔼similar-to𝑥superscript𝑥𝐶subscript𝑑𝒳𝑥superscript𝑥\mathbb{E}_{(x,x^{\prime})\sim C}d_{\mathcal{X}}(x,x^{\prime})blackboard_E start_POSTSUBSCRIPT ( italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∼ italic_C end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). The earth-mover’s distance (dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT) between P,Q𝑃𝑄P,Qitalic_P , italic_Q is equal to the minimum possible cost of a coupling between P𝑃Pitalic_P and Q𝑄Qitalic_Q:

dEM(P,Q)=infC𝒞(P,Q)𝔼(x,x)Cd𝒳(x,x).subscript𝑑EM𝑃𝑄subscriptinfimum𝐶𝒞𝑃𝑄subscript𝔼similar-to𝑥superscript𝑥𝐶subscript𝑑𝒳𝑥superscript𝑥d_{\textsf{EM}}(P,Q)=\inf_{C\in\mathcal{C}(P,Q)}\operatorname{\mathbb{E}}_{(x,% x^{\prime})\sim C}d_{\mathcal{X}}(x,x^{\prime}).italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( italic_P , italic_Q ) = roman_inf start_POSTSUBSCRIPT italic_C ∈ caligraphic_C ( italic_P , italic_Q ) end_POSTSUBSCRIPT blackboard_E start_POSTSUBSCRIPT ( italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∼ italic_C end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) .

Since we assume that d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT is bounded by 1111, we have dEM(,)1subscript𝑑EM1d_{\textsf{EM}}(\cdot,\cdot)\leq 1italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( ⋅ , ⋅ ) ≤ 1.

Next we present the Birkhoff-Von Neumann Theorem which is useful in our privacy analysis in Sec. 4.2. The theorem states that if both P𝑃Pitalic_P and Q𝑄Qitalic_Q are empirical distributions with the same number of points, then the dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT between them is the cost of the coupling that moves the entire mass in each point to the same destination:

Lemma 2.1.

[Birkhoff-Von Neumann Theorem (Konig, 2001), Lemma A.1 in (Fernandes et al., 2019)): For two datasets K={x1,,xm}𝐾subscript𝑥1subscript𝑥𝑚K=\{x_{1},\ldots,x_{m}\}italic_K = { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT } and K={y1,,ym}superscript𝐾subscript𝑦1subscript𝑦𝑚K^{\prime}=\{y_{1},\ldots,y_{m}\}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = { italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT }, there is a permutation π:[m][m]:𝜋delimited-[]𝑚delimited-[]𝑚\pi:[m]\rightarrow[m]italic_π : [ italic_m ] → [ italic_m ] such that

(3) dEM(K~,K~)=1mi=1md𝒳(xi,yπ(i)).subscript𝑑EM~𝐾superscript~𝐾1𝑚superscriptsubscript𝑖1𝑚subscript𝑑𝒳subscript𝑥𝑖subscript𝑦𝜋𝑖d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})=\frac{1}{m}\sum_{i=1}^{m}d_{% \mathcal{X}}(x_{i},y_{\pi(i)}).italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = divide start_ARG 1 end_ARG start_ARG italic_m end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_π ( italic_i ) end_POSTSUBSCRIPT ) .

3. Definition of dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP

In this section, we introduce our generalization of metric DP to the user-level. We start with the local model. We use the dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT metric to measure the distance between two datasets K,K𝐾superscript𝐾K,K^{\prime}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT since it captures the intuition that the changes which move smaller amounts of data by smaller distances are more sensitive (as discussed in Sec. 1).

Definition 3.1 ((Un)Bounded Local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP).

Let \mathcal{M}caligraphic_M be a mechanism which acts on a dataset K𝐾Kitalic_K. We say \mathcal{M}caligraphic_M satisfies (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-bounded local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP if for any two datasets K,K𝐾superscript𝐾K,K^{\prime}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that |K|=|K|𝐾superscript𝐾|K|=|K^{\prime}|| italic_K | = | italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT |, and for any output O𝑂Oitalic_O, we have

(4) Pr[(K)=O]eαdEM(K~,K~)Pr[(K)=O]+δ.Pr𝐾𝑂superscript𝑒𝛼subscript𝑑EM~𝐾superscript~𝐾Prsuperscript𝐾𝑂𝛿\Pr[\mathcal{M}(K)=O]\leq e^{\alpha d_{\textsf{EM}}(\tilde{K},\tilde{K}^{% \prime})}\Pr[\mathcal{M}(K^{\prime})=O]+\delta.roman_Pr [ caligraphic_M ( italic_K ) = italic_O ] ≤ italic_e start_POSTSUPERSCRIPT italic_α italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT roman_Pr [ caligraphic_M ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_O ] + italic_δ .

If the above equation holds for all datasets K,K𝐾superscript𝐾K,K^{\prime}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, regardless of whether |K|=|K|𝐾superscript𝐾|K|=|K^{\prime}|| italic_K | = | italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT |, we say that \mathcal{M}caligraphic_M satisfies (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-unbounded local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP.

For bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP, the size of the dataset is not protected, which is acceptable for applications where the amount of data is not sensitive. We explicitly differentiate between bounded and unbounded data since privacy analysis is easier under bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP by leveraging Lemma 2.1 (see Section 4).
In the central model, our goal is to protect changes in a single user’s dataset, transitioning from Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to Kisubscriptsuperscript𝐾𝑖K^{\prime}_{i}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, with a privacy guarantee that depends on dEM(K~,K~i)subscript𝑑EM~𝐾subscript~𝐾𝑖d_{\textsf{EM}}(\tilde{K},\tilde{K}_{i})italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ). We consider the bounded data setting where each dataset Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT has a publicly known fixed size m𝑚mitalic_m.

Definition 3.2 (Bounded Central dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP).

Let KG=K1Knsubscript𝐾𝐺subscript𝐾1subscript𝐾𝑛K_{G}=K_{1}\cup\cdots\cup K_{n}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT = italic_K start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ ⋯ ∪ italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT denote a global dataset from n𝑛nitalic_n users where i[n],Ki𝒳mformulae-sequencefor-all𝑖delimited-[]𝑛subscript𝐾𝑖superscript𝒳𝑚\forall i\in[n],K_{i}\in\mathcal{X}^{m}∀ italic_i ∈ [ italic_n ] , italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT. We say KGKGsimilar-tosubscript𝐾𝐺subscriptsuperscript𝐾𝐺K_{G}\sim K^{\prime}_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ∼ italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT if KGsubscriptsuperscript𝐾𝐺K^{\prime}_{G}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT can be obtained from KGsubscript𝐾𝐺K_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT by changing the dataset of a single user Uisubscript𝑈𝑖U_{i}italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT from Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to Kisuperscriptsubscript𝐾𝑖K_{i}^{\prime}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. We say a mechanism \mathcal{M}caligraphic_M satisfies (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP if, for all KG,KGsubscript𝐾𝐺superscriptsubscript𝐾𝐺K_{G},K_{G}^{\prime}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that KGKGsimilar-tosubscript𝐾𝐺subscriptsuperscript𝐾𝐺K_{G}\sim K^{\prime}_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ∼ italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT, and all outputs O𝑂Oitalic_O, we have

Pr[(KG)=O]eαdEM(K~i,K~i)Pr[(KG)=O]+δ.Prsubscript𝐾𝐺𝑂superscript𝑒𝛼subscript𝑑EMsubscript~𝐾𝑖superscriptsubscript~𝐾𝑖Prsuperscriptsubscript𝐾𝐺𝑂𝛿\Pr[\mathcal{M}(K_{G})=O]\leq e^{\alpha d_{\textsf{EM}}(\tilde{K}_{i},\tilde{K% }_{i}^{\prime})}\Pr[\mathcal{M}(K_{G}^{\prime})=O]+\delta.roman_Pr [ caligraphic_M ( italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ) = italic_O ] ≤ italic_e start_POSTSUPERSCRIPT italic_α italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT roman_Pr [ caligraphic_M ( italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_O ] + italic_δ .

In the above definition, the two global datasets KG,KGsubscript𝐾𝐺superscriptsubscript𝐾𝐺K_{G},K_{G}^{\prime}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are indistinguishable with a privacy parameter αdEM(Ki,Ki)𝛼subscript𝑑EMsubscript𝐾𝑖superscriptsubscript𝐾𝑖\alpha d_{\textsf{EM}}(K_{i},K_{i}^{\prime})italic_α italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). Since we consider the bounded data setting, neither the number of total users, n𝑛nitalic_n, nor the size of the individual datasets, m𝑚mitalic_m, are protected.

It is important to note that the above definition cannot be directly translated to the unbounded data setting. This limitation arises from the fact that if each Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is allowed to have an arbitrary size, then changing a single Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT could potentially change the entirety of KGsubscript𝐾𝐺K_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT in the worst-case (where user Uisubscript𝑈𝑖U_{i}italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT contributes the entire global dataset). This essentially reduces the central model (Def. 3.2) to the local model (Def. 3.1). We circumvent this challenge and provide a privacy definition for the undounded data setting in Sec. 5, by controlling the amount of data from each user.

Setting the Privacy Parameters. There are some semantic differences between the parameter α𝛼\alphaitalic_α in Definitions 3.1 and 3.2, and ε𝜀\varepsilonitalic_ε in Definitions 2.1 and 2.2. The privacy parameter ε𝜀\varepsilonitalic_ε is unitless. On the other hand, α𝛼\alphaitalic_α is not unitless – it has a unit inversely proportional to dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT. While ε1much-greater-than𝜀1\varepsilon\gg 1italic_ε ≫ 1 is usually not considered acceptable for standard DP, it is not unreasonable to set α1much-greater-than𝛼1\alpha\gg 1italic_α ≫ 1 in our case. This is acceptable if a strong privacy guarantee is needed only for input pairs that are close to each other since dEM(,)<1subscript𝑑EM1d_{\textsf{EM}}(\cdot,\cdot)<1italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( ⋅ , ⋅ ) < 1. For all q,τ[0,1]𝑞𝜏01q,\tau\in[0,1]italic_q , italic_τ ∈ [ 0 , 1 ], let (q,τ)𝑞𝜏\mathcal{E}(q,\tau)caligraphic_E ( italic_q , italic_τ ) refer to the minimum privacy parameter that is acceptable over all data changes of the form

A τ𝜏\tauitalic_τ-fraction of K𝐾Kitalic_K is changed by average distance q𝑞qitalic_q.

Then, α𝛼\alphaitalic_α may be set as α=infq,τ[0,1](q,τ)qτ𝛼subscriptinfimum𝑞𝜏01𝑞𝜏𝑞𝜏\alpha=\inf_{q,\tau\in[0,1]}\tfrac{\mathcal{E}(q,\tau)}{q\tau}italic_α = roman_inf start_POSTSUBSCRIPT italic_q , italic_τ ∈ [ 0 , 1 ] end_POSTSUBSCRIPT divide start_ARG caligraphic_E ( italic_q , italic_τ ) end_ARG start_ARG italic_q italic_τ end_ARG, and we can verify that Definition 3.1 will protect an input pair with the corresponding budget (q,τ)𝑞𝜏\mathcal{E}(q,\tau)caligraphic_E ( italic_q , italic_τ ). The parameter δ𝛿\deltaitalic_δ has the same interpretation as in standard DP, and should be set δ1poly(n)much-less-than𝛿1𝑝𝑜𝑙𝑦𝑛\delta\ll\frac{1}{poly(n)}italic_δ ≪ divide start_ARG 1 end_ARG start_ARG italic_p italic_o italic_l italic_y ( italic_n ) end_ARG.

Concrete Example. Throughout this paper, we consider a dataset of n=105𝑛superscript105n=10^{5}italic_n = 10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT users, each of whom contributes m=103𝑚superscript103m=10^{3}italic_m = 10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT location data points over the period of a month. We use the length of the shortest path on Earth’s surface as our metric d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT. Suppose we want to protect a user’s location over any particular day within a radius of 1000100010001000 miles, and the user’s location over the entire time period within a distance of 100100100100 miles. In the normalized metric space, these distances are q1=0.08subscript𝑞10.08q_{1}=0.08italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.08 and q2=0.008subscript𝑞20.008q_{2}=0.008italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0.008, respectively555The maximum surface distance between two points on Earth is 12930absent12930\approx 12930≈ 12930 miles.. They correspond to a fraction τ1=130subscript𝜏1130\tau_{1}=\frac{1}{30}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 30 end_ARG and τ2=1subscript𝜏21\tau_{2}=1italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1 of the metric space changing, respectively. Suppose we want to protect both of these inputs with privacy parameter ε=0.2𝜀0.2\varepsilon=0.2italic_ε = 0.2. Hence, we set α=min{ετ1q1,ετ2q2}=25.𝛼𝜀subscript𝜏1subscript𝑞1𝜀subscript𝜏2subscript𝑞225\alpha=\min\left\{\frac{\varepsilon}{\tau_{1}q_{1}},\frac{\varepsilon}{\tau_{2% }q_{2}}\right\}=25.italic_α = roman_min { divide start_ARG italic_ε end_ARG start_ARG italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG , divide start_ARG italic_ε end_ARG start_ARG italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG } = 25 . This value is much higher than typical privacy parameters used in DP, and yet it is able to adequately protect the desired inputs. Finally, we will set δ=1012𝛿superscript1012\delta=10^{-12}italic_δ = 10 start_POSTSUPERSCRIPT - 12 end_POSTSUPERSCRIPT.

4. Mechanisms for dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP

Now, we describe our mechanisms for releasing queries under dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP. Throughout this section, we focus on the bounded data setting, and consider both the local and central models. In Sec. 4.1, we show how to bound the sensitivity of linear queries, which can then be released with the addition of calibrated noise. Then, in Sec. 4.2, we show that we can release a noisy representation of K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG under dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP by applying any d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP mechanism to each item in K𝐾Kitalic_K, and aggregating the outputs.

4.1. Linear Queries

A non-adaptive linear query on a dataset K𝐾Kitalic_K computes the value of FK~𝐹~𝐾F\tilde{K}italic_F over~ start_ARG italic_K end_ARG, where Fd×|𝒳|𝐹superscript𝑑𝒳F\in\mathbb{R}^{d\times|\mathcal{X}|}italic_F ∈ blackboard_R start_POSTSUPERSCRIPT italic_d × | caligraphic_X | end_POSTSUPERSCRIPT is a matrix with d𝑑ditalic_d rows. The linearity comes from the linear transformation F𝐹Fitalic_F; our linear queries are normalized since they operate on K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG rather than K𝐾Kitalic_K. Such normalized queries can be used for answering the fraction of users satisfying a predicate (Blum et al., 2013). Nevertheless, one can estimate the non-normalized query by multiplying by an estimate of |K|𝐾|K|| italic_K |.

Let us represent F𝐹Fitalic_F by a function f:𝒳d:𝑓𝒳superscript𝑑f:\mathcal{X}\rightarrow\mathbb{R}^{d}italic_f : caligraphic_X → blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT where f(x)=F[x]𝑓𝑥𝐹delimited-[]𝑥f(x)=F[x]italic_f ( italic_x ) = italic_F [ italic_x ], the x𝑥xitalic_xth column of F𝐹Fitalic_F. The linear query can then be re-written as

(5) qf(K)=𝔼xK~[f(x)].subscript𝑞𝑓𝐾subscript𝔼similar-to𝑥~𝐾delimited-[]𝑓𝑥q_{f}(K)=\mathbb{E}_{x\sim\tilde{K}}[f(x)].italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K ) = blackboard_E start_POSTSUBSCRIPT italic_x ∼ over~ start_ARG italic_K end_ARG end_POSTSUBSCRIPT [ italic_f ( italic_x ) ] .

Thus, we may interpret a linear query on K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG as expected value of f𝑓fitalic_f over a random item from K𝐾Kitalic_K. Linear queries are simple but capable of expressing many indispensible tools in data analysis, and they are well-studied in differential privacy (Blum et al., 2013; Hardt and Talwar, 2010; Dwork et al., 2014). We will design a simple mechanism satisfying dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP for releasing a linear query, based on bounding the sensitivity of qfsubscript𝑞𝑓q_{f}italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT under the dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT. The sensitivity measures the maximum change output qfsubscript𝑞𝑓q_{f}italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT, measured according to some norm \|\cdot\|∥ ⋅ ∥ on dsuperscript𝑑\mathbb{R}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, relative to a change in the inputs by a certain dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT. This is given by:

Δ𝖤𝖬(qf)=maxK,K𝒳qf(K)qf(K)dEM(K~,K~).subscriptΔ𝖤𝖬subscript𝑞𝑓subscript𝐾superscript𝐾superscript𝒳normsubscript𝑞𝑓𝐾subscript𝑞𝑓superscript𝐾subscript𝑑EM~𝐾superscript~𝐾\Delta_{\mathsf{EM}}(q_{f})=\max_{K,K^{\prime}\in\mathcal{X}^{*}}\frac{\|q_{f}% (K)-q_{f}(K^{\prime})\|}{d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})}.roman_Δ start_POSTSUBSCRIPT sansserif_EM end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) = roman_max start_POSTSUBSCRIPT italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT divide start_ARG ∥ italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K ) - italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ end_ARG start_ARG italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG .

Naively, it is intractible to compute this sensitivity since there are exponentially many datasets of a given size. Additionally, this sensitivity might not always be bounded. For instance, consider two points x,x𝑥superscript𝑥x,x^{\prime}italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT that are close in 𝒳𝒳\mathcal{X}caligraphic_X, but f(x)𝑓𝑥f(x)italic_f ( italic_x ) is very far from f(x)𝑓superscript𝑥f(x^{\prime})italic_f ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). In this case, we cannot bound Δ𝖤𝖬subscriptΔ𝖤𝖬\Delta_{\mathsf{EM}}roman_Δ start_POSTSUBSCRIPT sansserif_EM end_POSTSUBSCRIPT, since the K𝐾Kitalic_K and Ksuperscript𝐾K^{\prime}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT which put all their mass on x𝑥xitalic_x and xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, respectively, will have qf(K)qf(K)dEM(K~,K~)=f(x)f(x)normsubscript𝑞𝑓𝐾subscript𝑞𝑓superscript𝐾subscript𝑑EM~𝐾superscript~𝐾norm𝑓𝑥𝑓superscript𝑥\frac{\|q_{f}(K)-q_{f}(K^{\prime})\|}{d_{\textsf{EM}}(\tilde{K},\tilde{K}^{% \prime})}=\|f(x)-f(x^{\prime})\|divide start_ARG ∥ italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K ) - italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ end_ARG start_ARG italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG = ∥ italic_f ( italic_x ) - italic_f ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥. However, if f𝑓fitalic_f is \ellroman_ℓ-Lipschitz, meaning

maxx,x𝒳f(x)f(x)d𝒳(x,x),subscript𝑥superscript𝑥𝒳norm𝑓𝑥𝑓superscript𝑥subscript𝑑𝒳𝑥superscript𝑥\max_{x,x^{\prime}\in\mathcal{X}}\frac{\|f(x)-f(x^{\prime})\|}{d_{\mathcal{X}}% (x,x^{\prime})}\leq\ell,roman_max start_POSTSUBSCRIPT italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X end_POSTSUBSCRIPT divide start_ARG ∥ italic_f ( italic_x ) - italic_f ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ end_ARG start_ARG italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG ≤ roman_ℓ ,

then it is possible to bound Δ𝖤𝖬(qf)subscriptΔ𝖤𝖬subscript𝑞𝑓\Delta_{\mathsf{EM}}(q_{f})roman_Δ start_POSTSUBSCRIPT sansserif_EM end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) using \ellroman_ℓ. We do this by observing that, for any coupling between K~,K~~𝐾superscript~𝐾\tilde{K},\tilde{K}^{\prime}over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, each mass that moves a distance d𝑑ditalic_d may change qfsubscript𝑞𝑓q_{f}italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT by up to d𝑑\ell droman_ℓ italic_d (based on Eq. (5)). This allows us to compute the following bound.

Theorem 4.1.

Let qf(K)subscript𝑞𝑓𝐾q_{f}(K)italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K ) be a linear query of the form in (5), where f:𝒳d:𝑓𝒳superscript𝑑f:\mathcal{X}\rightarrow\mathbb{R}^{d}italic_f : caligraphic_X → blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is \ellroman_ℓ-Lipschitz. Then, we have Δ𝖤𝖬(qf)subscriptΔ𝖤𝖬subscript𝑞𝑓\Delta_{\mathsf{EM}}(q_{f})\leq\ellroman_Δ start_POSTSUBSCRIPT sansserif_EM end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) ≤ roman_ℓ.

Remarks. The above theorem implies that a reasonable upper bound for Δ𝖤𝖬(qf)subscriptΔ𝖤𝖬subscript𝑞𝑓\Delta_{\mathsf{EM}}(q_{f})roman_Δ start_POSTSUBSCRIPT sansserif_EM end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) can be made when the query function f𝑓fitalic_f is smooth in terms of d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT. In Sec. 6.1 we outline a special type of linear query for which this is the case. Additionally, the aforementioned example illustrates that this sensitivity analysis is tight. This means that d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT, in addition to defining the privacy semantics, also influences the types of queries that can be answered with good utility.

Proof Sketch. Let C𝐶Citalic_C be the minimum-cost coupling between K~,K~~𝐾superscript~𝐾\tilde{K},\tilde{K}^{\prime}over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. 𝔼xK~[f(x)]𝔼xK~[f(x)]normsubscript𝔼similar-to𝑥~𝐾delimited-[]𝑓𝑥subscript𝔼similar-to𝑥superscript~𝐾delimited-[]𝑓𝑥\|\mathbb{E}_{x\sim\tilde{K}}[f(x)]-\mathbb{E}_{x\sim\tilde{K}^{\prime}}[f(x)]\|∥ blackboard_E start_POSTSUBSCRIPT italic_x ∼ over~ start_ARG italic_K end_ARG end_POSTSUBSCRIPT [ italic_f ( italic_x ) ] - blackboard_E start_POSTSUBSCRIPT italic_x ∼ over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_f ( italic_x ) ] ∥ can then be bounded by transporting K~superscript~𝐾\tilde{K}^{\prime}over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT onto K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG times the amount that f𝑓fitalic_f can change when each mass is transported, which is atmost dEM(K~,K~)subscript𝑑EM~𝐾superscript~𝐾\ell d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})roman_ℓ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). ∎

Using the upper bound on Δ𝖤𝖬(qf)subscriptΔ𝖤𝖬subscript𝑞𝑓\Delta_{\mathsf{EM}}(q_{f})roman_Δ start_POSTSUBSCRIPT sansserif_EM end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ), we follow a well-known approach for privately releasing a point with known sensitivity under a norm: Sample a point U𝑈Uitalic_U uniformly from the ball {xd:x=1}conditional-set𝑥superscript𝑑norm𝑥1\{x\in\mathbb{R}^{d}:\|x\|=1\}{ italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT : ∥ italic_x ∥ = 1 }, and release qf+gUsubscript𝑞𝑓𝑔𝑈q_{f}+\ell gUitalic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT + roman_ℓ italic_g italic_U, where gΓ(d,ωα)similar-to𝑔Γ𝑑𝜔𝛼g\sim\Gamma(d,\frac{\omega}{\alpha})italic_g ∼ roman_Γ ( italic_d , divide start_ARG italic_ω end_ARG start_ARG italic_α end_ARG ), the Gamma distribution with shape d𝑑ditalic_d and scale ωα𝜔𝛼\frac{\omega}{\alpha}divide start_ARG italic_ω end_ARG start_ARG italic_α end_ARG (Hardt and Talwar, 2010). Here, ω𝜔\omegaitalic_ω is a scale parameter that may be different in the central or local model, since the sensitivity of f𝑓fitalic_f is less in the bounded central model. This mechanism, PrivEMDLinear, is outlined in Alg. 1. Combining Thm. 4.1 with a standard privacy analysis, we can show that PrivEMDLinear satisfies (α,0)𝛼0(\alpha,0)( italic_α , 0 )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP.

Lemma 4.2.

PrivEMDLinear (Alg. 1) with scale ω=1α𝜔1𝛼\omega=\frac{1}{\alpha}italic_ω = divide start_ARG 1 end_ARG start_ARG italic_α end_ARG satisfies (α,0)𝛼0(\alpha,0)( italic_α , 0 )-unbounded local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP and with scale ω=1αn𝜔1𝛼𝑛\omega=\frac{1}{\alpha n}italic_ω = divide start_ARG 1 end_ARG start_ARG italic_α italic_n end_ARG satisfies (α,0)𝛼0(\alpha,0)( italic_α , 0 )-bounded central dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP.

Remarks. When using the 1111-norm, PrivEMDLinear becomes the multidimensional Laplace mechanism. We may instantiate PrivEMDLinear with any noise mechanism that preserves αp\alpha\|\cdot\|_{p}italic_α ∥ ⋅ ∥ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT-metric DP in the space dsuperscript𝑑\mathbb{R}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. In particular, under the 2222-norm, we can add Gaussian noise of width ω1.25ln(1/δ)α𝜔1.251𝛿𝛼\frac{\omega\sqrt{1.25\ln(1/\delta)}}{\alpha}divide start_ARG italic_ω square-root start_ARG 1.25 roman_ln ( 1 / italic_δ ) end_ARG end_ARG start_ARG italic_α end_ARG (Dwork et al., 2014), and this will give better utility than adding noise based on the Gamma distribution at the cost of (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-bounded local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP.

Concrete Example. In our location example, consider releasing the average distance of each point in K𝐾Kitalic_K from a particular city in the local model. This can be expressed with f(x)=d𝒳(x,c)𝑓𝑥subscript𝑑𝒳𝑥𝑐f(x)=d_{\mathcal{X}}(x,c)italic_f ( italic_x ) = italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x , italic_c ), where c𝑐citalic_c is the city; by the triangle inequality this is 1111-Lipschitz. PrivEMDLinear could then be applied to release qf(Ki)subscript𝑞𝑓subscript𝐾𝑖q_{f}(K_{i})italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) plus noise of expected magnitude α=0.04𝛼0.04\frac{\ell}{\alpha}=0.04divide start_ARG roman_ℓ end_ARG start_ARG italic_α end_ARG = 0.04 per user; the total noise will be 0.04n=1.26×1040.04𝑛1.26superscript104\frac{0.04}{\sqrt{n}}=1.26\times 10^{-4}divide start_ARG 0.04 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG = 1.26 × 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT, corresponding to an error of just 1.61.61.61.6 miles.

Data: qfsubscript𝑞𝑓q_{f}italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT – A d𝑑ditalic_d-dimensional linear query; \ellroman_ℓ – Upper bound of the Lipschitz constant of f𝑓fitalic_f; K𝐾Kitalic_K – Input dataset; ω𝜔\omegaitalic_ω – scale parameter
Result: An estimate of qf(K)subscript𝑞𝑓𝐾q_{f}(K)italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K )
Sample U𝑈Uitalic_U uniformly from {xd:x=1}conditional-set𝑥superscript𝑑norm𝑥1\{x\in\mathbb{R}^{d}:\|x\|=1\}{ italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT : ∥ italic_x ∥ = 1 };
Sample g𝑔g\in\mathbb{R}italic_g ∈ blackboard_R from Γ(d,ω)Γ𝑑𝜔\Gamma(d,\omega)roman_Γ ( italic_d , italic_ω );
return q^=qf(K~)+gU^𝑞subscript𝑞𝑓~𝐾𝑔𝑈\hat{q}=q_{f}(\tilde{K})+\ell gUover^ start_ARG italic_q end_ARG = italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG ) + roman_ℓ italic_g italic_U;
Algorithm 1 PrivEMDLinear, an algorithm for releasing linear queries under bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP.

4.2. Unordered Release of Item-wise Queries

We now consider the problem of directly releasing a private query applied to each item in K𝐾Kitalic_K. This can provide a more fine-grained result than the aforementioned linear queries, which outputs the average over all the items. We release the query results as an unordered list to take advantage of the fact that subsequent computation (such as, aggregation) often does not depend on the ordering of the data (Feldman et al., 2022). Specifically, our second mechanism PrivEMDItemWise applies a mechanism 𝒜𝒜\mathcal{A}caligraphic_A, which satisfies α0subscript𝛼0\alpha_{0}italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT-d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT DP, to each item individually. We use 𝒜𝒜\mathcal{A}caligraphic_A as a black-box making PrivEMDItemWise completely general-purpose. For example, one could let 𝒜𝒜\mathcal{A}caligraphic_A be a private item-release mechanism666A number of metric DP mechanisms for releasing items in specific applications are mentioned in Sec. 7. and use PrivEMDItemWise to form a histogram of the dataset. 𝒜𝒜\mathcal{A}caligraphic_A could also be a classifer, and PrivEMDItemWise can then release a simplified representation of the dataset.

Once PrivEMDItemWise applies 𝒜𝒜\mathcal{A}caligraphic_A to each element in the dataset, it shuffles the results (to remove any ordering of the data) and outputs the shuffled list. This appears in Alg. 2, and a precursor to this algorithm appeared in (Fernandes et al., 2019).

Data: Dataset K𝒳m𝐾superscript𝒳𝑚K\in\mathcal{X}^{m}italic_K ∈ caligraphic_X start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT, Mechanism 𝒜:𝒳𝒴:𝒜𝒳𝒴\mathcal{A}:\mathcal{X}\rightarrow\mathcal{Y}caligraphic_A : caligraphic_X → caligraphic_Y satisfying (α0,0)subscript𝛼00(\alpha_{0},0)( italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , 0 )-d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT DP
Result: L𝒴m𝐿superscript𝒴𝑚L\in\mathcal{Y}^{m}italic_L ∈ caligraphic_Y start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT, unordered list (multiset) of item-wise queries from K𝐾Kitalic_K
L=𝐿L=\emptysetitalic_L = ∅ ;
for i=1,,m𝑖1𝑚i=1,\ldots,mitalic_i = 1 , … , italic_m do
       Add 𝒜(xi)𝒜subscript𝑥𝑖\mathcal{A}(x_{i})caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) to L𝐿Litalic_L;
      
end for
Shuffle(L)Shuffle𝐿\textrm{Shuffle}(L)Shuffle ( italic_L );
return L𝐿Litalic_L
Algorithm 2 PrivEMDItemWise, a general mechanism for releasing a item-wise queries from K𝐾Kitalic_K as an unordered list under bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP

As PrivEMDItemWise does not hide the size of K𝐾Kitalic_K, we show it satisfies bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP. We use the following argument: for a neighboring dataset K={x1,,xm}superscript𝐾superscriptsubscript𝑥1superscriptsubscript𝑥𝑚K^{\prime}=\{x_{1}^{\prime},\ldots,x_{m}^{\prime}\}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT }, by Lemma 2.1 there exists a permutation π:[m][m]:𝜋delimited-[]𝑚delimited-[]𝑚\pi:[m]\rightarrow[m]italic_π : [ italic_m ] → [ italic_m ] satisfying (3). Observe that we release the query responses in an unordered fashion by explicitly shuffling them. This allows us to pair up the element xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT with xπ(i)subscript𝑥𝜋𝑖x_{\pi(i)}italic_x start_POSTSUBSCRIPT italic_π ( italic_i ) end_POSTSUBSCRIPT and analyze the privacy guarantee of releasing 𝒜(x1),,𝒜(xm)𝒜subscript𝑥1𝒜subscript𝑥𝑚\mathcal{A}(x_{1}),\ldots,\mathcal{A}(x_{m})caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) versus 𝒜(xπ(i)),𝒜(xπ(m))𝒜superscriptsubscript𝑥𝜋𝑖𝒜superscriptsubscript𝑥𝜋𝑚\mathcal{A}(x_{\pi(i)}^{\prime}),\mathcal{A}(x_{\pi(m)}^{\prime})caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_π ( italic_i ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_π ( italic_m ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). Prior work does this with composition (Fernandes et al., 2019).

However, composition is not the right tool for obtaining a tight privacy analysis. The reason is that composition assumes that each 𝒜(xi)𝒜subscript𝑥𝑖\mathcal{A}(x_{i})caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) is output sequentially, and in particular it is possible to identify which point came from 𝒜(xi)𝒜subscript𝑥𝑖\mathcal{A}(x_{i})caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) and which came from 𝒜(xπ(i))𝒜subscript𝑥𝜋𝑖\mathcal{A}(x_{\pi(i)})caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_π ( italic_i ) end_POSTSUBSCRIPT ). In our case, we output an unordered list, and it is not possible to link which point came from an index i𝑖iitalic_i. Based on this observation, our key idea is to leverage privacy amplification by shuffling (Feldman et al., 2022) instead, which can yield a much smaller privacy parameter when the output is order invariant.

In particular, our core technical contribution is to generalize the privacy amplification by shuffling to d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP. Specifically, we analyze the privacy guarantee between two multisets Shuffle({𝒜(x1),,𝒜(xm))\textsf{Shuffle}(\{\mathcal{A}(x_{1}),\ldots,\mathcal{A}(x_{m}))Shuffle ( { caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) and Shuffle(𝒜(x1),,𝒜(xm))Shuffle𝒜superscriptsubscript𝑥1𝒜superscriptsubscript𝑥𝑚\textsf{Shuffle}(\mathcal{A}(x_{1}^{\prime}),\ldots,\mathcal{A}(x_{m}^{\prime}))Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) when 𝒜𝒜\mathcal{A}caligraphic_A satisfies d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP, in terms of the vector of distances v=(d𝒳(xi,xi))i=1m𝑣superscriptsubscriptsubscript𝑑𝒳subscript𝑥𝑖superscriptsubscript𝑥𝑖𝑖1𝑚v=(d_{\mathcal{X}}(x_{i},x_{i}^{\prime}))_{i=1}^{m}italic_v = ( italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT. The parameters we will be interested in are v0subscriptnorm𝑣0\|v\|_{0}∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, or the number of nonzero elements in v𝑣vitalic_v, and v1subscriptnorm𝑣1\|v\|_{1}∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT since in our different privacy models we will be able to bound both. Formally:

Theorem 4.3.

Suppose that (𝒳,d𝒳)𝒳subscript𝑑𝒳(\mathcal{X},d_{\mathcal{X}})( caligraphic_X , italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ) is a metric space such that d𝒳(,)1subscript𝑑𝒳1d_{\mathcal{X}}(\cdot,\cdot)\leq 1italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( ⋅ , ⋅ ) ≤ 1, and that 𝒜𝒜\mathcal{A}caligraphic_A is an (α0,0)\alpha_{0},0)italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , 0 ) d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP algorithm. Let (x1,,xm)subscript𝑥1subscript𝑥𝑚(x_{1},\ldots,x_{m})( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) and (x1,,xm)superscriptsubscript𝑥1superscriptsubscript𝑥𝑚(x_{1}^{\prime},\ldots,x_{m}^{\prime})( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) be two vectors, and we define v=(d𝒳(xi,xi))i=1m𝑣superscriptsubscriptsubscript𝑑𝒳subscript𝑥𝑖superscriptsubscript𝑥𝑖𝑖1𝑚v=(d_{\mathcal{X}}(x_{i},x_{i}^{\prime}))_{i=1}^{m}italic_v = ( italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT. Let 0<δ<10𝛿10<\delta<10 < italic_δ < 1 be a constant, and suppose it holds that α0<ln(m16ln(4m/δ))subscript𝛼0𝑚164𝑚𝛿\alpha_{0}<\ln(\frac{m}{16\ln(4m/\delta)})italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT < roman_ln ( divide start_ARG italic_m end_ARG start_ARG 16 roman_ln ( 4 italic_m / italic_δ ) end_ARG ). Then, for all outputs O𝑂Oitalic_O, we have that

Pr[𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x1),,𝒜(xm))=O]eαPr[𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x1),,𝒜(xm))=O]+δeα,Pr𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜subscript𝑥1𝒜subscript𝑥𝑚𝑂superscript𝑒𝛼Pr𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜superscriptsubscript𝑥1𝒜superscriptsubscript𝑥𝑚𝑂𝛿superscript𝑒𝛼\Pr[\mathsf{Shuffle}(\mathcal{A}(x_{1}),\ldots,\mathcal{A}(x_{m}))=O]\\ \leq e^{\alpha}\Pr[\mathsf{Shuffle}(\mathcal{A}(x_{1}^{\prime}),\ldots,% \mathcal{A}(x_{m}^{\prime}))=O]+\delta e^{\alpha},start_ROW start_CELL roman_Pr [ sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) = italic_O ] end_CELL end_ROW start_ROW start_CELL ≤ italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT roman_Pr [ sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) = italic_O ] + italic_δ italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT , end_CELL end_ROW

where

αv0ln(1+exp(α0v1/v0)1exp(α0v1/v0)+1(8eα0ln(4v0/δ)m+8eα0m)).𝛼subscriptnorm𝑣01subscript𝛼0subscriptnorm𝑣1subscriptnorm𝑣01subscript𝛼0subscriptnorm𝑣1subscriptnorm𝑣018superscript𝑒subscript𝛼04subscriptnorm𝑣0𝛿𝑚8superscript𝑒subscript𝛼0𝑚\alpha\leq\|v\|_{0}\ln\left(1+\frac{\exp(\nicefrac{{\alpha_{0}\|v\|_{1}}}{{\|v% \|_{0}}})-1}{\exp{(\nicefrac{{\alpha_{0}\|v\|_{1}}}{{\|v\|_{0}}})}+1}\left(% \frac{8\sqrt{e^{\alpha_{0}}\ln(4\|v\|_{0}/\delta)}}{\sqrt{m}}+\frac{8e^{\alpha% _{0}}}{m}\right)\right).italic_α ≤ ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT roman_ln ( 1 + divide start_ARG roman_exp ( / start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) - 1 end_ARG start_ARG roman_exp ( / start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) + 1 end_ARG ( divide start_ARG 8 square-root start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_ln ( 4 ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / italic_δ ) end_ARG end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG + divide start_ARG 8 italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_m end_ARG ) ) .

Remarks. In particular, if α0v0v1subscript𝛼0subscriptnorm𝑣0subscriptnorm𝑣1\alpha_{0}\leq\frac{\|v\|_{0}}{\|v\|_{1}}italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≤ divide start_ARG ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG ∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG, the above bound is roughly α0v1msubscript𝛼0subscriptnorm𝑣1𝑚\frac{\alpha_{0}\|v\|_{1}}{\sqrt{m}}divide start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG, which grows with just m𝑚\sqrt{m}square-root start_ARG italic_m end_ARG (as v1msubscriptnorm𝑣1𝑚\|v\|_{1}\leq m∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_m). The standard result is only applicable when 𝒜𝒜\mathcal{A}caligraphic_A satisfies α𝛼\alphaitalic_α-local DP, and just x1subscript𝑥1x_{1}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is changed to x1superscriptsubscript𝑥1x_{1}^{\prime}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (since each user owns just one item). We generalize the state-of-the-art privacy amplification by shuffling analysis of Feldman et al. (Feldman et al., 2022) to d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP, and our result may be of independent interest. We state our result generally in terms of the vector v0,v1subscriptnorm𝑣0subscriptnorm𝑣1\|v\|_{0},\|v\|_{1}∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT since we will be applying it with different known bounds on these quantities.

Proof Sketch. We first generalize the analysis of amplification by shuffling to the datasets (x1,x2,,xm)subscript𝑥1subscript𝑥2subscript𝑥𝑚(x_{1},x_{2},\ldots,x_{m})( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) and (x1,x2,,xm)superscriptsubscript𝑥1subscript𝑥2subscript𝑥𝑚(x_{1}^{\prime},x_{2},\ldots,x_{m})( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ), where d𝒳(x1,x1)w1subscript𝑑𝒳subscript𝑥1superscriptsubscript𝑥1subscript𝑤1d_{\mathcal{X}}(x_{1},x_{1}^{\prime})\leq w_{1}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. We show the resulting privacy parameter is given by f(w1)𝑓subscript𝑤1f(w_{1})italic_f ( italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) where

f(w1)=ln(1+eα0w11eα0w1+1(8eα0ln(4/δ)m+8eα0m)).𝑓subscript𝑤11superscript𝑒subscript𝛼0subscript𝑤11superscript𝑒subscript𝛼0subscript𝑤118superscript𝑒subscript𝛼04𝛿𝑚8superscript𝑒subscript𝛼0𝑚f(w_{1})=\ln\left(1+\frac{e^{\alpha_{0}w_{1}}-1}{e^{\alpha_{0}w_{1}}+1}\left(% \frac{8\sqrt{e^{\alpha_{0}}\ln(4/\delta)}}{\sqrt{m}}+\frac{8e^{\alpha_{0}}}{m}% \right)\right).italic_f ( italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = roman_ln ( 1 + divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + 1 end_ARG ( divide start_ARG 8 square-root start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_ln ( 4 / italic_δ ) end_ARG end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG + divide start_ARG 8 italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_m end_ARG ) ) .

Then, we apply group privacy v0subscriptnorm𝑣0\|v\|_{0}∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT times to show the general result holds with parameter i=1v0f(α0wi)superscriptsubscript𝑖1subscriptnorm𝑣0𝑓subscript𝛼0subscript𝑤𝑖\sum_{i=1}^{\|v\|_{0}}f(\alpha_{0}w_{i})∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_f ( italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ). The function f𝑓fitalic_f is concave so the worst-case amplification is simply v0f(v1v0)subscriptnorm𝑣0𝑓subscriptnorm𝑣1subscriptnorm𝑣0\|v\|_{0}f(\frac{\|v\|_{1}}{\|v\|_{0}})∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_f ( divide start_ARG ∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ). ∎

Comparison with Composition. Analyzing Thm. 4.3 using the state-of-the-art composition results (Kairouz et al., 2015) and α01subscript𝛼01\alpha_{0}\leq 1italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≤ 1 gives us

αO(α0v2ln1δ).𝛼𝑂subscript𝛼0subscriptnorm𝑣21𝛿\alpha\leq O\left(\alpha_{0}\|v\|_{2}\sqrt{\ln\tfrac{1}{\delta}}\right).italic_α ≤ italic_O ( italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ italic_v ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT square-root start_ARG roman_ln divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG end_ARG ) .

However, we cannot form a satisfying bound on the 2222-norm of v𝑣vitalic_v—it is only possible to say v2v1subscriptnorm𝑣2subscriptnorm𝑣1\|v\|_{2}\leq\|v\|_{1}∥ italic_v ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ ∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT which is tight when e.g. v0subscriptnorm𝑣0\|v\|_{0}∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1. The bound is thus missing the factor of 1m1𝑚\frac{1}{\sqrt{m}}divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG—composition here does not leverage the fact that all m𝑚mitalic_m items are released in a random order.

Combining (3) and Thm. 4.3, we obtain an improved privacy guarantee for PrivEMDItemWise. We may state this guarantee in both the bounded local and central models. In the local model, recall that each user is applying PrivEMDItemWise to their data. In the central model, the central server applies PrivEMDItemWise to the entire dataset, and thus releases the frequencies of mn𝑚𝑛mnitalic_m italic_n itemwise queries.

Theorem 4.4.

For any δ(0,1)𝛿01\delta\in(0,1)italic_δ ∈ ( 0 , 1 ), PrivEMDItemWise shown in Alg. 2 satisfies bounded local (α,δ)𝛼superscript𝛿(\alpha,\delta^{\prime})( italic_α , italic_δ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP, where

α=supw[0,1]h(m;m,mw)w and δ=δeh(m;m,m),formulae-sequence𝛼subscriptsupremum𝑤01𝑚𝑚𝑚𝑤𝑤 and superscript𝛿𝛿superscript𝑒𝑚𝑚𝑚\textstyle{\alpha=\sup_{w\in[0,1]}\frac{h(m;m,mw)}{w}\ \ \ \ \ \text{ and }% \delta^{\prime}=\delta e^{h(m;m,m)},}italic_α = roman_sup start_POSTSUBSCRIPT italic_w ∈ [ 0 , 1 ] end_POSTSUBSCRIPT divide start_ARG italic_h ( italic_m ; italic_m , italic_m italic_w ) end_ARG start_ARG italic_w end_ARG and italic_δ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_δ italic_e start_POSTSUPERSCRIPT italic_h ( italic_m ; italic_m , italic_m ) end_POSTSUPERSCRIPT ,

and

h(m;x0,x1)=x0ln(1+exp(α0x1/x0)1exp(α0x1/x0)+1(8eα0ln(4x0/δ)m+8eα0m)).𝑚subscript𝑥0subscript𝑥1subscript𝑥01subscript𝛼0subscript𝑥1subscript𝑥01subscript𝛼0subscript𝑥1subscript𝑥018superscript𝑒subscript𝛼04subscript𝑥0𝛿𝑚8superscript𝑒subscript𝛼0𝑚h(m;x_{0},x_{1})=x_{0}\ln\left(1+\frac{\exp(\nicefrac{{\alpha_{0}x_{1}}}{{x_{0% }}})-1}{\exp{(\nicefrac{{\alpha_{0}x_{1}}}{{x_{0}}})}+1}\left(\frac{8\sqrt{e^{% \alpha_{0}}\ln(4x_{0}/\delta)}}{\sqrt{m}}+\frac{8e^{\alpha_{0}}}{m}\right)% \right).italic_h ( italic_m ; italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT roman_ln ( 1 + divide start_ARG roman_exp ( / start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) - 1 end_ARG start_ARG roman_exp ( / start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) + 1 end_ARG ( divide start_ARG 8 square-root start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_ln ( 4 italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / italic_δ ) end_ARG end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG + divide start_ARG 8 italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_m end_ARG ) ) .

Similarly, PrivEMDItemWise satisfies bounded central (α,δ)𝛼superscript𝛿(\alpha,\delta^{\prime})( italic_α , italic_δ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP, where

α=supw[0,1]h(mn;m,mw)w and δ=δeh(mn;m,m).formulae-sequence𝛼subscriptsupremum𝑤01𝑚𝑛𝑚𝑚𝑤𝑤 and superscript𝛿𝛿superscript𝑒𝑚𝑛𝑚𝑚\textstyle{\alpha=\sup_{w\in[0,1]}\frac{h(mn;m,mw)}{w}\ \ \ \ \ \text{ and }% \delta^{\prime}=\delta e^{h(mn;m,m)}.}italic_α = roman_sup start_POSTSUBSCRIPT italic_w ∈ [ 0 , 1 ] end_POSTSUBSCRIPT divide start_ARG italic_h ( italic_m italic_n ; italic_m , italic_m italic_w ) end_ARG start_ARG italic_w end_ARG and italic_δ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_δ italic_e start_POSTSUPERSCRIPT italic_h ( italic_m italic_n ; italic_m , italic_m ) end_POSTSUPERSCRIPT .

Remarks. Thm. 4.4 gives the tightest possible privacy parameters, but we may also give an asymptotic formula as follows. For desired privacy parameters (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ ), one should set

(6) α0={α32mln(4meα/δ) if α32mln(4meα/δ)2ln(α16mln(4meα/δ))32mln(4meα/δ)<α<msubscript𝛼0cases𝛼32𝑚4𝑚superscript𝑒𝛼𝛿 if α32mln(4meα/δ)2𝛼16𝑚4𝑚superscript𝑒𝛼𝛿32𝑚4𝑚superscript𝑒𝛼𝛿𝛼𝑚\displaystyle\alpha_{0}=\begin{cases}\frac{\alpha}{32\sqrt{m\ln(4me^{\alpha}/% \delta)}}&\text{ if $\alpha\leq 32\sqrt{m\ln(4me^{\alpha}/\delta)}$}\\ 2\ln\left(\frac{\alpha}{16\sqrt{m\ln(4me^{\alpha}/\delta)}}\right)&32\sqrt{m% \ln(4me^{\alpha}/\delta)}<\alpha<m\end{cases}italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = { start_ROW start_CELL divide start_ARG italic_α end_ARG start_ARG 32 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_ARG end_CELL start_CELL if italic_α ≤ 32 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_CELL end_ROW start_ROW start_CELL 2 roman_ln ( divide start_ARG italic_α end_ARG start_ARG 16 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_ARG ) end_CELL start_CELL 32 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG < italic_α < italic_m end_CELL end_ROW

and respectively

(7) α0={αn32mln(4meα/δ) if αn32mln(4meα/δ)2ln(αn16mln(4meα/δ))32mln(4meα/δ)<αn<mnsubscript𝛼0cases𝛼𝑛32𝑚4𝑚superscript𝑒𝛼𝛿 if αn32mln(4meα/δ)2𝛼𝑛16𝑚4𝑚superscript𝑒𝛼𝛿32𝑚4𝑚superscript𝑒𝛼𝛿𝛼𝑛𝑚𝑛\displaystyle\alpha_{0}=\begin{cases}\frac{\alpha\sqrt{n}}{32\sqrt{m\ln(4me^{% \alpha}/\delta)}}&\text{ if $\alpha\sqrt{n}\leq 32\sqrt{m\ln(4me^{\alpha}/% \delta)}$}\\ 2\ln\left(\frac{\alpha\sqrt{n}}{16\sqrt{m\ln(4me^{\alpha}/\delta)}}\right)&32% \sqrt{m\ln(4me^{\alpha}/\delta)}<\alpha\sqrt{n}<m\sqrt{n}\end{cases}italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = { start_ROW start_CELL divide start_ARG italic_α square-root start_ARG italic_n end_ARG end_ARG start_ARG 32 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_ARG end_CELL start_CELL if italic_α square-root start_ARG italic_n end_ARG ≤ 32 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_CELL end_ROW start_ROW start_CELL 2 roman_ln ( divide start_ARG italic_α square-root start_ARG italic_n end_ARG end_ARG start_ARG 16 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_ARG ) end_CELL start_CELL 32 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG < italic_α square-root start_ARG italic_n end_ARG < italic_m square-root start_ARG italic_n end_ARG end_CELL end_ROW

in order to achieve dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP in the bounded local (respectively central) model. Assuming αO(ln(m))𝛼𝑂𝑚\alpha\leq O(\ln(m))italic_α ≤ italic_O ( roman_ln ( italic_m ) ), this means that the privacy parameter will be roughly αm𝛼𝑚\frac{\alpha}{\sqrt{m}}divide start_ARG italic_α end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG (resp. ln(αnm)𝛼𝑛𝑚\ln(\frac{\alpha\sqrt{n}}{\sqrt{m}})roman_ln ( divide start_ARG italic_α square-root start_ARG italic_n end_ARG end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG )) for releasing the m𝑚mitalic_m samples; this is asymptotically better than the analysis with composition which would require setting α0=αmsubscript𝛼0𝛼𝑚\alpha_{0}=\frac{\alpha}{m}italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = divide start_ARG italic_α end_ARG start_ARG italic_m end_ARG (resp. αm)\frac{\alpha}{m})divide start_ARG italic_α end_ARG start_ARG italic_m end_ARG )). Even with higher α=mc𝛼superscript𝑚𝑐\alpha=m^{c}italic_α = italic_m start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT for c<1𝑐1c<1italic_c < 1, the budget is still αm1+c𝛼superscript𝑚1𝑐\frac{\alpha}{\sqrt{m^{1+c}}}divide start_ARG italic_α end_ARG start_ARG square-root start_ARG italic_m start_POSTSUPERSCRIPT 1 + italic_c end_POSTSUPERSCRIPT end_ARG end_ARG (resp. ln(αnm1+c\ln(\frac{\alpha\sqrt{n}}{\sqrt{m^{1+c}}}roman_ln ( divide start_ARG italic_α square-root start_ARG italic_n end_ARG end_ARG start_ARG square-root start_ARG italic_m start_POSTSUPERSCRIPT 1 + italic_c end_POSTSUPERSCRIPT end_ARG end_ARG), which are both significant asymptotic improvements.

Concrete Example. Our improved analysis makes the most significant improvements in the central model. Here, we would have to apply PrivEMDItemWise with α0=αm=0.025subscript𝛼0𝛼𝑚0.025\alpha_{0}=\frac{\alpha}{m}=0.025italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = divide start_ARG italic_α end_ARG start_ARG italic_m end_ARG = 0.025 for each of the m=103𝑚superscript103m=10^{3}italic_m = 10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT location data points per user. Using the guarantee of Thm. 4.4, it is possible to set α03.0subscript𝛼03.0\alpha_{0}\approx 3.0italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≈ 3.0 – a several orders of magnitude improvement.

5. Generalization to Unbounded DP

The mechanisms presented so far face two challenges when applied to the unbounded data setting. First, a direct privacy analysis of the unbounded data setting is difficult since we cannot leverage Lemma 2.1, which significantly simplifies the analysis (for the bounded data setting). Second, and more importantly, the unbounded central model offers no utility improvement over the local model. In the worst-case scenario, a single user may contribute nearly all the data in the dataset, effectively reducing any algorithm to satisfying only local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP. This issue has been noted in previous work in user-level DP (Liu et al., 2023).

In this section, we tackle these challenges by showing a blackbox reduction from unbounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP to bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP. Our reduction works in both the local and central models. The key idea of the reduction is to smoothly project a dataset K𝐾Kitalic_K of any size to a dataset L𝐿Litalic_L of a given fixed size, such that the dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT distance between any two input datasets and the dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT distance between their projections are roughly the same. Then, it is easy to show that applying any bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP algorithm to the smooth projections is sufficient to guarantee unbounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP for the entire scheme.

Our proposed projection mechanism is smooth in a near-multiplicative sense, albeit with a small additive penalty when the dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT between the two datasets is small. We account for this subtlety, by slightly modify the privacy semantics of dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP in the unbounded setting to not grow arbitrarily strong as dEM(K~,K~)0subscript𝑑EM~𝐾superscript~𝐾0d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})\rightarrow 0italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) → 0. Instead, we introduce a distance threshold r𝑟ritalic_r such that for all dEM(K~,K~)rsubscript𝑑EM~𝐾superscript~𝐾𝑟d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})\leq ritalic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ italic_r enjoys a uniform privacy guarantee of ε𝜀\varepsilonitalic_ε. This refined privacy definition, termed discrete dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP, is formalized (in the local model) as:

Definition 5.1.

[Discrete Local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP] Let \mathcal{M}caligraphic_M be a mechanism which acts on a dataset K𝐾Kitalic_K. We say \mathcal{M}caligraphic_M satisfies (ε,δ,r)𝜀𝛿𝑟(\varepsilon,\delta,r)( italic_ε , italic_δ , italic_r )-discrete local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP if, for any two datasets K,K𝒳𝐾superscript𝐾superscript𝒳K,K^{\prime}\in\mathcal{X}^{*}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT such that dEM(K~,K~)rsubscript𝑑EM~𝐾superscript~𝐾𝑟d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})\leq ritalic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ italic_r,

Pr[M(K)=O]eεPr[M(K)=O]+δ.Pr𝑀𝐾𝑂superscript𝑒𝜀Pr𝑀superscript𝐾𝑂𝛿\Pr[M(K)=O]\leq e^{\varepsilon}\Pr[M(K^{\prime})=O]+\delta.roman_Pr [ italic_M ( italic_K ) = italic_O ] ≤ italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT roman_Pr [ italic_M ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_O ] + italic_δ .

Like in standard DP, the above definition uses the parameter ε𝜀\varepsilonitalic_ε because it is a unitless privacy parameter—the unit of the metric is expressed in the parameter r𝑟ritalic_r.

Fact 5.1.

For any K,K𝐾superscript𝐾K,K^{\prime}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that d=dEM(K~,K~)𝑑subscript𝑑EM~𝐾superscript~𝐾d=d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})italic_d = italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ), \mathcal{M}caligraphic_M satisfies

Pr[M(K)=O]eεdrPr[M(K)=O]+δexp(dr).Pr𝑀𝐾𝑂superscript𝑒𝜀𝑑𝑟Pr𝑀superscript𝐾𝑂𝛿𝑑𝑟\Pr[M(K)=O]\leq e^{\varepsilon\lceil\frac{d}{r}\rceil}\Pr[M(K^{\prime})=O]+% \delta\exp(\lceil\tfrac{d}{r}\rceil).roman_Pr [ italic_M ( italic_K ) = italic_O ] ≤ italic_e start_POSTSUPERSCRIPT italic_ε ⌈ divide start_ARG italic_d end_ARG start_ARG italic_r end_ARG ⌉ end_POSTSUPERSCRIPT roman_Pr [ italic_M ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_O ] + italic_δ roman_exp ( ⌈ divide start_ARG italic_d end_ARG start_ARG italic_r end_ARG ⌉ ) .

Fact 5.1 is implied from Definition 5.1 by a direct application of group privacy (Dwork, 2006) (proven in Lemma A.2).This guarantee can be interpreted as providing dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP at the granularity of units of dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT distance r𝑟ritalic_r. Note that for all dr𝑑𝑟d\geq ritalic_d ≥ italic_r, we have εdr2εrd𝜀𝑑𝑟2𝜀𝑟𝑑\varepsilon\lceil\frac{d}{r}\rceil\leq\frac{2\varepsilon}{r}ditalic_ε ⌈ divide start_ARG italic_d end_ARG start_ARG italic_r end_ARG ⌉ ≤ divide start_ARG 2 italic_ε end_ARG start_ARG italic_r end_ARG italic_d. Thus, (ε,δ,r)𝜀𝛿𝑟(\varepsilon,\delta,r)( italic_ε , italic_δ , italic_r )-discrete local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP is roughly equivalent to (2εr,δ)2𝜀𝑟𝛿(\frac{2\varepsilon}{r},\delta)( divide start_ARG 2 italic_ε end_ARG start_ARG italic_r end_ARG , italic_δ )-unbounded local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP, except if dEM(K~,K~)rsubscript𝑑EM~𝐾superscript~𝐾𝑟d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})\leq ritalic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ italic_r. In this case, the privacy parameter will not go below ε𝜀\varepsilonitalic_ε. This adjustment does not significantly alter the overall privacy semantics of dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP; one may simply set α𝛼\alphaitalic_α as described in Sec. 4.

In the central model, we make a similar definition:

Definition 5.2.

[Discrete Central dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP] Let KG=K1Knsubscript𝐾𝐺subscript𝐾1subscript𝐾𝑛K_{G}=K_{1}\cup\cdots\cup K_{n}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT = italic_K start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ ⋯ ∪ italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT denote a global dataset from n𝑛nitalic_n users (of any size). We say KGrKGsubscriptsimilar-to𝑟subscript𝐾𝐺subscriptsuperscript𝐾𝐺K_{G}\sim_{r}K^{\prime}_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ∼ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT if KGsubscriptsuperscript𝐾𝐺K^{\prime}_{G}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT can be obtained from KGsubscript𝐾𝐺K_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT by changing Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to Kisuperscriptsubscript𝐾𝑖K_{i}^{\prime}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT for just one user i𝑖iitalic_i, such that dEM(K~i,K~i)rsubscript𝑑EMsubscript~𝐾𝑖subscriptsuperscript~𝐾𝑖𝑟d_{\textsf{EM}}(\tilde{K}_{i},\tilde{K}^{\prime}_{i})\leq ritalic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ≤ italic_r. We say a mechanism (KG)subscript𝐾𝐺\mathcal{M}(K_{G})caligraphic_M ( italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ) satisfies (ε,δ,r)𝜀𝛿𝑟(\varepsilon,\delta,r)( italic_ε , italic_δ , italic_r )-discrete central dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP if, for all KG,KGsubscript𝐾𝐺superscriptsubscript𝐾𝐺K_{G},K_{G}^{\prime}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that KGrKGsubscriptsimilar-to𝑟subscript𝐾𝐺superscriptsubscript𝐾𝐺K_{G}\sim_{r}K_{G}^{\prime}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ∼ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, we have

Pr[M(KG)=O]eεPr[M(KG)=O]+δ.Pr𝑀subscript𝐾𝐺𝑂superscript𝑒𝜀Pr𝑀subscriptsuperscript𝐾𝐺𝑂𝛿\Pr[M(K_{G})=O]\leq e^{\varepsilon}\Pr[M(K^{\prime}_{G})=O]+\delta.roman_Pr [ italic_M ( italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ) = italic_O ] ≤ italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT roman_Pr [ italic_M ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ) = italic_O ] + italic_δ .

As before, (ε,δ,r)𝜀𝛿𝑟(\varepsilon,\delta,r)( italic_ε , italic_δ , italic_r )-discrete central dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP is roughly equivalent to (2εr,δ)2𝜀𝑟𝛿(\frac{2\varepsilon}{r},\delta)( divide start_ARG 2 italic_ε end_ARG start_ARG italic_r end_ARG , italic_δ )-bounded central dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP when all user datasets have size m𝑚mitalic_m. However, we will see that Definition 5.2 is the appropriate generalization to unbounded user datasets under our projection mechanism which is described below.

Our projection mechanism first generates a fixed number of samples with replacement from each user’s dataset K𝐾Kitalic_K. Next, it applies a blackbox bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP mechanism, 𝒜𝒜\mathcal{A}caligraphic_A, to the projected dataset L𝐿Litalic_L. By blackbox application we mean that 𝒜𝒜\mathcal{A}caligraphic_A can be any arbitrary mechanism as long as it satisfies bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP. The projection mechanism, in the central model, is outlined in Alg. 3 (in the local model, each user samples from their own Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, so we would simply have n=1𝑛1n=1italic_n = 1). The smoothness of the projection from KGsubscript𝐾𝐺K_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT to L𝐿Litalic_L follows from the following claim:

Lemma 5.2.

Let K~,K~Δ𝒳~𝐾superscript~𝐾superscriptΔ𝒳\tilde{K},\tilde{K}^{\prime}\in\Delta^{\mathcal{X}}over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ roman_Δ start_POSTSUPERSCRIPT caligraphic_X end_POSTSUPERSCRIPT be probability distributions, and let Csuperscript𝐶C^{*}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT be the minimum cost coupling between K~,K~~𝐾superscript~𝐾\tilde{K},\tilde{K}^{\prime}over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Let {(xi,yi)}i=1ssuperscriptsubscriptsubscript𝑥𝑖subscript𝑦𝑖𝑖1𝑠\{(x_{i},y_{i})\}_{i=1}^{s}{ ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT denote s𝑠sitalic_s i.i.d. samples from Csuperscript𝐶C^{*}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, and let L=(x1,,xs)𝐿subscript𝑥1subscript𝑥𝑠L=(x_{1},\ldots,x_{s})italic_L = ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ) and L=(y1,,ys)superscript𝐿subscript𝑦1subscript𝑦𝑠L^{\prime}=(y_{1},\ldots,y_{s})italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ( italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ). Then,

Pr[dEM(L~,L~)(1+2)dEM(K~,K~)+3sln(1δ)]δ.Prsubscript𝑑EM~𝐿superscript~𝐿12subscript𝑑EM~𝐾superscript~𝐾3𝑠1𝛿𝛿\Pr[d_{\textsf{EM}}(\tilde{L},\tilde{L}^{\prime})\geq(1+\sqrt{2})d_{\textsf{EM% }}(\tilde{K},\tilde{K}^{\prime})+\tfrac{3}{s}\ln(\tfrac{1}{\delta})]\leq\delta.roman_Pr [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_L end_ARG , over~ start_ARG italic_L end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ ( 1 + square-root start_ARG 2 end_ARG ) italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) + divide start_ARG 3 end_ARG start_ARG italic_s end_ARG roman_ln ( divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ) ] ≤ italic_δ .
Data: KGsubscript𝐾𝐺K_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT - Global datasets of n𝑛nitalic_n users; 𝒜𝒜\mathcal{A}caligraphic_A - A mechanism satisfying bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP; s𝑠sitalic_s - Number of samples.
L=𝐿L=\emptysetitalic_L = ∅;
for i=1𝑖1i=1italic_i = 1 to n𝑛nitalic_n do
       Add s𝑠sitalic_s uniform samples with replacement from Kisubscript𝐾𝑖K_{i}italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to L𝐿Litalic_L
end for
O=𝒜(L)𝑂𝒜𝐿O=\mathcal{A}(L)italic_O = caligraphic_A ( italic_L );
return O𝑂Oitalic_O
Algorithm 3 BoundedEMDReduction, a reduction from unbounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP to bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP.

Proof Sketch: Intuitively, for any coupling C𝐶Citalic_C between K~isubscript~𝐾𝑖\tilde{K}_{i}over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and K~isuperscriptsubscript~𝐾𝑖\tilde{K}_{i}^{\prime}over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, we can simulate sampling s𝑠sitalic_s times from K~isuperscriptsubscript~𝐾𝑖\tilde{K}_{i}^{\prime}over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT by first sampling {x1,,xs}subscript𝑥1subscript𝑥𝑠\{x_{1},\ldots,x_{s}\}{ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT } from K~isubscript~𝐾𝑖\tilde{K}_{i}over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and then sampling yiCxi()similar-tosubscript𝑦𝑖subscript𝐶subscript𝑥𝑖y_{i}\sim C_{x_{i}}(\cdot)italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∼ italic_C start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ⋅ ). This view shows there is a transportation plan from L={x1,,xs}𝐿subscript𝑥1subscript𝑥𝑠L=\{x_{1},\ldots,x_{s}\}italic_L = { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT } and L={y1,,ys}superscript𝐿subscript𝑦1subscript𝑦𝑠L^{\prime}=\{y_{1},\ldots,y_{s}\}italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = { italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT } of expected cost 𝔼xC1,yCxd𝒳(x,y)=dEM(K~i,K~i)subscript𝔼formulae-sequencesimilar-to𝑥subscript𝐶1similar-to𝑦subscript𝐶𝑥subscript𝑑𝒳𝑥𝑦subscript𝑑EMsubscript~𝐾𝑖superscriptsubscript~𝐾𝑖\mathbb{E}_{x\sim C_{1},y\sim C_{x}}d_{\mathcal{X}}(x,y)=d_{\textsf{EM}}(% \tilde{K}_{i},\tilde{K}_{i}^{\prime})blackboard_E start_POSTSUBSCRIPT italic_x ∼ italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_y ∼ italic_C start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x , italic_y ) = italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). Using Bernstein’s inequality, we can show with probability at least 1δ1𝛿1-\delta1 - italic_δ, dEM(L~,L~)subscript𝑑EM~𝐿superscript~𝐿d_{\textsf{EM}}(\tilde{L},\tilde{L}^{\prime})italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_L end_ARG , over~ start_ARG italic_L end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) is upper bounded by 2dEM(K~i,K~i)+6slog(1δ)2subscript𝑑EMsubscript~𝐾𝑖superscriptsubscript~𝐾𝑖6𝑠1𝛿2d_{\textsf{EM}}(\tilde{K}_{i},\tilde{K}_{i}^{\prime})+\frac{6}{s}\log(\frac{1% }{\delta})2 italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) + divide start_ARG 6 end_ARG start_ARG italic_s end_ARG roman_log ( divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ). ∎

Thus, sampling is a smooth projection from KGsubscript𝐾𝐺K_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT to L𝐿Litalic_L, with the caveat that there is an additive log(1/δ)s1𝛿𝑠\frac{\log(1/\delta)}{s}divide start_ARG roman_log ( 1 / italic_δ ) end_ARG start_ARG italic_s end_ARG term that comes into play if s𝑠sitalic_s is too small to guarantee convergence. This is the reason for our relaxation to discrete dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP.

In summary, we have the following privacy guarantee:

Theorem 5.3.

Let ε>0𝜀0\varepsilon>0italic_ε > 0 and δ,r[0,1]𝛿𝑟01\delta,r\in[0,1]italic_δ , italic_r ∈ [ 0 , 1 ] be arbitrary constants. Suppose \mathcal{M}caligraphic_M is a mechanism which satisfies (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-bounded local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP (Definition 3.1), where

α=ε(1+2)r+3sln(1δ).𝛼𝜀12𝑟3𝑠1𝛿\alpha=\tfrac{\varepsilon}{(1+\sqrt{2})r+\tfrac{3}{s}\ln(\tfrac{1}{\delta})}.italic_α = divide start_ARG italic_ε end_ARG start_ARG ( 1 + square-root start_ARG 2 end_ARG ) italic_r + divide start_ARG 3 end_ARG start_ARG italic_s end_ARG roman_ln ( divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ) end_ARG .

Then, BoundedEMDReduction satisfies (ε,2δ,r)𝜀2𝛿𝑟(\varepsilon,2\delta,r)( italic_ε , 2 italic_δ , italic_r )-discrete local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP. Similarly, if \mathcal{M}caligraphic_M satisfies (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-bounded central dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP (Definition 3.1), then BoundedEMDReduction satisfies (ε,2δ,r)𝜀2𝛿𝑟(\varepsilon,2\delta,r)( italic_ε , 2 italic_δ , italic_r )-discrete central dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP.

Remarks. If the number of samples s𝑠sitalic_s is at least ln(1/δ)r1𝛿𝑟\frac{\ln(1/\delta)}{r}divide start_ARG roman_ln ( 1 / italic_δ ) end_ARG start_ARG italic_r end_ARG, then Thm. 5.3 shows there is only a small multiplicative cost to considering just bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP (in the respective local or central model). In this case, the bounded algorithm will need to roughly satisfy (εr,δ)𝜀𝑟𝛿(\frac{\varepsilon}{r},\delta)( divide start_ARG italic_ε end_ARG start_ARG italic_r end_ARG , italic_δ )-bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP, and this is roughly the same as the resulting (ε,δ,r)𝜀𝛿𝑟(\varepsilon,\delta,r)( italic_ε , italic_δ , italic_r )-discrete dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP algorithm. There is no privacy disadvantage to taking a large number of samples, and the utility may also increase due to more information about the dataset being captured (recall that the projection does not providing privacy; it is being provided by \mathcal{M}caligraphic_M). Thus, the number of samples may be set to be large with computational costs being the only constraint.

Proof Sketch. Let L𝐿Litalic_L and Lsuperscript𝐿L^{\prime}italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT denote the sampled data for KGsubscript𝐾𝐺K_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT and KGsuperscriptsubscript𝐾𝐺K_{G}^{\prime}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, respectively. By convexity of DP, it suffices to analyze the privacy guarantee for any coupling between the random variables L,L𝐿superscript𝐿L,L^{\prime}italic_L , italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. In particular, we use the optimal coupling between KG,KGsubscript𝐾𝐺superscriptsubscript𝐾𝐺K_{G},K_{G}^{\prime}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT , italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT to define this coupling. The resulting privacy parameter is bounded in terms of the expected cost dEM(K~,K~)subscript𝑑EM~𝐾superscript~𝐾d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) by Lemma 5.2. ∎

BoundedEMDReduction can be used to bound the contribution of each user in the central setting, allowing us to apply the simpler Definition 3.2. In addition, it can be used to adapt PrivEMDItemWise to the unbounded data setting. One caveat is that utility may not be preserved if the number of user samples is too small or, in the central setting, if the users data distributions are heterogeneous. In particular, if users have varying numbers of samples, each from different distributions, applying BoundedEMDReduction equalizes the frequency of all user data. Nonetheless, it is often reasonable to assume the users have homogeneous data distributions (Liu et al., 2020; Acharya et al., 2023).

6. Applications of Proposed Mechanisms

In this section, we compare the utilities of PrivEMDLinear and PrivEMDItemWise to existing mechanisms satisfying user-level DP. For simplicity, we assume the bounded data setting.
Notations. We define the following quantities of a real matrix Md×k𝑀superscript𝑑𝑘M\in\mathbb{R}^{d\times k}italic_M ∈ blackboard_R start_POSTSUPERSCRIPT italic_d × italic_k end_POSTSUPERSCRIPT. First, the (p,q)𝑝𝑞(p,q)( italic_p , italic_q ) operator norm of M𝑀Mitalic_M, denoted by Mpqsubscriptnorm𝑀𝑝𝑞\|M\|_{p\rightarrow q}∥ italic_M ∥ start_POSTSUBSCRIPT italic_p → italic_q end_POSTSUBSCRIPT, is given by Mpq=supxk,xp1Mxqsubscriptnorm𝑀𝑝𝑞subscriptsupremumformulae-sequence𝑥superscript𝑘subscriptnorm𝑥𝑝1subscriptnorm𝑀𝑥𝑞\|M\|_{p\rightarrow q}=\sup_{x\in\mathbb{R}^{k},\|x\|_{p}\leq 1}\|Mx\|_{q}∥ italic_M ∥ start_POSTSUBSCRIPT italic_p → italic_q end_POSTSUBSCRIPT = roman_sup start_POSTSUBSCRIPT italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , ∥ italic_x ∥ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ≤ 1 end_POSTSUBSCRIPT ∥ italic_M italic_x ∥ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT. We can show that M12subscriptnorm𝑀12\|M\|_{1\rightarrow 2}∥ italic_M ∥ start_POSTSUBSCRIPT 1 → 2 end_POSTSUBSCRIPT is equal to the maximum 2subscript2\ell_{2}roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT norm of a column of M𝑀Mitalic_M. Furthermore, M22subscriptnorm𝑀22\|M\|_{2\rightarrow 2}∥ italic_M ∥ start_POSTSUBSCRIPT 2 → 2 end_POSTSUBSCRIPT, more commonly written as M2subscriptnorm𝑀2\|M\|_{2}∥ italic_M ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, is the spectral norm and is equal to the maximum singular value of M𝑀Mitalic_M. Matrix norms satisfy the important submultiplicative property, which states that MNprMqrNpqsubscriptnorm𝑀𝑁𝑝𝑟subscriptnorm𝑀𝑞𝑟subscriptnorm𝑁𝑝𝑞\|MN\|_{p\rightarrow r}\leq\|M\|_{q\rightarrow r}\|N\|_{p\rightarrow q}∥ italic_M italic_N ∥ start_POSTSUBSCRIPT italic_p → italic_r end_POSTSUBSCRIPT ≤ ∥ italic_M ∥ start_POSTSUBSCRIPT italic_q → italic_r end_POSTSUBSCRIPT ∥ italic_N ∥ start_POSTSUBSCRIPT italic_p → italic_q end_POSTSUBSCRIPT for any matrices M,N𝑀𝑁M,Nitalic_M , italic_N and p,q,r1𝑝𝑞𝑟1p,q,r\geq 1italic_p , italic_q , italic_r ≥ 1. Next, let Idsubscript𝐼𝑑I_{d}italic_I start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT denote the d×d𝑑𝑑d\times ditalic_d × italic_d identity matrix, and again suppose that MRd×k𝑀superscript𝑅𝑑𝑘M\in R^{d\times k}italic_M ∈ italic_R start_POSTSUPERSCRIPT italic_d × italic_k end_POSTSUPERSCRIPT with dk𝑑𝑘d\leq kitalic_d ≤ italic_k. If M𝑀Mitalic_M has full row rank, then there exists a matrix Nk×d𝑁superscript𝑘𝑑N\in\mathbb{R}^{k\times d}italic_N ∈ blackboard_R start_POSTSUPERSCRIPT italic_k × italic_d end_POSTSUPERSCRIPT such that MN=Id𝑀𝑁subscript𝐼𝑑MN=I_{d}italic_M italic_N = italic_I start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT. We call such a matrix N𝑁Nitalic_N a right inverse of M𝑀Mitalic_M. Finally, for Ms1×t1𝑀superscriptsubscript𝑠1subscript𝑡1M\in\mathbb{R}^{s_{1}\times t_{1}}italic_M ∈ blackboard_R start_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT × italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT and Ns2×t2𝑁superscriptsubscript𝑠2subscript𝑡2N\in\mathbb{R}^{s_{2}\times t_{2}}italic_N ∈ blackboard_R start_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT × italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, let MNs1s2×t1t2tensor-product𝑀𝑁superscriptsubscript𝑠1subscript𝑠2subscript𝑡1subscript𝑡2M\otimes N\in\mathbb{R}^{s_{1}s_{2}\times t_{1}t_{2}}italic_M ⊗ italic_N ∈ blackboard_R start_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT × italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT denote the Kronecker product of two real matrices, whose entry in ((i1,i2),(j1,j2))subscript𝑖1subscript𝑖2subscript𝑗1subscript𝑗2((i_{1},i_{2}),(j_{1},j_{2}))( ( italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_i start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , ( italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_j start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) is Mi1j1Ni2j2subscript𝑀subscript𝑖1subscript𝑗1subscript𝑁subscript𝑖2subscript𝑗2M_{i_{1}j_{1}}N_{i_{2}j_{2}}italic_M start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT.

6.1. Linear Embedding Queries

Many applications of metric DP assume there is an embedding function ϕ:𝒳t:italic-ϕ𝒳superscript𝑡\phi:\mathcal{X}\rightarrow\mathbb{R}^{t}italic_ϕ : caligraphic_X → blackboard_R start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT, which maps an item to its semantic representation in tsuperscript𝑡\mathbb{R}^{t}blackboard_R start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT (each of the examples in Sec. 1 have an embedding representation). The metric d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT is then the distance between ϕ(x)italic-ϕ𝑥\phi(x)italic_ϕ ( italic_x ) and ϕ(x)italic-ϕsuperscript𝑥\phi(x^{\prime})italic_ϕ ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ); in this section, we consider the l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT distance.

Since ϕ(x)italic-ϕ𝑥\phi(x)italic_ϕ ( italic_x ) also communicates information about the item x𝑥xitalic_x, we define linear embedding queries as linear queries applied to an item’s embedding ϕ(x)italic-ϕ𝑥\phi(x)italic_ϕ ( italic_x ). Formally,

qfϕ(K)=𝔼xK~[fϕ(x)],subscript𝑞𝑓italic-ϕ𝐾subscript𝔼similar-to𝑥~𝐾delimited-[]𝑓italic-ϕ𝑥q_{f\circ\phi}(K)=\mathbb{E}_{x\sim\tilde{K}}[f\circ\phi(x)],italic_q start_POSTSUBSCRIPT italic_f ∘ italic_ϕ end_POSTSUBSCRIPT ( italic_K ) = blackboard_E start_POSTSUBSCRIPT italic_x ∼ over~ start_ARG italic_K end_ARG end_POSTSUBSCRIPT [ italic_f ∘ italic_ϕ ( italic_x ) ] ,

where f(y)=Fy𝑓𝑦𝐹𝑦f(y)=Fyitalic_f ( italic_y ) = italic_F italic_y for a matrix Fd×t𝐹superscript𝑑𝑡F\in\mathbb{R}^{d\times t}italic_F ∈ blackboard_R start_POSTSUPERSCRIPT italic_d × italic_t end_POSTSUPERSCRIPT (meaning that f𝑓fitalic_f is a linear function). Assume each row Fisubscript𝐹𝑖F_{i}italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT of F𝐹Fitalic_F is normalized so that Fi21subscriptnormsubscript𝐹𝑖21\|F_{i}\|_{2}\leq 1∥ italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 1. Each coordinate of fϕ𝑓italic-ϕf\circ\phiitalic_f ∘ italic_ϕ is equal to 𝔼xK~[Fi,ϕ(x)]subscript𝔼similar-to𝑥~𝐾delimited-[]subscript𝐹𝑖italic-ϕ𝑥\mathbb{E}_{x\sim\tilde{K}}[\langle F_{i},\phi(x)\rangle]blackboard_E start_POSTSUBSCRIPT italic_x ∼ over~ start_ARG italic_K end_ARG end_POSTSUBSCRIPT [ ⟨ italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_ϕ ( italic_x ) ⟩ ]. This can be interpreted as the average similarity of each item in K𝐾Kitalic_K with the vector Fisubscript𝐹𝑖F_{i}italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Our analysis will assume that d<|𝒳|𝑑𝒳d<|\mathcal{X}|italic_d < | caligraphic_X | and dnmuch-less-than𝑑𝑛d\ll nitalic_d ≪ italic_n, which is usually the case in practice. Note that we may write qfϕsubscript𝑞𝑓italic-ϕq_{f\circ\phi}italic_q start_POSTSUBSCRIPT italic_f ∘ italic_ϕ end_POSTSUBSCRIPT as FΦK~𝐹Φ~𝐾F\Phi\tilde{K}italic_F roman_Φ over~ start_ARG italic_K end_ARG, where Φt×𝒳Φsuperscript𝑡𝒳\Phi\in\mathbb{R}^{t\times\mathcal{X}}roman_Φ ∈ blackboard_R start_POSTSUPERSCRIPT italic_t × caligraphic_X end_POSTSUPERSCRIPT is the collection of embedding vectors in 𝒳𝒳\mathcal{X}caligraphic_X.

6.1.1. Local Model

Existing user-level DP solutions ask each user Uisubscript𝑈𝑖U_{i}italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to privately release the query q^i=qfϕ(K~i)subscript^𝑞𝑖subscript𝑞𝑓italic-ϕsubscript~𝐾𝑖\hat{q}_{i}=q_{f\circ\phi}(\tilde{K}_{i})over^ start_ARG italic_q end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_q start_POSTSUBSCRIPT italic_f ∘ italic_ϕ end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ). The aggregator computes the average q^=1ni=1nq^i^𝑞1𝑛superscriptsubscript𝑖1𝑛subscript^𝑞𝑖\hat{q}=\frac{1}{n}\sum_{i=1}^{n}\hat{q}_{i}over^ start_ARG italic_q end_ARG = divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT over^ start_ARG italic_q end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. The current best solutions have the following error guarantee (Duchi et al., 2013; Bassily, 2019):

Lemma 6.1.

(From Proposition 3 in (Duchi et al., 2013)) There exists an (ε,0)𝜀0(\varepsilon,0)( italic_ε , 0 )-bounded user-level DP in the local model algorithm which produces an estimate q^^𝑞\hat{q}over^ start_ARG italic_q end_ARG such that, for all K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG,

𝔼[q^qfϕ(K~)2]O(FΦ12dεn).𝔼delimited-[]subscriptnorm^𝑞subscript𝑞𝑓italic-ϕ~𝐾2𝑂subscriptnorm𝐹Φ12𝑑𝜀𝑛\mathbb{E}[\|\hat{q}-q_{f\circ\phi}(\tilde{K})\|_{2}]\leq O\left(\|F\Phi\|_{1% \rightarrow 2}\tfrac{\sqrt{d}}{\varepsilon\sqrt{n}}\right).blackboard_E [ ∥ over^ start_ARG italic_q end_ARG - italic_q start_POSTSUBSCRIPT italic_f ∘ italic_ϕ end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ≤ italic_O ( ∥ italic_F roman_Φ ∥ start_POSTSUBSCRIPT 1 → 2 end_POSTSUBSCRIPT divide start_ARG square-root start_ARG italic_d end_ARG end_ARG start_ARG italic_ε square-root start_ARG italic_n end_ARG end_ARG ) .

To interpret the term FΦ1,2subscriptnorm𝐹Φ12\|F\Phi\|_{1,2}∥ italic_F roman_Φ ∥ start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT, we can use the inequality FΦ12F2Φ12subscriptnorm𝐹Φ12subscriptnorm𝐹2subscriptnormΦ12\|F\Phi\|_{1\rightarrow 2}\leq\|F\|_{2}\|\Phi\|_{1\rightarrow 2}∥ italic_F roman_Φ ∥ start_POSTSUBSCRIPT 1 → 2 end_POSTSUBSCRIPT ≤ ∥ italic_F ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ roman_Φ ∥ start_POSTSUBSCRIPT 1 → 2 end_POSTSUBSCRIPT, which is tight for certain choices of F𝐹Fitalic_F and ΦΦ\Phiroman_Φ. By assumption, we know F2dsubscriptnorm𝐹2𝑑\|F\|_{2}\leq\sqrt{d}∥ italic_F ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ square-root start_ARG italic_d end_ARG and Φ1,21subscriptnormΦ121\|\Phi\|_{1,2}\leq 1∥ roman_Φ ∥ start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ≤ 1, both of which can also be tight. The bound is thus O(dεn)𝑂𝑑𝜀𝑛O(\frac{d}{\varepsilon\sqrt{n}})italic_O ( divide start_ARG italic_d end_ARG start_ARG italic_ε square-root start_ARG italic_n end_ARG end_ARG ).

On the other hand, for dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP, by Thm. 4.1, we know that Δ𝖤𝖬(qfϕ)subscriptΔ𝖤𝖬subscript𝑞𝑓italic-ϕ\Delta_{\mathsf{EM}}(q_{f\circ\phi})roman_Δ start_POSTSUBSCRIPT sansserif_EM end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT italic_f ∘ italic_ϕ end_POSTSUBSCRIPT ) is at most the Lipschitz constant of fϕ𝑓italic-ϕf\circ\phiitalic_f ∘ italic_ϕ given by:

maxx,x𝒳F(ϕ(x))F(ϕ(x))ϕ(x)ϕ(x)maxx,x𝒳F(ϕ(x))ϕ(x))ϕ(x)ϕ(x)F2.\max_{x,x^{\prime}\in\mathcal{X}}\tfrac{\|F(\phi(x))-F(\phi(x^{\prime}))\|}{\|% \phi(x)-\phi(x^{\prime})\|}\leq\max_{x,x^{\prime}\in\mathcal{X}}\tfrac{\|F(% \phi(x))-\phi(x^{\prime}))\|}{\|\phi(x)-\phi(x^{\prime})\|}\leq\|F\|_{2}.roman_max start_POSTSUBSCRIPT italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X end_POSTSUBSCRIPT divide start_ARG ∥ italic_F ( italic_ϕ ( italic_x ) ) - italic_F ( italic_ϕ ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ∥ end_ARG start_ARG ∥ italic_ϕ ( italic_x ) - italic_ϕ ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ end_ARG ≤ roman_max start_POSTSUBSCRIPT italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_X end_POSTSUBSCRIPT divide start_ARG ∥ italic_F ( italic_ϕ ( italic_x ) ) - italic_ϕ ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ∥ end_ARG start_ARG ∥ italic_ϕ ( italic_x ) - italic_ϕ ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ end_ARG ≤ ∥ italic_F ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT .

Hence, each user can apply PrivEMDLinear with the Gaussian mechanism with =F2subscriptnorm𝐹2\ell=\|F\|_{2}roman_ℓ = ∥ italic_F ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, which gives the following utility guarantee:

Lemma 6.2.

There exists an (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP algorithm in the local model which produces an estimate q^^𝑞\hat{q}over^ start_ARG italic_q end_ARG such that, for all K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG,

𝔼[q^qfϕ(K~)2]F21.25dln(1/δ))αn.\mathbb{E}[\|\hat{q}-q_{f\circ\phi}(\tilde{K})\|_{2}]\leq\|F\|_{2}\tfrac{\sqrt% {1.25d\ln(1/\delta))}}{\alpha\sqrt{n}}.blackboard_E [ ∥ over^ start_ARG italic_q end_ARG - italic_q start_POSTSUBSCRIPT italic_f ∘ italic_ϕ end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ≤ ∥ italic_F ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT divide start_ARG square-root start_ARG 1.25 italic_d roman_ln ( 1 / italic_δ ) ) end_ARG end_ARG start_ARG italic_α square-root start_ARG italic_n end_ARG end_ARG .

Remarks. We use the Gaussian mechanism because it performs better under the 2subscript2\ell_{2}roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT error than the pure (α,0)𝛼0(\alpha,0)( italic_α , 0 )-bounded local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP illustrated in Alg. 1. However, this forces us to use δ>0𝛿0\delta>0italic_δ > 0. We leave it as an interesting open question whether similar error can be achieved with δ=0𝛿0\delta=0italic_δ = 0. Compared to Lemma 6.1, the above bound differs by a factor of εα𝜀𝛼\frac{\varepsilon}{\alpha}divide start_ARG italic_ε end_ARG start_ARG italic_α end_ARG (and small ln1δ1𝛿\ln\frac{1}{\delta}roman_ln divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG terms)—when α=ε𝛼𝜀\alpha=\varepsilonitalic_α = italic_ε, we know that dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP provides better privacy. When εαmuch-less-than𝜀𝛼\varepsilon\ll\alphaitalic_ε ≪ italic_α, that PrivEMDItemWise offers lower error than Lemma 6.1.

6.1.2. Central Model

In the central model, linear query release has been extensively studied, and optimal algorithms under item-level DP are known (Hardt and Talwar, 2010; Bhaskara et al., 2012; Nikolov et al., 2013). These algorithms can be easily adapted to user-level DP, which will provide the following guarantee:

Lemma 6.3.

(From Thm. 1.3 in (Hardt and Talwar, 2010)) There exists an (ε,0)𝜀0(\varepsilon,0)( italic_ε , 0 )-bounded user-level DP algorithm in the central model which produces an estimate q^^𝑞\hat{q}over^ start_ARG italic_q end_ARG such that, for all K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG,

𝔼[q^qfϕ(K~)2]O(FΦ12dεnln(kd)).𝔼delimited-[]subscriptnorm^𝑞subscript𝑞𝑓italic-ϕ~𝐾2𝑂subscriptnorm𝐹Φ12𝑑𝜀𝑛𝑘𝑑\mathbb{E}[\|\hat{q}-q_{f\circ\phi}(\tilde{K})\|_{2}]\leq O\left(\|F\Phi\|_{1% \rightarrow 2}\tfrac{\sqrt{d}}{\varepsilon n}\ln\left(\tfrac{k}{d}\right)% \right).blackboard_E [ ∥ over^ start_ARG italic_q end_ARG - italic_q start_POSTSUBSCRIPT italic_f ∘ italic_ϕ end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ≤ italic_O ( ∥ italic_F roman_Φ ∥ start_POSTSUBSCRIPT 1 → 2 end_POSTSUBSCRIPT divide start_ARG square-root start_ARG italic_d end_ARG end_ARG start_ARG italic_ε italic_n end_ARG roman_ln ( divide start_ARG italic_k end_ARG start_ARG italic_d end_ARG ) ) .

To provide (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP in the central model, we can use PrivEMDLinear with the Gaussian mechanism with scale ω=1nα𝜔1𝑛𝛼\omega=\frac{1}{n\alpha}italic_ω = divide start_ARG 1 end_ARG start_ARG italic_n italic_α end_ARG. Following the same approach as in Lemma 6.2, this results in O(F2dln1δαn)𝑂subscriptnorm𝐹2𝑑1𝛿𝛼𝑛O(\|F\|_{2}\frac{\sqrt{d\ln\frac{1}{\delta}}}{\alpha n})italic_O ( ∥ italic_F ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT divide start_ARG square-root start_ARG italic_d roman_ln divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG end_ARG end_ARG start_ARG italic_α italic_n end_ARG ) error. Again, this is worse than Lemma 6.3 by a factor of εα𝜀𝛼\frac{\varepsilon}{\alpha}divide start_ARG italic_ε end_ARG start_ARG italic_α end_ARG, and similar observations apply.

6.2. Frequency Estimation

Here, we evaluate the error of PrivEMDItemWise for private frequency estimation, where the goal is to obtain a private estimate F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG of the (normalized) histogram K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG. This problem has been extensively studied in privacy (Hay et al., 2009; Xu et al., 2013; Suresh, 2019; Kairouz et al., 2016; Acharya et al., 2018; Chen et al., 2020; Acharya et al., 2023); the high-level goal is to minimize the psubscript𝑝\ell_{p}roman_ℓ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT distance between F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG and K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG. However when the data domain is a general metric space 𝒳𝒳\mathcal{X}caligraphic_X, not all psubscript𝑝\ell_{p}roman_ℓ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT perturbations to K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG are the same. Thus, we will measure similarity between K~,F~~𝐾~𝐹\tilde{K},\tilde{F}over~ start_ARG italic_K end_ARG , over~ start_ARG italic_F end_ARG with dEM(K~,F~)subscript𝑑EM~𝐾~𝐹d_{\textsf{EM}}(\tilde{K},\tilde{F})italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_F end_ARG ), as we do in our privacy definition.

To ease the analysis while still demonstrating the effectiveness of our mechanisms, we fix 𝒳𝒳\mathcal{X}caligraphic_X to be the following “clustered” metric space. Let 𝒳=×𝒞𝒳𝒞\mathcal{X}=\mathcal{B}\times\mathcal{C}caligraphic_X = caligraphic_B × caligraphic_C, where ={b1,,bs}subscript𝑏1subscript𝑏𝑠\mathcal{B}=\{b_{1},\ldots,b_{s}\}caligraphic_B = { italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT }, 𝒞={c1,,ct}𝒞subscript𝑐1subscript𝑐𝑡\mathcal{C}=\{c_{1},\ldots,c_{t}\}caligraphic_C = { italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_c start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } and st=k𝑠𝑡𝑘s\cdot t=kitalic_s ⋅ italic_t = italic_k. For some r<12𝑟12r<\frac{1}{2}italic_r < divide start_ARG 1 end_ARG start_ARG 2 end_ARG, the distance is given by the following:

d×𝒞((b,c),(b,c))={0if b=b and c=crif b=b1otherwise.subscript𝑑𝒞𝑏𝑐superscript𝑏superscript𝑐cases0if b=b and c=c𝑟if b=b1otherwised_{\mathcal{B}\times\mathcal{C}}((b,c),(b^{\prime},c^{\prime}))=\begin{cases}0% &\text{if $b=b^{\prime}$ and $c=c^{\prime}$}\\ r&\text{if $b=b^{\prime}$}\\ 1&\text{otherwise}.\end{cases}italic_d start_POSTSUBSCRIPT caligraphic_B × caligraphic_C end_POSTSUBSCRIPT ( ( italic_b , italic_c ) , ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) = { start_ROW start_CELL 0 end_CELL start_CELL if italic_b = italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and italic_c = italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_r end_CELL start_CELL if italic_b = italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL otherwise . end_CELL end_ROW

We can think of this metric space as a collection of s𝑠sitalic_s clusters consisting of the t𝑡titalic_t items {(b,c1),,(b,ct)}𝑏subscript𝑐1𝑏subscript𝑐𝑡\{(b,c_{1}),\ldots,(b,c_{t})\}{ ( italic_b , italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , ( italic_b , italic_c start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) } for each b𝑏b\in\mathcal{B}italic_b ∈ caligraphic_B. Points in a cluster are more related, being at distance r𝑟ritalic_r apart, than items in two different clusters, which are distance 1111 apart. We will assume that privacy is only needed between two items in the same cluster, so we will set α=εr𝛼𝜀𝑟\alpha=\frac{\varepsilon}{r}italic_α = divide start_ARG italic_ε end_ARG start_ARG italic_r end_ARG.

6.2.1. Algorithms in the Local Model

At a high level, in the local model each user applies a private mechanism 𝒜:𝒳𝒴:𝒜𝒳𝒴\mathcal{A}:\mathcal{X}\rightarrow\mathcal{Y}caligraphic_A : caligraphic_X → caligraphic_Y (with 𝒴𝒴\mathcal{Y}caligraphic_Y discrete and |𝒴||𝒳|𝒴𝒳|\mathcal{Y}|\geq|\mathcal{X}|| caligraphic_Y | ≥ | caligraphic_X |) to each sample and releases it. The central server forms an aggregate vector v𝒴𝑣superscript𝒴v\in\mathbb{R}^{\mathcal{Y}}italic_v ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_Y end_POSTSUPERSCRIPT. Let A𝒳×𝒴𝐴superscript𝒳𝒴A\in\mathbb{R}^{\mathcal{X}\times\mathcal{Y}}italic_A ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_X × caligraphic_Y end_POSTSUPERSCRIPT denote the transition probability matrix of 𝒜𝒜\mathcal{A}caligraphic_A; we have by linearity of expectation that 𝔼[v]=K~A𝔼delimited-[]𝑣~𝐾𝐴\mathbb{E}[v]=\tilde{K}Ablackboard_E [ italic_v ] = over~ start_ARG italic_K end_ARG italic_A. Assuming that A𝐴Aitalic_A has a right inverse B𝐵Bitalic_B, the central server returns the estimate L~=vB~𝐿𝑣𝐵\tilde{L}=vBover~ start_ARG italic_L end_ARG = italic_v italic_B, which is unbiased. All previous work in distribution estimation under local DP can be expressed in this way (Kairouz et al., 2016; Acharya et al., 2018; Chen et al., 2020; Acharya et al., 2023). We summarize this in Alg. 4.

Data: K𝐾Kitalic_K, a family of datasets from n𝑛nitalic_n users each with size m𝑚mitalic_m; 𝒜𝒜\mathcal{A}caligraphic_A, a mechanism from 𝒳𝒳\mathcal{X}caligraphic_X to 𝒴𝒴\mathcal{Y}caligraphic_Y; B𝒴×𝒳𝐵superscript𝒴𝒳B\in\mathbb{R}^{\mathcal{Y}\times\mathcal{X}}italic_B ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_Y × caligraphic_X end_POSTSUPERSCRIPT, a right inverse of 𝒜𝒜\mathcal{A}caligraphic_A.
for each user i𝑖iitalic_i from 1111 to n𝑛nitalic_n do
       Li=subscript𝐿𝑖L_{i}=\emptysetitalic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∅;
       for ljKisubscript𝑙𝑗subscript𝐾𝑖l_{j}\in K_{i}italic_l start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ italic_K start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT do
             rj=𝒜(lj)subscript𝑟𝑗𝒜subscript𝑙𝑗r_{j}=\mathcal{A}(l_{j})italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = caligraphic_A ( italic_l start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT );
             Add rjsubscript𝑟𝑗r_{j}italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT to Lisubscript𝐿𝑖L_{i}italic_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT;
            
       end for
      Release L~isubscript~𝐿𝑖\tilde{L}_{i}over~ start_ARG italic_L end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT;
      
end for
v=1ni=1nL~i𝑣1𝑛superscriptsubscript𝑖1𝑛subscript~𝐿𝑖v=\frac{1}{n}\sum_{i=1}^{n}\tilde{L}_{i}italic_v = divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT over~ start_ARG italic_L end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT;
F~=vB~𝐹𝑣𝐵\tilde{F}=vBover~ start_ARG italic_F end_ARG = italic_v italic_B;
return F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG
Algorithm 4 FreqEstLocal, a general framework for histogram estimation under local DP

The state-of-the-art approach for frequency estimation is the Hadamard response (Acharya et al., 2018; Chen et al., 2020), which is based off of the Hadamard matrices (which form a robust encoding of 𝒳𝒳\mathcal{X}caligraphic_X). Specifically, the matrix A𝐴Aitalic_A is given by q11+q2Hsubscript𝑞11subscript𝑞2𝐻q_{1}\textbf{1}+q_{2}Hitalic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 1 + italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_H, where H𝐻Hitalic_H is a Hadamard matrix and q1,q2subscript𝑞1subscript𝑞2q_{1},q_{2}italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are constants chosen so that A𝐴Aitalic_A is normalized and that each element is proportional to either eεsuperscript𝑒𝜀e^{\varepsilon}italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT or 1111. This mechanism has the following utility:

Lemma 6.4.

(From Thm. 3.1 in (Chen et al., 2020)) There exists a mechanism 𝒜𝒜\mathcal{A}caligraphic_A such that FreqEstLocal satisfies (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-bounded user-level DP and returns an estimator F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG such that

maxK𝔼[dEM(K~,F~)]O(kmn+k2lnmδnε2).subscript𝐾𝔼delimited-[]subscript𝑑EM~𝐾~𝐹𝑂𝑘𝑚𝑛superscript𝑘2𝑚𝛿𝑛superscript𝜀2\max_{K}\mathbb{E}[d_{\textsf{EM}}(\tilde{K},\tilde{F})]\leq O\left(\sqrt{% \tfrac{k}{mn}}+\sqrt{\tfrac{k^{2}\ln\frac{m}{\delta}}{n\varepsilon^{2}}}\right).roman_max start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT blackboard_E [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_F end_ARG ) ] ≤ italic_O ( square-root start_ARG divide start_ARG italic_k end_ARG start_ARG italic_m italic_n end_ARG end_ARG + square-root start_ARG divide start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_ln divide start_ARG italic_m end_ARG start_ARG italic_δ end_ARG end_ARG start_ARG italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ) .

Remarks. In order to adapt the Hadamard response to the user-level setting, we suppose each user applies 𝒜𝒜\mathcal{A}caligraphic_A to each sample with privacy budget εmln(m/δ)𝜀𝑚𝑚𝛿\frac{\varepsilon}{\sqrt{m\ln(m/\delta)}}divide start_ARG italic_ε end_ARG start_ARG square-root start_ARG italic_m roman_ln ( italic_m / italic_δ ) end_ARG end_ARG, and (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-user level DP follows from composition (Kairouz et al., 2015). The term kmn𝑘𝑚𝑛\sqrt{\frac{k}{mn}}square-root start_ARG divide start_ARG italic_k end_ARG start_ARG italic_m italic_n end_ARG end_ARG is a sampling error which does not depend on ε𝜀\varepsilonitalic_ε, and the second k2ln(m/δ)nε2superscript𝑘2𝑚𝛿𝑛superscript𝜀2\sqrt{\frac{k^{2}\ln(m/\delta)}{n\varepsilon^{2}}}square-root start_ARG divide start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_ln ( italic_m / italic_δ ) end_ARG start_ARG italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG term is the cost of privacy. The cost of privacy usually dominates, and furthermore its dependence on m𝑚mitalic_m is not significant. This is because m𝑚mitalic_m reduces both the effect of each sample on the final estimate, and the privacy budget per sample, countervailing itself.

Under dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP, we can use a transition probability matrix A𝐴Aitalic_A that is less noisy. Specifically, each user may apply 𝒜𝒜\mathcal{A}caligraphic_A to their dataset using PrivEMDItemWise, and by Thm. 4.3, 𝒜𝒜\mathcal{A}caligraphic_A needs to satisfy (O(αmln(meα/δ)),0)𝑂𝛼𝑚𝑚superscript𝑒𝛼𝛿0(O(\frac{\alpha}{\sqrt{m\ln(me^{\alpha}/\delta)}}),0)( italic_O ( divide start_ARG italic_α end_ARG start_ARG square-root start_ARG italic_m roman_ln ( italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_ARG ) , 0 ) d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP. Note that for our choice of α𝛼\alphaitalic_α and 𝒳𝒳\mathcal{X}caligraphic_X, this is often a less restrictive requirement than (εmln(m/δ),0)\frac{\varepsilon}{\sqrt{m\ln(m/\delta)}},0)divide start_ARG italic_ε end_ARG start_ARG square-root start_ARG italic_m roman_ln ( italic_m / italic_δ ) end_ARG end_ARG , 0 )-DP) since α<ε𝛼𝜀\alpha<\varepsilonitalic_α < italic_ε. We first derive an error bound on FreqEstLocal in terms of A𝐴Aitalic_A (specifically its right inverse), which we will then optimize later.

Theorem 6.5.

For the metric space 𝒳=×𝒞𝒳𝒞\mathcal{X}=\mathcal{B}\times\mathcal{C}caligraphic_X = caligraphic_B × caligraphic_C and any mechanism 𝒜𝒜\mathcal{A}caligraphic_A satisfying (α0,0)\alpha_{0},0)italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , 0 ) d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP where α0=O(αmln(meα/δ))subscript𝛼0𝑂𝛼𝑚𝑚superscript𝑒𝛼𝛿\alpha_{0}=O(\frac{\alpha}{\sqrt{m\ln(me^{\alpha}/\delta)}})italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_O ( divide start_ARG italic_α end_ARG start_ARG square-root start_ARG italic_m roman_ln ( italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_ARG ) (α0subscript𝛼0\alpha_{0}italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is specifically defined in Thm. 4.4), FreqEstLocal satisfies (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-bounded dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP in the local model and returns an estimator F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG such that

(8) maxK𝔼[dEM(F~,K~)]rst(BT1221)mn+s(PTBT1221)mn,subscript𝐾𝔼delimited-[]subscript𝑑EM~𝐹~𝐾𝑟𝑠𝑡superscriptsubscriptnormsuperscript𝐵𝑇1221𝑚𝑛𝑠superscriptsubscriptnormsuperscript𝑃𝑇superscript𝐵𝑇1221𝑚𝑛\max_{K}\mathbb{E}[d_{\textsf{EM}}(\tilde{F},\tilde{K})]\leq r\sqrt{\frac{st(% \|B^{T}\|_{1\rightarrow 2}^{2}-1)}{mn}}+\sqrt{\frac{s(\|P^{T}B^{T}\|_{1% \rightarrow 2}^{2}-1)}{mn}},roman_max start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT blackboard_E [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_F end_ARG , over~ start_ARG italic_K end_ARG ) ] ≤ italic_r square-root start_ARG divide start_ARG italic_s italic_t ( ∥ italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 → 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_m italic_n end_ARG end_ARG + square-root start_ARG divide start_ARG italic_s ( ∥ italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 → 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_m italic_n end_ARG end_ARG ,

where B𝐵Bitalic_B is a right inverse of 𝒜𝒜\mathcal{A}caligraphic_A, P=I1𝒞+𝑃tensor-productsubscript𝐼subscriptsuperscript1𝒞P=I_{\mathcal{B}}\otimes 1^{+}_{\mathcal{C}}italic_P = italic_I start_POSTSUBSCRIPT caligraphic_B end_POSTSUBSCRIPT ⊗ 1 start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT, and 1𝒞+subscriptsuperscript1𝒞1^{+}_{\mathcal{C}}1 start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT is a column vector of 1111s indexed by 𝒞𝒞\mathcal{C}caligraphic_C.

Remarks. The first term in the RHS of (8) is the cost of equalizing mass between clusters, and the second term is the cost of equalizing the mass across clusters (since the matrix P𝑃Pitalic_P essentially projects 𝒜𝒜\mathcal{A}caligraphic_A to act between clusters). For small r𝑟ritalic_r, the first term approaches 00, and the latter term may also approach 00 because 𝒜𝒜\mathcal{A}caligraphic_A will not often map a point outside its cluster under d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP (and thus, PTBT12210subscriptsuperscriptnormsuperscript𝑃𝑇superscript𝐵𝑇21210\|P^{T}B^{T}\|^{2}_{1\rightarrow 2}-1\rightarrow 0∥ italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 → 2 end_POSTSUBSCRIPT - 1 → 0).

Proof Sketch. Our bound forms a transportation plan between F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG and K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG by first map** the mass within each cluster arbitrarily, which incurs at most rF~K~1𝑟subscriptnorm~𝐹~𝐾1r\|\tilde{F}-\tilde{K}\|_{1}italic_r ∥ over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT cost, and then equalizing the mass between clusters, which incurs at most F~K~1subscriptnorm~𝐹~𝐾1\|\tilde{F}-\tilde{K}\|_{1}∥ over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT cost. Both of the error terms can then be bounded by viewing F~K~~𝐹~𝐾\tilde{F}-\tilde{K}over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG as the sum of mn𝑚𝑛mnitalic_m italic_n independent variables drawn from a Dirichlet distribution with mean 00, and applying a standard variance analysis.∎

We apply Thm. 6.5 with 𝒜𝒜\mathcal{A}caligraphic_A being a generalization of k𝑘kitalic_k-randomized response (Kairouz et al., 2016) which is adapted to d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP. Specifically, GKRRα0subscriptGKRRsubscript𝛼0\textsf{GKRR}_{\alpha_{0}}GKRR start_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT has probabilities given by, for each (b,c)𝒳𝑏𝑐𝒳(b,c)\in\mathcal{X}( italic_b , italic_c ) ∈ caligraphic_X,

Pr[GKRR(b,c)=(b,c))]eα0,\displaystyle\Pr[\textsf{GKRR}(b,c)=(b,c))]\propto e^{\alpha_{0}},roman_Pr [ GKRR ( italic_b , italic_c ) = ( italic_b , italic_c ) ) ] ∝ italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ,
Pr[GKRR(b,c))=(b,c)]e(1r)α0cc,\displaystyle\Pr[\textsf{GKRR}(b,c))=(b,c^{\prime})]\propto e^{(1-r)\alpha_{0}% }~{}~{}~{}~{}~{}~{}\forall c^{\prime}\neq c,roman_Pr [ GKRR ( italic_b , italic_c ) ) = ( italic_b , italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ] ∝ italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∀ italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ≠ italic_c ,
Pr[GKRR(b,c))=(b,c)]1bb,c.\displaystyle\Pr[\textsf{GKRR}(b,c))=(b^{\prime},c^{\prime})]\propto 1~{}~{}~{% }~{}~{}~{}\forall b^{\prime}\neq b,c.roman_Pr [ GKRR ( italic_b , italic_c ) ) = ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ] ∝ 1 ∀ italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ≠ italic_b , italic_c .

Using this mechanism, the higher-order terms of Eq. (8) will approach 00 with r𝑟ritalic_r, as follows:

Theorem 6.6.

For the metric space 𝒳=×𝒞𝒳𝒞\mathcal{X}=\mathcal{B}\times\mathcal{C}caligraphic_X = caligraphic_B × caligraphic_C, FreqEstLocal with the mechanism 𝒜=GKRRα0𝒜subscriptGKRRsubscript𝛼0\mathcal{A}=\textsf{GKRR}_{\alpha_{0}}caligraphic_A = GKRR start_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT satisfies (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP in the local model and returns an estimator F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG such that

maxK𝔼[dEM(F~,K~)]rst3mn(eα0+seα0e(1r)α0)+s2t2mn(s+2(eα01)eα0+(t1)e(1r)α0t),subscript𝐾𝔼delimited-[]subscript𝑑EM~𝐹~𝐾𝑟𝑠superscript𝑡3𝑚𝑛superscript𝑒subscript𝛼0𝑠superscript𝑒subscript𝛼0superscript𝑒1𝑟subscript𝛼0superscript𝑠2superscript𝑡2𝑚𝑛𝑠2superscript𝑒subscript𝛼01superscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑡\max_{K}\mathbb{E}[d_{\textsf{EM}}(\tilde{F},\tilde{K})]\leq r\sqrt{\frac{st^{% 3}}{mn}}\left(\frac{e^{\alpha_{0}}+s}{e^{\alpha_{0}}-e^{(1-r)\alpha_{0}}}% \right)\\ +\sqrt{\frac{s^{2}t^{2}}{mn}}\left(\frac{\sqrt{s+2(e^{\alpha_{0}}-1)}}{e^{% \alpha_{0}}+(t-1)e^{(1-r)\alpha_{0}}-t}\right),start_ROW start_CELL roman_max start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT blackboard_E [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_F end_ARG , over~ start_ARG italic_K end_ARG ) ] ≤ italic_r square-root start_ARG divide start_ARG italic_s italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG ( divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + italic_s end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG ) end_CELL end_ROW start_ROW start_CELL + square-root start_ARG divide start_ARG italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG ( divide start_ARG square-root start_ARG italic_s + 2 ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - 1 ) end_ARG end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_t end_ARG ) , end_CELL end_ROW

where α0subscript𝛼0\alpha_{0}italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is defined in Eq. (6).

Remarks: Specifically, for our choice of α=εr𝛼𝜀𝑟\alpha=\frac{\varepsilon}{r}italic_α = divide start_ARG italic_ε end_ARG start_ARG italic_r end_ARG, we have

maxK𝔼[dEM(K~,F~)]4k3mn+64k3αnln(4mexp(α)/δ).subscript𝐾𝔼delimited-[]subscript𝑑EM~𝐾~𝐹4superscript𝑘3𝑚𝑛64superscript𝑘3𝛼𝑛4𝑚𝛼𝛿\max_{K}\mathbb{E}[d_{\textsf{EM}}(\tilde{K},\tilde{F})]\leq 4\sqrt{\tfrac{k^{% 3}}{mn}}+64\tfrac{\sqrt{k^{3}}}{\alpha\sqrt{n}}\sqrt{\ln(4m\exp(\alpha)/\delta% )}.roman_max start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT blackboard_E [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_F end_ARG ) ] ≤ 4 square-root start_ARG divide start_ARG italic_k start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG + 64 divide start_ARG square-root start_ARG italic_k start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG start_ARG italic_α square-root start_ARG italic_n end_ARG end_ARG square-root start_ARG roman_ln ( 4 italic_m roman_exp ( italic_α ) / italic_δ ) end_ARG .

Similar to Lemma 6.4, the k3mnsuperscript𝑘3𝑚𝑛\sqrt{\frac{k^{3}}{mn}}square-root start_ARG divide start_ARG italic_k start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG term is the cost of sampling. The rk3εn𝑟superscript𝑘3𝜀𝑛r\frac{\sqrt{k^{3}}}{\varepsilon\sqrt{n}}italic_r divide start_ARG square-root start_ARG italic_k start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG start_ARG italic_ε square-root start_ARG italic_n end_ARG end_ARG term is the cost of privacy, and it dominates when αm𝛼𝑚\alpha\leq\sqrt{m}italic_α ≤ square-root start_ARG italic_m end_ARG. We will compare Thm. 6.6 with Lemma 6.4 when k,ε,α<m𝑘𝜀𝛼𝑚k,\varepsilon,\alpha<\sqrt{m}italic_k , italic_ε , italic_α < square-root start_ARG italic_m end_ARG—then the cost of privacy dominates. Specifically, the cost of Lemma 6.4 is O(k2ln(m/δ)nε2)𝑂superscript𝑘2𝑚𝛿𝑛superscript𝜀2O(\sqrt{\frac{k^{2}\ln(m/\delta)}{n\varepsilon^{2}}})italic_O ( square-root start_ARG divide start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_ln ( italic_m / italic_δ ) end_ARG start_ARG italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ), and the cost of Thm. 6.6 is O(k3α2nmax{ln(mδ),α})𝑂superscript𝑘3superscript𝛼2𝑛𝑚𝛿𝛼O(\sqrt{\frac{k^{3}}{\alpha^{2}n}\max\{\ln(\frac{m}{\delta}),\alpha\}})italic_O ( square-root start_ARG divide start_ARG italic_k start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_n end_ARG roman_max { roman_ln ( divide start_ARG italic_m end_ARG start_ARG italic_δ end_ARG ) , italic_α } end_ARG ). Given ε𝜀\varepsilonitalic_ε, the error will be smaller if

α>{εkε<1kln(mδ)ε2kln(m/δ)otherwise𝛼cases𝜀𝑘𝜀1𝑘𝑚𝛿superscript𝜀2𝑘𝑚𝛿otherwise\alpha>\begin{cases}\varepsilon\sqrt{k}&\varepsilon<\frac{1}{\sqrt{k}}\ln(% \frac{m}{\delta})\\ \varepsilon^{2}\frac{k}{\ln(m/\delta)}&\text{otherwise}\end{cases}italic_α > { start_ROW start_CELL italic_ε square-root start_ARG italic_k end_ARG end_CELL start_CELL italic_ε < divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_k end_ARG end_ARG roman_ln ( divide start_ARG italic_m end_ARG start_ARG italic_δ end_ARG ) end_CELL end_ROW start_ROW start_CELL italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_k end_ARG start_ARG roman_ln ( italic_m / italic_δ ) end_ARG end_CELL start_CELL otherwise end_CELL end_ROW

i.e. if there is a gap between α,ε𝛼𝜀\alpha,\varepsilonitalic_α , italic_ε of size at least k𝑘\sqrt{k}square-root start_ARG italic_k end_ARG. This is possible if k1rmuch-less-than𝑘1𝑟k\ll\frac{1}{r}italic_k ≪ divide start_ARG 1 end_ARG start_ARG italic_r end_ARG, and for these instances dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP offers better utility than user-level DP. In Thm. 6.6, the super-linear factor of k3/2superscript𝑘32k^{3/2}italic_k start_POSTSUPERSCRIPT 3 / 2 end_POSTSUPERSCRIPT comes from the fact that the k𝑘kitalic_k-RR is suboptimal in terms of k𝑘kitalic_k (Acharya et al., 2018).

6.2.2. Algorithms in the Central Model

The Laplace mechanism has been shown to be optimal for many instances of frequency estimation (Dwork et al., 2014). To attain user-level privacy, the baseline Laplace mechanism releases, for each x𝒳𝑥𝒳x\in\mathcal{X}italic_x ∈ caligraphic_X, the values Fx=K~G(x)+Ysubscript𝐹𝑥subscript~𝐾𝐺𝑥𝑌F_{x}=\tilde{K}_{G}(x)+Yitalic_F start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = over~ start_ARG italic_K end_ARG start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ( italic_x ) + italic_Y, where YLap(1nε)similar-to𝑌𝐿𝑎𝑝1𝑛𝜀Y\sim Lap(\frac{1}{n\varepsilon})italic_Y ∼ italic_L italic_a italic_p ( divide start_ARG 1 end_ARG start_ARG italic_n italic_ε end_ARG ). The distribution function F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG is then the normalization of Fx:x𝒳delimited-⟨⟩:subscript𝐹𝑥𝑥𝒳\langle F_{x}:x\in\mathcal{X}\rangle⟨ italic_F start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT : italic_x ∈ caligraphic_X ⟩. This gives us the following guarantees.

Lemma 6.7.

For the metric space 𝒳=×𝒞𝒳𝒞\mathcal{X}=\mathcal{B}\times\mathcal{C}caligraphic_X = caligraphic_B × caligraphic_C, the Laplace mechanism described above satisfies (ε,0)𝜀0(\varepsilon,0)( italic_ε , 0 )-user level DP, and produces an estimate F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG such that

maxK𝔼[dEM(K~,F~)]O(knε).subscript𝐾𝔼delimited-[]subscript𝑑EM~𝐾~𝐹𝑂𝑘𝑛𝜀\max_{K}\mathbb{E}[d_{\textsf{EM}}(\tilde{K},\tilde{F})]\leq O\left(\tfrac{k}{% n\varepsilon}\right).roman_max start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT blackboard_E [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_F end_ARG ) ] ≤ italic_O ( divide start_ARG italic_k end_ARG start_ARG italic_n italic_ε end_ARG ) .

Again, this utility does not depend on m𝑚mitalic_m, since each user contributes 1n1𝑛\frac{1}{n}divide start_ARG 1 end_ARG start_ARG italic_n end_ARG fraction of the whole dataset which is independent of m𝑚mitalic_m. Consistent with central DP, the error decreases with 1n1𝑛\frac{1}{n}divide start_ARG 1 end_ARG start_ARG italic_n end_ARG, which is much faster than the 1n1𝑛\frac{1}{\sqrt{n}}divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG in the local model.

It is possible to adapt FreqEstLocal to bounded central dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP by simply pretending to be one user who holds the global dataset KGsubscript𝐾𝐺K_{G}italic_K start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT. The privacy analysis of Thm. 4.4, and the utility analysis of Thm. 6.6 may be combined for the following corollary:

Corollary 6.8.

For the metric space 𝒳=×𝒞𝒳𝒞\mathcal{X}=\mathcal{B}\times\mathcal{C}caligraphic_X = caligraphic_B × caligraphic_C, FreqEstLocal with the mechanism 𝒜=GKRRα0𝒜subscriptGKRRsubscript𝛼0\mathcal{A}=\textsf{GKRR}_{\alpha_{0}}caligraphic_A = GKRR start_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT with α0subscript𝛼0\alpha_{0}italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT given in (7) satisfies (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ )-dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT DP in the central model and returns an estimator F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG with error given in (7).

Remarks. In particular

maxK𝔼[dEM(F~,K~)]4k3mn+64k3αnln(4mexp(α)/δ).subscript𝐾𝔼delimited-[]subscript𝑑EM~𝐹~𝐾4superscript𝑘3𝑚𝑛64superscript𝑘3𝛼𝑛4𝑚𝛼𝛿\max_{K}\mathbb{E}[d_{\textsf{EM}}(\tilde{F},\tilde{K})]\leq 4\tfrac{\sqrt{k^{% 3}}}{\sqrt{mn}}+64\tfrac{\sqrt{k^{3}}}{\alpha n}\sqrt{\ln(4m\exp(\alpha)/% \delta)}.roman_max start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT blackboard_E [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_F end_ARG , over~ start_ARG italic_K end_ARG ) ] ≤ 4 divide start_ARG square-root start_ARG italic_k start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_m italic_n end_ARG end_ARG + 64 divide start_ARG square-root start_ARG italic_k start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG start_ARG italic_α italic_n end_ARG square-root start_ARG roman_ln ( 4 italic_m roman_exp ( italic_α ) / italic_δ ) end_ARG .

The same sampling error is present, but the cost of privacy is reduced from a 1n1𝑛\frac{1}{\sqrt{n}}divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG dependence in Thm. 6.6 to just 1n1𝑛\frac{1}{n}divide start_ARG 1 end_ARG start_ARG italic_n end_ARG. To compare just the cost of privacy in Corollary 6.6 to Lemma 6.7, we will assume we are in the regime nmα𝑛𝑚𝛼n\leq\frac{m}{\alpha}italic_n ≤ divide start_ARG italic_m end_ARG start_ARG italic_α end_ARG. Then, the cost in Corollary 6.8 is O(k3αn)max{ln(mδ),α}𝑂superscript𝑘3𝛼𝑛𝑚𝛿𝛼O(\frac{\sqrt{k^{3}}}{\alpha n})\sqrt{\max\{\ln(\frac{m}{\delta}),\alpha\}}italic_O ( divide start_ARG square-root start_ARG italic_k start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG start_ARG italic_α italic_n end_ARG ) square-root start_ARG roman_max { roman_ln ( divide start_ARG italic_m end_ARG start_ARG italic_δ end_ARG ) , italic_α } end_ARG. The error of Corollary 6.8 will be less when

α{εkln(mδ)εln(m/δ)kε2kotherwise𝛼cases𝜀𝑘𝑚𝛿𝜀𝑚𝛿𝑘superscript𝜀2𝑘otherwise\alpha\geq\begin{cases}\varepsilon\sqrt{k\ln(\frac{m}{\delta})}&\varepsilon% \leq\sqrt{\frac{\ln(m/\delta)}{k}}\\ \varepsilon^{2}k&\text{otherwise}\end{cases}italic_α ≥ { start_ROW start_CELL italic_ε square-root start_ARG italic_k roman_ln ( divide start_ARG italic_m end_ARG start_ARG italic_δ end_ARG ) end_ARG end_CELL start_CELL italic_ε ≤ square-root start_ARG divide start_ARG roman_ln ( italic_m / italic_δ ) end_ARG start_ARG italic_k end_ARG end_ARG end_CELL end_ROW start_ROW start_CELL italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_k end_CELL start_CELL otherwise end_CELL end_ROW

Thus, the utility is improved when α𝛼\alphaitalic_α is bigger than ε𝜀\varepsilonitalic_ε by a factor of at least k𝑘\sqrt{k}square-root start_ARG italic_k end_ARG, which is achieved when k1rmuch-less-than𝑘1𝑟k\ll\frac{1}{r}italic_k ≪ divide start_ARG 1 end_ARG start_ARG italic_r end_ARG. One final advantage of Corollary 6.8 is that it may be implemented in the shuffle model of DP, which requires less trust than the central model. This parallels prior results of the shuffle model of DP (Feldman et al., 2022).

7. Related Work

Item-level DP. DP was originally considered at the item-level (Dwork, 2006), where a privacy guarantee is made when one item in the sensitive dataset is changed. Of the most relevance to our setting are results in distribution estimation (Hay et al., 2009; Xu et al., 2013; Suresh, 2019); these results study more complex estimation problems than frequency. Also, we consider linear query release, for which there is a long line of work (Hardt and Talwar, 2010; Bhaskara et al., 2012; Nikolov et al., 2013; Blum et al., 2013; Li et al., 2015). The mechanism in (Hardt and Talwar, 2010) is often optimal and easy to adapt to our setting; we compare our algorithms with it.
User-level DP. With a vast increase data collected about users, user-level privacy is gaining more interest (Amin et al., 2019; Narayanan et al., 2022; Bassily and Sun, 2023; Levy et al., 2021; Liu et al., 2020; Cummings et al., 2022). The most relevant work to ours on user-level private mean estimation (Cummings et al., 2022) and histogram estimation (Liu et al., 2023; Acharya et al., 2023), though these problems are again more complex than the ones we study. Another related area is the problem of deciding the amount of data to pick from each user in cases where the users have different amounts of data (Amin et al., 2019; Liu et al., 2023; Cummings et al., 2022), which is related to our unbounded DP setup. These techniques apply to more specialized settings than our general blackbox reduction, and they are not immediately comparable.
Local DP. Local DP has also received lots of attention recently. The most results to our work is locally-private linear query release  (Duchi et al., 2013; Bassily, 2019) and distribution estimation (Duchi et al., 2013; Kairouz et al., 2016; Acharya et al., 2018; Chen et al., 2020; Acharya et al., 2023). We directly compare our work to the optimal algorithms in (Bassily, 2019) and (Chen et al., 2020) for our problems, which can be adapted to user-level DP easily. The other related line of work is privacy amplification from the local model to the central model, given access to a trusted shuffler (Erlingsson et al., 2019; Girgis et al., 2021; Feldman et al., 2022). We extend the state-of-the-art analysis in (Feldman et al., 2022) to general metric DP.
Metric DP. Metric DP was first proposed in (Chatzikokolakis et al., 2013) in the central model. In the local model, this has led to work on releasing numeric data (Roy Chowdhury et al., 2022), location data (Andrés et al., 2013; Bordenabe et al., 2014; Chatzikokolakis et al., 2015; Weggenmann and Kerschbaum, 2021) and text (Feyisetan et al., 2019, 2020; Feyisetan and Kasiviswanathan, 2021; Imola et al., 2022). Unlike these works, we consider privacy in a general metric space. The most related work is that of (Fernandes et al., 2019), which proposes metric DP based on the dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT for releasing text embeddings. As explained in the introduction, we consider a much more general setting than (Fernandes et al., 2019).

8. Conclusion

We have proposed metric DP at the user level using the earth-mover’s distance dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT. This captures both the magnitude and structural aspects of changes in the data, resulting in a tailored privacy semantic. We have designed two novel privacy mechanisms under dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP which improves the utility over standard DP. Additionally, we have shown that general (unbounded) dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP can be reduced to the simpler case (bounded) where all users have the same amount of data. Finally, we have demonstrated that dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP .

References

  • (1)
  • Abowd (2018) John M Abowd. 2018. The US Census Bureau adopts differential privacy. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 2867–2867.
  • Acharya et al. (2023) Jayadev Acharya, Yuhan Liu, and Ziteng Sun. 2023. Discrete distribution estimation under user-level local differential privacy. In International Conference on Artificial Intelligence and Statistics. PMLR, 8561–8585.
  • Acharya et al. (2018) Jayadev Acharya, Ziteng Sun, and Huanyu Zhang. 2018. Communication Efficient, Sample Optimal, Linear Time Locally Private Discrete Distribution Estimation. CoRR abs/1802.04705 (2018). arXiv:1802.04705 http://arxiv.longhoe.net/abs/1802.04705
  • Alvim et al. (2018) Mário Alvim, Konstantinos Chatzikokolakis, Catuscia Palamidessi, and Anna Pazii. 2018. Invited Paper: Local Differential Privacy on Metric Spaces: Optimizing the Trade-Off with Utility. In 2018 IEEE 31st Computer Security Foundations Symposium (CSF). 262–267. https://doi.org/10.1109/CSF.2018.00026
  • Amin et al. (2019) Kareem Amin, Alex Kulesza, Andres Munoz, and Sergei Vassilvtiskii. 2019. Bounding user contributions: A bias-variance trade-off in differential privacy. In International Conference on Machine Learning. PMLR, 263–271.
  • Andrés et al. (2013) Miguel E Andrés, Nicolás E Bordenabe, Konstantinos Chatzikokolakis, and Catuscia Palamidessi. 2013. Geo-indistinguishability: Differential privacy for location-based systems. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. 901–914.
  • Balle et al. (2018) Borja Balle, Gilles Barthe, and Marco Gaboardi. 2018. Privacy amplification by subsampling: Tight analyses via couplings and divergences. Advances in neural information processing systems 31 (2018).
  • Barthe and Olmedo (2013) Gilles Barthe and Federico Olmedo. 2013. Beyond differential privacy: Composition theorems and relational logic for f-divergences between probabilistic programs. In International Colloquium on Automata, Languages, and Programming. Springer, 49–60.
  • Bassily (2019) Raef Bassily. 2019. Linear queries estimation with local differential privacy. In The 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 721–729.
  • Bassily and Sun (2023) Raef Bassily and Ziteng Sun. 2023. User-level private stochastic convex optimization with optimal rates. In International Conference on Machine Learning. PMLR, 1838–1851.
  • Bhaskara et al. (2012) Aditya Bhaskara, Daniel Dadush, Ravishankar Krishnaswamy, and Kunal Talwar. 2012. Unconditional differentially private mechanisms for linear queries. In Proceedings of the forty-fourth annual ACM symposium on Theory of computing. 1269–1284.
  • Blum et al. (2013) Avrim Blum, Katrina Ligett, and Aaron Roth. 2013. A learning theory approach to noninteractive database privacy. Journal of the ACM (JACM) 60, 2 (2013), 1–25.
  • Bordenabe et al. (2014) Nicolás E. Bordenabe, Konstantinos Chatzikokolakis, and Catuscia Palamidessi. 2014. Optimal Geo-Indistinguishable Mechanisms for Location Privacy. Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (Nov. 2014), 251–262. https://doi.org/10.1145/2660267.2660345 arXiv: 1402.5029.
  • Chatzikokolakis et al. (2013) Konstantinos Chatzikokolakis, Miguel E Andrés, Nicolás Emilio Bordenabe, and Catuscia Palamidessi. 2013. Broadening the scope of differential privacy using metrics. In PETS.
  • Chatzikokolakis et al. (2015) Konstantinos Chatzikokolakis, Catuscia Palamidessi, and Marco Stronati. 2015. Constructing elastic distinguishability metrics for location privacy. arXiv preprint arXiv:1503.00756 (2015).
  • Chen et al. (2020) Wei-Ning Chen, Peter Kairouz, and Ayfer Ozgur. 2020. Breaking the communication-privacy-accuracy trilemma. Advances in Neural Information Processing Systems 33 (2020), 3312–3324.
  • Cormode et al. (2018) Graham Cormode, Somesh Jha, Tejas Kulkarni, Ninghui Li, Divesh Srivastava, and Tianhao Wang. 2018. Privacy at scale: Local differential privacy in practice. In Proceedings of the 2018 International Conference on Management of Data. 1655–1658.
  • Csiszár (1975) Imre Csiszár. 1975. I-divergence geometry of probability distributions and minimization problems. The annals of probability (1975), 146–158.
  • Cummings et al. (2022) Rachel Cummings, Vitaly Feldman, Audra McMillan, and Kunal Talwar. 2022. Mean estimation with user-level privacy under data heterogeneity. Advances in Neural Information Processing Systems 35 (2022), 29139–29151.
  • Duchi et al. (2013) John C Duchi, Michael I Jordan, and Martin J Wainwright. 2013. Local privacy, data processing inequalities, and minimax rates. arXiv preprint arXiv:1302.3203 (2013).
  • Dwork (2006) Cynthia Dwork. 2006. Differential privacy. In International colloquium on automata, languages, and programming. Springer, 1–12.
  • Dwork et al. (2014) Cynthia Dwork, Aaron Roth, et al. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9, 3–4 (2014), 211–407.
  • Erlingsson et al. (2019) Úlfar Erlingsson, Vitaly Feldman, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, and Abhradeep Thakurta. 2019. Amplification by shuffling: From local to central differential privacy via anonymity. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 2468–2479.
  • Erlingsson et al. (2014) Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. 1054–1067.
  • Feldman et al. (2022) Vitaly Feldman, Audra McMillan, and Kunal Talwar. 2022. Hiding among the clones: A simple and nearly optimal analysis of privacy amplification by shuffling. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS). IEEE, 954–964.
  • Fernandes et al. (2019) Natasha Fernandes, Mark Dras, and Annabelle McIver. 2019. Generalised differential privacy for text document processing. In Principles of Security and Trust: 8th International Conference, POST 2019, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019, Prague, Czech Republic, April 6–11, 2019, Proceedings 8. Springer International Publishing, 123–148.
  • Feyisetan et al. (2020) Oluwaseyi Feyisetan, Borja Balle, Thomas Drake, and Tom Diethe. 2020. Privacy-and utility-preserving textual analysis via calibrated multivariate perturbations. In Proceedings of the 13th international conference on web search and data mining. 178–186.
  • Feyisetan et al. (2019) Oluwaseyi Feyisetan, Tom Diethe, and Thomas Drake. 2019. Leveraging hierarchical representations for preserving privacy and utility in text. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 210–219.
  • Feyisetan and Kasiviswanathan (2021) Oluwaseyi Feyisetan and Shiva Kasiviswanathan. 2021. Private release of text embedding vectors. In Proceedings of the First Workshop on Trustworthy Natural Language Processing. 15–27.
  • Girgis et al. (2021) Antonious M Girgis, Deepesh Data, Suhas Diggavi, Ananda Theertha Suresh, and Peter Kairouz. 2021. On the renyi differential privacy of the shuffle model. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 2321–2341.
  • Givens and Shortt (1984) Clark R Givens and Rae Michael Shortt. 1984. A class of Wasserstein metrics for probability distributions. Michigan Mathematical Journal 31, 2 (1984), 231–240.
  • Hardt and Talwar (2010) Moritz Hardt and Kunal Talwar. 2010. On the geometry of differential privacy. In Proceedings of the forty-second ACM symposium on Theory of computing. 705–714.
  • Hay et al. (2009) Michael Hay, Vibhor Rastogi, Gerome Miklau, and Dan Suciu. 2009. Boosting the accuracy of differentially-private histograms through consistency. arXiv preprint arXiv:0904.0942 (2009).
  • Imola et al. (2022) Jacob Imola, Shiva Kasiviswanathan, Stephen White, Abhinav Aggarwal, and Nathanael Teissier. 2022. Balancing utility and scalability in metric differential privacy. In Uncertainty in Artificial Intelligence. PMLR, 885–894.
  • Kairouz et al. (2016) Peter Kairouz, Keith Bonawitz, and Daniel Ramage. 2016. Discrete distribution estimation under local privacy. In International Conference on Machine Learning. PMLR, 2436–2444.
  • Kairouz et al. (2015) Peter Kairouz, Sewoong Oh, and Pramod Viswanath. 2015. The composition theorem for differential privacy. In International conference on machine learning. PMLR, 1376–1385.
  • Konig (2001) Dénes Konig. 2001. Theorie der endlichen und unendlichen Graphen. Vol. 72. American Mathematical Soc.
  • Levy et al. (2021) Daniel Levy, Ziteng Sun, Kareem Amin, Satyen Kale, Alex Kulesza, Mehryar Mohri, and Ananda Theertha Suresh. 2021. Learning with User-Level Privacy. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 12466–12479. https://proceedings.neurips.cc/paper_files/paper/2021/file/67e235e7f2fa8800d8375409b566e6b6-Paper.pdf
  • Li et al. (2015) Chao Li, Gerome Miklau, Michael Hay, Andrew McGregor, and Vibhor Rastogi. 2015. The matrix mechanism: optimizing linear counting queries under differential privacy. The VLDB journal 24 (2015), 757–781.
  • Li et al. (2016) N. Li, M. Lyu, D. Su, and W. Yang. 2016. Differential Privacy: From Theory to Practice. Morgan and Claypool. https://ieeexplore.ieee.org/document/7731575
  • Liu et al. (2020) Yuhan Liu, Ananda Theertha Suresh, Felix Xinnan X Yu, Sanjiv Kumar, and Michael Riley. 2020. Learning discrete distributions: user vs item-level privacy. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 20965–20976. https://proceedings.neurips.cc/paper_files/paper/2020/file/f06edc8ab534b2c7ecbd4c2051d9cb1e-Paper.pdf
  • Liu et al. (2023) Yuhan Liu, Ananda Theertha Suresh, Wennan Zhu, Peter Kairouz, and Marco Gruteser. 2023. Algorithms for bounding contribution for histogram estimation under user-level privacy. In International Conference on Machine Learning. PMLR, 21969–21996.
  • Narayanan et al. (2022) Shyam Narayanan, Vahab Mirrokni, and Hossein Esfandiari. 2022. Tight and robust private mean estimation with few users. In International Conference on Machine Learning. PMLR, 16383–16412.
  • Nikolov et al. (2013) Aleksandar Nikolov, Kunal Talwar, and Li Zhang. 2013. The geometry of differential privacy: the sparse and approximate cases. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing. 351–360.
  • Roy Chowdhury et al. (2022) Amrita Roy Chowdhury, Bolin Ding, Somesh Jha, Weiran Liu, and **gren Zhou. 2022. Strengthening Order Preserving Encryption with Differential Privacy. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (Los Angeles, CA, USA) (CCS ’22). Association for Computing Machinery, New York, NY, USA, 2519–2533. https://doi.org/10.1145/3548606.3560610
  • Suresh (2019) Ananda Theertha Suresh. 2019. Differentially private anonymized histograms. Advances in Neural Information Processing Systems 32 (2019).
  • Weggenmann and Kerschbaum (2021) Benjamin Weggenmann and Florian Kerschbaum. 2021. Differential privacy for directional data. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 1205–1222.
  • Xu et al. (2013) Jia Xu, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, Ge Yu, and Marianne Winslett. 2013. Differentially private histogram publication. The VLDB journal 22 (2013), 797–822.

Appendix A Omitted Technical Details

An alternative characterization of differential privacy is through the hockey-stick divergence (Barthe and Olmedo, 2013). For probability distributions P,Q𝑃𝑄P,Qitalic_P , italic_Q defined on a space 𝒴𝒴\mathcal{Y}caligraphic_Y, this is given by the following:

Definition A.1.

Let ε,δ>0𝜀𝛿0\varepsilon,\delta>0italic_ε , italic_δ > 0, and let P,Q𝑃𝑄P,Qitalic_P , italic_Q be distributions on a space 𝒴𝒴\mathcal{Y}caligraphic_Y. The Hockey Stick Divergence is given by

Deε(PQ)=𝒴max{P(y)Q(y)eε,0}Q(y)𝑑y.subscript𝐷superscript𝑒𝜀conditional𝑃𝑄subscript𝒴𝑃𝑦𝑄𝑦superscript𝑒𝜀0𝑄𝑦differential-d𝑦D_{e^{\varepsilon}}(P\|Q)=\int_{\mathcal{Y}}\max\left\{\frac{P(y)}{Q(y)}-e^{% \varepsilon},0\right\}Q(y)dy.italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P ∥ italic_Q ) = ∫ start_POSTSUBSCRIPT caligraphic_Y end_POSTSUBSCRIPT roman_max { divide start_ARG italic_P ( italic_y ) end_ARG start_ARG italic_Q ( italic_y ) end_ARG - italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT , 0 } italic_Q ( italic_y ) italic_d italic_y .

It is easy to show that Deε(M(K)M(K))δsubscript𝐷superscript𝑒𝜀conditional𝑀𝐾𝑀superscript𝐾𝛿D_{e^{\varepsilon}}(M(K)\|M(K^{\prime}))\leq\deltaitalic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_M ( italic_K ) ∥ italic_M ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≤ italic_δ implies (2), so Definition A.1 provides an alternative way to prove privacy.

Definition A.1 satisfies a number of useful properties. First, because it is an f𝑓fitalic_f-divergence (Csiszár, 1975), it satisfies the data-processing inequality: for any function f𝑓fitalic_f, we have

Deε(f(P)f(Q))Deε(f(P)f(Q)).subscript𝐷superscript𝑒𝜀conditional𝑓𝑃𝑓𝑄subscript𝐷superscript𝑒𝜀conditional𝑓𝑃𝑓𝑄D_{e^{\varepsilon}}(f(P)\|f(Q))\leq D_{e^{\varepsilon}}(f(P)\|f(Q)).italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_f ( italic_P ) ∥ italic_f ( italic_Q ) ) ≤ italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_f ( italic_P ) ∥ italic_f ( italic_Q ) ) .

This property is used to show that DP is invariant to post-processing by any function f𝑓fitalic_f. The second property, again holding for all f𝑓fitalic_f-divergences, is convexity. This states that for two pairs of distributions P1,P2,Q1,Q2Δ𝒴subscript𝑃1subscript𝑃2subscript𝑄1subscript𝑄2superscriptΔ𝒴P_{1},P_{2},Q_{1},Q_{2}\in\Delta^{\mathcal{Y}}italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ roman_Δ start_POSTSUPERSCRIPT caligraphic_Y end_POSTSUPERSCRIPT and a real number λ[0,1]𝜆01\lambda\in[0,1]italic_λ ∈ [ 0 , 1 ] we have

Deε(λP1+(1λ)P2λQ1+(1λ)Q2)λDeε(P1Q1)+(1λ)Deε(P2Q2).subscript𝐷superscript𝑒𝜀𝜆subscript𝑃1conditional1𝜆subscript𝑃2𝜆subscript𝑄11𝜆subscript𝑄2𝜆subscript𝐷superscript𝑒𝜀conditionalsubscript𝑃1subscript𝑄11𝜆subscript𝐷superscript𝑒𝜀conditionalsubscript𝑃2subscript𝑄2D_{e^{\varepsilon}}(\lambda P_{1}+(1-\lambda)P_{2}\|\lambda Q_{1}+(1-\lambda)Q% _{2})\\ \leq\lambda D_{e^{\varepsilon}}(P_{1}\|Q_{1})+(1-\lambda)D_{e^{\varepsilon}}(P% _{2}\|Q_{2}).start_ROW start_CELL italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_λ italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + ( 1 - italic_λ ) italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ italic_λ italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + ( 1 - italic_λ ) italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL ≤ italic_λ italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∥ italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + ( 1 - italic_λ ) italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) . end_CELL end_ROW

Stated in terms of couplings, we may generalize convexity as follows:

Lemma A.1.

Suppose X,Y𝒳𝑋𝑌𝒳X,Y\in\mathcal{X}italic_X , italic_Y ∈ caligraphic_X are random variables with probability distributions PX,PYΔ𝒳subscript𝑃𝑋subscript𝑃𝑌superscriptΔ𝒳P_{X},P_{Y}\in\Delta^{\mathcal{X}}italic_P start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ∈ roman_Δ start_POSTSUPERSCRIPT caligraphic_X end_POSTSUPERSCRIPT. Suppose :𝒳𝒴:𝒳𝒴\mathcal{M}:\mathcal{X}\rightarrow\mathcal{Y}caligraphic_M : caligraphic_X → caligraphic_Y is a randomized function. Then, for any coupling C𝒞(PX,PY)𝐶𝒞subscript𝑃𝑋subscript𝑃𝑌C\in\mathcal{C}(P_{X},P_{Y})italic_C ∈ caligraphic_C ( italic_P start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ), we have

Deε((X)(Y))𝔼(x,y)C[Deε((x)(y))].subscript𝐷superscript𝑒𝜀conditional𝑋𝑌subscript𝔼similar-to𝑥𝑦𝐶delimited-[]subscript𝐷superscript𝑒𝜀conditional𝑥𝑦D_{e^{\varepsilon}}(\mathcal{M}(X)\|\mathcal{M}(Y))\leq\mathbb{E}_{(x,y)\sim C% }[D_{e^{\varepsilon}}(\mathcal{M}(x)\|\mathcal{M}(y))].italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_M ( italic_X ) ∥ caligraphic_M ( italic_Y ) ) ≤ blackboard_E start_POSTSUBSCRIPT ( italic_x , italic_y ) ∼ italic_C end_POSTSUBSCRIPT [ italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_M ( italic_x ) ∥ caligraphic_M ( italic_y ) ) ] .
Proof.

We may write

(X)𝑋\displaystyle\mathcal{M}(X)caligraphic_M ( italic_X ) =x𝒳PX(x)(x)=x,y𝒳C(x,y)(x)absentsubscript𝑥𝒳subscript𝑃𝑋𝑥𝑥subscript𝑥𝑦𝒳𝐶𝑥𝑦𝑥\displaystyle=\sum_{x\in\mathcal{X}}P_{X}(x)\mathcal{M}(x)=\sum_{x,y\in% \mathcal{X}}C(x,y)\mathcal{M}(x)= ∑ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_x ) caligraphic_M ( italic_x ) = ∑ start_POSTSUBSCRIPT italic_x , italic_y ∈ caligraphic_X end_POSTSUBSCRIPT italic_C ( italic_x , italic_y ) caligraphic_M ( italic_x )
(Y)𝑌\displaystyle\mathcal{M}(Y)caligraphic_M ( italic_Y ) =x,y𝒳C(x,y)(y).absentsubscript𝑥𝑦𝒳𝐶𝑥𝑦𝑦\displaystyle=\sum_{x,y\in\mathcal{X}}C(x,y)\mathcal{M}(y).= ∑ start_POSTSUBSCRIPT italic_x , italic_y ∈ caligraphic_X end_POSTSUBSCRIPT italic_C ( italic_x , italic_y ) caligraphic_M ( italic_y ) .

Applying convexity, we have

Deε((X)(Y))x,y𝒳C(x,y)Deε((x)(y)),subscript𝐷superscript𝑒𝜀conditional𝑋𝑌subscript𝑥𝑦𝒳𝐶𝑥𝑦subscript𝐷superscript𝑒𝜀conditional𝑥𝑦D_{e^{\varepsilon}}(\mathcal{M}(X)\|\mathcal{M}(Y))\leq\sum_{x,y\in\mathcal{X}% }C(x,y)D_{e^{\varepsilon}}(\mathcal{M}(x)\|\mathcal{M}(y)),italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_M ( italic_X ) ∥ caligraphic_M ( italic_Y ) ) ≤ ∑ start_POSTSUBSCRIPT italic_x , italic_y ∈ caligraphic_X end_POSTSUBSCRIPT italic_C ( italic_x , italic_y ) italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_M ( italic_x ) ∥ caligraphic_M ( italic_y ) ) ,

and the claim follows. ∎

Third, Deεsubscript𝐷superscript𝑒𝜀D_{e^{\varepsilon}}italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT satisfies a “weak” triangle inequality (also known as group privacy):

Lemma A.2.

For distributions P,Q,R𝑃𝑄𝑅P,Q,Ritalic_P , italic_Q , italic_R on 𝒴𝒴\mathcal{Y}caligraphic_Y, we have Deα+β(PR)Deα(PQ)+eαDeβ(QR)subscript𝐷superscript𝑒𝛼𝛽conditional𝑃𝑅subscript𝐷superscript𝑒𝛼conditional𝑃𝑄superscript𝑒𝛼subscript𝐷superscript𝑒𝛽conditional𝑄𝑅D_{e^{\alpha+\beta}}(P\|R)\leq D_{e^{\alpha}}(P\|Q)+e^{\alpha}D_{e^{\beta}}(Q% \|R)italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_α + italic_β end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P ∥ italic_R ) ≤ italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P ∥ italic_Q ) + italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_Q ∥ italic_R ).

Proof.

For any P,Q,ε𝑃𝑄𝜀P,Q,\varepsilonitalic_P , italic_Q , italic_ε, we may view Deε(PQ)subscript𝐷superscript𝑒𝜀conditional𝑃𝑄D_{e^{\varepsilon}}(P\|Q)italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P ∥ italic_Q ) through its dual form as

Deε(PQ)=supY𝒴(P(Y)eεQ(Y)).subscript𝐷superscript𝑒𝜀conditional𝑃𝑄subscriptsupremum𝑌𝒴𝑃𝑌superscript𝑒𝜀𝑄𝑌D_{e^{\varepsilon}}(P\|Q)=\sup_{Y\subseteq\mathcal{Y}}(P(Y)-e^{\varepsilon}Q(Y% )).italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P ∥ italic_Q ) = roman_sup start_POSTSUBSCRIPT italic_Y ⊆ caligraphic_Y end_POSTSUBSCRIPT ( italic_P ( italic_Y ) - italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT italic_Q ( italic_Y ) ) .

Thus, let Ysuperscript𝑌Y^{*}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT denote the maximal set such that

Deα+β(PR)=(P(Y)eα+βR(Y)).subscript𝐷superscript𝑒𝛼𝛽conditional𝑃𝑅𝑃superscript𝑌superscript𝑒𝛼𝛽𝑅superscript𝑌\displaystyle D_{e^{\alpha+\beta}}(P\|R)=(P(Y^{*})-e^{\alpha+\beta}R(Y^{*})).italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_α + italic_β end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P ∥ italic_R ) = ( italic_P ( italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) - italic_e start_POSTSUPERSCRIPT italic_α + italic_β end_POSTSUPERSCRIPT italic_R ( italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ) .

We may rewrite this as

Deα+β(PR)subscript𝐷superscript𝑒𝛼𝛽conditional𝑃𝑅\displaystyle D_{e^{\alpha+\beta}}(P\|R)italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_α + italic_β end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P ∥ italic_R ) =(P(Y)eαQ(Y))+eα(Q(Y)eβR(Y))absent𝑃superscript𝑌superscript𝑒𝛼𝑄superscript𝑌superscript𝑒𝛼𝑄superscript𝑌superscript𝑒𝛽𝑅superscript𝑌\displaystyle=(P(Y^{*})-e^{\alpha}Q(Y^{*}))+e^{\alpha}(Q(Y^{*})-e^{\beta}R(Y^{% *}))= ( italic_P ( italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) - italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT italic_Q ( italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ) + italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( italic_Q ( italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) - italic_e start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT italic_R ( italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) )
Deα(PQ)+eαDeβ(QR),absentsubscript𝐷superscript𝑒𝛼conditional𝑃𝑄superscript𝑒𝛼subscript𝐷superscript𝑒𝛽conditional𝑄𝑅\displaystyle\leq D_{e^{\alpha}}(P\|Q)+e^{\alpha}D_{e^{\beta}}(Q\|R),≤ italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P ∥ italic_Q ) + italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_Q ∥ italic_R ) ,

showing the claim. ∎

Appendix B Omitted Proofs from Section 4

B.1. Proof of Theorem 4.1

See 4.1 For any two distributions K~,K~~𝐾superscript~𝐾\tilde{K},\tilde{K}^{\prime}over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, we have

qf(K)qf(K)subscript𝑞𝑓𝐾subscript𝑞𝑓superscript𝐾\displaystyle q_{f}(K)-q_{f}(K^{\prime})italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K ) - italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) =𝔼xK~[f(x)]𝔼xK~[f(x)]absentsubscript𝔼similar-to𝑥~𝐾delimited-[]𝑓𝑥subscript𝔼similar-to𝑥superscript~𝐾delimited-[]𝑓𝑥\displaystyle=\mathbb{E}_{x\sim\tilde{K}}[f(x)]-\mathbb{E}_{x\sim\tilde{K}^{% \prime}}[f(x)]= blackboard_E start_POSTSUBSCRIPT italic_x ∼ over~ start_ARG italic_K end_ARG end_POSTSUBSCRIPT [ italic_f ( italic_x ) ] - blackboard_E start_POSTSUBSCRIPT italic_x ∼ over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_f ( italic_x ) ]
=x𝒳f(x)K~(x)x𝒳f(y)K~(y).absentsubscript𝑥𝒳𝑓𝑥~𝐾𝑥subscript𝑥𝒳𝑓𝑦superscript~𝐾𝑦\displaystyle=\sum_{x\in\mathcal{X}}f(x)\tilde{K}(x)-\sum_{x\in\mathcal{X}}f(y% )\tilde{K}^{\prime}(y).= ∑ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT italic_f ( italic_x ) over~ start_ARG italic_K end_ARG ( italic_x ) - ∑ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT italic_f ( italic_y ) over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_y ) .

Let C(x,y)={Cx(y)}x𝒳𝐶𝑥𝑦subscriptsubscript𝐶𝑥𝑦𝑥𝒳C(x,y)=\{C_{x}(y)\}_{x\in\mathcal{X}}italic_C ( italic_x , italic_y ) = { italic_C start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_y ) } start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT be the minimum-transport coupling between K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG and K~superscript~𝐾\tilde{K}^{\prime}over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. By Definition 2.4, we have K~(y)=x𝒳C(x,y)superscript~𝐾𝑦subscript𝑥𝒳𝐶𝑥𝑦\tilde{K}^{\prime}(y)=\sum_{x\in\mathcal{X}}C(x,y)over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_y ) = ∑ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT italic_C ( italic_x , italic_y ), and dEM(K~,K~)=x,y𝒳d𝒳(x,y)C(x,y)subscript𝑑EM~𝐾superscript~𝐾subscript𝑥𝑦𝒳subscript𝑑𝒳𝑥𝑦𝐶𝑥𝑦d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})=\sum_{x,y\in\mathcal{X}}d_{% \mathcal{X}}(x,y)C(x,y)italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_x , italic_y ∈ caligraphic_X end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x , italic_y ) italic_C ( italic_x , italic_y ). Now, we write

x𝒳f(x)K~(x)y𝒳f(y)K~(y)subscript𝑥𝒳𝑓𝑥~𝐾𝑥subscript𝑦𝒳𝑓𝑦superscript~𝐾𝑦\displaystyle\sum_{x\in\mathcal{X}}f(x)\tilde{K}(x)-\sum_{y\in\mathcal{X}}f(y)% \tilde{K}^{\prime}(y)∑ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT italic_f ( italic_x ) over~ start_ARG italic_K end_ARG ( italic_x ) - ∑ start_POSTSUBSCRIPT italic_y ∈ caligraphic_X end_POSTSUBSCRIPT italic_f ( italic_y ) over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_y )
=x𝒳f(x)K~(x)y𝒳f(y)x𝒳C(x,y)absentsubscript𝑥𝒳𝑓𝑥~𝐾𝑥subscript𝑦𝒳𝑓𝑦subscript𝑥𝒳𝐶𝑥𝑦\displaystyle\qquad=\sum_{x\in\mathcal{X}}f(x)\tilde{K}(x)-\sum_{y\in\mathcal{% X}}f(y)\sum_{x\in\mathcal{X}}C(x,y)= ∑ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT italic_f ( italic_x ) over~ start_ARG italic_K end_ARG ( italic_x ) - ∑ start_POSTSUBSCRIPT italic_y ∈ caligraphic_X end_POSTSUBSCRIPT italic_f ( italic_y ) ∑ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT italic_C ( italic_x , italic_y )
=x𝒳(f(x)y𝒳f(y)Cx(y))K~(x)absentsubscript𝑥𝒳𝑓𝑥subscript𝑦𝒳𝑓𝑦subscript𝐶𝑥𝑦~𝐾𝑥\displaystyle\qquad=\sum_{x\in\mathcal{X}}\left(f(x)-\sum_{y\in\mathcal{X}}f(y% )C_{x}(y)\right)\tilde{K}(x)= ∑ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT ( italic_f ( italic_x ) - ∑ start_POSTSUBSCRIPT italic_y ∈ caligraphic_X end_POSTSUBSCRIPT italic_f ( italic_y ) italic_C start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_y ) ) over~ start_ARG italic_K end_ARG ( italic_x )
=x𝒳(y𝒳f(x)Cx(y)y𝒳f(y)Cx(y))K~(x)absentsubscript𝑥𝒳subscript𝑦𝒳𝑓𝑥subscript𝐶𝑥𝑦subscript𝑦𝒳𝑓𝑦subscript𝐶𝑥𝑦~𝐾𝑥\displaystyle\qquad=\sum_{x\in\mathcal{X}}\left(\sum_{y\in\mathcal{X}}f(x)C_{x% }(y)-\sum_{y\in\mathcal{X}}f(y)C_{x}(y)\right)\tilde{K}(x)= ∑ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_y ∈ caligraphic_X end_POSTSUBSCRIPT italic_f ( italic_x ) italic_C start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_y ) - ∑ start_POSTSUBSCRIPT italic_y ∈ caligraphic_X end_POSTSUBSCRIPT italic_f ( italic_y ) italic_C start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_y ) ) over~ start_ARG italic_K end_ARG ( italic_x )
=x𝒳y𝒳(f(x)f(y))Cx(y)K~(x)absentsubscript𝑥𝒳subscript𝑦𝒳𝑓𝑥𝑓𝑦subscript𝐶𝑥𝑦~𝐾𝑥\displaystyle\qquad=\sum_{x\in\mathcal{X}}\sum_{y\in\mathcal{X}}\left(f(x)-f(y% )\right)C_{x}(y)\tilde{K}(x)= ∑ start_POSTSUBSCRIPT italic_x ∈ caligraphic_X end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_y ∈ caligraphic_X end_POSTSUBSCRIPT ( italic_f ( italic_x ) - italic_f ( italic_y ) ) italic_C start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_y ) over~ start_ARG italic_K end_ARG ( italic_x )
=x,y𝒳(f(x)f(y))C(x,y).absentsubscript𝑥𝑦𝒳𝑓𝑥𝑓𝑦𝐶𝑥𝑦\displaystyle\qquad=\sum_{x,y\in\mathcal{X}}\left(f(x)-f(y)\right)C(x,y).= ∑ start_POSTSUBSCRIPT italic_x , italic_y ∈ caligraphic_X end_POSTSUBSCRIPT ( italic_f ( italic_x ) - italic_f ( italic_y ) ) italic_C ( italic_x , italic_y ) .

By the triangle inequality and the fact that f𝑓fitalic_f is \ellroman_ℓ-Lipschitz, we may write

qf(K)qf(K)normsubscript𝑞𝑓𝐾subscript𝑞𝑓superscript𝐾\displaystyle\|q_{f}(K)-q_{f}(K^{\prime})\|∥ italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K ) - italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ x,y𝒳f(x)f(y)C(x,y)absentsubscript𝑥𝑦𝒳norm𝑓𝑥𝑓𝑦𝐶𝑥𝑦\displaystyle\leq\sum_{x,y\in\mathcal{X}}\|f(x)-f(y)\|C(x,y)≤ ∑ start_POSTSUBSCRIPT italic_x , italic_y ∈ caligraphic_X end_POSTSUBSCRIPT ∥ italic_f ( italic_x ) - italic_f ( italic_y ) ∥ italic_C ( italic_x , italic_y )
x,y𝒳d𝒳(x,y)C(x,y)absentsubscript𝑥𝑦𝒳subscript𝑑𝒳𝑥𝑦𝐶𝑥𝑦\displaystyle\leq\sum_{x,y\in\mathcal{X}}\ell d_{\mathcal{X}}(x,y)C(x,y)≤ ∑ start_POSTSUBSCRIPT italic_x , italic_y ∈ caligraphic_X end_POSTSUBSCRIPT roman_ℓ italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x , italic_y ) italic_C ( italic_x , italic_y )
=dEM(K~,K~).absentsubscript𝑑EM~𝐾superscript~𝐾\displaystyle=\ell d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime}).= roman_ℓ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) .

The last equation tells us that ΔdEM(qf)subscriptΔsubscript𝑑EMsubscript𝑞𝑓\Delta_{d_{\textsf{EM}}}(q_{f})\leq\ellroman_Δ start_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) ≤ roman_ℓ.

B.2. Proof of Lemma 4.2

See 4.2 In the local model, by Theorem 4.1, we have qf(K)qf(K)normsubscript𝑞𝑓𝐾subscript𝑞𝑓superscript𝐾\|q_{f}(K)-q_{f}(K^{\prime})\|\leq\ell∥ italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K ) - italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ ≤ roman_ℓ. By adding noise drawn from Γ(d,1α)Γ𝑑1𝛼\Gamma(d,\frac{1}{\alpha})roman_Γ ( italic_d , divide start_ARG 1 end_ARG start_ARG italic_α end_ARG ), it is known this satisfies (α,0)𝛼0(\alpha,0)( italic_α , 0 )-DP (Hardt and Talwar, 2010). In the bounded central setting, we have qf(K)qf(K)nnormsubscript𝑞𝑓𝐾subscript𝑞𝑓superscript𝐾𝑛\|q_{f}(K)-q_{f}(K^{\prime})\|\leq\frac{\ell}{n}∥ italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K ) - italic_q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ ≤ divide start_ARG roman_ℓ end_ARG start_ARG italic_n end_ARG, and thus we may add noise drawn from Γ(d,1nα)Γ𝑑1𝑛𝛼\Gamma(d,\frac{1}{n\alpha})roman_Γ ( italic_d , divide start_ARG 1 end_ARG start_ARG italic_n italic_α end_ARG ).

B.3. Proof of Theorem 4.3

See 4.3 We will first assume the following lemma:

Lemma B.1.

Suppose that 𝒜𝒜\mathcal{A}caligraphic_A is an α0d𝒳subscript𝛼0subscript𝑑𝒳\alpha_{0}d_{\mathcal{X}}italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-metric DP algorithm, where d𝒳1subscript𝑑𝒳1d_{\mathcal{X}}\leq 1italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ≤ 1. Let x10,x11,x2,,xm𝒳superscriptsubscript𝑥10superscriptsubscript𝑥11subscript𝑥2subscript𝑥𝑚𝒳x_{1}^{0},x_{1}^{1},x_{2},\ldots,x_{m}\in\mathcal{X}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ∈ caligraphic_X be a set of inputs such that d𝒳(x10,x11)dsubscript𝑑𝒳superscriptsubscript𝑥10superscriptsubscript𝑥11𝑑d_{\mathcal{X}}(x_{1}^{0},x_{1}^{1})\leq ditalic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) ≤ italic_d, and let δ>0𝛿0\delta>0italic_δ > 0 be a constant such that α0ln(m16ln(2/δ))subscript𝛼0𝑚162𝛿\alpha_{0}\leq\ln(\frac{m}{16\ln(2/\delta)})italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≤ roman_ln ( divide start_ARG italic_m end_ARG start_ARG 16 roman_ln ( 2 / italic_δ ) end_ARG ). Then, we have that

Deα(𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x10),,𝒜(xm)),𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x11),𝒜(x2),,𝒜(xm)))δ,subscript𝐷superscript𝑒𝛼𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜superscriptsubscript𝑥10𝒜subscript𝑥𝑚𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜superscriptsubscript𝑥11𝒜subscript𝑥2𝒜subscript𝑥𝑚𝛿D_{e^{\alpha}}(\mathsf{Shuffle}(\mathcal{A}(x_{1}^{0}),\ldots,\mathcal{A}(x_{m% })),\\ \mathsf{Shuffle}(\mathcal{A}(x_{1}^{1}),\mathcal{A}(x_{2}),\ldots,\mathcal{A}(% x_{m})))\leq\delta,start_ROW start_CELL italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) , end_CELL end_ROW start_ROW start_CELL sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , caligraphic_A ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) ) ≤ italic_δ , end_CELL end_ROW

where

αln(1+eα0d1eα0d+1(8eα0ln(4/δ)m+8eα0m)).𝛼1superscript𝑒subscript𝛼0𝑑1superscript𝑒subscript𝛼0𝑑18superscript𝑒subscript𝛼04𝛿𝑚8superscript𝑒subscript𝛼0𝑚\alpha\leq\ln\left(1+\frac{e^{\alpha_{0}d}-1}{e^{\alpha_{0}d}+1}\left(\frac{8% \sqrt{e^{\alpha_{0}}\ln(4/\delta)}}{\sqrt{m}}+\frac{8e^{\alpha_{0}}}{m}\right)% \right).italic_α ≤ roman_ln ( 1 + divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_d end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_d end_POSTSUPERSCRIPT + 1 end_ARG ( divide start_ARG 8 square-root start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_ln ( 4 / italic_δ ) end_ARG end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG + divide start_ARG 8 italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_m end_ARG ) ) .

To prove Theorem 4.3, let

S(𝐱i,𝐱mi)=𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x1),,𝒜(xi),𝒜(xi+1),,𝒜(xm)).𝑆subscript𝐱𝑖superscriptsubscript𝐱𝑚𝑖𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜subscript𝑥1𝒜subscript𝑥𝑖𝒜superscriptsubscript𝑥𝑖1𝒜superscriptsubscript𝑥𝑚S(\mathbf{x}_{i},\mathbf{x}_{m-i}^{\prime})=\mathsf{Shuffle}(\mathcal{A}(x_{1}% ),\ldots,\mathcal{A}(x_{i}),\mathcal{A}(x_{i+1}^{\prime}),\ldots,\mathcal{A}(x% _{m}^{\prime})).italic_S ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_m - italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) .

Let m=v0superscript𝑚subscriptnorm𝑣0m^{\prime}=\|v\|_{0}italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, and WLOG suppose that xi=xisubscript𝑥𝑖superscriptsubscript𝑥𝑖x_{i}=x_{i}^{\prime}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT for i>m𝑖superscript𝑚i>m^{\prime}italic_i > italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Our goal is to show that

Deα(S(𝐱m,𝐱0)S(𝐱0,𝐱m))δ.subscript𝐷superscript𝑒𝛼conditional𝑆subscript𝐱superscript𝑚superscriptsubscript𝐱0𝑆subscript𝐱0superscriptsubscript𝐱superscript𝑚𝛿D_{e^{\alpha}}(S(\mathbf{x}_{m^{\prime}},\mathbf{x}_{0}^{\prime})\|S(\mathbf{x% }_{0},\mathbf{x}_{m^{\prime}}^{\prime}))\leq\delta.italic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_S ( bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ italic_S ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≤ italic_δ .

By Lemma B.1, we have for each 1im1𝑖superscript𝑚1\leq i\leq m^{\prime}1 ≤ italic_i ≤ italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT that

Dexp(α(i))(S(𝐱i1,𝐱mi+1)S(𝐱i,𝐱mi))δm,subscript𝐷𝛼𝑖conditional𝑆subscript𝐱𝑖1superscriptsubscript𝐱superscript𝑚𝑖1𝑆subscript𝐱𝑖superscriptsubscript𝐱superscript𝑚𝑖𝛿superscript𝑚D_{\exp(\alpha(i))}(S(\mathbf{x}_{i-1},\mathbf{x}_{m^{\prime}-i+1}^{\prime})\|% S(\mathbf{x}_{i},\mathbf{x}_{m^{\prime}-i}^{\prime}))\leq\frac{\delta}{m^{% \prime}},italic_D start_POSTSUBSCRIPT roman_exp ( italic_α ( italic_i ) ) end_POSTSUBSCRIPT ( italic_S ( bold_x start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_i + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ italic_S ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≤ divide start_ARG italic_δ end_ARG start_ARG italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ,

where

α(i)=ln(1+eα0d𝒳(xi,xi)1eα0d𝒳(xi,xi)+1(8eα0ln(4m/δ)m+8eα0m)).𝛼𝑖1superscript𝑒subscript𝛼0subscript𝑑𝒳subscript𝑥𝑖superscriptsubscript𝑥𝑖1superscript𝑒subscript𝛼0subscript𝑑𝒳subscript𝑥𝑖superscriptsubscript𝑥𝑖18superscript𝑒subscript𝛼04superscript𝑚𝛿𝑚8superscript𝑒subscript𝛼0𝑚\alpha(i)=\ln\left(1+\frac{e^{\alpha_{0}d_{\mathcal{X}}(x_{i},x_{i}^{\prime})}% -1}{e^{\alpha_{0}d_{\mathcal{X}}(x_{i},x_{i}^{\prime})}+1}\left(\frac{8\sqrt{e% ^{\alpha_{0}}\ln(4m^{\prime}/\delta)}}{\sqrt{m}}+\frac{8e^{\alpha_{0}}}{m}% \right)\right).italic_α ( italic_i ) = roman_ln ( 1 + divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT + 1 end_ARG ( divide start_ARG 8 square-root start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_ln ( 4 italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG + divide start_ARG 8 italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_m end_ARG ) ) .

Applying Lemma A.2 msuperscript𝑚m^{\prime}italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT times, we see

Dexp(α(1)++α(m))(S(𝐱m,𝐱0)S(𝐱0,𝐱m))subscript𝐷𝛼1𝛼superscript𝑚conditional𝑆subscript𝐱superscript𝑚superscriptsubscript𝐱0𝑆subscript𝐱0superscriptsubscript𝐱superscript𝑚\displaystyle D_{\exp(\alpha(1)+\cdots+\alpha(m^{\prime}))}(S(\mathbf{x}_{m^{% \prime}},\mathbf{x}_{0}^{\prime})\|S(\mathbf{x}_{0},\mathbf{x}_{m^{\prime}}^{% \prime}))italic_D start_POSTSUBSCRIPT roman_exp ( italic_α ( 1 ) + ⋯ + italic_α ( italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) end_POSTSUBSCRIPT ( italic_S ( bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ italic_S ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) )
Dexp(α(m))(S(𝐱m,𝐱0)S(𝐱m1,𝐱1))absentsubscript𝐷𝛼superscript𝑚conditional𝑆subscript𝐱superscript𝑚superscriptsubscript𝐱0𝑆subscript𝐱superscript𝑚1superscriptsubscript𝐱1\displaystyle\leq D_{\exp(\alpha(m^{\prime}))}(S(\mathbf{x}_{m^{\prime}},% \mathbf{x}_{0}^{\prime})\|S(\mathbf{x}_{m^{\prime}-1},\mathbf{x}_{1}^{\prime}))≤ italic_D start_POSTSUBSCRIPT roman_exp ( italic_α ( italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) end_POSTSUBSCRIPT ( italic_S ( bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ italic_S ( bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) )
+eα(m)Dexp(α(m1))(S(𝐱m1,𝐱1)S(𝐱m2,𝐱2))superscript𝑒𝛼superscript𝑚subscript𝐷𝛼superscript𝑚1conditional𝑆subscript𝐱superscript𝑚1superscriptsubscript𝐱1𝑆subscript𝐱superscript𝑚2superscriptsubscript𝐱2\displaystyle+e^{\alpha(m^{\prime})}D_{\exp(\alpha(m^{\prime}-1))}(S(\mathbf{x% }_{m^{\prime}-1},\mathbf{x}_{1}^{\prime})\|S(\mathbf{x}_{m^{\prime}-2},\mathbf% {x}_{2}^{\prime}))+ italic_e start_POSTSUPERSCRIPT italic_α ( italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT roman_exp ( italic_α ( italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) ) end_POSTSUBSCRIPT ( italic_S ( bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ italic_S ( bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 2 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) )
+\displaystyle+\cdots+ ⋯
+eα(2)++α(m)Dexp(α(1))(S(𝐱1,𝐱m1)S(𝐱0,𝐱m))superscript𝑒𝛼2𝛼superscript𝑚subscript𝐷𝛼1conditional𝑆subscript𝐱1superscriptsubscript𝐱superscript𝑚1𝑆subscript𝐱0superscriptsubscript𝐱superscript𝑚\displaystyle+e^{\alpha(2)+\cdots+\alpha(m^{\prime})}D_{\exp(\alpha(1))}(S(% \mathbf{x}_{1},\mathbf{x}_{m^{\prime}-1}^{\prime})\|S(\mathbf{x}_{0},\mathbf{x% }_{m^{\prime}}^{\prime}))+ italic_e start_POSTSUPERSCRIPT italic_α ( 2 ) + ⋯ + italic_α ( italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT roman_exp ( italic_α ( 1 ) ) end_POSTSUBSCRIPT ( italic_S ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ italic_S ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) )
eα(1)++α(m)i=1mDexp(α(i))(S(𝐱i1,𝐱mi+1)S(𝐱i,𝐱mi))absentsuperscript𝑒𝛼1𝛼superscript𝑚superscriptsubscript𝑖1superscript𝑚subscript𝐷𝛼𝑖conditional𝑆subscript𝐱𝑖1superscriptsubscript𝐱superscript𝑚𝑖1𝑆subscript𝐱𝑖superscriptsubscript𝐱superscript𝑚𝑖\displaystyle\leq e^{\alpha(1)+\cdots+\alpha(m^{\prime})}\sum_{i=1}^{m^{\prime% }}D_{\exp(\alpha(i))}(S(\mathbf{x}_{i-1},\mathbf{x}_{m^{\prime}-i+1}^{\prime})% \|S(\mathbf{x}_{i},\mathbf{x}_{m^{\prime}-i}^{\prime}))≤ italic_e start_POSTSUPERSCRIPT italic_α ( 1 ) + ⋯ + italic_α ( italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT roman_exp ( italic_α ( italic_i ) ) end_POSTSUBSCRIPT ( italic_S ( bold_x start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_i + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ italic_S ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) )
eα(1)++α(m)δ.absentsuperscript𝑒𝛼1𝛼superscript𝑚𝛿\displaystyle\leq e^{\alpha(1)+\cdots+\alpha(m^{\prime})}\delta.≤ italic_e start_POSTSUPERSCRIPT italic_α ( 1 ) + ⋯ + italic_α ( italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT italic_δ .

We now show that α(i)𝛼𝑖\alpha(i)italic_α ( italic_i ) is a concave function of d𝒳(xi,xi)subscript𝑑𝒳subscript𝑥𝑖superscriptsubscript𝑥𝑖d_{\mathcal{X}}(x_{i},x_{i}^{\prime})italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ); to do this we write α(i)=f(d)=ln(1+g(d)K)𝛼𝑖𝑓𝑑1𝑔𝑑𝐾\alpha(i)=f(d)=\ln(1+g(d)K)italic_α ( italic_i ) = italic_f ( italic_d ) = roman_ln ( 1 + italic_g ( italic_d ) italic_K ), where g(d)=ed1ed+1𝑔𝑑superscript𝑒𝑑1superscript𝑒𝑑1g(d)=\frac{e^{d}-1}{e^{d}+1}italic_g ( italic_d ) = divide start_ARG italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 end_ARG and K>0𝐾0K>0italic_K > 0 is a suitable constant. We will show that f′′(d)0superscript𝑓′′𝑑0f^{\prime\prime}(d)\leq 0italic_f start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( italic_d ) ≤ 0. Taking derivatives, it is easy to show that f′′(d)superscript𝑓′′𝑑f^{\prime\prime}(d)italic_f start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( italic_d ) has the same sign as (1+Kg(d))g′′(d)Kg(d)21𝐾𝑔𝑑superscript𝑔′′𝑑𝐾superscript𝑔superscript𝑑2(1+Kg(d))g^{\prime\prime}(d)-Kg^{\prime}(d)^{2}( 1 + italic_K italic_g ( italic_d ) ) italic_g start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( italic_d ) - italic_K italic_g start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Thus, we will show that (1+Kg(d))g′′(d)Kg(d)21𝐾𝑔𝑑superscript𝑔′′𝑑𝐾superscript𝑔superscript𝑑2(1+Kg(d))g^{\prime\prime}(d)\leq Kg^{\prime}(d)^{2}( 1 + italic_K italic_g ( italic_d ) ) italic_g start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( italic_d ) ≤ italic_K italic_g start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. We may write

g(d)=12ed+1𝑔𝑑12superscript𝑒𝑑1\displaystyle g(d)=1-\frac{2}{e^{d}+1}italic_g ( italic_d ) = 1 - divide start_ARG 2 end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 end_ARG
g(d)=2ed(ed+1)2superscript𝑔𝑑2superscript𝑒𝑑superscriptsuperscript𝑒𝑑12\displaystyle g^{\prime}(d)=\frac{2e^{d}}{(e^{d}+1)^{2}}italic_g start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d ) = divide start_ARG 2 italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_ARG start_ARG ( italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG
g′′(d)=2(ed+1)2ed2ed(ed+1)ed(ed+1)4=2ede2d(ed+1)3.superscript𝑔′′𝑑2superscriptsuperscript𝑒𝑑12superscript𝑒𝑑2superscript𝑒𝑑superscript𝑒𝑑1superscript𝑒𝑑superscriptsuperscript𝑒𝑑142superscript𝑒𝑑superscript𝑒2𝑑superscriptsuperscript𝑒𝑑13\displaystyle g^{\prime\prime}(d)=2\frac{(e^{d}+1)^{2}e^{d}-2e^{d}(e^{d}+1)e^{% d}}{(e^{d}+1)^{4}}=2\frac{e^{d}-e^{2d}}{(e^{d}+1)^{3}}.italic_g start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( italic_d ) = 2 divide start_ARG ( italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - 2 italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 ) italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_ARG start_ARG ( italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 ) start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_ARG = 2 divide start_ARG italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT end_ARG start_ARG ( italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG .

Now, we have

(1+Kg(d))g′′(d)Kg(d)21𝐾𝑔𝑑superscript𝑔′′𝑑𝐾superscript𝑔superscript𝑑2\displaystyle(1+Kg(d))g^{\prime\prime}(d)\leq Kg^{\prime}(d)^{2}( 1 + italic_K italic_g ( italic_d ) ) italic_g start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( italic_d ) ≤ italic_K italic_g start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
(1+K2Ked+1)2ede2d(ed+1)3K4e2d(ed+1)4absent1𝐾2𝐾superscript𝑒𝑑12superscript𝑒𝑑superscript𝑒2𝑑superscriptsuperscript𝑒𝑑13𝐾4superscript𝑒2𝑑superscriptsuperscript𝑒𝑑14\displaystyle\Longleftrightarrow(1+K-\frac{2K}{e^{d}+1})2\frac{e^{d}-e^{2d}}{(% e^{d}+1)^{3}}\leq K\frac{4e^{2d}}{(e^{d}+1)^{4}}⟺ ( 1 + italic_K - divide start_ARG 2 italic_K end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 end_ARG ) 2 divide start_ARG italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT end_ARG start_ARG ( italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG ≤ italic_K divide start_ARG 4 italic_e start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT end_ARG start_ARG ( italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 ) start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_ARG
((ed+1)(K+1)2K)(1ed)2Kedabsentsuperscript𝑒𝑑1𝐾12𝐾1superscript𝑒𝑑2𝐾superscript𝑒𝑑\displaystyle\Longleftrightarrow((e^{d}+1)(K+1)-2K)(1-e^{d})\leq 2Ke^{d}⟺ ( ( italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 ) ( italic_K + 1 ) - 2 italic_K ) ( 1 - italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) ≤ 2 italic_K italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT
(KedK+ed+1)(1ed)2Kedabsent𝐾superscript𝑒𝑑𝐾superscript𝑒𝑑11superscript𝑒𝑑2𝐾superscript𝑒𝑑\displaystyle\Longleftrightarrow(Ke^{d}-K+e^{d}+1)(1-e^{d})\leq 2Ke^{d}⟺ ( italic_K italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - italic_K + italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + 1 ) ( 1 - italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) ≤ 2 italic_K italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT
KedK+1Ke2d+Kede2d2Kedabsent𝐾superscript𝑒𝑑𝐾1𝐾superscript𝑒2𝑑𝐾superscript𝑒𝑑superscript𝑒2𝑑2𝐾superscript𝑒𝑑\displaystyle\Longleftrightarrow Ke^{d}-K+1-Ke^{2d}+Ke^{d}-e^{2d}\leq 2Ke^{d}⟺ italic_K italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - italic_K + 1 - italic_K italic_e start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT + italic_K italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT ≤ 2 italic_K italic_e start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT
K+1Ke2de2d0absent𝐾1𝐾superscript𝑒2𝑑superscript𝑒2𝑑0\displaystyle\Longleftrightarrow-K+1-Ke^{2d}-e^{2d}\leq 0⟺ - italic_K + 1 - italic_K italic_e start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT ≤ 0

We are done by observing that 1e2d01superscript𝑒2𝑑01-e^{2d}\leq 01 - italic_e start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT ≤ 0, and KKe2d0𝐾𝐾superscript𝑒2𝑑0-K-Ke^{2d}\leq 0- italic_K - italic_K italic_e start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT ≤ 0. Having shown convexity, we establish the maximum occurs when each α(i)𝛼𝑖\alpha(i)italic_α ( italic_i ) is equal to v1v0subscriptnorm𝑣1subscriptnorm𝑣0\frac{\|v\|_{1}}{\|v\|_{0}}divide start_ARG ∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG. This gives us a bound of

α(1)++α(m)v0ln(1+eα0v1/v01eα0v1/v0+1(8eα0ln(4v0/δ)m+8eα0m)).𝛼1𝛼superscript𝑚subscriptdelimited-∥∥𝑣01superscript𝑒subscript𝛼0subscriptnorm𝑣1subscriptnorm𝑣01superscript𝑒subscript𝛼0subscriptnorm𝑣1subscriptnorm𝑣018superscript𝑒subscript𝛼04subscriptnorm𝑣0𝛿𝑚8superscript𝑒subscript𝛼0𝑚\alpha(1)+\cdots+\alpha(m^{\prime})\\ \leq\|v\|_{0}\ln\left(1+\frac{e^{\alpha_{0}\|v\|_{1}/\|v\|_{0}}-1}{e^{\alpha_{% 0}\|v\|_{1}/\|v\|_{0}}+1}\left(\frac{8\sqrt{e^{\alpha_{0}}\ln(4\|v\|_{0}/% \delta)}}{\sqrt{m}}+\frac{8e^{\alpha_{0}}}{m}\right)\right).start_ROW start_CELL italic_α ( 1 ) + ⋯ + italic_α ( italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_CELL end_ROW start_ROW start_CELL ≤ ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT roman_ln ( 1 + divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ italic_v ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + 1 end_ARG ( divide start_ARG 8 square-root start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_ln ( 4 ∥ italic_v ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / italic_δ ) end_ARG end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG + divide start_ARG 8 italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_m end_ARG ) ) . end_CELL end_ROW

B.4. Proof of Lemma B.1

This lemma can be viewed as a generalization of amplification by shuffling, which has the same setup but sets d=1𝑑1d=1italic_d = 1 and merely requires that \mathcal{M}caligraphic_M satisfy ε𝜀\varepsilonitalic_ε-local DP. We generalize the approach of Feldman et al. (2022), starting with the the following preliminary claims.

B.4.1. Preliminary Lemmas

Lemma B.2.

(Generalization of Lemma 3.3 in Feldman et al. (2022)). Let X={x10,x11,x2,xm}𝑋superscriptsubscript𝑥10superscriptsubscript𝑥11subscript𝑥2subscript𝑥𝑚X=\{x_{1}^{0},x_{1}^{1},x_{2}\ldots,x_{m}\}italic_X = { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT } be a set of indices, and for xX𝑥𝑋x\in Xitalic_x ∈ italic_X, let R(x),Q(x)𝑅𝑥𝑄𝑥R(x),Q(x)italic_R ( italic_x ) , italic_Q ( italic_x ) be two families of distributions and α[0,1],β[0,12]formulae-sequence𝛼01𝛽012\alpha\in[0,1],\beta\in[0,\frac{1}{2}]italic_α ∈ [ 0 , 1 ] , italic_β ∈ [ 0 , divide start_ARG 1 end_ARG start_ARG 2 end_ARG ] be coefficients such that

R(x10)=(1α)Q(x10)+αQ(x11)𝑅superscriptsubscript𝑥101𝛼𝑄superscriptsubscript𝑥10𝛼𝑄superscriptsubscript𝑥11\displaystyle R(x_{1}^{0})=(1-\alpha)Q(x_{1}^{0})+\alpha Q(x_{1}^{1})italic_R ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = ( 1 - italic_α ) italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) + italic_α italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT )
R(x11)=αQ(x10)+(1α)Q(x11)𝑅superscriptsubscript𝑥11𝛼𝑄superscriptsubscript𝑥101𝛼𝑄superscriptsubscript𝑥11\displaystyle R(x_{1}^{1})=\alpha Q(x_{1}^{0})+(1-\alpha)Q(x_{1}^{1})italic_R ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) = italic_α italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) + ( 1 - italic_α ) italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT )
R(xj)=βQ(x10)+βQ(x11)+(12β)Q(xj)j2.𝑅subscript𝑥𝑗𝛽𝑄superscriptsubscript𝑥10𝛽𝑄superscriptsubscript𝑥1112𝛽𝑄subscript𝑥𝑗for-all𝑗2\displaystyle R(x_{j})=\beta Q(x_{1}^{0})+\beta Q(x_{1}^{1})+(1-2\beta)Q(x_{j}% )~{}~{}~{}\forall j\geq 2.italic_R ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = italic_β italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) + italic_β italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) + ( 1 - 2 italic_β ) italic_Q ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∀ italic_j ≥ 2 .

Then, there exists a post-processing mechanism 𝒮𝒮\mathcal{S}caligraphic_S such that

𝖲𝗁𝗎𝖿𝖿𝗅𝖾(R(x10),R(x2),R(xm))𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝑅superscriptsubscript𝑥10𝑅subscript𝑥2𝑅subscript𝑥𝑚\displaystyle\mathsf{Shuffle}(R(x_{1}^{0}),R(x_{2}),\ldots R(x_{m}))sansserif_Shuffle ( italic_R ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_R ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … italic_R ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) =𝒮(A+1Δ,CA+Δ)andabsent𝒮𝐴1Δ𝐶𝐴Δand\displaystyle=\mathcal{S}(A+1-\Delta,C-A+\Delta)\ \ \ \ \ \ \ \text{and}= caligraphic_S ( italic_A + 1 - roman_Δ , italic_C - italic_A + roman_Δ ) and
𝖲𝗁𝗎𝖿𝖿𝗅𝖾(R(x11),R(x2),,R(xm))𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝑅superscriptsubscript𝑥11𝑅subscript𝑥2𝑅subscript𝑥𝑚\displaystyle\mathsf{Shuffle}(R(x_{1}^{1}),R(x_{2}),\ldots,R(x_{m}))sansserif_Shuffle ( italic_R ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , italic_R ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , italic_R ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) =𝒮(A+Δ,CA+1Δ),absent𝒮𝐴Δ𝐶𝐴1Δ\displaystyle=\mathcal{S}(A+\Delta,C-A+1-\Delta),= caligraphic_S ( italic_A + roman_Δ , italic_C - italic_A + 1 - roman_Δ ) ,

where CBin(s1,2β)similar-to𝐶Bin𝑠12𝛽C\sim\text{Bin}(s-1,2\beta)italic_C ∼ Bin ( italic_s - 1 , 2 italic_β ), ABin(C,12)similar-to𝐴Bin𝐶12A\sim\text{Bin}(C,\frac{1}{2})italic_A ∼ Bin ( italic_C , divide start_ARG 1 end_ARG start_ARG 2 end_ARG ), and ΔBernoulli(α)similar-toΔBernoulli𝛼\Delta\sim\text{Bernoulli}(\alpha)roman_Δ ∼ Bernoulli ( italic_α ), and 𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝖲𝗁𝗎𝖿𝖿𝗅𝖾\mathsf{Shuffle}sansserif_Shuffle is a uniformly random shuffle.

Proof.

Let Y10,Y11,Y2,,Ymsuperscriptsubscript𝑌10superscriptsubscript𝑌11subscript𝑌2subscript𝑌𝑚Y_{1}^{0},Y_{1}^{1},Y_{2},\ldots,Y_{m}italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT be distributions where Y1bsuperscriptsubscript𝑌1𝑏Y_{1}^{b}italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT is defined over {0,1}01\{0,1\}{ 0 , 1 } and satisfies Y10(0)=1αsuperscriptsubscript𝑌1001𝛼Y_{1}^{0}(0)=1-\alphaitalic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( 0 ) = 1 - italic_α and Y11(1)=αsuperscriptsubscript𝑌111𝛼Y_{1}^{1}(1)=\alphaitalic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( 1 ) = italic_α (with reversed probabilities if b=1𝑏1b=1italic_b = 1), and Yjsubscript𝑌𝑗Y_{j}italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for j2𝑗2j\geq 2italic_j ≥ 2 is defined over {0,1,2}012\{0,1,2\}{ 0 , 1 , 2 } and satisfies Yj(0)=Yj(1)=βsubscript𝑌𝑗0subscript𝑌𝑗1𝛽Y_{j}(0)=Y_{j}(1)=\betaitalic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 0 ) = italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 1 ) = italic_β and Yj(2)=12βsubscript𝑌𝑗212𝛽Y_{j}(2)=1-2\betaitalic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 2 ) = 1 - 2 italic_β. Let F𝐹Fitalic_F be a function returning a distribution satisfying

Fj(v)={Q(x10)v=0Q(x11)v=1Q(xj)otherwisesubscript𝐹𝑗𝑣cases𝑄superscriptsubscript𝑥10𝑣0𝑄superscriptsubscript𝑥11𝑣1𝑄subscript𝑥𝑗otherwiseF_{j}(v)=\begin{cases}Q(x_{1}^{0})&v=0\\ Q(x_{1}^{1})&v=1\\ Q(x_{j})&\text{otherwise}\end{cases}italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_v ) = { start_ROW start_CELL italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) end_CELL start_CELL italic_v = 0 end_CELL end_ROW start_ROW start_CELL italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) end_CELL start_CELL italic_v = 1 end_CELL end_ROW start_ROW start_CELL italic_Q ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) end_CELL start_CELL otherwise end_CELL end_ROW

Observe that by definition, the following probability distributions are equal for b{0,1}𝑏01b\in\{0,1\}italic_b ∈ { 0 , 1 }:

R(x1b),R(x2),,R(xm)=F1(Y1b),F2(Y2),,Fm(Ym).formulae-sequence𝑅superscriptsubscript𝑥1𝑏𝑅subscript𝑥2𝑅subscript𝑥𝑚subscript𝐹1superscriptsubscript𝑌1𝑏subscript𝐹2subscript𝑌2subscript𝐹𝑚subscript𝑌𝑚R(x_{1}^{b}),R(x_{2}),\ldots,R(x_{m})=F_{1}(Y_{1}^{b}),F_{2}(Y_{2}),\ldots,F_{% m}(Y_{m}).italic_R ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ) , italic_R ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , italic_R ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) = italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ) , italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , italic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) .

Let 0(Y1,,Ym)0subscript𝑌1subscript𝑌𝑚\textbf{0}(Y_{1},\ldots,Y_{m})0 ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) denote the number of indices j𝑗jitalic_j such that Yj=0subscript𝑌𝑗0Y_{j}=0italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 0, and define 1(Y1,,Ym)1subscript𝑌1subscript𝑌𝑚\textbf{1}(Y_{1},\ldots,Y_{m})1 ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) similarly. We will show that there exists a post-processing function 𝒮𝒮\mathcal{S}caligraphic_S such that, for both b{0,1}𝑏01b\in\{0,1\}italic_b ∈ { 0 , 1 }, we have

(9) 𝖲𝗁𝗎𝖿𝖿𝗅𝖾(F1(Y1b),F2(Y2),,Fm(Ym))=𝒮(0(Y1b,,Ym),1(Y1b,,Ym)).𝖲𝗁𝗎𝖿𝖿𝗅𝖾subscript𝐹1superscriptsubscript𝑌1𝑏subscript𝐹2subscript𝑌2subscript𝐹𝑚subscript𝑌𝑚𝒮0superscriptsubscript𝑌1𝑏subscript𝑌𝑚1superscriptsubscript𝑌1𝑏subscript𝑌𝑚\mathsf{Shuffle}(F_{1}(Y_{1}^{b}),F_{2}(Y_{2}),\ldots,F_{m}(Y_{m}))\\ =\mathcal{S}(\textbf{0}(Y_{1}^{b},\ldots,Y_{m}),\textbf{1}(Y_{1}^{b},\ldots,Y_% {m})).start_ROW start_CELL sansserif_Shuffle ( italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ) , italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , italic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) end_CELL end_ROW start_ROW start_CELL = caligraphic_S ( 0 ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) , 1 ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) . end_CELL end_ROW

We will do this by conditioning on the event Eu,vsubscript𝐸𝑢𝑣E_{u,v}italic_E start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT that

(0(Y1b,Y2,,Ym),1(Y1b,Y2,,Ym))=(u,v),0superscriptsubscript𝑌1𝑏subscript𝑌2subscript𝑌𝑚1superscriptsubscript𝑌1𝑏subscript𝑌2subscript𝑌𝑚𝑢𝑣(\textbf{0}(Y_{1}^{b},Y_{2},\ldots,Y_{m}),\textbf{1}(Y_{1}^{b},Y_{2},\ldots,Y_% {m}))=(u,v),( 0 ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT , italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) , 1 ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT , italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) = ( italic_u , italic_v ) ,

where u,v𝑢𝑣u,v\in\mathbb{N}italic_u , italic_v ∈ blackboard_N satisfy 1u+vm1𝑢𝑣𝑚1\leq u+v\leq m1 ≤ italic_u + italic_v ≤ italic_m. Now, define the vector r=𝖲𝗁𝗎𝖿𝖿𝗅𝖾(F(Y1),F2(Y2),,Fm(Ym))𝑟𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝐹subscript𝑌1subscript𝐹2subscript𝑌2subscript𝐹𝑚subscript𝑌𝑚r=\mathsf{Shuffle}(F(Y_{1}),F_{2}(Y_{2}),\ldots,F_{m}(Y_{m}))italic_r = sansserif_Shuffle ( italic_F ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , italic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ). Conditioned on Eu,vsubscript𝐸𝑢𝑣E_{u,v}italic_E start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT, r𝑟ritalic_r is distributed according to the following process: First, select a random partition UVW=[m]square-union𝑈𝑉𝑊delimited-[]𝑚U\sqcup V\sqcup W=[m]italic_U ⊔ italic_V ⊔ italic_W = [ italic_m ] such that |U|=u𝑈𝑢|U|=u| italic_U | = italic_u and |V|=v𝑉𝑣|V|=v| italic_V | = italic_v, corresponding to the indices (after shuffling) where Y1b,Y2,,Ymsuperscriptsubscript𝑌1𝑏subscript𝑌2subscript𝑌𝑚Y_{1}^{b},Y_{2},\ldots,Y_{m}italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT , italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT are equal to 0,1010,10 , 1, or 2222. Next, let π𝜋\piitalic_π be a random injection from W𝑊Witalic_W to [m]1delimited-[]𝑚1[m]\setminus 1[ italic_m ] ∖ 1. Then, r𝑟ritalic_r is distributed according to:

(10) r(u)=Q(x10)uUformulae-sequence𝑟𝑢𝑄superscriptsubscript𝑥10for-all𝑢𝑈\displaystyle r(u)=Q(x_{1}^{0})\ \ \ \forall u\in Uitalic_r ( italic_u ) = italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ∀ italic_u ∈ italic_U
(11) r(v)=Q(x11)vVformulae-sequence𝑟𝑣𝑄superscriptsubscript𝑥11for-all𝑣𝑉\displaystyle r(v)=Q(x_{1}^{1})\ \ \ \forall v\in Vitalic_r ( italic_v ) = italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) ∀ italic_v ∈ italic_V
(12) r(w)=Q(xπ(w))wW.formulae-sequence𝑟𝑤𝑄subscript𝑥𝜋𝑤for-all𝑤𝑊\displaystyle r(w)=Q(x_{\pi(w)})\ \ \ \forall w\in W.italic_r ( italic_w ) = italic_Q ( italic_x start_POSTSUBSCRIPT italic_π ( italic_w ) end_POSTSUBSCRIPT ) ∀ italic_w ∈ italic_W .

The above process is independent of α,β𝛼𝛽\alpha,\betaitalic_α , italic_β given Eu,vsubscript𝐸𝑢𝑣E_{u,v}italic_E start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT. In particular, it does not care whether we replace α𝛼\alphaitalic_α with 1α1𝛼1-\alpha1 - italic_α, and thus it serves as our process 𝒮𝒮\mathcal{S}caligraphic_S satisfying (9) for both values of b𝑏bitalic_b. Having established this, it is easy to show that 0(Y10,,Ym)=A+1Δ0superscriptsubscript𝑌10subscript𝑌𝑚𝐴1Δ\textbf{0}(Y_{1}^{0},\ldots,Y_{m})=A+1-\Delta0 ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) = italic_A + 1 - roman_Δ, 1(Y10,,Ym)=CA+Δ1superscriptsubscript𝑌10subscript𝑌𝑚𝐶𝐴Δ\textbf{1}(Y_{1}^{0},\ldots,Y_{m})=C-A+\Delta1 ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) = italic_C - italic_A + roman_Δ for b=0𝑏0b=0italic_b = 0, and 0(Y11,,Ym)=A+Δ0superscriptsubscript𝑌11subscript𝑌𝑚𝐴Δ\textbf{0}(Y_{1}^{1},\ldots,Y_{m})=A+\Delta0 ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) = italic_A + roman_Δ, 1(Y11,,Ym)=CA+1Δ1superscriptsubscript𝑌11subscript𝑌𝑚𝐶𝐴1Δ\textbf{1}(Y_{1}^{1},\ldots,Y_{m})=C-A+1-\Delta1 ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) = italic_C - italic_A + 1 - roman_Δ for b=1𝑏1b=1italic_b = 1. ∎

Having reduced the shuffling problem to a divergence between two fixed probability distributions, we follow the method of (Feldman et al., 2022) to compute this divergence. We use the following two results:

Lemma B.3.

(Restatement of Lemma A.1 from (Feldman et al., 2022)): Suppose p16ln(2/δ)m𝑝162𝛿𝑚p\geq\frac{16\ln(2/\delta)}{m}italic_p ≥ divide start_ARG 16 roman_ln ( 2 / italic_δ ) end_ARG start_ARG italic_m end_ARG, CBin(m1,p)similar-to𝐶𝐵𝑖𝑛𝑚1𝑝C\sim Bin(m-1,p)italic_C ∼ italic_B italic_i italic_n ( italic_m - 1 , italic_p ) and ABin(C,12)similar-to𝐴𝐵𝑖𝑛𝐶12A\sim Bin(C,\frac{1}{2})italic_A ∼ italic_B italic_i italic_n ( italic_C , divide start_ARG 1 end_ARG start_ARG 2 end_ARG ). Define P=(A+1,CA)𝑃𝐴1𝐶𝐴P=(A+1,C-A)italic_P = ( italic_A + 1 , italic_C - italic_A ) and Q=(A,CA+1)𝑄𝐴𝐶𝐴1Q=(A,C-A+1)italic_Q = ( italic_A , italic_C - italic_A + 1 ). Then, Deε(PQ)δsubscript𝐷superscript𝑒𝜀conditional𝑃𝑄𝛿D_{e^{\varepsilon}}(P\|Q)\leq\deltaitalic_D start_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P ∥ italic_Q ) ≤ italic_δ, where

ε=ln(1+8ln(4/δ))pm+8pm)\varepsilon=\ln\left(1+\frac{8\sqrt{\ln(4/\delta)})}{\sqrt{pm}}+\frac{8}{pm}\right)italic_ε = roman_ln ( 1 + divide start_ARG 8 square-root start_ARG roman_ln ( 4 / italic_δ ) end_ARG ) end_ARG start_ARG square-root start_ARG italic_p italic_m end_ARG end_ARG + divide start_ARG 8 end_ARG start_ARG italic_p italic_m end_ARG )

The next result, advanced joint convexity, originally appeared in the privacy amplification by sampling literature and can be used to improve the parameter ε𝜀\varepsilonitalic_ε when computing Dα(PQ)subscript𝐷𝛼conditional𝑃𝑄D_{\alpha}(P\|Q)italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_P ∥ italic_Q ) between two distributions which are nearly the same.

Lemma B.4.

(Restatement of Theorem 2 from (Balle et al., 2018)) Let P,Q𝑃𝑄P,Qitalic_P , italic_Q be probability distributions satisfying P=νM+(1ν)N𝑃𝜈𝑀1𝜈𝑁P=\nu M+(1-\nu)Nitalic_P = italic_ν italic_M + ( 1 - italic_ν ) italic_N and Q=νM+(1ν)N𝑄𝜈superscript𝑀1𝜈𝑁Q=\nu M^{\prime}+(1-\nu)Nitalic_Q = italic_ν italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + ( 1 - italic_ν ) italic_N for distributions M,M,N𝑀superscript𝑀𝑁M,M^{\prime},Nitalic_M , italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_N and ν[0,1]𝜈01\nu\in[0,1]italic_ν ∈ [ 0 , 1 ]. Given α1𝛼1\alpha\geq 1italic_α ≥ 1, define α=1+ν(α1)superscript𝛼1𝜈𝛼1\alpha^{\prime}=1+\nu(\alpha-1)italic_α start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1 + italic_ν ( italic_α - 1 ) and β=αα𝛽superscript𝛼𝛼\beta=\frac{\alpha^{\prime}}{\alpha}italic_β = divide start_ARG italic_α start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_α end_ARG. Then,

Dα(PQ)νDα(M(1β)N+βM).subscript𝐷superscript𝛼conditional𝑃𝑄𝜈subscript𝐷𝛼conditional𝑀1𝛽𝑁𝛽superscript𝑀D_{\alpha^{\prime}}(P\|Q)\leq\nu D_{\alpha}(M\|(1-\beta)N+\beta M^{\prime}).italic_D start_POSTSUBSCRIPT italic_α start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_P ∥ italic_Q ) ≤ italic_ν italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_M ∥ ( 1 - italic_β ) italic_N + italic_β italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) .

Finally, we require a result from local DP:

Lemma B.5.

(Restatement of Theorem 2.5 from (Kairouz et al., 2015)) Let P,Q𝑃𝑄P,Qitalic_P , italic_Q be two distributions and α1𝛼1\alpha\geq 1italic_α ≥ 1 be a parameter such that Dα(PQ)=0subscript𝐷𝛼conditional𝑃𝑄0D_{\alpha}(P\|Q)=0italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_P ∥ italic_Q ) = 0. Then, there exist distributions M,N𝑀𝑁M,Nitalic_M , italic_N such that

P=αα+1M+1α+1N𝑃𝛼𝛼1𝑀1𝛼1𝑁\displaystyle P=\frac{\alpha}{\alpha+1}M+\frac{1}{\alpha+1}Nitalic_P = divide start_ARG italic_α end_ARG start_ARG italic_α + 1 end_ARG italic_M + divide start_ARG 1 end_ARG start_ARG italic_α + 1 end_ARG italic_N
Q=1α+1M+1α+1N.𝑄1𝛼1𝑀1𝛼1𝑁\displaystyle Q=\frac{1}{\alpha+1}M+\frac{1}{\alpha+1}N.italic_Q = divide start_ARG 1 end_ARG start_ARG italic_α + 1 end_ARG italic_M + divide start_ARG 1 end_ARG start_ARG italic_α + 1 end_ARG italic_N .

With these results in order, we are ready to complete the proof.

B.4.2. Completing the proof of Lemma B.1

Using the definition of d𝒳subscript𝑑𝒳d_{\mathcal{X}}italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT-DP and the fact that d𝒳1subscript𝑑𝒳1d_{\mathcal{X}}\leq 1italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ≤ 1, we have

Dexp(ε0d)(𝒜(x10)𝒜(x11))=0subscript𝐷subscript𝜀0𝑑conditional𝒜superscriptsubscript𝑥10𝒜superscriptsubscript𝑥110\displaystyle D_{\exp(\varepsilon_{0}d)}(\mathcal{A}(x_{1}^{0})\|\mathcal{A}(x% _{1}^{1}))=0italic_D start_POSTSUBSCRIPT roman_exp ( italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_d ) end_POSTSUBSCRIPT ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ∥ caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) ) = 0
Dexp(ε0)(𝒜(x10)𝒜(xj))=0j2subscript𝐷subscript𝜀0conditional𝒜superscriptsubscript𝑥10𝒜subscript𝑥𝑗0for-all𝑗2\displaystyle D_{\exp(\varepsilon_{0})}(\mathcal{A}(x_{1}^{0})\|\mathcal{A}(x_% {j}))=0~{}~{}~{}\forall j\geq 2italic_D start_POSTSUBSCRIPT roman_exp ( italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ∥ caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) = 0 ∀ italic_j ≥ 2
Dexp(ε0)(𝒜(x11)𝒜(xj))=0j2.subscript𝐷subscript𝜀0conditional𝒜superscriptsubscript𝑥11𝒜subscript𝑥𝑗0for-all𝑗2\displaystyle D_{\exp(\varepsilon_{0})}(\mathcal{A}(x_{1}^{1})\|\mathcal{A}(x_% {j}))=0~{}~{}~{}\forall j\geq 2.italic_D start_POSTSUBSCRIPT roman_exp ( italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) ∥ caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) = 0 ∀ italic_j ≥ 2 .

Applying Lemma B.5 to the first equation, we obtain

(13) 𝒜(x10)=(1β)Q(x10)+βQ(x11)𝒜superscriptsubscript𝑥101𝛽𝑄superscriptsubscript𝑥10𝛽𝑄superscriptsubscript𝑥11\displaystyle\mathcal{A}(x_{1}^{0})=(1-\beta)Q(x_{1}^{0})+\beta Q(x_{1}^{1})caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = ( 1 - italic_β ) italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) + italic_β italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT )
(14) 𝒜(x11)=βQ(x10)+(1β)Q(x11)𝒜superscriptsubscript𝑥11𝛽𝑄superscriptsubscript𝑥101𝛽𝑄superscriptsubscript𝑥11\displaystyle\mathcal{A}(x_{1}^{1})=\beta Q(x_{1}^{0})+(1-\beta)Q(x_{1}^{1})caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) = italic_β italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) + ( 1 - italic_β ) italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT )

where β=11+exp(ε0d)𝛽11subscript𝜀0𝑑\beta=\frac{1}{1+\exp(\varepsilon_{0}d)}italic_β = divide start_ARG 1 end_ARG start_ARG 1 + roman_exp ( italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_d ) end_ARG. Applying the lemma to the second and third sets of equations, we obtain

(15) 𝒜(x10)=(1γ)R(x10,xj)+γR(x10,xj)j2𝒜superscriptsubscript𝑥101𝛾𝑅superscriptsubscript𝑥10subscript𝑥𝑗𝛾superscript𝑅superscriptsubscript𝑥10subscript𝑥𝑗for-all𝑗2\displaystyle\mathcal{A}(x_{1}^{0})=(1-\gamma)R(x_{1}^{0},x_{j})+\gamma R^{% \prime}(x_{1}^{0},x_{j})~{}~{}~{}\forall j\geq 2caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = ( 1 - italic_γ ) italic_R ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) + italic_γ italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∀ italic_j ≥ 2
(16) 𝒜(xj)=γR(x10,xj)+(1γ)R(x10,xj)j2𝒜subscript𝑥𝑗𝛾𝑅superscriptsubscript𝑥10subscript𝑥𝑗1𝛾superscript𝑅superscriptsubscript𝑥10subscript𝑥𝑗for-all𝑗2\displaystyle\mathcal{A}(x_{j})=\gamma R(x_{1}^{0},x_{j})+(1-\gamma)R^{\prime}% (x_{1}^{0},x_{j})~{}~{}~{}\forall j\geq 2caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = italic_γ italic_R ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) + ( 1 - italic_γ ) italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∀ italic_j ≥ 2
(17) 𝒜(x11)=(1γ)R(x11,xj)+γR(x11,xj)j2𝒜superscriptsubscript𝑥111𝛾𝑅superscriptsubscript𝑥11subscript𝑥𝑗𝛾superscript𝑅superscriptsubscript𝑥11subscript𝑥𝑗for-all𝑗2\displaystyle\mathcal{A}(x_{1}^{1})=(1-\gamma)R(x_{1}^{1},x_{j})+\gamma R^{% \prime}(x_{1}^{1},x_{j})~{}~{}~{}\forall j\geq 2caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) = ( 1 - italic_γ ) italic_R ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) + italic_γ italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∀ italic_j ≥ 2
(18) 𝒜(xj)=γR(x11,xj)+(1γ)R(x11,xj)j2.𝒜subscript𝑥𝑗𝛾𝑅superscriptsubscript𝑥11subscript𝑥𝑗1𝛾superscript𝑅superscriptsubscript𝑥11subscript𝑥𝑗for-all𝑗2\displaystyle\mathcal{A}(x_{j})=\gamma R(x_{1}^{1},x_{j})+(1-\gamma)R^{\prime}% (x_{1}^{1},x_{j})~{}~{}~{}\forall j\geq 2.caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = italic_γ italic_R ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) + ( 1 - italic_γ ) italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∀ italic_j ≥ 2 .

where γ=11+exp(ε0)𝛾11subscript𝜀0\gamma=\frac{1}{1+\exp(\varepsilon_{0})}italic_γ = divide start_ARG 1 end_ARG start_ARG 1 + roman_exp ( italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG. Subtracting 15 and 16, we obtain that

(19) 𝒜(xj)=γ1γ𝒜(x10)+12γ1γR(x10,xj)j2,𝒜subscript𝑥𝑗𝛾1𝛾𝒜superscriptsubscript𝑥1012𝛾1𝛾superscript𝑅superscriptsubscript𝑥10subscript𝑥𝑗for-all𝑗2\displaystyle\mathcal{A}(x_{j})=\frac{\gamma}{1-\gamma}\mathcal{A}(x_{1}^{0})+% \frac{1-2\gamma}{1-\gamma}R^{\prime}(x_{1}^{0},x_{j})~{}~{}~{}\forall j\geq 2,caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = divide start_ARG italic_γ end_ARG start_ARG 1 - italic_γ end_ARG caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) + divide start_ARG 1 - 2 italic_γ end_ARG start_ARG 1 - italic_γ end_ARG italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∀ italic_j ≥ 2 ,

and likewise 17 and 18 imply

(20) 𝒜(xj)=γ1γ𝒜(x11)+12γ1γR(x11,xj)j2.𝒜subscript𝑥𝑗𝛾1𝛾𝒜superscriptsubscript𝑥1112𝛾1𝛾superscript𝑅superscriptsubscript𝑥11subscript𝑥𝑗for-all𝑗2\displaystyle\mathcal{A}(x_{j})=\frac{\gamma}{1-\gamma}\mathcal{A}(x_{1}^{1})+% \frac{1-2\gamma}{1-\gamma}R^{\prime}(x_{1}^{1},x_{j})~{}~{}~{}\forall j\geq 2.caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = divide start_ARG italic_γ end_ARG start_ARG 1 - italic_γ end_ARG caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) + divide start_ARG 1 - 2 italic_γ end_ARG start_ARG 1 - italic_γ end_ARG italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∀ italic_j ≥ 2 .

Taking the average of 19 and 20, we obtain

(21) 𝒜(xj)=γ2(1γ)𝒜(x10)+γ2(1γ)𝒜(x11)+12γ1γQ(xj)j2,𝒜subscript𝑥𝑗𝛾21𝛾𝒜superscriptsubscript𝑥10𝛾21𝛾𝒜superscriptsubscript𝑥1112𝛾1𝛾𝑄subscript𝑥𝑗for-all𝑗2\displaystyle\mathcal{A}(x_{j})=\frac{\gamma}{2(1-\gamma)}\mathcal{A}(x_{1}^{0% })+\frac{\gamma}{2(1-\gamma)}\mathcal{A}(x_{1}^{1})+\frac{1-2\gamma}{1-\gamma}% Q(x_{j})~{}~{}~{}\forall j\geq 2,caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = divide start_ARG italic_γ end_ARG start_ARG 2 ( 1 - italic_γ ) end_ARG caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) + divide start_ARG italic_γ end_ARG start_ARG 2 ( 1 - italic_γ ) end_ARG caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) + divide start_ARG 1 - 2 italic_γ end_ARG start_ARG 1 - italic_γ end_ARG italic_Q ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∀ italic_j ≥ 2 ,

where Q(xj)=12R(x10,xj)+12R(x11,xj)𝑄subscript𝑥𝑗12superscript𝑅superscriptsubscript𝑥10subscript𝑥𝑗12superscript𝑅superscriptsubscript𝑥11subscript𝑥𝑗Q(x_{j})=\frac{1}{2}R^{\prime}(x_{1}^{0},x_{j})+\frac{1}{2}R^{\prime}(x_{1}^{1% },x_{j})italic_Q ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) + divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ). Now, equations 13 and 14 imply that

𝒜(x10)+𝒜(x11)=Q(x10)+Q(x11).𝒜superscriptsubscript𝑥10𝒜superscriptsubscript𝑥11𝑄superscriptsubscript𝑥10𝑄superscriptsubscript𝑥11\mathcal{A}(x_{1}^{0})+\mathcal{A}(x_{1}^{1})=Q(x_{1}^{0})+Q(x_{1}^{1}).caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) + caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) = italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) + italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) .

This implies

(22) 𝒜(xj)=γ2(1γ)Q(x10)+γ2(1γ)Q(x11)+12γ1γQ(xj)j2.𝒜subscript𝑥𝑗𝛾21𝛾𝑄superscriptsubscript𝑥10𝛾21𝛾𝑄superscriptsubscript𝑥1112𝛾1𝛾𝑄subscript𝑥𝑗for-all𝑗2\displaystyle\mathcal{A}(x_{j})=\frac{\gamma}{2(1-\gamma)}Q(x_{1}^{0})+\frac{% \gamma}{2(1-\gamma)}Q(x_{1}^{1})+\frac{1-2\gamma}{1-\gamma}Q(x_{j})~{}~{}~{}% \forall j\geq 2.caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = divide start_ARG italic_γ end_ARG start_ARG 2 ( 1 - italic_γ ) end_ARG italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) + divide start_ARG italic_γ end_ARG start_ARG 2 ( 1 - italic_γ ) end_ARG italic_Q ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) + divide start_ARG 1 - 2 italic_γ end_ARG start_ARG 1 - italic_γ end_ARG italic_Q ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∀ italic_j ≥ 2 .

Applying Lemma B.2, there exists a function S𝑆Sitalic_S such that

𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x10),𝒜(x2),,𝒜(xm))=S(A+1Δ,CA+Δ)𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜superscriptsubscript𝑥10𝒜subscript𝑥2𝒜subscript𝑥𝑚𝑆𝐴1Δ𝐶𝐴Δ\displaystyle\mathsf{Shuffle}(\mathcal{A}(x_{1}^{0}),\mathcal{A}(x_{2}),\ldots% ,\mathcal{A}(x_{m}))=S(A+1-\Delta,C-A+\Delta)sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , caligraphic_A ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) = italic_S ( italic_A + 1 - roman_Δ , italic_C - italic_A + roman_Δ )
𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x11),𝒜(x2),,𝒜(xm))=S(A+Δ,CA+1Δ),𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜superscriptsubscript𝑥11𝒜subscript𝑥2𝒜subscript𝑥𝑚𝑆𝐴Δ𝐶𝐴1Δ\displaystyle\mathsf{Shuffle}(\mathcal{A}(x_{1}^{1}),\mathcal{A}(x_{2}),\ldots% ,\mathcal{A}(x_{m}))=S(A+\Delta,C-A+1-\Delta),sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , caligraphic_A ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) = italic_S ( italic_A + roman_Δ , italic_C - italic_A + 1 - roman_Δ ) ,

where CBin(m1,γ1γ)=Bin(m1,eε0)similar-to𝐶𝐵𝑖𝑛𝑚1𝛾1𝛾𝐵𝑖𝑛𝑚1superscript𝑒subscript𝜀0C\sim Bin(m-1,\frac{\gamma}{1-\gamma})=Bin(m-1,e^{-{\varepsilon_{0}}})italic_C ∼ italic_B italic_i italic_n ( italic_m - 1 , divide start_ARG italic_γ end_ARG start_ARG 1 - italic_γ end_ARG ) = italic_B italic_i italic_n ( italic_m - 1 , italic_e start_POSTSUPERSCRIPT - italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ), ABin(C,12)similar-to𝐴𝐵𝑖𝑛𝐶12A\sim Bin(C,\frac{1}{2})italic_A ∼ italic_B italic_i italic_n ( italic_C , divide start_ARG 1 end_ARG start_ARG 2 end_ARG ), and ΔBernoulli(β)similar-toΔ𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖𝛽\Delta\sim Bernoulli(\beta)roman_Δ ∼ italic_B italic_e italic_r italic_n italic_o italic_u italic_l italic_l italic_i ( italic_β ). By the post-processing inequality, we have for any α1𝛼1\alpha\geq 1italic_α ≥ 1 that

Dα(𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x10),𝒜(x2),,𝒜(xs))𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x11),𝒜(x2),,𝒜(xs)))Dα((A+1Δ,CA+Δ)(A+Δ,CA+1Δ)).subscript𝐷𝛼𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜superscriptsubscript𝑥10𝒜subscript𝑥2𝒜subscript𝑥𝑠𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜superscriptsubscript𝑥11𝒜subscript𝑥2𝒜subscript𝑥𝑠subscript𝐷𝛼𝐴1Δ𝐶𝐴Δ𝐴Δ𝐶𝐴1ΔD_{\alpha}(\mathsf{Shuffle}(\mathcal{A}(x_{1}^{0}),\mathcal{A}(x_{2}),\ldots,% \mathcal{A}(x_{s}))\|\mathsf{Shuffle}(\mathcal{A}(x_{1}^{1}),\mathcal{A}(x_{2}% ),\\ \ldots,\mathcal{A}(x_{s})))\leq D_{\alpha}((A+1-\Delta,C-A+\Delta)\|(A+\Delta,% C-A+1-\Delta)).start_ROW start_CELL italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , caligraphic_A ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ) ) ∥ sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , caligraphic_A ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , end_CELL end_ROW start_ROW start_CELL … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ) ) ) ≤ italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( ( italic_A + 1 - roman_Δ , italic_C - italic_A + roman_Δ ) ∥ ( italic_A + roman_Δ , italic_C - italic_A + 1 - roman_Δ ) ) . end_CELL end_ROW

Observe we can write

(A+1Δ,CA+Δ)=(1β)(A+1,CA)+β(A,CA+1)𝐴1Δ𝐶𝐴Δ1𝛽𝐴1𝐶𝐴𝛽𝐴𝐶𝐴1\displaystyle(A+1-\Delta,C-A+\Delta)=(1-\beta)(A+1,C-A)+\beta(A,C-A+1)( italic_A + 1 - roman_Δ , italic_C - italic_A + roman_Δ ) = ( 1 - italic_β ) ( italic_A + 1 , italic_C - italic_A ) + italic_β ( italic_A , italic_C - italic_A + 1 )
(A+Δ,CA+1Δ)=β(A+1,CA)+(1β)(A,CA+1).𝐴Δ𝐶𝐴1Δ𝛽𝐴1𝐶𝐴1𝛽𝐴𝐶𝐴1\displaystyle(A+\Delta,C-A+1-\Delta)=\beta(A+1,C-A)+(1-\beta)(A,C-A+1).( italic_A + roman_Δ , italic_C - italic_A + 1 - roman_Δ ) = italic_β ( italic_A + 1 , italic_C - italic_A ) + ( 1 - italic_β ) ( italic_A , italic_C - italic_A + 1 ) .

Define X=(A+1,CA)𝑋𝐴1𝐶𝐴X=(A+1,C-A)italic_X = ( italic_A + 1 , italic_C - italic_A ) and Y=(A,CA+1)𝑌𝐴𝐶𝐴1Y=(A,C-A+1)italic_Y = ( italic_A , italic_C - italic_A + 1 ). We can rewrite the above as

(A+1Δ,CA+Δ)=2βX+Y2+(12β)X𝐴1Δ𝐶𝐴Δ2𝛽𝑋𝑌212𝛽𝑋\displaystyle(A+1-\Delta,C-A+\Delta)=2\beta\frac{X+Y}{2}+(1-2\beta)X( italic_A + 1 - roman_Δ , italic_C - italic_A + roman_Δ ) = 2 italic_β divide start_ARG italic_X + italic_Y end_ARG start_ARG 2 end_ARG + ( 1 - 2 italic_β ) italic_X
(A+Δ,CA+1Δ)=2βX+Y2+(12β)Y.𝐴Δ𝐶𝐴1Δ2𝛽𝑋𝑌212𝛽𝑌\displaystyle(A+\Delta,C-A+1-\Delta)=2\beta\frac{X+Y}{2}+(1-2\beta)Y.( italic_A + roman_Δ , italic_C - italic_A + 1 - roman_Δ ) = 2 italic_β divide start_ARG italic_X + italic_Y end_ARG start_ARG 2 end_ARG + ( 1 - 2 italic_β ) italic_Y .

Applying Lemma B.4, we have

Dα((A+1Δ,CA+Δ)(A+Δ,CA+1Δ))(12β)Dα(X(1η)(X+Y2)+ηY),subscript𝐷superscript𝛼conditional𝐴1Δ𝐶𝐴Δ𝐴Δ𝐶𝐴1Δ12𝛽subscript𝐷𝛼conditional𝑋1𝜂𝑋𝑌2𝜂𝑌D_{\alpha^{\prime}}((A+1-\Delta,C-A+\Delta)\|(A+\Delta,C-A+1-\Delta))\\ \leq(1-2\beta)D_{\alpha}(X\|(1-\eta)(\tfrac{X+Y}{2})+\eta Y),start_ROW start_CELL italic_D start_POSTSUBSCRIPT italic_α start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( ( italic_A + 1 - roman_Δ , italic_C - italic_A + roman_Δ ) ∥ ( italic_A + roman_Δ , italic_C - italic_A + 1 - roman_Δ ) ) end_CELL end_ROW start_ROW start_CELL ≤ ( 1 - 2 italic_β ) italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_X ∥ ( 1 - italic_η ) ( divide start_ARG italic_X + italic_Y end_ARG start_ARG 2 end_ARG ) + italic_η italic_Y ) , end_CELL end_ROW

where α=1+(12β)(α1)superscript𝛼112𝛽𝛼1\alpha^{\prime}=1+(1-2\beta)(\alpha-1)italic_α start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1 + ( 1 - 2 italic_β ) ( italic_α - 1 ) and η=αα𝜂superscript𝛼𝛼\eta=\frac{\alpha^{\prime}}{\alpha}italic_η = divide start_ARG italic_α start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_α end_ARG. By convexity, the RHS above is at most

Dα((A+1Δ,CA+Δ)(A+Δ,CA+1Δ))(12β)Dα(XY).subscript𝐷superscript𝛼conditional𝐴1Δ𝐶𝐴Δ𝐴Δ𝐶𝐴1Δ12𝛽subscript𝐷𝛼conditional𝑋𝑌D_{\alpha^{\prime}}((A+1-\Delta,C-A+\Delta)\|(A+\Delta,C-A+1-\Delta))\leq(1-2% \beta)D_{\alpha}(X\|Y).italic_D start_POSTSUBSCRIPT italic_α start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( ( italic_A + 1 - roman_Δ , italic_C - italic_A + roman_Δ ) ∥ ( italic_A + roman_Δ , italic_C - italic_A + 1 - roman_Δ ) ) ≤ ( 1 - 2 italic_β ) italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_X ∥ italic_Y ) .

Now, we finally set α=1+8exp(ε0)ln(4/δ)m+8exp(ε0)m𝛼18subscript𝜀04𝛿𝑚8subscript𝜀0𝑚\alpha=1+\frac{8\sqrt{\exp(-\varepsilon_{0})\ln(4/\delta)}}{\sqrt{m}}+\frac{8% \exp(-\varepsilon_{0})}{m}italic_α = 1 + divide start_ARG 8 square-root start_ARG roman_exp ( - italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) roman_ln ( 4 / italic_δ ) end_ARG end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG + divide start_ARG 8 roman_exp ( - italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_m end_ARG. Lemma B.3 (using the assumption that ε0ln(m16ln(2/δ))subscript𝜀0𝑚162𝛿\varepsilon_{0}\leq\ln(\frac{m}{16\ln(2/\delta)})italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≤ roman_ln ( divide start_ARG italic_m end_ARG start_ARG 16 roman_ln ( 2 / italic_δ ) end_ARG )) implies Dα(XY)δsubscript𝐷𝛼conditional𝑋𝑌𝛿D_{\alpha}(X\|Y)\leq\deltaitalic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_X ∥ italic_Y ) ≤ italic_δ. From this, we obtain our desired result that

Dα(𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x10),𝒜(x2),,𝒜(xm))𝖲𝗁𝗎𝖿𝖿𝗅𝖾(𝒜(x11),𝒜(x2),,𝒜(xm)))(12β)Dα(XY)δ,subscript𝐷superscript𝛼𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜superscriptsubscript𝑥10𝒜subscript𝑥2𝒜subscript𝑥𝑚𝖲𝗁𝗎𝖿𝖿𝗅𝖾𝒜superscriptsubscript𝑥11𝒜subscript𝑥2𝒜subscript𝑥𝑚12𝛽subscript𝐷𝛼𝑋𝑌𝛿D_{\alpha^{\prime}}(\mathsf{Shuffle}(\mathcal{A}(x_{1}^{0}),\mathcal{A}(x_{2})% ,\ldots,\mathcal{A}(x_{m}))\|\mathsf{Shuffle}(\mathcal{A}(x_{1}^{1}),\mathcal{% A}(x_{2}),\\ \ldots,\mathcal{A}(x_{m})))\leq(1-2\beta)D_{\alpha}(X\|Y)\leq\delta,start_ROW start_CELL italic_D start_POSTSUBSCRIPT italic_α start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , caligraphic_A ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) ∥ sansserif_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , caligraphic_A ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , end_CELL end_ROW start_ROW start_CELL … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) ) ≤ ( 1 - 2 italic_β ) italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_X ∥ italic_Y ) ≤ italic_δ , end_CELL end_ROW

where

α=1+eε0d1eε0d+1(8eε0ln(4/δ)m+8eε0m).superscript𝛼1superscript𝑒subscript𝜀0𝑑1superscript𝑒subscript𝜀0𝑑18superscript𝑒subscript𝜀04𝛿𝑚8superscript𝑒subscript𝜀0𝑚\alpha^{\prime}=1+\frac{e^{\varepsilon_{0}d}-1}{e^{\varepsilon_{0}d}+1}\left(% \frac{8\sqrt{e^{\varepsilon_{0}}\ln(4/\delta)}}{\sqrt{m}}+\frac{8e^{% \varepsilon_{0}}}{m}\right).italic_α start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1 + divide start_ARG italic_e start_POSTSUPERSCRIPT italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_d end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_d end_POSTSUPERSCRIPT + 1 end_ARG ( divide start_ARG 8 square-root start_ARG italic_e start_POSTSUPERSCRIPT italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_ln ( 4 / italic_δ ) end_ARG end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG + divide start_ARG 8 italic_e start_POSTSUPERSCRIPT italic_ε start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_m end_ARG ) .

B.5. Proof of Theorem 4.4

See 4.4

First, consider the local model. Fix any two itemsets K={x1,,xm}𝐾subscript𝑥1subscript𝑥𝑚K=\{x_{1},\ldots,x_{m}\}italic_K = { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT } and K={x1,,xm}superscript𝐾subscript𝑥1superscriptsubscript𝑥𝑚K^{\prime}=\{x_{1},\ldots,x_{m}^{\prime}\}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } such that dEM(K~,K~)wsubscript𝑑EM~𝐾superscript~𝐾𝑤d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})\leq witalic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ italic_w. By Lemma 2.1, there exists a permutation π:[m][m]:𝜋delimited-[]𝑚delimited-[]𝑚\pi:[m]\rightarrow[m]italic_π : [ italic_m ] → [ italic_m ] such that

i=1md𝒳(xi,xπ(i))=mw.superscriptsubscript𝑖1𝑚subscript𝑑𝒳subscript𝑥𝑖superscriptsubscript𝑥𝜋𝑖𝑚𝑤\sum_{i=1}^{m}d_{\mathcal{X}}(x_{i},x_{\pi(i)}^{\prime})=mw.∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_π ( italic_i ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_m italic_w .

Let

(23) L~=Shuffle(𝒜(x1),,𝒜(xm))~𝐿Shuffle𝒜subscript𝑥1𝒜subscript𝑥𝑚\displaystyle\tilde{L}=\mathrm{Shuffle}(\mathcal{A}(x_{1}),\ldots,\mathcal{A}(% x_{m}))over~ start_ARG italic_L end_ARG = roman_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) )
(24) L~=Shuffle(𝒜(xπ(i)),,𝒜(xπ(m))).superscript~𝐿Shuffle𝒜superscriptsubscript𝑥𝜋𝑖𝒜superscriptsubscript𝑥𝜋𝑚\displaystyle\tilde{L}^{\prime}=\mathrm{Shuffle}(\mathcal{A}(x_{\pi(i)}^{% \prime}),\ldots,\mathcal{A}(x_{\pi(m)}^{\prime})).over~ start_ARG italic_L end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = roman_Shuffle ( caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_π ( italic_i ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , … , caligraphic_A ( italic_x start_POSTSUBSCRIPT italic_π ( italic_m ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) .

By Theorem 4.3, we know that Dexp(α(w))(L~L~)δeα(w)subscript𝐷𝛼𝑤conditional~𝐿superscript~𝐿𝛿superscript𝑒𝛼𝑤D_{\exp(\alpha(w))}(\tilde{L}\|\tilde{L}^{\prime})\leq\delta e^{\alpha(w)}italic_D start_POSTSUBSCRIPT roman_exp ( italic_α ( italic_w ) ) end_POSTSUBSCRIPT ( over~ start_ARG italic_L end_ARG ∥ over~ start_ARG italic_L end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ italic_δ italic_e start_POSTSUPERSCRIPT italic_α ( italic_w ) end_POSTSUPERSCRIPT, where α(w)=h(m;m,mw)𝛼𝑤𝑚𝑚𝑚𝑤\alpha(w)=h(m;m,mw)italic_α ( italic_w ) = italic_h ( italic_m ; italic_m , italic_m italic_w ). The final privacy parameters for a fixed w𝑤witalic_w will be α(w)w𝛼𝑤𝑤\frac{\alpha(w)}{w}divide start_ARG italic_α ( italic_w ) end_ARG start_ARG italic_w end_ARG and δeα(w)𝛿superscript𝑒𝛼𝑤\delta e^{\alpha(w)}italic_δ italic_e start_POSTSUPERSCRIPT italic_α ( italic_w ) end_POSTSUPERSCRIPT; the worst-case privacy parameters are thus supw[0,1]α(w)wsubscriptsupremum𝑤01𝛼𝑤𝑤\sup_{w\in[0,1]}\frac{\alpha(w)}{w}roman_sup start_POSTSUBSCRIPT italic_w ∈ [ 0 , 1 ] end_POSTSUBSCRIPT divide start_ARG italic_α ( italic_w ) end_ARG start_ARG italic_w end_ARG and supw[0,1]δeα(w)subscriptsupremum𝑤01𝛿superscript𝑒𝛼𝑤\sup_{w\in[0,1]}\delta e^{\alpha(w)}roman_sup start_POSTSUBSCRIPT italic_w ∈ [ 0 , 1 ] end_POSTSUBSCRIPT italic_δ italic_e start_POSTSUPERSCRIPT italic_α ( italic_w ) end_POSTSUPERSCRIPT. Since α(w)𝛼𝑤\alpha(w)italic_α ( italic_w ) is an increasing function, the latter term reduces to δeα(w)𝛿superscript𝑒𝛼𝑤\delta e^{\alpha(w)}italic_δ italic_e start_POSTSUPERSCRIPT italic_α ( italic_w ) end_POSTSUPERSCRIPT.

In the bounded central model, the same logic applies, except that L~,L~~𝐿superscript~𝐿\tilde{L},\tilde{L}^{\prime}over~ start_ARG italic_L end_ARG , over~ start_ARG italic_L end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT have size mn𝑚𝑛mnitalic_m italic_n, differ in only m𝑚mitalic_m coordinates, and

i=1mnd𝒳(xi,xπ(i))=mw.superscriptsubscript𝑖1𝑚𝑛subscript𝑑𝒳subscript𝑥𝑖superscriptsubscript𝑥𝜋𝑖𝑚𝑤\sum_{i=1}^{mn}d_{\mathcal{X}}(x_{i},x_{\pi(i)}^{\prime})=mw.∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_π ( italic_i ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_m italic_w .

We apply Theorem 4.3 to obtain Dexp(α(w))(L~L~)δeα(w)subscript𝐷𝛼𝑤conditional~𝐿superscript~𝐿𝛿superscript𝑒𝛼𝑤D_{\exp(\alpha(w))}(\tilde{L}\|\tilde{L}^{\prime})\leq\delta e^{\alpha(w)}italic_D start_POSTSUBSCRIPT roman_exp ( italic_α ( italic_w ) ) end_POSTSUBSCRIPT ( over~ start_ARG italic_L end_ARG ∥ over~ start_ARG italic_L end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ italic_δ italic_e start_POSTSUPERSCRIPT italic_α ( italic_w ) end_POSTSUPERSCRIPT, where α(w)=h(mn;m,mw)𝛼𝑤𝑚𝑛𝑚𝑚𝑤\alpha(w)=h(mn;m,mw)italic_α ( italic_w ) = italic_h ( italic_m italic_n ; italic_m , italic_m italic_w ), and we complete the proof similarly.

Appendix C Omitted Proofs from Section 5

C.1. Proof of Lemma 5.2

See 5.2 For i=1,,s𝑖1𝑠i=1,\ldots,sitalic_i = 1 , … , italic_s, define Xi=d𝒳(xi,yi)subscript𝑋𝑖subscript𝑑𝒳subscript𝑥𝑖subscript𝑦𝑖X_{i}=d_{\mathcal{X}}(x_{i},y_{i})italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_d start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ), and observe that dEM(L~,L~)1s(X1++Xs)subscript𝑑EM~𝐿superscript~𝐿1𝑠subscript𝑋1subscript𝑋𝑠d_{\textsf{EM}}(\tilde{L},\tilde{L}^{\prime})\leq\frac{1}{s}(X_{1}+\cdots+X_{s})italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_L end_ARG , over~ start_ARG italic_L end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ divide start_ARG 1 end_ARG start_ARG italic_s end_ARG ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + ⋯ + italic_X start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ). Now, let μ𝜇\muitalic_μ denote dEM(K~,K~)subscript𝑑EM~𝐾superscript~𝐾d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). Observe each Xisubscript𝑋𝑖X_{i}italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is i.i.d. and satisfies 𝔼[Xi]=μ𝔼delimited-[]subscript𝑋𝑖𝜇\mathbb{E}[X_{i}]=\mublackboard_E [ italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] = italic_μ and 0Xi10subscript𝑋𝑖10\leq X_{i}\leq 10 ≤ italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≤ 1. Due to the last two facts, we have 𝔼[Xi2]μ𝔼delimited-[]superscriptsubscript𝑋𝑖2𝜇\mathbb{E}[X_{i}^{2}]\leq\mublackboard_E [ italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] ≤ italic_μ. By Bernstein’s inequality, we have, for all t0𝑡0t\geq 0italic_t ≥ 0,

Pr[X1++Xssμt]et2/2(v+bt/3),Prsubscript𝑋1subscript𝑋𝑠𝑠𝜇𝑡superscript𝑒superscript𝑡22𝑣𝑏𝑡3\Pr\left[X_{1}+\cdots+X_{s}-s\mu\geq t\right]\leq e^{-t^{2}/2(v+bt/3)},roman_Pr [ italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + ⋯ + italic_X start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT - italic_s italic_μ ≥ italic_t ] ≤ italic_e start_POSTSUPERSCRIPT - italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 2 ( italic_v + italic_b italic_t / 3 ) end_POSTSUPERSCRIPT ,

where v=i=1s𝔼[Xi2]sμ𝑣superscriptsubscript𝑖1𝑠𝔼delimited-[]superscriptsubscript𝑋𝑖2𝑠𝜇v=\sum_{i=1}^{s}\mathbb{E}[X_{i}^{2}]\leq s\muitalic_v = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT blackboard_E [ italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] ≤ italic_s italic_μ and b=1𝑏1b=1italic_b = 1. By setting

t=max{4sμln(1/δ),43ln(1/δ)},𝑡4𝑠𝜇1𝛿431𝛿t=\max\{\sqrt{4s\mu\ln(1/\delta)},\tfrac{4}{3}\ln(1/\delta)\},italic_t = roman_max { square-root start_ARG 4 italic_s italic_μ roman_ln ( 1 / italic_δ ) end_ARG , divide start_ARG 4 end_ARG start_ARG 3 end_ARG roman_ln ( 1 / italic_δ ) } ,

we ensure that the probability is at most δ𝛿\deltaitalic_δ. We have

sμ+tsμ+2sμln(1/δ)+43ln(1/δ)(1+2)sμ+(43+2)ln1δ.𝑠𝜇𝑡𝑠𝜇2𝑠𝜇1𝛿431𝛿12𝑠𝜇4321𝛿s\mu+t\leq s\mu+2\sqrt{s\mu\ln(1/\delta)}+\tfrac{4}{3}\ln(1/\delta)\leq(1+% \sqrt{2})s\mu+(\tfrac{4}{3}+\sqrt{2})\ln\tfrac{1}{\delta}.italic_s italic_μ + italic_t ≤ italic_s italic_μ + 2 square-root start_ARG italic_s italic_μ roman_ln ( 1 / italic_δ ) end_ARG + divide start_ARG 4 end_ARG start_ARG 3 end_ARG roman_ln ( 1 / italic_δ ) ≤ ( 1 + square-root start_ARG 2 end_ARG ) italic_s italic_μ + ( divide start_ARG 4 end_ARG start_ARG 3 end_ARG + square-root start_ARG 2 end_ARG ) roman_ln divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG .

Finally,

Pr[dEM(L~,L~)(1+2)μ+3sln1δ]Pr[X1++Xs(1+2)sμ+3ln1δ]δ.Prsubscript𝑑EM~𝐿superscript~𝐿12𝜇3𝑠1𝛿Prsubscript𝑋1subscript𝑋𝑠12𝑠𝜇31𝛿𝛿\Pr[d_{\textsf{EM}}(\tilde{L},\tilde{L}^{\prime})\geq(1+\sqrt{2})\mu+\tfrac{3}% {s}\ln\tfrac{1}{\delta}]\leq\\ \Pr[X_{1}+\cdots+X_{s}\geq(1+\sqrt{2})s\mu+3\ln\tfrac{1}{\delta}]\leq\delta.start_ROW start_CELL roman_Pr [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_L end_ARG , over~ start_ARG italic_L end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ ( 1 + square-root start_ARG 2 end_ARG ) italic_μ + divide start_ARG 3 end_ARG start_ARG italic_s end_ARG roman_ln divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ] ≤ end_CELL end_ROW start_ROW start_CELL roman_Pr [ italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + ⋯ + italic_X start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ≥ ( 1 + square-root start_ARG 2 end_ARG ) italic_s italic_μ + 3 roman_ln divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ] ≤ italic_δ . end_CELL end_ROW

C.2. Proof of Theorem 5.3

See 5.3 First, we will consider the local model. Let K,K𝐾superscript𝐾K,K^{\prime}italic_K , italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT denote two datasets such that dEM(K~,K~)rsubscript𝑑EM~𝐾superscript~𝐾𝑟d_{\textsf{EM}}(\tilde{K},\tilde{K}^{\prime})\leq ritalic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ italic_r. Let L𝐿Litalic_L, Lsuperscript𝐿L^{\prime}italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT denote the set of s𝑠sitalic_s samples when K𝐾Kitalic_K (resp. Ksuperscript𝐾K^{\prime}italic_K start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT) is used. Our goal is to show that Dexp(ε)((L)(L))δsubscript𝐷𝜀conditional𝐿superscript𝐿𝛿D_{\exp(\varepsilon)}(\mathcal{M}(L)\|\mathcal{M}(L^{\prime}))\leq\deltaitalic_D start_POSTSUBSCRIPT roman_exp ( italic_ε ) end_POSTSUBSCRIPT ( caligraphic_M ( italic_L ) ∥ caligraphic_M ( italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≤ italic_δ. Observe we may define the objects 𝐋,𝐋Δ𝒳s𝐋superscript𝐋superscriptΔsuperscript𝒳𝑠\mathbf{L},\mathbf{L}^{\prime}\in\Delta^{\mathcal{X}^{s}}bold_L , bold_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ roman_Δ start_POSTSUPERSCRIPT caligraphic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT to be the probability distributions of L,L𝐿superscript𝐿L,L^{\prime}italic_L , italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (which lie in 𝒳ssuperscript𝒳𝑠\mathcal{X}^{s}caligraphic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT). By Lemma A.1, for any coupling C𝒞(𝐋,𝐋)𝐶𝒞𝐋superscript𝐋C\in\mathcal{C}(\mathbf{L},\mathbf{L}^{\prime})italic_C ∈ caligraphic_C ( bold_L , bold_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ), we have

Dexp(ε)((L)(L))subscript𝐷𝜀conditional𝐿superscript𝐿\displaystyle D_{\exp(\varepsilon)}(\mathcal{M}(L)\|\mathcal{M}(L^{\prime}))italic_D start_POSTSUBSCRIPT roman_exp ( italic_ε ) end_POSTSUBSCRIPT ( caligraphic_M ( italic_L ) ∥ caligraphic_M ( italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) 𝔼(L,L)C[Dexp(ε)((L)(L))].absentsubscript𝔼similar-to𝐿superscript𝐿𝐶delimited-[]subscript𝐷𝜀conditional𝐿superscript𝐿\displaystyle\leq\mathbb{E}_{(L,L^{\prime})\sim C}[D_{\exp(\varepsilon)}(% \mathcal{M}(L)\|\mathcal{M}(L^{\prime}))].≤ blackboard_E start_POSTSUBSCRIPT ( italic_L , italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∼ italic_C end_POSTSUBSCRIPT [ italic_D start_POSTSUBSCRIPT roman_exp ( italic_ε ) end_POSTSUBSCRIPT ( caligraphic_M ( italic_L ) ∥ caligraphic_M ( italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ] .

Let A𝐴Aitalic_A denote the event that we have dEM(L~,L~)(1+2)r+3sln1δsubscript𝑑EM~𝐿superscript~𝐿12𝑟3𝑠1𝛿d_{\textsf{EM}}(\tilde{L},\tilde{L}^{\prime})\leq(1+\sqrt{2})r+\frac{3}{s}\ln% \tfrac{1}{\delta}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_L end_ARG , over~ start_ARG italic_L end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ ( 1 + square-root start_ARG 2 end_ARG ) italic_r + divide start_ARG 3 end_ARG start_ARG italic_s end_ARG roman_ln divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG. When A𝐴Aitalic_A holds, then Dexp(ε)((L)(L))δsubscript𝐷𝜀conditional𝐿superscript𝐿𝛿D_{\exp(\varepsilon)}(\mathcal{M}(L)\|\mathcal{M}(L^{\prime}))\leq\deltaitalic_D start_POSTSUBSCRIPT roman_exp ( italic_ε ) end_POSTSUBSCRIPT ( caligraphic_M ( italic_L ) ∥ caligraphic_M ( italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≤ italic_δ by assumption. When this does not hold, then trivially Dexp(ε)((L)(L))1subscript𝐷𝜀conditional𝐿superscript𝐿1D_{\exp(\varepsilon)}(\mathcal{M}(L)\|\mathcal{M}(L^{\prime}))\leq 1italic_D start_POSTSUBSCRIPT roman_exp ( italic_ε ) end_POSTSUBSCRIPT ( caligraphic_M ( italic_L ) ∥ caligraphic_M ( italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≤ 1. Conditioning on the above expectation, we have

𝔼(L,L)C[Dexp(ε)((L)(L))]subscript𝔼similar-to𝐿superscript𝐿𝐶delimited-[]subscript𝐷𝜀conditional𝐿superscript𝐿\displaystyle\mathbb{E}_{(L,L^{\prime})\sim C}[D_{\exp(\varepsilon)}(\mathcal{% M}(L)\|\mathcal{M}(L^{\prime}))]blackboard_E start_POSTSUBSCRIPT ( italic_L , italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∼ italic_C end_POSTSUBSCRIPT [ italic_D start_POSTSUBSCRIPT roman_exp ( italic_ε ) end_POSTSUBSCRIPT ( caligraphic_M ( italic_L ) ∥ caligraphic_M ( italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ] δPr[A]+Pr[A¯]absent𝛿Pr𝐴Pr¯𝐴\displaystyle\leq\delta\Pr[A]+\Pr[\overline{A}]≤ italic_δ roman_Pr [ italic_A ] + roman_Pr [ over¯ start_ARG italic_A end_ARG ]
δ+Pr[A¯].absent𝛿Pr¯𝐴\displaystyle\leq\delta+\Pr[\overline{A}].≤ italic_δ + roman_Pr [ over¯ start_ARG italic_A end_ARG ] .

Now, let CΔ𝒳×𝒳superscript𝐶superscriptΔ𝒳𝒳C^{*}\in\Delta^{\mathcal{X}\times\mathcal{X}}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ roman_Δ start_POSTSUPERSCRIPT caligraphic_X × caligraphic_X end_POSTSUPERSCRIPT denote the optimal coupling between K~,K~~𝐾superscript~𝐾\tilde{K},\tilde{K}^{\prime}over~ start_ARG italic_K end_ARG , over~ start_ARG italic_K end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. We will take C=(C)sΔ𝒳s×𝒳s𝐶superscriptsuperscript𝐶𝑠superscriptΔsuperscript𝒳𝑠superscript𝒳𝑠C=(C^{*})^{s}\in\Delta^{\mathcal{X}^{s}\times\mathcal{X}^{s}}italic_C = ( italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ∈ roman_Δ start_POSTSUPERSCRIPT caligraphic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT × caligraphic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT, the s𝑠sitalic_s-fold Kronecker product of Csuperscript𝐶C^{*}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. Observe this is indeed a coupling between 𝐋,𝐋𝐋superscript𝐋\mathbf{L},\mathbf{L^{\prime}}bold_L , bold_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, and each coordinate of (L,L)Csimilar-to𝐿superscript𝐿𝐶(L,L^{\prime})\sim C( italic_L , italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∼ italic_C is simply a sample from Csuperscript𝐶C^{*}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. Thus, the event A𝐴Aitalic_A above is equivalent to

Pr[A]=Pr(L,L)(C)s[dEM(L~,L~)(1+2)r+3sln1δ],Pr𝐴subscriptPrsimilar-to𝐿superscript𝐿superscriptsuperscript𝐶𝑠subscript𝑑EM~𝐿superscript~𝐿12𝑟3𝑠1𝛿\Pr[A]=\Pr_{(L,L^{\prime})\sim(C^{*})^{s}}[d_{\textsf{EM}}(\tilde{L},\tilde{L}% ^{\prime})\leq(1+\sqrt{2})r+\frac{3}{s}\ln\frac{1}{\delta}],roman_Pr [ italic_A ] = roman_Pr start_POSTSUBSCRIPT ( italic_L , italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∼ ( italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_L end_ARG , over~ start_ARG italic_L end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ ( 1 + square-root start_ARG 2 end_ARG ) italic_r + divide start_ARG 3 end_ARG start_ARG italic_s end_ARG roman_ln divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ] ,

where the notation (L,L)(C)ssimilar-to𝐿superscript𝐿superscriptsuperscript𝐶𝑠(L,L^{\prime})\sim(C^{*})^{s}( italic_L , italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∼ ( italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT indicates that L={x1,,xs}𝐿subscript𝑥1subscript𝑥𝑠L=\{x_{1},\ldots,x_{s}\}italic_L = { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT } and L={y1,,ys}𝐿subscript𝑦1subscript𝑦𝑠L=\{y_{1},\ldots,y_{s}\}italic_L = { italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT }, and each (xi,yi)Csimilar-tosubscript𝑥𝑖subscript𝑦𝑖superscript𝐶(x_{i},y_{i})\sim C^{*}( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ∼ italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. By Lemma 5.2, we know that Pr[A]1δPr𝐴1𝛿\Pr[A]\geq 1-\deltaroman_Pr [ italic_A ] ≥ 1 - italic_δ, and thus the above expectation is at most 2δ2𝛿2\delta2 italic_δ. This proof may be generalized easily to the central model.

Appendix D Omitted Proofs from Section 6

D.1. Proof of Lemma 6.2

See 6.2 As the sensitivity of the query is bounded by F2subscriptnorm𝐹2\|F\|_{2}∥ italic_F ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, is easy to show (e.g. (Dwork et al., 2014)) that adding d𝑑ditalic_d-dimensional Gaussian noise with width F2r1.25ln1δαsubscriptnorm𝐹2𝑟1.251𝛿𝛼\|F\|_{2}\frac{r\sqrt{1.25\ln\frac{1}{\delta}}}{\alpha}∥ italic_F ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT divide start_ARG italic_r square-root start_ARG 1.25 roman_ln divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG end_ARG end_ARG start_ARG italic_α end_ARG in each coordinate will satisfy (αr,δ)𝛼𝑟𝛿(\frac{\alpha}{r},\delta)( divide start_ARG italic_α end_ARG start_ARG italic_r end_ARG , italic_δ ) local dEMsubscript𝑑EMd_{\textsf{EM}}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT-DP. The standard deviation in each coordinate of q^^𝑞\hat{q}over^ start_ARG italic_q end_ARG is thus F2r1.25ln1δαnsubscriptnorm𝐹2𝑟1.251𝛿𝛼𝑛\|F\|_{2}\frac{r\sqrt{1.25\ln\frac{1}{\delta}}}{\alpha\sqrt{n}}∥ italic_F ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT divide start_ARG italic_r square-root start_ARG 1.25 roman_ln divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG end_ARG end_ARG start_ARG italic_α square-root start_ARG italic_n end_ARG end_ARG, and this gives the desired expected error.

D.2. Proof of Theorem 6.5

See 6.5 First, we will introduce notation. For a cluster label b𝑏b\in\mathcal{B}italic_b ∈ caligraphic_B, let 𝒳[b]𝒳𝒳delimited-[]𝑏𝒳\mathcal{X}[b]\subseteq\mathcal{X}caligraphic_X [ italic_b ] ⊆ caligraphic_X denote the elements of 𝒳𝒳\mathcal{X}caligraphic_X in cluster b𝑏bitalic_b. Define F~[b]×𝒞~𝐹delimited-[]𝑏superscript𝒞\tilde{F}[b]\in\mathbb{R}^{\mathcal{B}\times\mathcal{C}}over~ start_ARG italic_F end_ARG [ italic_b ] ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_B × caligraphic_C end_POSTSUPERSCRIPT to be the indices of F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG in 𝒳[b]𝒳delimited-[]𝑏\mathcal{X}[b]caligraphic_X [ italic_b ] (so that indices outside 𝒳[b]𝒳delimited-[]𝑏\mathcal{X}[b]caligraphic_X [ italic_b ] are zeroed out). Define K~[b]~𝐾delimited-[]𝑏\tilde{K}[b]over~ start_ARG italic_K end_ARG [ italic_b ] similarly, and observe that F~[b],K~[b]~𝐹delimited-[]𝑏~𝐾delimited-[]𝑏\tilde{F}[b],\tilde{K}[b]over~ start_ARG italic_F end_ARG [ italic_b ] , over~ start_ARG italic_K end_ARG [ italic_b ] are not normalized.

For any estimate F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG, consider the following transportation plan from F~~𝐹\tilde{F}over~ start_ARG italic_F end_ARG to K~~𝐾\tilde{K}over~ start_ARG italic_K end_ARG: For each b𝑏b\in\mathcal{B}italic_b ∈ caligraphic_B, transfer F~[b]~𝐹delimited-[]𝑏\tilde{F}[b]over~ start_ARG italic_F end_ARG [ italic_b ] to K~[b]~𝐾delimited-[]𝑏\tilde{K}[b]over~ start_ARG italic_K end_ARG [ italic_b ] arbitrarily, and put any excess weight in the bin (b,c)𝑏superscript𝑐(b,c^{\prime})( italic_b , italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) for an arbitrary c𝒞superscript𝑐𝒞c^{\prime}\in\mathcal{C}italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_C. The cost incurred by this is at most rF~[b]K~[b]1+r|μ(F~[b])μ(K~[b])|𝑟subscriptnorm~𝐹delimited-[]𝑏~𝐾delimited-[]𝑏1𝑟𝜇~𝐹delimited-[]𝑏𝜇~𝐾delimited-[]𝑏r\|\tilde{F}[b]-\tilde{K}[b]\|_{1}+r|\mu(\tilde{F}[b])-\mu(\tilde{K}[b])|italic_r ∥ over~ start_ARG italic_F end_ARG [ italic_b ] - over~ start_ARG italic_K end_ARG [ italic_b ] ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_r | italic_μ ( over~ start_ARG italic_F end_ARG [ italic_b ] ) - italic_μ ( over~ start_ARG italic_K end_ARG [ italic_b ] ) |, where μ()𝜇\mu(\cdot)italic_μ ( ⋅ ) denotes total mass of its argument. Finally, equalize the weights in the coordinates {(b,c):b}conditional-set𝑏superscript𝑐𝑏\{(b,c^{\prime}):b\in\mathcal{B}\}{ ( italic_b , italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) : italic_b ∈ caligraphic_B }. The cost incurred for this step is at most (1r)b|μ(F~[b])μ(K~[b])|1𝑟subscript𝑏𝜇~𝐹delimited-[]𝑏𝜇~𝐾delimited-[]𝑏(1-r)\sum_{b\in\mathcal{B}}|\mu(\tilde{F}[b])-\mu(\tilde{K}[b])|( 1 - italic_r ) ∑ start_POSTSUBSCRIPT italic_b ∈ caligraphic_B end_POSTSUBSCRIPT | italic_μ ( over~ start_ARG italic_F end_ARG [ italic_b ] ) - italic_μ ( over~ start_ARG italic_K end_ARG [ italic_b ] ) |. Thus, the total cost is

brF~[b]K~[b]1+|μ(F~[b])μ(K~[b])|=rF~K~1+b|μ(F~[b])μ(K~[b])|.subscript𝑏𝑟subscriptdelimited-∥∥~𝐹delimited-[]𝑏~𝐾delimited-[]𝑏1𝜇~𝐹delimited-[]𝑏𝜇~𝐾delimited-[]𝑏𝑟subscriptdelimited-∥∥~𝐹~𝐾1subscript𝑏𝜇~𝐹delimited-[]𝑏𝜇~𝐾delimited-[]𝑏\sum_{b\in\mathcal{B}}r\|\tilde{F}[b]-\tilde{K}[b]\|_{1}+|\mu(\tilde{F}[b])-% \mu(\tilde{K}[b])|\\ =r\|\tilde{F}-\tilde{K}\|_{1}+\sum_{b\in\mathcal{B}}|\mu(\tilde{F}[b])-\mu(% \tilde{K}[b])|.start_ROW start_CELL ∑ start_POSTSUBSCRIPT italic_b ∈ caligraphic_B end_POSTSUBSCRIPT italic_r ∥ over~ start_ARG italic_F end_ARG [ italic_b ] - over~ start_ARG italic_K end_ARG [ italic_b ] ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + | italic_μ ( over~ start_ARG italic_F end_ARG [ italic_b ] ) - italic_μ ( over~ start_ARG italic_K end_ARG [ italic_b ] ) | end_CELL end_ROW start_ROW start_CELL = italic_r ∥ over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_b ∈ caligraphic_B end_POSTSUBSCRIPT | italic_μ ( over~ start_ARG italic_F end_ARG [ italic_b ] ) - italic_μ ( over~ start_ARG italic_K end_ARG [ italic_b ] ) | . end_CELL end_ROW

Observe that the term b|μ(F~[b])μ(K~[b])|subscript𝑏𝜇~𝐹delimited-[]𝑏𝜇~𝐾delimited-[]𝑏\sum_{b\in\mathcal{B}}|\mu(\tilde{F}[b])-\mu(\tilde{K}[b])|∑ start_POSTSUBSCRIPT italic_b ∈ caligraphic_B end_POSTSUBSCRIPT | italic_μ ( over~ start_ARG italic_F end_ARG [ italic_b ] ) - italic_μ ( over~ start_ARG italic_K end_ARG [ italic_b ] ) | is simply the 1subscript1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT distance between F~P~𝐹𝑃\tilde{F}Pover~ start_ARG italic_F end_ARG italic_P and K~P~𝐾𝑃\tilde{K}Pover~ start_ARG italic_K end_ARG italic_P, where P(×𝒞)×𝑃superscript𝒞P\in\mathbb{R}^{(\mathcal{B}\times\mathcal{C})\times\mathcal{B}}italic_P ∈ blackboard_R start_POSTSUPERSCRIPT ( caligraphic_B × caligraphic_C ) × caligraphic_B end_POSTSUPERSCRIPT is the matrix that maps a vector to its sum along each coordinate in \mathcal{B}caligraphic_B. Thus, we may form the the upper bound

𝔼[dEM(F~,K~)]𝔼delimited-[]subscript𝑑EM~𝐹~𝐾\displaystyle\mathbb{E}[d_{\textsf{EM}}(\tilde{F},\tilde{K})]blackboard_E [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_F end_ARG , over~ start_ARG italic_K end_ARG ) ] r𝔼[F~K~1]+𝔼[(F~K~)P1]absent𝑟𝔼delimited-[]subscriptnorm~𝐹~𝐾1𝔼delimited-[]subscriptnorm~𝐹~𝐾𝑃1\displaystyle\leq r\mathbb{E}[\|\tilde{F}-\tilde{K}\|_{1}]+\mathbb{E}[\|(% \tilde{F}-\tilde{K})P\|_{1}]≤ italic_r blackboard_E [ ∥ over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] + blackboard_E [ ∥ ( over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ) italic_P ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ]
r𝔼[stF~K~2]+𝔼[s(F~K~)P2]absent𝑟𝔼delimited-[]𝑠𝑡subscriptnorm~𝐹~𝐾2𝔼delimited-[]𝑠subscriptnorm~𝐹~𝐾𝑃2\displaystyle\leq r\mathbb{E}[\sqrt{st}\|\tilde{F}-\tilde{K}\|_{2}]+\mathbb{E}% [\sqrt{s}\|(\tilde{F}-\tilde{K})P\|_{2}]≤ italic_r blackboard_E [ square-root start_ARG italic_s italic_t end_ARG ∥ over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] + blackboard_E [ square-root start_ARG italic_s end_ARG ∥ ( over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ) italic_P ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ]
(25) rst𝔼[F~K~22]+s𝔼[(F~K~)P22].absent𝑟𝑠𝑡𝔼delimited-[]superscriptsubscriptnorm~𝐹~𝐾22𝑠𝔼delimited-[]superscriptsubscriptnorm~𝐹~𝐾𝑃22\displaystyle\leq r\sqrt{st\mathbb{E}[\|\tilde{F}-\tilde{K}\|_{2}^{2}]}+\sqrt{% s\mathbb{E}[\|(\tilde{F}-\tilde{K})P\|_{2}^{2}]}.≤ italic_r square-root start_ARG italic_s italic_t blackboard_E [ ∥ over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG + square-root start_ARG italic_s blackboard_E [ ∥ ( over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ) italic_P ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG .

Now, we will bound (25) given this estimator. In the following, let Axsubscript𝐴𝑥A_{x}italic_A start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT denote the x𝑥xitalic_xth row of the matrix A𝐴Aitalic_A. Observe that

F~K~~𝐹~𝐾\displaystyle\tilde{F}-\tilde{K}over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG =1mni=1mnziBK~ABabsent1𝑚𝑛superscriptsubscript𝑖1𝑚𝑛subscript𝑧𝑖𝐵~𝐾𝐴𝐵\displaystyle=\frac{1}{mn}\sum_{i=1}^{mn}z_{i}B-\tilde{K}AB= divide start_ARG 1 end_ARG start_ARG italic_m italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B - over~ start_ARG italic_K end_ARG italic_A italic_B
=1mni=1mnziB1mni=1mnekiABabsent1𝑚𝑛superscriptsubscript𝑖1𝑚𝑛subscript𝑧𝑖𝐵1𝑚𝑛superscriptsubscript𝑖1𝑚𝑛subscript𝑒subscript𝑘𝑖𝐴𝐵\displaystyle=\frac{1}{mn}\sum_{i=1}^{mn}z_{i}B-\frac{1}{mn}\sum_{i=1}^{mn}e_{% k_{i}}AB= divide start_ARG 1 end_ARG start_ARG italic_m italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B - divide start_ARG 1 end_ARG start_ARG italic_m italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT italic_e start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_A italic_B
=1mni=1mnziB1mni=1mnAkiBabsent1𝑚𝑛superscriptsubscript𝑖1𝑚𝑛subscript𝑧𝑖𝐵1𝑚𝑛superscriptsubscript𝑖1𝑚𝑛subscript𝐴subscript𝑘𝑖𝐵\displaystyle=\frac{1}{mn}\sum_{i=1}^{mn}z_{i}B-\frac{1}{mn}\sum_{i=1}^{mn}A_{% k_{i}}B= divide start_ARG 1 end_ARG start_ARG italic_m italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B - divide start_ARG 1 end_ARG start_ARG italic_m italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_B
=1mni=1mn(ziAki)Babsent1𝑚𝑛superscriptsubscript𝑖1𝑚𝑛subscript𝑧𝑖subscript𝐴subscript𝑘𝑖𝐵\displaystyle=\frac{1}{mn}\sum_{i=1}^{mn}(z_{i}-A_{k_{i}})B= divide start_ARG 1 end_ARG start_ARG italic_m italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_B

Define wi=ziAkisubscript𝑤𝑖subscript𝑧𝑖subscript𝐴subscript𝑘𝑖w_{i}=z_{i}-A_{k_{i}}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT, and notice that 𝔼[wi]=𝔼[zi]Aki=0𝔼delimited-[]subscript𝑤𝑖𝔼delimited-[]subscript𝑧𝑖subscript𝐴subscript𝑘𝑖0\mathbb{E}[w_{i}]=\mathbb{E}[z_{i}]-A_{k_{i}}=0blackboard_E [ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] = blackboard_E [ italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] - italic_A start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0. Thus,

𝔼[F~K~22]𝔼delimited-[]superscriptsubscriptnorm~𝐹~𝐾22\displaystyle\mathbb{E}[\|\tilde{F}-\tilde{K}\|_{2}^{2}]blackboard_E [ ∥ over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] =𝔼[(F~K~)(F~K~)T]absent𝔼delimited-[]~𝐹~𝐾superscript~𝐹~𝐾𝑇\displaystyle=\mathbb{E}[(\tilde{F}-\tilde{K})(\tilde{F}-\tilde{K})^{T}]= blackboard_E [ ( over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ) ( over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ]
=(1mn)2𝔼[(i=1mnwiB)(i=1mnBTwiT)]absentsuperscript1𝑚𝑛2𝔼delimited-[]superscriptsubscript𝑖1𝑚𝑛subscript𝑤𝑖𝐵superscriptsubscript𝑖1𝑚𝑛superscript𝐵𝑇superscriptsubscript𝑤𝑖𝑇\displaystyle=\left(\frac{1}{mn}\right)^{2}\mathbb{E}\left[\left(\sum_{i=1}^{% mn}w_{i}B\right)\left(\sum_{i=1}^{mn}B^{T}w_{i}^{T}\right)\right]= ( divide start_ARG 1 end_ARG start_ARG italic_m italic_n end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT blackboard_E [ ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B ) ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) ]
=(1mn)2i,j=1mn𝔼[wiBBTwjT]absentsuperscript1𝑚𝑛2superscriptsubscript𝑖𝑗1𝑚𝑛𝔼delimited-[]subscript𝑤𝑖𝐵superscript𝐵𝑇superscriptsubscript𝑤𝑗𝑇\displaystyle=\left(\frac{1}{mn}\right)^{2}\sum_{i,j=1}^{mn}\mathbb{E}[w_{i}BB% ^{T}w_{j}^{T}]= ( divide start_ARG 1 end_ARG start_ARG italic_m italic_n end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT blackboard_E [ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ]
=(1mn)2i=1mn𝔼[wiBBTwiT],absentsuperscript1𝑚𝑛2superscriptsubscript𝑖1𝑚𝑛𝔼delimited-[]subscript𝑤𝑖𝐵superscript𝐵𝑇superscriptsubscript𝑤𝑖𝑇\displaystyle=\left(\frac{1}{mn}\right)^{2}\sum_{i=1}^{mn}\mathbb{E}[w_{i}BB^{% T}w_{i}^{T}],= ( divide start_ARG 1 end_ARG start_ARG italic_m italic_n end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT blackboard_E [ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] ,

where the last step holds because the wisubscript𝑤𝑖w_{i}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are independent. Now, we have

𝔼[wiBBTwiT]𝔼delimited-[]subscript𝑤𝑖𝐵superscript𝐵𝑇superscriptsubscript𝑤𝑖𝑇\displaystyle\mathbb{E}[w_{i}BB^{T}w_{i}^{T}]blackboard_E [ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] =𝔼[ziBBTziT]𝔼[AkiBBTAkiT]absent𝔼delimited-[]subscript𝑧𝑖𝐵superscript𝐵𝑇superscriptsubscript𝑧𝑖𝑇𝔼delimited-[]subscript𝐴subscript𝑘𝑖𝐵superscript𝐵𝑇superscriptsubscript𝐴subscript𝑘𝑖𝑇\displaystyle=\mathbb{E}[z_{i}BB^{T}z_{i}^{T}]-\mathbb{E}[A_{k_{i}}BB^{T}A_{k_% {i}}^{T}]= blackboard_E [ italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] - blackboard_E [ italic_A start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_B italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ]
=𝔼[ziBBTziT]ekiekiTabsent𝔼delimited-[]subscript𝑧𝑖𝐵superscript𝐵𝑇superscriptsubscript𝑧𝑖𝑇subscript𝑒subscript𝑘𝑖superscriptsubscript𝑒subscript𝑘𝑖𝑇\displaystyle=\mathbb{E}[z_{i}BB^{T}z_{i}^{T}]-e_{k_{i}}e_{k_{i}}^{T}= blackboard_E [ italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] - italic_e start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_e start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
BT1,221.absentsuperscriptsubscriptnormsuperscript𝐵𝑇1221\displaystyle\leq\|B^{T}\|_{1,2}^{2}-1.≤ ∥ italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 .

Putting it all together, we have

𝔼[F~K~22]BT1,221mn𝔼delimited-[]superscriptsubscriptnorm~𝐹~𝐾22superscriptsubscriptnormsuperscript𝐵𝑇1221𝑚𝑛\mathbb{E}[\|\tilde{F}-\tilde{K}\|_{2}^{2}]\leq\frac{\|B^{T}\|_{1,2}^{2}-1}{mn}blackboard_E [ ∥ over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] ≤ divide start_ARG ∥ italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_m italic_n end_ARG

To control the term (F~K~)P22superscriptsubscriptnorm~𝐹~𝐾𝑃22\|(\tilde{F}-\tilde{K})P\|_{2}^{2}∥ ( over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ) italic_P ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT in (25), using similar steps, we may write

𝔼[(F~K~)P22](1mn)2i=1mn𝔼[wiBPPTBTwiT].𝔼delimited-[]superscriptsubscriptnorm~𝐹~𝐾𝑃22superscript1𝑚𝑛2superscriptsubscript𝑖1𝑚𝑛𝔼delimited-[]subscript𝑤𝑖𝐵𝑃superscript𝑃𝑇superscript𝐵𝑇superscriptsubscript𝑤𝑖𝑇\mathbb{E}[\|(\tilde{F}-\tilde{K})P\|_{2}^{2}]\leq\left(\frac{1}{mn}\right)^{2% }\sum_{i=1}^{mn}\mathbb{E}[w_{i}BPP^{T}B^{T}w_{i}^{T}].blackboard_E [ ∥ ( over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ) italic_P ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] ≤ ( divide start_ARG 1 end_ARG start_ARG italic_m italic_n end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_n end_POSTSUPERSCRIPT blackboard_E [ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B italic_P italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] .

Similarly, for any i𝑖iitalic_i we have

𝔼[wiBPPTBTwiT]𝔼delimited-[]subscript𝑤𝑖𝐵𝑃superscript𝑃𝑇superscript𝐵𝑇superscriptsubscript𝑤𝑖𝑇\displaystyle\mathbb{E}[w_{i}BPP^{T}B^{T}w_{i}^{T}]blackboard_E [ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B italic_P italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] =𝔼[ziBPPTBTziT]𝔼[AkiBPPTBAkiT]absent𝔼delimited-[]subscript𝑧𝑖𝐵𝑃superscript𝑃𝑇superscript𝐵𝑇superscriptsubscript𝑧𝑖𝑇𝔼delimited-[]subscript𝐴subscript𝑘𝑖𝐵𝑃superscript𝑃𝑇𝐵superscriptsubscript𝐴subscript𝑘𝑖𝑇\displaystyle=\mathbb{E}[z_{i}BPP^{T}B^{T}z_{i}^{T}]-\mathbb{E}[A_{k_{i}}BPP^{% T}BA_{k_{i}}^{T}]= blackboard_E [ italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_B italic_P italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] - blackboard_E [ italic_A start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_B italic_P italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_A start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ]
PTBT1,221,absentsuperscriptsubscriptnormsuperscript𝑃𝑇superscript𝐵𝑇1221\displaystyle\leq\|P^{T}B^{T}\|_{1,2}^{2}-1,≤ ∥ italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 ,

and this implies

𝔼[(F~K~)P22]PTBT1,221mn.𝔼delimited-[]superscriptsubscriptnorm~𝐹~𝐾𝑃22superscriptsubscriptnormsuperscript𝑃𝑇superscript𝐵𝑇1221𝑚𝑛\mathbb{E}[\|(\tilde{F}-\tilde{K})P\|_{2}^{2}]\leq\frac{\|P^{T}B^{T}\|_{1,2}^{% 2}-1}{mn}.blackboard_E [ ∥ ( over~ start_ARG italic_F end_ARG - over~ start_ARG italic_K end_ARG ) italic_P ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] ≤ divide start_ARG ∥ italic_P start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_m italic_n end_ARG .

Substituting into (25), we obtain the desired bound.

D.3. Proof of Theorem 6.6

See 6.6 For positive constants a,b,c𝑎𝑏𝑐a,b,citalic_a , italic_b , italic_c, the matrix A𝐴Aitalic_A is given by

A=aI𝒳+(bI+c1)1𝒞,𝐴𝑎subscript𝐼𝒳tensor-product𝑏subscript𝐼𝑐subscript1subscript1𝒞A=aI_{\mathcal{X}}+(bI_{\mathcal{B}}+c\textbf{1}_{\mathcal{B}})\otimes\textbf{% 1}_{\mathcal{C}},italic_A = italic_a italic_I start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT + ( italic_b italic_I start_POSTSUBSCRIPT caligraphic_B end_POSTSUBSCRIPT + italic_c 1 start_POSTSUBSCRIPT caligraphic_B end_POSTSUBSCRIPT ) ⊗ 1 start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT ,

where

a𝑎\displaystyle aitalic_a =eα0e(1r)α0eα0+(t1)e(1r)α0+(s1)tabsentsuperscript𝑒subscript𝛼0superscript𝑒1𝑟subscript𝛼0superscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑠1𝑡\displaystyle=\frac{e^{\alpha_{0}}-e^{(1-r)\alpha_{0}}}{e^{\alpha_{0}}+(t-1)e^% {(1-r)\alpha_{0}}+(s-1)t}= divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_s - 1 ) italic_t end_ARG
b𝑏\displaystyle bitalic_b =e(1r)α01eα0+(t1)e(1r)α0+(s1)tabsentsuperscript𝑒1𝑟subscript𝛼01superscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑠1𝑡\displaystyle=\frac{e^{(1-r)\alpha_{0}}-1}{e^{\alpha_{0}}+(t-1)e^{(1-r)\alpha_% {0}}+(s-1)t}= divide start_ARG italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_s - 1 ) italic_t end_ARG
c𝑐\displaystyle citalic_c =1eα0+(t1)e(1r)α0+(s1)t.absent1superscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑠1𝑡\displaystyle=\frac{1}{e^{\alpha_{0}}+(t-1)e^{(1-r)\alpha_{0}}+(s-1)t}.= divide start_ARG 1 end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_s - 1 ) italic_t end_ARG .

The matrix A𝐴Aitalic_A is actually invertible, and

A1=aI𝒳+(bI+c1)1𝒞,superscript𝐴1superscript𝑎subscript𝐼𝒳tensor-productsuperscript𝑏subscript𝐼superscript𝑐subscript1subscript1𝒞A^{-1}=a^{\prime}I_{\mathcal{X}}+(b^{\prime}I_{\mathcal{B}}+c^{\prime}\textbf{% 1}_{\mathcal{B}})\otimes\textbf{1}_{\mathcal{C}},italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT + ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT caligraphic_B end_POSTSUBSCRIPT + italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT 1 start_POSTSUBSCRIPT caligraphic_B end_POSTSUBSCRIPT ) ⊗ 1 start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT ,

where

asuperscript𝑎\displaystyle a^{\prime}italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT =eα0+(t1)e(1r)α0+(s1)teα0e(1r)α0absentsuperscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑠1𝑡superscript𝑒subscript𝛼0superscript𝑒1𝑟subscript𝛼0\displaystyle=\frac{e^{\alpha_{0}}+(t-1)e^{(1-r)\alpha_{0}}+(s-1)t}{e^{\alpha_% {0}}-e^{(1-r)\alpha_{0}}}= divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_s - 1 ) italic_t end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG
bsuperscript𝑏\displaystyle b^{\prime}italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT =(e(1r)α01)(eα0+(t1)e(1r)α0+(s1)t)(eα0e(1r)α0)(eα0+(t1)e(1r)α0t)absentsuperscript𝑒1𝑟subscript𝛼01superscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑠1𝑡superscript𝑒subscript𝛼0superscript𝑒1𝑟subscript𝛼0superscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑡\displaystyle=-\frac{(e^{(1-r)\alpha_{0}}-1)(e^{\alpha_{0}}+(t-1)e^{(1-r)% \alpha_{0}}+(s-1)t)}{(e^{\alpha_{0}}-e^{(1-r)\alpha_{0}})(e^{\alpha_{0}}+(t-1)% e^{(1-r)\alpha_{0}}-t)}= - divide start_ARG ( italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - 1 ) ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_s - 1 ) italic_t ) end_ARG start_ARG ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_t ) end_ARG
csuperscript𝑐\displaystyle c^{\prime}italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT =1eα0+(t1)e(1r)α0t.absent1superscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑡\displaystyle=-\frac{1}{e^{\alpha_{0}}+(t-1)e^{(1-r)\alpha_{0}}-t}.= - divide start_ARG 1 end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_t end_ARG .

It is easy to show the identity that a+tb+stc=1superscript𝑎𝑡superscript𝑏𝑠𝑡superscript𝑐1a^{\prime}+tb^{\prime}+stc^{\prime}=1italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_t italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_s italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1. Each row of A1superscript𝐴1A^{-1}italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT looks like one copy of a+b+csuperscript𝑎superscript𝑏superscript𝑐a^{\prime}+b^{\prime}+c^{\prime}italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, t1𝑡1t-1italic_t - 1 copies of b+csuperscript𝑏superscript𝑐b^{\prime}+c^{\prime}italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, and (s1)t𝑠1𝑡(s-1)t( italic_s - 1 ) italic_t copies of csuperscript𝑐c^{\prime}italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Thus,

(A1)T1221superscriptsubscriptnormsuperscriptsuperscript𝐴1𝑇1221\displaystyle\|(A^{-1})^{T}\|_{1\rightarrow 2}^{2}-1∥ ( italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 → 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1
=(a+b+c)2+(t1)(b+c)2+(s1)t(c)21absentsuperscriptsuperscript𝑎superscript𝑏superscript𝑐2𝑡1superscriptsuperscript𝑏superscript𝑐2𝑠1𝑡superscriptsuperscript𝑐21\displaystyle=(a^{\prime}+b^{\prime}+c^{\prime})^{2}+(t-1)(b^{\prime}+c^{% \prime})^{2}+(s-1)t(c^{\prime})^{2}-1= ( italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_t - 1 ) ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_s - 1 ) italic_t ( italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1
=(1(t1)b(st1)c)2+(t1)(b)2absentsuperscript1𝑡1superscript𝑏𝑠𝑡1superscript𝑐2𝑡1superscriptsuperscript𝑏2\displaystyle\;=(1-(t-1)b^{\prime}-(st-1)c^{\prime})^{2}+(t-1)(b^{\prime})^{2}= ( 1 - ( italic_t - 1 ) italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - ( italic_s italic_t - 1 ) italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_t - 1 ) ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
+2(t1)bc+(t1)(c)2+(s1)t(c)212𝑡1superscript𝑏superscript𝑐𝑡1superscriptsuperscript𝑐2𝑠1𝑡superscriptsuperscript𝑐21\displaystyle\qquad+2(t-1)b^{\prime}c^{\prime}+(t-1)(c^{\prime})^{2}+(s-1)t(c^% {\prime})^{2}-1+ 2 ( italic_t - 1 ) italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + ( italic_t - 1 ) ( italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_s - 1 ) italic_t ( italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1
=2(t1)b2(st1)c+2(t1)(st1)cbabsent2𝑡1superscript𝑏2𝑠𝑡1superscript𝑐2𝑡1𝑠𝑡1superscript𝑐superscript𝑏\displaystyle\;=-2(t-1)b^{\prime}-2(st-1)c^{\prime}+2(t-1)(st-1)c^{\prime}b^{\prime}= - 2 ( italic_t - 1 ) italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 2 ( italic_s italic_t - 1 ) italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + 2 ( italic_t - 1 ) ( italic_s italic_t - 1 ) italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
+(t1)2(b)2+(st1)2(c)2+(t1)(b)2superscript𝑡12superscriptsuperscript𝑏2superscript𝑠𝑡12superscriptsuperscript𝑐2𝑡1superscriptsuperscript𝑏2\displaystyle\qquad+(t-1)^{2}(b^{\prime})^{2}+(st-1)^{2}(c^{\prime})^{2}+(t-1)% (b^{\prime})^{2}+ ( italic_t - 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_s italic_t - 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_t - 1 ) ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
+2(t1)bc+(t1)(c)2+(s1)t(c)22𝑡1superscript𝑏superscript𝑐𝑡1superscriptsuperscript𝑐2𝑠1𝑡superscriptsuperscript𝑐2\displaystyle\qquad+2(t-1)b^{\prime}c^{\prime}+(t-1)(c^{\prime})^{2}+(s-1)t(c^% {\prime})^{2}+ 2 ( italic_t - 1 ) italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + ( italic_t - 1 ) ( italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_s - 1 ) italic_t ( italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
(tb)2+2st2bc+(stc)22tb2stcabsentsuperscript𝑡superscript𝑏22𝑠superscript𝑡2superscript𝑏superscript𝑐superscript𝑠𝑡superscript𝑐22𝑡superscript𝑏2𝑠𝑡superscript𝑐\displaystyle\leq(tb^{\prime})^{2}+2st^{2}b^{\prime}c^{\prime}+(stc^{\prime})^% {2}-2tb^{\prime}-2stc^{\prime}≤ ( italic_t italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 italic_s italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + ( italic_s italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 italic_t italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 2 italic_s italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
(tb+stc)22(tb+stc)absentsuperscript𝑡superscript𝑏𝑠𝑡superscript𝑐22𝑡superscript𝑏𝑠𝑡superscript𝑐\displaystyle\leq(tb^{\prime}+stc^{\prime})^{2}-2(tb^{\prime}+stc^{\prime})≤ ( italic_t italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_s italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 ( italic_t italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_s italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
=(a)21.absentsuperscriptsuperscript𝑎21\displaystyle=(a^{\prime})^{2}-1.= ( italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 .

Substituting, we obtain

(a)21superscriptsuperscript𝑎21\displaystyle(a^{\prime})^{2}-1( italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 (te(1r)α0+(s1)teα0e(1r)α0)2+2(te(1r)α0+(s1)teα0e(1r)α0)absentsuperscript𝑡superscript𝑒1𝑟subscript𝛼0𝑠1𝑡superscript𝑒subscript𝛼0superscript𝑒1𝑟subscript𝛼022𝑡superscript𝑒1𝑟subscript𝛼0𝑠1𝑡superscript𝑒subscript𝛼0superscript𝑒1𝑟subscript𝛼0\displaystyle\leq\left(\frac{te^{(1-r)\alpha_{0}}+(s-1)t}{e^{\alpha_{0}}-e^{(1% -r)\alpha_{0}}}\right)^{2}+2\left(\frac{te^{(1-r)\alpha_{0}}+(s-1)t}{e^{\alpha% _{0}}-e^{(1-r)\alpha_{0}}}\right)≤ ( divide start_ARG italic_t italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_s - 1 ) italic_t end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 ( divide start_ARG italic_t italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_s - 1 ) italic_t end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG )
t2e2α0+2(s1)t2eα0+(s1)2t2+2te2α0+2(s1)teα0(eα0e(1r)α0)2absentsuperscript𝑡2superscript𝑒2subscript𝛼02𝑠1superscript𝑡2superscript𝑒subscript𝛼0superscript𝑠12superscript𝑡22𝑡superscript𝑒2subscript𝛼02𝑠1𝑡superscript𝑒subscript𝛼0superscriptsuperscript𝑒subscript𝛼0superscript𝑒1𝑟subscript𝛼02\displaystyle\leq\frac{t^{2}e^{2\alpha_{0}}+2(s-1)t^{2}e^{\alpha_{0}}+(s-1)^{2% }t^{2}+2te^{2\alpha_{0}}+2(s-1)te^{\alpha_{0}}}{(e^{\alpha_{0}}-e^{(1-r)\alpha% _{0}})^{2}}≤ divide start_ARG italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT 2 italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + 2 ( italic_s - 1 ) italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_s - 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 italic_t italic_e start_POSTSUPERSCRIPT 2 italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + 2 ( italic_s - 1 ) italic_t italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG
(teα0+seα0e(1r)α0)2absentsuperscript𝑡superscript𝑒subscript𝛼0𝑠superscript𝑒subscript𝛼0superscript𝑒1𝑟subscript𝛼02\displaystyle\leq\left(t\frac{e^{\alpha_{0}}+s}{e^{\alpha_{0}}-e^{(1-r)\alpha_% {0}}}\right)^{2}≤ ( italic_t divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + italic_s end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

Next, it’s easy to see that

A1Psuperscript𝐴1𝑃\displaystyle A^{-1}Pitalic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P =(aI𝒳+((bI+c1)1𝒞))(I1𝒞)absentsuperscript𝑎subscript𝐼𝒳tensor-productsuperscript𝑏subscript𝐼superscript𝑐subscript1subscript1𝒞tensor-productsubscript𝐼subscript1𝒞\displaystyle=\left(a^{\prime}I_{\mathcal{X}}+((b^{\prime}I_{\mathcal{B}}+c^{% \prime}\textbf{1}_{\mathcal{B}})\otimes\textbf{1}_{\mathcal{C}})\right)(I_{% \mathcal{B}}\otimes 1_{\mathcal{C}})= ( italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT + ( ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT caligraphic_B end_POSTSUBSCRIPT + italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT 1 start_POSTSUBSCRIPT caligraphic_B end_POSTSUBSCRIPT ) ⊗ 1 start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT ) ) ( italic_I start_POSTSUBSCRIPT caligraphic_B end_POSTSUBSCRIPT ⊗ 1 start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT )
=aI1𝒞+(bI+c1)t1𝒞absenttensor-productsuperscript𝑎subscript𝐼subscript1𝒞tensor-productsuperscript𝑏subscript𝐼superscript𝑐subscript1𝑡subscript1𝒞\displaystyle=a^{\prime}I_{\mathcal{B}}\otimes 1_{\mathcal{C}}+(b^{\prime}I_{% \mathcal{B}}+c^{\prime}\textbf{1}_{\mathcal{B}})\otimes t1_{\mathcal{C}}= italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT caligraphic_B end_POSTSUBSCRIPT ⊗ 1 start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT + ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT caligraphic_B end_POSTSUBSCRIPT + italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT 1 start_POSTSUBSCRIPT caligraphic_B end_POSTSUBSCRIPT ) ⊗ italic_t 1 start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT

Each row of the latter consists of one copy of a+tb+tcsuperscript𝑎𝑡superscript𝑏𝑡superscript𝑐a^{\prime}+tb^{\prime}+tc^{\prime}italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_t italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and s1𝑠1s-1italic_s - 1 copies of tc𝑡superscript𝑐tc^{\prime}italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. This gives us

(A1P)T1221superscriptsubscriptnormsuperscriptsuperscript𝐴1𝑃𝑇1221\displaystyle\|(A^{-1}P)^{T}\|_{1\rightarrow 2}^{2}-1∥ ( italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 → 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 =(a+tb+tc)2+(s1)(tc)21absentsuperscriptsuperscript𝑎𝑡superscript𝑏𝑡superscript𝑐2𝑠1superscript𝑡superscript𝑐21\displaystyle=(a^{\prime}+tb^{\prime}+tc^{\prime})^{2}+(s-1)(tc^{\prime})^{2}-1= ( italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_t italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_s - 1 ) ( italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1
=(1(s1)tc)2+(s1)(tc)21absentsuperscript1𝑠1𝑡superscript𝑐2𝑠1superscript𝑡superscript𝑐21\displaystyle=(1-(s-1)tc^{\prime})^{2}+(s-1)(tc^{\prime})^{2}-1= ( 1 - ( italic_s - 1 ) italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_s - 1 ) ( italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1
=s(s1)(tc)22(s1)(tc)absent𝑠𝑠1superscript𝑡superscript𝑐22𝑠1𝑡superscript𝑐\displaystyle=s(s-1)(tc^{\prime})^{2}-2(s-1)(tc^{\prime})= italic_s ( italic_s - 1 ) ( italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 ( italic_s - 1 ) ( italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
(stc)22(stc).absentsuperscript𝑠𝑡superscript𝑐22𝑠𝑡superscript𝑐\displaystyle\leq(stc^{\prime})^{2}-2(stc^{\prime}).≤ ( italic_s italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 ( italic_s italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) .

Substituting, we obtain

(stc)22(stc)superscript𝑠𝑡superscript𝑐22𝑠𝑡superscript𝑐\displaystyle(stc^{\prime})^{2}-2(stc^{\prime})( italic_s italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 ( italic_s italic_t italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) =st(st+2(eα0+(t1)e(1r)α0t))(eα0+(t1)e(1r)α0t)2absent𝑠𝑡𝑠𝑡2superscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑡superscriptsuperscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑡2\displaystyle=\frac{st(st+2(e^{\alpha_{0}}+(t-1)e^{(1-r)\alpha_{0}}-t))}{(e^{% \alpha_{0}}+(t-1)e^{(1-r)\alpha_{0}}-t)^{2}}= divide start_ARG italic_s italic_t ( italic_s italic_t + 2 ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_t ) ) end_ARG start_ARG ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_t ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG
st2(s+2(eα01))(eα0+(t1)e(1r)α0t)2absent𝑠superscript𝑡2𝑠2superscript𝑒subscript𝛼01superscriptsuperscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑡2\displaystyle\leq\frac{st^{2}(s+2(e^{\alpha_{0}}-1))}{(e^{\alpha_{0}}+(t-1)e^{% (1-r)\alpha_{0}}-t)^{2}}≤ divide start_ARG italic_s italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_s + 2 ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - 1 ) ) end_ARG start_ARG ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_t ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG

Applying Theorem 6.5, we obtain

𝔼[dEM(F~,K~)]𝔼delimited-[]subscript𝑑EM~𝐹~𝐾\displaystyle\mathbb{E}[d_{\textsf{EM}}(\tilde{F},\tilde{K})]blackboard_E [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_F end_ARG , over~ start_ARG italic_K end_ARG ) ]
rst((a)21)mn+s2t(st(c)22c)mnabsent𝑟𝑠𝑡superscriptsuperscript𝑎21𝑚𝑛superscript𝑠2𝑡𝑠𝑡superscriptsuperscript𝑐22superscript𝑐𝑚𝑛\displaystyle\;\leq r\sqrt{\frac{st((a^{\prime})^{2}-1)}{mn}}+\sqrt{\frac{s^{2% }t(st(c^{\prime})^{2}-2c^{\prime})}{mn}}≤ italic_r square-root start_ARG divide start_ARG italic_s italic_t ( ( italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_m italic_n end_ARG end_ARG + square-root start_ARG divide start_ARG italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_t ( italic_s italic_t ( italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_m italic_n end_ARG end_ARG
rst3mn(eα0+seα0e(1r)α0)+s2t2mn(s+2(eα01)eα0+(t1)e(1r)α0t),absent𝑟𝑠superscript𝑡3𝑚𝑛superscript𝑒subscript𝛼0𝑠superscript𝑒subscript𝛼0superscript𝑒1𝑟subscript𝛼0superscript𝑠2superscript𝑡2𝑚𝑛𝑠2superscript𝑒subscript𝛼01superscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑡\displaystyle\;\leq r\sqrt{\frac{st^{3}}{mn}}\left(\frac{e^{\alpha_{0}}+s}{e^{% \alpha_{0}}-e^{(1-r)\alpha_{0}}}\right)+\sqrt{\frac{s^{2}t^{2}}{mn}}\left(% \frac{\sqrt{s+2(e^{\alpha_{0}}-1)}}{e^{\alpha_{0}}+(t-1)e^{(1-r)\alpha_{0}}-t}% \right),≤ italic_r square-root start_ARG divide start_ARG italic_s italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG ( divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + italic_s end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG ) + square-root start_ARG divide start_ARG italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG ( divide start_ARG square-root start_ARG italic_s + 2 ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - 1 ) end_ARG end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_t end_ARG ) ,

finishing the claim. To obtain an asymptotic bound (with budget α=ε/r𝛼𝜀𝑟\alpha=\varepsilon/ritalic_α = italic_ε / italic_r), we plug in (6), which says that we may set

α0={α32mln(4mexp(α)/δ) if α32mln(4mexp(α)/δ)2ln(ε16rmln(4mexp(α)/δ))32rmln(4mexp(α)/δ)εrm.subscript𝛼0cases𝛼32𝑚4𝑚𝛼𝛿 if α32mln(4mexp(α)/δ)2𝜀16𝑟𝑚4𝑚𝛼𝛿32𝑟𝑚4𝑚𝛼𝛿𝜀𝑟𝑚\alpha_{0}=\begin{cases}\frac{\alpha}{32\sqrt{m\ln(4m\exp(\alpha)/\delta)}}&% \text{ if $\alpha\leq 32\sqrt{m\ln(4m\exp(\alpha)/\delta)}$}\\ 2\ln\left(\frac{\varepsilon}{16r\sqrt{m\ln(4m\exp(\alpha)/\delta)}}\right)&32r% \sqrt{m\ln(4m\exp(\alpha)/\delta)}\leq\varepsilon\leq rm\end{cases}.italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = { start_ROW start_CELL divide start_ARG italic_α end_ARG start_ARG 32 square-root start_ARG italic_m roman_ln ( 4 italic_m roman_exp ( italic_α ) / italic_δ ) end_ARG end_ARG end_CELL start_CELL if italic_α ≤ 32 square-root start_ARG italic_m roman_ln ( 4 italic_m roman_exp ( italic_α ) / italic_δ ) end_ARG end_CELL end_ROW start_ROW start_CELL 2 roman_ln ( divide start_ARG italic_ε end_ARG start_ARG 16 italic_r square-root start_ARG italic_m roman_ln ( 4 italic_m roman_exp ( italic_α ) / italic_δ ) end_ARG end_ARG ) end_CELL start_CELL 32 italic_r square-root start_ARG italic_m roman_ln ( 4 italic_m roman_exp ( italic_α ) / italic_δ ) end_ARG ≤ italic_ε ≤ italic_r italic_m end_CELL end_ROW .

In the first case, we have

eα0+seα0e(1r)α0srα0superscript𝑒subscript𝛼0𝑠superscript𝑒subscript𝛼0superscript𝑒1𝑟subscript𝛼0𝑠𝑟subscript𝛼0\displaystyle\frac{e^{\alpha_{0}}+s}{e^{\alpha_{0}}-e^{(1-r)\alpha_{0}}}\leq% \frac{s}{r\alpha_{0}}divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + italic_s end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG ≤ divide start_ARG italic_s end_ARG start_ARG italic_r italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG
s+2(eα01)eα0+(t1)e(1r)α0t2sα0t,𝑠2superscript𝑒subscript𝛼01superscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑡2𝑠subscript𝛼0𝑡\displaystyle\frac{\sqrt{s+2(e^{\alpha_{0}}-1)}}{e^{\alpha_{0}}+(t-1)e^{(1-r)% \alpha_{0}}-t}\leq\frac{2\sqrt{s}}{\alpha_{0}t},divide start_ARG square-root start_ARG italic_s + 2 ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - 1 ) end_ARG end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_t end_ARG ≤ divide start_ARG 2 square-root start_ARG italic_s end_ARG end_ARG start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_t end_ARG ,

and this implies

𝔼[dEM(K~,F~)]𝔼delimited-[]subscript𝑑EM~𝐾~𝐹\displaystyle\mathbb{E}[d_{\textsf{EM}}(\tilde{K},\tilde{F})]blackboard_E [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_F end_ARG ) ] s3t3mn1α0+s3t2mn2α0absentsuperscript𝑠3superscript𝑡3𝑚𝑛1subscript𝛼0superscript𝑠3superscript𝑡2𝑚𝑛2subscript𝛼0\displaystyle\leq\sqrt{\frac{s^{3}t^{3}}{mn}}\frac{1}{\alpha_{0}}+\sqrt{\frac{% s^{3}t^{2}}{mn}}\frac{2}{\alpha_{0}}≤ square-root start_ARG divide start_ARG italic_s start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG divide start_ARG 1 end_ARG start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG + square-root start_ARG divide start_ARG italic_s start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG divide start_ARG 2 end_ARG start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG
64r(st)3/2ln(4mexp(εr)/δ)εn.absent64𝑟superscript𝑠𝑡324𝑚𝜀𝑟𝛿𝜀𝑛\displaystyle\leq\frac{64r(st)^{3/2}\sqrt{\ln(4m\exp(\tfrac{\varepsilon}{r})/% \delta)}}{\varepsilon\sqrt{n}}.≤ divide start_ARG 64 italic_r ( italic_s italic_t ) start_POSTSUPERSCRIPT 3 / 2 end_POSTSUPERSCRIPT square-root start_ARG roman_ln ( 4 italic_m roman_exp ( divide start_ARG italic_ε end_ARG start_ARG italic_r end_ARG ) / italic_δ ) end_ARG end_ARG start_ARG italic_ε square-root start_ARG italic_n end_ARG end_ARG .

In the second, we have

eα0+seα0e(1r)α0=1+s/eα01erα021+seα0min{1,rα0}superscript𝑒subscript𝛼0𝑠superscript𝑒subscript𝛼0superscript𝑒1𝑟subscript𝛼01𝑠superscript𝑒subscript𝛼01superscript𝑒𝑟subscript𝛼021𝑠superscript𝑒subscript𝛼01𝑟subscript𝛼0\displaystyle\frac{e^{\alpha_{0}}+s}{e^{\alpha_{0}}-e^{(1-r)\alpha_{0}}}=\frac% {1+s/e^{\alpha_{0}}}{1-e^{-r\alpha_{0}}}\leq 2\frac{1+se^{-\alpha_{0}}}{\min\{% 1,r\alpha_{0}\}}divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + italic_s end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG = divide start_ARG 1 + italic_s / italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_e start_POSTSUPERSCRIPT - italic_r italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG ≤ 2 divide start_ARG 1 + italic_s italic_e start_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG roman_min { 1 , italic_r italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } end_ARG
s+2(eα01)eα0+(t1)e(1r)α0t2(s+eα0)eα0.𝑠2superscript𝑒subscript𝛼01superscript𝑒subscript𝛼0𝑡1superscript𝑒1𝑟subscript𝛼0𝑡2𝑠superscript𝑒subscript𝛼0superscript𝑒subscript𝛼0\displaystyle\frac{\sqrt{s+2(e^{\alpha_{0}}-1)}}{e^{\alpha_{0}}+(t-1)e^{(1-r)% \alpha_{0}}-t}\leq\frac{\sqrt{2(s+e^{\alpha_{0}})}}{e^{\alpha_{0}}}.divide start_ARG square-root start_ARG italic_s + 2 ( italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - 1 ) end_ARG end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + ( italic_t - 1 ) italic_e start_POSTSUPERSCRIPT ( 1 - italic_r ) italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - italic_t end_ARG ≤ divide start_ARG square-root start_ARG 2 ( italic_s + italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) end_ARG end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG .

This implies

𝔼[dEM(K~,F~)]𝔼delimited-[]subscript𝑑EM~𝐾~𝐹\displaystyle\mathbb{E}[d_{\textsf{EM}}(\tilde{K},\tilde{F})]blackboard_E [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_F end_ARG ) ]
2(1+1rα0)(1+seα0)rst3mn+2(eα0s+eα0/2)s2t2mnabsent211𝑟subscript𝛼01𝑠superscript𝑒subscript𝛼0𝑟𝑠superscript𝑡3𝑚𝑛2superscript𝑒subscript𝛼0𝑠superscript𝑒subscript𝛼02superscript𝑠2superscript𝑡2𝑚𝑛\displaystyle\;\leq 2\left(1+\tfrac{1}{r\alpha_{0}}\right)(1+se^{-\alpha_{0}})% r\sqrt{\frac{st^{3}}{mn}}+2\left(e^{-\alpha_{0}}\sqrt{s}+e^{-\alpha_{0}/2}% \right)\sqrt{\frac{s^{2}t^{2}}{mn}}≤ 2 ( 1 + divide start_ARG 1 end_ARG start_ARG italic_r italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) ( 1 + italic_s italic_e start_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) italic_r square-root start_ARG divide start_ARG italic_s italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG + 2 ( italic_e start_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT square-root start_ARG italic_s end_ARG + italic_e start_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / 2 end_POSTSUPERSCRIPT ) square-root start_ARG divide start_ARG italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG
2(1+seα0)st3mn+2(eα0s+eα0/2)s2t2mnabsent21𝑠superscript𝑒subscript𝛼0𝑠superscript𝑡3𝑚𝑛2superscript𝑒subscript𝛼0𝑠superscript𝑒subscript𝛼02superscript𝑠2superscript𝑡2𝑚𝑛\displaystyle\;\leq 2(1+se^{-\alpha_{0}})\sqrt{\frac{st^{3}}{mn}}+2\left(e^{-% \alpha_{0}}\sqrt{s}+e^{-\alpha_{0}/2}\right)\sqrt{\frac{s^{2}t^{2}}{mn}}≤ 2 ( 1 + italic_s italic_e start_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) square-root start_ARG divide start_ARG italic_s italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG + 2 ( italic_e start_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT square-root start_ARG italic_s end_ARG + italic_e start_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / 2 end_POSTSUPERSCRIPT ) square-root start_ARG divide start_ARG italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG
2(1+seα0/2+seα0)st3mnabsent21𝑠superscript𝑒subscript𝛼02𝑠superscript𝑒subscript𝛼0𝑠superscript𝑡3𝑚𝑛\displaystyle\;\leq 2(1+\sqrt{s}e^{-\alpha_{0}/2}+se^{-\alpha_{0}})\sqrt{\frac% {st^{3}}{mn}}≤ 2 ( 1 + square-root start_ARG italic_s end_ARG italic_e start_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / 2 end_POSTSUPERSCRIPT + italic_s italic_e start_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) square-root start_ARG divide start_ARG italic_s italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG
4(1+seα0)st3mnabsent41𝑠superscript𝑒subscript𝛼0𝑠superscript𝑡3𝑚𝑛\displaystyle\;\leq 4(1+se^{-\alpha_{0}})\sqrt{\frac{st^{3}}{mn}}≤ 4 ( 1 + italic_s italic_e start_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) square-root start_ARG divide start_ARG italic_s italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG
4st3mn+1024r2ms3t3ε2nln(4mexp(ε/r)/δ)absent4𝑠superscript𝑡3𝑚𝑛1024superscript𝑟2𝑚superscript𝑠3superscript𝑡3superscript𝜀2𝑛4𝑚𝜀𝑟𝛿\displaystyle\;\leq 4\sqrt{\frac{st^{3}}{mn}}+1024\frac{r^{2}\sqrt{ms^{3}t^{3}% }}{\varepsilon^{2}\sqrt{n}}\ln(4m\exp(\varepsilon/r)/\delta)≤ 4 square-root start_ARG divide start_ARG italic_s italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG + 1024 divide start_ARG italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT square-root start_ARG italic_m italic_s start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG start_ARG italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT square-root start_ARG italic_n end_ARG end_ARG roman_ln ( 4 italic_m roman_exp ( italic_ε / italic_r ) / italic_δ )
4st3mn+32rs3t3εnln(4mexp(ε/r)/δ).absent4𝑠superscript𝑡3𝑚𝑛32𝑟superscript𝑠3superscript𝑡3𝜀𝑛4𝑚𝜀𝑟𝛿\displaystyle\;\leq 4\sqrt{\frac{st^{3}}{mn}}+32\frac{r\sqrt{s^{3}t^{3}}}{% \varepsilon\sqrt{n}}\sqrt{\ln(4m\exp(\varepsilon/r)/\delta)}.≤ 4 square-root start_ARG divide start_ARG italic_s italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG + 32 divide start_ARG italic_r square-root start_ARG italic_s start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG start_ARG italic_ε square-root start_ARG italic_n end_ARG end_ARG square-root start_ARG roman_ln ( 4 italic_m roman_exp ( italic_ε / italic_r ) / italic_δ ) end_ARG .

In both cases, the desired bound has been shown.

D.4. Proof of Lemma 6.7

We use the bound that dEM(K~,F~)K~F~1subscript𝑑EM~𝐾~𝐹subscriptnorm~𝐾~𝐹1d_{\textsf{EM}}(\tilde{K},\tilde{F})\leq\|\tilde{K}-\tilde{F}\|_{1}italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_F end_ARG ) ≤ ∥ over~ start_ARG italic_K end_ARG - over~ start_ARG italic_F end_ARG ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. In each coordinate, the expected error introduced by the Laplace noise is at most O(1nε)𝑂1𝑛𝜀O(\frac{1}{n\varepsilon})italic_O ( divide start_ARG 1 end_ARG start_ARG italic_n italic_ε end_ARG ), and thus 𝔼[K~F~1]O(knε)𝔼delimited-[]subscriptnorm~𝐾~𝐹1𝑂𝑘𝑛𝜀\mathbb{E}[\|\tilde{K}-\tilde{F}\|_{1}]\leq O(\frac{k}{n\varepsilon})blackboard_E [ ∥ over~ start_ARG italic_K end_ARG - over~ start_ARG italic_F end_ARG ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] ≤ italic_O ( divide start_ARG italic_k end_ARG start_ARG italic_n italic_ε end_ARG ). Normalizing will only reduce this error.

D.5. Proof of Corollary 6.8

See 6.8 Our mechanism will simply combine the itemsets into one large itemset K𝐾Kitalic_K with mn𝑚𝑛mnitalic_m italic_n elements (and one global user), and then apply the algorithm of Theorem 6.6. By Theorem 4.4, the privacy budget is (α,δ)𝛼𝛿(\alpha,\delta)( italic_α , italic_δ ), where

α0={αn32mln(4meα/δ) if αn32mln(4meα/δ)2ln(αn16mln(4meα/δ))32mln(4meα/δ)<αn<mnsubscript𝛼0cases𝛼𝑛32𝑚4𝑚superscript𝑒𝛼𝛿 if αn32mln(4meα/δ)2𝛼𝑛16𝑚4𝑚superscript𝑒𝛼𝛿32𝑚4𝑚superscript𝑒𝛼𝛿𝛼𝑛𝑚𝑛\displaystyle\alpha_{0}=\begin{cases}\frac{\alpha\sqrt{n}}{32\sqrt{m\ln(4me^{% \alpha}/\delta)}}&\text{ if $\alpha\sqrt{n}\leq 32\sqrt{m\ln(4me^{\alpha}/% \delta)}$}\\ 2\ln\left(\frac{\alpha\sqrt{n}}{16\sqrt{m\ln(4me^{\alpha}/\delta)}}\right)&32% \sqrt{m\ln(4me^{\alpha}/\delta)}<\alpha\sqrt{n}<m\sqrt{n}\end{cases}italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = { start_ROW start_CELL divide start_ARG italic_α square-root start_ARG italic_n end_ARG end_ARG start_ARG 32 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_ARG end_CELL start_CELL if italic_α square-root start_ARG italic_n end_ARG ≤ 32 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_CELL end_ROW start_ROW start_CELL 2 roman_ln ( divide start_ARG italic_α square-root start_ARG italic_n end_ARG end_ARG start_ARG 16 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG end_ARG ) end_CELL start_CELL 32 square-root start_ARG italic_m roman_ln ( 4 italic_m italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT / italic_δ ) end_ARG < italic_α square-root start_ARG italic_n end_ARG < italic_m square-root start_ARG italic_n end_ARG end_CELL end_ROW

Following the proof in Section D.3, (and setting α=εr𝛼𝜀𝑟\alpha=\frac{\varepsilon}{r}italic_α = divide start_ARG italic_ε end_ARG start_ARG italic_r end_ARG), we can show that

𝔼[dEM(K~,F~)]4st3mn+64rs3t3εnln(4mexp(ε/r)/δ).𝔼delimited-[]subscript𝑑EM~𝐾~𝐹4𝑠superscript𝑡3𝑚𝑛64𝑟superscript𝑠3superscript𝑡3𝜀𝑛4𝑚𝜀𝑟𝛿\mathbb{E}[d_{\textsf{EM}}(\tilde{K},\tilde{F})]\leq 4\sqrt{\frac{st^{3}}{mn}}% +64\frac{r\sqrt{s^{3}t^{3}}}{\varepsilon n}\sqrt{\ln(4m\exp(\varepsilon/r)/% \delta)}.blackboard_E [ italic_d start_POSTSUBSCRIPT EM end_POSTSUBSCRIPT ( over~ start_ARG italic_K end_ARG , over~ start_ARG italic_F end_ARG ) ] ≤ 4 square-root start_ARG divide start_ARG italic_s italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG italic_m italic_n end_ARG end_ARG + 64 divide start_ARG italic_r square-root start_ARG italic_s start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG start_ARG italic_ε italic_n end_ARG square-root start_ARG roman_ln ( 4 italic_m roman_exp ( italic_ε / italic_r ) / italic_δ ) end_ARG .