Private prediction for large-scale synthetic text generationthanks: Authors ordered alphabetically. Author contributions are listed at the end.

Kareem Amin  Alex Bie  Weiwei Kong  Alexey Kurakin
Natalia Ponomareva  Umar Syed  Andreas Terzis  Sergei Vassilvitskii
Google
{kamin,alexbie,weiweikong,kurakin,nponomareva,usyed,aterzis,sergeiv}@google.com
Abstract

We present an approach for generating differentially private synthetic text using large language models (LLMs), via private prediction. In the private prediction framework, we only require the output synthetic data to satisfy differential privacy guarantees. This is in contrast to approaches that train a generative model on potentially sensitive user-supplied source data and seek to ensure the model itself is safe to release. We prompt a pretrained LLM with source data, but ensure that next-token predictions are made with differential privacy guarantees. Previous work in this paradigm reported generating a small number of examples (<10absent10<10< 10) at reasonable privacy levels, an amount of data that is useful only for downstream in-context learning or prompting. In contrast, we make changes that allow us to generate thousands of high-quality synthetic data points, greatly expanding the set of potential applications. Our improvements come from an improved privacy analysis and a better private selection mechanism, which makes use of the equivalence between the softmax layer for sampling tokens in LLMs and the exponential mechanism. Furthermore, we introduce a novel use of public predictions via the sparse vector technique, in which we do not pay privacy costs for tokens that are predictable without sensitive data; we find this to be particularly effective for structured data.

1 Introduction

Differentially private mechanisms process a source dataset potentially containing sensitive user information and perform a useful computation — as simple as calculating a mean, or as complex as training an ML model — whose output can be safely shared while protecting the privacy of users who contributed to the dataset.

Perhaps the most general-purpose differentially private mechanism is one that produces a synthetic version of its input dataset, as the output of such a mechanism would be suitable for all the same purposes as the original dataset. For example, a private synthetic dataset can be used to train an ML model, but can also be used for auxiliary tasks such as feature engineering, hyperparameter tuning, and quality monitoring.

There has been recent interest in using large-language models (LLMs) to generate differentially private versions of text datasets. Existing approaches can be classified into several categories. Private fine-tuning methods privately adjust the parameters of an LLM on the input dataset, using an algorithm such as differentially private stochastic gradient descent (DP-SGD), and then prompt the LLM to generate similar text. Fine-tuning methods have been used to produce high-quality synthetic data, but the training procedure can be prohibitive, available only to those with the time, compute, and access necessary to train state-of-the-art LLMs containing billions of parameters.

Private prediction methods do not modify the LLM parameters at all. Instead, they directly prompt the LLM with text from the source dataset, asking for similar text in response, and then perturb the LLM’s token distribution (i.e., its last layer) to ensure that each sampled token, and thus the entire generated response, is private. Since no training is required, private prediction methods can quickly generate synthetic data, typically producing some data within minutes instead of hours, which allows for rapid prototyping and iteration. However, unlike private fine-tuning, the guarantees of private prediction methods degrade with the volume of data that is generated. Consequently, existing private prediction methods have mostly been used in applications that require only small amounts of synthetic data [Tang et al., 2024], sharply limiting their practicality.

In this paper we describe a new private prediction method that produces hundreds of times as much synthetic data as a state-of-the-art private prediction method, while maintaining a comparable privacy guarantee. Similar to some existing work, our method is based on running LLM inference on several subsets of the input data in parallel and privately aggregating their token distributions to generate synthetic text. However, our approach is distinguished by three novel algorithmic elements that lead to its improved performance:

  1. 1.

    Instead of protecting the privacy of the entire token distribution with Gaussian or Laplace noise, we leverage the uncertainty inherent in sampling to ensure privacy. We clip and aggregate token logits before standard softmax sampling — which is differentially private, since it can viewed as the exponential mechanism. Our approach induces much less distortion of the original token distributions to achieve the same level of privacy than prior work.

  2. 2.

    Previous work generated each token using a random subset of the input data, leveraging privacy amplification by subsampling in their analysis. This is computationally undesirable, as it requires repeated re-computation of the prefix for each decoding step, and limits scalability towards generating large synthetic corpora. Instead, we generate each synthetic example using a fixed disjoint subset of the input data, which yields substantial savings in privacy cost (by leveraging parallel composition) while allowing us to pay a linear amount of non-attention FLOPs, rather than quadratic, in terms of sequence length (via KV cache accelerated decoding).

  3. 3.

    Our method uses an auxillary token distribution from an LLM without access to sensitive data, and draws the next token from that distribution whenever it is very similar to the token distribution induced by the sensitive data. Our method incurs no privacy cost when outputting “obvious” tokens, and as a result, only a fraction of the tokens in the synthetic data are generated using sensitive data (as little as 20% in structured datasets). We leverage the sparse vector technique to privately calculate distributional similarity.

Taken together, the combination of these algorithmic techniques leads to significant improvements over prior work. Informally, (1) and (2) above keep our inference closely aligned to standard (non-DP) inference.

In our experiments, we generate private synthetic versions of publicly available, benchmark machine learning datasets, and then use the synthetic datasets for downstream classification and extraction tasks. Owing to the increased quantity and quality of our synthetic data, we improve over an existing state-of-the-art private prediction method in terms of downstream accuracy. Furthermore, while prior work in this paradigm only generated a small (<10) number of examples we demonstrate the ability to generate thousands of training examples, enough for fine-tuning downstream models.

Finally, since synthetic data is intended for a wide variety of applications, we also evaluate data quality using a metric that is orthogonal to performance on a downstream classification task. Specifically, we generate synthetic versions of a publicly available dataset containing highly structured data records, each of which is encoded as a JSON object. Our results demonstrate that the sparse vector technique helps preserve data structure at high privacy levels.

2 Related work

Private fine-tuning is widely used for synthetic text generation. Yue et al. [2023] created private synthetic versions of text datasets by using DP-SGD [Abadi et al., 2016] to fine-tune an LLM on the sensitive data. Kurakin et al. [2024] showed that parameter efficient approaches to fine-tuning, such as LoRA [Hu et al., 2022], can improve the quality of the synthetic data, since reducing the number of parameters also reduces the amount of noise injected into the optimization procedure. Wu et al. [2024a] took a two-stage approach: First they fine-tuned an LLM on a public dataset that closely resembled the sensitive data (which was itself generated by an LLM using carefully designed prompts), and then completed the fine-tuning process by running DP-SGD on the sensitive dataset. Concurrent to this work, Tran and Xiong [2024] describe a private fine-tuning approach for generating synthetic tabular data that is formatting compliant.

Private prediction [Dwork and Feldman, 2018] is an alternate approach to private machine learning that only guarantees the privacy of the predictions output by an ML model, and not the model itself. Private prediction has been applied to synthetic text generation by viewing each token sampled by an LLM as a ‘prediction’, and perturbing the LLM’s token distributions to ensure their privacy. Tang et al. [2024] added noise to several independent token distributions and averaged them, while Hong et al. [2024] selected the most popular token among the token distributions using the LimitedDomain mechanism [Durfee and Rogers, 2019]. These methods can avoid the time, compute, and access required to fine-tune an LLM with billions of parameters. However, a privacy loss is suffered for each token produced in this manner. As a result, previous work has only been able to generate a very small number of synthetic examples at reasonable privacy levels (fewer than 10). Other work has applied private prediction techniques to LLMs [Majmudar et al., 2022, Duan et al., 2023], including in combination with fine-tuning [Ginart et al., 2022, Flemings et al., 2024], but not for the purpose of synthetic text generation.

Finally, another distinct set of approaches are private filtering methods. Private filtering methods operate directly on whole LLM responses and a large corpus of public data that does not require protection. Yu et al. [2024] and Xie et al. [2024] used the sensitive responses to privately select similar responses from the public dataset. Similarly, Wu et al. [2024b] aggregate response embeddings and select the public response that is closest in embedding space.111Wu et al. [2024b] also proposed a non-filtering approach based on privately selecting common keywords among the sensitive data and using them to prompt an LLM. One limitation of filtering methods is that the menu of possible responses is constructed without signal from the new source dataset.

3 Method

Before describing our algorithm for generating private synthetic text, we review the standard algorithm for LLM inference. Let 𝒳𝒳\mathcal{X}caligraphic_X be the token vocabulary (i.e., the set of all possible tokens), and let v=|𝒳|𝑣𝒳v=|\mathcal{X}|italic_v = | caligraphic_X | be the vocabulary size. A token sequence is an element of 𝒳superscript𝒳\mathcal{X}^{*}caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, and a logit vector is an element of vsuperscript𝑣\mathbb{R}^{v}blackboard_R start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT (one logit per token in the vocabulary). If 𝐱1subscript𝐱1\mathbf{x}_{1}bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 𝐱2subscript𝐱2\mathbf{x}_{2}bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are token sequences then we write 𝐱1𝐱2𝒳subscript𝐱1subscript𝐱2superscript𝒳\mathbf{x}_{1}\mathbf{x}_{2}\in\mathcal{X}^{*}bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT to denote their concatenation.

Standard LLM inference.

A decoder-only LLM can be viewed as a function logits:𝒳v:logitssuperscript𝒳superscript𝑣\operatorname{logits}:\mathcal{X}^{*}\rightarrow\mathbb{R}^{v}roman_logits : caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT that maps each token sequence to a logit vector. Standard LLM inference generates a response 𝐱𝒳𝐱superscript𝒳\mathbf{x}\in\mathcal{X}^{*}bold_x ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT by initializing 𝐱=𝐩𝐱𝐩\mathbf{x}=\mathbf{p}bold_x = bold_p, where 𝐩𝒳𝐩superscript𝒳\mathbf{p}\in\mathcal{X}^{*}bold_p ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is the prompt, and then repeatedly executes the following procedure: (1) Let 𝐳=logits(𝐱)𝐳logits𝐱\mathbf{z}=\operatorname{logits}(\mathbf{x})bold_z = roman_logits ( bold_x ); (2) draw token x𝑥xitalic_x from softmax(𝐳/τ)softmax𝐳𝜏\operatorname{softmax}(\mathbf{z}/\tau)roman_softmax ( bold_z / italic_τ ); and (3) append x𝑥xitalic_x to 𝐱𝐱\mathbf{x}bold_x. Here softmax(𝐳/τ)softmax𝐳𝜏\operatorname{softmax}(\mathbf{z}/\tau)roman_softmax ( bold_z / italic_τ ) is the distribution that assigns probability proportional to exp(zi/τ)subscript𝑧𝑖𝜏\exp(z_{i}/\tau)roman_exp ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT / italic_τ ) to the i𝑖iitalic_ith token, and τ>0𝜏0\tau>0italic_τ > 0 is a temperature parameter that flattens or sharpens the distribution. The procedure terminates when x=<eos>𝑥<eos>x=\texttt{<eos>}italic_x = <eos>, a special token that indicates the end of the response.

Our algorithm.

One straightforward approach to generating a synthetic version of a sensitive piece of text would be to prompt an LLM with ‘Please generate text similar to: <sensitive text>’. However, this could easily lead to a privacy violation, as the response could retain the semantics of the input sensitive text.

Refer to caption
Figure 1: Visualization of Algorithm 1 for a single token in a single batch.

Algorithm 1 describes our method for privately generating a dataset of synthetic examples X𝑋Xitalic_X from a dataset of sensitive prompts D𝐷Ditalic_D. Each prompt in D𝐷Ditalic_D resembles the sample prompt given above. But instead of using a single prompt to generate a synthetic example, the algorithm uses a batch of the prompts to run several LLM inferences in parallel. Each synthetic example is generated one token at a time, with the average of the logit vectors across the inferences defining the distribution from which the next token is randomly selected. Before averaging, the logits are clipped and re-centered using the function

clipc(𝐳)i=max(c,𝐳imaxj{𝐳j}+c)\operatorname{clip}_{c}(\mathbf{z})_{i}=\max\left(-c,\mathbf{z}_{i}-\max_{j}{% \{\mathbf{z}_{j}\}}+c\right)roman_clip start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_z ) start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = roman_max ( - italic_c , bold_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - roman_max start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT { bold_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } + italic_c ) (1)

which maps each component of 𝐳𝐳\mathbf{z}bold_z into [c,c]𝑐𝑐[-c,c][ - italic_c , italic_c ]. Forcing each logit to lie in a bounded range is key to proving the privacy guarantee for our algorithm (see §4). While several functions can achieve this purpose, Eq. (1) has an additional desirable property: If the components of 𝐳𝐳\mathbf{z}bold_z can be shifted by a constant so that they all lie in the interval [c,c]𝑐𝑐[-c,c][ - italic_c , italic_c ], then clipc(𝐳)subscriptclip𝑐𝐳\operatorname{clip}_{c}(\mathbf{z})roman_clip start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_z ) is one such shift. This property is desirable because the distribution softmax(𝐳)softmax𝐳\operatorname{softmax}(\mathbf{z})roman_softmax ( bold_z ) is invariant to any constant shift of 𝐳𝐳\mathbf{z}bold_z. We also found that Eq. (1) performed better empirically than other functions we considered. For example, regular clipping to the range [c,c]𝑐𝑐[-c,c][ - italic_c , italic_c ] without recentering requires twice as large c𝑐citalic_c to sample without distortion (see Appendix B).

Algorithm 1 Generate private synthetic examples using an LLM
1:Parameters: LLM logit function logits()logits\operatorname{logits}(\cdot)roman_logits ( ⋅ ), public prompt 𝐩publicsubscript𝐩public\mathbf{p}_{\operatorname{public}}bold_p start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT, expected batch size s𝑠sitalic_s, maximum number of private tokens per batch r𝑟ritalic_r, clipping value c𝑐citalic_c, noise level σ𝜎\sigmaitalic_σ, distance function d𝑑ditalic_d, threshold θ𝜃\thetaitalic_θ, public temperature τpublicsubscript𝜏public\tau_{\operatorname{public}}italic_τ start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT, private temperature τ𝜏\tauitalic_τ
2:Input: Dataset of sensitive prompts D𝐷Ditalic_D
3:Output: Dataset of synthetic examples X𝑋Xitalic_X
4:Let 𝒮𝒮\mathcal{S}caligraphic_S be a partition of D𝐷Ditalic_D into disjoint batches
5:for each batch S𝒮𝑆𝒮S\in\mathcal{S}italic_S ∈ caligraphic_S do
6:     θ^θ+Laplace(σ)^𝜃𝜃Laplace𝜎\hat{\theta}\leftarrow\theta+\textrm{Laplace}(\sigma)over^ start_ARG italic_θ end_ARG ← italic_θ + Laplace ( italic_σ )
7:     t0𝑡0t\leftarrow 0italic_t ← 0 # private token counter
8:     while t<r𝑡𝑟t<ritalic_t < italic_r do
9:         𝐱Empty token sequence𝐱Empty token sequence\mathbf{x}\leftarrow\textrm{Empty token sequence}bold_x ← Empty token sequence
10:         while 𝐱𝐱\mathbf{x}bold_x does not end with <eos> do
11:              Z{logits(𝐩𝐱):𝐩S}𝑍conditional-setlogits𝐩𝐱𝐩𝑆Z\leftarrow\{\operatorname{logits}(\mathbf{p}\mathbf{x}):\mathbf{p}\in S\}italic_Z ← { roman_logits ( bold_px ) : bold_p ∈ italic_S }
12:              𝐳publiclogits(𝐩public𝐱)subscript𝐳publiclogitssubscript𝐩public𝐱\mathbf{z}_{\operatorname{public}}\leftarrow\operatorname{logits}(\mathbf{p}_{% \operatorname{public}}\mathbf{x})bold_z start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT ← roman_logits ( bold_p start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT bold_x )
13:              d^d(Z,𝐳public)+Laplace(2σ)^𝑑𝑑𝑍subscript𝐳publicLaplace2𝜎\hat{d}\leftarrow d(Z,\mathbf{z}_{\operatorname{public}})+\textrm{Laplace}(2\sigma)over^ start_ARG italic_d end_ARG ← italic_d ( italic_Z , bold_z start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT ) + Laplace ( 2 italic_σ )
14:              if d^>θ^^𝑑^𝜃\hat{d}>\hat{\theta}over^ start_ARG italic_d end_ARG > over^ start_ARG italic_θ end_ARG then
15:                  𝐳¯1s𝐳Zclipc(𝐳)¯𝐳1𝑠subscript𝐳𝑍subscriptclip𝑐𝐳\bar{\mathbf{z}}\leftarrow\frac{1}{s}\sum_{\mathbf{z}\in Z}\operatorname{clip}% _{c}(\mathbf{z})over¯ start_ARG bold_z end_ARG ← divide start_ARG 1 end_ARG start_ARG italic_s end_ARG ∑ start_POSTSUBSCRIPT bold_z ∈ italic_Z end_POSTSUBSCRIPT roman_clip start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_z )
16:                  xsoftmax(𝐳¯/τ)similar-to𝑥softmax¯𝐳𝜏x\sim\operatorname{softmax}(\bar{\mathbf{z}}/\tau)italic_x ∼ roman_softmax ( over¯ start_ARG bold_z end_ARG / italic_τ )
17:                  θ^θ+Laplace(σ)^𝜃𝜃Laplace𝜎\hat{\theta}\leftarrow\theta+\textrm{Laplace}(\sigma)over^ start_ARG italic_θ end_ARG ← italic_θ + Laplace ( italic_σ )
18:                  tt+1𝑡𝑡1t\leftarrow t+1italic_t ← italic_t + 1
19:              else
20:                  xsoftmax(𝐳public/τpublic)similar-to𝑥softmaxsubscript𝐳publicsubscript𝜏publicx\sim\operatorname{softmax}(\mathbf{z}_{\operatorname{public}}/\tau_{% \operatorname{public}})italic_x ∼ roman_softmax ( bold_z start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT / italic_τ start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT )               
21:              Append x𝑥xitalic_x to 𝐱𝐱\mathbf{x}bold_x          
22:         hinzufügen 𝐱𝐱\mathbf{x}bold_x to X𝑋Xitalic_X      
23:Return X𝑋Xitalic_X.

Since the average logit vector is computed using a set of sensitive prompts, each token selected from a distribution determined by the average logit vector incurs a privacy cost. To minimize this cost, Algorithm 1 also has access to a non-sensitive public prompt, 𝐩publicsubscript𝐩public\mathbf{p}_{\operatorname{public}}bold_p start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT, and uses this prompt to generate the next token whenever doing so does not significantly change the distribution from which the next token is drawn. The distance function used to make this determination is

d(Z,𝐳public)=1s𝐳Zp𝐳p𝐳public1𝑑𝑍subscript𝐳publicsubscriptdelimited-∥∥1𝑠subscript𝐳𝑍subscript𝑝𝐳subscript𝑝subscript𝐳public1d(Z,\mathbf{z}_{\operatorname{public}})=\left\lVert\frac{1}{s}\sum_{\mathbf{z}% \in Z}p_{\mathbf{z}}-p_{\mathbf{z}_{\operatorname{public}}}\right\rVert_{1}italic_d ( italic_Z , bold_z start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT ) = ∥ divide start_ARG 1 end_ARG start_ARG italic_s end_ARG ∑ start_POSTSUBSCRIPT bold_z ∈ italic_Z end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT bold_z end_POSTSUBSCRIPT - italic_p start_POSTSUBSCRIPT bold_z start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (2)

where p𝐳=softmax(𝐳)subscript𝑝𝐳softmax𝐳p_{\mathbf{z}}=\operatorname{softmax}(\mathbf{z})italic_p start_POSTSUBSCRIPT bold_z end_POSTSUBSCRIPT = roman_softmax ( bold_z ) is the token distribution corresponding to logit vector 𝐳𝐳\mathbf{z}bold_z. When this distance is small, Algorithm 1 outputs a public token instead of a private token. The privacy guarantee for Algorithm 1 leverages the analysis of the sparse vector technique [Dwork et al., 2009], and shows that while privacy degrades with the number of private output tokens, it is independent of the number of public output tokens (see §4). Empirically, we observe that the fraction of output tokens that must be private in order to generate high-quality synthetic data can be only 20% for some datasets.

Note that the first step of Algorithm 1 partitions the input dataset of sensitive prompts into disjoint batches. We do not prescribe a procedure for assigning prompts to batches in Algorithm 1 since many batching approaches are admissible as long as they satisfy a minor technical assumption required for the privacy analysis of Algorithm 1, which we explain in §4. While the batches are not required to be any particular size, the algorithm makes most efficient use of the data if each batch has size equal to the expected batch size s𝑠sitalic_s. And while prompts can be batched together (almost) arbitrarily, more tailored batching can lead to better synthetic data quality. For example, in the experiments in §5, where we generate synthetic versions of ML training datasets, each sensitive prompt contains a label. In those experiments we assign prompts with the same label to the same batch.

Relationship to prior work.

Two major features of Algorithm 1 are that it leverages the inherent randomness of token sampling to guarantee privacy, and that it further reduces privacy cost by using public data to generate a portion of the synthetic data. Some prior work also incorporated these algorithmic ideas, but with key differences. Instead of clipping logits to ensure that the token sampling is private, Majmudar et al. [2022] mixed each sensitive token distribution with the uniform distribution. This approach induced a dependence on the vocabulary size in their privacy guarantee, and since LLM vocabularies are typically very large, the resulting privacy guarantee was quite weak: Majmudar et al. [2022] noted that setting the differential privacy parameter ε𝜀\varepsilonitalic_ε (see Definition 1) lower than 50 produced synthetic data that was “unusable”. Flemings et al. [2024] guaranteed the privacy of token sampling by mixing each sensitive token distribution with a public token distribution, but their approach was based on aggregating a set of fine-tuned models, not a set of prompts. Neither Majmudar et al. [2022] nor Flemings et al. [2024] had a goal of generating synthetic data.

Tang et al. [2024] found that limiting the token vocabulary to a fixed set of the most popular 100 public tokens caused their synthetic data generation algorithm to exhibit greater stability. However, if the sensitive data contains many tokens that are rare in public data, their approach cannot produce synthetic data that is very similar to the sensitive data. By contrast, our approach compares public and private token distributions on-the-fly, and determines which one to use for sampling the next token by balancing a trade-off between privacy and quality. Also, Tang et al. [2024] used a different random subset of prompts to generate each token, and left as an open problem how to use a single subset to generate every token in a synthetic example. Our algorithm resolves this open problem, and consequently yields both improved privacy and greater computational efficiency (see §6).

4 Privacy analysis

In this section we prove that Algorithm 1 preserves the privacy of the sensitive prompts it uses to generate synthetic examples.

Let 𝒟𝒟\mathcal{D}caligraphic_D be the set of all possible prompt datasets. A mechanism is a randomized function with domain 𝒟𝒟\mathcal{D}caligraphic_D. Note that Algorithm 1 is a mechanism. We say that a pair of prompt datasets D,D𝒟𝐷superscript𝐷𝒟D,D^{\prime}\in\mathcal{D}italic_D , italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_D are neighbors if there exists a prompt 𝐩𝐩\mathbf{p}bold_p such that D=D{𝐩}𝐷superscript𝐷𝐩D=D^{\prime}\cup\{\mathbf{p}\}italic_D = italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∪ { bold_p } oder D=D{𝐩}superscript𝐷𝐷𝐩D^{\prime}=D\cup\{\mathbf{p}\}italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_D ∪ { bold_p }. In the differential privacy literature this is commonly referred to as the add/remove neighbor relation.

Definition 1 (Dwork et al. [2006]).

A mechanism M𝑀Mitalic_M satisfies (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-differential privacy if Pr[M(D)O]eεPr[M(D)O]+δPr𝑀𝐷𝑂superscript𝑒𝜀Pr𝑀superscript𝐷𝑂𝛿\Pr[M(D)\in O]\leq e^{\varepsilon}\Pr[M(D^{\prime})\in O]+\deltaroman_Pr [ italic_M ( italic_D ) ∈ italic_O ] ≤ italic_e start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT roman_Pr [ italic_M ( italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_O ] + italic_δ for any neighboring datasets D,D𝒟𝐷superscript𝐷𝒟D,D^{\prime}\in\mathcal{D}italic_D , italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_D and subset O𝑂Oitalic_O of the range of M𝑀Mitalic_M.

Theorem 1 below provides a differential privacy guarantee for Algorithm 1. The proof of Theorem 1 requires a technical assumption about how the prompts are partitioned into batches in the first step of the algorithm.

Assumption 1.

In Algorithm 1, the assignment of a prompt to a batch depends only on the prompt itself, and not on the other prompts.

The most straightforward way to satisfy Assumption 1 is to apply a hash function to each prompt and then use the hash value to determine its assigned batch. For example, if hhitalic_h is the hash value, n𝑛nitalic_n is the number of prompts and s𝑠sitalic_s is the expected batch size, then we can assign the prompt to the (hmodns)modulo𝑛𝑠(h\mod\frac{n}{s})( italic_h roman_mod divide start_ARG italic_n end_ARG start_ARG italic_s end_ARG )th batch. If we want to batch together prompts that share a certain attribute (like a label), we can apply another hash function to that attribute and concatenate the hash values. Using hash functions for batch assignment can lead to batches whose sizes differ from the expected batch size s𝑠sitalic_s, but this does not impact the validity of Theorem 1.

Theorem 1 (Privacy of Algorithm 1).

Suppose Assumption 1 holds. Let ρ=r(12(csτ)2+2(sσ)2)𝜌𝑟12superscript𝑐𝑠𝜏22superscript𝑠𝜎2\rho=r\left(\frac{1}{2}\left(\frac{c}{s\tau}\right)^{2}+\frac{2}{(s\sigma)^{2}% }\right)italic_ρ = italic_r ( divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG italic_c end_ARG start_ARG italic_s italic_τ end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG 2 end_ARG start_ARG ( italic_s italic_σ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ). For all ε0𝜀0\varepsilon\geq 0italic_ε ≥ 0, Algorithm 1 satisfies (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-differential privacy, where

δ=infα(1,)e(α1)(αρε)α1(11α)α.𝛿subscriptinfimum𝛼1superscript𝑒𝛼1𝛼𝜌𝜀𝛼1superscript11𝛼𝛼\delta=\inf_{\alpha\in(1,\infty)}\frac{e^{(\alpha-1)(\alpha\rho-\varepsilon)}}% {\alpha-1}\left(1-\frac{1}{\alpha}\right)^{\alpha}.italic_δ = roman_inf start_POSTSUBSCRIPT italic_α ∈ ( 1 , ∞ ) end_POSTSUBSCRIPT divide start_ARG italic_e start_POSTSUPERSCRIPT ( italic_α - 1 ) ( italic_α italic_ρ - italic_ε ) end_POSTSUPERSCRIPT end_ARG start_ARG italic_α - 1 end_ARG ( 1 - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT .

Also, for all δ(0,1]𝛿01\delta\in(0,1]italic_δ ∈ ( 0 , 1 ], Algorithm 1 satisfies (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-differential privacy, where

ε=ρ+4ρlog(1/δ).𝜀𝜌4𝜌1𝛿\varepsilon=\rho+\sqrt{4\rho\log(1/\delta)}.italic_ε = italic_ρ + square-root start_ARG 4 italic_ρ roman_log ( 1 / italic_δ ) end_ARG .

The proof is in Appendix C and makes use of sharp privacy analyses of: (1) zCDP to approximate DP conversion [Canonne et al., 2020]; and (2) zCDP bounds for the exponential mechanism [Cesar and Rogers, 2021].

5 Experiments

Gemma 1.1 2B IT [Team, 2024] is the data generator in all private prediction experiments. We choose it due to its lightweight, open-source JAX implementation that makes easy to implement and share sampling algorithms.222https://github.com/google-deepmind/gemma Tables 1(a) and 1(b) give an overview of datasets and models used.

Dataset ntrainsubscript𝑛trainn_{\text{train}}italic_n start_POSTSUBSCRIPT train end_POSTSUBSCRIPT Description
AGNews 120,000 4-way news classification
TREC 5452 6-way query classification
DBPedia 560,000 14-way topic classification
MIT-G 2,953 Movie genre extraction
MIT-D 1,561 Movie director extraction
IMDB 25,000 2-way review classification
Yelp 560,000 2-way review classification
WikiMoviesJSON 27,412 JSON with 6 fields
(a) Overview of datasets used.
Model Usage
Gemma 2B 1.1 IT Generation; private prediction
LaMDA 8B Generation; DP fine-tuning
GPT-3 babbage-002 Evaluation; in-context learning
BERT-Base 12/768 110M Evaluation; fine-tuning
(b) Overview of models used.
Table 1: Overview of datasets and models used. Datasets are benchmark classification and extraction tasks used in prior work on private synthetic text generation, with the exception of WikiMoviesJSON, which is used for structured data experiments. LaMDA and Gemma are used for synthetic data generation, while the other models are used to measure how useful our synthetic data is for improving accuracy on real test data.

We perform 3 sets of experiments, targeting various datasets and utility criteria:

  • In-context learning (§5.1); We generate examples to use as in-context exemplars for prompting an LLM. We report downstream accuracy on real test examples, when prompted with synthetic data, on 3 classification tasks (AGNews [Zhang et al., 2015], DBPedia [Zhang et al., 2015], TREC [Voorhees and Tice, 2000]) and 2 extraction tasks (MIT-G, MIT-D [Liu et al., 2012]).

  • Fine-tuning (§5.2); We generate synthetic examples to use for fine-tuning a BERT classifier. We report downstream accuracy on real test examples for 3 classification tasks (IMDB [Maas et al., 2011], Yelp [Zhang et al., 2015], AGNews [Zhang et al., 2015]).

  • Structured data (§5.3); We generate examples that must adhere to structural constraints to be useful synthetic data. We consider a JSON generation task (WikiMoviesJSON [Rust, 2024]), evaluating structure preservation.

5.1 In-context learning

Experimental setup.

Using our method, we generate 90-1500 examples using Gemma 2B 1.1 IT. We compare against real examples, and results reported in the prior work of Tang et al. [2024], where they generated 4-shot examples for in-context learning.333It is no longer possible to reproduce their results, due to changes in the OpenAI API since publication: GPT-3 babbage is now deprecated, and it is no longer possible to query for top 100 logprobs, which is required by their method. To evaluate generated synthetic data, we put synthetic examples in the context window before querying with the real test example, as shown in Figure 2.

1Classify the following examples: 2Text: lorem ipsum # synthetic text 1 3Answer: label 4... 5Text: sed do eiusmod # synthetic text n 6Answer: label 7 8Text: excepteur sint # test text 9Answer:
Figure 2: Example of n𝑛nitalic_n-shot in-context learning evaluation for synthetic data.

We perform this evaluation with GPT-3 babbage-002 which has a 16K context window. We report results on AGNews, DBPedia, TREC, MIT-G, and MIT-D using the implementation of Zhao et al. [2021].444https://github.com/tonyzhaozh/few-shot-learning. Following the work of Tang et al. [2024], we enable contextual calibration [Zhao et al., 2021] for classification but not extraction tasks. Our evaluation setup is a best-effort reproduction of their setup, which is no longer possible to completely reproduce due to changes to OpenAI API access (see Table 2 caption for more details). Due to cost, we follow prior work [Bertsch et al., 2024, Ratner et al., 2023, Lu et al., 2022, Zhao et al., 2021] and opt to subsample test sets to 250 test examples. We run 3 seeds of label-stratified sampling of exemplars from synthetic/real data.

GPT-3 babbage-002 Acc. (%)*
ε𝜀\varepsilonitalic_ε Method Shots Reported in Model AGNews DBPedia TREC MIT-G MIT-D
0 Zero shot 0 This work - 24.80.0subscript24.80.024.8_{0.0}24.8 start_POSTSUBSCRIPT 0.0 end_POSTSUBSCRIPT 12.00.0subscript12.00.012.0_{0.0}12.0 start_POSTSUBSCRIPT 0.0 end_POSTSUBSCRIPT 28.40.0subscript28.40.028.4_{0.0}28.4 start_POSTSUBSCRIPT 0.0 end_POSTSUBSCRIPT 29.60.0subscript29.60.029.6_{0.0}29.6 start_POSTSUBSCRIPT 0.0 end_POSTSUBSCRIPT 28.80.0subscript28.80.028.8_{0.0}28.8 start_POSTSUBSCRIPT 0.0 end_POSTSUBSCRIPT
\infty Real data 4 This work - 75.33.0subscript75.33.075.3_{3.0}75.3 start_POSTSUBSCRIPT 3.0 end_POSTSUBSCRIPT 73.60.3subscript73.60.373.6_{0.3}73.6 start_POSTSUBSCRIPT 0.3 end_POSTSUBSCRIPT 34.95.0subscript34.95.034.9_{5.0}34.9 start_POSTSUBSCRIPT 5.0 end_POSTSUBSCRIPT 56.02.0subscript56.02.056.0_{2.0}56.0 start_POSTSUBSCRIPT 2.0 end_POSTSUBSCRIPT 83.15.3subscript83.15.383.1_{5.3}83.1 start_POSTSUBSCRIPT 5.3 end_POSTSUBSCRIPT
64 84.71.5subscript84.71.584.7_{1.5}84.7 start_POSTSUBSCRIPT 1.5 end_POSTSUBSCRIPT 92.51.6subscript92.51.692.5_{1.6}92.5 start_POSTSUBSCRIPT 1.6 end_POSTSUBSCRIPT 50.36.1subscript50.36.150.3_{6.1}50.3 start_POSTSUBSCRIPT 6.1 end_POSTSUBSCRIPT 56.45.4subscript56.45.456.4_{5.4}56.4 start_POSTSUBSCRIPT 5.4 end_POSTSUBSCRIPT 89.10.7subscript89.10.789.1_{0.7}89.1 start_POSTSUBSCRIPT 0.7 end_POSTSUBSCRIPT
Tang et al. [2024] 4 Tang et al. [2024]* GPT-3 babbage 69.34.8subscript69.34.869.3_{4.8}69.3 start_POSTSUBSCRIPT 4.8 end_POSTSUBSCRIPT 82.33.7subscript82.33.782.3_{3.7}82.3 start_POSTSUBSCRIPT 3.7 end_POSTSUBSCRIPT 50.66.9subscript50.66.950.6_{6.9}50.6 start_POSTSUBSCRIPT 6.9 end_POSTSUBSCRIPT 54.47.0subscript54.47.054.4_{7.0}54.4 start_POSTSUBSCRIPT 7.0 end_POSTSUBSCRIPT -
Ours 4 This work Gemma 1.1 2B IT 76.84.8subscript76.84.876.8_{4.8}76.8 start_POSTSUBSCRIPT 4.8 end_POSTSUBSCRIPT 72.32.5subscript72.32.572.3_{2.5}72.3 start_POSTSUBSCRIPT 2.5 end_POSTSUBSCRIPT 38.86.0subscript38.86.038.8_{6.0}38.8 start_POSTSUBSCRIPT 6.0 end_POSTSUBSCRIPT 47.72.5subscript47.72.547.7_{2.5}47.7 start_POSTSUBSCRIPT 2.5 end_POSTSUBSCRIPT 81.72.4subscript81.72.481.7_{2.4}81.7 start_POSTSUBSCRIPT 2.4 end_POSTSUBSCRIPT
64 77.51.8subscript77.51.877.5_{1.8}77.5 start_POSTSUBSCRIPT 1.8 end_POSTSUBSCRIPT 91.51.7subscript91.51.791.5_{1.7}91.5 start_POSTSUBSCRIPT 1.7 end_POSTSUBSCRIPT 57.93.4subscript57.93.457.9_{3.4}57.9 start_POSTSUBSCRIPT 3.4 end_POSTSUBSCRIPT 56.41.2subscript56.41.256.4_{1.2}56.4 start_POSTSUBSCRIPT 1.2 end_POSTSUBSCRIPT 87.10.2subscript87.10.287.1_{0.2}87.1 start_POSTSUBSCRIPT 0.2 end_POSTSUBSCRIPT
1111 Tang et al. [2024] 4 Tang et al. [2024]* GPT-3 babbage 64.13.9subscript64.13.964.1_{3.9}64.1 start_POSTSUBSCRIPT 3.9 end_POSTSUBSCRIPT 81.23.0subscript81.23.081.2_{3.0}81.2 start_POSTSUBSCRIPT 3.0 end_POSTSUBSCRIPT 50.74.1subscript50.74.150.7_{4.1}50.7 start_POSTSUBSCRIPT 4.1 end_POSTSUBSCRIPT 46.37.8subscript46.37.846.3_{7.8}46.3 start_POSTSUBSCRIPT 7.8 end_POSTSUBSCRIPT 69.27.9subscript69.27.969.2_{7.9}69.2 start_POSTSUBSCRIPT 7.9 end_POSTSUBSCRIPT
Ours 4 This work Gemma 1.1 2B IT 75.93.5subscript75.93.575.9_{3.5}75.9 start_POSTSUBSCRIPT 3.5 end_POSTSUBSCRIPT 75.10.5subscript75.10.575.1_{0.5}75.1 start_POSTSUBSCRIPT 0.5 end_POSTSUBSCRIPT 39.23.7subscript39.23.739.2_{3.7}39.2 start_POSTSUBSCRIPT 3.7 end_POSTSUBSCRIPT 47.16.0subscript47.16.047.1_{6.0}47.1 start_POSTSUBSCRIPT 6.0 end_POSTSUBSCRIPT 84.51.0subscript84.51.084.5_{1.0}84.5 start_POSTSUBSCRIPT 1.0 end_POSTSUBSCRIPT
64 78.71.8subscript78.71.878.7_{1.8}78.7 start_POSTSUBSCRIPT 1.8 end_POSTSUBSCRIPT 90.42.6subscript90.42.690.4_{2.6}90.4 start_POSTSUBSCRIPT 2.6 end_POSTSUBSCRIPT 53.61.3subscript53.61.353.6_{1.3}53.6 start_POSTSUBSCRIPT 1.3 end_POSTSUBSCRIPT 51.62.3subscript51.62.351.6_{2.3}51.6 start_POSTSUBSCRIPT 2.3 end_POSTSUBSCRIPT 86.40.6subscript86.40.686.4_{0.6}86.4 start_POSTSUBSCRIPT 0.6 end_POSTSUBSCRIPT
Table 2: In-context learning results with GPT-3 babbage-002. We report mean and standard deviation over 3 random samplings (equally many from each label for classification; fully random for extraction) of synthetic/real data. (*) Note: For the results reported in Tang et al. [2024], they use GPT-3 babbage (now deprecated; we use GPT-3 babbage-002) as the in-context learner, and use the top 100 logprobs for contextual calibration (only top 5 are available now); while not directly comparable, we report their results for context.

Results.

Results are presented in Table 2. Our gains in quantity while maintaining quality are realized in terms of 64-shot in-context learning accuracy. In some cases, we can generate more examples, but we limit ourselves to 64 for these evaluations for cost and efficiency reasons. Our results at 64 shots are comparable to real data at 64 shots. Notably, our synthetic data at 64 shots improves over real data at 4 shots – which is roughly an upper bound on the performance of methods limited to generating 4 examples (e.g., Tang et al. [2024]). We also improve over results reported in Tang et al. [2024], although we note that there are differences in the experimental setup.

5.2 Fine-tuning

We achieve significant improvements over the best available private inference method for in-context learning tasks. Since our method is capable of generating thousands of synthetic examples at reasonable privacy budgets, it is natural to ask whether it can compete with state-of-the-art private fine-tuning methods, which can generate infinitely many synthetic examples once the up-front costs of model training are paid. This makes them capable of producing enough data to train downstream classification models.

Experiment setup.

We use our approach to generate a large quantity of synthetic data for the purposes of fine-tuning 110M BERT-Base models. We consider 3 classification tasks used in prior work on private fine-tuning [Kurakin et al., 2024]), following the exact same evaluation procedure. We omit comparison to prior private prediction work (e.g. [Tang et al., 2024]), as they only generate 4 examples which is insufficient for fine-tuning.

Results.

Main results are presented in Table 3. Across various datasets and privacy levels, we generate between 2.5K (IMDB, ε𝜀\varepsilonitalic_ε=1) and 200K (Yelp, ε𝜀\varepsilonitalic_ε=10) examples for fine-tuning. Prior work generating fewer than 10101010 examples using private prediction were unable to compare with private fine-tuning on these tasks at all. While there remains a gap between the best fine-tuning and best private inference methods on downstream classification tasks, we achieve reasonable performance, even out-performing or matching the baseline of privately tuning all the parameters in the model reported in Kurakin et al. [2024].

BERT Acc. (%)
IMDB @ ε𝜀\varepsilonitalic_ε Yelp @ ε𝜀\varepsilonitalic_ε AGNews @ ε𝜀\varepsilonitalic_ε
Method Reported in Model \infty 1111 3333 10101010 \infty 1111 3333 10101010 \infty 1111 3333 10101010
Real data [Kurakin et al., 2024] - 93.70.1subscript93.70.193.7_{0.1}93.7 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT - - - 97.60.1subscript97.60.197.6_{0.1}97.6 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT - - - 93.70.1subscript93.70.193.7_{0.1}93.7 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT - - -
Fine-tune [Kurakin et al., 2024] LaMDA 8B 93.20.2subscript93.20.293.2_{0.2}93.2 start_POSTSUBSCRIPT 0.2 end_POSTSUBSCRIPT 79.11.7subscript79.11.779.1_{1.7}79.1 start_POSTSUBSCRIPT 1.7 end_POSTSUBSCRIPT 83.90.6subscript83.90.683.9_{0.6}83.9 start_POSTSUBSCRIPT 0.6 end_POSTSUBSCRIPT 84.00.7subscript84.00.784.0_{0.7}84.0 start_POSTSUBSCRIPT 0.7 end_POSTSUBSCRIPT 95.90.1subscript95.90.195.9_{0.1}95.9 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 84.10.3subscript84.10.384.1_{0.3}84.1 start_POSTSUBSCRIPT 0.3 end_POSTSUBSCRIPT 84.60.1subscript84.60.184.6_{0.1}84.6 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 84.20.3subscript84.20.384.2_{0.3}84.2 start_POSTSUBSCRIPT 0.3 end_POSTSUBSCRIPT 91.10.1subscript91.10.191.1_{0.1}91.1 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 65.72.9subscript65.72.965.7_{2.9}65.7 start_POSTSUBSCRIPT 2.9 end_POSTSUBSCRIPT 65.32.7subscript65.32.765.3_{2.7}65.3 start_POSTSUBSCRIPT 2.7 end_POSTSUBSCRIPT 65.15.3subscript65.15.365.1_{5.3}65.1 start_POSTSUBSCRIPT 5.3 end_POSTSUBSCRIPT
Prompt-tune 92.00.1subscript92.00.192.0_{0.1}92.0 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 88.10.4subscript88.10.488.1_{0.4}88.1 start_POSTSUBSCRIPT 0.4 end_POSTSUBSCRIPT 87.40.2subscript87.40.287.4_{0.2}87.4 start_POSTSUBSCRIPT 0.2 end_POSTSUBSCRIPT 90.70.2subscript90.70.290.7_{0.2}90.7 start_POSTSUBSCRIPT 0.2 end_POSTSUBSCRIPT 93.90.1subscript93.90.193.9_{0.1}93.9 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 94.10.1subscript94.10.194.1_{0.1}94.1 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 93.50.1subscript93.50.193.5_{0.1}93.5 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 94.10.1subscript94.10.194.1_{0.1}94.1 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 88.30.3subscript88.30.388.3_{0.3}88.3 start_POSTSUBSCRIPT 0.3 end_POSTSUBSCRIPT 83.90.8subscript83.90.883.9_{0.8}83.9 start_POSTSUBSCRIPT 0.8 end_POSTSUBSCRIPT 86.20.2subscript86.20.286.2_{0.2}86.2 start_POSTSUBSCRIPT 0.2 end_POSTSUBSCRIPT 86.90.1subscript86.90.186.9_{0.1}86.9 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT
LoRA 91.60.2subscript91.60.291.6_{0.2}91.6 start_POSTSUBSCRIPT 0.2 end_POSTSUBSCRIPT 90.00.3subscript90.00.390.0_{0.3}90.0 start_POSTSUBSCRIPT 0.3 end_POSTSUBSCRIPT 90.60.2subscript90.60.290.6_{0.2}90.6 start_POSTSUBSCRIPT 0.2 end_POSTSUBSCRIPT 91.30.2subscript91.30.291.3_{0.2}91.3 start_POSTSUBSCRIPT 0.2 end_POSTSUBSCRIPT 96.40.1subscript96.40.196.4_{0.1}96.4 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 95.50.1subscript95.50.195.5_{0.1}95.5 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 95.60.1subscript95.60.195.6_{0.1}95.6 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 95.90.1subscript95.90.195.9_{0.1}95.9 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 91.80.2subscript91.80.291.8_{0.2}91.8 start_POSTSUBSCRIPT 0.2 end_POSTSUBSCRIPT 89.40.1subscript89.40.189.4_{0.1}89.4 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 89.60.1subscript89.60.189.6_{0.1}89.6 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 90.00.1subscript90.00.190.0_{0.1}90.0 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT
Ours This work Gemma 1.1 2B IT 83.62.9subscript83.62.983.6_{2.9}83.6 start_POSTSUBSCRIPT 2.9 end_POSTSUBSCRIPT 82.72.1subscript82.72.182.7_{2.1}82.7 start_POSTSUBSCRIPT 2.1 end_POSTSUBSCRIPT 83.61.9subscript83.61.983.6_{1.9}83.6 start_POSTSUBSCRIPT 1.9 end_POSTSUBSCRIPT 85.52.3subscript85.52.385.5_{2.3}85.5 start_POSTSUBSCRIPT 2.3 end_POSTSUBSCRIPT 91.80.6subscript91.80.691.8_{0.6}91.8 start_POSTSUBSCRIPT 0.6 end_POSTSUBSCRIPT 91.10.2subscript91.10.291.1_{0.2}91.1 start_POSTSUBSCRIPT 0.2 end_POSTSUBSCRIPT 91.60.8subscript91.60.891.6_{0.8}91.6 start_POSTSUBSCRIPT 0.8 end_POSTSUBSCRIPT 92.60.2subscript92.60.292.6_{0.2}92.6 start_POSTSUBSCRIPT 0.2 end_POSTSUBSCRIPT 81.21.2subscript81.21.281.2_{1.2}81.2 start_POSTSUBSCRIPT 1.2 end_POSTSUBSCRIPT 79.81.8subscript79.81.879.8_{1.8}79.8 start_POSTSUBSCRIPT 1.8 end_POSTSUBSCRIPT 79.32.1subscript79.32.179.3_{2.1}79.3 start_POSTSUBSCRIPT 2.1 end_POSTSUBSCRIPT 79.80.3subscript79.80.379.8_{0.3}79.8 start_POSTSUBSCRIPT 0.3 end_POSTSUBSCRIPT
+ SVT This work Gemma 1.1 2B IT - 84.31.1subscript84.31.184.3_{1.1}84.3 start_POSTSUBSCRIPT 1.1 end_POSTSUBSCRIPT 84.41.5subscript84.41.584.4_{1.5}84.4 start_POSTSUBSCRIPT 1.5 end_POSTSUBSCRIPT 85.01.0subscript85.01.085.0_{1.0}85.0 start_POSTSUBSCRIPT 1.0 end_POSTSUBSCRIPT - 88.40.6subscript88.40.688.4_{0.6}88.4 start_POSTSUBSCRIPT 0.6 end_POSTSUBSCRIPT 89.10.3subscript89.10.389.1_{0.3}89.1 start_POSTSUBSCRIPT 0.3 end_POSTSUBSCRIPT 89.01.9subscript89.01.989.0_{1.9}89.0 start_POSTSUBSCRIPT 1.9 end_POSTSUBSCRIPT - 79.20.3subscript79.20.379.2_{0.3}79.2 start_POSTSUBSCRIPT 0.3 end_POSTSUBSCRIPT 79.80.4subscript79.80.479.8_{0.4}79.8 start_POSTSUBSCRIPT 0.4 end_POSTSUBSCRIPT 80.40.6subscript80.40.680.4_{0.6}80.4 start_POSTSUBSCRIPT 0.6 end_POSTSUBSCRIPT
Table 3: Results of fine-tuning on real and synthetic data with BERT. We report mean and standard deviation over 3 runs of downstream fine-tuning and evaluation. We compare to results reported in [Kurakin et al., 2024] that fine-tunes a synthetic data generator with DP-SGD. We generate 2.5-200K examples with private prediction, which suffices to train reasonably performing models on.

Limited data regime.

We additionally consider the limited data regime. In Appendix A we present experiments on AGNews1K, a 1024-subsample of AGNews. Our method, which employs parallel composition, is “pay-as-you-go”, i.e., we can put in a small amount of data to get out a small amount, while preserving quality. On the other hand, fine-tuning based approaches necessarily pay upfront to ensure the model and all future generations are private. This means that without sufficient data, all outputs will be low quality. Results in Table 5 demonstrate that our private prediction method generates more useful examples for in-context learning in this regime.

5.3 Structured data

We conclude our experiments with a demonstration of the lift in performance provided by using the sparse vector technique (SVT) against a public prompt. Informally, the privacy loss of our method only scales with the information density of a new example vis-a-vis the public prompt. This contrasts with other private inference methods that incur privacy loss on every token. This is especially useful for structured data, where we avoid incurring privacy loss on syntactic elements of the data.

Experiment setup.

For JSON generation, we evaluate on a dataset of information about American movies scraped from Wikipedia [Rust, 2024]. Entries contain fields such as title, year, cast, and extract (a short synopsis). We lightly curate the data: we omit uninteresting fields (i.e., thumbnail dimensions and URLs) and remove entries with incomplete entries. We refer to the resulting 34,266 JSON examples with 6 fields as WikiMoviesJSON. We evaluate two criteria: the rate at which output generated constitutes well-formed JSON (Parses (%)), and rate at which the output passes basic schema validation (Validates (%)). This includes checks such as: no extra fields, all required fields are present, values are the correct type, and other custom constraints (e.g. no whitespace in the href field).

Results.

Results are in Table 4. Targeting a large number of examples at small ε𝜀\varepsilonitalic_ε necessitates increases in the sampling temperature τ𝜏\tauitalic_τ, to ensure privacy, but compromises the well-formed-ness of outputs. For structured generation, there is a large amount of tokens that (a) are crucial to get right for structure preservation, and (b) easily predictable without access to sensitive data. Here the SVT enables us to get these tokens reliably and for free, leading to better generation quantity.

ε𝜀\varepsilonitalic_ε Method τ𝜏\tauitalic_τ Parses (%) Validates (%) m𝑚mitalic_m
1 Ours 2 80.61.3subscript80.61.380.6_{1.3}80.6 start_POSTSUBSCRIPT 1.3 end_POSTSUBSCRIPT 74.21.9subscript74.21.974.2_{1.9}74.2 start_POSTSUBSCRIPT 1.9 end_POSTSUBSCRIPT 94.31.2subscript94.31.294.3_{1.2}94.3 start_POSTSUBSCRIPT 1.2 end_POSTSUBSCRIPT
2.5 4.91.1subscript4.91.14.9_{1.1}4.9 start_POSTSUBSCRIPT 1.1 end_POSTSUBSCRIPT 1.50.1subscript1.50.11.5_{0.1}1.5 start_POSTSUBSCRIPT 0.1 end_POSTSUBSCRIPT 138.07.5subscript138.07.5138.0_{7.5}138.0 start_POSTSUBSCRIPT 7.5 end_POSTSUBSCRIPT
+ SVT, θ𝜃\thetaitalic_θ = 0.9 2 91.72.1subscript91.72.191.7_{2.1}91.7 start_POSTSUBSCRIPT 2.1 end_POSTSUBSCRIPT 88.63.2subscript88.63.288.6_{3.2}88.6 start_POSTSUBSCRIPT 3.2 end_POSTSUBSCRIPT 289.719.4subscript289.719.4289.7_{19.4}289.7 start_POSTSUBSCRIPT 19.4 end_POSTSUBSCRIPT
2.5 74.12.7subscript74.12.774.1_{2.7}74.1 start_POSTSUBSCRIPT 2.7 end_POSTSUBSCRIPT 64.04.1subscript64.04.164.0_{4.1}64.0 start_POSTSUBSCRIPT 4.1 end_POSTSUBSCRIPT 356.725.9subscript356.725.9356.7_{25.9}356.7 start_POSTSUBSCRIPT 25.9 end_POSTSUBSCRIPT
+ SVT, θ𝜃\thetaitalic_θ = 1.5 2 95.51.0subscript95.51.095.5_{1.0}95.5 start_POSTSUBSCRIPT 1.0 end_POSTSUBSCRIPT 93.10.7subscript93.10.793.1_{0.7}93.1 start_POSTSUBSCRIPT 0.7 end_POSTSUBSCRIPT 893.020.2subscript893.020.2893.0_{20.2}893.0 start_POSTSUBSCRIPT 20.2 end_POSTSUBSCRIPT
2.5 79.31.0subscript79.31.079.3_{1.0}79.3 start_POSTSUBSCRIPT 1.0 end_POSTSUBSCRIPT 72.71.4subscript72.71.472.7_{1.4}72.7 start_POSTSUBSCRIPT 1.4 end_POSTSUBSCRIPT 1178.310.1subscript1178.310.11178.3_{10.1}1178.3 start_POSTSUBSCRIPT 10.1 end_POSTSUBSCRIPT
Table 4: Results for generating JSON records from WikiMoviesJSON. We report mean and standard deviation over 3 runs of dataset generation. τ𝜏\tauitalic_τ refers to the sampling temperature, and m𝑚mitalic_m refers to the number of raw samples produced (before parsing and validation checks). The batch size used is 255. We present results at two different SVT thresholds θ𝜃\thetaitalic_θ, and see gains in structure preservation and quantity.

6 Discussion

We believe that our significantly improved performance relative to Tang et al. [2024] is primarily attributable to two algorithmic innovations.

First, for each generated token, Tang et al. [2024] preserve the privacy of the entire distributions from which the token is sampled (by taking argmax), even though only the token itself is included in the synthetic data. By contrast, our method uses a discrete choosing mechanism, the exponential mechanism. As a result, we do not need to maintain a DP version of the entire token distribution to release a single token. This decision leads to significantly lower noise requirements, as a straightforward calculation reveals. Empirically, we obtained good synthetic data quality with s=250𝑠250s=250italic_s = 250, τ=2𝜏2\tau=2italic_τ = 2, c=10𝑐10c=10italic_c = 10 and δ=106𝛿superscript106\delta=10^{-6}italic_δ = 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT. In order to switch to the Gaussian mechanism using its standard (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-DP guarantee, and achieve comparable privacy guarantees we would would require σ0.53𝜎0.53\sigma\approx 0.53italic_σ ≈ 0.53 to achieve a comparable privacy guarantee. (See Appendix D). Better analyses of the Gaussian mechanism exist, but do not offer much help. Using the improved analysis in Balle and Wang [2018] to attain the same ε𝜀\varepsilonitalic_ε would require σ0.34𝜎0.34\sigma\approx 0.34italic_σ ≈ 0.34. Conducting the analysis so that both mechanisms have equivalent privacy loss under zCDP yields σ=0.2𝜎0.2\sigma=0.2italic_σ = 0.2. These are all very large noise magnitudes relative to probabilities in [0,1]01[0,1][ 0 , 1 ].555To put independent noise of magnitude σ=0.2𝜎0.2\sigma=0.2italic_σ = 0.2 into perspective: suppose the ground truth next-token prediction is deterministic, i.e., 𝐩¯=[1,0,,0]v¯𝐩100superscript𝑣\bar{\mathbf{p}}=[1,0,...,0]\in\mathbb{R}^{v}over¯ start_ARG bold_p end_ARG = [ 1 , 0 , … , 0 ] ∈ blackboard_R start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT, v𝑣vitalic_v = 256128 in the case of Gemma. Now with probability 0.15absent0.15\geq 0.15≥ 0.15, the noised distribution 𝐩~~𝐩{\widetilde{\mathbf{p}}}over~ start_ARG bold_p end_ARG has 𝐩~1<0.8subscript~𝐩10.8{\widetilde{\mathbf{p}}}_{1}<0.8over~ start_ARG bold_p end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < 0.8. Each other 𝐩isubscript𝐩𝑖\mathbf{p}_{i}bold_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is 0.8absent0.8\geq 0.8≥ 0.8 w.p. 3105absent3superscript105\geq 3\cdot 10^{-5}≥ 3 ⋅ 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT independently. Hence the probability of one of these being promoted to argmax is 0.15(1(13105)v1)0.15absent0.151superscript13superscript105𝑣10.15\geq 0.15\cdot(1-(1-3\cdot 10^{-5})^{v-1})\approx 0.15≥ 0.15 ⋅ ( 1 - ( 1 - 3 ⋅ 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_v - 1 end_POSTSUPERSCRIPT ) ≈ 0.15. At this rate, the chance of generating a 30 token span without a corruption is <1%absentpercent1<1\%< 1 %.

Secondly, Tang et al. [2024] generated each token using a different random sample of the sensitive prompts, which is computationally very expensive, as it prevents the use of KV cache-accelerated decoding, since the cache is invalidated upon every resample. While resampling less often would be more practical, Tang et al. [2024] noted that in this case the privacy amplification benefits of subsampling would not be adequately realized, and characterized this limitation as the “main weakness” of their approach. Instead, our method generates each synthetic example using a fixed disjoint subset of the sensitive prompts, allowing us to leverage parallel composition in our analysis, and thus avoid this privacy versus computation tradeoff.

7 Conclusion

As proprietary models become increasingly powerful, we anticipate more practitioners will be able to generate inferences from state-of-the-art models, while fewer practitioners will be able to train networks that perform like state-of-the-art models. This makes it increasingly important to develop private prediction methods that compete with private fine-tuning.

We demonstrate that private prediction can be used to generate large amounts of synthetic text with reasonable differential privacy guarantees. We produce 2-3 orders of magnitude more private synthetic data than what was demonstrated in prior work in this paradigm. Access to more synthetic data lets us fine-tune downstream models, as well as yields performance improvements via many-shot in-context learning. Furthermore, we introduce a novel use of public models in which we are able to sample predictable tokens at no privacy cost, which is particularly effective for structured data.

Limitations

While our work demonstrates that private prediction is a practical technique for privately generating a large volume of high-quality synthetic data, there remains a small gap between our results and the results obtained from privately fine-tuning the parameters of the LLM. Currently, private prediction methods pay a privacy cost for every generated token, while private fine-tuning methods do not. We view correcting this limitation as a very important open problem. Finally, any method for ensuring data privacy will inevitably entail some loss of data utility.

Author contributions

  • Alex B is the main contributor. He implemented the method, tested variants to optimize utility and privacy, and ran most of the experiments. He also proposed the use of sparse vector.

  • Umar proposed the method, the use of sampling to preserve privacy, and conducted the theoretical analysis.

  • Umar and Kareem framed the structure of the paper and led writing.

  • Kareem proposed parallel composition. He also assisted with the privacy analysis.

  • Natalia proposed logits recentering.

  • Weiwei and Alexey provided infrastructure support and code for running experiments. Alexey suggested the limited data experiments and ran the fine-tuning baselines.

  • Natalia, Andreas, and Sergei advised the project.

  • Everyone contributed to discussing, interpreting, and iterating on experiment results as well as project management.

References

  • Tang et al. [2024] Xinyu Tang, Richard Shin, Huseyin A Inan, Andre Manoel, Fatemehsadat Mireshghallah, Zinan Lin, Sivakanth Gopi, Janardhan Kulkarni, and Robert Sim. Privacy-preserving in-context learning with differentially private few-shot generation. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=oZtt0pRnOl.
  • Yue et al. [2023] Xiang Yue, Huseyin Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Hoda Shajari, Huan Sun, David Levitan, and Robert Sim. Synthetic text generation with differential privacy: A simple and practical recipe. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1321–1342, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.acl-long.74. URL https://aclanthology.org/2023.acl-long.74.
  • Abadi et al. [2016] Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318, 2016.
  • Kurakin et al. [2024] Alexey Kurakin, Natalia Ponomareva, Umar Syed, Liam MacDermed, and Andreas Terzis. Harnessing large-language models to generate private synthetic text, 2024.
  • Hu et al. [2022] Edward J Hu, yelong shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=nZeVKeeFYf9.
  • Wu et al. [2024a] Shanshan Wu, Zheng Xu, Yanxiang Zhang, Yuanbo Zhang, and Daniel Ramage. Prompt public large language models to synthesize data for private on-device applications, 2024a.
  • Tran and Xiong [2024] Toan V. Tran and Li Xiong. Differentially private tabular data synthesis using large language models, 2024.
  • Dwork and Feldman [2018] Cynthia Dwork and Vitaly Feldman. Privacy-preserving prediction. In Sébastien Bubeck, Vianney Perchet, and Philippe Rigollet, editors, Proceedings of the 31st Conference On Learning Theory, volume 75 of Proceedings of Machine Learning Research, pages 1693–1702. PMLR, 06–09 Jul 2018. URL https://proceedings.mlr.press/v75/dwork18a.html.
  • Hong et al. [2024] Junyuan Hong, Jiachen T. Wang, Chenhui Zhang, Zhangheng LI, Bo Li, and Zhangyang Wang. DP-OPT: Make large language model your privacy-preserving prompt engineer. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=Ifz3IgsEPX.
  • Durfee and Rogers [2019] David Durfee and Ryan M Rogers. Practical differentially private top-k selection with pay-what-you-get composition. Advances in Neural Information Processing Systems, 32, 2019.
  • Majmudar et al. [2022] Jimit Majmudar, Christophe Dupuy, Charith Peris, Sami Smaili, Rahul Gupta, and Richard Zemel. Differentially private decoding in large language models. In NAACL 2022 Second Workshop on Trustworthy Natural Language Processing (TrustNLP), 2022. URL https://www.amazon.science/publications/differentially-private-decoding-in-large-language-models.
  • Duan et al. [2023] Haonan Duan, Adam Dziedzic, Nicolas Papernot, and Franziska Boenisch. Flocks of stochastic parrots: Differentially private prompt learning for large language models. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems, volume 36, pages 76852–76871. Curran Associates, Inc., 2023. URL https://proceedings.neurips.cc/paper_files/paper/2023/file/f26119b4ffe38c24d97e4c49d334b99e-Paper-Conference.pdf.
  • Ginart et al. [2022] Antonio Ginart, Laurens van der Maaten, James Zou, and Chuan Guo. Submix: Practical private prediction for large-scale language models. CoRR, abs/2201.00971, 2022. URL https://arxiv.org/abs/2201.00971.
  • Flemings et al. [2024] James Flemings, Meisam Razaviyayn, and Murali Annavaram. Differentially private next-token prediction of large language models, 2024.
  • Yu et al. [2024] Da Yu, Peter Kairouz, Sewoong Oh, and Zheng Xu. Privacy-preserving instructions for aligning large language models, 2024.
  • Xie et al. [2024] Chulin Xie, Zinan Lin, Arturs Backurs, Sivakanth Gopi, Da Yu, Huseyin A Inan, Harsha Nori, Haotian Jiang, Huishuai Zhang, Yin Tat Lee, Bo Li, and Sergey Yekhanin. Differentially private synthetic data via foundation model APIs 2: Text. In ICLR 2024 Workshop on Secure and Trustworthy Large Language Models, 2024. URL https://openreview.net/forum?id=jnF53uXmBS.
  • Wu et al. [2024b] Tong Wu, Ashwinee Panda, Jiachen T. Wang, and Prateek Mittal. Privacy-preserving in-context learning for large language models. In The Twelfth International Conference on Learning Representations, 2024b. URL https://openreview.net/forum?id=x4OPJ7lHVU.
  • Dwork et al. [2009] Cynthia Dwork, Moni Naor, Omer Reingold, Guy N Rothblum, and Salil Vadhan. On the complexity of differentially private data release: efficient algorithms and hardness results. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 381–390, 2009.
  • Dwork et al. [2006] Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, May 28-June 1, 2006. Proceedings 25, pages 486–503. Springer, 2006.
  • Canonne et al. [2020] Clément L Canonne, Gautam Kamath, and Thomas Steinke. The discrete gaussian for differential privacy. Advances in Neural Information Processing Systems, 33:15676–15688, 2020.
  • Cesar and Rogers [2021] Mark Cesar and Ryan Rogers. Bounding, concentrating, and truncating: Unifying privacy loss composition for data analytics. In Vitaly Feldman, Katrina Ligett, and Sivan Sabato, editors, Proceedings of the 32nd International Conference on Algorithmic Learning Theory, volume 132 of Proceedings of Machine Learning Research, pages 421–457. PMLR, 16–19 Mar 2021. URL https://proceedings.mlr.press/v132/cesar21a.html.
  • Team [2024] Gemma Team. Gemma: Open models based on gemini research and technology, 2024.
  • Zhang et al. [2015] Xiang Zhang, Junbo Zhao, and Yann LeCun. Character-level convolutional networks for text classification. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015. URL https://proceedings.neurips.cc/paper_files/paper/2015/file/250cf8b51c773f3f8dc8b4be867a9a02-Paper.pdf.
  • Voorhees and Tice [2000] Ellen M. Voorhees and Dawn M. Tice. Building a question answering test collection. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’00, page 200–207, New York, NY, USA, 2000. Association for Computing Machinery. ISBN 1581132263. doi: 10.1145/345508.345577. URL https://doi.org/10.1145/345508.345577.
  • Liu et al. [2012] Jingjing Liu, Scott Cyphers, Panupong Pasupat, Ian McGraw, and James R. Glass. A conversational movie search system based on conditional random fields. In INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012, pages 2454–2457. ISCA, 2012. doi: 10.21437/INTERSPEECH.2012-563. URL https://doi.org/10.21437/Interspeech.2012-563.
  • Maas et al. [2011] Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. Learning word vectors for sentiment analysis. In Dekang Lin, Yuji Matsumoto, and Rada Mihalcea, editors, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 142–150, Portland, Oregon, USA, June 2011. Association for Computational Linguistics. URL https://aclanthology.org/P11-1015.
  • Rust [2024] Peter Rust. wikipedia-movie-data. https://github.com/prust/wikipedia-movie-data, 2024.
  • Zhao et al. [2021] Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh. Calibrate before use: Improving few-shot performance of language models. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 12697–12706. PMLR, 18–24 Jul 2021. URL https://proceedings.mlr.press/v139/zhao21c.html.
  • Bertsch et al. [2024] Amanda Bertsch, Maor Ivgi, Uri Alon, Jonathan Berant, Matthew R Gormley, and Graham Neubig. In-context learning with long-context models: An in-depth exploration. arXiv preprint arXiv:2405.00200, 2024.
  • Ratner et al. [2023] Nir Ratner, Yoav Levine, Yonatan Belinkov, Ori Ram, Inbal Magar, Omri Abend, Ehud Karpas, Amnon Shashua, Kevin Leyton-Brown, and Yoav Shoham. Parallel context windows for large language models. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6383–6402, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.acl-long.352. URL https://aclanthology.org/2023.acl-long.352.
  • Lu et al. [2022] Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stenetorp. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8086–8098, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.acl-long.556. URL https://aclanthology.org/2022.acl-long.556.
  • Balle and Wang [2018] Borja Balle and Yu-Xiang Wang. Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. In International Conference on Machine Learning, pages 394–403. PMLR, 2018.
  • Bun and Steinke [2016] Mark Bun and Thomas Steinke. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Theory of Cryptography Conference, pages 635–658. Springer, 2016.
  • Rogers and Steinke [2021] Ryan Rogers and Thomas Steinke. A better privacy analysis of the exponential mechanism. DifferentialPrivacy.org, 07 2021. https://differentialprivacy.org/exponential-mechanism-bounded-range/.
  • Turc et al. [2019] Iulia Turc, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Well-read students learn better: On the importance of pre-training compact models. arXiv preprint arXiv:1908.08962v2, 2019.

Appendix A Private prediction beats fine-tuning in the limited data regime

We do LoRA fine-tuning with DP-SGD on AGNews1K, with the same setup that beats our method in the full data regime. We sample synthetic data from the fine-tuned model. We also run our private prediction method on AGNews1K. We evaluate performance on 4 and 16 shot in-context learning with GPT-3 babbage-002 (the same experimental setting as §5.1).

ε𝜀\varepsilonitalic_ε Method Shots Model Acc. (%)
1 LoRA 4 LaMDA 8B 63.38.0subscript63.38.063.3_{8.0}63.3 start_POSTSUBSCRIPT 8.0 end_POSTSUBSCRIPT
16 68.15.9subscript68.15.968.1_{5.9}68.1 start_POSTSUBSCRIPT 5.9 end_POSTSUBSCRIPT
Ours 4 Gemma 2B 1.1 IT 73.98.3subscript73.98.373.9_{8.3}73.9 start_POSTSUBSCRIPT 8.3 end_POSTSUBSCRIPT
16 80.12.5subscript80.12.580.1_{2.5}80.1 start_POSTSUBSCRIPT 2.5 end_POSTSUBSCRIPT
Table 5: Results on AGNews1K, a 1024-subsample of AGNews. Our method is “pay-as-you-go”, and is capable of generating a few high quality examples for in-context learning in this regime. On the other hand fine-tuning does worse due to the stricter requirement that all future model outputs must be private. We run 16 shot since we only generate 38 examples/since 16 examples fills up the context length for LoRA.

Appendix B Design choices

B.1 Logits clipping function

In Figure 3, we compare results for different logits clipping functions. The baseline approach it to clip all logits to the interval [c,c]𝑐𝑐[-c,c][ - italic_c , italic_c ] before aggregation and softmax – we refer to this as “fixed interval clipping”. Alternatively, we can clip to the range [maxj{𝐳j}2c,maxj{𝐳j}]subscript𝑗subscript𝐳𝑗2𝑐subscript𝑗subscript𝐳𝑗[\max_{j}\{\mathbf{z}_{j}\}-2c,\max_{j}\{\mathbf{z}_{j}\}][ roman_max start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT { bold_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } - 2 italic_c , roman_max start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT { bold_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } ] and then translate to the interval [c,c]𝑐𝑐[-c,c][ - italic_c , italic_c ] (Equation 1). In Figure 3 we plot the distortion as a consequence of clipping in terms of L1 error, and find that the latter approach allows us clip more than twice as aggressively, thus improving the privacy guarantee, without compromising utility.

Refer to caption
(a) Distribution of L1 error induced by fixed interval clipping.
Refer to caption
(b) Distribution of L1 error induced by clipping with recentering.
Figure 3: We sample a few hundred tokens using logits aggregation with no clipping. At each sampling step, we compute the L1 distances between the post-softmax distributions of aggregated clipped logits vs. aggregated unclipped logits, at various settings of c𝑐citalic_c, and plot them in a histrogram. We observe less error, at lower choices of c𝑐citalic_c when clipping with recentering (note the x𝑥xitalic_x-axis scales).

Appendix C Proof of Theorem 1

Our proof of Theorem 1 is organized into sections. §C.1 provides basic definitions. §C.2 and §C.3 establish key results related to composition and sensitivity. §C.4 proves the privacy of simpler mechanisms that each account for a portion of the functionality of Algorithm 1. C.5 puts all the pieces together and completes the proof.

C.1 Definitions

In §4 we defined neighboring prompt datasets. We extend the definition to arbitrary sets.

Definition 2.

Let 𝒰𝒰\mathcal{U}caligraphic_U be a set. Let S,S𝒰𝑆superscript𝑆𝒰S,S^{\prime}\subseteq\mathcal{U}italic_S , italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊆ caligraphic_U. We say that S𝑆Sitalic_S and Ssuperscript𝑆S^{\prime}italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are neighbors if there exists u𝒰𝑢𝒰u\in\mathcal{U}italic_u ∈ caligraphic_U such that S=S{u}𝑆superscript𝑆𝑢S=S^{\prime}\cup\{u\}italic_S = italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∪ { italic_u } oder S=S{u}superscript𝑆𝑆𝑢S^{\prime}=S\cup\{u\}italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_S ∪ { italic_u }.

The sensitivity of a function is an upper bound on how much its value can change over neighbors.

Definition 3.

Let 𝒰𝒰\mathcal{U}caligraphic_U be a set. Let k1𝑘1k\geq 1italic_k ≥ 1. Let f:2𝒰k:𝑓superscript2𝒰superscript𝑘f:2^{\mathcal{U}}\rightarrow\mathbb{R}^{k}italic_f : 2 start_POSTSUPERSCRIPT caligraphic_U end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT. The sensitivity of f𝑓fitalic_f is

supS,Sf(S)f(S)subscriptsupremum𝑆superscript𝑆subscriptdelimited-∥∥𝑓𝑆𝑓superscript𝑆\sup_{S,S^{\prime}}\left\lVert f(S)-f(S^{\prime})\right\rVert_{\infty}roman_sup start_POSTSUBSCRIPT italic_S , italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ italic_f ( italic_S ) - italic_f ( italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT

where the supremum is over neighbors S,S𝒰𝑆superscript𝑆𝒰S,S^{\prime}\in\mathcal{U}italic_S , italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_U.

Zero-concentrated differential privacy (zCDP) is a relaxation of ε𝜀\varepsilonitalic_ε-differential privacy.

Definition 4 (Bun and Steinke [2016]).

A mechanism M𝑀Mitalic_M satisfies ρ𝜌\rhoitalic_ρ-zCDP if

Dα(M(D)M(D))ραsubscript𝐷𝛼conditional𝑀𝐷𝑀superscript𝐷𝜌𝛼D_{\alpha}(M(D)\parallel M(D^{\prime}))\leq\rho\alphaitalic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_M ( italic_D ) ∥ italic_M ( italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≤ italic_ρ italic_α

for all α>1𝛼1\alpha>1italic_α > 1 and neighboring datasets D,D𝒟𝐷superscript𝐷𝒟D,D^{\prime}\in\mathcal{D}italic_D , italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_D, where Dα(PQ)subscript𝐷𝛼conditional𝑃𝑄D_{\alpha}(P\parallel Q)italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_P ∥ italic_Q ) is Rényi divergence of order α𝛼\alphaitalic_α betweeen distributions P𝑃Pitalic_P and Q𝑄Qitalic_Q.

C.2 Composition

Zero-concentrated differential privacy obeys a simple sequential composition rule.

Lemma 1.

If mechanisms M1subscript𝑀1M_{1}italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and M2subscript𝑀2M_{2}italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT satisfy ρ1subscript𝜌1\rho_{1}italic_ρ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-zCDP and ρ2subscript𝜌2\rho_{2}italic_ρ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT-zCDP, respectively, then the sequential composition of M1subscript𝑀1M_{1}italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and M2subscript𝑀2M_{2}italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT satisfies (ρ1+ρ2)subscript𝜌1subscript𝜌2(\rho_{1}+\rho_{2})( italic_ρ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_ρ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT )-zCDP.

Parallel composition is a well-known technique in differential privacy that is useful for establishing privacy guarantees in scenarios where a mechanism is independently applied to disjoint subsets of a dataset. Many versions of parallel composition require that the subsets are chosen in a fully data-independent manner. We show that the same result holds under a weaker assumption.

Lemma 2.

Let k𝑘kitalic_k be a positive integer. Let f𝑓fitalic_f be a function that maps prompts into [k]delimited-[]𝑘[k][ italic_k ]. For any dataset of prompts D𝐷Ditalic_D and i[k]𝑖delimited-[]𝑘i\in[k]italic_i ∈ [ italic_k ] let

Di={𝐩D:f(𝐩)=i}.subscript𝐷𝑖conditional-set𝐩𝐷𝑓𝐩𝑖D_{i}=\{\mathbf{p}\in D:f(\mathbf{p})=i\}.italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = { bold_p ∈ italic_D : italic_f ( bold_p ) = italic_i } .

Let M𝑀Mitalic_M be a mechanism that satisfies ρ𝜌\rhoitalic_ρ-zCDP. If M^^𝑀\widehat{M}over^ start_ARG italic_M end_ARG is the mechanism defined by

M^(D)=(M(D1),,M(Dk))^𝑀𝐷𝑀subscript𝐷1𝑀subscript𝐷𝑘\widehat{M}(D)=(M(D_{1}),\ldots,M(D_{k}))over^ start_ARG italic_M end_ARG ( italic_D ) = ( italic_M ( italic_D start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , italic_M ( italic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) )

then M^^𝑀\widehat{M}over^ start_ARG italic_M end_ARG satisfies ρ𝜌\rhoitalic_ρ-zCDP.

Proof.

Let D,D𝒟𝐷superscript𝐷𝒟D,D^{\prime}\in\mathcal{D}italic_D , italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_D be neighboring datasets. Without loss of generality assume D=D{𝐩}𝐷superscript𝐷𝐩D=D^{\prime}\cup\{\mathbf{p}\}italic_D = italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∪ { bold_p }, where 𝐩𝐩\mathbf{p}bold_p is a prompt. There exists j[k]𝑗delimited-[]𝑘j\in[k]italic_j ∈ [ italic_k ] such that Di=Disubscript𝐷𝑖subscriptsuperscript𝐷𝑖D_{i}=D^{\prime}_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for all ij𝑖𝑗i\neq jitalic_i ≠ italic_j and Dj=Dj{𝐩}subscript𝐷𝑗subscriptsuperscript𝐷𝑗𝐩D_{j}=D^{\prime}_{j}\cup\{\mathbf{p}\}italic_D start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∪ { bold_p }. We have for all α>1𝛼1\alpha>1italic_α > 1

Dα(M^(D)M^(D))subscript𝐷𝛼conditional^𝑀𝐷^𝑀superscript𝐷\displaystyle D_{\alpha}(\widehat{M}(D)\parallel\widehat{M}(D^{\prime}))italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( over^ start_ARG italic_M end_ARG ( italic_D ) ∥ over^ start_ARG italic_M end_ARG ( italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) =i=1kDα(M(Di)M(Di))absentsuperscriptsubscript𝑖1𝑘subscript𝐷𝛼conditional𝑀subscript𝐷𝑖𝑀subscriptsuperscript𝐷𝑖\displaystyle=\sum_{i=1}^{k}D_{\alpha}(M(D_{i})\parallel M(D^{\prime}_{i}))= ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_M ( italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ∥ italic_M ( italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) )
=Dα(M(Dj)M(Dj))absentsubscript𝐷𝛼conditional𝑀subscript𝐷𝑗𝑀subscriptsuperscript𝐷𝑗\displaystyle=D_{\alpha}(M(D_{j})\parallel M(D^{\prime}_{j}))= italic_D start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_M ( italic_D start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∥ italic_M ( italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) )
ραabsent𝜌𝛼\displaystyle\leq\rho\alpha\qed≤ italic_ρ italic_α italic_∎

C.3 Sensitivity analysis

In this we compute the sensitivity of several functions used in Algorithm 1. Each function depends on a set of logit vectors. Recall that a logit vector is an element of vsuperscript𝑣\mathbb{R}^{v}blackboard_R start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT. Let

(Z)=1s𝐳Zclipc(𝐳)𝑍1𝑠subscript𝐳𝑍subscriptclip𝑐𝐳\ell(Z)=\frac{1}{s}\sum_{\mathbf{z}\in Z}\operatorname{clip}_{c}(\mathbf{z})roman_ℓ ( italic_Z ) = divide start_ARG 1 end_ARG start_ARG italic_s end_ARG ∑ start_POSTSUBSCRIPT bold_z ∈ italic_Z end_POSTSUBSCRIPT roman_clip start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_z )

where clipc()subscriptclip𝑐\operatorname{clip}_{c}(\cdot)roman_clip start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( ⋅ ) was defined in Eq. (1). Also recall the distance function defined in Eq. (2):

d(Z,𝐳)=1s𝐳Zp𝐳p𝐳1𝑑𝑍𝐳subscriptdelimited-∥∥1𝑠subscriptsuperscript𝐳𝑍subscript𝑝superscript𝐳subscript𝑝𝐳1d(Z,\mathbf{z})=\left\lVert\frac{1}{s}\sum_{\mathbf{z}^{\prime}\in Z}p_{% \mathbf{z}^{\prime}}-p_{\mathbf{z}}\right\rVert_{1}italic_d ( italic_Z , bold_z ) = ∥ divide start_ARG 1 end_ARG start_ARG italic_s end_ARG ∑ start_POSTSUBSCRIPT bold_z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Z end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT bold_z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT - italic_p start_POSTSUBSCRIPT bold_z end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT

where p𝐳=softmax(𝐳)subscript𝑝𝐳softmax𝐳p_{\mathbf{z}}=\operatorname{softmax}(\mathbf{z})italic_p start_POSTSUBSCRIPT bold_z end_POSTSUBSCRIPT = roman_softmax ( bold_z ).

Lemma 3.

The function \ellroman_ℓ has sensitivity cs𝑐𝑠\frac{c}{s}divide start_ARG italic_c end_ARG start_ARG italic_s end_ARG, and for all 𝐳v𝐳superscript𝑣\mathbf{z}\in\mathbb{R}^{v}bold_z ∈ blackboard_R start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT, the function d(,𝐳)𝑑𝐳d(\cdot,\mathbf{z})italic_d ( ⋅ , bold_z ) has sensitivity 1s1𝑠\frac{1}{s}divide start_ARG 1 end_ARG start_ARG italic_s end_ARG.

Proof.

Let Z,Zv𝑍superscript𝑍superscript𝑣Z,Z^{\prime}\subseteq\mathbb{R}^{v}italic_Z , italic_Z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊆ blackboard_R start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT be neighbors. Let 𝐳~v~𝐳superscript𝑣\mathbf{\tilde{z}}\in\mathbb{R}^{v}over~ start_ARG bold_z end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT be the logit vector they do not have in common. We have

(Z)(Z)=1sclipc(𝐳~)cs.subscriptdelimited-∥∥𝑍superscript𝑍1𝑠subscriptdelimited-∥∥subscriptclip𝑐~𝐳𝑐𝑠\left\lVert\ell(Z)-\ell(Z^{\prime})\right\rVert_{\infty}=\frac{1}{s}\left% \lVert\operatorname{clip}_{c}(\mathbf{\tilde{z}})\right\rVert_{\infty}\leq% \frac{c}{s}.∥ roman_ℓ ( italic_Z ) - roman_ℓ ( italic_Z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_s end_ARG ∥ roman_clip start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( over~ start_ARG bold_z end_ARG ) ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ divide start_ARG italic_c end_ARG start_ARG italic_s end_ARG .

We also have

|d(Z,𝐳)d(D,𝐳)|𝑑𝑍𝐳𝑑superscript𝐷𝐳\displaystyle\left|d(Z,\mathbf{z})-d(D^{\prime},\mathbf{z})\right|| italic_d ( italic_Z , bold_z ) - italic_d ( italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , bold_z ) |
=\displaystyle== |1s𝐳Zp𝐳p𝐳11s𝐳Zp𝐳p𝐳1|subscriptdelimited-∥∥1𝑠subscriptsuperscript𝐳𝑍subscript𝑝superscript𝐳subscript𝑝𝐳1subscriptdelimited-∥∥1𝑠subscriptsuperscript𝐳superscript𝑍subscript𝑝superscript𝐳subscript𝑝𝐳1\displaystyle\left|\left\lVert\frac{1}{s}\sum_{\mathbf{z}^{\prime}\in Z}p_{% \mathbf{z}^{\prime}}-p_{\mathbf{z}}\right\rVert_{1}-\left\lVert\frac{1}{s}\sum% _{\mathbf{z}^{\prime}\in Z^{\prime}}p_{\mathbf{z}^{\prime}}-p_{\mathbf{z}}% \right\rVert_{1}\right|| ∥ divide start_ARG 1 end_ARG start_ARG italic_s end_ARG ∑ start_POSTSUBSCRIPT bold_z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Z end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT bold_z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT - italic_p start_POSTSUBSCRIPT bold_z end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - ∥ divide start_ARG 1 end_ARG start_ARG italic_s end_ARG ∑ start_POSTSUBSCRIPT bold_z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT bold_z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT - italic_p start_POSTSUBSCRIPT bold_z end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT |
\displaystyle\leq 1s𝐳Zp𝐳1s𝐳Zp𝐳1subscriptdelimited-∥∥1𝑠subscriptsuperscript𝐳𝑍subscript𝑝superscript𝐳1𝑠subscriptsuperscript𝐳superscript𝑍subscript𝑝superscript𝐳1\displaystyle\left\lVert\frac{1}{s}\sum_{\mathbf{z}^{\prime}\in Z}p_{\mathbf{z% }^{\prime}}-\frac{1}{s}\sum_{\mathbf{z}^{\prime}\in Z^{\prime}}p_{\mathbf{z}^{% \prime}}\right\rVert_{1}∥ divide start_ARG 1 end_ARG start_ARG italic_s end_ARG ∑ start_POSTSUBSCRIPT bold_z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Z end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT bold_z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_s end_ARG ∑ start_POSTSUBSCRIPT bold_z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT bold_z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT
=\displaystyle== 1s𝐩𝐳~1subscriptdelimited-∥∥1𝑠subscript𝐩~𝐳1\displaystyle\left\lVert\frac{1}{s}\mathbf{p}_{\mathbf{\tilde{z}}}\right\rVert% _{1}∥ divide start_ARG 1 end_ARG start_ARG italic_s end_ARG bold_p start_POSTSUBSCRIPT over~ start_ARG bold_z end_ARG end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT
=\displaystyle== 1s1𝑠\displaystyle\frac{1}{s}divide start_ARG 1 end_ARG start_ARG italic_s end_ARG

where we used the reverse triangle inequality. ∎

C.4 Constituent mechanisms

In this section we prove privacy guarantees for several simpler mechanisms that we will later compose together to show that Algorithm 1 is private.

Both Algorithms 2 and 3 accept a sensitive prompt dataset and a token sequence as input. Algorithm 2 appends a private token to the token sequence, while Algorithm 3 appends zero or more public tokens to the token sequence. The operation of both algorithms is governed by the parameters of Algorithm 1 (e.g., temperature, noise level, etc).

Algorithm 2 Private token generation
1:Input: Sensitive prompt dataset D𝐷Ditalic_D, initial token sequence 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT
2:Output: Token sequence 𝐱𝒳𝐱superscript𝒳\mathbf{x}\in\mathcal{X}^{*}bold_x ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT
3:𝐱𝐱0𝐱subscript𝐱0\mathbf{x}\leftarrow\mathbf{x}_{0}bold_x ← bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT
4:Z{logits(𝐩𝐱):𝐩D}𝑍conditional-setlogits𝐩𝐱𝐩𝐷Z\leftarrow\{\operatorname{logits}(\mathbf{p}\mathbf{x}):\mathbf{p}\in D\}italic_Z ← { roman_logits ( bold_px ) : bold_p ∈ italic_D }
5:𝐳¯(Z)¯𝐳𝑍\bar{\mathbf{z}}\leftarrow\ell(Z)over¯ start_ARG bold_z end_ARG ← roman_ℓ ( italic_Z )
6:xsoftmax(𝐳¯/τ)similar-to𝑥softmax¯𝐳𝜏x\sim\operatorname{softmax}(\bar{\mathbf{z}}/\tau)italic_x ∼ roman_softmax ( over¯ start_ARG bold_z end_ARG / italic_τ )
7:Append x𝑥xitalic_x to 𝐱𝐱\mathbf{x}bold_x
8:Return 𝐱𝐱\mathbf{x}bold_x.
Lemma 4.

Let A(D,𝐱0)𝐴𝐷subscript𝐱0A(D,\mathbf{x}_{0})italic_A ( italic_D , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) be Algorithm 2. For each 𝐱0𝒳subscript𝐱0superscript𝒳\mathbf{x}_{0}\in\mathcal{X}^{*}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT the mechanism M:DA(D,𝐱0):𝑀maps-to𝐷𝐴𝐷subscript𝐱0M:D\mapsto A(D,\mathbf{x}_{0})italic_M : italic_D ↦ italic_A ( italic_D , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) satisfies ρ𝜌\rhoitalic_ρ-zCDP, where ρ=12(csτ)2𝜌12superscript𝑐𝑠𝜏2\rho=\frac{1}{2}(\frac{c}{s\tau})^{2}italic_ρ = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG italic_c end_ARG start_ARG italic_s italic_τ end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

Proof.

Consider a function f:𝒟v:𝑓𝒟superscript𝑣f:\mathcal{D}\rightarrow\mathbb{R}^{v}italic_f : caligraphic_D → blackboard_R start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT with sensitivity ΔΔ\Deltaroman_Δ. By an analysis of the exponential mechanism due to Cesar and Rogers [2021],666See also Rogers and Steinke [2021]. choosing a token according to the distribution softmax(ε2Δ)softmax𝜀2Δ\operatorname{softmax}(\frac{\varepsilon}{2\Delta})roman_softmax ( divide start_ARG italic_ε end_ARG start_ARG 2 roman_Δ end_ARG ) satisfies 18ε218superscript𝜀2\frac{1}{8}\varepsilon^{2}divide start_ARG 1 end_ARG start_ARG 8 end_ARG italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-zCDP. Observe that mechanism M𝑀Mitalic_M is the exponential mechanism with f=1τ𝑓1𝜏f=\frac{1}{\tau}\ellitalic_f = divide start_ARG 1 end_ARG start_ARG italic_τ end_ARG roman_ℓ, which by Lemma 3 has sensitivity csτ𝑐𝑠𝜏\frac{c}{s\tau}divide start_ARG italic_c end_ARG start_ARG italic_s italic_τ end_ARG. ∎

Algorithm 3 Public token generation
1:Input: Sensitive prompt dataset D𝐷Ditalic_D, initial token sequence 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT
2:Output: Token sequence 𝐱𝒳𝐱superscript𝒳\mathbf{x}\in\mathcal{X}^{*}bold_x ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT
3:𝐱𝐱0𝐱subscript𝐱0\mathbf{x}\leftarrow\mathbf{x}_{0}bold_x ← bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT
4:θ^θ+Laplace(σ)^𝜃𝜃Laplace𝜎\hat{\theta}\leftarrow\theta+\textrm{Laplace}(\sigma)over^ start_ARG italic_θ end_ARG ← italic_θ + Laplace ( italic_σ )
5:while True do
6:     Z{logits(𝐩𝐱):𝐩D}𝑍conditional-setlogits𝐩𝐱𝐩𝐷Z\leftarrow\{\operatorname{logits}(\mathbf{p}\mathbf{x}):\mathbf{p}\in D\}italic_Z ← { roman_logits ( bold_px ) : bold_p ∈ italic_D }
7:     𝐳publiclogits(𝐩public𝐱)subscript𝐳publiclogitssubscript𝐩public𝐱\mathbf{z}_{\operatorname{public}}\leftarrow\operatorname{logits}(\mathbf{p}_{% \operatorname{public}}\mathbf{x})bold_z start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT ← roman_logits ( bold_p start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT bold_x )
8:     d^d(Z,𝐳public)+Laplace(2σ)^𝑑𝑑𝑍subscript𝐳publicLaplace2𝜎\hat{d}\leftarrow d(Z,\mathbf{z}_{\operatorname{public}})+\textrm{Laplace}(2\sigma)over^ start_ARG italic_d end_ARG ← italic_d ( italic_Z , bold_z start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT ) + Laplace ( 2 italic_σ )
9:     if q^>θ^^𝑞^𝜃\hat{q}>\hat{\theta}over^ start_ARG italic_q end_ARG > over^ start_ARG italic_θ end_ARG then
10:         Break
11:     else
12:         xsoftmax(𝐳public/τpublic)similar-to𝑥softmaxsubscript𝐳publicsubscript𝜏publicx\sim\operatorname{softmax}(\mathbf{z}_{\operatorname{public}}/\tau_{% \operatorname{public}})italic_x ∼ roman_softmax ( bold_z start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT / italic_τ start_POSTSUBSCRIPT roman_public end_POSTSUBSCRIPT )
13:         Append x𝑥xitalic_x to 𝐱𝐱\mathbf{x}bold_x      
14:Return 𝐱𝐱\mathbf{x}bold_x.
Lemma 5.

Let A(D,𝐱0)𝐴𝐷subscript𝐱0A(D,\mathbf{x}_{0})italic_A ( italic_D , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) be Algorithm 3. For each 𝐱0𝒳subscript𝐱0superscript𝒳\mathbf{x}_{0}\in\mathcal{X}^{*}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ caligraphic_X start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT the mechanism M:DA(D,𝐱0):𝑀maps-to𝐷𝐴𝐷subscript𝐱0M:D\mapsto A(D,\mathbf{x}_{0})italic_M : italic_D ↦ italic_A ( italic_D , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) satisfies ρ𝜌\rhoitalic_ρ-zCDP, where ρ=2(sσ)2𝜌2superscript𝑠𝜎2\rho=\frac{2}{(s\sigma)^{2}}italic_ρ = divide start_ARG 2 end_ARG start_ARG ( italic_s italic_σ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG.

Proof.

Observe that mechanism M𝑀Mitalic_M is an instance of the AboveThrehold mechanism [Dwork et al., 2009], which accepts a private dataset, a threshold, and a sequence of queries as input. In each iteration, the AboveThreshold mechanism applies the next query in the sequence to the dataset and compares it to a noisy threshold, and returns the index of the first query that exceeds the threshold. The queries can be chosen adaptively and adversarially. In mechanism M𝑀Mitalic_M, each query is specified by a token sequence 𝐱𝐱\mathbf{x}bold_x, and the index of the first query that exceeds the threshold is determined by the length of the returned token sequence. Furthermore, by Lemma 3 each query has sensitivity 1s1𝑠\frac{1}{s}divide start_ARG 1 end_ARG start_ARG italic_s end_ARG. Thus by the analysis due to Dwork et al. [2009], mechanism M𝑀Mitalic_M satisfies 2sσ2𝑠𝜎\frac{2}{s\sigma}divide start_ARG 2 end_ARG start_ARG italic_s italic_σ end_ARG-differential privacy, which by Bun and Steinke [2016] implies that mechanism M𝑀Mitalic_M satisfies 2(sσ)22superscript𝑠𝜎2\frac{2}{(s\sigma)^{2}}divide start_ARG 2 end_ARG start_ARG ( italic_s italic_σ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG-zCDP. ∎

C.5 Putting it all together

Consider a sequence of iterations of the inner loop of Algorithm 1 in which the value of t𝑡titalic_t (the private token counter) is constant. Observe that the operation of Algorithm 1 during these iterations is equivalent to the sequential composition of Algorithms 2 and 3, since these iterations generate zero or more public tokens followed by a private token.777The special treatment of the <eos> token complicates this story a little, but we can always assume that the LLM ignores any tokens before the last <eos> token. Moreover, there are at most r𝑟ritalic_r such sequences of iterations, since r𝑟ritalic_r is an upper bound on the private token counter for any batch. By Lemmas 1, 4 and 5 we have that Algorithm 1 applied to a single batch satisfies ρ𝜌\rhoitalic_ρ-zCDP (where ρ𝜌\rhoitalic_ρ is specified in the statement of Theorem 1). And therefore by Assumption 1 and Lemma 2 we have that Algorithm 1 applied to the whole dataset satisfies ρ𝜌\rhoitalic_ρ-zCDP. It remains to convert this zCDP guarantee to an (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-differential privacy guarantee, which we do two different ways using two existing results: Corollary 13 due to Canonne et al. [2020] and Lemma 3.5 due to Bun and Steinke [2016].

Appendix D Privacy-equivalent Gaussian noise

Given the average token distribution 𝐩¯¯𝐩\bar{\mathbf{p}}over¯ start_ARG bold_p end_ARG in a batch, Tang et al. [2024] protect the privacy of 𝐩¯¯𝐩\bar{\mathbf{p}}over¯ start_ARG bold_p end_ARG by using the Gaussian mechanism, which achieves (ε,δ)𝜀𝛿(\varepsilon,\delta)( italic_ε , italic_δ )-differential privacy with ε=2log(1.25/δ)sσ𝜀21.25𝛿𝑠𝜎\varepsilon=\frac{\sqrt{2\log(1.25/\delta)}}{s\sigma}italic_ε = divide start_ARG square-root start_ARG 2 roman_log ( 1.25 / italic_δ ) end_ARG end_ARG start_ARG italic_s italic_σ end_ARG, where s𝑠sitalic_s is the batch size and σ𝜎\sigmaitalic_σ is the standard deviation of the noise added to each probability in 𝐩¯¯𝐩\bar{\mathbf{p}}over¯ start_ARG bold_p end_ARG. On the other hand, we use the exponential mechanism to protect the privacy of a sample drawn from 𝐩¯¯𝐩\bar{\mathbf{p}}over¯ start_ARG bold_p end_ARG, which achieves ε𝜀\varepsilonitalic_ε-differential privacy with ε=2csτ𝜀2𝑐𝑠𝜏\varepsilon=\frac{2c}{s\tau}italic_ε = divide start_ARG 2 italic_c end_ARG start_ARG italic_s italic_τ end_ARG, where c𝑐citalic_c is the maximum absolute value of any log-probability in the batch and τ𝜏\tauitalic_τ is the sampling temperature.

Empirically, we obtained good synthetic data quality with s=250𝑠250s=250italic_s = 250, τ=2𝜏2\tau=2italic_τ = 2, c=10𝑐10c=10italic_c = 10 and δ=106𝛿superscript106\delta=10^{-6}italic_δ = 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT.

Setting the ε𝜀\varepsilonitalic_ε values equal to each other yields σ=τlog(1.25/δ)2c𝜎𝜏1.25𝛿2𝑐\sigma=\frac{\tau\sqrt{\log(1.25/\delta)}}{\sqrt{2}c}italic_σ = divide start_ARG italic_τ square-root start_ARG roman_log ( 1.25 / italic_δ ) end_ARG end_ARG start_ARG square-root start_ARG 2 end_ARG italic_c end_ARG, which is the noise level needed for the two mechanisms to have comparable privacy guarantees (setting aside that δ>0𝛿0\delta>0italic_δ > 0, an omission that only favors the Gaussian mechanism). Plugging in the above parameters yields σ0.53𝜎0.53\sigma\approx 0.53italic_σ ≈ 0.53.

The analysis in Theorem 8 of Balle and Wang [2018] does not admit a closed-form solution. Instead, we binary search for a solution to:

Φ(Δ2σεσΔ)exp(ε)Φ(Δ2σεσΔ)δΦΔ2𝜎𝜀𝜎Δ𝜀ΦΔ2𝜎𝜀𝜎Δ𝛿\Phi\left(\frac{\Delta}{2\sigma}-\frac{\varepsilon\sigma}{\Delta}\right)-\exp(% \varepsilon)\Phi\left(-\frac{\Delta}{2\sigma}-\frac{\varepsilon\sigma}{\Delta}% \right)\leq\deltaroman_Φ ( divide start_ARG roman_Δ end_ARG start_ARG 2 italic_σ end_ARG - divide start_ARG italic_ε italic_σ end_ARG start_ARG roman_Δ end_ARG ) - roman_exp ( italic_ε ) roman_Φ ( - divide start_ARG roman_Δ end_ARG start_ARG 2 italic_σ end_ARG - divide start_ARG italic_ε italic_σ end_ARG start_ARG roman_Δ end_ARG ) ≤ italic_δ

where ΦΦ\Phiroman_Φ is the Gaussian cdf, ε=2csτ𝜀2𝑐𝑠𝜏\varepsilon=\frac{2c}{s\tau}italic_ε = divide start_ARG 2 italic_c end_ARG start_ARG italic_s italic_τ end_ARG, δ=106𝛿superscript106\delta=10^{-6}italic_δ = 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT, and ΔΔ\Deltaroman_Δ is the L2 sensitivity of a vector computed as the average of s𝑠sitalic_s user-provided probability vectors, namely Δ=1/sΔ1𝑠\Delta=1/sroman_Δ = 1 / italic_s. This procedure yields σ0.34𝜎0.34\sigma\approx 0.34italic_σ ≈ 0.34.

Finally, equating the zCDP loss for the exponential mechanism given by ε28=c22s2τ2superscript𝜀28superscript𝑐22superscript𝑠2superscript𝜏2\frac{\varepsilon^{2}}{8}=\frac{c^{2}}{2s^{2}\tau^{2}}divide start_ARG italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 8 end_ARG = divide start_ARG italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_τ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG (Cesar and Rogers [2021]) to that of the Gaussian mechanism given by 12s2σ212superscript𝑠2superscript𝜎2\frac{1}{2s^{2}\sigma^{2}}divide start_ARG 1 end_ARG start_ARG 2 italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG (Bun and Steinke [2016]), yields σ=0.2𝜎0.2\sigma=0.2italic_σ = 0.2.

Appendix E Experiment details

E.1 Hyperparameter tuning

There are a significant amount of hyperparameters associated with our approach. See Table 6 for a list of the main ones and the values they take. In this section we describe the hyperparameter evaluation procedure, as well as the rationale for our decisions on what hyperparameter settings to couple together or that we altogether avoid running.

Hyperparameter evaluation procedure.

For fine-tuning experiments, we set aside a real validation set consisting of 10% the real train set. We choose dataset generation parameters based on which resulting dataset induces the the best classifier on this real validation set. However, the process of tuning the classifier itself on synthetic data (choosing the best learning rate and checkpoint) does not use real data – we do that tuning with synthetic data. This is because the output of our method is a dataset, and its usefulness to train a model includes how well subsets of it can be used for downstream task hyperparameter selection. After identifying the best synthetic dataset in this manner, we run the tuning process based on synthetic data only and report accuracy of the resultant classifier on the real test set.

Hyperparameter choices.

Based on initial experiments, we found that setting c=10𝑐10c=10italic_c = 10 and τ=2𝜏2\tau=2italic_τ = 2 produced well formed text, so we fix c=10𝑐10c=10italic_c = 10 and try a low temperature (τ=1.5𝜏1.5\tau=1.5italic_τ = 1.5) and a high temperature (τ=2.25𝜏2.25\tau=2.25italic_τ = 2.25) setting. At τ=2.25𝜏2.25\tau=2.25italic_τ = 2.25, we observed text degeneration. This is due to the combination of Gemma’s large vocabulary (256K) and clipping, which raises the “probability floor” of nonsense tokens. So for τ=2.25𝜏2.25\tau=2.25italic_τ = 2.25 settings only, we follow Tang et al. [2024] and reduce the vocabulary to the public prediction’s top 1024. We emphasize that (1) we do not do this for any of the other settings of τ𝜏\tauitalic_τ, and (2) use a larger value than prior work (they use top 100).

Keeping other parameters fixed and increasing the batch size s𝑠sitalic_s decreases ε𝜀\varepsilonitalic_ε. At the same time, it raises the amount of compute spent to decode a single example.888The way we interpret this is that s𝑠sitalic_s is a compute multiplier that broadens the search space to include better utility configurations in the low ε𝜀\varepsilonitalic_ε regime. This is analagous to the role of the noise multiplier σ𝜎\sigmaitalic_σ in DP-SGD, where the best results at low ε𝜀\varepsilonitalic_ε come from taking more steps at higher noise levels. Hence our approach for selecting the batch size is based on the following: given a target epsilon and dataset, choose s𝑠sitalic_s large enough so that we can hit at least 1K examples at the low temperature setting τ=1.5𝜏1.5\tau=1.5italic_τ = 1.5. When targeting a large ε𝜀\varepsilonitalic_ε, choosing large s𝑠sitalic_s results in too many tokens to decode at too high of a cost per token.

For the sparse vector hyperparameters, we consider the following paired (θ,σ)𝜃𝜎(\theta,\sigma)( italic_θ , italic_σ ) settings: {(,),\{(-\infty,-),{ ( - ∞ , - ) , (0.3,0.1),0.30.1(0.3,0.1),( 0.3 , 0.1 ) , (0.5,0.2),(0.7,0.3)}(0.5,0.2),(0.7,0.3)\}( 0.5 , 0.2 ) , ( 0.7 , 0.3 ) }. The first setting corresponds to no use of the SVT, the next 3 represent different privacy levels per token: moving to the right uses noisier queries (less privacy budget) and more often uses the free public tokens. For large datasets and target ε𝜀\varepsilonitalic_ε, we do not run the high privacy settings (too much compute to finish), and for smaller datasets and smaller ε𝜀\varepsilonitalic_ε we omit the settings that do not produce at least 1K examples.

α𝛼\alphaitalic_α Description Values
s𝑠sitalic_s batch size 127, 255, 511,
1023, 1535, 2047
c𝑐citalic_c logits clip bound 10
τ𝜏\tauitalic_τ temperature 1.5, 2, 2.25
θ𝜃\thetaitalic_θ SVT threshold -\infty- ∞, 0.3, 0.5, 0.7
σ𝜎\sigmaitalic_σ SVT noise level --, 0.1, 0.2, 0.2
τpublicsubscript𝜏public\tau_{\text{public}}italic_τ start_POSTSUBSCRIPT public end_POSTSUBSCRIPT public temperature 1.5
Table 6: Values for hyperparameters explored in this work.

E.2 Prompts used

We report the prompts used for our experiments. Generally, we use the same prompt for private and public predictions, with "<text of xxx>" in the public prompt replaced with an actual private example in the private prompt. The exception is for WikiMoviesJSON (Figures 11 and 12), where the public prompt contains a schema description in place of the example.

1# [User] 2Here are texts with News Type: Business. 3 4Text: <text of News Type: Business> 5 6Please give me another one. 7 8# [Assistant] 9Text:
Figure 4: Generation prompt for AGNews.
1# [User] 2Here are questions with Answer Type: Entity. 3 4‘‘‘ 5Text: <question of Answer Type: Entity> 6‘‘‘ 7 8Please give me another one. 9 10# [Assistant] 11‘‘‘ 12Question:
Figure 5: Generation prompt for TREC.
1# [User] 2Here are entries of Category: School. 3 4Entry: <entry of Category: School> 5 6Please give me another one. 7 8# [Assistant] 9Entry:
Figure 6: Generation prompt for DBPedia.
1# [User] 2Give me text about a film and the extracted Phrase about its Director. 3 4Phrase: "josh trank" 5Text: "<text containing phrase "josh trank">" 6 7Please give me another Phrase and Text: "josh trank". IMPORTANT: The exact Director phrase "josh trank" must be mentioned in Text. 8 9# [Assistant] 10Phrase: "josh trank" 11Text: ""
Figure 7: Generation prompt for MIT-D.
1# [User] 2Give me text about a film and the extracted Phrase about its Genre. 3 4Phrase "comedy" 5Text: "<text containing phrase "comedy">" 6 7Please give me another Phrase and Text. IMPORTANT: The exact Genre phrase "comedy" must be mentioned in Text. 8 9# [Assistant] 10Phrase: "comedy" 11Text: ""
Figure 8: Generation prompt for MIT-G.
1# [User] 2Here are texts with Sentiment: Negative. 3 4Text: <text of Sentiment: Negative> 5 6Please give me another one. 7 8# [Assistant] 9Text:
Figure 9: Generation prompt for IMDB.
1# [User] 2Here are texts with Sentiment: Negative. 3 4Text: <text of Sentiment: Negative> 5 6Please give me another one. 7 8# [Assistant] 9Text:
Figure 10: Generation prompt for Yelp.
1# [User] 2Here is a JSON record: 3‘‘‘ 4{ 5 "title": "$50,000 Reward", 6 "year": 1924, 7 "cast": [ 8 "Ken Maynard", 9 "Esther Ralston" 10 ], 11 "genres": [ 12 "Western", 13 "Silent" 14 ], 15 "href": "$50,000_Reward", 16 "extract": "$50,000 Reward is a 1924 American silent Western film directed by Clifford S. Elfelt and starring Ken Maynard, Esther Ralston and Bert Lindley." 17} 18‘‘‘ 19Please give me another JSON record that complies with the above schema. 20 21# [Assistant] 22‘‘‘ 23{
Figure 11: Private generation prompt for WikiMoviesJSON.
1# [User] 2Here is the schema for a JSON record: 3Schema: 4‘‘‘ 5{ 6 "title": "str", 7 "year": int, 8 "cast": [ # list of str 9 "str1", # 0 or more total entries 10 ], 11 "genres": [ # list of str 12 "str1", # 0 or more total entries 13 ] 14 "href": "str", # URL slug, e.g.: Link_to_Page 15 "extract": "str" 16} 17‘‘‘ 18Please give me another JSON record that complies with the above schema. 19 20# [Assistant] 21‘‘‘ 22{
Figure 12: Public generation prompt for WikiMoviesJSON.

Appendix F Artifacts

Tables 1(a) and 1(b) list all artifacts we use in this work. AGNews, TREC, DBPedia, MIT-G, MIT-D, IMDB, and Yelp are all standard academic datasets permissible for research use; we cite their original publications when introduced. WikiMoviesJSON is scraped from Wikipedia data, courtesy of [Rust, 2024]; their work is covered by an MIT license. Wikipedia content is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA) and the GNU Free Documentation License (GFDL).

We use open-source models BERT-Base, released by [Turc et al., 2019], and Gemma. Our use of Gemma for academic purposes is in accordance of the Gemma terms of use: https://ai.google.dev/gemma/terms. GPT-3 is accessible for academic purposes under OpenAI’s terms of use, which supports educational and research activities. LaMDA 8B is not publically available, but we received sufficient authorization to use it for the academic purposes of this paper.

Appendix G Compute budget

All experiments for synthetic data generation run on Gemma 2B 1.1 IT. A run of synthetic data generation takes between 8-48 accelerator hours. Including exploratory runs and hyperparameter search, the total compute budget for this project is roughly 14,000 accelerator hours.