HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

  • failed: tocloft

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: arXiv.org perpetual non-exclusive license
arXiv:2401.07561v1 [quant-ph] 15 Jan 2024

The Quantum Esscher Transform

Yixian Qiu [email protected] Centre for Quantum Technologies, National University of Singapore Kelvin Koor [email protected] Centre for Quantum Technologies, National University of Singapore Patrick Rebentrost [email protected] Centre for Quantum Technologies, National University of Singapore Department of Computer Science, School of Computing, National University of Singapore
Abstract

The Esscher Transform is a tool of broad utility in various domains of applied probability. It provides the solution to a constrained minimum relative entropy optimization problem. In this work, we study the generalization of the Esscher Transform to the quantum setting. We examine a relative entropy minimization problem for a quantum density operator, potentially of wide relevance in quantum information theory. The resulting solution form motivates us to define the quantum Esscher Transform, which subsumes the classical Esscher Transform as a special case. Envisioning potential applications of the quantum Esscher Transform, we also discuss its implementation on fault-tolerant quantum computers. Our algorithm is based on the modern techniques of block-encoding and quantum singular value transformation (QSVT). We show that given block-encoded inputs, our algorithm outputs a subnormalized block-encoding of the quantum Esscher transform within accuracy ϵitalic-ϵ\epsilonitalic_ϵ in O~(κdlog21/ϵ)~𝑂𝜅𝑑superscript21italic-ϵ\tilde{O}(\kappa d\log^{2}1/\epsilon)over~ start_ARG italic_O end_ARG ( italic_κ italic_d roman_log start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT 1 / italic_ϵ ) queries to the inputs, where κ𝜅\kappaitalic_κ is the condition number of the input density operator and d𝑑ditalic_d is the number of constraints.

1 Introduction

In probability and statistics, it is often important to find low relative-entropy distributions from a given fixed distribution. In addition, further constraints, the form and interpretation of which depend on the problem at hand, are frequently imposed on the target distribution.

An interesting example is the following: consider the process of inferring probability distributions from a set of measurement data. These data play the role of the constraints—they put restrictions on what the true distribution could be—and the available data may not suffice to uniquely determine a probability distribution. In this situation, a common approach is to invoke Jaynes’ maximum entropy principle (MaxEnt) [Jay57]. In essence, MaxEnt advocates that the selected distribution be the one that simultaneously maximizes entropy and satisfies the given constraints.

However, the situation becomes more nuanced if we already possess some knowledge of the system, say, a prior distribution. In such cases, a more refined strategy emerges: the minimum relative entropy principle. As expounded in [SJ80, OP07, ZTF13], this principle, regarded as a generalization of MaxEnt, operates by minimizing the distinguishability (characterized by the relative entropy) between the prior distribution and the distribution to be selected, while respecting the imposed constraints. This systematic approach to incorporating new data makes it fundamental in Bayesian statistics. The updating procedure results in the posterior distribution which reflects the most current understanding of the system in light of the observed data.

In the case when the measurement data is presented in the form of expectation values of selected random variables, the solution to the corresponding relative entropy minimization problem takes the form known as an Esscher Transform. Named after Swedish mathematician and economist Fredrik Esscher, who introduced the concept in 1932 in his work on risk theory [Esc32], the Esscher Transform, also known as ‘exponential tilting’ in statistics, and its various extensions have since then found many applications beyond minimizing relative entropy. Notable examples include option pricing (in mathematical finance) [GS+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT93], importance sampling (for rare-event simulation) [Sie76] and Lévy processes (in financial economics) [HS06]. More recently, it has also made inroads into machine learning [BSS23], in the context of empirical risk minimization.

In this paper, we discuss the extension of the above problem to the quantum setting. We consider the following optimization problem:

minimizeσ0subscriptminimize𝜎0\displaystyle\text{minimize}_{\sigma\geq 0}minimize start_POSTSUBSCRIPT italic_σ ≥ 0 end_POSTSUBSCRIPT S(σρ)𝑆conditional𝜎𝜌\displaystyle S(\sigma\|\rho)italic_S ( italic_σ ∥ italic_ρ ) (1.1)
s.t. Tr(σHi)=mi,i[d]formulae-sequenceTr𝜎subscript𝐻𝑖subscript𝑚𝑖𝑖delimited-[]𝑑\displaystyle\operatorname{Tr}(\sigma H_{i})=m_{i},\quad i\in[d]roman_Tr ( italic_σ italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i ∈ [ italic_d ]
Tr(σ)=1,Tr𝜎1\displaystyle\operatorname{Tr}(\sigma)=1,roman_Tr ( italic_σ ) = 1 ,

where ρ𝜌\rhoitalic_ρ is the a priori state and Hisubscript𝐻𝑖H_{i}italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, i[d]𝑖delimited-[]𝑑i\in[d]italic_i ∈ [ italic_d ] are observables. Refer to Definition 2.4 for the precise formulation. In the first part of this work, we show the formal solution to this constrained optimization problem. The solution methodology is modelled after its classical predecessor, albeit with added technical intricacies to manage. The form of the corresponding solution then motivates us to define the quantum Esscher Transform, see Definition 2.8. The proof of the solution to the optimization problem is found in Theorem 2.5. The quantum Esscher Transform can be viewed as a generalization of the (classical) Esscher Transform, and indeed subsumes the latter as a special case. In the second part of this work, with an eye toward potential applications, we discuss the implementation of the quantum Esscher Transform on fault-tolerant quantum computers. Our algorithm is based on the modern techniques of block-encoding and the quantum singular value transformation (QSVT) [GSLW19, MRTC21]. As an input model we consider purifications of the density operator ρ𝜌\rhoitalic_ρ and block-encodings of the operators Hisubscript𝐻𝑖H_{i}italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. The main algorithm is Algorithm 1, whose complexity is discussed in Theorem 4.3. The quantum Esscher transform could find applications in quantum analogues of problems in statistics, machine learning, and finance.

1.1 Preliminaries and notation

We define the following notations. Let ={1,2,}12\mathbb{N}=\{1,2,\dots\}blackboard_N = { 1 , 2 , … } be the set of positive natural numbers. For d𝑑d\in\mathbb{N}italic_d ∈ blackboard_N, [d]={1,2,,d}delimited-[]𝑑12𝑑[d]=\{1,2,\dots,d\}[ italic_d ] = { 1 , 2 , … , italic_d }. Here \|\cdot\|∥ ⋅ ∥, 1\|\cdot\|_{1}∥ ⋅ ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, 2\|\cdot\|_{2}∥ ⋅ ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and T\|\cdot\|_{T}∥ ⋅ ∥ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT refer to the spectral, l1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-, l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT- and trace norms respectively. The symbol direct-product\odot denotes component-wise product, e.g. for vectors (vw)i=viwisubscriptdirect-product𝑣𝑤𝑖subscript𝑣𝑖subscript𝑤𝑖(v\odot w)_{i}=v_{i}w_{i}( italic_v ⊙ italic_w ) start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, for matrices (AB)ij=AijBijsubscriptdirect-product𝐴𝐵𝑖𝑗subscript𝐴𝑖𝑗subscript𝐵𝑖𝑗(A\odot B)_{ij}=A_{ij}B_{ij}( italic_A ⊙ italic_B ) start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT. Throughout this paper, log\logroman_log will be base 2222. For convenience, when calculus is involved we shall differentiate as if it were base e𝑒eitalic_e. For a matrix M𝑀Mitalic_M we write aMb𝑎𝑀𝑏a\leq M\leq bitalic_a ≤ italic_M ≤ italic_b to mean the eigenvalues of M𝑀Mitalic_M are in [a,b]𝑎𝑏[a,b][ italic_a , italic_b ]. Thus, M0𝑀0M\geq 0italic_M ≥ 0 means M𝑀Mitalic_M is positive semidefinite. We denote a Hilbert space by \mathcal{H}caligraphic_H, Nsubscript𝑁\mathcal{H}_{N}caligraphic_H start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT if its dimension N𝑁Nitalic_N is to be explicitly specified, the set of linear operators on \mathcal{H}caligraphic_H by ()\mathcal{L}(\mathcal{H})caligraphic_L ( caligraphic_H ), and the set of density operators on \mathcal{H}caligraphic_H by 𝒟()𝒟\mathcal{D}(\mathcal{H})caligraphic_D ( caligraphic_H ). Let A()𝐴A\in\mathcal{L}(\mathcal{H})italic_A ∈ caligraphic_L ( caligraphic_H ). The kernel of A𝐴Aitalic_A is ker(A):={|ψ:A|ψ=0}assignkernel𝐴conditional-setket𝜓𝐴ket𝜓0\ker(A):=\{\ket{\psi}\in\mathcal{H}:A\ket{\psi}=0\}roman_ker ( italic_A ) := { | start_ARG italic_ψ end_ARG ⟩ ∈ caligraphic_H : italic_A | start_ARG italic_ψ end_ARG ⟩ = 0 } and the support of A𝐴Aitalic_A is supp(A):=ker(A)\operatorname{supp}(A):=\ker(A)^{\perp}roman_supp ( italic_A ) := roman_ker ( italic_A ) start_POSTSUPERSCRIPT ⟂ end_POSTSUPERSCRIPT. Note that ker(A)supp(A)=direct-sumkernel𝐴supp𝐴\ker(A)\oplus\operatorname{supp}(A)=\mathcal{H}roman_ker ( italic_A ) ⊕ roman_supp ( italic_A ) = caligraphic_H. Insubscript𝐼𝑛I_{n}italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT denotes the n𝑛nitalic_n-qubit identity operator, i.e. it is of size 2n×2nsuperscript2𝑛superscript2𝑛2^{n}\times 2^{n}2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT × 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. We use O~()~𝑂\tilde{O}(\cdot)over~ start_ARG italic_O end_ARG ( ⋅ ) to hide polylog factors, i.e., O~(f(n)):=O(f(n)polylog(f(n)))assign~𝑂𝑓𝑛𝑂𝑓𝑛polylog𝑓𝑛\tilde{O}(f(n)):=O(f(n)\cdot{\rm polylog}(f(n)))over~ start_ARG italic_O end_ARG ( italic_f ( italic_n ) ) := italic_O ( italic_f ( italic_n ) ⋅ roman_polylog ( italic_f ( italic_n ) ) ). We use A:=Bassign𝐴𝐵A:=Bitalic_A := italic_B to define expression A𝐴Aitalic_A in terms of B𝐵Bitalic_B.

A probability space is denoted by (Ω,Σ,P)ΩΣ𝑃(\Omega,\Sigma,P)( roman_Ω , roman_Σ , italic_P ), where ΩΩ\Omegaroman_Ω is the sample space, ΣΣ\Sigmaroman_Σ is the σ𝜎\sigmaitalic_σ-algebra over ΩΩ\Omegaroman_Ω, and P𝑃Pitalic_P is the probability measure on ΣΣ\Sigmaroman_Σ. While all the discussions in our work are well-defined for general probability spaces, for our purposes we shall restrict our discussion to finite sample spaces, i.e., |Ω|<Ω|\Omega|<\infty| roman_Ω | < ∞, and set Σ=2ΩΣsuperscript2Ω\Sigma=2^{\Omega}roman_Σ = 2 start_POSTSUPERSCRIPT roman_Ω end_POSTSUPERSCRIPT. In this setting, P𝑃Pitalic_P can be viewed as a |Ω|Ω|\Omega|| roman_Ω |-dimensional vector residing in the hypercube [0,1]|Ω||Ω|superscript01ΩsuperscriptΩ[0,1]^{|\Omega|}\subseteq\mathbb{R}^{|\Omega|}[ 0 , 1 ] start_POSTSUPERSCRIPT | roman_Ω | end_POSTSUPERSCRIPT ⊆ blackboard_R start_POSTSUPERSCRIPT | roman_Ω | end_POSTSUPERSCRIPT, with components P(ω)𝑃𝜔P(\omega)italic_P ( italic_ω ), ωΩ𝜔Ω\omega\in\Omegaitalic_ω ∈ roman_Ω and normalization ωΩP(ω)=1subscript𝜔Ω𝑃𝜔1\sum_{\omega\in\Omega}P(\omega)=1∑ start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_P ( italic_ω ) = 1. Note that technically, a probability measure P𝑃Pitalic_P is a function on the σ𝜎\sigmaitalic_σ-algebra ΣΣ\Sigmaroman_Σ, not ΩΩ\Omegaroman_Ω. Since we are dealing with a finite sample space here, knowing P({ω})𝑃𝜔P(\{\omega\})italic_P ( { italic_ω } ) for all ωΩ𝜔Ω\omega\in\Omegaitalic_ω ∈ roman_Ω gives us full knowledge of P𝑃Pitalic_P, from the additivity property of measures. Thus we can and shall simply view P𝑃Pitalic_P as a function on ΩΩ\Omegaroman_Ω and write P(ω)𝑃𝜔P(\omega)italic_P ( italic_ω ) in place of P({ω})𝑃𝜔P(\{\omega\})italic_P ( { italic_ω } ). Finally, given probability measures P𝑃Pitalic_P and Q𝑄Qitalic_Q, we say Q𝑄Qitalic_Q is absolutely continuous with respect to P𝑃Pitalic_P (written QPmuch-less-than𝑄𝑃Q\ll Pitalic_Q ≪ italic_P) if P(ω)=0Q(ω)=0𝑃𝜔0𝑄𝜔0P(\omega)=0\implies Q(\omega)=0italic_P ( italic_ω ) = 0 ⟹ italic_Q ( italic_ω ) = 0 for all ω𝜔\omegaitalic_ω.

2 Quantum Esscher Transform

2.1 Esscher Transform

The Esscher Transform was first defined by F. Esscher in his work on risk theory [Esc32]. Let f:E:𝑓𝐸f:E\longrightarrow\mathbb{R}italic_f : italic_E ⟶ blackboard_R be a probability mass function, where Ed𝐸superscript𝑑E\subset\mathbb{R}^{d}italic_E ⊂ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and θd𝜃superscript𝑑\theta\in\mathbb{R}^{d}italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. The function fθ(x):=eθxf(x)xEeθxf(x)assignsubscript𝑓𝜃𝑥superscript𝑒𝜃𝑥𝑓𝑥subscript𝑥𝐸superscript𝑒𝜃𝑥𝑓𝑥f_{\theta}(x):=\frac{e^{\theta\cdot x}f(x)}{\sum_{x\in E}e^{\theta\cdot x}f(x)}italic_f start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_x ) := divide start_ARG italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_x end_POSTSUPERSCRIPT italic_f ( italic_x ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_x ∈ italic_E end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_x end_POSTSUPERSCRIPT italic_f ( italic_x ) end_ARG is also a probability mass function, and it is called the Esscher Transform of f𝑓fitalic_f with parameter θ𝜃\thetaitalic_θ. We can replace probability mass functions with probability density functions (accordingly, \sum\longrightarrow\int∑ ⟶ ∫).

The Esscher Transform is a map from and onto the space of probability mass/density functions, as (f;θ)=fθ𝑓𝜃subscript𝑓𝜃\mathcal{E}(f;\theta)=f_{\theta}caligraphic_E ( italic_f ; italic_θ ) = italic_f start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT. In this work, we never invoke \mathcal{E}caligraphic_E and simply call fθsubscript𝑓𝜃f_{\theta}italic_f start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT the Esscher Transform of f𝑓fitalic_f, in the same spirit as the Fourier Transform. In the context of probability theory, let (Ω,Σ,P)ΩΣ𝑃(\Omega,\Sigma,P)( roman_Ω , roman_Σ , italic_P ) be a probability space and X:Ωd:𝑋Ωsuperscript𝑑X:\Omega\longrightarrow\mathbb{R}^{d}italic_X : roman_Ω ⟶ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT a random d𝑑ditalic_d dimensional vector. This setting motivates the equivalent definition (see Remark 2.3 below) of Esscher Transforms for measures/distributions.

Definition 2.1 (Esscher Transform for probability distributions).

Given a probability distribution P𝑃Pitalic_P on a finite sample space ΩΩ\Omegaroman_Ω, a random variable X:Ωd:𝑋Ωsuperscript𝑑X:\Omega\longrightarrow\mathbb{R}^{d}italic_X : roman_Ω ⟶ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and θd𝜃superscript𝑑\theta\in\mathbb{R}^{d}italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. The probability distribution

Pθ,X(ω):=eθX(ω)P(ω)𝔼P[eθX]assignsubscript𝑃𝜃𝑋𝜔superscript𝑒𝜃𝑋𝜔𝑃𝜔subscript𝔼𝑃delimited-[]superscript𝑒𝜃𝑋\displaystyle P_{\theta,X}(\omega):=\frac{e^{\theta\cdot X(\omega)}P(\omega)}{% \mathbb{E}_{P}[e^{\theta\cdot X}]}italic_P start_POSTSUBSCRIPT italic_θ , italic_X end_POSTSUBSCRIPT ( italic_ω ) := divide start_ARG italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_X ( italic_ω ) end_POSTSUPERSCRIPT italic_P ( italic_ω ) end_ARG start_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_X end_POSTSUPERSCRIPT ] end_ARG

is called the Esscher Transform of P𝑃Pitalic_P with parameter θ𝜃\thetaitalic_θ, with respect to X𝑋Xitalic_X. For brevity, we say Pθ,Xsubscript𝑃𝜃𝑋P_{\theta,X}italic_P start_POSTSUBSCRIPT italic_θ , italic_X end_POSTSUBSCRIPT is the (θ,X)𝜃𝑋(\theta,X)( italic_θ , italic_X )-Esscher Transform of P𝑃Pitalic_P.

This definition is connected to the following problem. Fix md𝑚superscript𝑑m\in\mathbb{R}^{d}italic_m ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. When and how can we derive from P𝑃Pitalic_P another probability measure Q𝑄Qitalic_Q such that the expectation of X𝑋Xitalic_X with respect to Q𝑄Qitalic_Q, 𝔼Q[X]subscript𝔼𝑄delimited-[]𝑋\mathbb{E}_{Q}[X]blackboard_E start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT [ italic_X ] is equal to m𝑚mitalic_m? Among such probability measures, if they exist, how can we find the one that is closest (in some sense) to P𝑃Pitalic_P? Take as a measure of closeness the relative entropy between P𝑃Pitalic_P and Q𝑄Qitalic_Q,

D(QP)=ωΩQ(ω)logQ(ω)P(ω).𝐷conditional𝑄𝑃subscript𝜔Ω𝑄𝜔𝑄𝜔𝑃𝜔D(Q\|P)=\sum_{\omega\in\Omega}Q(\omega)\log\frac{Q(\omega)}{P(\omega)}.italic_D ( italic_Q ∥ italic_P ) = ∑ start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_Q ( italic_ω ) roman_log divide start_ARG italic_Q ( italic_ω ) end_ARG start_ARG italic_P ( italic_ω ) end_ARG .

The definition of D(QP)𝐷conditional𝑄𝑃D(Q\|P)italic_D ( italic_Q ∥ italic_P ) requires that Q𝑄Qitalic_Q be absolutely continuous with respect to P𝑃Pitalic_P, otherwise D(QP)=𝐷conditional𝑄𝑃D(Q\|P)=\inftyitalic_D ( italic_Q ∥ italic_P ) = ∞. Without loss of generality, we can assume P𝑃Pitalic_P is strictly positive on ΩΩ\Omegaroman_Ω. If this were not so, then let SΩ𝑆ΩS\subset\Omegaitalic_S ⊂ roman_Ω denote the subset on which P=0𝑃0P=0italic_P = 0. Since Q𝑄Qitalic_Q is absolutely continuous w.r.t. P𝑃Pitalic_P, we have D(QP)=ωΩSQ(ω)logQ(ω)P(ω)𝐷conditional𝑄𝑃subscript𝜔Ω𝑆𝑄𝜔𝑄𝜔𝑃𝜔D(Q\|P)=\sum_{\omega\in\Omega\setminus S}Q(\omega)\log\frac{Q(\omega)}{P(% \omega)}italic_D ( italic_Q ∥ italic_P ) = ∑ start_POSTSUBSCRIPT italic_ω ∈ roman_Ω ∖ italic_S end_POSTSUBSCRIPT italic_Q ( italic_ω ) roman_log divide start_ARG italic_Q ( italic_ω ) end_ARG start_ARG italic_P ( italic_ω ) end_ARG, so we are reduced to an ‘effective ΩΩ\Omegaroman_Ω’ on which P𝑃Pitalic_P is strictly positive. The aforementioned question can then be cast as an optimization problem with multiple constraints:

minimizeQ[0,1]|Ω|subscriptminimize𝑄superscript01Ω\displaystyle\text{minimize}_{Q\in[0,1]^{|\Omega|}}minimize start_POSTSUBSCRIPT italic_Q ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT | roman_Ω | end_POSTSUPERSCRIPT end_POSTSUBSCRIPT D(QP)𝐷conditional𝑄𝑃\displaystyle D(Q\|P)italic_D ( italic_Q ∥ italic_P ) (2.1)
s.t. 𝔼Q[Xi]=mi,i[d]formulae-sequencesubscript𝔼𝑄delimited-[]subscript𝑋𝑖subscript𝑚𝑖𝑖delimited-[]𝑑\displaystyle\mathbb{E}_{Q}[X_{i}]=m_{i},\quad i\in[d]blackboard_E start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT [ italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] = italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i ∈ [ italic_d ]
ωΩQ(ω)=1.subscript𝜔Ω𝑄𝜔1\displaystyle\sum_{\omega\in\Omega}Q(\omega)=1.∑ start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_Q ( italic_ω ) = 1 .

Note that there are d+1𝑑1d+1italic_d + 1 constraints on Q𝑄Qitalic_Q, hence in feasible, non-redundant cases we have d+1|Ω|𝑑1Ωd+1\leq|\Omega|italic_d + 1 ≤ | roman_Ω |. We have the following solution to the optimization problem.

Theorem 2.2.

Given a random vector X:Ωd:𝑋Ωsuperscript𝑑X:\Omega\longrightarrow\mathbb{R}^{d}italic_X : roman_Ω ⟶ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and md𝑚superscript𝑑m\in\mathbb{R}^{d}italic_m ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT where minωΩXi(ω)<mi<maxωΩXi(ω)subscript𝜔Ωsubscript𝑋𝑖𝜔subscript𝑚𝑖subscript𝜔Ωsubscript𝑋𝑖𝜔\min_{\omega\in\Omega}X_{i}(\omega)<m_{i}<\max_{\omega\in\Omega}X_{i}(\omega)roman_min start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_ω ) < italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < roman_max start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_ω ) for i[d]𝑖delimited-[]𝑑i\in[d]italic_i ∈ [ italic_d ]. There exists a unique solution Qsuperscript𝑄Q^{\star}italic_Q start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT to problem 2.1, given by

Q=eλXP𝔼P[eλX],superscript𝑄superscript𝑒superscript𝜆𝑋𝑃subscript𝔼𝑃delimited-[]superscript𝑒superscript𝜆𝑋\displaystyle Q^{\star}=\frac{e^{\lambda^{\star}\cdot X}P}{\mathbb{E}_{P}[e^{% \lambda^{\star}\cdot X}]},italic_Q start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = divide start_ARG italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⋅ italic_X end_POSTSUPERSCRIPT italic_P end_ARG start_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⋅ italic_X end_POSTSUPERSCRIPT ] end_ARG ,

where λ:=argminλd𝔼P[eλ(Xm)]assignsuperscript𝜆subscriptargmin𝜆superscript𝑑subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋𝑚\lambda^{\star}:=\operatorname*{argmin}_{\lambda\in\mathbb{R}^{d}}\mathbb{E}_{% P}[e^{\lambda\cdot(X-m)}]italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT := roman_argmin start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ ( italic_X - italic_m ) end_POSTSUPERSCRIPT ]. Thus Qsuperscript𝑄Q^{\star}italic_Q start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is the (λ,X)superscript𝜆𝑋(\lambda^{\star},X)( italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT , italic_X )-Esscher Transform of P𝑃Pitalic_P, see Definition 2.1.

The proof is elaborated in Appendix A.

Remark 2.3.

Let us comment on a subtlety. Above, we have called Qsuperscript𝑄Q^{\star}italic_Q start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT the Esscher Transform of P𝑃Pitalic_P. Recall that the Esscher Transform as originally defined by Esscher pertains to probability mass/density functions instead of measures. Here we show that using the same terminology for probability measures is well-justified (at least for the case when ΩΩ\Omegaroman_Ω is discrete). The random variable X𝑋Xitalic_X induces from the probability measure P𝑃Pitalic_P the probability mass function PX(x):=P(X1(x))assignsubscript𝑃𝑋𝑥𝑃superscript𝑋1𝑥P_{X}(x):=P(X^{-1}(x))italic_P start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_x ) := italic_P ( italic_X start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x ) ) on E:=X(Ω)assign𝐸𝑋ΩE:=X(\Omega)italic_E := italic_X ( roman_Ω ). Assume we have, for probability measures Q,P𝑄𝑃Q,Pitalic_Q , italic_P and random variable X𝑋Xitalic_X, that

Q(ω)=eθX(ω)P(ω)𝔼P[eθX].𝑄𝜔superscript𝑒𝜃𝑋𝜔𝑃𝜔subscript𝔼𝑃delimited-[]superscript𝑒𝜃𝑋\displaystyle Q(\omega)=\frac{e^{\theta\cdot X(\omega)}P(\omega)}{\mathbb{E}_{% P}[e^{\theta\cdot X}]}.italic_Q ( italic_ω ) = divide start_ARG italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_X ( italic_ω ) end_POSTSUPERSCRIPT italic_P ( italic_ω ) end_ARG start_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_X end_POSTSUPERSCRIPT ] end_ARG .

Then for the probability mass functions QXsubscript𝑄𝑋Q_{X}italic_Q start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT and PXsubscript𝑃𝑋P_{X}italic_P start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT we have

QX(x)=Q(X1(x))subscript𝑄𝑋𝑥𝑄superscript𝑋1𝑥\displaystyle Q_{X}(x)=Q(X^{-1}(x))italic_Q start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_x ) = italic_Q ( italic_X start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x ) ) =ω:X(ω)=xQ(ω)absentsubscript:𝜔𝑋𝜔𝑥𝑄𝜔\displaystyle=\sum_{\omega:X(\omega)=x}Q(\omega)= ∑ start_POSTSUBSCRIPT italic_ω : italic_X ( italic_ω ) = italic_x end_POSTSUBSCRIPT italic_Q ( italic_ω )
=ω:X(ω)=xeθX(ω)P(ω)ωΩeθX(ω)P(ω)absentsubscript:𝜔𝑋𝜔𝑥superscript𝑒𝜃𝑋𝜔𝑃𝜔subscript𝜔Ωsuperscript𝑒𝜃𝑋𝜔𝑃𝜔\displaystyle=\frac{\sum_{\omega:X(\omega)=x}e^{\theta\cdot X(\omega)}P(\omega% )}{\sum_{\omega\in\Omega}e^{\theta\cdot X(\omega)}P(\omega)}= divide start_ARG ∑ start_POSTSUBSCRIPT italic_ω : italic_X ( italic_ω ) = italic_x end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_X ( italic_ω ) end_POSTSUPERSCRIPT italic_P ( italic_ω ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_X ( italic_ω ) end_POSTSUPERSCRIPT italic_P ( italic_ω ) end_ARG
=eθxPX(x)xEω:X(ω)=xeθX(ω)P(ω)absentsuperscript𝑒𝜃𝑥subscript𝑃𝑋𝑥subscript𝑥𝐸subscript:𝜔𝑋𝜔𝑥superscript𝑒𝜃𝑋𝜔𝑃𝜔\displaystyle=\frac{e^{\theta\cdot x}P_{X}(x)}{\sum_{x\in E}\sum_{\omega:X(% \omega)=x}e^{\theta\cdot X(\omega)}P(\omega)}= divide start_ARG italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_x end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_x ∈ italic_E end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_ω : italic_X ( italic_ω ) = italic_x end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_X ( italic_ω ) end_POSTSUPERSCRIPT italic_P ( italic_ω ) end_ARG
=eθxPX(x)xEeθxPX(x),absentsuperscript𝑒𝜃𝑥subscript𝑃𝑋𝑥subscript𝑥𝐸superscript𝑒𝜃𝑥subscript𝑃𝑋𝑥\displaystyle=\frac{e^{\theta\cdot x}P_{X}(x)}{\sum_{x\in E}e^{\theta\cdot x}P% _{X}(x)},= divide start_ARG italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_x end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_x ∈ italic_E end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_x end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_x ) end_ARG ,

i.e., QXsubscript𝑄𝑋Q_{X}italic_Q start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT is the Esscher Transform of PXsubscript𝑃𝑋P_{X}italic_P start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT as defined above.

2.2 Quantum version

2.2.1 Problem statement

Many entities in classical probability theory have meaningful generalizations in quantum theory. For example, sample spaces, probability distributions and random variables find their respective counterparts in Hilbert spaces, density operators and observables (the latter also include the former as special instances). The quantum counterpart of the relative entropy is the quantum relative entropy,

S(σρ):=Tr{σ(logσlogρ)},assign𝑆conditional𝜎𝜌Tr𝜎𝜎𝜌\displaystyle S(\sigma\|\rho):=\operatorname{Tr}\{\sigma(\log\sigma-\log\rho)\},italic_S ( italic_σ ∥ italic_ρ ) := roman_Tr { italic_σ ( roman_log italic_σ - roman_log italic_ρ ) } ,

defined for density operators σ,ρ𝜎𝜌\sigma,\rhoitalic_σ , italic_ρ. As in the classical case, the definition of S(σρ)𝑆conditional𝜎𝜌S(\sigma\|\rho)italic_S ( italic_σ ∥ italic_ρ ) imposes constraints on σ𝜎\sigmaitalic_σ and ρ𝜌\rhoitalic_ρ in order to have S(σρ)<𝑆conditional𝜎𝜌S(\sigma\|\rho)<\inftyitalic_S ( italic_σ ∥ italic_ρ ) < ∞. Namely, supp(σ)supp(ρ)supp𝜎supp𝜌\operatorname{supp}(\sigma)\subseteq\operatorname{supp}(\rho)roman_supp ( italic_σ ) ⊆ roman_supp ( italic_ρ ) (see Chapter 11, [Wil13]) or equivalently, ker(ρ)ker(σ)kernel𝜌kernel𝜎\ker(\rho)\subseteq\ker(\sigma)roman_ker ( italic_ρ ) ⊆ roman_ker ( italic_σ ). Using terminology from measure theory, if this condition is satisfied we say σ𝜎\sigmaitalic_σ is absolutely continuous with respect to ρ𝜌\rhoitalic_ρ (σρmuch-less-than𝜎𝜌\sigma\ll\rhoitalic_σ ≪ italic_ρ). This is analogous to the absolute continuity between probability distributions in classical probability theory. Now we formally state the quantized version of Problem 2.1.

Problem 2.4.

Let Nsubscript𝑁\mathcal{H}_{N}caligraphic_H start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT be an N𝑁Nitalic_N-dimensional Hilbert space and ρ𝒟(N)𝜌𝒟subscript𝑁\rho\in\mathcal{D}(\mathcal{H}_{N})italic_ρ ∈ caligraphic_D ( caligraphic_H start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) be a density operator. With d𝑑d\in\mathbb{N}italic_d ∈ blackboard_N, for i[d]𝑖delimited-[]𝑑i\in[d]italic_i ∈ [ italic_d ], let Hisubscript𝐻𝑖H_{i}italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT be an observable with hi,minsubscript𝑖h_{i,\min}italic_h start_POSTSUBSCRIPT italic_i , roman_min end_POSTSUBSCRIPT and hi,maxsubscript𝑖h_{i,\max}italic_h start_POSTSUBSCRIPT italic_i , roman_max end_POSTSUBSCRIPT denoting its smallest and largest eigenvalue respectively. For md𝑚superscript𝑑m\in\mathbb{R}^{d}italic_m ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT with hi,min<mi<hi,maxsubscript𝑖subscript𝑚𝑖subscript𝑖h_{i,\min}<m_{i}<h_{i,\max}italic_h start_POSTSUBSCRIPT italic_i , roman_min end_POSTSUBSCRIPT < italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < italic_h start_POSTSUBSCRIPT italic_i , roman_max end_POSTSUBSCRIPT, solve

minimizeσ0subscriptminimize𝜎0\displaystyle\text{minimize}_{\sigma\geq 0}minimize start_POSTSUBSCRIPT italic_σ ≥ 0 end_POSTSUBSCRIPT S(σρ)𝑆conditional𝜎𝜌\displaystyle S(\sigma\|\rho)italic_S ( italic_σ ∥ italic_ρ ) (2.2)
s.t. Tr(σHi)=mi,i[d]formulae-sequenceTr𝜎subscript𝐻𝑖subscript𝑚𝑖𝑖delimited-[]𝑑\displaystyle\operatorname{Tr}(\sigma H_{i})=m_{i},\quad i\in[d]roman_Tr ( italic_σ italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i ∈ [ italic_d ]
Tr(σ)=1.Tr𝜎1\displaystyle\operatorname{Tr}(\sigma)=1.roman_Tr ( italic_σ ) = 1 .

Here hisubscript𝑖h_{i}italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT denotes a generic eigenvalue of Hisubscript𝐻𝑖H_{i}italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Note that because σ,Hi𝜎subscript𝐻𝑖\sigma,H_{i}italic_σ , italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are Hermitian, Tr(σHi)Tr𝜎subscript𝐻𝑖\operatorname{Tr}(\sigma H_{i})roman_Tr ( italic_σ italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) is real. As before, we require hi,min<mi<hi,maxsubscript𝑖subscript𝑚𝑖subscript𝑖h_{i,\min}<m_{i}<h_{i,\max}italic_h start_POSTSUBSCRIPT italic_i , roman_min end_POSTSUBSCRIPT < italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < italic_h start_POSTSUBSCRIPT italic_i , roman_max end_POSTSUBSCRIPT, otherwise the constraints Tr(σHi)=miTr𝜎subscript𝐻𝑖subscript𝑚𝑖\operatorname{Tr}(\sigma H_{i})=m_{i}roman_Tr ( italic_σ italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT cannot be satisfied. Finally, we can assume WLOG that Hi1normsubscript𝐻𝑖1\|H_{i}\|\leq 1∥ italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ ≤ 1. This amounts to dividing the constraint Tr(σHi)=miTr𝜎subscript𝐻𝑖subscript𝑚𝑖\operatorname{Tr}(\sigma H_{i})=m_{i}roman_Tr ( italic_σ italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT throughout by Hinormsubscript𝐻𝑖\|H_{i}\|∥ italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ if necessary.

2.2.2 Solution

Before delving into the solution, let us briefly comment on a few possible concerns. First, S(σρ)𝑆conditional𝜎𝜌S(\sigma\|\rho)italic_S ( italic_σ ∥ italic_ρ ) requires taking the logarithm of ρ𝜌\rhoitalic_ρ, which poses a problem if ρ𝜌\rhoitalic_ρ is not strictly positive definite. This issue is circumvented if, as mentioned above, ker(ρ)ker(σ)kernel𝜌kernel𝜎\ker(\rho)\subseteq\ker(\sigma)roman_ker ( italic_ρ ) ⊆ roman_ker ( italic_σ ). The analysis becomes relatively straightforward if we partition the Hilbert space \mathcal{H}caligraphic_H into suitable subspaces and examine σ𝜎\sigmaitalic_σ over them separately. To this end, we introduce the following notation. Let 𝒢𝒢\mathcal{G}caligraphic_G be a subspace of \mathcal{H}caligraphic_H. For A()𝐴A\in\mathcal{L}(\mathcal{H})italic_A ∈ caligraphic_L ( caligraphic_H ), denote A𝒢:=Π𝒢AΠ𝒢(𝒢)assignsubscript𝐴𝒢subscriptΠ𝒢𝐴subscriptΠ𝒢𝒢A_{\mathcal{G}}:=\Pi_{\mathcal{G}}A\Pi_{\mathcal{G}}\in\mathcal{L}(\mathcal{G})italic_A start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT := roman_Π start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT italic_A roman_Π start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ∈ caligraphic_L ( caligraphic_G ), where Π𝒢subscriptΠ𝒢\Pi_{\mathcal{G}}roman_Π start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT is the projector onto 𝒢𝒢\mathcal{G}caligraphic_G.

Second, as in the classical case, we hope to solve this optimization problem using Lagrange multipliers. With a fixed ρ𝜌\rhoitalic_ρ, S(σρ)𝑆conditional𝜎𝜌S(\sigma\|\rho)italic_S ( italic_σ ∥ italic_ρ ) is a real-valued function of complex matrices. How do we optimize such functions? In principle we could convert everything into real numbers—MN()2N2subscript𝑀𝑁superscript2superscript𝑁2M_{N}(\mathbb{C})\cong\mathbb{R}^{2N^{2}}italic_M start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ( blackboard_C ) ≅ blackboard_R start_POSTSUPERSCRIPT 2 italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT, so we could view S(σρ)𝑆conditional𝜎𝜌S(\sigma\|\rho)italic_S ( italic_σ ∥ italic_ρ ) as a function of 2N22superscript𝑁22N^{2}2 italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT real parameters and implement conventional optimization methods. However, this conversion is generally tedious, and the resulting expression for S(σρ)𝑆conditional𝜎𝜌S(\sigma\|\rho)italic_S ( italic_σ ∥ italic_ρ ) cumbersome. The ‘Wirtinger Calculus’ provides a relatively simple methodology for the optimization of such functions, through the use of ‘Wirtinger derivatives’. We state the main definitions and results of this framework in Appendix B.

We have the following result, which partially resolves Problem 2.4:

Theorem 2.5.

The solution to Problem 2.4 takes the form

σ=σsuppρσkerρ,superscript𝜎direct-sumsuperscriptsubscript𝜎supp𝜌superscriptsubscript𝜎kernel𝜌\displaystyle\sigma^{\star}=\sigma_{\operatorname{supp}\rho}^{\star}\oplus% \sigma_{\ker\rho}^{\star},italic_σ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⊕ italic_σ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT , (2.3)

where

σsuppρ=eλHsuppρ+logρsuppρTr(eλHsuppρ+logρsuppρ)andσkerρ=𝟎.formulae-sequencesuperscriptsubscript𝜎supp𝜌superscript𝑒superscript𝜆subscript𝐻supp𝜌subscript𝜌supp𝜌Trsuperscript𝑒superscript𝜆subscript𝐻supp𝜌subscript𝜌supp𝜌andsuperscriptsubscript𝜎kernel𝜌0\displaystyle\sigma_{\operatorname{supp}\rho}^{\star}=\frac{e^{\lambda^{\star}% \cdot H_{\operatorname{supp}\rho}+\log\rho_{\operatorname{supp}\rho}}}{% \operatorname{Tr}(e^{\lambda^{\star}\cdot H_{\operatorname{supp}\rho}+\log\rho% _{\operatorname{supp}\rho}})}\qquad\text{and}\qquad\sigma_{\ker\rho}^{\star}=% \mathbf{0}.italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = divide start_ARG italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⋅ italic_H start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT + roman_log italic_ρ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⋅ italic_H start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT + roman_log italic_ρ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) end_ARG and italic_σ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = bold_0 . (2.4)

The optimal values λdsuperscript𝜆superscript𝑑\lambda^{\star}\in\mathbb{R}^{d}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT are to be determined from the constraints

Tr(eλ(Hsuppρm)+logρsuppρ(Hi,suppρmi))=0,i[d].formulae-sequenceTrsuperscript𝑒superscript𝜆subscript𝐻supp𝜌𝑚subscript𝜌supp𝜌subscript𝐻𝑖supp𝜌subscript𝑚𝑖0𝑖delimited-[]𝑑\displaystyle\operatorname{Tr}\left(e^{\lambda^{\star}\cdot(H_{\operatorname{% supp}\rho}-m)+\log\rho_{\operatorname{supp}\rho}}(H_{i,\operatorname{supp}\rho% }-m_{i})\right)=0\;,i\in[d].roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⋅ ( italic_H start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT - italic_m ) + roman_log italic_ρ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_H start_POSTSUBSCRIPT italic_i , roman_supp italic_ρ end_POSTSUBSCRIPT - italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) = 0 , italic_i ∈ [ italic_d ] . (2.5)
Proof.

To facilitate the presentation of the solution, certain parts of the argument sequence are collated into lemmas and placed below the main body of this proof.

Step 1. First, for any candidate solution σ𝜎\sigmaitalic_σ we enforce kerρkerσkernel𝜌kernel𝜎\ker\rho\subseteq\ker\sigmaroman_ker italic_ρ ⊆ roman_ker italic_σ. By Lemma 2.6, this implies σkerρ=𝟎subscript𝜎kernel𝜌0\sigma_{\ker\rho}=\mathbf{0}italic_σ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT = bold_0 and furthermore enables the decomposition of σ𝜎\sigmaitalic_σ into a direct sum: σ=σsuppρσkerρ𝜎direct-sumsubscript𝜎supp𝜌subscript𝜎kernel𝜌\sigma=\sigma_{\operatorname{supp}\rho}\oplus\sigma_{\ker\rho}italic_σ = italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT ⊕ italic_σ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT. With this decomposition, we can consider the trace of the operators over just the subspace suppρsupp𝜌\operatorname{supp}\rhoroman_supp italic_ρ. More specifically, Tr(σHi)=Tr(σ(Πsuppρ+Πkerρ)Hi(Πsuppρ+Πkerρ))=Tr(σsuppρHi,suppρ)Tr𝜎subscript𝐻𝑖Tr𝜎subscriptΠsupp𝜌subscriptΠkernel𝜌subscript𝐻𝑖subscriptΠsupp𝜌subscriptΠkernel𝜌Trsubscript𝜎supp𝜌subscript𝐻𝑖supp𝜌\operatorname{Tr}(\sigma H_{i})=\operatorname{Tr}(\sigma(\Pi_{\operatorname{% supp}\rho}+\Pi_{\ker\rho})H_{i}(\Pi_{\operatorname{supp}\rho}+\Pi_{\ker\rho}))% =\operatorname{Tr}(\sigma_{\operatorname{supp}\rho}H_{i,\operatorname{supp}% \rho})roman_Tr ( italic_σ italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = roman_Tr ( italic_σ ( roman_Π start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT + roman_Π start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT ) italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( roman_Π start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT + roman_Π start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT ) ) = roman_Tr ( italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_i , roman_supp italic_ρ end_POSTSUBSCRIPT )111Recall that for any A()𝐴A\in\mathcal{L}(\mathcal{H})italic_A ∈ caligraphic_L ( caligraphic_H ), kerAsuppA=direct-sumkernel𝐴supp𝐴\ker A\oplus\operatorname{supp}A=\mathcal{H}roman_ker italic_A ⊕ roman_supp italic_A = caligraphic_H, so ΠkerA+ΠsuppA=IsubscriptΠkernel𝐴subscriptΠsupp𝐴𝐼\Pi_{\ker A}+\Pi_{\operatorname{supp}A}=Iroman_Π start_POSTSUBSCRIPT roman_ker italic_A end_POSTSUBSCRIPT + roman_Π start_POSTSUBSCRIPT roman_supp italic_A end_POSTSUBSCRIPT = italic_I. and

S(σρ)𝑆conditional𝜎𝜌\displaystyle S(\sigma\|\rho)italic_S ( italic_σ ∥ italic_ρ ) =Tr{σsuppρσkerρ(log(σsuppρσkerρ)log(ρsuppρρkerρ))}absentTrdirect-sumsubscript𝜎supp𝜌subscript𝜎kernel𝜌direct-sumsubscript𝜎supp𝜌subscript𝜎kernel𝜌direct-sumsubscript𝜌supp𝜌subscript𝜌kernel𝜌\displaystyle=\operatorname{Tr}\{\sigma_{\operatorname{supp}\rho}\oplus\sigma_% {\ker\rho}\left(\log(\sigma_{\operatorname{supp}\rho}\oplus\sigma_{\ker\rho})-% \log(\rho_{\operatorname{supp}\rho}\oplus\rho_{\ker\rho})\right)\}= roman_Tr { italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT ⊕ italic_σ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT ( roman_log ( italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT ⊕ italic_σ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT ) - roman_log ( italic_ρ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT ⊕ italic_ρ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT ) ) }
=Tr{σsuppρ(logσsuppρlogρsuppρ)}+Tr{σkerρ(logσkerρlogρkerρ)}=0absentTrsubscript𝜎supp𝜌subscript𝜎supp𝜌subscript𝜌supp𝜌subscriptTrsubscript𝜎kernel𝜌subscript𝜎kernel𝜌subscript𝜌kernel𝜌absent0\displaystyle=\operatorname{Tr}\{\sigma_{\operatorname{supp}\rho}(\log\sigma_{% \operatorname{supp}\rho}-\log\rho_{\operatorname{supp}\rho})\}+\underbrace{% \operatorname{Tr}\{\sigma_{\ker\rho}(\log\sigma_{\ker\rho}-\log\rho_{\ker\rho}% )\}}_{=0}= roman_Tr { italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT ( roman_log italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT - roman_log italic_ρ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT ) } + under⏟ start_ARG roman_Tr { italic_σ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT ( roman_log italic_σ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT - roman_log italic_ρ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT ) } end_ARG start_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT
=S(σsuppρρsuppρ).absent𝑆conditionalsubscript𝜎supp𝜌subscript𝜌supp𝜌\displaystyle=S(\sigma_{\operatorname{supp}\rho}\|\rho_{\operatorname{supp}% \rho}).= italic_S ( italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT ∥ italic_ρ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT ) .

Thus, we can replace \mathcal{H}caligraphic_H in Problem 2.4 by suppρsupp𝜌\operatorname{supp}\rhoroman_supp italic_ρ, and the operators by their restrictions to suppρsupp𝜌\operatorname{supp}\rhoroman_supp italic_ρ. Note that ρsuppρsubscript𝜌supp𝜌\rho_{\operatorname{supp}\rho}italic_ρ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT is positive definite.

Step 2. Next we obtain the form of σsuppρsubscript𝜎supp𝜌\sigma_{\operatorname{supp}\rho}italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT. For ease of presentation let us simply denote (σ/ρ/Hi)suppρsubscript𝜎𝜌subscript𝐻𝑖supp𝜌(\sigma/\rho/H_{i})_{\operatorname{supp}\rho}( italic_σ / italic_ρ / italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT by (σ/ρ/Hi)𝜎𝜌subscript𝐻𝑖(\sigma/\rho/H_{i})( italic_σ / italic_ρ / italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ). With ρ𝜌\rhoitalic_ρ now positive definite, logρ𝜌\log\rhoroman_log italic_ρ is well-defined. Now we invoke Proposition B.1 to extract the optimal σ𝜎\sigmaitalic_σ by setting σ=𝟎𝜎0\frac{\partial\mathcal{L}}{\partial\sigma}=\mathbf{0}divide start_ARG ∂ caligraphic_L end_ARG start_ARG ∂ italic_σ end_ARG = bold_0.

Set up the Lagrangian

=Tr{σ(logσlogρ)}iλi(Tr(σHi)mi)η(Trσ1)Tr𝜎𝜎𝜌subscript𝑖subscript𝜆𝑖Tr𝜎subscript𝐻𝑖subscript𝑚𝑖𝜂Tr𝜎1\displaystyle\mathcal{L}=\operatorname{Tr}\{\sigma(\log\sigma-\log\rho)\}-\sum% _{i}\lambda_{i}(\operatorname{Tr}(\sigma H_{i})-m_{i})-\eta(\operatorname{Tr}% \sigma-1)caligraphic_L = roman_Tr { italic_σ ( roman_log italic_σ - roman_log italic_ρ ) } - ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( roman_Tr ( italic_σ italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - italic_η ( roman_Tr italic_σ - 1 ) (2.6)

where λisubscript𝜆𝑖\lambda_{i}italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and η𝜂\etaitalic_η are the Lagrange multipliers. Making use of Propositions B.2 and B.3, setting σ𝜎\frac{\partial\mathcal{L}}{\partial\sigma}divide start_ARG ∂ caligraphic_L end_ARG start_ARG ∂ italic_σ end_ARG to zero gives

σ=𝟎𝜎0\displaystyle\frac{\partial\mathcal{L}}{\partial\sigma}=\mathbf{0}divide start_ARG ∂ caligraphic_L end_ARG start_ARG ∂ italic_σ end_ARG = bold_0 (logσ)T+I(logρ)T(λH)TηI=𝟎absentsuperscript𝜎𝑇𝐼superscript𝜌𝑇superscript𝜆𝐻𝑇𝜂𝐼0\displaystyle\implies(\log\sigma)^{T}+I-(\log\rho)^{T}-(\lambda\cdot H)^{T}-% \eta I=\mathbf{0}⟹ ( roman_log italic_σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT + italic_I - ( roman_log italic_ρ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - ( italic_λ ⋅ italic_H ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_η italic_I = bold_0
σ=eη1eλH+logρabsent𝜎superscript𝑒𝜂1superscript𝑒𝜆𝐻𝜌\displaystyle\implies\sigma=e^{\eta-1}e^{\lambda\cdot H+\log\rho}⟹ italic_σ = italic_e start_POSTSUPERSCRIPT italic_η - 1 end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT
σ=eλH+logρTr(eλH+logρ)after normalization.formulae-sequenceabsent𝜎superscript𝑒𝜆𝐻𝜌Trsuperscript𝑒𝜆𝐻𝜌after normalization\displaystyle\implies\sigma=\frac{e^{\lambda\cdot H+\log\rho}}{\operatorname{% Tr}(e^{\lambda\cdot H+\log\rho})}\qquad\text{after normalization}.⟹ italic_σ = divide start_ARG italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT end_ARG start_ARG roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) end_ARG after normalization .

It remains to determine λ𝜆\lambdaitalic_λ from the constraints Tr(σH)=mTr𝜎𝐻𝑚\operatorname{Tr}(\sigma H)=mroman_Tr ( italic_σ italic_H ) = italic_m. Plugging in the above expression for σ𝜎\sigmaitalic_σ into the constraints we have

Tr(eλH+logρH)Tr(eλH+logρ)=mTrsuperscript𝑒𝜆𝐻𝜌𝐻Trsuperscript𝑒𝜆𝐻𝜌𝑚absent\displaystyle\frac{\operatorname{Tr}(e^{\lambda\cdot H+\log\rho}H)}{% \operatorname{Tr}(e^{\lambda\cdot H+\log\rho})}=m\impliesdivide start_ARG roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT italic_H ) end_ARG start_ARG roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) end_ARG = italic_m ⟹ Tr(eλH+logρ(Hm))Tr(eλH+logρ)=0Trsuperscript𝑒𝜆𝐻𝜌𝐻𝑚Trsuperscript𝑒𝜆𝐻𝜌0\displaystyle\frac{\operatorname{Tr}(e^{\lambda\cdot H+\log\rho}(H-m))}{% \operatorname{Tr}(e^{\lambda\cdot H+\log\rho})}=0divide start_ARG roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ( italic_H - italic_m ) ) end_ARG start_ARG roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) end_ARG = 0
\displaystyle\implies Tr(eλ(Hm)+logρ(Hm))=0.Trsuperscript𝑒𝜆𝐻𝑚𝜌𝐻𝑚0\displaystyle\operatorname{Tr}(e^{\lambda\cdot(H-m)+\log\rho}(H-m))=0.roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ ( italic_H - italic_m ) + roman_log italic_ρ end_POSTSUPERSCRIPT ( italic_H - italic_m ) ) = 0 .

Step 3. Now we show that σsuperscript𝜎\sigma^{\star}italic_σ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT as given in Eq. 2.4 indeed minimizes S(σρ)𝑆conditional𝜎𝜌S(\sigma\|\rho)italic_S ( italic_σ ∥ italic_ρ ). But this follows easily from Lemma 2.7. Furthermore, since S(σρ)𝑆conditional𝜎𝜌S(\sigma\|\rho)italic_S ( italic_σ ∥ italic_ρ ) is a strictly convex functional of σ𝜎\sigmaitalic_σ, it can have at most one minimizer in the convex set M𝑀Mitalic_M, thereby showing the uniqueness of σsuperscript𝜎\sigma^{\star}italic_σ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT. Finally, again by Lemma 2.7 we note that λsuperscript𝜆\lambda^{\star}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT satisfies λ=argmaxλd[λmlogTr(eλH+logρ)]=argminλdlogTr(eλ(Hm)+logρ)=argminλdTr(eλ(Hm)+logρ)superscript𝜆subscriptargmax𝜆superscript𝑑𝜆𝑚Trsuperscript𝑒𝜆𝐻𝜌subscriptargmin𝜆superscript𝑑Trsuperscript𝑒𝜆𝐻𝑚𝜌subscriptargmin𝜆superscript𝑑Trsuperscript𝑒𝜆𝐻𝑚𝜌\lambda^{\star}=\operatorname*{argmax}_{\lambda\in\mathbb{R}^{d}}\left[\lambda% \cdot m-\log\operatorname{Tr}(e^{\lambda\cdot H+\log\rho})\right]=% \operatorname*{argmin}_{\lambda\in\mathbb{R}^{d}}\log\operatorname{Tr}(e^{% \lambda\cdot(H-m)+\log\rho})=\operatorname*{argmin}_{\lambda\in\mathbb{R}^{d}}% \operatorname{Tr}(e^{\lambda\cdot(H-m)+\log\rho})italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = roman_argmax start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_λ ⋅ italic_m - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) ] = roman_argmin start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ ( italic_H - italic_m ) + roman_log italic_ρ end_POSTSUPERSCRIPT ) = roman_argmin start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ ( italic_H - italic_m ) + roman_log italic_ρ end_POSTSUPERSCRIPT ), where the last equality holds because logf(x)𝑓𝑥\log f(x)roman_log italic_f ( italic_x ) and f(x)𝑓𝑥f(x)italic_f ( italic_x ) share the same minimum/maximum points, provided f(x)>0𝑓𝑥0f(x)>0italic_f ( italic_x ) > 0 at those points. ∎

Lemma 2.6.

Let σ,ρ()𝜎𝜌\sigma,\rho\in\mathcal{L}(\mathcal{H})italic_σ , italic_ρ ∈ caligraphic_L ( caligraphic_H ) be normal operators, so that they have spectral decompositions. If kerρkerσkernel𝜌kernel𝜎\ker\rho\subseteq\ker\sigmaroman_ker italic_ρ ⊆ roman_ker italic_σ, then σkerρ=𝟎subscript𝜎kernel𝜌0\sigma_{\ker\rho}=\mathbf{0}italic_σ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT = bold_0 and σ𝜎\sigmaitalic_σ can be partitioned into a direct sum:

σ=σsuppρσkerρ.𝜎direct-sumsubscript𝜎supp𝜌subscript𝜎kernel𝜌\displaystyle\sigma=\sigma_{\operatorname{supp}\rho}\oplus\sigma_{\ker\rho}.italic_σ = italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT ⊕ italic_σ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT .
Proof.

Expand σ𝜎\sigmaitalic_σ in terms of the eigenbasis of ρ𝜌\rhoitalic_ρ, {|i}i=0N1superscriptsubscriptket𝑖𝑖0𝑁1\{\ket{i}\}_{i=0}^{N-1}{ | start_ARG italic_i end_ARG ⟩ } start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT. Let S[N]1𝑆delimited-[]𝑁1S\subseteq[N]-1italic_S ⊆ [ italic_N ] - 1 be the index subset such that span{|i:iS}=suppρspanconditional-setket𝑖𝑖𝑆supp𝜌\text{span}\{\ket{i}:i\in S\}=\operatorname{supp}\rhospan { | start_ARG italic_i end_ARG ⟩ : italic_i ∈ italic_S } = roman_supp italic_ρ, so span{|i:iSc}=kerρspanconditional-setket𝑖𝑖superscript𝑆𝑐kernel𝜌\text{span}\{\ket{i}:i\in S^{c}\}=\ker\rhospan { | start_ARG italic_i end_ARG ⟩ : italic_i ∈ italic_S start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT } = roman_ker italic_ρ. We have

σ=i,j=0N1i|σ|j|ij|=𝜎superscriptsubscript𝑖𝑗0𝑁1quantum-operator-product𝑖𝜎𝑗ket𝑖bra𝑗absent\displaystyle\sigma=\sum_{i,j=0}^{N-1}\braket{i}{\sigma}{j}\ket{i}\bra{j}=italic_σ = ∑ start_POSTSUBSCRIPT italic_i , italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT ⟨ start_ARG italic_i end_ARG | start_ARG italic_σ end_ARG | start_ARG italic_j end_ARG ⟩ | start_ARG italic_i end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | = iSjSi|σ|j|ij|=σsuppρ+iSjSci|σ|j|ij|=𝟎subscriptsubscript𝑖𝑆subscript𝑗𝑆quantum-operator-product𝑖𝜎𝑗ket𝑖bra𝑗absentsubscript𝜎supp𝜌subscriptsubscript𝑖𝑆subscript𝑗superscript𝑆𝑐quantum-operator-product𝑖𝜎𝑗ket𝑖bra𝑗absent0\displaystyle\underbrace{\sum_{i\in S}\sum_{j\in S}\braket{i}{\sigma}{j}\ket{i% }\bra{j}}_{=\;\sigma_{\operatorname{supp}\rho}}+\underbrace{\sum_{i\in S}\sum_% {j\in S^{c}}\braket{i}{\sigma}{j}\ket{i}\bra{j}}_{=\mathbf{0}}under⏟ start_ARG ∑ start_POSTSUBSCRIPT italic_i ∈ italic_S end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_S end_POSTSUBSCRIPT ⟨ start_ARG italic_i end_ARG | start_ARG italic_σ end_ARG | start_ARG italic_j end_ARG ⟩ | start_ARG italic_i end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | end_ARG start_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT end_POSTSUBSCRIPT + under⏟ start_ARG ∑ start_POSTSUBSCRIPT italic_i ∈ italic_S end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_S start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ⟨ start_ARG italic_i end_ARG | start_ARG italic_σ end_ARG | start_ARG italic_j end_ARG ⟩ | start_ARG italic_i end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | end_ARG start_POSTSUBSCRIPT = bold_0 end_POSTSUBSCRIPT
+iScjSi|σ|j|ij|=𝟎+iScjSci|σ|j|ij|=σkerρ= 0,subscriptsubscript𝑖superscript𝑆𝑐subscript𝑗𝑆quantum-operator-product𝑖𝜎𝑗ket𝑖bra𝑗absent0subscriptsubscript𝑖superscript𝑆𝑐subscript𝑗superscript𝑆𝑐quantum-operator-product𝑖𝜎𝑗ket𝑖bra𝑗absentsubscript𝜎kernel𝜌absent 0\displaystyle+\underbrace{\sum_{i\in S^{c}}\sum_{j\in S}\braket{i}{\sigma}{j}% \ket{i}\bra{j}}_{=\mathbf{0}}+\underbrace{\sum_{i\in S^{c}}\sum_{j\in S^{c}}% \braket{i}{\sigma}{j}\ket{i}\bra{j}}_{=\;\sigma_{\ker\rho}=\;\mathbf{0}},+ under⏟ start_ARG ∑ start_POSTSUBSCRIPT italic_i ∈ italic_S start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_S end_POSTSUBSCRIPT ⟨ start_ARG italic_i end_ARG | start_ARG italic_σ end_ARG | start_ARG italic_j end_ARG ⟩ | start_ARG italic_i end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | end_ARG start_POSTSUBSCRIPT = bold_0 end_POSTSUBSCRIPT + under⏟ start_ARG ∑ start_POSTSUBSCRIPT italic_i ∈ italic_S start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_S start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ⟨ start_ARG italic_i end_ARG | start_ARG italic_σ end_ARG | start_ARG italic_j end_ARG ⟩ | start_ARG italic_i end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | end_ARG start_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT roman_ker italic_ρ end_POSTSUBSCRIPT = bold_0 end_POSTSUBSCRIPT ,

where the annihilation of the last three terms comes about because for iSc𝑖superscript𝑆𝑐i\in S^{c}italic_i ∈ italic_S start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT, |ikerρkerσket𝑖kernel𝜌kernel𝜎\ket{i}\in\ker\rho\subseteq\ker\sigma| start_ARG italic_i end_ARG ⟩ ∈ roman_ker italic_ρ ⊆ roman_ker italic_σ.

Note that the partition of an operator into a direct sum over another operator’s ker and supp subspaces does not hold in general. ∎

The following lemma is the quantized version of Lemma A.1. We employ analogous arguments and notation, starting with

Λ={eλH+logρTr(eλH+logρ):λd}andM={σ:Tr(σH)=m}.formulae-sequenceΛconditional-setsuperscript𝑒𝜆𝐻𝜌Trsuperscript𝑒𝜆𝐻𝜌𝜆superscript𝑑and𝑀conditional-set𝜎Tr𝜎𝐻𝑚\displaystyle\Lambda=\left\{\frac{e^{\lambda\cdot H+\log\rho}}{\operatorname{% Tr}(e^{\lambda\cdot H+\log\rho})}:\lambda\in\mathbb{R}^{d}\right\}\quad\text{% and}\quad M=\{\sigma:\operatorname{Tr}(\sigma H)=m\}.roman_Λ = { divide start_ARG italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT end_ARG start_ARG roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) end_ARG : italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT } and italic_M = { italic_σ : roman_Tr ( italic_σ italic_H ) = italic_m } .
Lemma 2.7.

Let ρ𝒟()𝜌𝒟\rho\in\mathcal{D}(\mathcal{H})italic_ρ ∈ caligraphic_D ( caligraphic_H ) and Hi,i[d]subscript𝐻𝑖𝑖delimited-[]𝑑H_{i},i\in[d]italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i ∈ [ italic_d ] be observables on \mathcal{H}caligraphic_H. Fix md𝑚superscript𝑑m\in\mathbb{R}^{d}italic_m ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. Then for any density operator σ𝒟()𝜎𝒟\sigma\in\mathcal{D}(\mathcal{H})italic_σ ∈ caligraphic_D ( caligraphic_H ) satisfying Tr(σH)=mTr𝜎𝐻𝑚\operatorname{Tr}(\sigma H)=mroman_Tr ( italic_σ italic_H ) = italic_m, we have

S(σρ)supλd[λmlogTr(eλH+logρ)].𝑆conditional𝜎𝜌subscriptsupremum𝜆superscript𝑑delimited-[]𝜆𝑚Trsuperscript𝑒𝜆𝐻𝜌\displaystyle S(\sigma\|\rho)\geq\sup_{\lambda\in\mathbb{R}^{d}}\left[\lambda% \cdot m-\log\operatorname{Tr}(e^{\lambda\cdot H+\log\rho})\right].italic_S ( italic_σ ∥ italic_ρ ) ≥ roman_sup start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_λ ⋅ italic_m - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) ] . (2.7)

Moreover the inequality is saturated if σ=σλ:=eλH+logρ/Tr(eλH+logρ)ΛM\sigma=\sigma_{\lambda^{\prime}}:=e^{\lambda^{\prime}\cdot H+\log\rho}/% \operatorname{Tr}(e^{\lambda^{\prime}\cdot H+\log\rho)}\in\Lambda\cap Mitalic_σ = italic_σ start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT := italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT / roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_H + roman_log italic_ρ ) end_POSTSUPERSCRIPT ∈ roman_Λ ∩ italic_M for some λdsuperscript𝜆superscript𝑑\lambda^{\prime}\in\mathbb{R}^{d}italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT:

S(σλρ)=λmlogTr(eλH+logρ)=supλd[λmlogTr(eλH+logρ)].𝑆conditionalsubscript𝜎superscript𝜆𝜌superscript𝜆𝑚Trsuperscript𝑒superscript𝜆𝐻𝜌subscriptsupremum𝜆superscript𝑑delimited-[]𝜆𝑚Trsuperscript𝑒𝜆𝐻𝜌\displaystyle S(\sigma_{\lambda^{\prime}}\|\rho)=\lambda^{\prime}\cdot m-\log% \operatorname{Tr}(e^{\lambda^{\prime}\cdot H+\log\rho})=\sup_{\lambda\in% \mathbb{R}^{d}}\left[\lambda\cdot m-\log\operatorname{Tr}(e^{\lambda\cdot H+% \log\rho})\right].italic_S ( italic_σ start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ italic_ρ ) = italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_m - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) = roman_sup start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_λ ⋅ italic_m - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) ] . (2.8)
Proof.

Each λd𝜆superscript𝑑\lambda\in\mathbb{R}^{d}italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT gives rise to a corresponding σλΛsubscript𝜎𝜆Λ\sigma_{\lambda}\in\Lambdaitalic_σ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ∈ roman_Λ (note that σλsubscript𝜎𝜆\sigma_{\lambda}italic_σ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT need not be in M𝑀Mitalic_M). Then for any σ𝜎\sigmaitalic_σ satisfying Tr(σH)=mTr𝜎𝐻𝑚\operatorname{Tr}(\sigma H)=mroman_Tr ( italic_σ italic_H ) = italic_m, we have

S(σρ)𝑆conditional𝜎𝜌\displaystyle S(\sigma\|\rho)italic_S ( italic_σ ∥ italic_ρ ) =\displaystyle== S(σσλ)+Tr{σ(logσλlogρ)}𝑆conditional𝜎subscript𝜎𝜆Tr𝜎subscript𝜎𝜆𝜌\displaystyle S(\sigma\|\sigma_{\lambda})+\operatorname{Tr}\{\sigma(\log\sigma% _{\lambda}-\log\rho)\}italic_S ( italic_σ ∥ italic_σ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ) + roman_Tr { italic_σ ( roman_log italic_σ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT - roman_log italic_ρ ) } (2.9)
(nonnegativity of S(σρ))nonnegativity of S(σρ)\displaystyle(\text{nonnegativity of $S(\sigma\|\rho)$})( nonnegativity of italic_S ( italic_σ ∥ italic_ρ ) ) \displaystyle\geq Tr{σ(log(eλH+logρ)logTr(eλH+logρ)logρ)}Tr𝜎superscript𝑒𝜆𝐻𝜌Trsuperscript𝑒𝜆𝐻𝜌𝜌\displaystyle\operatorname{Tr}\{\sigma(\log(e^{\lambda\cdot H+\log\rho})-\log% \operatorname{Tr}(e^{\lambda\cdot H+\log\rho})-\log\rho)\}roman_Tr { italic_σ ( roman_log ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) - roman_log italic_ρ ) }
=\displaystyle== Tr{σ(λH)}logTr(eλH+logρ)Tr𝜎𝜆𝐻Trsuperscript𝑒𝜆𝐻𝜌\displaystyle\operatorname{Tr}\{\sigma(\lambda\cdot H)\}-\log\operatorname{Tr}% (e^{\lambda\cdot H+\log\rho})roman_Tr { italic_σ ( italic_λ ⋅ italic_H ) } - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT )
=\displaystyle== λmlogTr(eλH+logρ).𝜆𝑚Trsuperscript𝑒𝜆𝐻𝜌\displaystyle\lambda\cdot m-\log\operatorname{Tr}(e^{\lambda\cdot H+\log\rho}).italic_λ ⋅ italic_m - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) .

Since this holds for all λd𝜆superscript𝑑\lambda\in\mathbb{R}^{d}italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, we conclude that S(σρ)supλd[λmlogTr(eλH+logρ)]𝑆conditional𝜎𝜌subscriptsupremum𝜆superscript𝑑delimited-[]𝜆𝑚Trsuperscript𝑒𝜆𝐻𝜌S(\sigma\|\rho)\geq\sup_{\lambda\in\mathbb{R}^{d}}\left[\lambda\cdot m-\log% \operatorname{Tr}(e^{\lambda\cdot H+\log\rho})\right]italic_S ( italic_σ ∥ italic_ρ ) ≥ roman_sup start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_λ ⋅ italic_m - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) ]. Furthermore, if λdsuperscript𝜆superscript𝑑\lambda^{\prime}\in\mathbb{R}^{d}italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is such that σλΛMsubscript𝜎superscript𝜆Λ𝑀\sigma_{\lambda^{\prime}}\in\Lambda\cap Mitalic_σ start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∈ roman_Λ ∩ italic_M, then letting σ=σλ𝜎subscript𝜎superscript𝜆\sigma=\sigma_{\lambda^{\prime}}italic_σ = italic_σ start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT and rerunning the same argument sequence above gives

S(σλρ)𝑆conditionalsubscript𝜎superscript𝜆𝜌\displaystyle S(\sigma_{\lambda^{\prime}}\|\rho)italic_S ( italic_σ start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ italic_ρ ) =Tr{σλ(logσλlogρ)}absentTrsubscript𝜎superscript𝜆subscript𝜎superscript𝜆𝜌\displaystyle=\operatorname{Tr}\{\sigma_{\lambda^{\prime}}(\log\sigma_{\lambda% ^{\prime}}-\log\rho)\}= roman_Tr { italic_σ start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( roman_log italic_σ start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT - roman_log italic_ρ ) }
=Tr{σλ(log(eλH+logρ)logTr(eλH+logρ)logρ)}absentTrsubscript𝜎superscript𝜆superscript𝑒superscript𝜆𝐻𝜌Trsuperscript𝑒superscript𝜆𝐻𝜌𝜌\displaystyle=\operatorname{Tr}\{\sigma_{\lambda^{\prime}}(\log(e^{\lambda^{% \prime}\cdot H+\log\rho})-\log\operatorname{Tr}(e^{\lambda^{\prime}\cdot H+% \log\rho})-\log\rho)\}= roman_Tr { italic_σ start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( roman_log ( italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) - roman_log italic_ρ ) }
=Tr{σλ(λH)}logTr(eλH+logρ)absentTrsubscript𝜎superscript𝜆superscript𝜆𝐻Trsuperscript𝑒superscript𝜆𝐻𝜌\displaystyle=\operatorname{Tr}\{\sigma_{\lambda^{\prime}}(\lambda^{\prime}% \cdot H)\}-\log\operatorname{Tr}(e^{\lambda^{\prime}\cdot H+\log\rho})= roman_Tr { italic_σ start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_H ) } - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT )
=λmlogTr(eλH+logρ).absentsuperscript𝜆𝑚Trsuperscript𝑒superscript𝜆𝐻𝜌\displaystyle=\lambda^{\prime}\cdot m-\log\operatorname{Tr}(e^{\lambda^{\prime% }\cdot H+\log\rho}).= italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_m - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) .

In particular, this also shows that λ=argmaxλd[λmlogTr(eλH+logρ)]superscript𝜆subscriptargmax𝜆superscript𝑑𝜆𝑚Trsuperscript𝑒𝜆𝐻𝜌\lambda^{\prime}=\operatorname*{argmax}_{\lambda\in\mathbb{R}^{d}}\left[% \lambda\cdot m-\log\operatorname{Tr}(e^{\lambda\cdot H+\log\rho})\right]italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = roman_argmax start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_λ ⋅ italic_m - roman_log roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) ]. ∎

Motivated by the form of the state σsuppρsuperscriptsubscript𝜎supp𝜌\sigma_{\operatorname{supp}\rho}^{\star}italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT in Theorem 2.5, we make the following definition:

Definition 2.8 (Quantum Esscher Transform).

Given a density operator 0<ρ𝒟()0𝜌𝒟0<\rho\in\mathcal{D}(\mathcal{H})0 < italic_ρ ∈ caligraphic_D ( caligraphic_H ), observables Hi,i[d]subscript𝐻𝑖𝑖delimited-[]𝑑H_{i},\;i\in[d]italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i ∈ [ italic_d ] and θd𝜃superscript𝑑\theta\in\mathbb{R}^{d}italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. The density operator

ρθ,H:=eθH+logρTr(eθH+logρ)assignsubscript𝜌𝜃𝐻superscript𝑒𝜃𝐻𝜌Trsuperscript𝑒𝜃𝐻𝜌\displaystyle\rho_{\theta,H}:=\frac{e^{\theta\cdot H+\log\rho}}{\operatorname{% Tr}(e^{\theta\cdot H+\log\rho})}italic_ρ start_POSTSUBSCRIPT italic_θ , italic_H end_POSTSUBSCRIPT := divide start_ARG italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT end_ARG start_ARG roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_H + roman_log italic_ρ end_POSTSUPERSCRIPT ) end_ARG

is called the (θ,H)𝜃𝐻(\theta,H)( italic_θ , italic_H )-quantum Esscher transform of ρ𝜌\rhoitalic_ρ.

Remark 2.9.

The state σsuppρsuperscriptsubscript𝜎supp𝜌\sigma_{\operatorname{supp}\rho}^{\star}italic_σ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT in Theorem 2.5 is thus a (λ,Hsuppρ)superscript𝜆subscript𝐻supp𝜌(\lambda^{\star},H_{\operatorname{supp}\rho})( italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT , italic_H start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT )-quantum Esscher transform of ρsuppρ>0subscript𝜌supp𝜌0\rho_{\operatorname{supp}\rho}>0italic_ρ start_POSTSUBSCRIPT roman_supp italic_ρ end_POSTSUBSCRIPT > 0. Also note that the quantum Esscher transform subsumes the classical Esscher transform as a special case, wherein ρ,Hi𝜌subscript𝐻𝑖\rho,H_{i}italic_ρ , italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are diagonal and thus commute.

2.2.3 Connection to quantum imaginary time evolution

Quantum imaginary-time evolution (QITE) is a conceptual tool which relates to the finding of ground states of Hamiltonians [MJE+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT19, MST+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT20]. From the real-time Schrödinger equation one obtains the imaginary-time Schrödinger equation |ψτ=H|ψket𝜓𝜏𝐻ket𝜓\frac{\partial|\psi\rangle}{\partial\tau}=-H|\psi\rangledivide start_ARG ∂ | italic_ψ ⟩ end_ARG start_ARG ∂ italic_τ end_ARG = - italic_H | italic_ψ ⟩ by performing a Wick rotation, i.e. τ=it𝜏𝑖𝑡\tau=ititalic_τ = italic_i italic_t. For general mixed states ρ𝜌\rhoitalic_ρ, the imaginary-time Liouville-von Neumann equation [BK91] is given by

ρτ={H,ρ}+2Hρ,𝜌𝜏𝐻𝜌2delimited-⟨⟩𝐻𝜌\displaystyle\frac{\partial\rho}{\partial\tau}=-\{H,\rho\}+2\langle H\rangle\rho,divide start_ARG ∂ italic_ρ end_ARG start_ARG ∂ italic_τ end_ARG = - { italic_H , italic_ρ } + 2 ⟨ italic_H ⟩ italic_ρ , (2.10)

from which the solution is derived as

ρ(τ)=A(τ)eτHρ(0)eτH,𝜌𝜏𝐴𝜏superscript𝑒𝜏𝐻𝜌0superscript𝑒𝜏𝐻\displaystyle\rho(\tau)=A(\tau)e^{-\tau H}\rho(0)e^{-\tau H},italic_ρ ( italic_τ ) = italic_A ( italic_τ ) italic_e start_POSTSUPERSCRIPT - italic_τ italic_H end_POSTSUPERSCRIPT italic_ρ ( 0 ) italic_e start_POSTSUPERSCRIPT - italic_τ italic_H end_POSTSUPERSCRIPT , (2.11)

where A(τ)=1/Tr(e2τHρ(0))𝐴𝜏1Trsuperscript𝑒2𝜏𝐻𝜌0A(\tau)=1/\operatorname{Tr}(e^{-2\tau H}\rho(0))italic_A ( italic_τ ) = 1 / roman_Tr ( italic_e start_POSTSUPERSCRIPT - 2 italic_τ italic_H end_POSTSUPERSCRIPT italic_ρ ( 0 ) ) is the normalisation factor.

In [OP07] it was asserted that under certain conditions, namely ‘when the prior and posterior states are close to each other with respect to the Fisher information metric’, the minimizing relative entropy problem could be solved by formally integrating a ‘quantum trajectory’ equation [OP07, Bra96]. This equation takes on the same form as Eq. 2.10, and thus its solution is given by Eq. 2.11. More specifically, we have

ρ(θ)=eθH/2ρeθH/2Tr(eθHρ),𝜌𝜃superscript𝑒𝜃𝐻2𝜌superscript𝑒𝜃𝐻2Trsuperscript𝑒𝜃𝐻𝜌\displaystyle\rho(\theta)=\frac{e^{\theta\cdot H/2}\rho e^{\theta\cdot H/2}}{% \operatorname{Tr}(e^{\theta\cdot H}\rho)},italic_ρ ( italic_θ ) = divide start_ARG italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_H / 2 end_POSTSUPERSCRIPT italic_ρ italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_H / 2 end_POSTSUPERSCRIPT end_ARG start_ARG roman_Tr ( italic_e start_POSTSUPERSCRIPT italic_θ ⋅ italic_H end_POSTSUPERSCRIPT italic_ρ ) end_ARG ,

where θ𝜃\thetaitalic_θ are the Lagrange multipliers. Here we simply observe that ρ(θ)𝜌𝜃\rho(\theta)italic_ρ ( italic_θ ) resembles the imaginary-time-evolved state in Eq. (2.11) if θ𝜃\thetaitalic_θ is one-dimensional and after making the substitution τ=θ/2𝜏𝜃2\tau=-\theta/2italic_τ = - italic_θ / 2. Since the quantum Esscher transform provides an exact solution to the problem, under the aforementioned condition we note the connection between the quantum Esscher transform and QITE.

Next, we discuss how to implement the quantum Esscher Transform on quantum computers using modern techniques based on block-encodings (BE) and the quantum singular value transformation (QSVT). Before doing so we collate the relevant tools and techniques of the framework in the next section.

3 Overview on block-encodings and quantum singular value transformations

The technique of quantum signal processing [LYC16] and its lifting, via ‘qubitization’, to quantum singular value transformation (QSVT) [LC19, GSLW19] provide a concise way to formulate quantum algorithms, particularly for linear algebraic tasks. This framework has provided more efficient implementations of several existing quantum algorithms, such as Hamiltonian simulation [LC17, LC19], amplitude amplification and estimation [GSLW19, RF23] and quantum linear systems solving [GSLW19], and even led to the discovery of new algorithms. For our purposes, we do not actually need the full generality of QSVT. As our matrices of interest are Hermitian and thus admit spectral decompositions, a relaxed version of QSVT—quantum eigenvalue transformation (QET)—suffices. We direct readers interested in learning more about QSVT to [GSLW19, MRTC21, DMB+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT23].

Definition 3.1 (Block-Encoding).

Let A𝐴Aitalic_A be an n𝑛nitalic_n-qubit matrix, α,ε+𝛼𝜀subscript\alpha,\varepsilon\in\mathbb{R}_{+}italic_α , italic_ε ∈ blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT and a𝑎a\in\mathbb{N}italic_a ∈ blackboard_N. We say that the (n+a)𝑛𝑎(n+a)( italic_n + italic_a )-qubit unitary U𝑈Uitalic_U is an (α,a,ε)𝛼𝑎𝜀(\alpha,a,\varepsilon)( italic_α , italic_a , italic_ε )-block-encoding of A𝐴Aitalic_A if

Aα(0a|In)U(|0aIn)ε.norm𝐴𝛼tensor-productbrasuperscript0𝑎subscript𝐼𝑛𝑈tensor-productketsuperscript0𝑎subscript𝐼𝑛𝜀\|A-\alpha(\bra{0^{a}}\otimes I_{n})U(\ket{0^{a}}\otimes I_{n})\|\leq\varepsilon.∥ italic_A - italic_α ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) italic_U ( | start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ∥ ≤ italic_ε .
Remark 3.2.

Note that if U𝑈Uitalic_U is an (α,a,ε)𝛼𝑎𝜀(\alpha,a,\varepsilon)( italic_α , italic_a , italic_ε )-BE of A𝐴Aitalic_A, then equivalently it is a (1,a,εα)1𝑎𝜀𝛼(1,a,\frac{\varepsilon}{\alpha})( 1 , italic_a , divide start_ARG italic_ε end_ARG start_ARG italic_α end_ARG )-BE of Aα𝐴𝛼\frac{A}{\alpha}divide start_ARG italic_A end_ARG start_ARG italic_α end_ARG. Also, if we have a (α,a,ε)𝛼𝑎𝜀(\alpha,a,\varepsilon)( italic_α , italic_a , italic_ε )-BE of A𝐴Aitalic_A then we also have a (α,a+a,ε+ε)𝛼𝑎superscript𝑎𝜀superscript𝜀(\alpha,a+a^{\prime},\varepsilon+\varepsilon^{\prime})( italic_α , italic_a + italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_ε + italic_ε start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )-BE of A𝐴Aitalic_A, where 1a1superscript𝑎1\leq a^{\prime}\in\mathbb{N}1 ≤ italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ blackboard_N and ε>0superscript𝜀0\varepsilon^{\prime}>0italic_ε start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT > 0. Making the increment asuperscript𝑎a^{\prime}italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT simply corresponds to tacking on an extra asuperscript𝑎a^{\prime}italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT-qubit identity operator Iasubscript𝐼superscript𝑎I_{a^{\prime}}italic_I start_POSTSUBSCRIPT italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. More specifically, if U𝑈Uitalic_U is an (α,a,ε)𝛼𝑎𝜀(\alpha,a,\varepsilon)( italic_α , italic_a , italic_ε )-BE of A𝐴Aitalic_A then IaUtensor-productsubscript𝐼superscript𝑎𝑈I_{a^{\prime}}\otimes Uitalic_I start_POSTSUBSCRIPT italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ⊗ italic_U is an (α,a+a,ε)𝛼𝑎superscript𝑎𝜀(\alpha,a+a^{\prime},\varepsilon)( italic_α , italic_a + italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_ε )-BE of A𝐴Aitalic_A, since

Aα(0a|In)U(|0aIn)εAα(0a+a|In)IaU(|0a+aIn)ε.norm𝐴𝛼tensor-productbrasuperscript0𝑎subscript𝐼𝑛𝑈tensor-productketsuperscript0𝑎subscript𝐼𝑛𝜀norm𝐴tensor-product𝛼tensor-productbrasuperscript0superscript𝑎𝑎subscript𝐼𝑛subscript𝐼superscript𝑎𝑈tensor-productketsuperscript0superscript𝑎𝑎subscript𝐼𝑛𝜀\displaystyle\|A-\alpha(\bra{0^{a}}\otimes I_{n})U(\ket{0^{a}}\otimes I_{n})\|% \leq\varepsilon\implies\|A-\alpha(\bra{0^{a^{\prime}+a}}\otimes I_{n})I_{a^{% \prime}}\otimes U(\ket{0^{a^{\prime}+a}}\otimes I_{n})\|\leq\varepsilon.∥ italic_A - italic_α ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) italic_U ( | start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ∥ ≤ italic_ε ⟹ ∥ italic_A - italic_α ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) italic_I start_POSTSUBSCRIPT italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ⊗ italic_U ( | start_ARG 0 start_POSTSUPERSCRIPT italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ∥ ≤ italic_ε .

Finally, if ε𝜀\varepsilonitalic_ε is already an error bound, ε+ε𝜀superscript𝜀\varepsilon+\varepsilon^{\prime}italic_ε + italic_ε start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT clearly serves as another error bound, albeit a weaker one.

[GSLW19] provides a construction of exact block-encodings for density operators, assuming access to oracles which prepare the purifications of the density operators:

Definition 3.3 (Purified quantum query-access).

Let ρ𝜌\rhoitalic_ρ be an n𝑛nitalic_n-qubit density operator. We say ρ𝜌\rhoitalic_ρ has purified quantum query-access if we have access to a (nρ+n)subscript𝑛𝜌𝑛(n_{\rho}+n)( italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT + italic_n )-qubit unitary operator Oρsubscript𝑂𝜌O_{\rho}italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT, where

Oρ|0nρ|0n=|ρsubscript𝑂𝜌ketsuperscript0subscript𝑛𝜌ketsuperscript0𝑛ket𝜌\displaystyle O_{\rho}\ket{0^{n_{\rho}}}\ket{0^{n}}=\ket{\rho}italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT | start_ARG 0 start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG ⟩ | start_ARG 0 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_ARG ⟩ = | start_ARG italic_ρ end_ARG ⟩

prepares |ρket𝜌\ket{\rho}| start_ARG italic_ρ end_ARG ⟩, the purification of ρ𝜌\rhoitalic_ρ (i.e. trnρ|ρρ|=ρsubscripttrsubscript𝑛𝜌ket𝜌bra𝜌𝜌\text{tr}_{n_{\rho}}\ket{\rho}\bra{\rho}=\rhotr start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT end_POSTSUBSCRIPT | start_ARG italic_ρ end_ARG ⟩ ⟨ start_ARG italic_ρ end_ARG | = italic_ρ) with the help of nρsubscript𝑛𝜌n_{\rho}italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ancilla qubits.222Theoretically, any n𝑛nitalic_n-qubit quantum state can be purified with at most n𝑛nitalic_n ancilla qubits, so one can assume nρnsubscript𝑛𝜌𝑛n_{\rho}\leq nitalic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ≤ italic_n. In practice however, it could be more convenient to use more than n𝑛nitalic_n ancillas for purification. Thus we make the more relaxed assumption that nρ=poly(n)subscript𝑛𝜌poly𝑛n_{\rho}=\operatorname{poly}(n)italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT = roman_poly ( italic_n ).

Proposition 3.4 (Block-encoding of density operators – Lemma 45, [GSLW19]).

Let ρ𝜌\rhoitalic_ρ be an n𝑛nitalic_n-qubit density operator with purified quantum query-access via Oρsubscript𝑂𝜌O_{\rho}italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT. Then Oρ~:=(OρIn)(Inρ+nSWAPn)(OρIn)assign~subscript𝑂𝜌tensor-productsuperscriptsubscript𝑂𝜌subscript𝐼𝑛tensor-productsubscript𝐼subscript𝑛𝜌𝑛subscriptSWAP𝑛tensor-productsubscript𝑂𝜌subscript𝐼𝑛\widetilde{O_{\rho}}:=(O_{\rho}^{\dagger}\otimes I_{n})(I_{n_{\rho}+n}\otimes% \text{SWAP}_{n})(O_{\rho}\otimes I_{n})over~ start_ARG italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT end_ARG := ( italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ( italic_I start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT + italic_n end_POSTSUBSCRIPT ⊗ SWAP start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ( italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) is a (1,n+nρ,0)1𝑛subscript𝑛𝜌0(1,n+n_{\rho},0)( 1 , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT , 0 )-BE of ρ𝜌\rhoitalic_ρ.

For general matrices which need not be density operators, [CGJ18, GSLW19] also showed how to implement their block-encodings efficiently, assuming the existence of quantum random access memory (QRAM) [GLM08]. Given block-encodings of operators Aisubscript𝐴𝑖A_{i}italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, we can construct block-encodings of their linear combinations and products. For linear combinations, we make use of an auxiliary tool known as a ‘state preparation pair’. Recall that 1\|\cdot\|_{1}∥ ⋅ ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is the l1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT/Manhattan norm.

Definition 3.5 (State Preparation Pair).

Let ym𝑦superscript𝑚y\in\mathbb{C}^{m}italic_y ∈ blackboard_C start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT and y1βsubscriptnorm𝑦1𝛽\|y\|_{1}\leq\beta∥ italic_y ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_β. The pair of unitaries (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) is called a (β,b,εSP𝛽𝑏subscript𝜀SP\beta,b,\varepsilon_{\text{SP}}italic_β , italic_b , italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT)-state-preparation-pair for y𝑦yitalic_y if

PL|0b=j=02b1cj|j,PR|0b=j=02b1dj|jformulae-sequencesubscript𝑃𝐿ketsuperscript0𝑏superscriptsubscript𝑗0superscript2𝑏1subscript𝑐𝑗ket𝑗subscript𝑃𝑅ketsuperscript0𝑏superscriptsubscript𝑗0superscript2𝑏1subscript𝑑𝑗ket𝑗\displaystyle P_{L}\ket{0^{b}}=\sum_{j=0}^{2^{b}-1}c_{j}\ket{j},\quad P_{R}% \ket{0^{b}}=\sum_{j=0}^{2^{b}-1}d_{j}\ket{j}italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT | start_ARG 0 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_ARG ⟩ = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_ARG italic_j end_ARG ⟩ , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT | start_ARG 0 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_ARG ⟩ = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_ARG italic_j end_ARG ⟩

such that j=0m1|yjβcj*dj|εSPsuperscriptsubscript𝑗0𝑚1subscript𝑦𝑗𝛽superscriptsubscript𝑐𝑗subscript𝑑𝑗subscript𝜀SP\sum_{j=0}^{m-1}|y_{j}-\beta c_{j}^{*}d_{j}|\leq\varepsilon_{\text{SP}}∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT | italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_β italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | ≤ italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT and cj*dj=0superscriptsubscript𝑐𝑗subscript𝑑𝑗0c_{j}^{*}d_{j}=0italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 0 for j=m,,2b1𝑗𝑚superscript2𝑏1j=m,\dots,2^{b}-1italic_j = italic_m , … , 2 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT - 1.

One can think of a state preparation pair as encoding the desired state/vector y𝑦yitalic_y in the first m𝑚mitalic_m elements of a length-2bsuperscript2𝑏2^{b}2 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT column vector whose elements are cj*djsuperscriptsubscript𝑐𝑗subscript𝑑𝑗c_{j}^{*}d_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, up to an error of εSPsubscript𝜀SP\varepsilon_{\text{SP}}italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT. The role of β𝛽\betaitalic_β is to take care of normalization.

Proposition 3.6 (Linear combination of block-encoded matrices – Lemma 52, [GSLW19]).

Let

  1. i.

    Aj,j=0,,m1formulae-sequencesubscript𝐴𝑗𝑗0𝑚1A_{j},\;j=0,\dots,m-1italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j = 0 , … , italic_m - 1 be n𝑛nitalic_n-qubit operators with respective (α,a,εBE𝛼𝑎subscript𝜀BE\alpha,a,\varepsilon_{\text{BE}}italic_α , italic_a , italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT)-BEs Ujsubscript𝑈𝑗U_{j}italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT,

  2. ii.

    A=j=0m1yjAj𝐴superscriptsubscript𝑗0𝑚1subscript𝑦𝑗subscript𝐴𝑗A=\sum_{j=0}^{m-1}y_{j}A_{j}italic_A = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for y:=(y0,,ym1)massign𝑦subscript𝑦0subscript𝑦𝑚1superscript𝑚y:=(y_{0},\dots,y_{m-1})\in\mathbb{C}^{m}italic_y := ( italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_m - 1 end_POSTSUBSCRIPT ) ∈ blackboard_C start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT,

  3. iii.

    (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) be a (β,b,εSP)𝛽𝑏subscript𝜀SP(\beta,b,\varepsilon_{\text{SP}})( italic_β , italic_b , italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT )-state-preparation-pair for y𝑦yitalic_y.

Then there exists a (αβ,a+b,αεSP+βεBE)𝛼𝛽𝑎𝑏𝛼subscript𝜀SP𝛽subscript𝜀BE(\alpha\beta,a+b,\alpha\varepsilon_{\text{SP}}+\beta\varepsilon_{\text{BE}})( italic_α italic_β , italic_a + italic_b , italic_α italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT + italic_β italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT )-BE of A𝐴Aitalic_A, given by

W~=(PLIaIn)W(PRIaIn),~𝑊tensor-productsuperscriptsubscript𝑃𝐿subscript𝐼𝑎subscript𝐼𝑛𝑊tensor-productsubscript𝑃𝑅subscript𝐼𝑎subscript𝐼𝑛\widetilde{W}=(P_{L}^{\dagger}\otimes I_{a}\otimes I_{n})W(P_{R}\otimes I_{a}% \otimes I_{n}),over~ start_ARG italic_W end_ARG = ( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) italic_W ( italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ,

where

W=j=0m1|jj|Uj+j=m2b1|jj|IaIn𝑊superscriptsubscript𝑗0𝑚1tensor-productket𝑗bra𝑗subscript𝑈𝑗superscriptsubscript𝑗𝑚superscript2𝑏1tensor-productket𝑗bra𝑗subscript𝐼𝑎subscript𝐼𝑛W=\sum_{j=0}^{m-1}\ket{j}\bra{j}\otimes U_{j}+\sum_{j=m}^{2^{b}-1}\ket{j}\bra{% j}\otimes I_{a}\otimes I_{n}italic_W = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT | start_ARG italic_j end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | ⊗ italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_j = italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | start_ARG italic_j end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT

is a (n+a+b)𝑛𝑎𝑏(n+a+b)( italic_n + italic_a + italic_b )-qubit unitary.

In Proposition 3.6, the subnormalization factors of the Ajsubscript𝐴𝑗A_{j}italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT’s are to be the same. Later on, we will need a slight generalization of the above result whereby this requirement is dropped.

Proposition 3.7 (Generalized linear combination of block-encoded matrices).

Let

  1. i.

    Aj,j=0,,m1formulae-sequencesubscript𝐴𝑗𝑗0𝑚1A_{j},\;j=0,\dots,m-1italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j = 0 , … , italic_m - 1 be n𝑛nitalic_n-qubit operators with respective (αj,a,εBEsubscript𝛼𝑗𝑎subscript𝜀BE\alpha_{j},a,\varepsilon_{\text{BE}}italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_a , italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT)-BEs Ujsubscript𝑈𝑗U_{j}italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for α:=(α0,,αm1)massign𝛼subscript𝛼0subscript𝛼𝑚1superscript𝑚\alpha:=(\alpha_{0},\dots,\alpha_{m-1})\in\mathbb{C}^{m}italic_α := ( italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_α start_POSTSUBSCRIPT italic_m - 1 end_POSTSUBSCRIPT ) ∈ blackboard_C start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT,

  2. ii.

    A=j=0m1yjAj𝐴superscriptsubscript𝑗0𝑚1subscript𝑦𝑗subscript𝐴𝑗A=\sum_{j=0}^{m-1}y_{j}A_{j}italic_A = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for y:=(y0,,ym1)massign𝑦subscript𝑦0subscript𝑦𝑚1superscript𝑚y:=(y_{0},\dots,y_{m-1})\in\mathbb{C}^{m}italic_y := ( italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_m - 1 end_POSTSUBSCRIPT ) ∈ blackboard_C start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT,

  3. iii.

    (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) be a (β,b,εSP)𝛽𝑏subscript𝜀SP(\beta,b,\varepsilon_{\text{SP}})( italic_β , italic_b , italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT )-state-preparation-pair for αydirect-product𝛼𝑦\alpha\odot yitalic_α ⊙ italic_y.

Then there exists a (β,a+b,βinfjαjεBE+εSP)𝛽𝑎𝑏𝛽subscriptinfimum𝑗subscript𝛼𝑗subscript𝜀BEsubscript𝜀SP(\beta,a+b,\frac{\beta}{\inf_{j}\alpha_{j}}\varepsilon_{\text{BE}}+\varepsilon% _{\text{SP}})( italic_β , italic_a + italic_b , divide start_ARG italic_β end_ARG start_ARG roman_inf start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT + italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT )-BE of A𝐴Aitalic_A, given by

W~=(PLIaIn)W(PRIaIn),~𝑊tensor-productsuperscriptsubscript𝑃𝐿subscript𝐼𝑎subscript𝐼𝑛𝑊tensor-productsubscript𝑃𝑅subscript𝐼𝑎subscript𝐼𝑛\widetilde{W}=(P_{L}^{\dagger}\otimes I_{a}\otimes I_{n})W(P_{R}\otimes I_{a}% \otimes I_{n}),over~ start_ARG italic_W end_ARG = ( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) italic_W ( italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ,

where

W=j=0m1|jj|Uj+j=m2b1|jj|IaIn𝑊superscriptsubscript𝑗0𝑚1tensor-productket𝑗bra𝑗subscript𝑈𝑗superscriptsubscript𝑗𝑚superscript2𝑏1tensor-productket𝑗bra𝑗subscript𝐼𝑎subscript𝐼𝑛W=\sum_{j=0}^{m-1}\ket{j}\bra{j}\otimes U_{j}+\sum_{j=m}^{2^{b}-1}\ket{j}\bra{% j}\otimes I_{a}\otimes I_{n}italic_W = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT | start_ARG italic_j end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | ⊗ italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_j = italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | start_ARG italic_j end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT

is a (n+a+b)𝑛𝑎𝑏(n+a+b)( italic_n + italic_a + italic_b )-qubit unitary.

Proof.

The following is adapted from the proof of Lemma 52, [GSLW19]. By definition of state-preparation pairs (see Definition 3.5), PL|0b=j=02b1cj|jsubscript𝑃𝐿ketsuperscript0𝑏superscriptsubscript𝑗0superscript2𝑏1subscript𝑐𝑗ket𝑗P_{L}\ket{0^{b}}=\sum_{j=0}^{2^{b}-1}c_{j}\ket{j}italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT | start_ARG 0 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_ARG ⟩ = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_ARG italic_j end_ARG ⟩ and PR|0b=j=02b1dj|jsubscript𝑃𝑅ketsuperscript0𝑏superscriptsubscript𝑗0superscript2𝑏1subscript𝑑𝑗ket𝑗P_{R}\ket{0^{b}}=\sum_{j=0}^{2^{b}-1}d_{j}\ket{j}italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT | start_ARG 0 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_ARG ⟩ = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_ARG italic_j end_ARG ⟩ such that j=0m1|αjyjβcj*dj|εSPsuperscriptsubscript𝑗0𝑚1subscript𝛼𝑗subscript𝑦𝑗𝛽superscriptsubscript𝑐𝑗subscript𝑑𝑗subscript𝜀SP\sum_{j=0}^{m-1}|\alpha_{j}y_{j}-\beta c_{j}^{*}d_{j}|\leq\varepsilon_{\text{% SP}}∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT | italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_β italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | ≤ italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT. First we evaluate the block extraction of W~~𝑊\widetilde{W}over~ start_ARG italic_W end_ARG. We have

(0b+a|In)W~(|0b+aIn)tensor-productbrasuperscript0𝑏𝑎subscript𝐼𝑛~𝑊tensor-productketsuperscript0𝑏𝑎subscript𝐼𝑛\displaystyle\quad\;(\bra{0^{b+a}}\otimes I_{n})\widetilde{W}(\ket{0^{b+a}}% \otimes I_{n})( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_b + italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) over~ start_ARG italic_W end_ARG ( | start_ARG 0 start_POSTSUPERSCRIPT italic_b + italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT )
=(0b+a|In)(PLIaIn)(j=0m1|jj|Uj+j=m2b1|jj|IaIn)(PRIaIn)(|0b+aIn)absenttensor-productbrasuperscript0𝑏𝑎subscript𝐼𝑛tensor-productsuperscriptsubscript𝑃𝐿subscript𝐼𝑎subscript𝐼𝑛superscriptsubscript𝑗0𝑚1tensor-productket𝑗bra𝑗subscript𝑈𝑗superscriptsubscript𝑗𝑚superscript2𝑏1tensor-productket𝑗bra𝑗subscript𝐼𝑎subscript𝐼𝑛tensor-productsubscript𝑃𝑅subscript𝐼𝑎subscript𝐼𝑛tensor-productketsuperscript0𝑏𝑎subscript𝐼𝑛\displaystyle=(\bra{0^{b+a}}\otimes I_{n})(P_{L}^{\dagger}\otimes I_{a}\otimes I% _{n})\left(\sum_{j=0}^{m-1}\ket{j}\bra{j}\otimes U_{j}+\sum_{j=m}^{2^{b}-1}% \ket{j}\bra{j}\otimes I_{a}\otimes I_{n}\right)(P_{R}\otimes I_{a}\otimes I_{n% })(\ket{0^{b+a}}\otimes I_{n})= ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_b + italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ( ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT | start_ARG italic_j end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | ⊗ italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_j = italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | start_ARG italic_j end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ( italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ( | start_ARG 0 start_POSTSUPERSCRIPT italic_b + italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT )
=j=0m10b|PL|jj|PR|0b(0a|In)Uj(|0aIn)absentsuperscriptsubscript𝑗0𝑚1brasuperscript0𝑏superscriptsubscript𝑃𝐿ket𝑗bra𝑗subscript𝑃𝑅ketsuperscript0𝑏tensor-productbrasuperscript0𝑎subscript𝐼𝑛subscript𝑈𝑗tensor-productketsuperscript0𝑎subscript𝐼𝑛\displaystyle=\sum_{j=0}^{m-1}\bra{0^{b}}P_{L}^{\dagger}\ket{j}\bra{j}P_{R}% \ket{0^{b}}\cdot(\bra{0^{a}}\otimes I_{n})U_{j}(\ket{0^{a}}\otimes I_{n})= ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_ARG | italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT | start_ARG italic_j end_ARG ⟩ ⟨ start_ARG italic_j end_ARG | italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT | start_ARG 0 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_ARG ⟩ ⋅ ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( | start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT )
=j=0m1cj*dj(0a|In)Uj(|0aIn).absentsuperscriptsubscript𝑗0𝑚1superscriptsubscript𝑐𝑗subscript𝑑𝑗tensor-productbrasuperscript0𝑎subscript𝐼𝑛subscript𝑈𝑗tensor-productketsuperscript0𝑎subscript𝐼𝑛\displaystyle=\sum_{j=0}^{m-1}c_{j}^{*}d_{j}\cdot(\bra{0^{a}}\otimes I_{n})U_{% j}(\ket{0^{a}}\otimes I_{n}).= ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⋅ ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( | start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) .

In going from the first equality to the second, we have made use of the fact that for state preparation pairs cj*dj=0superscriptsubscript𝑐𝑗subscript𝑑𝑗0c_{j}^{*}d_{j}=0italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 0 for j=m,,2b1𝑗𝑚superscript2𝑏1j=m,\dots,2^{b}-1italic_j = italic_m , … , 2 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT - 1. The second summand in W𝑊Witalic_W is thus annihilated. Therefore,

Aβ(0b+a|In)W~(|0b+aIn)norm𝐴𝛽tensor-productbrasuperscript0𝑏𝑎subscript𝐼𝑛~𝑊tensor-productketsuperscript0𝑏𝑎subscript𝐼𝑛\displaystyle\left\|A-\beta(\bra{0^{b+a}}\otimes I_{n})\widetilde{W}(\ket{0^{b% +a}}\otimes I_{n})\right\|∥ italic_A - italic_β ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_b + italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) over~ start_ARG italic_W end_ARG ( | start_ARG 0 start_POSTSUPERSCRIPT italic_b + italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ∥ =Aj=0m1(βcj*djαjyj+αjyj)(0a|In)Uj(|0aIn)absentnorm𝐴superscriptsubscript𝑗0𝑚1𝛽superscriptsubscript𝑐𝑗subscript𝑑𝑗subscript𝛼𝑗subscript𝑦𝑗subscript𝛼𝑗subscript𝑦𝑗tensor-productbrasuperscript0𝑎subscript𝐼𝑛subscript𝑈𝑗tensor-productketsuperscript0𝑎subscript𝐼𝑛\displaystyle=\left\|A-\sum_{j=0}^{m-1}(\beta c_{j}^{*}d_{j}-\alpha_{j}y_{j}+% \alpha_{j}y_{j})\cdot(\bra{0^{a}}\otimes I_{n})U_{j}(\ket{0^{a}}\otimes I_{n})\right\|= ∥ italic_A - ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT ( italic_β italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ⋅ ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( | start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ∥
j=0m1|βcj*djαjyj|+Aj=0m1αjyj(0a|In)Uj(|0aIn)absentsuperscriptsubscript𝑗0𝑚1𝛽superscriptsubscript𝑐𝑗subscript𝑑𝑗subscript𝛼𝑗subscript𝑦𝑗norm𝐴superscriptsubscript𝑗0𝑚1subscript𝛼𝑗subscript𝑦𝑗tensor-productbrasuperscript0𝑎subscript𝐼𝑛subscript𝑈𝑗tensor-productketsuperscript0𝑎subscript𝐼𝑛\displaystyle\leq\sum_{j=0}^{m-1}|\beta c_{j}^{*}d_{j}-\alpha_{j}y_{j}|+\left% \|A-\sum_{j=0}^{m-1}\alpha_{j}y_{j}(\bra{0^{a}}\otimes I_{n})U_{j}(\ket{0^{a}}% \otimes I_{n})\right\|≤ ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT | italic_β italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | + ∥ italic_A - ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( | start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ∥
εSP+j=0m1yjAjj=0m1yjαj(0a|In)Uj(|0aIn)absentsubscript𝜀SPnormsuperscriptsubscript𝑗0𝑚1subscript𝑦𝑗subscript𝐴𝑗superscriptsubscript𝑗0𝑚1subscript𝑦𝑗subscript𝛼𝑗tensor-productbrasuperscript0𝑎subscript𝐼𝑛subscript𝑈𝑗tensor-productketsuperscript0𝑎subscript𝐼𝑛\displaystyle\leq\varepsilon_{\text{SP}}+\left\|\sum_{j=0}^{m-1}y_{j}A_{j}-% \sum_{j=0}^{m-1}y_{j}\alpha_{j}(\bra{0^{a}}\otimes I_{n})U_{j}(\ket{0^{a}}% \otimes I_{n})\right\|≤ italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT + ∥ ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( | start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ∥
εSP+j=0m1|yj|Ajαj(0a|In)Uj(|0aIn)absentsubscript𝜀SPsuperscriptsubscript𝑗0𝑚1subscript𝑦𝑗normsubscript𝐴𝑗subscript𝛼𝑗tensor-productbrasuperscript0𝑎subscript𝐼𝑛subscript𝑈𝑗tensor-productketsuperscript0𝑎subscript𝐼𝑛\displaystyle\leq\varepsilon_{\text{SP}}+\sum_{j=0}^{m-1}|y_{j}|\left\|A_{j}-% \alpha_{j}(\bra{0^{a}}\otimes I_{n})U_{j}(\ket{0^{a}}\otimes I_{n})\right\|≤ italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT | italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | ∥ italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( | start_ARG 0 start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ∥
εSP+j=0m1|yj|εBEabsentsubscript𝜀SPsuperscriptsubscript𝑗0𝑚1subscript𝑦𝑗subscript𝜀BE\displaystyle\leq\varepsilon_{\text{SP}}+\sum_{j=0}^{m-1}|y_{j}|\varepsilon_{% \text{BE}}≤ italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT | italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT
εSP+βinfjαjεBE.absentsubscript𝜀SP𝛽subscriptinfimum𝑗subscript𝛼𝑗subscript𝜀BE\displaystyle\leq\varepsilon_{\text{SP}}+\frac{\beta}{\inf_{j}\alpha_{j}}% \varepsilon_{\text{BE}}.≤ italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT + divide start_ARG italic_β end_ARG start_ARG roman_inf start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT .

where the last inequality was obtained using βj=0m1|αjyj|j=0m1(infkαk)|yj|𝛽superscriptsubscript𝑗0𝑚1subscript𝛼𝑗subscript𝑦𝑗superscriptsubscript𝑗0𝑚1subscriptinfimum𝑘subscript𝛼𝑘subscript𝑦𝑗\beta\geq\sum_{j=0}^{m-1}|\alpha_{j}y_{j}|\geq\sum_{j=0}^{m-1}(\inf_{k}\alpha_% {k})|y_{j}|italic_β ≥ ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT | italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | ≥ ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m - 1 end_POSTSUPERSCRIPT ( roman_inf start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) | italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT |. ∎

Remark 3.8.

In the special case where the block-encodings of the Ajsubscript𝐴𝑗A_{j}italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT’s have the same subnormalization factors, i.e., αj=αsubscript𝛼𝑗𝛼\alpha_{j}=\alphaitalic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_α for all j𝑗jitalic_j, we recover Proposition 3.6 from Proposition 3.7 . To see this, observe that if (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) is a (β,b,εSP)𝛽𝑏subscript𝜀SP(\beta,b,\varepsilon_{\text{SP}})( italic_β , italic_b , italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT )-state-preparation-pair for αydirect-product𝛼𝑦\alpha\odot yitalic_α ⊙ italic_y, then j|αjyjβcj*dj|εSPj|αyjβcj*dj|εSPj|yjβαcj*dj|εSPαsubscript𝑗subscript𝛼𝑗subscript𝑦𝑗𝛽superscriptsubscript𝑐𝑗subscript𝑑𝑗subscript𝜀SPsubscript𝑗𝛼subscript𝑦𝑗𝛽superscriptsubscript𝑐𝑗subscript𝑑𝑗subscript𝜀SPsubscript𝑗subscript𝑦𝑗𝛽𝛼superscriptsubscript𝑐𝑗subscript𝑑𝑗subscript𝜀SP𝛼\sum_{j}|\alpha_{j}y_{j}-\beta c_{j}^{*}d_{j}|\leq\varepsilon_{\text{SP}}% \implies\sum_{j}|\alpha y_{j}-\beta c_{j}^{*}d_{j}|\leq\varepsilon_{\text{SP}}% \implies\sum_{j}|y_{j}-\frac{\beta}{\alpha}c_{j}^{*}d_{j}|\leq\frac{% \varepsilon_{\text{SP}}}{\alpha}∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_β italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | ≤ italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT ⟹ ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_α italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_β italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | ≤ italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT ⟹ ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - divide start_ARG italic_β end_ARG start_ARG italic_α end_ARG italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | ≤ divide start_ARG italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT end_ARG start_ARG italic_α end_ARG, thus implying (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) is a (βα,b,εSPα)𝛽𝛼𝑏subscript𝜀SP𝛼(\frac{\beta}{\alpha},b,\frac{\varepsilon_{\text{SP}}}{\alpha})( divide start_ARG italic_β end_ARG start_ARG italic_α end_ARG , italic_b , divide start_ARG italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT end_ARG start_ARG italic_α end_ARG )-state-preparation-pair for y𝑦yitalic_y. According to Proposition 3.6, W~~𝑊\widetilde{W}over~ start_ARG italic_W end_ARG is then a (αβα,a+b,αεSPα+βαεBE)𝛼𝛽𝛼𝑎𝑏𝛼subscript𝜀SP𝛼𝛽𝛼subscript𝜀BE(\alpha\cdot\frac{\beta}{\alpha},\;a+b,\;\alpha\cdot\frac{\varepsilon_{\text{% SP}}}{\alpha}+\frac{\beta}{\alpha}\varepsilon_{\text{BE}})( italic_α ⋅ divide start_ARG italic_β end_ARG start_ARG italic_α end_ARG , italic_a + italic_b , italic_α ⋅ divide start_ARG italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT end_ARG start_ARG italic_α end_ARG + divide start_ARG italic_β end_ARG start_ARG italic_α end_ARG italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT )-BE of A𝐴Aitalic_A. This is in agreement with Proposition 3.7.

We now arrive at a milestone within the QSVT framework. Namely, the ability to implement block-encodings of polynomials of a matrix from a given block-encoding of the matrix. In many applications however, the functions of interest are not polynomials. In such cases, one has to first approximate the desired function by a polynomial in order to apply QSVT/QET.

Theorem 3.9 (Polynomial Eigenvalue Transformation – Theorem 56, [GSLW19]).

Let U𝑈Uitalic_U be an (α,a,ε)𝛼𝑎𝜀(\alpha,a,\varepsilon)( italic_α , italic_a , italic_ε )-encoding of a Hermitian matrix A𝐴Aitalic_A (equivalently, a (1,a,ε/α)1𝑎𝜀𝛼(1,a,\varepsilon/\alpha)( 1 , italic_a , italic_ε / italic_α )-encoding of A/α𝐴𝛼A/\alphaitalic_A / italic_α) and P[x]𝑃delimited-[]𝑥P\in\mathbb{R}[x]italic_P ∈ blackboard_R [ italic_x ] be a degree-d𝑑ditalic_d polynomial satisfying |P(x)|12𝑃𝑥12|P(x)|\leq\frac{1}{2}| italic_P ( italic_x ) | ≤ divide start_ARG 1 end_ARG start_ARG 2 end_ARG on [1,1]11[-1,1][ - 1 , 1 ]. Then, one can construct a quantum circuit U~~𝑈\tilde{U}over~ start_ARG italic_U end_ARG which is a (1,a+2,4dε/α)1𝑎24𝑑𝜀𝛼(1,a+2,4d\sqrt{\varepsilon/\alpha})( 1 , italic_a + 2 , 4 italic_d square-root start_ARG italic_ε / italic_α end_ARG )-encoding of P(A/α)𝑃𝐴𝛼P(A/\alpha)italic_P ( italic_A / italic_α ). U~~𝑈\tilde{U}over~ start_ARG italic_U end_ARG consists of d𝑑ditalic_d U𝑈Uitalic_U and Usuperscript𝑈U^{\dagger}italic_U start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT gates, one controlled-U𝑈Uitalic_U, and 𝒪((a+1)d)𝒪𝑎1𝑑\mathcal{O}((a+1)d)caligraphic_O ( ( italic_a + 1 ) italic_d ) other one- and two-qubit gates.

Proposition 3.10 (Bounded Polynomial Approximation – Corollary 66, [GSLW19]).

Let x0[1,1]subscript𝑥011x_{0}\in[-1,1]italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ [ - 1 , 1 ], r(0,2]𝑟02r\in(0,2]italic_r ∈ ( 0 , 2 ], δ(0,r]𝛿0𝑟\delta\in(0,r]italic_δ ∈ ( 0 , italic_r ] and let f:[x0rδ,x0+r+δ]:𝑓subscript𝑥0𝑟𝛿subscript𝑥0𝑟𝛿f:[x_{0}-r-\delta,x_{0}+r+\delta]\longrightarrow\mathbb{C}italic_f : [ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r - italic_δ , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r + italic_δ ] ⟶ blackboard_C be such that f(x)=l=0al(xx0)l𝑓𝑥superscriptsubscript𝑙0subscript𝑎𝑙superscript𝑥subscript𝑥0𝑙f(x)=\sum_{l=0}^{\infty}a_{l}(x-x_{0})^{l}italic_f ( italic_x ) = ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ( italic_x - italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT for all x[x0rδ,x0+r+δ]𝑥subscript𝑥0𝑟𝛿subscript𝑥0𝑟𝛿x\in[x_{0}-r-\delta,x_{0}+r+\delta]italic_x ∈ [ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r - italic_δ , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r + italic_δ ]. Suppose B>0𝐵0B>0italic_B > 0 is such that l=0(r+δ)l|al|Bsuperscriptsubscript𝑙0superscript𝑟𝛿𝑙subscript𝑎𝑙𝐵\sum_{l=0}^{\infty}(r+\delta)^{l}|a_{l}|\leq B∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( italic_r + italic_δ ) start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT | italic_a start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT | ≤ italic_B. Let ε(0,12B]𝜀012𝐵\varepsilon\in(0,\frac{1}{2B}]italic_ε ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 2 italic_B end_ARG ], then there is an efficiently computable polynomial P[x]𝑃delimited-[]𝑥P\in\mathbb{C}[x]italic_P ∈ blackboard_C [ italic_x ] of degree 𝒪(1δlog(Bε))𝒪1𝛿𝐵𝜀\mathcal{O}\left(\frac{1}{\delta}\log\left(\frac{B}{\varepsilon}\right)\right)caligraphic_O ( divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG roman_log ( divide start_ARG italic_B end_ARG start_ARG italic_ε end_ARG ) ) such that

f(x)P(x)[x0r,x0+r]subscriptnorm𝑓𝑥𝑃𝑥subscript𝑥0𝑟subscript𝑥0𝑟\displaystyle\|f(x)-P(x)\|_{[x_{0}-r,x_{0}+r]}∥ italic_f ( italic_x ) - italic_P ( italic_x ) ∥ start_POSTSUBSCRIPT [ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r ] end_POSTSUBSCRIPT εabsent𝜀\displaystyle\leq\varepsilon≤ italic_ε (3.1)
P(x)[1,1]subscriptnorm𝑃𝑥11\displaystyle\|P(x)\|_{[-1,1]}∥ italic_P ( italic_x ) ∥ start_POSTSUBSCRIPT [ - 1 , 1 ] end_POSTSUBSCRIPT ε+f(x)[x0rδ/2,x0+r+δ/2]ε+Babsent𝜀subscriptnorm𝑓𝑥subscript𝑥0𝑟𝛿2subscript𝑥0𝑟𝛿2𝜀𝐵\displaystyle\leq\varepsilon+\|f(x)\|_{[x_{0}-r-\delta/2,x_{0}+r+\delta/2]}% \leq\varepsilon+B≤ italic_ε + ∥ italic_f ( italic_x ) ∥ start_POSTSUBSCRIPT [ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r - italic_δ / 2 , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r + italic_δ / 2 ] end_POSTSUBSCRIPT ≤ italic_ε + italic_B (3.2)
P(x)[1,1][x0rδ/2,x0+r+δ/2]subscriptnorm𝑃𝑥11subscript𝑥0𝑟𝛿2subscript𝑥0𝑟𝛿2\displaystyle\|P(x)\|_{[-1,1]\setminus[x_{0}-r-\delta/2,x_{0}+r+\delta/2]}∥ italic_P ( italic_x ) ∥ start_POSTSUBSCRIPT [ - 1 , 1 ] ∖ [ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r - italic_δ / 2 , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r + italic_δ / 2 ] end_POSTSUBSCRIPT ε.absent𝜀\displaystyle\leq\varepsilon.≤ italic_ε . (3.3)

If we choose B𝐵Bitalic_B sufficiently large such that 12B<112𝐵1\frac{1}{2B}<1divide start_ARG 1 end_ARG start_ARG 2 italic_B end_ARG < 1, then we also have an ε𝜀\varepsilonitalic_ε-independent bound on P(x)𝑃𝑥P(x)italic_P ( italic_x ): P(x)[1,1]1+Bsubscriptnorm𝑃𝑥111𝐵\|P(x)\|_{[-1,1]}\leq 1+B∥ italic_P ( italic_x ) ∥ start_POSTSUBSCRIPT [ - 1 , 1 ] end_POSTSUBSCRIPT ≤ 1 + italic_B.

Theorem 3.9 and Proposition 3.10 are to be used in conjunction to produce block-encodings of general functions of Hermitian matrices. In doing so, we first note that Theorem 3.9 produces an encoding of P(A/α)𝑃𝐴𝛼P(A/\alpha)italic_P ( italic_A / italic_α ), not P(A)𝑃𝐴P(A)italic_P ( italic_A ). Thus, with a polynomial approximation of f𝑓fitalic_f, say P(x)f(x)𝑃𝑥𝑓𝑥P(x)\approx f(x)italic_P ( italic_x ) ≈ italic_f ( italic_x ), it is generally not true that P(A/α)f(A)𝑃𝐴𝛼𝑓𝐴P(A/\alpha)\approx f(A)italic_P ( italic_A / italic_α ) ≈ italic_f ( italic_A ). What we need is a polynomial approximation not of f𝑓fitalic_f, but of a (horizontally) scaled version of f𝑓fitalic_f, f(x):=f(αx)assignsuperscript𝑓𝑥𝑓𝛼𝑥f^{\prime}(x):=f(\alpha x)italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x ) := italic_f ( italic_α italic_x ), so that P(x)f(x)P(A/α)f(A/α)=f(A)𝑃𝑥superscript𝑓𝑥𝑃𝐴𝛼superscript𝑓𝐴𝛼𝑓𝐴P(x)\approx f^{\prime}(x)\implies P(A/\alpha)\approx f^{\prime}(A/\alpha)=f(A)italic_P ( italic_x ) ≈ italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x ) ⟹ italic_P ( italic_A / italic_α ) ≈ italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_A / italic_α ) = italic_f ( italic_A ). Second, we also have to take into account the polynomial approximation error incurred in producing the final desired block encoding f(A)𝑓𝐴f(A)italic_f ( italic_A ). We take care of these matters in Corollary 3.11, which, given the block-encoding of an arbitrary Hermitian matrix A𝐴Aitalic_A, produces a block-encoding of f(A)𝑓𝐴f(A)italic_f ( italic_A ), where f𝑓fitalic_f is a generic real-valued function.

Corollary 3.11 (Block-encoding functions of general Hermitian matrices).

Given

  1. i.

    A Hermitian matrix λminAλmaxsubscript𝜆𝐴subscript𝜆\lambda_{\min}\leq A\leq\lambda_{\max}italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ≤ italic_A ≤ italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT, <λmin<λmax<subscript𝜆subscript𝜆-\infty<\lambda_{\min}<\lambda_{\max}<\infty- ∞ < italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT < italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT < ∞ and U𝑈Uitalic_U, an (α,a,ε)𝛼𝑎𝜀(\alpha,a,\varepsilon)( italic_α , italic_a , italic_ε )-encoding of A𝐴Aitalic_A.

  2. ii.

    f:I:𝑓𝐼f:I\longrightarrow\mathbb{R}italic_f : italic_I ⟶ blackboard_R, a smooth function on an open interval I𝐼Iitalic_I containing [λmin,λmax]subscript𝜆subscript𝜆[\lambda_{\min},\lambda_{\max}][ italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ]. Assume the function xf(αx)maps-to𝑥𝑓𝛼𝑥x\mapsto f(\alpha x)italic_x ↦ italic_f ( italic_α italic_x ) satisfies the conditions in Proposition 3.10 with [λmin/α,λmax/α][x0r,x0+r]subscript𝜆𝛼subscript𝜆𝛼subscript𝑥0𝑟subscript𝑥0𝑟[\lambda_{\min}/\alpha,\lambda_{\max}/\alpha]\subseteq[x_{0}-r,x_{0}+r][ italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT / italic_α , italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT / italic_α ] ⊆ [ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r ] and series-of-coefficients bound B𝐵Bitalic_B.

  3. iii.

    Polynomial approximation error tolerance for f𝑓fitalic_f: εpoly(0,12]subscript𝜀poly012\varepsilon_{\text{poly}}\in(0,\frac{1}{2}]italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 2 end_ARG ].

Then there exists a quantum circuit Ufsubscript𝑈𝑓U_{f}italic_U start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT which is a (2(1+B),a+2,εpoly+2(1+B)(4dε/α))21𝐵𝑎2subscript𝜀poly21𝐵4𝑑𝜀𝛼\left(2(1+B),\;a+2,\;\varepsilon_{\text{poly}}+2(1+B)(4d\sqrt{\varepsilon/% \alpha})\right)( 2 ( 1 + italic_B ) , italic_a + 2 , italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT + 2 ( 1 + italic_B ) ( 4 italic_d square-root start_ARG italic_ε / italic_α end_ARG ) )-encoding of f(A)𝑓𝐴f(A)italic_f ( italic_A ). The construction of Ufsubscript𝑈𝑓U_{f}italic_U start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT makes d=𝒪(1δlogBεpoly)𝑑𝒪1𝛿𝐵subscript𝜀polyd=\mathcal{O}\left(\frac{1}{\delta}\log\frac{B}{\varepsilon_{\text{poly}}}\right)italic_d = caligraphic_O ( divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG roman_log divide start_ARG italic_B end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG ) queries to U𝑈Uitalic_U.

Proof.

First, αA=max{|λmin|,|λmax|}𝛼norm𝐴subscript𝜆subscript𝜆\alpha\geq\|A\|=\max\{|\lambda_{\min}|,|\lambda_{\max}|\}italic_α ≥ ∥ italic_A ∥ = roman_max { | italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT | , | italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT | }. Define the scaling map tα:xx/α:subscript𝑡𝛼maps-to𝑥𝑥𝛼t_{\alpha}:x\mapsto x/\alphaitalic_t start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT : italic_x ↦ italic_x / italic_α, so that under this map [λmin,λmax][λmin/α,λmax/α]maps-tosubscript𝜆subscript𝜆subscript𝜆𝛼subscript𝜆𝛼[\lambda_{\min},\lambda_{\max}]\mapsto[\lambda_{\min}/\alpha,\lambda_{\max}/\alpha][ italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ] ↦ [ italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT / italic_α , italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT / italic_α ]. By assumption on f𝑓fitalic_f there exists x0[1,1]subscript𝑥011x_{0}\in[-1,1]italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ [ - 1 , 1 ], r(0,2]𝑟02r\in(0,2]italic_r ∈ ( 0 , 2 ], δ(0,r]𝛿0𝑟\delta\in(0,r]italic_δ ∈ ( 0 , italic_r ] such that (i.) [λmin/α,λmax/α][x0r,x0+r]subscript𝜆𝛼subscript𝜆𝛼subscript𝑥0𝑟subscript𝑥0𝑟[\lambda_{\min}/\alpha,\lambda_{\max}/\alpha]\subseteq[x_{0}-r,x_{0}+r][ italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT / italic_α , italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT / italic_α ] ⊆ [ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r ], (ii.) ftα1(x)=l=0al(xx0)l𝑓superscriptsubscript𝑡𝛼1𝑥superscriptsubscript𝑙0subscript𝑎𝑙superscript𝑥subscript𝑥0𝑙f\circ t_{\alpha}^{-1}(x)=\sum_{l=0}^{\infty}a_{l}(x-x_{0})^{l}italic_f ∘ italic_t start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x ) = ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ( italic_x - italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT on [x0rδ,x0+r+δ]subscript𝑥0𝑟𝛿subscript𝑥0𝑟𝛿[x_{0}-r-\delta,x_{0}+r+\delta][ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r - italic_δ , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r + italic_δ ] and (iii.) l=0(r+δ)l|al|Bsuperscriptsubscript𝑙0superscript𝑟𝛿𝑙subscript𝑎𝑙𝐵\sum_{l=0}^{\infty}(r+\delta)^{l}|a_{l}|\leq B∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( italic_r + italic_δ ) start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT | italic_a start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT | ≤ italic_B for some B>0𝐵0B>0italic_B > 0.

By Proposition 3.10, given polynomial approximation error tolerance εpolysubscript𝜀poly\varepsilon_{\text{poly}}italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT there exists a polynomial Q[x]𝑄delimited-[]𝑥Q\in\mathbb{C}[x]italic_Q ∈ blackboard_C [ italic_x ] of degree 𝒪(1δlog(Bεpoly))𝒪1𝛿𝐵subscript𝜀poly\mathcal{O}\left(\frac{1}{\delta}\log\left(\frac{B}{\varepsilon_{\text{poly}}}% \right)\right)caligraphic_O ( divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG roman_log ( divide start_ARG italic_B end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG ) ) which εpolysubscript𝜀poly\varepsilon_{\text{poly}}italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT-approximates ftα1𝑓superscriptsubscript𝑡𝛼1f\circ t_{\alpha}^{-1}italic_f ∘ italic_t start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT on [x0r,x0+r]subscript𝑥0𝑟subscript𝑥0𝑟[x_{0}-r,x_{0}+r][ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r ] and is bounded above by 1+B1𝐵1+B1 + italic_B on [1,1]11[-1,1][ - 1 , 1 ]. Since A/α[λmin/α,λmax/α][x0r,x0+r]norm𝐴𝛼subscript𝜆𝛼subscript𝜆𝛼subscript𝑥0𝑟subscript𝑥0𝑟\|A/\alpha\|\in[\lambda_{\min}/\alpha,\lambda_{\max}/\alpha]\subseteq[x_{0}-r,% x_{0}+r]∥ italic_A / italic_α ∥ ∈ [ italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT / italic_α , italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT / italic_α ] ⊆ [ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r ], we have

ftα1(Aα)Q(Aα)ftα1(x)Q(x)[x0r,x0+r]εpoly.norm𝑓superscriptsubscript𝑡𝛼1𝐴𝛼𝑄𝐴𝛼subscriptnorm𝑓superscriptsubscript𝑡𝛼1𝑥𝑄𝑥subscript𝑥0𝑟subscript𝑥0𝑟subscript𝜀poly\displaystyle\left\|f\circ t_{\alpha}^{-1}\left(\frac{A}{\alpha}\right)-Q\left% (\frac{A}{\alpha}\right)\right\|\leq\|f\circ t_{\alpha}^{-1}(x)-Q(x)\|_{[x_{0}% -r,x_{0}+r]}\leq\varepsilon_{\text{poly}}.∥ italic_f ∘ italic_t start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( divide start_ARG italic_A end_ARG start_ARG italic_α end_ARG ) - italic_Q ( divide start_ARG italic_A end_ARG start_ARG italic_α end_ARG ) ∥ ≤ ∥ italic_f ∘ italic_t start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x ) - italic_Q ( italic_x ) ∥ start_POSTSUBSCRIPT [ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r ] end_POSTSUBSCRIPT ≤ italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT .

In order to apply Theorem 3.9, our polynomial has to be real and upper-bounded by 1/2121/21 / 2 on [1,1]11[-1,1][ - 1 , 1 ]. Observe that for any complex-valued function F𝐹Fitalic_F and domain S𝑆Sitalic_S,

FS=supxS|F(x)|=supxS(ReF(x))2+(ImF(x))2supxS|ReF(x)|=ReFS.subscriptnorm𝐹𝑆subscriptsupremum𝑥𝑆𝐹𝑥subscriptsupremum𝑥𝑆superscriptRe𝐹𝑥2superscriptIm𝐹𝑥2subscriptsupremum𝑥𝑆Re𝐹𝑥subscriptnormRe𝐹𝑆\displaystyle\|F\|_{S}=\sup_{x\in S}|F(x)|=\sup_{x\in S}\sqrt{(\operatorname{% Re}F(x))^{2}+(\operatorname{Im}F(x))^{2}}\geq\sup_{x\in S}|\operatorname{Re}F(% x)|=\|\operatorname{Re}F\|_{S}.∥ italic_F ∥ start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT = roman_sup start_POSTSUBSCRIPT italic_x ∈ italic_S end_POSTSUBSCRIPT | italic_F ( italic_x ) | = roman_sup start_POSTSUBSCRIPT italic_x ∈ italic_S end_POSTSUBSCRIPT square-root start_ARG ( roman_Re italic_F ( italic_x ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( roman_Im italic_F ( italic_x ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ≥ roman_sup start_POSTSUBSCRIPT italic_x ∈ italic_S end_POSTSUBSCRIPT | roman_Re italic_F ( italic_x ) | = ∥ roman_Re italic_F ∥ start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT .

Since f𝑓fitalic_f itself is real-valued, ReQ[x]Re𝑄delimited-[]𝑥\operatorname{Re}Q\in\mathbb{R}[x]roman_Re italic_Q ∈ blackboard_R [ italic_x ] is qualified to assume the role of P𝑃Pitalic_P in Proposition 3.10. That is, the real polynomial ReQRe𝑄\operatorname{Re}Qroman_Re italic_Q also εpolysubscript𝜀poly\varepsilon_{\text{poly}}italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT-approximates ftα1𝑓superscriptsubscript𝑡𝛼1f\circ t_{\alpha}^{-1}italic_f ∘ italic_t start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT on [x0r,x0+r]subscript𝑥0𝑟subscript𝑥0𝑟[x_{0}-r,x_{0}+r][ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_r , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_r ] and is bounded above by 1+B1𝐵1+B1 + italic_B on [1,1]11[-1,1][ - 1 , 1 ]. Thus, letting PReQ2(1+B)𝑃Re𝑄21𝐵P\leftarrow\frac{\operatorname{Re}Q}{2(1+B)}italic_P ← divide start_ARG roman_Re italic_Q end_ARG start_ARG 2 ( 1 + italic_B ) end_ARG in Theorem 3.9 we obtain U~~𝑈\tilde{U}over~ start_ARG italic_U end_ARG, a (1,a+2,4dε/α)1𝑎24𝑑𝜀𝛼(1,a+2,4d\sqrt{\varepsilon/\alpha})( 1 , italic_a + 2 , 4 italic_d square-root start_ARG italic_ε / italic_α end_ARG )-encoding of ReQ2(1+B)(A/α)Re𝑄21𝐵𝐴𝛼\frac{\operatorname{Re}Q}{2(1+B)}(A/\alpha)divide start_ARG roman_Re italic_Q end_ARG start_ARG 2 ( 1 + italic_B ) end_ARG ( italic_A / italic_α ), where d=𝒪(1δlog(Bεpoly))𝑑𝒪1𝛿𝐵subscript𝜀polyd=\mathcal{O}\left(\frac{1}{\delta}\log\left(\frac{B}{\varepsilon_{\text{poly}% }}\right)\right)italic_d = caligraphic_O ( divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG roman_log ( divide start_ARG italic_B end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG ) ). Putting these together and noting that ftα1(Aα)=f(A)𝑓superscriptsubscript𝑡𝛼1𝐴𝛼𝑓𝐴f\circ t_{\alpha}^{-1}(\frac{A}{\alpha})=f(A)italic_f ∘ italic_t start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( divide start_ARG italic_A end_ARG start_ARG italic_α end_ARG ) = italic_f ( italic_A ), we have

f(A)2(1+B)(0a+2|I)U~(|0a+2I)norm𝑓𝐴21𝐵tensor-productbrasuperscript0𝑎2𝐼~𝑈tensor-productketsuperscript0𝑎2𝐼\displaystyle\left\|\frac{f(A)}{2(1+B)}-(\bra{0^{a+2}}\otimes I)\tilde{U}(\ket% {0^{a+2}}\otimes I)\right\|∥ divide start_ARG italic_f ( italic_A ) end_ARG start_ARG 2 ( 1 + italic_B ) end_ARG - ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_a + 2 end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I ) over~ start_ARG italic_U end_ARG ( | start_ARG 0 start_POSTSUPERSCRIPT italic_a + 2 end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I ) ∥
ftα1(Aα)2(1+B)ReQ(Aα)2(1+B)+ReQ(Aα)2(1+B)(0a+2|I)U~(|0a+2I)absentnorm𝑓superscriptsubscript𝑡𝛼1𝐴𝛼21𝐵Re𝑄𝐴𝛼21𝐵normRe𝑄𝐴𝛼21𝐵tensor-productbrasuperscript0𝑎2𝐼~𝑈tensor-productketsuperscript0𝑎2𝐼\displaystyle\qquad\leq\left\|\frac{f\circ t_{\alpha}^{-1}(\frac{A}{\alpha})}{% 2(1+B)}-\frac{\operatorname{Re}Q(\frac{A}{\alpha})}{2(1+B)}\right\|+\left\|% \frac{\operatorname{Re}Q(\frac{A}{\alpha})}{2(1+B)}-(\bra{0^{a+2}}\otimes I)% \tilde{U}(\ket{0^{a+2}}\otimes I)\right\|≤ ∥ divide start_ARG italic_f ∘ italic_t start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( divide start_ARG italic_A end_ARG start_ARG italic_α end_ARG ) end_ARG start_ARG 2 ( 1 + italic_B ) end_ARG - divide start_ARG roman_Re italic_Q ( divide start_ARG italic_A end_ARG start_ARG italic_α end_ARG ) end_ARG start_ARG 2 ( 1 + italic_B ) end_ARG ∥ + ∥ divide start_ARG roman_Re italic_Q ( divide start_ARG italic_A end_ARG start_ARG italic_α end_ARG ) end_ARG start_ARG 2 ( 1 + italic_B ) end_ARG - ( ⟨ start_ARG 0 start_POSTSUPERSCRIPT italic_a + 2 end_POSTSUPERSCRIPT end_ARG | ⊗ italic_I ) over~ start_ARG italic_U end_ARG ( | start_ARG 0 start_POSTSUPERSCRIPT italic_a + 2 end_POSTSUPERSCRIPT end_ARG ⟩ ⊗ italic_I ) ∥
εpoly2(1+B)+4dε/α.absentsubscript𝜀poly21𝐵4𝑑𝜀𝛼\displaystyle\qquad\leq\frac{\varepsilon_{\text{poly}}}{2(1+B)}+4d\sqrt{% \varepsilon/\alpha}.≤ divide start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG start_ARG 2 ( 1 + italic_B ) end_ARG + 4 italic_d square-root start_ARG italic_ε / italic_α end_ARG .

Thus, choosing Uf=U~subscript𝑈𝑓~𝑈U_{f}=\tilde{U}italic_U start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = over~ start_ARG italic_U end_ARG gives us a (2(1+B),a+2,εpoly+2(1+B)(4dε/α))21𝐵𝑎2subscript𝜀poly21𝐵4𝑑𝜀𝛼\left(2(1+B),\;a+2,\;\varepsilon_{\text{poly}}+2(1+B)(4d\sqrt{\varepsilon/% \alpha})\right)( 2 ( 1 + italic_B ) , italic_a + 2 , italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT + 2 ( 1 + italic_B ) ( 4 italic_d square-root start_ARG italic_ε / italic_α end_ARG ) )-encoding of f(A)𝑓𝐴f(A)italic_f ( italic_A ). ∎

4 Implementation on quantum computers

In this section, we provide a quantum algorithm implementing the quantum Esscher Transform, based on block-encodings and QSVT. We assume the inputs come in the form of block-encodings. Our algorithm outputs the Esscher-transformed state in block-encoded form (and subsequent translations to the physical state itself).

Reference [GSLW19] demonstrates how to construct block-encodings for density operators ρ𝜌\rhoitalic_ρ within the purified quantum query-access model (see Definition 3.3 and Proposition 3.4 above). For the Hermitian operators Hisubscript𝐻𝑖H_{i}italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT which are generally not density operators, their block-encodings can be constructed efficiently for many physical Hamiltonians, or if the Hisubscript𝐻𝑖H_{i}italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT’s are stored in sparse data structures or KP trees. Along the way we shall also need as an auxiliary tool ‘state-preparation pairs’ (see Definition 3.5), to prepare linear combinations of the Hamiltonians. We assume immediate access to these, as we do for block-encodings. For the construction of state-preparation pairs, one can refer to [vAG18].

4.1 Technical lemmas

The logarithm of the density matrix ρ𝜌\rhoitalic_ρ is a key ingredient of the quantum Esscher transform. Here we provide a technical lemma on constructing a block-encoding of the logarithm of a density matrix from the block-encoding of that matrix.

Lemma 4.1 (Block-encoding of logρ𝜌\log\rhoroman_log italic_ρ).

Given Uρsubscript𝑈𝜌U_{\rho}italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT, a (1,a,0)1𝑎0(1,a,0)( 1 , italic_a , 0 )-BE of an n𝑛nitalic_n-qubit density operator 1κρ11𝜅𝜌1\frac{1}{\kappa}\leq\rho\leq 1divide start_ARG 1 end_ARG start_ARG italic_κ end_ARG ≤ italic_ρ ≤ 1, where κ>1𝜅1\kappa>1italic_κ > 1, and polynomial approximation error tolerance εpoly>0subscript𝜀poly0\varepsilon_{\text{poly}}>0italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT > 0. Then we have a (2(1+log2κ),a+2,εpoly)212𝜅𝑎2subscript𝜀poly\left(2(1+\log 2\kappa),\;a+2,\;\varepsilon_{\text{poly}}\right)( 2 ( 1 + roman_log 2 italic_κ ) , italic_a + 2 , italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT )-BE of logρ𝜌\log\rhoroman_log italic_ρ, the construction of which makes 𝒪(κlog(logκεpoly))𝒪𝜅𝜅subscript𝜀poly\mathcal{O}\left(\kappa\log\left(\frac{\log\kappa}{\varepsilon_{\text{poly}}}% \right)\right)caligraphic_O ( italic_κ roman_log ( divide start_ARG roman_log italic_κ end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG ) ) queries to Uρsubscript𝑈𝜌U_{\rho}italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT.

Proof.

First we construct a polynomial approximation of logx𝑥\log xroman_log italic_x. More specifically, we check that the function logx𝑥\log xroman_log italic_x satisfies the conditions of Proposition 3.10, with the appropriate x0,r,δsubscript𝑥0𝑟𝛿x_{0},r,\deltaitalic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_r , italic_δ and B𝐵Bitalic_B. Corollary 3.11 then gives us the desired block-encoding.

The following derivation is based on the proof of Corollary 67, [GSLW19] and Lemma 11, [GL19]. Negative power functions xcsuperscript𝑥𝑐x^{-c}italic_x start_POSTSUPERSCRIPT - italic_c end_POSTSUPERSCRIPT share with logx𝑥\log xroman_log italic_x the common property of going to infinity as x𝑥xitalic_x approaches 00, thus the Taylor expansions of these functions are performed about x=1𝑥1x=1italic_x = 1. Choose x0=1subscript𝑥01x_{0}=1italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1, r=11κ𝑟11𝜅r=1-\frac{1}{\kappa}italic_r = 1 - divide start_ARG 1 end_ARG start_ARG italic_κ end_ARG and δ=12κ𝛿12𝜅\delta=\frac{1}{2\kappa}italic_δ = divide start_ARG 1 end_ARG start_ARG 2 italic_κ end_ARG. The Taylor series of logx𝑥\log xroman_log italic_x about x=1𝑥1x=1italic_x = 1 is logx=k=1(1)k+1k(x1)k𝑥superscriptsubscript𝑘1superscript1𝑘1𝑘superscript𝑥1𝑘\log x=\sum_{k=1}^{\infty}\frac{(-1)^{k+1}}{k}(x-1)^{k}roman_log italic_x = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG ( - 1 ) start_POSTSUPERSCRIPT italic_k + 1 end_POSTSUPERSCRIPT end_ARG start_ARG italic_k end_ARG ( italic_x - 1 ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT. With ak=(1)k+1ksubscript𝑎𝑘superscript1𝑘1𝑘a_{k}=\frac{(-1)^{k+1}}{k}italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = divide start_ARG ( - 1 ) start_POSTSUPERSCRIPT italic_k + 1 end_POSTSUPERSCRIPT end_ARG start_ARG italic_k end_ARG, the series-of-coefficients bound B𝐵Bitalic_B in Proposition 3.10 is

k=1(r+δ)k|ak|=k=1(11/2κ)kk=k=1(1)kk(12κ1)k=log12κ=log2κ=:B.\displaystyle\sum_{k=1}^{\infty}(r+\delta)^{k}|a_{k}|=\sum_{k=1}^{\infty}\frac% {(1-1/2\kappa)^{k}}{k}=\sum_{k=1}^{\infty}\frac{(-1)^{k}}{k}\left(\frac{1}{2% \kappa}-1\right)^{k}=-\log\frac{1}{2\kappa}=\log 2\kappa=:B.∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( italic_r + italic_δ ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT | italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG ( 1 - 1 / 2 italic_κ ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_ARG start_ARG italic_k end_ARG = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG ( - 1 ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_ARG start_ARG italic_k end_ARG ( divide start_ARG 1 end_ARG start_ARG 2 italic_κ end_ARG - 1 ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT = - roman_log divide start_ARG 1 end_ARG start_ARG 2 italic_κ end_ARG = roman_log 2 italic_κ = : italic_B .

Corollary 3.11 gives us the unitary Ulogρsubscript𝑈𝜌U_{\log\rho}italic_U start_POSTSUBSCRIPT roman_log italic_ρ end_POSTSUBSCRIPT, which is a (2(1+log2κ),a+2,εpoly)212𝜅𝑎2subscript𝜀poly\left(2(1+\log 2\kappa),\;a+2,\;\varepsilon_{\text{poly}}\right)( 2 ( 1 + roman_log 2 italic_κ ) , italic_a + 2 , italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT )-encoding of logρ𝜌\log\rhoroman_log italic_ρ, which can be constructed using 𝒪(κlog(logκεpoly))𝒪𝜅𝜅subscript𝜀poly\mathcal{O}\left(\kappa\log\left(\frac{\log\kappa}{\varepsilon_{\text{poly}}}% \right)\right)caligraphic_O ( italic_κ roman_log ( divide start_ARG roman_log italic_κ end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG ) ) queries to Uρsubscript𝑈𝜌U_{\rho}italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT. ∎

Next, we provide a lemma to construct the block-encoding of an exponentiated matrix from the block-encoding of that matrix.

Lemma 4.2 (Block-encoding of eHsuperscript𝑒𝐻e^{H}italic_e start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT).

Given UHsubscript𝑈𝐻U_{H}italic_U start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT, a (α,a,ε)𝛼𝑎𝜀(\alpha,a,\varepsilon)( italic_α , italic_a , italic_ε )-BE of H𝐻Hitalic_H and polynomial approximation error tolerance εpoly>0subscript𝜀poly0\varepsilon_{\text{poly}}>0italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT > 0, there is a (4,a+2,εpoly+16tε/α)4𝑎2subscript𝜀poly16𝑡𝜀𝛼\left(4,\;a+2,\;\varepsilon_{\text{poly}}+16t\sqrt{\varepsilon/\alpha}\right)( 4 , italic_a + 2 , italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT + 16 italic_t square-root start_ARG italic_ε / italic_α end_ARG )-BE of eH/eαsuperscript𝑒𝐻superscript𝑒𝛼e^{H}/e^{\alpha}italic_e start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT / italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT, constructible using t𝑡titalic_t queries to UHsubscript𝑈𝐻U_{H}italic_U start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT. Here

t=𝒪(max(α,log1εpoly)log1εpoly).𝑡𝒪𝛼1subscript𝜀poly1subscript𝜀poly\displaystyle t=\mathcal{O}\left(\sqrt{\max(\alpha,\log\frac{1}{\varepsilon_{% \text{poly}}})\log\frac{1}{\varepsilon_{\text{poly}}}}\right).italic_t = caligraphic_O ( square-root start_ARG roman_max ( italic_α , roman_log divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG ) roman_log divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG end_ARG ) .
Proof.

By Corollary 64, [GSLW19], there exists P[x]𝑃delimited-[]𝑥P\in\mathbb{R}[x]italic_P ∈ blackboard_R [ italic_x ] of degree t=𝒪(max(α,log1εpoly)log1εpoly)𝑡𝒪𝛼1subscript𝜀poly1subscript𝜀polyt=\mathcal{O}\left(\sqrt{\max(\alpha,\log\frac{1}{\varepsilon_{\text{poly}}})% \log\frac{1}{\varepsilon_{\text{poly}}}}\right)italic_t = caligraphic_O ( square-root start_ARG roman_max ( italic_α , roman_log divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG ) roman_log divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG end_ARG ) such that eαxeαP(x)[1,1]εpolysubscriptnormsuperscript𝑒𝛼𝑥superscript𝑒𝛼𝑃𝑥11subscript𝜀poly\|\frac{e^{\alpha x}}{e^{\alpha}}-P(x)\|_{[-1,1]}\leq\varepsilon_{\text{poly}}∥ divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α italic_x end_POSTSUPERSCRIPT end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - italic_P ( italic_x ) ∥ start_POSTSUBSCRIPT [ - 1 , 1 ] end_POSTSUBSCRIPT ≤ italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT. Furthermore P(x)eαxeαP(x)[1,1]+eαxeα[1,1]1+Bnorm𝑃𝑥subscriptnormsuperscript𝑒𝛼𝑥superscript𝑒𝛼𝑃𝑥11subscriptnormsuperscript𝑒𝛼𝑥superscript𝑒𝛼111𝐵\|P(x)\|\leq\|\frac{e^{\alpha x}}{e^{\alpha}}-P(x)\|_{[-1,1]}+\|\frac{e^{% \alpha x}}{e^{\alpha}}\|_{[-1,1]}\leq 1+B∥ italic_P ( italic_x ) ∥ ≤ ∥ divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α italic_x end_POSTSUPERSCRIPT end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - italic_P ( italic_x ) ∥ start_POSTSUBSCRIPT [ - 1 , 1 ] end_POSTSUBSCRIPT + ∥ divide start_ARG italic_e start_POSTSUPERSCRIPT italic_α italic_x end_POSTSUPERSCRIPT end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG ∥ start_POSTSUBSCRIPT [ - 1 , 1 ] end_POSTSUBSCRIPT ≤ 1 + italic_B, where B=1𝐵1B=1italic_B = 1. Applying Corollary 3.11 with f(x)=exeα𝑓𝑥superscript𝑒𝑥superscript𝑒𝛼f(x)=\frac{e^{x}}{e^{\alpha}}italic_f ( italic_x ) = divide start_ARG italic_e start_POSTSUPERSCRIPT italic_x end_POSTSUPERSCRIPT end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG gives a (4,a+2,εpoly+16tε/α)4𝑎2subscript𝜀poly16𝑡𝜀𝛼\left(4,\;a+2,\;\varepsilon_{\text{poly}}+16t\sqrt{\varepsilon/\alpha}\right)( 4 , italic_a + 2 , italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT + 16 italic_t square-root start_ARG italic_ε / italic_α end_ARG )-encoding of eH/eαsuperscript𝑒𝐻superscript𝑒𝛼e^{H}/e^{\alpha}italic_e start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT / italic_e start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT, making t𝑡titalic_t queries to UHsubscript𝑈𝐻U_{H}italic_U start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT. ∎

4.2 Algorithm

We now provide the algorithm implementing the quantum Esscher transform, see Algorithm 1. We specify the constraints on the inputs and the guarantees on the output in the algorithm itself. A step-by-step analysis of Algorithm 1 is provided below in detail, whereafter the overall (query) complexity is stated. We summarize these information in Theorem 4.3.

Theorem 4.3.

Let us be given the block-encodings of ρ𝜌\rhoitalic_ρ and Hjsubscript𝐻𝑗H_{j}italic_H start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, j[d]𝑗delimited-[]𝑑j\in[d]italic_j ∈ [ italic_d ], parameters θd𝜃superscript𝑑\theta\in\mathbb{R}^{d}italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and error tolerance ε𝜀\varepsilonitalic_ε as specified in Algorithm 1. Then Algorithm 1 outputs an ε𝜀\varepsilonitalic_ε-approximate block-encoding of the (subnormalized) quantum Esscher transform σ=eiθiHi+logρ𝒩,𝜎superscript𝑒subscript𝑖subscript𝜃𝑖subscript𝐻𝑖𝜌𝒩\sigma=\frac{e^{\sum_{i}\theta_{i}H_{i}+\log\rho}}{\mathcal{N}},italic_σ = divide start_ARG italic_e start_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + roman_log italic_ρ end_POSTSUPERSCRIPT end_ARG start_ARG caligraphic_N end_ARG , making

𝒪~(κlog2(1ε))~𝒪𝜅superscript21𝜀\widetilde{\mathcal{O}}\left(\kappa\log^{2}\left(\frac{1}{\varepsilon}\right)\right)over~ start_ARG caligraphic_O end_ARG ( italic_κ roman_log start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) )

queries to Uρsubscript𝑈𝜌U_{\rho}italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT and

𝒪(log1ε)𝒪1𝜀\mathcal{O}\left(\log\frac{1}{\varepsilon}\right)caligraphic_O ( roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG )

queries to each Ujsubscript𝑈𝑗U_{j}italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT.

Algorithm 1 Quantum Esscher Transform via QSVT – QEsscher(ρ,H,θ𝜌𝐻𝜃\rho,H,\thetaitalic_ρ , italic_H , italic_θ)
1:
2:- Unitary Oρsubscript𝑂𝜌O_{\rho}italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT preparing the purification of the n𝑛nitalic_n-qubit density operator 1κρ11𝜅𝜌1\frac{1}{\kappa}\leq\rho\leq 1divide start_ARG 1 end_ARG start_ARG italic_κ end_ARG ≤ italic_ρ ≤ 1 using nρsubscript𝑛𝜌n_{\rho}italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ancillary qubits
3:- Quantum circuits Ujsubscript𝑈𝑗U_{j}italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT which are (1,a,εBE)1𝑎subscript𝜀BE(1,a,\varepsilon_{\text{BE}})( 1 , italic_a , italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT )-BEs of Hjsubscript𝐻𝑗H_{j}italic_H start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for j[d]𝑗delimited-[]𝑑j\in[d]italic_j ∈ [ italic_d ], where εBE=(ε8log1ε)2subscript𝜀BEsuperscript𝜀81𝜀2\varepsilon_{\text{BE}}=\left(\frac{\varepsilon}{8\log\frac{1}{\varepsilon}}% \right)^{2}italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT = ( divide start_ARG italic_ε end_ARG start_ARG 8 roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
4:- Parameters θd𝜃superscript𝑑\theta\in\mathbb{R}^{d}italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT
5:- Output block-encoding error 0<ε<2θ12(1+log2κ)0𝜀superscript2subscriptnorm𝜃1212𝜅0<\varepsilon<2^{-\|\theta\|_{1}-2(1+\log 2\kappa)}0 < italic_ε < 2 start_POSTSUPERSCRIPT - ∥ italic_θ ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - 2 ( 1 + roman_log 2 italic_κ ) end_POSTSUPERSCRIPT.
6:A (1,max{a,n+nρ}+logd+4,ε)1𝑎𝑛subscript𝑛𝜌𝑑4𝜀(1,\;\max\{a,n+n_{\rho}\}+\lceil\log d\rceil+4,\;\varepsilon)( 1 , roman_max { italic_a , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT } + ⌈ roman_log italic_d ⌉ + 4 , italic_ε )-BE of
σ=eiθiHi+logρ𝒩,𝜎superscript𝑒subscript𝑖subscript𝜃𝑖subscript𝐻𝑖𝜌𝒩\sigma=\frac{e^{\sum_{i}\theta_{i}H_{i}+\log\rho}}{\mathcal{N}},italic_σ = divide start_ARG italic_e start_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + roman_log italic_ρ end_POSTSUPERSCRIPT end_ARG start_ARG caligraphic_N end_ARG ,
where 𝒩=eθ1+2(1+log2κ)𝒩superscript𝑒subscriptnorm𝜃1212𝜅\mathcal{N}=e^{\|\theta\|_{1}+2(1+\log 2\kappa)}caligraphic_N = italic_e start_POSTSUPERSCRIPT ∥ italic_θ ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + 2 ( 1 + roman_log 2 italic_κ ) end_POSTSUPERSCRIPT is a subnormalization factor.
7:Use Oρsubscript𝑂𝜌O_{\rho}italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT to construct Uρsubscript𝑈𝜌U_{\rho}italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT, a a (1,n+nρ,0)1𝑛subscript𝑛𝜌0(1,n+n_{\rho},0)( 1 , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT , 0 )-BE of ρ𝜌\rhoitalic_ρ.
8:Construct Ulogρsubscript𝑈𝜌U_{\log\rho}italic_U start_POSTSUBSCRIPT roman_log italic_ρ end_POSTSUBSCRIPT, a (2(1+log2κ),n+nρ+2,εBE)212𝜅𝑛subscript𝑛𝜌2subscript𝜀BE(2(1+\log 2\kappa),\;n+n_{\rho}+2,\;\varepsilon_{\text{BE}})( 2 ( 1 + roman_log 2 italic_κ ) , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT + 2 , italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT )-BE of logρ𝜌\log\rhoroman_log italic_ρ. This makes t=𝒪(κlog(logκεBE))𝑡𝒪𝜅𝜅subscript𝜀BEt=\mathcal{O}\left(\kappa\log\left(\frac{\log\kappa}{\varepsilon_{\text{BE}}}% \right)\right)italic_t = caligraphic_O ( italic_κ roman_log ( divide start_ARG roman_log italic_κ end_ARG start_ARG italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT end_ARG ) ) queries to Uρsubscript𝑈𝜌U_{\rho}italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT, see Lemma 4.1.
9:Construct the (β,b,εSP)𝛽𝑏subscript𝜀SP(\beta,b,\varepsilon_{\text{SP}})( italic_β , italic_b , italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT )-state-preparation-pair (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) for αθdirect-product𝛼𝜃\alpha\odot\thetaitalic_α ⊙ italic_θ, where
10:βθ1+2(1+log2κ)𝛽subscriptnorm𝜃1212𝜅\beta\leftarrow\|\theta\|_{1}+2(1+\log 2\kappa)italic_β ← ∥ italic_θ ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + 2 ( 1 + roman_log 2 italic_κ )
11:blogd𝑏𝑑b\leftarrow\lceil\log d\rceilitalic_b ← ⌈ roman_log italic_d ⌉
12:εSPβεBEsubscript𝜀SP𝛽subscript𝜀BE\varepsilon_{\text{SP}}\leftarrow\beta\varepsilon_{\text{BE}}italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT ← italic_β italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT
13:Using (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ), combine Ulogρsubscript𝑈𝜌U_{\log\rho}italic_U start_POSTSUBSCRIPT roman_log italic_ρ end_POSTSUBSCRIPT and Ujsubscript𝑈𝑗U_{j}italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, j[d]𝑗delimited-[]𝑑j\in[d]italic_j ∈ [ italic_d ] to give UHsubscript𝑈𝐻U_{H}italic_U start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT, a (β,max{a,n+nρ}+2+logd, 2βεBE)𝛽𝑎𝑛subscript𝑛𝜌2𝑑2𝛽subscript𝜀BE(\beta,\;\max\{a,n+n_{\rho}\}+2+\lceil\log d\rceil,\;2\beta\varepsilon_{\text{% BE}})( italic_β , roman_max { italic_a , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT } + 2 + ⌈ roman_log italic_d ⌉ , 2 italic_β italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT )-BE of H:=iθiHi+logρassign𝐻subscript𝑖subscript𝜃𝑖subscript𝐻𝑖𝜌H:=\sum_{i}\theta_{i}H_{i}+\log\rhoitalic_H := ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + roman_log italic_ρ. This makes 1 query to (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) and 1 query to Ulogρsubscript𝑈𝜌U_{\log\rho}italic_U start_POSTSUBSCRIPT roman_log italic_ρ end_POSTSUBSCRIPT and each Ujsubscript𝑈𝑗U_{j}italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, see Proposition 3.7.
14:Construct Uσsubscript𝑈𝜎U_{\sigma}italic_U start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT, a (1,max{a,n+nρ}+4+logd,ε)1𝑎𝑛subscript𝑛𝜌4𝑑𝜀(1,\;\max\{a,n+n_{\rho}\}+4+\lceil\log d\rceil,\;\varepsilon)( 1 , roman_max { italic_a , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT } + 4 + ⌈ roman_log italic_d ⌉ , italic_ε )-BE of σ:=eH/𝒩assign𝜎superscript𝑒𝐻𝒩\sigma:=e^{H}/\mathcal{N}italic_σ := italic_e start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT / caligraphic_N. Makes t=𝒪(log1ε)𝑡𝒪1𝜀t=\mathcal{O}\left(\log\frac{1}{\varepsilon}\right)italic_t = caligraphic_O ( roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) queries to UHsubscript𝑈𝐻U_{H}italic_U start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT, see Lemma 4.2.
15:return Uσsubscript𝑈𝜎U_{\sigma}italic_U start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT.
Proof of Theorem 4.3.

Now we analyze the steps of Algorithm 1 in more detail to give the query complexity of QEsscher(ρ,H,θ𝜌𝐻𝜃\rho,H,\thetaitalic_ρ , italic_H , italic_θ).

Step 1. From Proposition 3.4 we construct Uρ=Oρ~:=(OρIn)(In+nρSWAPn)(OρIn)subscript𝑈𝜌~subscript𝑂𝜌assigntensor-productsuperscriptsubscript𝑂𝜌subscript𝐼𝑛tensor-productsubscript𝐼𝑛subscript𝑛𝜌subscriptSWAP𝑛tensor-productsubscript𝑂𝜌subscript𝐼𝑛U_{\rho}=\widetilde{O_{\rho}}:=(O_{\rho}^{\dagger}\otimes I_{n})(I_{n+n_{\rho}% }\otimes\text{SWAP}_{n})(O_{\rho}\otimes I_{n})italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT = over~ start_ARG italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT end_ARG := ( italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ( italic_I start_POSTSUBSCRIPT italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ SWAP start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ( italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ⊗ italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ), a (1,n+nρ,0)1𝑛subscript𝑛𝜌0(1,n+n_{\rho},0)( 1 , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT , 0 )-BE of ρ𝜌\rhoitalic_ρ. This makes 𝒪(1)𝒪1\mathcal{O}(1)caligraphic_O ( 1 ) queries to Oρsubscript𝑂𝜌O_{\rho}italic_O start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT.

Step 2. This step entails a polynomial approximation to the logarithm function on the interval [1κ,1]1𝜅1[\frac{1}{\kappa},1][ divide start_ARG 1 end_ARG start_ARG italic_κ end_ARG , 1 ]. Denote by εpolysubscript𝜀poly\varepsilon_{\text{poly}}italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT the approximation error tolerance. Choose εpolyεBEsubscript𝜀polysubscript𝜀BE\varepsilon_{\text{poly}}\leq\varepsilon_{\text{BE}}italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT ≤ italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT. Lemma 4.1 gives Ulogρsubscript𝑈𝜌U_{\log\rho}italic_U start_POSTSUBSCRIPT roman_log italic_ρ end_POSTSUBSCRIPT, a (2(1+log2κ),n+nρ+2,εBE)212𝜅𝑛subscript𝑛𝜌2subscript𝜀BE(2(1+\log 2\kappa),\;n+n_{\rho}+2,\;\varepsilon_{\text{BE}})( 2 ( 1 + roman_log 2 italic_κ ) , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT + 2 , italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT )-BE of logρ𝜌\log\rhoroman_log italic_ρ. The construction of Ulogρsubscript𝑈𝜌U_{\log\rho}italic_U start_POSTSUBSCRIPT roman_log italic_ρ end_POSTSUBSCRIPT makes t=𝒪(κlog(logκεBE))𝑡𝒪𝜅𝜅subscript𝜀BEt=\mathcal{O}\left(\kappa\log\left(\frac{\log\kappa}{\varepsilon_{\text{BE}}}% \right)\right)italic_t = caligraphic_O ( italic_κ roman_log ( divide start_ARG roman_log italic_κ end_ARG start_ARG italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT end_ARG ) ) queries to Uρsubscript𝑈𝜌U_{\rho}italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT, where t𝑡titalic_t is the degree of the approximating polynomial (see Proposition 3.10/Corollary 3.11).

Step 3. Construct a (β,b,εSP)𝛽𝑏subscript𝜀SP(\beta,b,\varepsilon_{\text{SP}})( italic_β , italic_b , italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT )-state-preparation-pair (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) for αθd+1direct-product𝛼𝜃superscript𝑑1\alpha\odot\theta\in\mathbb{R}^{d+1}italic_α ⊙ italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d + 1 end_POSTSUPERSCRIPT, where α=(1d,2(1+log2κ))𝛼superscript1𝑑212𝜅\alpha=(1^{d},2(1+\log 2\kappa))italic_α = ( 1 start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , 2 ( 1 + roman_log 2 italic_κ ) ) and θ=(θ1,,θd,1)𝜃subscript𝜃1subscript𝜃𝑑1\theta=(\theta_{1},\dots,\theta_{d},1)italic_θ = ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_θ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT , 1 ) (see Proposition 3.7). Choose β=αθ1=θ1+2(1+log2κ)𝛽subscriptnormdirect-product𝛼𝜃1subscriptnorm𝜃1212𝜅\beta=\|\alpha\odot\theta\|_{1}=\|\theta\|_{1}+2(1+\log 2\kappa)italic_β = ∥ italic_α ⊙ italic_θ ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ∥ italic_θ ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + 2 ( 1 + roman_log 2 italic_κ ). b𝑏bitalic_b has to be such that d+12b𝑑1superscript2𝑏d+1\leq 2^{b}italic_d + 1 ≤ 2 start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT, so choose b=logd𝑏𝑑b=\lceil\log d\rceilitalic_b = ⌈ roman_log italic_d ⌉. Finally, choose εSPβεBEsubscript𝜀SP𝛽subscript𝜀BE\varepsilon_{\text{SP}}\leq\beta\varepsilon_{\text{BE}}italic_ε start_POSTSUBSCRIPT SP end_POSTSUBSCRIPT ≤ italic_β italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT. The construction of (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) can be achieved using 𝒪(d)𝒪𝑑\mathcal{O}(d)caligraphic_O ( italic_d ) elementary gates [BCC+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT15].

Step 4. Now we make use of our access to the state-preparation-pair (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ). To form linear combinations of block-encodings, the number of ancilla qubits required for each constituent block-encoding should be the same, see Proposition 3.6/3.7. Remark 3.2 shows that we can always equalize this number of ancilla qubits by padding with additional ancillas. The equalized number of ancillas is max{a,n+nρ+2}max{a,n+nρ}+2𝑎𝑛subscript𝑛𝜌2𝑎𝑛subscript𝑛𝜌2\max\{a,n+n_{\rho}+2\}\leq\max\{a,n+n_{\rho}\}+2roman_max { italic_a , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT + 2 } ≤ roman_max { italic_a , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT } + 2. We could also take a+n+nρ+2𝑎𝑛subscript𝑛𝜌2a+n+n_{\rho}+2italic_a + italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT + 2, but we want to minimize the number of ancilla qubits. From Proposition 3.7 we get UHsubscript𝑈𝐻U_{H}italic_U start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT, a (β,max{a,n+nρ}+2+logd, 2βεBE)𝛽𝑎𝑛subscript𝑛𝜌2𝑑2𝛽subscript𝜀BE(\beta,\;\max\{a,n+n_{\rho}\}+2+\lceil\log d\rceil,\;2\beta\varepsilon_{\text{% BE}})( italic_β , roman_max { italic_a , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT } + 2 + ⌈ roman_log italic_d ⌉ , 2 italic_β italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT )-BE of H:=iθiHi+logρassign𝐻subscript𝑖subscript𝜃𝑖subscript𝐻𝑖𝜌H:=\sum_{i}\theta_{i}H_{i}+\log\rhoitalic_H := ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + roman_log italic_ρ, making 1 query to (PL,PR)subscript𝑃𝐿subscript𝑃𝑅(P_{L},P_{R})( italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) and 1 query to Ulogρsubscript𝑈𝜌U_{\log\rho}italic_U start_POSTSUBSCRIPT roman_log italic_ρ end_POSTSUBSCRIPT and each Ujsubscript𝑈𝑗U_{j}italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT.

Step 5. Finally, we construct a block-encoding for eH/𝒩superscript𝑒𝐻𝒩e^{H}/\mathcal{N}italic_e start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT / caligraphic_N. At this stage, we have a (β,max{a,n+nρ}+2+logd, 2βεBE)𝛽𝑎𝑛subscript𝑛𝜌2𝑑2𝛽subscript𝜀BE(\beta,\;\max\{a,n+n_{\rho}\}+2+\lceil\log d\rceil,\;2\beta\varepsilon_{\text{% BE}})( italic_β , roman_max { italic_a , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT } + 2 + ⌈ roman_log italic_d ⌉ , 2 italic_β italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT )-BE of H𝐻Hitalic_H. Lemma 4.2 gives a (1,max{a,n+nρ}+logd+4,εpoly/4+4t2εBE)1𝑎𝑛subscript𝑛𝜌𝑑4subscript𝜀poly44𝑡2subscript𝜀BE(1,\;\max\{a,n+n_{\rho}\}+\lceil\log d\rceil+4,\;\varepsilon_{\text{poly}}/4+4% t\sqrt{2\varepsilon_{\text{BE}}})( 1 , roman_max { italic_a , italic_n + italic_n start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT } + ⌈ roman_log italic_d ⌉ + 4 , italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT / 4 + 4 italic_t square-root start_ARG 2 italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT end_ARG )-BE of σ=eH/4eβ𝜎superscript𝑒𝐻4superscript𝑒𝛽\sigma=e^{H}/{4e^{\beta}}italic_σ = italic_e start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT / 4 italic_e start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT (thus 𝒩=4eβ𝒩4superscript𝑒𝛽\mathcal{N}=4e^{\beta}caligraphic_N = 4 italic_e start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT), where t=𝒪(max(β,log1εpoly)log1εpoly)𝑡𝒪𝛽1subscript𝜀poly1subscript𝜀polyt=\mathcal{O}\left(\sqrt{\max(\beta,\log\frac{1}{\varepsilon_{\text{poly}}})% \log\frac{1}{\varepsilon_{\text{poly}}}}\right)italic_t = caligraphic_O ( square-root start_ARG roman_max ( italic_β , roman_log divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG ) roman_log divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG end_ARG ). It remains to make judicious choices for εpolysubscript𝜀poly\varepsilon_{\text{poly}}italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT (note that the εpolysubscript𝜀poly\varepsilon_{\text{poly}}italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT at this step need not be the same as the one in Step 2) and εBEsubscript𝜀BE\varepsilon_{\text{BE}}italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT in order to ensure the overall block-encoding error is less than ε𝜀\varepsilonitalic_ε, i.e.

εpoly4+4t2εBEε.subscript𝜀poly44𝑡2subscript𝜀BE𝜀\displaystyle\frac{\varepsilon_{\text{poly}}}{4}+4t\sqrt{2\varepsilon_{\text{% BE}}}\leq\varepsilon.divide start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG start_ARG 4 end_ARG + 4 italic_t square-root start_ARG 2 italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT end_ARG ≤ italic_ε . (4.1)

Now given a sufficently small ε𝜀\varepsilonitalic_ε such that ε2β𝜀superscript2𝛽\varepsilon\leq 2^{-\beta}italic_ε ≤ 2 start_POSTSUPERSCRIPT - italic_β end_POSTSUPERSCRIPT, choose εpoly=min{ε,2β}=εsubscript𝜀poly𝜀superscript2𝛽𝜀\varepsilon_{\text{poly}}=\min\{\varepsilon,2^{-\beta}\}=\varepsilonitalic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT = roman_min { italic_ε , 2 start_POSTSUPERSCRIPT - italic_β end_POSTSUPERSCRIPT } = italic_ε and

εBE=(ε8log1ε)2.subscript𝜀BEsuperscript𝜀81𝜀2\varepsilon_{\text{BE}}=\left(\frac{\varepsilon}{8\log\frac{1}{\varepsilon}}% \right)^{2}.italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT = ( divide start_ARG italic_ε end_ARG start_ARG 8 roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

These choices ensure Equation 4.1 is satisfied. Note that limx0xlog1x=0subscript𝑥0𝑥1𝑥0\lim_{x\rightarrow 0}\frac{x}{\log\frac{1}{x}}=0roman_lim start_POSTSUBSCRIPT italic_x → 0 end_POSTSUBSCRIPT divide start_ARG italic_x end_ARG start_ARG roman_log divide start_ARG 1 end_ARG start_ARG italic_x end_ARG end_ARG = 0, so εBE0subscript𝜀BE0\varepsilon_{\text{BE}}\rightarrow 0italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT → 0 as ε0𝜀0\varepsilon\rightarrow 0italic_ε → 0. The degree of the approximating polynomial, and thus the number of queries to UHsubscript𝑈𝐻U_{H}italic_U start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT required, is t=𝒪(max(β,log1εpoly)log1εpoly)=𝒪(log1ε)𝑡𝒪𝛽1subscript𝜀poly1subscript𝜀poly𝒪1𝜀t=\mathcal{O}\left(\sqrt{\max(\beta,\log\frac{1}{\varepsilon_{\text{poly}}})% \log\frac{1}{\varepsilon_{\text{poly}}}}\right)=\mathcal{O}\left(\log\frac{1}{% \varepsilon}\right)italic_t = caligraphic_O ( square-root start_ARG roman_max ( italic_β , roman_log divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG ) roman_log divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUBSCRIPT poly end_POSTSUBSCRIPT end_ARG end_ARG ) = caligraphic_O ( roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ). Recall that constructing UHsubscript𝑈𝐻U_{H}italic_U start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT itself makes 1 query to Ulogρsubscript𝑈𝜌U_{\log\rho}italic_U start_POSTSUBSCRIPT roman_log italic_ρ end_POSTSUBSCRIPT and each Ujsubscript𝑈𝑗U_{j}italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. Lastly, observe that eHeHei|θi|+logκeβ<𝒩normsuperscript𝑒𝐻superscript𝑒norm𝐻superscript𝑒subscript𝑖subscript𝜃𝑖𝜅superscript𝑒𝛽𝒩\|e^{H}\|\leq e^{\|H\|}\leq e^{\sum_{i}|\theta_{i}|+\log\kappa}\leq e^{\beta}<% \mathcal{N}∥ italic_e start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ∥ ≤ italic_e start_POSTSUPERSCRIPT ∥ italic_H ∥ end_POSTSUPERSCRIPT ≤ italic_e start_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | + roman_log italic_κ end_POSTSUPERSCRIPT ≤ italic_e start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT < caligraphic_N, so 𝒩𝒩\mathcal{N}caligraphic_N is a valid subnormalization factor.

Overall complexity: Uσsubscript𝑈𝜎U_{\sigma}italic_U start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT makes 𝒪(log1ε)𝒪1𝜀\mathcal{O}(\log\frac{1}{\varepsilon})caligraphic_O ( roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) queries to UHsubscript𝑈𝐻U_{H}italic_U start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT. UHsubscript𝑈𝐻U_{H}italic_U start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT queries Ulogρsubscript𝑈𝜌U_{\log\rho}italic_U start_POSTSUBSCRIPT roman_log italic_ρ end_POSTSUBSCRIPT and each Ujsubscript𝑈𝑗U_{j}italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT exactly once, and Ulogρsubscript𝑈𝜌U_{\log\rho}italic_U start_POSTSUBSCRIPT roman_log italic_ρ end_POSTSUBSCRIPT in turn makes 𝒪(κlog(logκεBE))𝒪𝜅𝜅subscript𝜀BE\mathcal{O}\left(\kappa\log\left(\frac{\log\kappa}{\varepsilon_{\text{BE}}}% \right)\right)caligraphic_O ( italic_κ roman_log ( divide start_ARG roman_log italic_κ end_ARG start_ARG italic_ε start_POSTSUBSCRIPT BE end_POSTSUBSCRIPT end_ARG ) ) queries to Uρsubscript𝑈𝜌U_{\rho}italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT. Accordingly, the implementation of Uσsubscript𝑈𝜎U_{\sigma}italic_U start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT makes

𝒪(log1ε)𝒪(κlog(logκ1ε2log21ε))𝒪(κlog(logκε)log(1ε))𝒪~(κlog2(1ε))𝒪1𝜀𝒪𝜅𝜅1superscript𝜀2superscript21𝜀𝒪𝜅𝜅𝜀1𝜀~𝒪𝜅superscript21𝜀\mathcal{O}\left(\log\frac{1}{\varepsilon}\right)\cdot\mathcal{O}\left(\kappa% \log\left(\log\kappa\cdot\frac{1}{\varepsilon^{2}}\cdot\log^{2}\frac{1}{% \varepsilon}\right)\right)\subseteq\mathcal{O}\left(\kappa\log\left(\frac{\log% \kappa}{\varepsilon}\right)\log\left(\frac{1}{\varepsilon}\right)\right)% \subseteq\widetilde{\mathcal{O}}\left(\kappa\log^{2}\left(\frac{1}{\varepsilon% }\right)\right)caligraphic_O ( roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) ⋅ caligraphic_O ( italic_κ roman_log ( roman_log italic_κ ⋅ divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ⋅ roman_log start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) ) ⊆ caligraphic_O ( italic_κ roman_log ( divide start_ARG roman_log italic_κ end_ARG start_ARG italic_ε end_ARG ) roman_log ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) ) ⊆ over~ start_ARG caligraphic_O end_ARG ( italic_κ roman_log start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) )

queries to Uρsubscript𝑈𝜌U_{\rho}italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT and 𝒪(log1ε)𝒪1𝜀\mathcal{O}\left(\log\frac{1}{\varepsilon}\right)caligraphic_O ( roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) queries to each Ujsubscript𝑈𝑗U_{j}italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, thus

𝒪(dlog1ε)𝒪𝑑1𝜀\mathcal{O}\left(d\log\frac{1}{\varepsilon}\right)caligraphic_O ( italic_d roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG )

queries to {Uj}j=1dsuperscriptsubscriptsubscript𝑈𝑗𝑗1𝑑\{U_{j}\}_{j=1}^{d}{ italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, the constraint operators collectively considered. ∎

4.3 Further discussion

If the positive definite ρN×N𝜌superscript𝑁𝑁\rho\in\mathbb{C}^{N\times N}italic_ρ ∈ blackboard_C start_POSTSUPERSCRIPT italic_N × italic_N end_POSTSUPERSCRIPT is full rank, the condition number is κN𝜅𝑁\kappa\geq Nitalic_κ ≥ italic_N since the eigenvalue lower bound 1κ1𝜅\frac{1}{\kappa}divide start_ARG 1 end_ARG start_ARG italic_κ end_ARG must be 1/Nabsent1𝑁\leq 1/N≤ 1 / italic_N. Then the Uρsubscript𝑈𝜌U_{\rho}italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT-query complexity grows at least linearly with N𝑁Nitalic_N. Hence, our Esscher transform is most relevant for low-rank cases. Assume we have r𝑟ritalic_r non-zero eigenvalues 1/κabsent1𝜅\geq 1/\kappa≥ 1 / italic_κ. As a consequence rκ𝑟𝜅r\leq\kappaitalic_r ≤ italic_κ holds. While the condition number can still be exponential if the smallest eigenvalue is exponentially small, when the smallest eigenvalue is 1/poly(r)1poly𝑟1/{\rm poly}(r)1 / roman_poly ( italic_r ), we obtain a well-behaved query complexity. In addition we can allow for smaller eigenvalues, especially when we are interested only in low-rank approximations of the Esscher transform. Let 1/κeff1/κ1subscript𝜅eff1𝜅1/\kappa_{\rm eff}\geq 1/\kappa1 / italic_κ start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT ≥ 1 / italic_κ, with the effective condition number κeffsubscript𝜅eff\kappa_{\rm eff}italic_κ start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT. With slight adaptations, our method can implement the Esscher transform on the effectively well-conditioned subspace, while leaving the other part undefined. This incurs an error compared to the full Esscher transform proportional to the importance of the neglected eigenvalues, but may be acceptable in many practical situations. Recall that low-rank approximations are frequently performed in statistics and machine learning.

If the desired output model is a normalized state, one can apply similar techniques for Gibbs sampling to extract the normalized Esscher-transformed state from the output of Algorithm 1. We briefly describe this procedure and the overhead cost it incurs. More details can be found in Chapter 3 of [Gil19]. Let ε>0𝜀0\varepsilon>0italic_ε > 0 denote the desired precision in trace distance between our approximate output and the ideal state. First, we prepare a maximally entangled state on two registers. Use Algorithm 1 to construct a 1111-block-encoding U𝑈Uitalic_U of eiθiHi+logρ2/𝒩superscript𝑒subscript𝑖subscript𝜃𝑖subscript𝐻𝑖𝜌2𝒩e^{\frac{\sum_{i}\theta_{i}H_{i}+\log\rho}{2}}/\sqrt{\mathcal{N}}italic_e start_POSTSUPERSCRIPT divide start_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + roman_log italic_ρ end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT / square-root start_ARG caligraphic_N end_ARG where 𝒩=eθ1+2(1+log2κ)𝒩superscript𝑒subscriptnorm𝜃1212𝜅\mathcal{N}=e^{\|\theta\|_{1}+2(1+\log 2\kappa)}caligraphic_N = italic_e start_POSTSUPERSCRIPT ∥ italic_θ ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + 2 ( 1 + roman_log 2 italic_κ ) end_POSTSUPERSCRIPT, with block-encoding error 0<ε1<ε/N20subscript𝜀1𝜀superscript𝑁20<\varepsilon_{1}<\varepsilon/N^{2}0 < italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_ε / italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Then apply U𝑈Uitalic_U to the second register to obtain a state |ψket𝜓|\psi\rangle| italic_ψ ⟩, so that tracing out the first register yields an approximate subnormalized state with trace distance error of 𝒪(ε/N)𝒪𝜀𝑁\mathcal{O}\left(\varepsilon/N\right)caligraphic_O ( italic_ε / italic_N ). That is,

Tr1(0|I)|ψψ|(|0I)eiθiHi+logρN𝒩T=𝒪(εN).subscriptnormsubscriptTr1tensor-productbra0𝐼ket𝜓bra𝜓tensor-productket0𝐼superscript𝑒subscript𝑖subscript𝜃𝑖subscript𝐻𝑖𝜌𝑁𝒩𝑇𝒪𝜀𝑁\displaystyle\left\|\operatorname{Tr}_{1}(\bra{0}\otimes I)\ket{\psi}\bra{\psi% }(\ket{0}\otimes I)-\frac{e^{\sum_{i}\theta_{i}H_{i}+\log\rho}}{N\mathcal{N}}% \right\|_{T}=\mathcal{O}\left(\frac{\varepsilon}{N}\right).∥ roman_Tr start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ⟨ start_ARG 0 end_ARG | ⊗ italic_I ) | start_ARG italic_ψ end_ARG ⟩ ⟨ start_ARG italic_ψ end_ARG | ( | start_ARG 0 end_ARG ⟩ ⊗ italic_I ) - divide start_ARG italic_e start_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + roman_log italic_ρ end_POSTSUPERSCRIPT end_ARG start_ARG italic_N caligraphic_N end_ARG ∥ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = caligraphic_O ( divide start_ARG italic_ε end_ARG start_ARG italic_N end_ARG ) .

With 𝒵:=Tr(eiθiHi+logρ)assign𝒵Trsuperscript𝑒subscript𝑖subscript𝜃𝑖subscript𝐻𝑖𝜌\mathcal{Z}:=\operatorname{Tr}\left(e^{\sum_{i}\theta_{i}H_{i}+\log\rho}\right)caligraphic_Z := roman_Tr ( italic_e start_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + roman_log italic_ρ end_POSTSUPERSCRIPT ), this state, when postselected after 𝒪(N𝒩𝒵log1ε)𝒪𝑁𝒩𝒵1𝜀\mathcal{O}\left(\sqrt{\frac{N\mathcal{N}}{\mathcal{Z}}}\log\frac{1}{% \varepsilon}\right)caligraphic_O ( square-root start_ARG divide start_ARG italic_N caligraphic_N end_ARG start_ARG caligraphic_Z end_ARG end_ARG roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) steps of fixed-point amplitude amplification (refer to Theorem 27 in [GSLW19]), results in a density operator ε𝜀\varepsilonitalic_ε-close to the normalized Esscher-transformed state

eiθiHi+logρTr(eiθiHi+logρ)superscript𝑒subscript𝑖subscript𝜃𝑖subscript𝐻𝑖𝜌Trsuperscript𝑒subscript𝑖subscript𝜃𝑖subscript𝐻𝑖𝜌\displaystyle\frac{e^{\sum_{i}\theta_{i}H_{i}+\log\rho}}{\operatorname{Tr}(e^{% \sum_{i}\theta_{i}H_{i}+\log\rho})}divide start_ARG italic_e start_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + roman_log italic_ρ end_POSTSUPERSCRIPT end_ARG start_ARG roman_Tr ( italic_e start_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + roman_log italic_ρ end_POSTSUPERSCRIPT ) end_ARG

in trace distance. Taking this overhead cost into account and assuming ε𝜀\varepsilonitalic_ε is sufficiently small (such that the block-encoding error satisfies ε1<2θ12(1+log2κ)subscript𝜀1superscript2subscriptnorm𝜃1212𝜅\varepsilon_{1}<2^{-\|\theta\|_{1}-2(1+\log 2\kappa)}italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < 2 start_POSTSUPERSCRIPT - ∥ italic_θ ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - 2 ( 1 + roman_log 2 italic_κ ) end_POSTSUPERSCRIPT), the total query complexity of preparing the approximate Esscher-transformed state is

𝒪~(κlog2(N2ε))𝒪(N𝒩𝒵log1ε)𝒪~(κN𝒩𝒵log3(1ε)).~𝒪𝜅superscript2superscript𝑁2𝜀𝒪𝑁𝒩𝒵1𝜀~𝒪𝜅𝑁𝒩𝒵superscript31𝜀\displaystyle\widetilde{\mathcal{O}}\left(\kappa\log^{2}\left(\frac{N^{2}}{% \varepsilon}\right)\right)\cdot\mathcal{O}\left(\sqrt{\frac{N\mathcal{N}}{% \mathcal{Z}}}\log\frac{1}{\varepsilon}\right)\subseteq\widetilde{\mathcal{O}}% \left(\kappa\sqrt{\frac{N\mathcal{N}}{\mathcal{Z}}}\log^{3}\left(\frac{1}{% \varepsilon}\right)\right).over~ start_ARG caligraphic_O end_ARG ( italic_κ roman_log start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( divide start_ARG italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ε end_ARG ) ) ⋅ caligraphic_O ( square-root start_ARG divide start_ARG italic_N caligraphic_N end_ARG start_ARG caligraphic_Z end_ARG end_ARG roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) ⊆ over~ start_ARG caligraphic_O end_ARG ( italic_κ square-root start_ARG divide start_ARG italic_N caligraphic_N end_ARG start_ARG caligraphic_Z end_ARG end_ARG roman_log start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) ) .

5 Conclusion

In this paper, we considered a minimum relative entropy problem for the density operator subject to equality constraints. We formally solved this problem and the solution form inspired us to define the Quantum Esscher Transform (QUEST), a generalization of the classical Esscher transform to the quantum setting. We discussed its implementation on fault-tolerant quantum computers, leveraging techniques based on the QSVT framework. Given as inputs block-encodings of the initial quantum state and the constraint operators, the algorithm outputs an ε𝜀\varepsilonitalic_ε-approximate block-encoding of the Esscher-transformed state with Uρsubscript𝑈𝜌U_{\rho}italic_U start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT-query complexity

𝒪(κlog(logκε)log(1ε))𝒪~(κlog2(1ε))𝒪𝜅𝜅𝜀1𝜀~𝒪𝜅superscript21𝜀\displaystyle\mathcal{O}\left(\kappa\log\left(\frac{\log\kappa}{\varepsilon}% \right)\log\left(\frac{1}{\varepsilon}\right)\right)\subseteq\widetilde{% \mathcal{O}}\left(\kappa\log^{2}\left(\frac{1}{\varepsilon}\right)\right)caligraphic_O ( italic_κ roman_log ( divide start_ARG roman_log italic_κ end_ARG start_ARG italic_ε end_ARG ) roman_log ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) ) ⊆ over~ start_ARG caligraphic_O end_ARG ( italic_κ roman_log start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) )

and {Uj:j[d]}conditional-setsubscript𝑈𝑗𝑗delimited-[]𝑑\{U_{j}:j\in[d]\}{ italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT : italic_j ∈ [ italic_d ] }-query complexity

𝒪(dlog1ε).𝒪𝑑1𝜀\mathcal{O}\left(d\log\frac{1}{\varepsilon}\right).caligraphic_O ( italic_d roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) .

Several avenues remain open for future work:

  • Is there a quantum algorithmic framework that can fully solve the minimum relative entropy problem? Our current approach only presents the formal solution for the optimal parameter λ*superscript𝜆\lambda^{*}italic_λ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT. Approaches such as Newton’s algorithm with backtracking was suggested in [ZTF13], the quantized version of which could be studied. Additionally, [AAKS20] demonstrated that λ*superscript𝜆\lambda^{*}italic_λ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT can, in principle, be found with a convex optimization program. Can we design a quantum algorithm to effectively address this problem?

  • One could explore strategies for alternative input models. Our current work exclusively considered the purified access model, wherein the preparation of the purification of the input state was assumed. In contrast, the sampling access model, which assumes multiple independent copies of the input state, is another commonly used model. Gilyén et al. [GP22] has proposed an approach to implement approximate block-encodings of ρ𝜌\rhoitalic_ρ, starting with sample access. This approach is based on a combination of density matrix exponentiation [LMR14, KLL+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT17] and QSVT, and allows us to implement the quantum Esscher transform in the sampling access model. We leave the total cost of this procedure for further analysis.

  • In Section 2.2.3, we noted potential connections between the quantum Esscher transform and imaginary-time evolution. To give these substance, further investigation is required.

  • Various applications could be envisioned for the quantum Esscher transform. Its classical version has found usage for numerous problems in domains such as statistics, machine learning, and finance. These problems have quantum analogues, which could benefit from the quantum Esscher transform and its implementation on quantum computers.

Acknowledgments

The authors would like to thank Po-Wei Huang, Xiufan Li, Zhan Yu, Serge Massar and Roberto Rubboli for helpful discussions. This work is supported by the National Research Foundation, Singapore, and A*STAR under its CQT Bridging Grant and its Quantum Engineering Programme under grant NRF2021-QEP2-02-P05. KK acknowledges support from Leong Chuan Kwek, under project grant R-710-000-007-135.

Appendix A Proof of Theorem 2.2

Before delving into the proof, we introduce some notation and state a lemma to facilitate its presentation. The exponential family of P𝑃Pitalic_P with respect to the random variable X𝑋Xitalic_X is the set of measures

Λ={eλXP𝔼P[eλX]:λd}.Λconditional-setsuperscript𝑒𝜆𝑋𝑃subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋𝜆superscript𝑑\displaystyle\Lambda=\left\{\frac{e^{\lambda\cdot X}P}{\mathbb{E}_{P}[e^{% \lambda\cdot X}]}:\lambda\in\mathbb{R}^{d}\right\}.roman_Λ = { divide start_ARG italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_X end_POSTSUPERSCRIPT italic_P end_ARG start_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_X end_POSTSUPERSCRIPT ] end_ARG : italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT } .

Also, let

M={Q:𝔼Q[X]=m}.𝑀conditional-set𝑄subscript𝔼𝑄delimited-[]𝑋𝑚\displaystyle M=\{Q:\mathbb{E}_{Q}[X]=m\}.italic_M = { italic_Q : blackboard_E start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT [ italic_X ] = italic_m } .
Lemma A.1.

(Proposition 3.24 – [FS11]) Let P𝑃Pitalic_P be a probability measure on (Ω,Σ)ΩΣ(\Omega,\Sigma)( roman_Ω , roman_Σ ) and X𝑋Xitalic_X be a random variable on ΩΩ\Omegaroman_Ω. Fix md𝑚superscript𝑑m\in\mathbbm{R}^{d}italic_m ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. Then for any probability measure Q𝑄Qitalic_Q on (Ω,Σ)ΩΣ(\Omega,\Sigma)( roman_Ω , roman_Σ ) satisfying 𝔼Q[X]=msubscript𝔼𝑄delimited-[]𝑋𝑚\mathbb{E}_{Q}[X]=mblackboard_E start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT [ italic_X ] = italic_m, we have

D(QP)supλd[λmlog𝔼P[eλX]].𝐷conditional𝑄𝑃subscriptsupremum𝜆superscript𝑑delimited-[]𝜆𝑚subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋\displaystyle D(Q\|P)\geq\sup_{\lambda\in\mathbb{R}^{d}}\left[\lambda\cdot m-% \log\mathbb{E}_{P}[e^{\lambda\cdot X}]\right].italic_D ( italic_Q ∥ italic_P ) ≥ roman_sup start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_λ ⋅ italic_m - roman_log blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_X end_POSTSUPERSCRIPT ] ] . (A.1)

Moreover the inequality is saturated if Q=Qλ:=eλXP/𝔼P[eλX]ΛM𝑄subscript𝑄superscript𝜆assignsuperscript𝑒superscript𝜆𝑋𝑃subscript𝔼𝑃delimited-[]superscript𝑒superscript𝜆𝑋Λ𝑀Q=Q_{\lambda^{\prime}}:=e^{\lambda^{\prime}\cdot X}P/\mathbb{E}_{P}[e^{\lambda% ^{\prime}\cdot X}]\in\Lambda\cap Mitalic_Q = italic_Q start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT := italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_X end_POSTSUPERSCRIPT italic_P / blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_X end_POSTSUPERSCRIPT ] ∈ roman_Λ ∩ italic_M for some λdsuperscript𝜆superscript𝑑\lambda^{\prime}\in\mathbb{R}^{d}italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT:

D(QλP)=λmlog𝔼P[eλX]=supλd[λmlog𝔼P[eλX]].𝐷conditionalsubscript𝑄superscript𝜆𝑃superscript𝜆𝑚subscript𝔼𝑃delimited-[]superscript𝑒superscript𝜆𝑋subscriptsupremum𝜆superscript𝑑delimited-[]𝜆𝑚subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋\displaystyle D(Q_{\lambda^{\prime}}\|P)=\lambda^{\prime}\cdot m-\log\mathbb{E% }_{P}[e^{\lambda^{\prime}\cdot X}]=\sup_{\lambda\in\mathbb{R}^{d}}\left[% \lambda\cdot m-\log\mathbb{E}_{P}[e^{\lambda\cdot X}]\right].italic_D ( italic_Q start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ italic_P ) = italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_m - roman_log blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_X end_POSTSUPERSCRIPT ] = roman_sup start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_λ ⋅ italic_m - roman_log blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_X end_POSTSUPERSCRIPT ] ] . (A.2)
Proof.

Each λd𝜆superscript𝑑\lambda\in\mathbb{R}^{d}italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT gives rise to a corresponding QλΛsubscript𝑄𝜆ΛQ_{\lambda}\in\Lambdaitalic_Q start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ∈ roman_Λ (note that Qλsubscript𝑄𝜆Q_{\lambda}italic_Q start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT need not be in M𝑀Mitalic_M). Then for any arbitrary Q𝑄Qitalic_Q, we have

D(QP)𝐷conditional𝑄𝑃\displaystyle D(Q\|P)italic_D ( italic_Q ∥ italic_P ) =\displaystyle== ωΩQ(ω)logQ(ω)Qλ(ω)Qλ(ω)P(ω)subscript𝜔Ω𝑄𝜔𝑄𝜔subscript𝑄𝜆𝜔subscript𝑄𝜆𝜔𝑃𝜔\displaystyle\sum_{\omega\in\Omega}Q(\omega)\log\frac{Q(\omega)}{Q_{\lambda}(% \omega)}\frac{Q_{\lambda}(\omega)}{P(\omega)}∑ start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_Q ( italic_ω ) roman_log divide start_ARG italic_Q ( italic_ω ) end_ARG start_ARG italic_Q start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_ω ) end_ARG divide start_ARG italic_Q start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_ω ) end_ARG start_ARG italic_P ( italic_ω ) end_ARG
=\displaystyle== D(QQλ)+ωΩQ(ω)logQλ(ω)P(ω)𝐷conditional𝑄subscript𝑄𝜆subscript𝜔Ω𝑄𝜔subscript𝑄𝜆𝜔𝑃𝜔\displaystyle D(Q\|Q_{\lambda})+\sum_{\omega\in\Omega}Q(\omega)\log\frac{Q_{% \lambda}(\omega)}{P(\omega)}italic_D ( italic_Q ∥ italic_Q start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ) + ∑ start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_Q ( italic_ω ) roman_log divide start_ARG italic_Q start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_ω ) end_ARG start_ARG italic_P ( italic_ω ) end_ARG
(by Jensen, D(QP)0)by Jensen, D(QP)0\displaystyle(\text{by Jensen, $D(Q\|P)\geq 0$})( by Jensen, italic_D ( italic_Q ∥ italic_P ) ≥ 0 ) \displaystyle\geq ωΩQ(ω)logQλ(ω)P(ω)subscript𝜔Ω𝑄𝜔subscript𝑄𝜆𝜔𝑃𝜔\displaystyle\sum_{\omega\in\Omega}Q(\omega)\log\frac{Q_{\lambda}(\omega)}{P(% \omega)}∑ start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_Q ( italic_ω ) roman_log divide start_ARG italic_Q start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_ω ) end_ARG start_ARG italic_P ( italic_ω ) end_ARG
=\displaystyle== ωΩQ(ω)logeλX(ω)𝔼P[eλX]subscript𝜔Ω𝑄𝜔superscript𝑒𝜆𝑋𝜔subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋\displaystyle\sum_{\omega\in\Omega}Q(\omega)\log\frac{e^{\lambda\cdot X(\omega% )}}{\mathbb{E}_{P}[e^{\lambda\cdot X}]}∑ start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_Q ( italic_ω ) roman_log divide start_ARG italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_X ( italic_ω ) end_POSTSUPERSCRIPT end_ARG start_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_X end_POSTSUPERSCRIPT ] end_ARG
=\displaystyle== EQ[λX]log𝔼P[eλX]subscript𝐸𝑄delimited-[]𝜆𝑋subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋\displaystyle E_{Q}[\lambda\cdot X]-\log\mathbb{E}_{P}[e^{\lambda\cdot X}]italic_E start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT [ italic_λ ⋅ italic_X ] - roman_log blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_X end_POSTSUPERSCRIPT ]
=\displaystyle== λmlog𝔼P[eλX].𝜆𝑚subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋\displaystyle\lambda\cdot m-\log\mathbb{E}_{P}[e^{\lambda\cdot X}].italic_λ ⋅ italic_m - roman_log blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_X end_POSTSUPERSCRIPT ] .

Since this holds for all λd𝜆superscript𝑑\lambda\in\mathbb{R}^{d}italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, we conclude that D(QP)supλd[λmlog𝔼P[eλX]]𝐷conditional𝑄𝑃subscriptsupremum𝜆superscript𝑑delimited-[]𝜆𝑚subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋D(Q\|P)\geq\sup_{\lambda\in\mathbb{R}^{d}}\left[\lambda\cdot m-\log\mathbb{E}_% {P}[e^{\lambda\cdot X}]\right]italic_D ( italic_Q ∥ italic_P ) ≥ roman_sup start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_λ ⋅ italic_m - roman_log blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_X end_POSTSUPERSCRIPT ] ]. Furthermore, if λdsuperscript𝜆superscript𝑑\lambda^{\prime}\in\mathbb{R}^{d}italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is such that QλΛMsubscript𝑄superscript𝜆Λ𝑀Q_{\lambda^{\prime}}\in\Lambda\cap Mitalic_Q start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∈ roman_Λ ∩ italic_M, then letting Q=Qλ𝑄subscript𝑄superscript𝜆Q=Q_{\lambda^{\prime}}italic_Q = italic_Q start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT and rerunning the same argument sequence above gives

D(QλP)𝐷conditionalsubscript𝑄superscript𝜆𝑃\displaystyle D(Q_{\lambda^{\prime}}\|P)italic_D ( italic_Q start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ italic_P ) =ωΩQλ(ω)logQλ(ω)P(ω)absentsubscript𝜔Ωsubscript𝑄superscript𝜆𝜔subscript𝑄superscript𝜆𝜔𝑃𝜔\displaystyle=\sum_{\omega\in\Omega}Q_{\lambda^{\prime}}(\omega)\log\frac{Q_{% \lambda^{\prime}}(\omega)}{P(\omega)}= ∑ start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_ω ) roman_log divide start_ARG italic_Q start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_ω ) end_ARG start_ARG italic_P ( italic_ω ) end_ARG
=ωΩQλ(ω)logeλX(ω)𝔼P[eλX]absentsubscript𝜔Ωsubscript𝑄superscript𝜆𝜔superscript𝑒superscript𝜆𝑋𝜔subscript𝔼𝑃delimited-[]superscript𝑒superscript𝜆𝑋\displaystyle=\sum_{\omega\in\Omega}Q_{\lambda^{\prime}}(\omega)\log\frac{e^{% \lambda^{\prime}\cdot X(\omega)}}{\mathbb{E}_{P}[e^{\lambda^{\prime}\cdot X}]}= ∑ start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_ω ) roman_log divide start_ARG italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_X ( italic_ω ) end_POSTSUPERSCRIPT end_ARG start_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_X end_POSTSUPERSCRIPT ] end_ARG
=EQλ[λX]log𝔼P[eλX]absentsubscript𝐸subscript𝑄superscript𝜆delimited-[]superscript𝜆𝑋subscript𝔼𝑃delimited-[]superscript𝑒superscript𝜆𝑋\displaystyle=E_{Q_{\lambda^{\prime}}}[\lambda^{\prime}\cdot X]-\log\mathbb{E}% _{P}[e^{\lambda^{\prime}\cdot X}]= italic_E start_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_X ] - roman_log blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_X end_POSTSUPERSCRIPT ]
=λmlog𝔼P[eλX].absentsuperscript𝜆𝑚subscript𝔼𝑃delimited-[]superscript𝑒superscript𝜆𝑋\displaystyle=\lambda^{\prime}\cdot m-\log\mathbb{E}_{P}[e^{\lambda^{\prime}% \cdot X}].= italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_m - roman_log blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_X end_POSTSUPERSCRIPT ] .

Proof of Theorem 2.2.

First, we have required minωΩXi(ω)<mi<maxωΩXi(ω)subscript𝜔Ωsubscript𝑋𝑖𝜔subscript𝑚𝑖subscript𝜔Ωsubscript𝑋𝑖𝜔\min_{\omega\in\Omega}X_{i}(\omega)<m_{i}<\max_{\omega\in\Omega}X_{i}(\omega)roman_min start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_ω ) < italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < roman_max start_POSTSUBSCRIPT italic_ω ∈ roman_Ω end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_ω ) because otherwise the constraints 𝔼Q[Xi]=misubscript𝔼𝑄delimited-[]subscript𝑋𝑖subscript𝑚𝑖\mathbb{E}_{Q}[X_{i}]=m_{i}blackboard_E start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT [ italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] = italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT cannot be satisfied. The Lagrangian function is

(Q,λ,η)=ωQ(ω)logQ(ω)P(ω)i=1dλi(ωQ(ω)Xi(ω)mi)η(ωQ(ω)1).𝑄𝜆𝜂subscript𝜔𝑄𝜔𝑄𝜔𝑃𝜔superscriptsubscript𝑖1𝑑subscript𝜆𝑖subscript𝜔𝑄𝜔subscript𝑋𝑖𝜔subscript𝑚𝑖𝜂subscript𝜔𝑄𝜔1\displaystyle\mathcal{L}(Q,\lambda,\eta)=\sum_{\omega}Q(\omega)\log\frac{Q(% \omega)}{P(\omega)}-\sum_{i=1}^{d}\lambda_{i}\left(\sum_{\omega}Q(\omega)X_{i}% (\omega)-m_{i}\right)-\eta\left(\sum_{\omega}Q(\omega)-1\right).caligraphic_L ( italic_Q , italic_λ , italic_η ) = ∑ start_POSTSUBSCRIPT italic_ω end_POSTSUBSCRIPT italic_Q ( italic_ω ) roman_log divide start_ARG italic_Q ( italic_ω ) end_ARG start_ARG italic_P ( italic_ω ) end_ARG - ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_ω end_POSTSUBSCRIPT italic_Q ( italic_ω ) italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_ω ) - italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - italic_η ( ∑ start_POSTSUBSCRIPT italic_ω end_POSTSUBSCRIPT italic_Q ( italic_ω ) - 1 ) .

Setting the first-order derivatives of (Q,λ,η)𝑄𝜆𝜂\mathcal{L}(Q,\lambda,\eta)caligraphic_L ( italic_Q , italic_λ , italic_η ) with respect to Q(ω)𝑄𝜔Q(\omega)italic_Q ( italic_ω ) to zero gives

Q(ω)=eλX(ω)P(ω)𝔼P[eλX],superscript𝑄𝜔superscript𝑒superscript𝜆𝑋𝜔𝑃𝜔subscript𝔼𝑃delimited-[]superscript𝑒superscript𝜆𝑋\displaystyle Q^{\star}(\omega)=\frac{e^{\lambda^{\star}\cdot X(\omega)}P(% \omega)}{\mathbb{E}_{P}[e^{\lambda^{\star}\cdot X}]},italic_Q start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( italic_ω ) = divide start_ARG italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⋅ italic_X ( italic_ω ) end_POSTSUPERSCRIPT italic_P ( italic_ω ) end_ARG start_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⋅ italic_X end_POSTSUPERSCRIPT ] end_ARG ,

where λsuperscript𝜆\lambda^{\star}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is to be determined from the d𝑑ditalic_d constraints 𝔼Q[X]=msubscript𝔼𝑄delimited-[]𝑋𝑚\mathbb{E}_{Q}[X]=mblackboard_E start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT [ italic_X ] = italic_m:

𝔼Q[X]=msubscript𝔼𝑄delimited-[]𝑋𝑚\displaystyle\mathbb{E}_{Q}[X]=mblackboard_E start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT [ italic_X ] = italic_m 𝔼P[XeλX]𝔼P[eλX]m=0iffabsentsubscript𝔼𝑃delimited-[]𝑋superscript𝑒superscript𝜆𝑋subscript𝔼𝑃delimited-[]superscript𝑒superscript𝜆𝑋𝑚0\displaystyle\iff\frac{\mathbb{E}_{P}[Xe^{\lambda^{\star}\cdot X}]}{\mathbb{E}% _{P}[e^{\lambda^{\star}\cdot X}]}-m=0⇔ divide start_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_X italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⋅ italic_X end_POSTSUPERSCRIPT ] end_ARG start_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⋅ italic_X end_POSTSUPERSCRIPT ] end_ARG - italic_m = 0 (A.4)
𝔼P[(Xm)eλ(Xm)]𝔼P[eλ(Xm)]=0iffabsentsubscript𝔼𝑃delimited-[]𝑋𝑚superscript𝑒superscript𝜆𝑋𝑚subscript𝔼𝑃delimited-[]superscript𝑒superscript𝜆𝑋𝑚0\displaystyle\iff\frac{\mathbb{E}_{P}[(X-m)e^{\lambda^{\star}\cdot(X-m)}]}{% \mathbb{E}_{P}[e^{\lambda^{\star}\cdot(X-m)}]}=0⇔ divide start_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ ( italic_X - italic_m ) italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⋅ ( italic_X - italic_m ) end_POSTSUPERSCRIPT ] end_ARG start_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⋅ ( italic_X - italic_m ) end_POSTSUPERSCRIPT ] end_ARG = 0
λlog𝔼P[eλ(Xm)]|λ=λ=0iffabsentevaluated-at𝜆subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋𝑚𝜆superscript𝜆0\displaystyle\iff\frac{\partial}{\partial\lambda}\log\mathbb{E}_{P}[e^{\lambda% \cdot(X-m)}]|_{\lambda=\lambda^{\star}}=0⇔ divide start_ARG ∂ end_ARG start_ARG ∂ italic_λ end_ARG roman_log blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ ( italic_X - italic_m ) end_POSTSUPERSCRIPT ] | start_POSTSUBSCRIPT italic_λ = italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = 0
λ𝔼P[eλ(Xm)]|λ=λ=0.iffabsentevaluated-at𝜆subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋𝑚𝜆superscript𝜆0\displaystyle\iff\frac{\partial}{\partial\lambda}\mathbb{E}_{P}[e^{\lambda% \cdot(X-m)}]|_{\lambda=\lambda^{\star}}=0.⇔ divide start_ARG ∂ end_ARG start_ARG ∂ italic_λ end_ARG blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ ( italic_X - italic_m ) end_POSTSUPERSCRIPT ] | start_POSTSUBSCRIPT italic_λ = italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = 0 .

The last equivalence holds because logf(x)𝑓𝑥\log f(x)roman_log italic_f ( italic_x ) and f(x)𝑓𝑥f(x)italic_f ( italic_x ) share the same minimum/maximum points, provided f(x)>0𝑓𝑥0f(x)>0italic_f ( italic_x ) > 0 at those points. It remains to show Qsuperscript𝑄Q^{\star}italic_Q start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT indeed minimizes D(QP)𝐷conditional𝑄𝑃D(Q\|P)italic_D ( italic_Q ∥ italic_P ), subject to the constraints EQ[X]=msubscript𝐸𝑄delimited-[]𝑋𝑚E_{Q}[X]=mitalic_E start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT [ italic_X ] = italic_m. But this follows easily from Lemma A.1. Furthermore, since xxlogxmaps-to𝑥𝑥𝑥x\mapsto x\log xitalic_x ↦ italic_x roman_log italic_x is a strictly convex function, D(QP)𝐷conditional𝑄𝑃D(Q\|P)italic_D ( italic_Q ∥ italic_P ) is a strictly convex functional of Q𝑄Qitalic_Q and so it can have at most one minimizer in the convex set M𝑀Mitalic_M, thereby showing the uniqueness of Qsuperscript𝑄Q^{\star}italic_Q start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT. Finally, again using Lemma A.1 we have λ=argmaxλd[λmlog𝔼P[eλX]]=argminλd[log𝔼P[eλ(Xm)]]=argminλd𝔼P[eλ(Xm)]superscript𝜆subscriptargmax𝜆superscript𝑑𝜆𝑚subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋subscriptargmin𝜆superscript𝑑subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋𝑚subscriptargmin𝜆superscript𝑑subscript𝔼𝑃delimited-[]superscript𝑒𝜆𝑋𝑚\lambda^{\star}=\operatorname*{argmax}_{\lambda\in\mathbb{R}^{d}}\left[\lambda% \cdot m-\log\mathbb{E}_{P}[e^{\lambda\cdot X}]\right]=\operatorname*{argmin}_{% \lambda\in\mathbb{R}^{d}}\left[\log\mathbb{E}_{P}[e^{\lambda\cdot(X-m)}]\right% ]=\operatorname*{argmin}_{\lambda\in\mathbb{R}^{d}}\mathbb{E}_{P}[e^{\lambda% \cdot(X-m)}]italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = roman_argmax start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_λ ⋅ italic_m - roman_log blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ italic_X end_POSTSUPERSCRIPT ] ] = roman_argmin start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ roman_log blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ ( italic_X - italic_m ) end_POSTSUPERSCRIPT ] ] = roman_argmin start_POSTSUBSCRIPT italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT blackboard_E start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_e start_POSTSUPERSCRIPT italic_λ ⋅ ( italic_X - italic_m ) end_POSTSUPERSCRIPT ]. ∎

Appendix B Wirtinger Calculus

The ‘Wirtinger Calculus’ provides a methodology for optimization problems involving complex matrices. It enables ‘differentiation as usual’ with respect to complex matrices. In this appendix, we state only the main definitions and results needed to solve Problem 2.4. For a more thorough exposition of this framework, we direct the reader to [KQKR23, Hjø11, KD09].

Consider functions of the form f:n×n:𝑓superscript𝑛𝑛f:\mathbb{C}^{n\times n}\longrightarrow\mathbb{C}italic_f : blackboard_C start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT ⟶ blackboard_C. Since \mathbb{C}blackboard_C is 2superscript2\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT endowed with the multiplication operation (a,b)×(c,d)(acbd,ad+bc)maps-to𝑎𝑏𝑐𝑑𝑎𝑐𝑏𝑑𝑎𝑑𝑏𝑐(a,b)\times(c,d)\mapsto(ac-bd,ad+bc)( italic_a , italic_b ) × ( italic_c , italic_d ) ↦ ( italic_a italic_c - italic_b italic_d , italic_a italic_d + italic_b italic_c ), we can view

f::𝑓absent\displaystyle f:\;italic_f : 2(n×n)2superscript2𝑛𝑛superscript2\displaystyle\mathbb{R}^{2(n\times n)}\longrightarrow\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT 2 ( italic_n × italic_n ) end_POSTSUPERSCRIPT ⟶ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
(xij,yij)i,j[n]=(𝐗,𝐘)(u(𝐗,𝐘),v(𝐗,𝐘)).subscriptsubscript𝑥𝑖𝑗subscript𝑦𝑖𝑗𝑖𝑗delimited-[]𝑛𝐗𝐘maps-to𝑢𝐗𝐘𝑣𝐗𝐘\displaystyle(x_{ij},y_{ij})_{i,j\in[n]}=(\mathbf{X},\mathbf{Y})\mapsto(u(% \mathbf{X},\mathbf{Y}),v(\mathbf{X},\mathbf{Y})).( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_i , italic_j ∈ [ italic_n ] end_POSTSUBSCRIPT = ( bold_X , bold_Y ) ↦ ( italic_u ( bold_X , bold_Y ) , italic_v ( bold_X , bold_Y ) ) .

For i=1,,n𝑖1𝑛i=1,\dots,nitalic_i = 1 , … , italic_n regard zij,zij*subscript𝑧𝑖𝑗superscriptsubscript𝑧𝑖𝑗z_{ij},z_{ij}^{*}italic_z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , italic_z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT as functions from n×n×n×nsuperscript𝑛𝑛superscript𝑛𝑛\mathbb{R}^{n\times n}\times\mathbb{R}^{n\times n}blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT × blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT to \mathbb{C}blackboard_C, where zij(𝐗,𝐘)=xij+iyijsubscript𝑧𝑖𝑗𝐗𝐘subscript𝑥𝑖𝑗𝑖subscript𝑦𝑖𝑗z_{ij}(\mathbf{X},\mathbf{Y})=x_{ij}+iy_{ij}italic_z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( bold_X , bold_Y ) = italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT + italic_i italic_y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT and zij*(𝐗,𝐘)=xijiyijsuperscriptsubscript𝑧𝑖𝑗𝐗𝐘subscript𝑥𝑖𝑗𝑖subscript𝑦𝑖𝑗z_{ij}^{*}(\mathbf{X},\mathbf{Y})=x_{ij}-iy_{ij}italic_z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ( bold_X , bold_Y ) = italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT - italic_i italic_y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT.333The notations z,z*𝑧superscript𝑧z,z^{*}italic_z , italic_z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT may raise questions on independence. This is irrelevant—one may simply write z1,z2subscript𝑧1subscript𝑧2z_{1},z_{2}italic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT if one wishes. We emphasize that (for each i,j𝑖𝑗i,jitalic_i , italic_j) the fundamental input variables are the two real numbers x𝑥xitalic_x and y𝑦yitalic_y. Then we have a function f~:n×n×n×n:~𝑓superscript𝑛𝑛superscript𝑛𝑛\tilde{f}:\mathbb{C}^{n\times n}\times\mathbb{C}^{n\times n}\longrightarrow% \mathbb{C}over~ start_ARG italic_f end_ARG : blackboard_C start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT × blackboard_C start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT ⟶ blackboard_C such that

f(𝐗,𝐘):=f~(𝐙,𝐙*)¯(𝐗,𝐘)=f~(𝐙(𝐗,𝐘),𝐙*(𝐗,𝐘))=f~(𝐗+𝐢𝐘,𝐗𝐢𝐘).assign𝑓𝐗𝐘¯~𝑓𝐙superscript𝐙𝐗𝐘~𝑓𝐙𝐗𝐘superscript𝐙𝐗𝐘~𝑓𝐗𝐢𝐘𝐗𝐢𝐘\displaystyle f(\mathbf{X},\mathbf{Y}):=\underline{\tilde{f}\circ(\mathbf{Z},% \mathbf{Z^{*}})}(\mathbf{X},\mathbf{Y})=\tilde{f}(\mathbf{Z}(\mathbf{X},% \mathbf{Y}),\mathbf{Z^{*}}(\mathbf{X},\mathbf{Y}))=\tilde{f}(\mathbf{X+iY},% \mathbf{X-iY}).italic_f ( bold_X , bold_Y ) := under¯ start_ARG over~ start_ARG italic_f end_ARG ∘ ( bold_Z , bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ) end_ARG ( bold_X , bold_Y ) = over~ start_ARG italic_f end_ARG ( bold_Z ( bold_X , bold_Y ) , bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ( bold_X , bold_Y ) ) = over~ start_ARG italic_f end_ARG ( bold_X + bold_iY , bold_X - bold_iY ) . (B.1)

Partial differentiating f𝑓fitalic_f with respect to each xijsubscript𝑥𝑖𝑗x_{ij}italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT and yijsubscript𝑦𝑖𝑗y_{ij}italic_y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT, and then rearranging terms, we have for 1i,jnformulae-sequence1𝑖𝑗𝑛1\leq i,j\leq n1 ≤ italic_i , italic_j ≤ italic_n

f~zij(𝐙(𝐗,𝐘),𝐙*(𝐗,𝐘))~𝑓subscript𝑧𝑖𝑗𝐙𝐗𝐘superscript𝐙𝐗𝐘\displaystyle\frac{\partial\tilde{f}}{\partial z_{ij}}(\mathbf{Z}(\mathbf{X},% \mathbf{Y}),\mathbf{Z^{*}}(\mathbf{X},\mathbf{Y}))divide start_ARG ∂ over~ start_ARG italic_f end_ARG end_ARG start_ARG ∂ italic_z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG ( bold_Z ( bold_X , bold_Y ) , bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ( bold_X , bold_Y ) ) =12(fxijifyij)(𝐗,𝐘)absent12𝑓subscript𝑥𝑖𝑗𝑖𝑓subscript𝑦𝑖𝑗𝐗𝐘\displaystyle=\frac{1}{2}\left(\frac{\partial f}{\partial x_{ij}}-i\frac{% \partial f}{\partial y_{ij}}\right)(\mathbf{X},\mathbf{Y})= divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG ∂ italic_f end_ARG start_ARG ∂ italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG - italic_i divide start_ARG ∂ italic_f end_ARG start_ARG ∂ italic_y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG ) ( bold_X , bold_Y ) (B.2)
f~zij*(𝐙(𝐗,𝐘),𝐙*(𝐗,𝐘))~𝑓superscriptsubscript𝑧𝑖𝑗𝐙𝐗𝐘superscript𝐙𝐗𝐘\displaystyle\frac{\partial\tilde{f}}{\partial z_{ij}^{*}}(\mathbf{Z}(\mathbf{% X},\mathbf{Y}),\mathbf{Z^{*}}(\mathbf{X},\mathbf{Y}))divide start_ARG ∂ over~ start_ARG italic_f end_ARG end_ARG start_ARG ∂ italic_z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG ( bold_Z ( bold_X , bold_Y ) , bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ( bold_X , bold_Y ) ) =12(fxij+ifyij)(𝐗,𝐘).absent12𝑓subscript𝑥𝑖𝑗𝑖𝑓subscript𝑦𝑖𝑗𝐗𝐘\displaystyle=\frac{1}{2}\left(\frac{\partial f}{\partial x_{ij}}+i\frac{% \partial f}{\partial y_{ij}}\right)(\mathbf{X},\mathbf{Y}).= divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG ∂ italic_f end_ARG start_ARG ∂ italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG + italic_i divide start_ARG ∂ italic_f end_ARG start_ARG ∂ italic_y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG ) ( bold_X , bold_Y ) .

To preserve the matrix structure of the parameters zijsubscript𝑧𝑖𝑗z_{ij}italic_z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT and zij*superscriptsubscript𝑧𝑖𝑗z_{ij}^{*}italic_z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT we use the standard notation

𝐙:=[z11z1nzn1znn]𝐙*:=[z11*z1n*zn1*znn*]formulae-sequenceassign𝐙matrixsubscript𝑧11subscript𝑧1𝑛subscript𝑧𝑛1subscript𝑧𝑛𝑛assignsuperscript𝐙matrixsuperscriptsubscript𝑧11superscriptsubscript𝑧1𝑛superscriptsubscript𝑧𝑛1superscriptsubscript𝑧𝑛𝑛\displaystyle\frac{\partial}{\partial\mathbf{Z}}:=\begin{bmatrix}\frac{% \partial}{\partial z_{11}}&\dots&\frac{\partial}{\partial z_{1n}}\\ \vdots&\ddots&\vdots\\ \frac{\partial}{\partial z_{n1}}&\dots&\frac{\partial}{\partial z_{nn}}\end{% bmatrix}\qquad\frac{\partial}{\partial\mathbf{Z^{*}}}:=\begin{bmatrix}\frac{% \partial}{\partial z_{11}^{*}}&\dots&\frac{\partial}{\partial z_{1n}^{*}}\\ \vdots&\ddots&\vdots\\ \frac{\partial}{\partial z_{n1}^{*}}&\dots&\frac{\partial}{\partial z_{nn}^{*}% }\end{bmatrix}divide start_ARG ∂ end_ARG start_ARG ∂ bold_Z end_ARG := [ start_ARG start_ROW start_CELL divide start_ARG ∂ end_ARG start_ARG ∂ italic_z start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL … end_CELL start_CELL divide start_ARG ∂ end_ARG start_ARG ∂ italic_z start_POSTSUBSCRIPT 1 italic_n end_POSTSUBSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL divide start_ARG ∂ end_ARG start_ARG ∂ italic_z start_POSTSUBSCRIPT italic_n 1 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL … end_CELL start_CELL divide start_ARG ∂ end_ARG start_ARG ∂ italic_z start_POSTSUBSCRIPT italic_n italic_n end_POSTSUBSCRIPT end_ARG end_CELL end_ROW end_ARG ] divide start_ARG ∂ end_ARG start_ARG ∂ bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG := [ start_ARG start_ROW start_CELL divide start_ARG ∂ end_ARG start_ARG ∂ italic_z start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG end_CELL start_CELL … end_CELL start_CELL divide start_ARG ∂ end_ARG start_ARG ∂ italic_z start_POSTSUBSCRIPT 1 italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL divide start_ARG ∂ end_ARG start_ARG ∂ italic_z start_POSTSUBSCRIPT italic_n 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG end_CELL start_CELL … end_CELL start_CELL divide start_ARG ∂ end_ARG start_ARG ∂ italic_z start_POSTSUBSCRIPT italic_n italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW end_ARG ] (B.9)

and similarly for 𝐗𝐗\frac{\partial}{\partial\mathbf{X}}divide start_ARG ∂ end_ARG start_ARG ∂ bold_X end_ARG and 𝐘𝐘\frac{\partial}{\partial\mathbf{Y}}divide start_ARG ∂ end_ARG start_ARG ∂ bold_Y end_ARG. Then Equation B.2 is concisely stated as

f~𝐙(𝐙(𝐗,𝐘),𝐙*(𝐗,𝐘))~𝑓𝐙𝐙𝐗𝐘superscript𝐙𝐗𝐘\displaystyle\frac{\partial\tilde{f}}{\partial\mathbf{Z}}(\mathbf{Z}(\mathbf{X% },\mathbf{Y}),\mathbf{Z^{*}}(\mathbf{X},\mathbf{Y}))divide start_ARG ∂ over~ start_ARG italic_f end_ARG end_ARG start_ARG ∂ bold_Z end_ARG ( bold_Z ( bold_X , bold_Y ) , bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ( bold_X , bold_Y ) ) =12(f𝐗if𝐘)(𝐗,𝐘)absent12𝑓𝐗𝑖𝑓𝐘𝐗𝐘\displaystyle=\frac{1}{2}\left(\frac{\partial f}{\partial\mathbf{X}}-i\frac{% \partial f}{\partial\mathbf{Y}}\right)(\mathbf{X},\mathbf{Y})= divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG ∂ italic_f end_ARG start_ARG ∂ bold_X end_ARG - italic_i divide start_ARG ∂ italic_f end_ARG start_ARG ∂ bold_Y end_ARG ) ( bold_X , bold_Y ) (B.10)
f~𝐙*(𝐙(𝐗,𝐘),𝐙*(𝐗,𝐘))~𝑓superscript𝐙𝐙𝐗𝐘superscript𝐙𝐗𝐘\displaystyle\frac{\partial\tilde{f}}{\partial\mathbf{Z^{*}}}(\mathbf{Z}(% \mathbf{X},\mathbf{Y}),\mathbf{Z^{*}}(\mathbf{X},\mathbf{Y}))divide start_ARG ∂ over~ start_ARG italic_f end_ARG end_ARG start_ARG ∂ bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG ( bold_Z ( bold_X , bold_Y ) , bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ( bold_X , bold_Y ) ) =12(f𝐗+if𝐘)(𝐗,𝐘).absent12𝑓𝐗𝑖𝑓𝐘𝐗𝐘\displaystyle=\frac{1}{2}\left(\frac{\partial f}{\partial\mathbf{X}}+i\frac{% \partial f}{\partial\mathbf{Y}}\right)(\mathbf{X},\mathbf{Y}).= divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG ∂ italic_f end_ARG start_ARG ∂ bold_X end_ARG + italic_i divide start_ARG ∂ italic_f end_ARG start_ARG ∂ bold_Y end_ARG ) ( bold_X , bold_Y ) .

𝐙𝐙\frac{\partial}{\partial\mathbf{Z}}divide start_ARG ∂ end_ARG start_ARG ∂ bold_Z end_ARG and 𝐙*superscript𝐙\frac{\partial}{\partial\mathbf{Z^{*}}}divide start_ARG ∂ end_ARG start_ARG ∂ bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG are the matrix Wirtinger derivatives of f𝑓fitalic_f. Often, we abuse notation and write both f(𝐗,𝐘)𝑓𝐗𝐘f(\mathbf{X},\mathbf{Y})italic_f ( bold_X , bold_Y ) and f(𝐙,𝐙*)𝑓𝐙superscript𝐙f(\mathbf{Z},\mathbf{Z^{*}})italic_f ( bold_Z , bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ), so we can write

𝐙=12(𝐗i𝐘),𝐙*=12(𝐗+i𝐘).formulae-sequence𝐙12𝐗𝑖𝐘superscript𝐙12𝐗𝑖𝐘\displaystyle\frac{\partial}{\partial\mathbf{Z}}=\frac{1}{2}\left(\frac{% \partial}{\partial\mathbf{X}}-i\frac{\partial}{\partial\mathbf{Y}}\right),% \qquad\frac{\partial}{\partial\mathbf{Z^{*}}}=\frac{1}{2}\left(\frac{\partial}% {\partial\mathbf{X}}+i\frac{\partial}{\partial\mathbf{Y}}\right).divide start_ARG ∂ end_ARG start_ARG ∂ bold_Z end_ARG = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG ∂ end_ARG start_ARG ∂ bold_X end_ARG - italic_i divide start_ARG ∂ end_ARG start_ARG ∂ bold_Y end_ARG ) , divide start_ARG ∂ end_ARG start_ARG ∂ bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG ∂ end_ARG start_ARG ∂ bold_X end_ARG + italic_i divide start_ARG ∂ end_ARG start_ARG ∂ bold_Y end_ARG ) . (B.11)

The following three propositions are all we need in this paper. We omit their proofs, which can all be found in [KQKR23].

Proposition B.1.

Let f:n×n:𝑓superscript𝑛𝑛f:\mathbb{C}^{n\times n}\longrightarrow\mathbb{R}italic_f : blackboard_C start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT ⟶ blackboard_R be a real-valued function of complex matrices. Then f𝑓fitalic_f has a stationary point at 𝐙=[zij]i,j[n]𝐙subscriptdelimited-[]subscript𝑧𝑖𝑗𝑖𝑗delimited-[]𝑛\mathbf{Z}=[z_{ij}]_{i,j\in[n]}bold_Z = [ italic_z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_i , italic_j ∈ [ italic_n ] end_POSTSUBSCRIPT if and only if

f𝐙(𝐙)=0(or equivalentlyf𝐙*(𝐙)=0).𝑓𝐙𝐙0or equivalently𝑓superscript𝐙𝐙0\displaystyle\frac{\partial f}{\partial\mathbf{Z}}(\mathbf{Z})=0\quad\left(% \text{or equivalently}\;\;\frac{\partial f}{\partial\mathbf{Z^{*}}}(\mathbf{Z}% )=0\right).divide start_ARG ∂ italic_f end_ARG start_ARG ∂ bold_Z end_ARG ( bold_Z ) = 0 ( or equivalently divide start_ARG ∂ italic_f end_ARG start_ARG ∂ bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG ( bold_Z ) = 0 ) .

Whether the solution of the above equation actually gives a minimum/maximum/saddle point has to be checked via additional considerations or by inspecting higher-order derivatives.

Proposition B.2.

Let 𝐙𝐙\mathbf{Z}bold_Z be a complex, unstructured (see below) matrix and F(z)=n=0cnzn𝐹𝑧superscriptsubscript𝑛0subscript𝑐𝑛superscript𝑧𝑛F(z)=\sum_{n=0}^{\infty}c_{n}z^{n}italic_F ( italic_z ) = ∑ start_POSTSUBSCRIPT italic_n = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_z start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT be analytic. Define the scalar function f(𝐙,𝐙*):=Tr(F(𝐙))assign𝑓𝐙superscript𝐙Tr𝐹𝐙f(\mathbf{Z,Z^{*}}):=\operatorname{Tr}(F(\mathbf{Z}))italic_f ( bold_Z , bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ) := roman_Tr ( italic_F ( bold_Z ) ). Then

Tr(F(𝐙))𝐙=F(𝐙)TTr𝐹𝐙𝐙superscript𝐹superscript𝐙𝑇\displaystyle\frac{\partial\operatorname{Tr}(F(\mathbf{Z}))}{\partial\mathbf{Z% }}=F^{\prime}(\mathbf{Z})^{T}divide start_ARG ∂ roman_Tr ( italic_F ( bold_Z ) ) end_ARG start_ARG ∂ bold_Z end_ARG = italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_Z ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT

where F()superscript𝐹F^{\prime}(\cdot)italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ⋅ ) is the complex derivative of F()𝐹F(\cdot)italic_F ( ⋅ ).

So far, by writing f:n×n:𝑓superscript𝑛𝑛f:\mathbb{C}^{n\times n}\longrightarrow\mathbb{C}italic_f : blackboard_C start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT ⟶ blackboard_C we have implicitly assumed the input matrices have independent components (we call such matrices ‘unstructured’). This condition often does not hold, e.g. when our matrices of interest are symmetric/Hermitian etc. To obtain the correct Wirtinger derivatives with respect to structured matrices, we resort to the chain rule.

Proposition B.3 (Wirtinger derivatives with respect to Hermitian matrices).

Let f(𝐙,𝐙*)𝑓𝐙superscript𝐙f(\mathbf{Z,Z^{*}})italic_f ( bold_Z , bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ) be a function of complex Hermitian matrices. Then the Wirtinger derivatives of f𝑓fitalic_f with respect to 𝐙,𝐙*𝐙superscript𝐙\mathbf{Z,Z^{*}}bold_Z , bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT are given by

f𝐙=[f𝐙~+(f𝐙~*)T]𝐙~=𝐙andf𝐙*=[f𝐙~*+(f𝐙~)T]𝐙~=𝐙.formulae-sequence𝑓𝐙subscriptdelimited-[]𝑓~𝐙superscript𝑓superscript~𝐙𝑇~𝐙𝐙and𝑓superscript𝐙subscriptdelimited-[]𝑓superscript~𝐙superscript𝑓~𝐙𝑇~𝐙𝐙\displaystyle\frac{\partial f}{\partial\mathbf{Z}}=\left[\frac{\partial f}{% \partial\mathbf{\tilde{Z}}}+\left(\frac{\partial f}{\partial\mathbf{\tilde{Z}^% {*}}}\right)^{T}\right]_{\mathbf{\tilde{Z}}=\mathbf{Z}}\qquad\text{and}\qquad% \frac{\partial f}{\partial\mathbf{Z^{*}}}=\left[\frac{\partial f}{\partial% \mathbf{\tilde{Z}^{*}}}+\left(\frac{\partial f}{\partial\mathbf{\tilde{Z}}}% \right)^{T}\right]_{\mathbf{\tilde{Z}}=\mathbf{Z}}.divide start_ARG ∂ italic_f end_ARG start_ARG ∂ bold_Z end_ARG = [ divide start_ARG ∂ italic_f end_ARG start_ARG ∂ over~ start_ARG bold_Z end_ARG end_ARG + ( divide start_ARG ∂ italic_f end_ARG start_ARG ∂ over~ start_ARG bold_Z end_ARG start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] start_POSTSUBSCRIPT over~ start_ARG bold_Z end_ARG = bold_Z end_POSTSUBSCRIPT and divide start_ARG ∂ italic_f end_ARG start_ARG ∂ bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG = [ divide start_ARG ∂ italic_f end_ARG start_ARG ∂ over~ start_ARG bold_Z end_ARG start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT end_ARG + ( divide start_ARG ∂ italic_f end_ARG start_ARG ∂ over~ start_ARG bold_Z end_ARG end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] start_POSTSUBSCRIPT over~ start_ARG bold_Z end_ARG = bold_Z end_POSTSUBSCRIPT .

Here, the tildes above 𝐙~,𝐙~*~𝐙superscript~𝐙\mathbf{\tilde{Z},\tilde{Z}^{*}}over~ start_ARG bold_Z end_ARG , over~ start_ARG bold_Z end_ARG start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT indicate that they are unstructured matrices. Thus, to derive the Wirtinger derivatives with respect to Hermitian matrices, first obtain the Wirtinger derivative of f𝑓fitalic_f, assuming the inputs are unstructured. Then form the correct expressions given above and reinstate the structured matrices 𝐙,𝐙*𝐙superscript𝐙\mathbf{Z,Z^{*}}bold_Z , bold_Z start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT as the arguments.

References

  • [AAKS20] Anurag Anshu, Srinivasan Arunachalam, Tomotaka Kuwahara, and Mehdi Soleimanifar. Sample-efficient learning of quantum many-body systems. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 685–691. IEEE, 2020.
  • [BCC+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT15] Dominic W Berry, Andrew M Childs, Richard Cleve, Robin Kothari, and Rolando D Somma. Simulating hamiltonian dynamics with a truncated taylor series. Physical review letters, 114(9):090502, 2015.
  • [BK91] Michael Berman and Ronnie Kosloff. Time-dependent solution of the liouville-von neumann equation: Non-dissipative evolution. Computer physics communications, 63(1-3):1–20, 1991.
  • [Bra96] Samuel L Braunstein. Geometry of quantum inference. Physics Letters A, 219(3-4):169–174, 1996.
  • [BSS23] Ahmad Beirami, Maziar Sanjabi, and Virginia Smith. On tilted losses in machine learning: Theory and applications. Journal of Machine Learning Research, 24:1–79, 2023.
  • [CGJ18] Shantanav Chakraborty, András Gilyén, and Stacey Jeffery. The power of block-encoded matrix powers: improved regression techniques via faster hamiltonian simulation. arXiv preprint arXiv:1804.01973, 2018.
  • [DMB+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT23] Alexander M Dalzell, Sam McArdle, Mario Berta, Przemyslaw Bienias, Chi-Fang Chen, András Gilyén, Connor T Hann, Michael J Kastoryano, Emil T Khabiboulline, Aleksander Kubica, et al. Quantum algorithms: A survey of applications and end-to-end complexities. arXiv preprint arXiv:2310.03011, 2023.
  • [Esc32] F Escher. On the probability function in the collective theory of risk. Skand. Aktuarie Tidskr., 15:175–195, 1932.
  • [FS11] Hans Föllmer and Alexander Schied. Stochastic finance: an introduction in discrete time. Walter de Gruyter, 2011.
  • [Gil19] András Gilyén. Quantum singular value transformation & its algorithmic applications. PhD thesis, University of Amsterdam, 2019.
  • [GL19] András Gilyén and Tongyang Li. Distributional property testing in a quantum world. arXiv preprint arXiv:1902.00814, 2019.
  • [GLM08] Vittorio Giovannetti, Seth Lloyd, and Lorenzo Maccone. Quantum random access memory. Physical review letters, 100(16):160501, 2008.
  • [GP22] András Gilyén and Alexander Poremba. Improved quantum algorithms for fidelity estimation. arXiv preprint arXiv:2203.15993, 2022.
  • [GS+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT93] Hans U Gerber, Elias SW Shiu, et al. Option pricing by Esscher transforms. HEC Ecole des hautes études commerciales, 1993.
  • [GSLW19] András Gilyén, Yuan Su, Guang Hao Low, and Nathan Wiebe. Quantum singular value transformation and beyond: exponential improvements for quantum matrix arithmetics. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 193–204, 2019.
  • [Hjø11] Are Hjørungnes. Complex-valued matrix derivatives: with applications in signal processing and communications. Cambridge University Press, 2011.
  • [HS06] Friedrich Hubalek and Carlo Sgarra. Esscher transforms and the minimal entropy martingale measure for exponential lévy models. Quantitative finance, 6(02):125–145, 2006.
  • [Jay57] Edwin T Jaynes. Information theory and statistical mechanics. Physical review, 106(4):620, 1957.
  • [KD09] Ken Kreutz-Delgado. The complex gradient operator and the cr-calculus. arXiv preprint arXiv:0906.4835, 2009.
  • [KLL+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT17] Shelby Kimmel, Cedric Yen-Yu Lin, Guang Hao Low, Maris Ozols, and Theodore J Yoder. Hamiltonian simulation with optimal sample complexity. npj Quantum Information, 3(1):13, 2017.
  • [KQKR23] Kelvin Koor, Yixian Qiu, Leong Chuan Kwek, and Patrick Rebentrost. A short tutorial on Wirtinger Calculus with applications in quantum information. arXiv preprint arXiv:2312.04858, 2023.
  • [LC17] Guang Hao Low and Isaac L Chuang. Optimal hamiltonian simulation by quantum signal processing. Physical review letters, 118(1):010501, 2017.
  • [LC19] Guang Hao Low and Isaac L Chuang. Hamiltonian simulation by qubitization. Quantum, 3:163, 2019.
  • [LMR14] Seth Lloyd, Masoud Mohseni, and Patrick Rebentrost. Quantum principal component analysis. Nature Physics, 10(9):631–633, 2014.
  • [LYC16] Guang Hao Low, Theodore J Yoder, and Isaac L Chuang. Methodology of resonant equiangular composite quantum gates. Physical Review X, 6(4):041067, 2016.
  • [MJE+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT19] Sam McArdle, Tyson Jones, Suguru Endo, Ying Li, Simon C Benjamin, and Xiao Yuan. Variational ansatz-based quantum simulation of imaginary time evolution. npj Quantum Information, 5(1):75, 2019.
  • [MRTC21] John M Martyn, Zane M Rossi, Andrew K Tan, and Isaac L Chuang. Grand unification of quantum algorithms. PRX Quantum, 2(4):040203, 2021.
  • [MST+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT20] Mario Motta, Chong Sun, Adrian TK Tan, Matthew J O’Rourke, Erika Ye, Austin J Minnich, Fernando GSL Brandao, and Garnet Kin-Lic Chan. Determining eigenstates and thermal states on a quantum computer using quantum imaginary time evolution. Nature Physics, 16(2):205–210, 2020.
  • [OP07] Stefano Olivares and Matteo GA Paris. Quantum estimation via the minimum kullback entropy principle. Physical Review A, 76(4):042120, 2007.
  • [RF23] Patrick Rall and Bryce Fuller. Amplitude estimation from quantum signal processing. Quantum, 7:937, 2023.
  • [Sie76] David Siegmund. Importance sampling in the monte carlo study of sequential tests. The Annals of Statistics, pages 673–684, 1976.
  • [SJ80] John Shore and Rodney Johnson. Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. IEEE Transactions on information theory, 26(1):26–37, 1980.
  • [vAG18] Joran van Apeldoorn and András Gilyén. Improvements in quantum sdp-solving with applications. arXiv preprint arXiv:1804.05058, 2018.
  • [Wil13] Mark M Wilde. Quantum information theory. Cambridge university press, 2013.
  • [ZTF13] Mattia Zorzi, Francesco Ticozzi, and Augusto Ferrante. Minimum relative entropy for quantum estimation: Feasibility and general solution. IEEE transactions on information theory, 60(1):357–367, 2013.