Maximizing the Minimum Eigenvalue in Constant Dimension

Adam Brown [email protected], [email protected]; supported in part by NSF CCF-2106444 and NSF CCF-1910423. Georgia Institute of Technology. Aditi Laddha [email protected]; supported in part by the Institute for Foundations of Data Science at Yale and NSF CCF- 2007443. Yale University. Mohit Singh¹¹footnotemark: 1 Georgia Institute of Technology.

Abstract

In an instance of the minimum eigenvalue problem, we are given a collection of $n$ vectors $v_{1},\ldots,v_{n}\subset{\mathbb{R}^{d}}$ , and the goal is to pick a subset $B\subseteq[n]$ of given vectors to maximize the minimum eigenvalue of the matrix $\sum_{i\in B}v_{i}v_{i}^{\top}$ . Often, additional combinatorial constraints such as cardinality constraint $\left(|B|\leq k\right)$ or matroid constraint ( $B$ is a basis of a matroid defined on $[n]$ ) must be satisfied by the chosen set of vectors. The minimum eigenvalue problem with matroid constraints models a wide variety of problems including the Santa Clause problem, the E-design problem, and the constructive Kadison-Singer problem.

In this paper, we give a randomized algorithm that finds a set $B\subseteq[n]$ subject to any matroid constraint whose minimum eigenvalue is at least $(1-\epsilon)$ times the optimum, with high probability. The running time of the algorithm is $O\left(n^{O(d\log(d)/\epsilon^{2})}\right)$ . In particular, our results give a polynomial time asymptotic scheme when the dimension of the vectors is constant. Our algorithm uses a convex programming relaxation of the problem after guessing a rescaling which allows us to apply pipage rounding and matrix Chernoff inequalities to round to a good solution. The key new component is a structural lemma which enables us to “guess” the appropriate rescaling, which could be of independent interest. Our approach generalizes the approximation guarantee to monotone, homogeneous functions and as such we can maximize $\det(\sum_{i\in B}v_{i}v_{i}^{\top})^{1/d}$ , or minimize any norm of the eigenvalues of the matrix $\left(\sum_{i\in B}v_{i}v_{i}^{\top}\right)^{-1}$ , with the same running time under some mild assumptions. As a byproduct, we also get a simple algorithm for an algorithmic version of Kadison-Singer problem.

1 Introduction

Subset selection problems with spectral objectives offer a natural model for studying problems in a variety of fields, including numerical linear algebra [AB13], graph theory [BSS09], convex geometry [SEFM15, Nik15], resource allocation [AS07, CCK09], and optimal design of experiments [AZLSW17, NST19, LZ21], all under a single umbrella.

In this work, we consider the minimum eigenvalue problem. In an instance of a minimum eigenvalue problem, we are given a collection of $n$ vectors $v_{1},\ldots,v_{n}\in{\mathbb{R}^{d}}$ , and the goal is to pick a subset $B\subseteq[n]$ of given vectors to maximize the minimum eigenvalue of the matrix $\sum_{i\in B}v_{i}v_{i}^{\top}$ . The selected set $B$ must satisfy additional constraints such as cardinality, partition, or more generally matroid constraints. While much of the focus in previous works [AZLSW17, NST19, LZ21] has been on cardinality constraints, in this work, we consider general matroid constraints. The generality of matroid constraints allows us to model the algorithmic version of the Kadison Singer problem [MSS15, JMS22] as well as the Santa Claus allocation problem [AS07, AFS12, Fei08, CCK09, DRZ20] as a special case of the minimum eigenvalue problem.

The “discrepancy” formulation of the Kadison-Singer problem (shown to be equivalent to the original formulation in [Wea04] and proved in [MSS15]) states that given a set of vectors $v_{1},\ldots,v_{m}\in\mathbb{R}^{d}$ with $\left\lVert v_{i}\right\rVert\leq\alpha$ and $\sum_{i}v_{i}v_{i}^{\top}=I_{d}$ , there exists a partition of $[m]$ into two subsets $S_{1},S_{2}$ , such that for every $j$ , $\sum_{i\in S_{j}}v_{i}v_{i}^{\top}$ spectrally approximates $I_{d}/2$ to an additive factor of $O(\alpha)$ . Algorithmically, finding such a partition is equivalent to solving an instance of the minimum eigenvalue maximization problem under partition matroid constraints (see Section 2.1).

Another classical application of the minimum eigenvalue problem arises in the area of optimal design of experiments in statistics [Puk06, AZLSW17]. The goal in the design of experiments is to select a subset of vectors $S$ from a given list of vectors $\{v_{1},\ldots,v_{n}\}$ such that certain measures of the covariance matrix $\left(\sum_{i\in S}v_{i}v_{i}^{\top}\right)^{-1}$ are small. In particular, minimizing the maximum eigenvalue of the covariance matrix, classically known as the $E$ -design problem in statistics, is exactly the minimum eigenvalue problem. While much of the previous work has focused on the case when the selected set of measurements $S$ must satisfy cardinality constraints, our work generalizes this problem to be studied under general matroid constraints.

1.1 Our Results and Contributions

In this work, we present an approximation algorithm for the minimum eigenvalue problem for all matroids. We use the randomized rounding technique of pipage rounding to give a polynomial time approximation scheme (PTAS) when the dimension is constant.

Theorem 1

For any $\epsilon>0$ there is an $O\left(n^{O(d\log(d)/\epsilon^{2})}\right)$ -time algorithm which, given a collection of vectors $v_{1},\ldots,v_{n}\in\mathbb{R}^{d}$ and a matroid $\mathcal{M}=([n],\mathcal{I})$ returns a set $B\in\mathcal{I}$ such that with probability at least $1-d^{-4}$

\lambda_{\min}\left(\sum_{i\in B}v_{i}v_{i}^{\top}\right)\geq(1-\epsilon)\cdot% \max_{B^{\star}\in\mathcal{I}}\;\lambda_{\min}\left(\sum_{i\in B^{\star}}v_{i}% v_{i}^{\top}\right).

Our result generalizes to give a PTAS (for constant dimension) when the objective is a general matrix function satisfying certain technical properties. In particular, this implies that a similar result as in Theorem 1 is achievable when the objective is to maximize the determinant of $\sum_{i\in B}v_{i}v_{i}^{\top}$ or to minimize any norm of the eigenvalues of $(\sum_{i\in B}v_{i}v_{i}^{\top})^{-1}$ .

Theorem 2

Suppose we have a collection of vectors $\mathcal{V}=(v_{1},\ldots,v_{n})\in\mathbb{R}^{d}$ , and a matroid $\mathcal{M}=([n],\mathcal{I})$ . Let $f:\mathbb{S}_{d}^{+}:\rightarrow\mathbb{R}$ be a concave, monotone, and homogeneous function given with a value and first order oracle. For any $\epsilon>0$ , there is an $O\left(n^{O(d\log(d)/\epsilon^{2})}\right)$ -time randomized algorithm, which takes $(\mathcal{V},\mathcal{M},f)$ as input and returns a set $B\in\mathcal{I}$ such that with probability at least $1-d^{-4}$ ,

f\left(\sum_{i\in B}v_{i}v_{i}^{\top}\right)\geq(1-\epsilon)\cdot\max_{B^{% \star}\in\mathcal{I}}\;f\left(\sum_{i\in B^{\star}}v_{i}v_{i}^{\top}\right)\,.

Although Theorem 2 is stated in terms of maximizing concave functions, our algorithm can also be applied to minimize monotone and homogeneous convex functions (e.g., $\mathrm{trace}(\sum_{i\in B}v_{i}v_{i}^{\top})^{-1}$ ) by considering the natural convex relaxation of the function over the matroid base polytope and using the same rounding strategy.

Technical Overview.

The first natural direction is to construct a convex programming relaxation for the problem and aim to apply randomized rounding methods to it.

\begin{array}[]{cl}\max&\lambda_{\min}\left(X\right)\\ &X=\sum\limits_{i=1}^{n}x_{i}\cdot v_{i}v_{i}^{\top}\\ &x\in\mathcal{P}(\mathcal{M})\\ &x\geq 0\end{array}

(CP)

Here, $\mathcal{P}(\mathcal{M})$ denotes the matroid base polytope of $\mathcal{M}$ . Unfortunately, this direct approach faces problems as this natural relaxation has an unbounded integrality gap even in very special cases (see Appendix A.1). The main challenge is the presence of long vectors that contribute significantly towards the optimum solution. A natural way to formalize the contribution of a vector is to consider its leverage score. Indeed, if $T$ denotes the optimum solution and $A_{T}=\sum_{i\in T}v_{i}v_{i}^{\top}$ , let $l_{i}=v_{i}^{\top}A_{T}^{-1}v_{i}$ be the leverage score of $v_{i}$ and let $S=\{i\in T:l_{i}\geq\frac{\epsilon^{2}}{\log d}\}$ be the set of vectors in the optimum solution with large leverage scores. The bound $\frac{\epsilon^{2}}{\log d}$ is chosen to allow randomized rounding methods to work (see Lemma 2 for details). Using the simple fact that the sum of leverage scores of all vectors in the optimum solution is exactly $d$ , it follows that $|S|\leq\frac{d\log d}{\epsilon^{2}}$ . Thus we could easily enumerate all such subsets $S$ in time $n^{O(d\log d/\epsilon^{2})}$ . For each such guess $S$ , we include $S$ in our solution and solve the convex program. We then apply the randomized rounding method to the solution of the convex program. Unfortunately, the challenge lies in ensuring that the convex program not only selects the vectors in $S$ (this can be easily done by setting their indicator variable to one) but also avoids selecting all vectors not in $T$ that have a large leverage score. The latter is crucial for the randomized rounding approach to work effectively. Unfortunately, since we did not guess $T$ , we have no way to insist that we do not pick these vectors in the convex program.

To address this problem, we present a new structural lemma that enables us to compute the leverage score as given by matrix $A_{S}=\sum_{i\in S}v_{i}v_{i}^{T}$ . The lemma shows that there are few vectors with large leverage scores, even when using $A_{S}$ instead of $A_{T}$ . Observe that $A_{S}^{-1}\succeq A_{T}^{-1}$ and therefore, the leverage scores with respect to $A_{S}$ are larger. Nevertheless, we still show a similar bound in the following lemma.

Lemma 1

For any set $T$ and a set of vectors $\{v_{i}:i\in T\}$ in $\mathbb{R}^{d}$ such that $\sum_{i\in T}v_{i}v_{i}^{\top}$ is invertible, there exists a subset $S\subseteq T$ such that $|S|=O(d\log(d)/\epsilon^{2})$ , $A_{S}=\sum_{i\in S}v_{i}v_{i}^{\top}$ is invertible, and for all $i\in T\backslash S$ ,

v_{i}^{\top}A_{S}^{-1}v_{i}\leq\frac{\epsilon^{2}}{10\log(d)}\,.

With the help of Lemma 1, we can now guess the set $S$ and insist that the convex program includes all these vectors in the chosen subset. More importantly, it allows us to insist that all vectors $v_{i}$ not in $S$ such that $v_{i}^{\top}A_{S}^{-1}v_{i}>\frac{\epsilon^{2}}{10\log(d)}$ not be included in the chosen solution. The last step can be done since we have guessed the set $S$ . This allows us to apply the randomized rounding approach to the convex programming solution.

There are some points worth mentioning about the randomized rounding approach. When the constraint matroid is a partition matroid, randomized rounding is a natural choice: for each part, the convex programming solution can be interpreted as a probability distribution over vectors in that part. Independently, for each part, pick one of the vectors with probability given by the convex programming solution. A simple application of the matrix Chernoff bound and the fact that leverage scores are all small due to Lemma 1 gives us the desired result. Due to the simplicity of the approach for partition matroids as well as the applicability of these constraints, we first prove the result for partition matroids in Section 2. We also show the application of our result to obtain an algorithmic version of the Kadison-Singer problem [MSS15] for constant dimension. We slightly improve the run time compared to the recent work [JMS22].

Corollary 1

Suppose we are given collection of vectors $\mathcal{U}=(u_{1},\ldots,u_{n})\in\mathbb{R}^{d}$ with $\|u_{i}\|^{2}\leq\alpha$ for any $i\in[n]$ and $\sum_{i=1}^{n}u_{i}u_{i}^{\top}=I_{d}$ and a constant $c>0$ such that there exists a set $T^{*}$ satisfying

\left(\frac{1}{2}-c\sqrt{\alpha}\right)I_{d}\preceq\sum_{i\in T^{*}}u_{i}u_{i}% ^{\top}\preceq\left(\frac{1}{2}+c\sqrt{\alpha}\right)I_{d}\,.

For any $\epsilon>0$ , there exists a randomized algorithm such which given $\mathcal{U}$ and $c$ as input, returns a set $T$ such that

(1-\epsilon)\cdot\left(\frac{1}{2}-c\sqrt{\alpha}\right)I_{d}\preceq\sum_{i\in T% }u_{i}u_{i}^{\top}\preceq(1+\epsilon)\cdot\left(\frac{1}{2}+c\sqrt{\alpha}% \right)I_{d}\,,

with probability at least $1-O(d^{-4})$ . The run time of the algorithm is $O(n^{O(d\log{d}/\epsilon^{2})})$ .

For general matroids, a straightforward application of randomized rounding does not work since it will not ensure that the chosen set is an independent set in the matroid. Instead, we use pipage rounding for general matroids, which involves randomly walking in the matroid polytope to return a vertex while ensuring that the output solution has even better concentration than is given by independent randomized rounding. To show these concentration results, we build on the work of Harvey and Olver [HO14] and give lower tail bounds on the distribution obtained via pipage rounding in Lemma 4.

1.2 Related Work

The minimum eigenvalue problem with partition constraints can be interpreted as a generalization of the max-min allocation problem. In the case of cardinality constraints, it can also model problems from experimental design and spectral sparsification. We give an overview of prior work for these special cases.

Max-min allocation and Santa Claus:

In the max-min allocation problem, we are given a set $[d]$ of agents and a set $[n]$ of items where agent $j\in[d]$ has valuation $h_{ij}\geq 0$ for item $i$ . The goal is to select an assignment $\sigma:[n]\rightarrow[d]$ which maximizes

\min_{j\in[d]}\sum_{i:\sigma(i)=j}h_{ij}.

This can be seen as a special case of the minimum eigenvalue problem with partition constraints.

Bansal and Sviridenko [BS06] introduced the configuration LP as a relaxation for the max-min allocation problem but showed that it has an integrality gap of $\Omega(\sqrt{n})$ [BS06]. Asadpour and Saberi [AS07] gave a rounding scheme for the same LP, which achieves an $O(\sqrt{n}\log^{3}n)$ -approximation. This was later improved by Chakrabarty et al. [CCK09] to an $\tilde{O}(n^{\epsilon})$ -approximation for any $\epsilon\in\Omega(\log\log n/\log n)$ by iteratively constructing new instances with smaller integrality gap.

A further special case is the Santa Claus problem where each item $i$ has an intrinsic value $H_{i}\geq 0$ such that $h_{ij}\in\{0,H_{i}\}$ for all players $j\in[d]$ . Here, Bansal and Sviridenko [BS06] used the configuration LP to find an $O(\log\log n/\log\log\log n)$ -approximation. Feige [Fei08] non-constructively showed a constant upper bound on the integrality gap of the configuration LP for the Santa Claus problem by iteratively applying the Lovász Local Lemma. The current best bound is due to Haxell and Szabó [HS23], who used new topological techniques to show that the integrality gap is at most $3.534$ . Bounds on the integrality gap do not immediately lead to efficient approximation algorithms, but Davies et al. [DRZ20] recently gave an algorithm for a more general setting that can be used to achieve a $(4+\epsilon)$ -approximation for the Santa Claus problem.

Experimental Design (E-optimal Design):

Even with cardinality constraints (uniform matroid of rank $k$ ), the minimum eigenvalue problem is NP-hard [cMI09]. Allen-Zhu et al. [AZLSW17] showed that it is possible to deterministically find a $(1-\epsilon)$ -approximation so long as $k\geq\Omega(d/\epsilon^{2})$ by rounding the natural convex relaxation. They also conjectured that this requirement was necessary. This conjecture was confirmed in [NST19], where they showed an integrality gap instance for the convex relaxation. Recently Lau and Zhou [LZ21] have built on the regret minimization framework from [AZLSW17] to show that a modified local search algorithm with a “smoothed” objective works as long as there is a near-optimal solution with a good condition number.

Spectral Sparsification and Kadison-Singer.

The problem of rounding the natural convex programming relaxation for the minimum eigenvalue problem is closely related to spectral sparsification [BSS09] and the Kadison-Singer problem [MSS15]. In spectral sparsification [BSS09], the goal is to pick a small subset of vectors $S\subseteq[n]$ such that $\sum_{i\in S}w_{i}v_{i}v_{i}^{\top}$ spectrally approximates $\sum_{i\in[n]}v_{i}v_{i}^{\top}$ for some weights $w_{i}$ . In the cardinality constrained minimum eigenvalue problem, rounding the convex programming solution involves finding a small set $S$ , such that $\sum_{i\in S}v_{i}v_{i}^{\top}$ spectrally approximates $\sum_{i\in[n]}x_{i}v_{i}v_{i}^{\top}$ , where the weights $x_{i}$ form the solution to the convex relaxation. Indeed [AZLSW17] essentially build on this connection to obtain their results for the $E$ -design problem discussed earlier. The Kadison-Singer problem [MSS15] is closely related to the minimum eigenvalue problem under a partition matroid constraint. We utilize this connection in Corollary 1 to give an algorithmic version of the Kadison-Singer problem for constant dimensions. More generally, the Kadison-Singer problem can be reformulated as showing that the integrality gap of the natural relaxation of the minimum eigenvalue problem under partition matroid constraints is at most $1/(1-\epsilon)$ if the length of each vector is at most $O(\epsilon)$ . We discuss this connection in Section 4.

2 The Algorithm for Partition Matroids

To highlight the main idea of our algorithm, we first prove Theorem 1 for the special case of partition matroid. Let $\mathcal{M}=(E,\mathcal{I})$ be a partition matroid where $E=P_{1}\cup\cdots\cup P_{k}$ be a disjoint union of parts with each part containing $n$ elements, and we have a collection of vectors $v_{ij}$ for $i\in[k]$ and $j\in P_{i}$ . The goal is to select an element $\sigma(i)\in P_{i}$ for each $i$ to maximize $\lambda_{\min}\left(\sum_{i=1}^{k}v_{i\sigma(i)}v_{i\sigma(i)}^{\top}\right).$

We can construct the natural convex relaxation of this problem as follows. For each $i\in[k]$ and $j\in P_{i}$ , we add a decision variable $x_{ij}$ which represents whether we select the vector $v_{j}$ from part $P_{i}$ , i.e., if $\sigma(i)=j$ . Then we get the convex program

\begin{array}[]{cl}\max&\lambda_{\min}\left(X\right)\\ &X=\sum\limits_{i=1}^{k}\sum\limits_{j\in P_{i}}x_{ij}\cdot v_{ij}v_{ij}^{\top% }\\ &\sum\limits_{j\in P_{i}}x_{ij}=1,\quad\forall i\in[k]\\ &x\geq 0\end{array}

The constraint $\sum_{j\in P_{i}}x_{ij}=1$ ensures that we have a probability distribution over the possible assignments within each part in the optimal solution.

Given an optimal solution $x^{\star}$ with value $OPT$ , a natural rounding strategy is to round independently within each part. Following this rounding strategy, we get a rank $1$ random matrix $M_{i}$ for each part $P_{i}$ with

\operatorname{\mathrm{Pr}}(M_{i}=v_{ij}v_{ij}^{\top})=x^{\star}_{ij}\,,\quad% \forall j\in P_{i}\,.

The following matrix concentration inequality bounds the probability of failure of this rounding strategy.

Theorem 3

[Tro15, Theorem 5.1.1] Consider independent random matrices $M_{1},\ldots,M_{k}\in\mathbb{S}_{d}^{+}$ . Set

\mu_{\min}=\lambda_{\min}\left(\operatorname{\mathbb{E}}\left[\sum_{i=1}^{k}M_% {i}\right]\right)\,.

If $\lambda_{\max}(M_{i})\leq R$ for all $i\in[k]$ a.s. then

\operatorname{\mathrm{Pr}}\left(\lambda_{\min}\left(\sum_{i=1}^{k}M_{i}\right)% <(1-\epsilon)\mu_{\min}\right)\leq d\cdot\exp\left(\frac{-\epsilon^{2}\mu_{min% }}{2R}\right).

If we round according to the optimal solution $x^{\star}$ then $\operatorname{\mathbb{E}}\left[\sum_{i=1}^{k}M_{i}\right]=\sum_{i=1}^{k}\sum_{% j\in P_{i}}x_{ij}^{\star}v_{ij}v_{ij}^{\top}$ .

So $\mu_{\min}=OPT$ , and since for our particular case $M_{i}$ are rank $1$ , $R=\max_{i}\lambda_{\max}(M_{i})=\max_{ij}\|v_{ij}\|^{2}$ . To bound the failure probability, we want $R\approx\epsilon^{2}/\log(d)$ , which in turn requires that $\max_{ij}\|v_{ij}\|^{2}=O(\epsilon^{2}/\log(d))$ . This is a very strong assumption on an instance.

The plan is to “guess” a suitable change of basis such that all the vectors in the support of our optimal solution have a small norm. This will be useful because of the following standard, but slightly more flexible, version of the preceding matrix concentration inequality.

Corollary 2

Consider independent random matrices $M_{1},\ldots,M_{k}\in\mathbb{S}_{d}^{+}$ and let $A$ be an arbitrary positive definite matrix. Define $\mu_{\min}:=\lambda_{\min}\left(A^{-1/2}\operatorname{\mathbb{E}}\left[\sum_{i% =1}^{k}M_{i}\right]A^{-1/2}\right).$ If $\lambda_{\max}(A^{-1/2}M_{i}A^{-1/2})\leq R$ for all $i\in[k]$ a.s. then

\operatorname{\mathrm{Pr}}\left(\sum_{i=1}^{k}M_{i}\nsucceq(1-\epsilon)\mu_{% \min}\cdot A\right)\leq d\cdot\exp\left(\frac{-\epsilon^{2}\mu_{min}}{2R}% \right).

Again, since $M_{i}$ is rank $1$ for our case, we have $R=\max_{i\in[k]}\;\lambda_{\max}(A^{-1/2}M_{i}A^{-1/2})=\max_{i,j}\;v_{ij}^{% \top}A^{-1}v_{ij}\,.$ So, to use this corollary, we first need to find a matrix $A$ such that $v_{ij}^{\top}A^{-1}v_{ij}=O(\epsilon^{2}/\log(d))$ for all $[i]\in[k],j\in P_{i}$ . We will only need to consider matrices $A$ of a specific form that uses the input vectors.

Given a subset $S\subseteq E$ , we define $A_{S}:=\sum_{(i,j)\in S}v_{ij}v_{ij}^{\top}$ , and consider the set of long vectors in the norm induced by $A_{S}$ : $L(S):=\left\{(i,j)\in E\backslash S:v_{ij}^{\top}A_{S}^{-1}v_{ij}>\frac{% \epsilon^{2}}{10\log(d)}\right\}.$ For a fixed set $S$ , the following convex program ensures that $S$ is included in the solution and no “long” vectors from $L(S)$ are included in the solution.

\begin{array}[]{cl}\max&\lambda_{\min}\left(X\right)\\ &X=\sum\limits_{i=1}^{k}\sum\limits_{j\in P_{i}}x_{ij}\cdot v_{ij}v_{ij}^{\top% }\\ &\sum\limits_{j\in P_{i}}x_{ij}=1,\,\,\forall i\in[k]\\ &x_{ij}=0,\,\,\forall(i,j)\in L(S)\\ &x_{ij}=1,\,\,\forall(i,j)\in S\\ &x\geq 0\end{array}

(CP(S))

Because of the extra constraints excluding “long” vectors, we could now use the flexible matrix concentration inequalities to randomly round the optimal solution.

But, it is not clear that there is a good choice of $S$ for which the convex program CP(S) is still a relaxation of the original problem. Lemma 1, which we restate here for the reader’s convenience, shows that there exists a suitable set $S$ that is not too large.

See 1

The proof of this lemma is inspired by the local search algorithm of [MSTX19].

At first glance, it may not be apparent why a subset satisfying the conditions of Lemma 1 should exist. However, in the proof, we show that any subset of $T$ that is locally optimal with respect to a local search criteria indeed satisfies the guarantees of Lemma 1.

Proof (of Lemma 1) We consider the local search process of [MSTX19]. Starting with a set $S$ of size $\ell$ such that $A=\sum_{i\in S}v_{i}v_{i}^{\top}$ is invertible, we apply the following update rule. For any $j\in T\backslash S$ and $i\in S$ , if $\det(A)<\det(A-v_{i}v_{i}^{\top}+v_{j}v_{j}^{\top})$ , update $S=\{S\backslash\{i\}\}\cup\{j\}$ and iterate.

Let $S\subseteq T$ be a locally optimal (under single element swaps) solution for this process (such an $S$ corresponds to the locally optimal solution determinant maximization problem subject to the cardinality constraint $|S|\leq\ell$ ), and let $A=\sum_{i\in S}v_{i}v_{i}^{\top}$ . More concretely, this means that for all $i\in S$ and $j\in T\backslash S$ ,

\det(A)\geq\det(A-v_{i}v_{i}^{\top}+v_{j}v_{j}^{\top})\,.

We calculate the determinant on the right-hand side using the matrix determinant lemma,

	$\displaystyle\det(A-v_{i}v_{i}^{\top}+v_{j}v_{j}^{\top})$	$\displaystyle=\det\left(A+\begin{bmatrix}v_{i}&v_{j}\end{bmatrix}\begin{% bmatrix}-v_{i}&v_{j}\end{bmatrix}^{\top}\right)=\det(A)\cdot\det\left(I_{2}+% \begin{bmatrix}-v_{i}&v_{j}\end{bmatrix}^{\top}A^{-1}\begin{bmatrix}v_{i}&v_{j% }\end{bmatrix}\right)$
		$\displaystyle=\det(A)\cdot\left((1-v_{i}^{\top}A^{-1}v_{i})(1+v_{j}^{\top}A^{-% 1}v_{j})+(v_{i}^{\top}A^{-1}v_{j})^{2}\right)\,.$

So local optimality implies that for every $i\in S$ and $j\notin S$ , $(1-v_{i}^{\top}A^{-1}v_{i})(1+v_{j}^{\top}A^{-1}v_{j})+(v_{i}^{\top}A^{-1}v_{j% })^{2}\leq 1$ . Rearranging this inequality we get

v_{j}^{\top}A^{-1}v_{j}-(v_{i}^{\top}A^{-1}v_{i})\cdot(v_{j}^{\top}A^{-1}v_{j}% )+(v_{i}^{\top}A^{-1}v_{j})^{2}\leq v_{i}^{\top}A^{-1}v_{i}\,.

(1)

Note that $\sum_{i\in S}v_{i}^{\top}A^{-1}v_{i}=\langle A,A^{-1}\rangle=d$ and $\sum_{i\in S}(v_{i}^{\top}A^{-1}v_{j})^{2}=v_{j}^{\top}A^{-1}v_{j}$ . So for a fixed $j\in T\backslash S$ , summing equation (1) over all $i\in S$ implies $\ell\cdot v_{j}^{\top}A^{-1}v_{j}-d\cdot v_{j}^{\top}A^{-1}v_{j}+v_{j}^{\top}A% ^{-1}v_{j}\leq d$ . Rearranging, we see that for any $j\in T\backslash S$ ,

v_{j}^{\top}A^{-1}v_{j}\leq\frac{d}{\ell-d+1}=\frac{\epsilon^{2}}{10\log(d)}\,,

where the last equality follows by choosing $\ell=10d\log(d)/\epsilon^{2}+d-1$ . $\Box$

We will apply this lemma to the case when $T=\{v_{i\sigma^{\star}(i)}:i\in[k]\}$ m where $\sigma^{*}$ is the choice function that maximizes the minimum eigenvalue, i.e., when $T$ contains the vectors from an optimal integral assignment. In particular, we get the following corollary.

Lemma 2

There is a subset $S\subseteq E$ such that $|S|=O(d\log(d)/\epsilon^{2})$ , and the convex program CP(S) is a relaxation for the minimum eigenvalue problem.

As $d$ is a constant, the size of the set $S$ we search for is also constant. Thus, there are at most $O(n^{O(d\log(d)/\epsilon^{2})})$ possible choices for $S$ . We will consider each choice in turn to guess the correct set. Note that trying every set of the appropriate size will be the dominant factor in determining the algorithm’s runtime.

The following lemma proves that for any fixed subset $S$ , rounding the optimal solution to CP(S) gives a good approximation to the optimal value of CP(S).

Lemma 3

Let $S\subseteq E$ be an independent set, and let $x^{\star}$ be the optimal solution to CP(S). Then rounding randomly in each part outputs an assignment $\sigma:[k]\rightarrow E$ with $\sigma(i)\in P_{i}$ such that

\operatorname{\mathrm{Pr}}\left[\lambda_{\min}\left(\sum_{i=1}^{k}v_{i\sigma(i% )}v_{i\sigma(i)}^{\top}\right)<(1-\epsilon)\cdot\lambda_{\min}\left(\sum_{i=1}% ^{k}\sum_{j\in P_{i}}x^{\star}_{ij}\,v_{ij}v_{ij}^{\top}\right)\right]<d^{-4}.

Proof Let $X=\sum_{i=1}^{k}\sum_{j\in P_{i}}x^{\star}_{ij}\,v_{ij}v_{ij}^{\top}$ . The matrix $X^{\star}$ contains $v_{ij}v_{ij}^{\top}$ with coefficient $1$ for every $(i,j)\in S$ . Thus $X\succeq\sum_{i\in S}v_{i}v_{i}^{\top}$ , so $v_{j}^{\top}X^{-1}v_{j}\leq\frac{\epsilon^{2}}{10\log(d)}$ for all $j\notin L\cup S$ . For the purposes of the analysis, for every $j\in S$ we can replace the vector $v_{i}$ with $(v_{i}^{\top}X^{-1}v_{i})\cdot\sqrt{10\log(d)/\epsilon^{2}}$ copies of the same vector scaled down to have squared-length at most $\epsilon^{2}/(10\log(d))$ with respect to $X$ . Since all elements of $S$ get value $1$ in $x^{\star}$ , we can similarly extend the vector $x^{\star}$ so that it has a $1$ in all the copied entries. Since these values of $x^{*}$ are deterministic, nothing changes about the resulting distribution over matrices, but we can now assume that $v_{ij}^{\top}X^{-1}v_{ij}<\epsilon^{2}/(10\log(d))$ for all $(i,j)$ in the support of $x^{\star}$ .

Next, define random matrices $M_{1},\ldots,M_{k}$ such that for any $i\in[k]$ , $\operatorname{\mathrm{Pr}}\left(M_{i}=v_{ij}v_{ij}^{\top}\right)=x^{\star}_{ij}$ for all $j\in P_{i}$ . We then apply Corollary 2 with $A=X$ on random matrices $M_{1},\ldots,M_{k}$ . Since $x^{\star}$ is not supported on $L$ ,

R=\max_{i}\;\lambda_{\max}(X^{-1/2}M_{i}X^{-1/2})\leq\frac{\epsilon^{2}}{10% \log(d)}\,.

In addition, as $\operatorname{\mathbb{E}}\left(\sum_{i}M_{i}\right)=X$ by definition, we have $\mu_{\min}=1$ .

Thus, if $\sigma:[k]\rightarrow E$ is the choice function obtained by independent rounding,

\displaystyle\operatorname{\mathrm{Pr}}_{\sigma}\left[\sum_{i=1}^{k}v_{i\sigma% (i)}v_{i\sigma(i)}^{\top}\nsucceq(1-\epsilon)X\right]

\displaystyle\leq d\cdot\exp(-5\log(d))=d^{-4}.

We conclude that $\lambda_{\min}\left(\sum_{i=1}^{k}v_{i\sigma(i)}v_{i\sigma(i)}^{\top}\right)% \geq(1-\epsilon)\lambda_{\min}\left(X\right)$ with probability at least $1-d^{-4}$ . $\Box$

Combining this lemma with the earlier guarantee that there exists a set $S$ of reasonable size such that CP(S) is a relaxation, we get the following algorithm: try all possible choices for the set $S$ and return the solution with the best objective.

Algorithm 1 Algorithm to find an approximation to

OPT

1:Input: Partition matroid

\mathcal{M}

with

k

parts

P_{1},\ldots,P_{k}

2:for each

S\subseteq[n]

such that

|S|=10d\log(d)/\epsilon^{2}+d-1

x^{*}\leftarrow

optimal solution of CP(S) for matroid

\mathcal{M}

4: For each

i\in[k]

, assign

\sigma_{S}(i)=j

with probability

x^{*}_{ij}

5:end for

6:Return the choice function

\sigma_{S}

which maximizes

\lambda_{\min}\left(\sum_{i}v_{i\sigma_{S}(i)}v_{i\sigma_{S}(i)}^{\top}\right)

over all choices of

S

Proof (of Theorem 1 for Partition Matroids) By Lemma 2 there is a set $S\subseteq E$ with $|S|=O(d\log d/\epsilon^{2})$ such that CP(S) is a relaxation.

Let $x^{*}$ be the optimal value of CP(S) and let $\sigma^{\star}$ be the choice function of the optimal basis for the minimum eigenvalue problem. Since CP(S) is a relaxation, we have $\lambda_{\min}\left(\sum_{i=1}^{k}v_{i\sigma^{*}(i)}v_{i\sigma^{*}(i)}^{\top}% \right)\leq\lambda_{\min}\left(\sum_{ij}x^{*}_{ij}v_{ij}v_{ij}^{\top}\right)$ .

Lemma 3 implies that with high probability, the choice function obtained by rounding $x^{*}$ , $\sigma_{S}$ , is a good approximation to CP(S). So combining Lemma 3 with the previous inequality gives

\displaystyle\lambda_{\min}\left(\sum_{i=1}^{k}v_{i\sigma_{S}(i)}v_{i\sigma_{S% }(i)}^{\top}\right)

\displaystyle\geq(1-\epsilon)\cdot\lambda_{\min}\left(\sum_{ij}x^{*}_{ij}v_{ij% }v_{ij}^{\top}\right)\geq(1-\epsilon)\cdot\lambda_{\min}\left(\sum_{i=1}^{k}v_% {i\sigma^{*}(i)}v_{i\sigma^{*}(i)}^{\top}\right)\,,

with probability at least $1-d^{-4}$ . Since we iterate over all choice functions in step 6 of Algorithm 1, we will output a choice function $\sigma$ which is at least as good as $\sigma_{S}$ with the same probability. $\Box$

2.1 Application: Algorithmic Kadison-Singer Problem

The Kadison-Singer conjecture was resolved in [MSS15] using the following theorem which can be interpreted as a generalization of Weaver’s conjecture [Wea04].

Theorem 4

[MSS15, Corollary 1.5 with $r=2$ ] Let $u_{1},\ldots,u_{m}\in\mathbb{R}^{d}$ be vectors such that $\sum_{i=1}^{m}u_{i}u_{i}^{\top}=I$ and $\|u_{i}\|^{2}\leq\alpha$ for all $i.$ There exists a set $T\subseteq[m]$ such that

\left(\frac{1}{2}-3\sqrt{\alpha}\right)I_{d}\preceq\sum_{i\in T}u_{i}u_{i}^{% \top}\preceq\left(\frac{1}{2}+3\sqrt{\alpha}\right)I_{d}.

Their proof is based on analyzing interlacing families of polynomials and does not lead to an efficient algorithm to find such a subset $T$ .

In [JMS22], they introduce an algorithmic form of the Kadison-Singer problem, which asks to find such a subset assuming it exists. For a constant $c>0$ and a set of vectors $u_{1},\ldots,u_{m}\in\mathbb{R}^{d}$ such that $\|u_{i}\|^{2}\leq\alpha$ , $\sum_{i=1}^{m}u_{i}u_{i}^{\top}=I$ where there exists a subset $T\subseteq[m]$ satisfying

\left(\frac{1}{2}-c\sqrt{\alpha}\right)I\preceq\sum_{i\in T}u_{i}u_{i}^{\top}% \preceq\left(\frac{1}{2}+c\sqrt{\alpha}\right),

(2)

the goal is actually to find a set $T\subseteq[m]$ which satisfies the above condition. This problem is FNP-hard when $c=1/(4\sqrt{2})$ for general values of $d$ [JMS22, Theorem 2].

Their main result [JMS22, Theorem 1] is an algorithm with running time

O\left(\binom{m}{k}\cdot\text{poly}(m,d)\right)\text{ for }k=O\left(\frac{d}{% \epsilon^{2}}\log(d)\log\left(\frac{1}{c\sqrt{\alpha}}\right)\right)\,,

which returns a set $T^{\prime}\subseteq[m]$ such that

(1-\epsilon)\left(\frac{1}{2}-c\sqrt{\alpha}\right)I\preceq\sum_{i\in T^{% \prime}}u_{i}u_{i}^{\top}\preceq(1+\epsilon)\left(\frac{1}{2}+c\sqrt{\alpha}% \right)I,

(3)

In this section, we will show how to use the rounding technique for partition matroids to give a simpler algorithm that achieves the same guarantee with the same run time, except we save the small dependence on $\log(1/c\sqrt{\alpha})$ in the exponent.

Proof (of Corollary 1) Given vectors $u_{1},\ldots,u_{m}\in\mathbb{R}^{d}$ , we construct an instance of the minimum eigenvalue with partition constraints as follows. Let $E=\{1,2\}\times[m]$ , with $m$ parts $P_{1},\ldots,P_{m}$ so that $P_{i}=\{(i,1),(i,2)\}$ for $i\in[m]$ . For each $i\in[m]$ define the vectors

v_{i1}=\begin{bmatrix}u_{i}\\ 0\end{bmatrix}\in\mathbb{R}^{2d},\text{ and }v_{i2}=\begin{bmatrix}0\\ u_{i}\end{bmatrix}\in\mathbb{R}^{2d}.

To see how $v$ and $u$ are related, note that for any $\delta\in[0,1/2)$ there is a choice function $\sigma:[m]\rightarrow\{1,2\}$ such that

\left(\frac{1}{2}-\delta\right)I_{2d}\preceq\sum_{i=1}^{m}v_{i\sigma(i)}v_{i% \sigma(i)}^{\top}

(4)

if and only if there is a set $T\subseteq[m]$ such that

\left(\frac{1}{2}-\delta\right)I\preceq\sum_{i\in T}u_{i}u_{i}^{\top}\preceq% \left(\frac{1}{2}+\delta\right)I.

(5)

Given $\sigma$ satisfying (4), let $X_{1}:=\sum_{i:\sigma(i)=1}u_{i}u_{i}^{\top}$ and $X_{2}:=\sum_{i:\sigma(i)=2}u_{i}u_{i}^{\top}$ . Then $X_{1}$ and $X_{2}$ are respectively the first and second diagonal $d\times d$ block of $\sum_{i=1}^{m}v_{i\sigma(i)}v_{i\sigma(i)}^{\top}$ . Therefore $\left(\frac{1}{2}-\delta\right)I_{2d}\preceq\sum_{i=1}^{m}v_{i\sigma(i)}v_{i% \sigma(i)}^{\top}$ if and only if $X_{1}\succeq\left(\frac{1}{2}-\delta\right)I_{d}$ and $X_{2}\succeq\left(\frac{1}{2}-\delta\right)I_{d}$ . In addition, since $X_{1}+X_{2}=I_{d}$ , this is equivalent to

\left(\frac{1}{2}-\delta\right)I_{d}\preceq\sum_{i:\sigma(i)=1}u_{i}u_{i}=I_{d% }-X_{2}\preceq\left(\frac{1}{2}+\delta\right)I_{d}.

We then use Algorithm 1 to find a $(1-\epsilon)$ approximate solution $\sigma:[m]\rightarrow\{1,2\}$ to input $\mathcal{M}$ and vectors $v_{ij}$ . Since we assume there is a set $T$ satisfying (2), Theorem 1 implies that with probability at least $1-O(d^{-4})$ , Algorithm 1 will return a choice function $\sigma^{*}$ such that $(1-\epsilon)\left(\frac{1}{2}-c\sqrt{\alpha}\right)I_{2d}\preceq\sum_{i=1}^{m}% v_{i\sigma^{*}(i)}v_{i\sigma^{*}(i)}^{\top}$ , and we will return the set $T^{\prime}=\{i\in[m]:\sigma^{*}(i)=1\}$ .

From the equivalence between (4) and (5), the set $T^{\prime}=\{i\in[m]:\sigma(i)=1\}$ satisfies (3)

(1-\epsilon)\left(\frac{1}{2}-c\sqrt{\alpha}\right)I_{d}\preceq\sum_{i\in T^{% \prime}}u_{i}u_{i}\preceq(1+\epsilon)\left(\frac{1}{2}+c\sqrt{\alpha}\right)I_% {d}\,.

$\Box$

3 General Matroid Constraints

In the general form of the problem, we are given a collection of vectors $v_{1},\ldots,v_{n}\in\mathbb{R}^{d}$ and a matroid $\mathcal{M}=([n],\mathcal{I})$ , and the goal is to find a basis $B\in\mathcal{I}$ which maximizes $\lambda_{\min}\left(\sum_{i\in B}v_{i}v_{i}^{\top}\right).$ For background on matroids, see Appendix B.1.

For a general matroid, the idea of finding a linear transformation under which all elements in the optimal solution have a small norm generalizes easily. So we can use the same approach of first guessing a set $S\subseteq E$ on a reasonable size and then solving the convex relaxation of the problem conditioned on $S$ being included in the solution.

Given a subset $S\subseteq[n]$ , we can again set $A_{S}=\sum_{i\in S}v_{i}v_{i}^{\top}$ , and consider the set of long vectors:

L(S)=\left\{i\in[n]\backslash S:v_{i}^{\top}A_{S}^{-1}v_{i}>\frac{\epsilon^{2}% }{10\log(d)}\right\}.

For a matroid $\mathcal{M}$ , let $\mathcal{P}(M)\subseteq[0,1]^{n}$ denote the matroid base polytope. Then the following is the natural convex programming relaxation which excludes the “long” vectors.

\begin{array}[]{cl}\max&\lambda_{\min}\left(X\right)\\ &X=\sum\limits_{i=1}^{n}x_{i}\cdot v_{ij}v_{ij}^{\top}\\ &x\in\mathcal{P}(\mathcal{M})\\ &x_{i}=0,\,\,\forall i\in L(S)\\ &x_{i}=1,\,\,\forall i\in S\\ &x\geq 0\end{array}

(CP(S))

This convex program can be solved in polynomial time (see Appendix B.1). Just like in the partition case, Lemma 1 guarantees that there is a set $S$ for which CP(S) is a relaxation for the minimum eigenvalue problem. As before, after solving CP(S), we can guarantee that all the vectors in the fractional support of the optimal solution will have a small norm with respect to $A_{S}$ .

The challenge in extending the earlier approach to general matroid constraints comes from the rounding step. For a partition matroid, we could simply round the fractional optimum of CP(S) independently in each part to obtain a basis. However, for more general constraints, it is not so clear how to round a fractional solution to a basis.

Instead of rounding independently, we will use the technique of pipage rounding to find a basis. The following lemma is the lower-tail version of the same concentration inequality proved in [HO14]. For completeness, we will include a proof of the version we need in Appendix B.

Lemma 4

Let $\mathcal{P}(\mathcal{M})$ be a matroid base polytope and $x\in\mathcal{P}(\mathcal{M})$ . Let $M_{1},\ldots,M_{m}$ be self-adjoint matrices that satisfy $\lambda_{\max}(M_{i})\leq R$ . Let $\mu=\lambda_{\min}\left(\sum_{i\in[n]}x_{i}M_{i}\right)$ . If randomized pipage rounding (Algorithm 3) starts at $x$ and outputs the extreme point $\hat{x}=\chi(B)$ of $\mathcal{P}(\mathcal{M})$ , then we have

\operatorname{\mathrm{Pr}}\left[\sum_{i\in B}M_{i}\leq(1-\epsilon)\cdot\mu% \right]\leq d\cdot\exp\left(\frac{-\epsilon^{2}\mu}{2R}\right)\,.

We use this lemma to generalize our earlier approach to all matroids.

Lemma 5

Let $S\subseteq E$ be an independent set in $\mathcal{M}$ and let $x^{\star}$ be the optimal solution to CP(S). Then pipage rounding starting at $x^{\star}$ outputs a basis $B$ such that

\operatorname{\mathrm{Pr}}\left[\lambda_{\min}\left(\sum_{i\in B}v_{i}v_{i}^{% \top}\right)<(1-\epsilon)\lambda_{\min}\left(\sum_{i\in[n]}x^{\star}_{i}\,v_{i% }v_{i}^{\top}\right)\right]<d^{-4}.

The proof is identical to that of Lemma 3, except we use the matrix concentration inequality from Lemma 4.

Using this lemma, the following algorithm gives a $(1-\epsilon)$ -approximation with high probability.

Algorithm 2 Algorithm to find an approximation to

OPT

1:for each

S\subseteq[n]

such that

|S|=10d\log(d)/\epsilon^{2}+d-1

2: Solve CP(S) to get optimal solution

x^{\star}

3: Let

B_{S}\leftarrow

basis returned by Algorithm 3 for input

x^{\star}

4:end for

5:Return the basis

B_{S}

with the best objective

3.1 Proof of Theorem 1 and Theorem 2

In this section we prove Theorem 2. The proof of Theorem 1 follows identically.

Proof (of Theorem 2) Let $B^{*}=\arg\max_{B\in\mathcal{I}}f(\sum_{i\in B}v_{i}v_{i}^{\top})$ and let $OPT=f\left(\sum_{i\in B^{*}}v_{i}v_{i}^{\top}\right)$ . Let $S\subseteq B^{*}$ such that $|S|=O(d\log{d}/\epsilon^{2})$ , $A=\sum_{(i,j)\in S}v_{ij}v_{ij}^{\top}$ is invertible, and for all $(i,j)\in B^{*}\backslash S$ , $v_{ij}^{\top}A^{-1}v_{ij}\leq\epsilon^{2}/10d\log{d}$ . By Lemma 1, such a set $S$ exists.

Let $x^{*}$ be the optimal solution of CP(S). Since the indicator vector of $B^{*}$ satisfies the constraints of CP(S), $OPT=f\left(\sum_{i\in B^{*}}v_{i}v_{i}^{\top}\right)\leq f\left(\sum_{i\in E}x% ^{*}_{i}\cdot v_{i}v_{i}^{\top}\right)$ . Therefore,

\Pr\left[f(\sum_{i\in\tilde{B}}v_{i}v_{i}^{\top})<(1-\epsilon)\cdot OPT\right]% \leq\Pr\left[f(\sum_{i\in\tilde{B}}v_{i}v_{i}^{\top})<(1-\epsilon)\cdot f\left% (\sum_{i\in E}x^{*}_{i}\cdot v_{i}v_{i}^{\top}\right)\right]\,.

Let $X:=\sum_{i\in E}x^{*}_{i}\cdot v_{i}v_{i}^{\top}$ . Since $f$ is monotone and homogeneous, we have

$\displaystyle\Pr\left[f(\sum_{i\in\tilde{B}}v_{i}v_{i}^{\top})<(1-\epsilon)% \cdot f(X)\right]$	$\displaystyle=\Pr\left[f(\sum_{i\in\tilde{B}}v_{i}v_{i}^{\top})<f((1-\epsilon)% \cdot X)\right]$	(6)
	$\displaystyle\leq\Pr\left[\sum_{i\in\tilde{B}}v_{i}v_{i}^{\top}\nsucceq(1-% \epsilon)\cdot X\right]$
	$\displaystyle=\Pr\left[\sum_{i\in\tilde{B}}X^{-1/2}v_{i}v_{i}^{\top}X^{-1/2}% \nsucceq(1-\epsilon)\cdot I\right]$
	$\displaystyle=\Pr\left[\lambda_{\min}(\sum_{i\in\tilde{B}}X^{-1/2}v_{i}v_{i}^{% \top}X^{-1/2})\leq(1-\epsilon)\right]$	(7)

Similar to the proof of Lemma 3, we will apply Lemma 4 to random matrices $M_{i}=v_{i}v_{i}^{\top}$ after appropriate transformations to ensure $R=O(\epsilon^{2}/\log{d})$ . First note that since $x^{*}_{i}=1$ for any $i\in S$ , we have $i\in B$ as pipage rounding does not change the integral elements of $x^{*}$ . Therefore $\sum_{i\in S}v_{i}v_{i}^{\top}\preceq X$ , and for any $i\in B\backslash S$ ,

\lambda_{\max}(X^{-1/2}v_{i}v_{i}^{\top}X^{-1/2})=v_{i}^{\top}X^{-1}v_{i}\leq v% _{i}^{\top}\left(\sum_{j\in S}v_{j}v_{j}^{\top}\right)^{-1}v_{i}\leq\frac{% \epsilon^{2}}{10\log{d}}.

For any $i\in S$ , $v_{i}X^{-1}v_{i}^{\top}\leq 1$ , and since $x_{i}^{*}=1$ we can replace $M_{i}=v_{i}v_{i}^{\top}$ with $r=10\log{d}/\epsilon^{2}$ matrices $M_{i}^{1},\ldots,M_{i}^{r}$ such that $M_{i}^{j}=\frac{\epsilon^{2}}{10\log{d}}v_{i}v_{i}^{\top}$ . This ensures that $\lambda_{\max}(X^{-1/2}M_{i}^{j}X^{-1/2})\leq\epsilon^{2}/10\log{d}$ for all $j\in[r]$ . Applying Lemma 4 on random matrices $\{M_{i}\}_{i\in B\backslash S},\{M_{i}^{j}\}_{i\in S,j\in[r]}$ gives

\Pr\left[\lambda_{\min}(\sum_{i\in\tilde{B}}X^{-1/2}v_{i}v_{i}^{\top}X^{-1/2})% \leq(1-\epsilon)\right]\leq d^{-4}\,.

If $B$ is the basis returned by Algorithm 2, then $f(B)\geq f(\tilde{B})$ . Using the above inequality with (7) implies that $f(\sum_{i\in B}v_{i}v_{i}^{\top})\geq(1-\epsilon)\cdot OPT$ with probability at least $1-d^{-4}$ . $\Box$

3.2 Pipage Rounding

The purpose of this section is to give an explanation of the pipage rounding technique and motivate the proof of Lemma 4. For a detailed discussion of pipage rounding, see [HO14].

For a set $S\subseteq E$ , let $x^{*}$ be the optimal solution of CP(S). If we round $x^{*}$ independently, i.e., add element $i$ to the output with probability $x_{i}^{*}$ , then the set obtained, say $B$ , might not be independent in $\mathcal{M}$ . But the following concentration inequality would still hold due to independence,

\operatorname{\mathrm{Pr}}\left[\lambda_{\min}\left(\sum_{i\in B}v_{i}v_{i}^{% \top}\right)<(1-\epsilon)\lambda_{\min}\left(\sum_{i\in[n]}x^{\star}_{i}\,v_{i% }v_{i}^{\top}\right)\right]<d^{-4}.

(8)

The main idea behind pipage rounding is to iteratively transform a point $x\in\mathcal{P}(\mathcal{M})$ to a basis of $\mathcal{M}$ while ensuring that the failure probability from equation (8) does not increase.

For a point $x\in[0,1]^{n}$ , let $D(x)$ represent the corresponding product distribution over $\{0,1\}^{n}$ with marginals given by $x$ , i.e., include element $i$ in the output with probability $x_{i}$ . For $x\in[0,1]^{n}$ and $\epsilon>0$ , define

p_{\epsilon}(x):=\Pr_{B\sim D(x)}\left(\lambda_{\min}(\sum_{i\in B}v_{i}v_{i}^% {\top})\leq(1-\epsilon)\cdot\lambda_{\min}\left(\sum_{i\in[n]}x_{i}v_{i}v_{i}^% {\top}\right)\right)\,.

So $p_{\epsilon}(x)$ is the failure probability of getting a $(1-\epsilon)$ -approximation when rounding independently at point $x$ . [HO14] showed that there exists a function $g_{\epsilon}(x)$ s.t. $p_{\epsilon}(x)\leq g_{\epsilon}(x)\leq d\cdot\exp\left(\frac{-\epsilon^{2}\mu% _{min}}{2R}\right)$ and $g_{\epsilon}$ is concave under swaps, i.e., for all $a,b\in[n]$ and $x\in\mathcal{P}(\mathcal{M})$ the map $z\mapsto g_{\epsilon}(x+z(e_{a}-e_{b}))$ is concave.

So, if $x$ is not an extreme point of $\mathcal{P}(\mathcal{M})$ , then there exist $a,b\in[n]$ and $\epsilon>0$ such that $x\pm\epsilon(e_{a}+e_{b})\in\mathcal{P}(\mathcal{M})$ . Let $l=\min\{z:x+z(e_{a}-e_{b})\in\mathcal{P}(\mathcal{M})\}$ and $u=\max\{z:x+z(e_{a}-e_{b})\in\mathcal{P}(\mathcal{M})$ .

With this, we can define $x^{l}=x+l(e_{a}-e_{b})$ and $x^{u}=x+u(e_{a}-e_{b})$ . Since $g(x+z(e_{a}-e_{b}))$ is concave as a function of $z$ , we know that either $g(x^{l})\leq g(x)$ or $g(x^{u})\leq g(x)$ . Moreover, both $x^{l}$ and $x^{u}$ are on a lower dimensional face than the initial point $x$ . Thus, for any initial point $x_{0}\in\mathcal{P}(\mathcal{M})$ , a total of $m$ iterations suffice to find an extreme point with $\hat{x}$ with $g(\hat{x})\leq g(x_{0})$ .

In randomized pipage starting at $x\in\mathcal{P}(\mathcal{M})$ , our next iterate $x^{\prime}$ of the rounding procedure will be $x^{l}$ with probability $\frac{u}{u-l}$ and $x^{u}$ with probability $\frac{-l}{u-l}$ . This ensures that $\operatorname{\mathbb{E}}(x^{\prime})=x$ , and the concavity under swaps guarantees that $\operatorname{\mathbb{E}}[g_{\epsilon}(x^{\prime})]\leq g_{\epsilon}(x)$ by Jensen’s in the variable $z$ . If we start at a point $x_{0}\in\mathcal{P}(\mathcal{M}))$ and iterate this random procedure $m$ times, we get an extreme point $\hat{x}$ which satisfies $\operatorname{\mathbb{E}}[\hat{x}]=x_{0}$ and $\operatorname{\mathbb{E}}[g(\hat{x})]\leq g(x_{0})$ .

This gives the intuition behind the proof of Lemma 6, and leads to the following algorithm.

Algorithm 3 Randomized Pipage Rounding

1:Input: Point

x\in\mathcal{P}(\mathcal{M})

, where

\mathcal{P}(\mathcal{M})

is a matroid base polytope

2:while

x

is not integral do

a,b\leftarrow

distinct elements of

[n]

s.t.

\exists\epsilon>0

with

x\pm\epsilon(e_{a}-e_{b})\in\mathcal{P}(\mathcal{M})

\ell\leftarrow\min\{y\geq 0:x-y(e_{a}-e_{b})\in\mathcal{P}(\mathcal{M})\}

h\leftarrow\max\{y\geq 0:x+y(e_{a}-e_{b})\in\mathcal{P}(\mathcal{M})\}

x\leftarrow\begin{cases}x-\ell(e_{a}-e_{b})&\text{w.p. }\ell/(\ell+h)\\ x+h(e_{a}-e_{b})&\text{w.p. }h/(\ell+h)\end{cases}

7:end while

8:Return basis

B\in\mathcal{P}(\mathcal{M})

with indicator vector

x

4 Conclusion and Remarks

The resolution of the Kadison-Singer problem in [MSS15] using the interlacing families of polynomials implies the following existential result about maximizing the minimum eigenvalue under partition matroid constraints.

Theorem 5

[MSS15, Theorem 1.4] For $\epsilon>0$ and vectors $\{v_{ij}\}_{i\in[k],j\in[n]}\in\mathbb{R}^{d}$ with $\|v_{ij}\|^{2}\leq\epsilon$ for all $i\in[k],j\in[n]$ , if there exist $x_{ij}\geq 0$ such that

\sum_{i=1}^{k}\sum_{j=1}^{n}x_{ij}\cdot v_{ij}v_{ij}^{\top}=I_{d}\quad\text{ % and }\quad\sum_{j=1}^{n}x_{ij}=1\text{ for all }i\in[k],

then there exists a choice function $\sigma:[k]\rightarrow[n]$ such that

(1-\sqrt{\epsilon})^{2}\cdot I_{d}\preceq\sum_{i=1}^{k}v_{i\sigma(i)}v_{i% \sigma(i)}^{\top}\preceq(1+\sqrt{\epsilon})^{2}\cdot I_{d}.

We can state this result equivalently as an “existential” rounding result. When $\|v_{ij}\|^{2}\leq\epsilon$ , Theorem 5 implies that the integrality gap of the natural convex relaxation (CP) for the minimum eigenvalue problem with partition constraints is only $1/(1-\sqrt{\epsilon})^{2}$ . It is an open problem to efficiently round the solution to the convex relaxation with comparable guarantees for any dimension $d$ .

More generally, the problem of designing an approximation algorithm for the minimum eigenvalue problem under partition or matroid constraints in arbitrary dimensions remains wide open. However, checking whether there is a solution with a non-zero objective can be solved in polynomial time solvable through matroid intersection. Recently, there has been significant progress in the case of maximizing the determinant [Nik15, SEFM15, NS16, AGV18, SX18, MNST20, BLP ${}^{+}$ 22], but it remains open whether those techniques can be utilized for the minimum eigenvalue problem.

References

[AB13] Haim Avron and Christos Boutsidis. Faster subset selection for matrices and applications. SIAM Journal on Matrix Analysis and Applications, 34(4):1464–1499, 2013.
[AFS12] Arash Asadpour, Uriel Feige, and Amin Saberi. Santa claus meets hypergraph matchings. ACM Trans. Algorithms, 8(3), jul 2012.
[AGV18] Nima Anari, Shayan Oveis Gharan, and Cynthia Vinzant. Log-concave polynomials, entropy, and a deterministic approximation algorithm for counting bases of matroids. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 35–46. IEEE, 2018.
[AS07] Arash Asadpour and Amin Saberi. An approximation algorithm for max-min fair allocation of indivisible goods. In Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, STOC ’07, page 114–121, New York, NY, USA, 2007. Association for Computing Machinery.
[AZLSW17] Zeyuan Allen-Zhu, Yuanzhi Li, Aarti Singh, and Yining Wang. Near-optimal discrete optimization for experimental design: A regret minimization approach. Mathematical Programming, 186, 11 2017.
[BLP ${}^{+}$ 22] Adam Brown, Aditi Laddha, Madhusudhan Pittu, Mohit Singh, and Prasad Tetali. Determinant maximization via matroid intersection algorithms. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 255–266, Los Alamitos, CA, USA, nov 2022. IEEE Computer Society.
[BS06] Nikhil Bansal and Maxim Sviridenko. The santa claus problem. In Proceedings of the Thirty-Eighth Annual ACM Symposium on Theory of Computing, STOC ’06, page 31–40, New York, NY, USA, 2006. Association for Computing Machinery.
[BSS09] Joshua D. Batson, Daniel A. Spielman, and Nikhil Srivastava. Twice-ramanujan sparsifiers. In Proceedings of the Forty-First Annual ACM Symposium on Theory of Computing, STOC ’09, page 255–262, New York, NY, USA, 2009. Association for Computing Machinery.
[CCK09] Deeparnab Chakrabarty, Julia Chuzhoy, and Sanjeev Khanna. On allocating goods to maximize fairness. In Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science, FOCS ’09, page 107–116, USA, 2009. IEEE Computer Society.
[cMI09] Ali Çivril and Malik Magdon-Ismail. On selecting a maximum volume sub-matrix of a matrix and related problems. Theoretical Computer Science, 410:4801–4811, 2009.
[Cun84] W. H. Cunningham. Testing membership in matroid polyhedra. Journal of Combinatorial Theory, Series B, 36:161–188, 1984.
[DRZ20] Sami Davies, Thomas Rothvoss, and Yihao Zhang. A tale of santa claus, hypergraphs and matroids. In Proceedings of the Thirty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’20, page 2748–2757, USA, 2020. Society for Industrial and Applied Mathematics.
[Fei08] Uriel Feige. On allocations that maximize fairness. In Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’08, page 287–293, USA, 2008. Society for Industrial and Applied Mathematics.
[HO14] Nicholas JA Harvey and Neil Olver. Pipage rounding, pessimistic estimators and matrix concentration. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 926–945. SIAM, 2014.
[HS23] Penny Haxell and Tibor Szabó. Improved integrality gap in max-min allocation: or topology at the north pole. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2875–2897, 2023.
[JMS22] Ben Jourdan, Peter Macgregor, and He Sun. Is the algorithmic kadison-singer problem hard?, 2022. arxiv.longhoe.net:2205.02161.
[LZ21] Lap Chi Lau and Hong Zhou. A local search framework for experimental design. In Proceedings of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’21, page 1039–1058, USA, 2021. Society for Industrial and Applied Mathematics.
[MNST20] Vivek Madan, Aleksandar Nikolov, Mohit Singh, and Uthaipon Tantipongpipat. Maximizing determinants under matroid constraints. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 565–576, 2020.
[MSS15] Adam W. Marcus, Daniel A. Spielman, and Nikhil Srivastava. Interlacing families ii: Mixed characteristic polynomials and the kadison–singer problem. Annals of Mathematics, 182:327–350, 2015.
[MSTX19] Vivek Madan, Mohit Singh, Uthaipon Tantipongpipat, and Weijun Xie. Combinatorial algorithms for optimal design. In Proceedings of the Thirty-Second Conference on Learning Theory, 06 2019.
[Nik15] Aleksandar Nikolov. Randomized rounding for the largest simplex problem. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’15, page 861–870, New York, NY, USA, 2015. Association for Computing Machinery.
[NS16] Aleksandar Nikolov and Mohit Singh. Maximizing determinants under partition constraints. In Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, STOC ’16, page 192–201, New York, NY, USA, 2016. Association for Computing Machinery.
[NST19] Aleksandar Nikolov, Mohit Singh, and Uthaipon Tao Tantipongpipat. Proportional volume sampling and approximation algorithms for a-optimal design. Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1369–1386, 2019.
[Puk06] Friedrich Pukelsheim. Optimal design of experiments. SIAM, 2006.
[SEFM15] Marco Di Summa, Friedrich Eisenbrand, Yuri Faenza, and Carsten Moldenhauer. On largest volume simplices and sub-determinants. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’15, page 315–323, USA, 2015. Society for Industrial and Applied Mathematics.
[SX18] Mohit Singh and Weijun Xie. Approximate positive correlated distributions and approximation algorithms for D-optimal design. In Proceedings of SODA, 2018.
[Tro15] Joel A. Tropp. An introduction to matrix concentration inequalities. Foundations and Trends® in Machine Learning, 8(1-2):1–230, 2015.
[Wea04] Nik Weaver. The kadison–singer problem in discrepancy theory. Discrete mathematics, 278(1-3):227–239, 2004.

Appendix A Omitted proofs

Proof (of Corollary 2) This is a simple calculation, using the fact the the semidefinite order is preserved under conjugation.

	$\displaystyle\operatorname{\mathrm{Pr}}\left(\sum_{i=1}^{k}M_{i}\nsucceq(1-% \epsilon)\mu_{\min}\cdot A\right)$	$\displaystyle=\operatorname{\mathrm{Pr}}\left(\sum_{i=1}^{k}A^{-1/2}M_{i}A^{-1% /2}\nsucceq(1-\epsilon)\mu_{\min}\cdot I\right)$
		$\displaystyle=\operatorname{\mathrm{Pr}}\left(\lambda_{\min}\left(\sum_{i=1}^{% k}A^{-1/2}M_{i}A^{-1/2}\right)<(1-\epsilon)\mu_{\min}\right)$
		$\displaystyle\leq d\cdot\exp\left(\frac{-\epsilon^{2}\mu_{min}}{2R}\right)\,.$

$\Box$

A.1 Integrality Gap Example

Consider the vectors $v_{1}=e_{1},v_{2}=e_{1},v_{3}=e_{2},v_{4}=e_{3}$ in $\mathbb{R}^{3}$ and a partition matroid $\mathcal{M}=([4],\mathcal{I})$ defined by the bases $\{1,2,3\},\{1,2,4\}$ . The optimal value of maximizing the minimum eigenvalue for this instance is $0$ as we are forced to pick $v_{1}$ and $v_{2}$ in any basis and they are linearly dependent.

The convex relaxation of maximizing the minimum eigenvalue for this instance is given by

\begin{array}[]{cl}\max&\lambda_{\min}\left(X\right)\\ &X=x_{1}\cdot v_{1}v_{1}^{\top}+x_{2}\cdot v_{2}v_{2}^{\top}+x_{3}\cdot v_{3}v% _{3}^{\top}+x_{4}\cdot v_{4}v_{4}^{\top}\\ &x_{1}=1,\quad\forall i\in[k]\\ &x_{2}=1,\quad\forall i\in[k]\\ &x_{3}+x_{4}=1,\quad\forall i\in[k]\\ &x\geq 0\end{array}

(CP)

The optimum of (CP) is attained when $x_{1}=x_{2}=1$ and $x_{3}=x_{4}$ which gives

X=2e_{1}e_{1}^{\top}+\frac{1}{2}e_{2}e_{2}^{\top}+\frac{1}{2}e_{3}e_{3}^{\top}\,.

So the optimal value of (CP) is $1/2$ , whereas the true optimal is $0$ .

Appendix B Matroids and Pipage Rounding

In this section, we provide the necessary background on matroids, as well as the lower tail versions of lemmas from [HO14], which let us prove Lemma 4.

B.1 Matroids

A pair $\mathcal{M}=(E,\mathcal{I})$ is a matroid if $E$ is a finite set and $\mathcal{I}$ is a collection of subsets of $E$ satisfying

(1)

If $I\in\mathcal{I}$ and $J\subseteq I$ then $J\in\mathcal{I}$ , and
(2)

If $I,J\in\mathcal{I}$ and $|I|<|J|$ then there is $e\in J\backslash I$ such that $I\cup\{e\}\in\mathcal{I}$ .

The sets in $\mathcal{I}$ are referred to as the independent sets of the matroid $\mathcal{M}$ . The maximal sets in $\mathcal{I}$ are called bases, and it is a consequence of the matroid axioms that all bases have the same cardinality. For a subset $U\subseteq E$ , we denote my $r(U)$ the maximum size of an independent set in $U$ and call this the rank of $U$ . In this notation, we can say that every basis of $\mathcal{M}$ has cardinality exactly $r(E)$ . Given a matroid $\mathcal{M}$ , the matroid base polytope is the convex hull of indicator vectors of the bases of $\mathcal{M}$ , and is denoted $\mathcal{P}(\mathcal{M})$ . The base polytope has the following linear description

	$\displaystyle\mathcal{P}(\mathcal{M})$	$\displaystyle=\text{conv}\left\{\chi(B):B\text{ a basis of }\mathcal{M}\right\}$
		$\displaystyle=\left\{x\in\mathbb{R}^{E}:\sum_{e\in E}x_{e}=r(E),\sum_{e\in U}x% _{e}\leq r(U)\,\,\forall U\subseteq E,x\geq 0\right\}.$

Cunningham [Cun84] showed that given $x\in\mathbb{R}^{E}_{+}$ , it is possible to find a violated constraint for $\mathcal{P}(\mathcal{M})$ in strongly polynomial time using only an independence oracle for the matroid $\mathcal{M}$ .

B.2 Pipage Rounding

The following theorem follows from the discussion in Section 3.2.

Theorem 6

[HO14] There is a randomized polynomial time algorithm that, given $x_{0}\in\mathcal{P}(\mathcal{M})$ , outputs an extreme point $\hat{x}$ of $\mathcal{P}(\mathcal{M})$ with $\operatorname{\mathbb{E}}[\hat{x}]=x_{0}$ and such that for any $g$ concave under swaps $\operatorname{\mathbb{E}}[g(\hat{x})]\leq g(x)$ .

We will mainly make use of this theorem through the following claim. The conditions of the claim come from pessimistic estimators, but not all of them are strictly necessary. For a point $x\in[0,1]^{n}$ , let $D(x)$ represent the corresponding product distribution over $\{0,1\}^{n}$ with marginals given by $x$ .

Lemma 6

[HO14] Let $\mathcal{E}\subseteq\{0,1\}^{n}$ and $g:\mathcal{P}(\mathcal{M})\rightarrow\mathbb{R}$ satisfy

	$\displaystyle\operatorname{\mathrm{Pr}}_{x\sim D(x)}[x\in\mathcal{E}]\leq g(x)% ,\text{ and }$
	$\displaystyle\min\{g(x-x_{i}e_{i}),g(x+(1-x_{i})e_{i})\}\leq g(x)$

for all $x\in[0,1]^{n}$ , and $g$ be concave under swaps. If pipage rounding is started at an initial point $x_{0}\in P$ and $\hat{x}$ is the random extreme point, then $\operatorname{\mathrm{Pr}}[\hat{x}\in\mathcal{E}]\leq g(x_{0})$ .

Essentially, this lemma says that if we have a pessimistic estimator which is concave under swaps, then pipage rounding has the same type of concentration behavior as independent rounding, but will actually return a vertex of the matroid polytope.

For our particular application, we will be choosing the function $g$ to be an estimator for matrix concentration due to Tropp [Tro15].

Theorem 7

[Tro15, Theorem 5.1.1] Let $M_{1},\ldots,M_{n}$ be self-adjoint matrices with $\lambda_{\max}(M_{i})\leq R$ for all $i\in[n]$ and let $\mu_{\min}=\lambda_{\min}(\operatorname{\mathbb{E}}_{x\sim\mathcal{D}(x)}[\sum% _{i\in[n]}x_{i}M_{i}]$ . For $t\in\mathbb{R}$ , we have the bound

\operatorname{\mathrm{Pr}}_{x\sim D(x)}\left[\lambda_{\min}\left(\sum_{i\in[n]% }x_{i}M_{i}\right)\leq t\right]\leq\inf_{\theta<0}g_{t,\theta}(x)

where $g_{t,\theta}(x)=e^{-\theta t}\cdot\mathrm{tr}\exp\left(\sum_{i\in[n]}\log% \operatorname{\mathbb{E}}e^{\theta x_{i}\cdot M_{i}}\right)$ . Furthermore, for $t=(1-\epsilon)\mu_{\min}$ ,

g_{t,\theta}(x)\leq d\cdot\left(\frac{e^{-\epsilon}}{(1-\epsilon)^{1-\epsilon}% }\right)^{\mu_{\min}/R}.

This is the lower-tail version of the same concentration inequality which was used in [HO14]. In that paper, they provide an upper-tail version of Lemma 4 using a new generalization of Lieb’s concavity theorem, stated below.

Lemma 7

[HO14] Let $L\in\mathbb{S}_{d}$ , $C_{1},C_{2}\in\mathbb{S}_{d}^{++}$ , and $K_{1},K_{2}\in\mathbb{S}_{d}^{+}$ . Then the univariate function

z\rightarrow\mathrm{tr}\exp\left(L+\log(C_{1}+zK_{1})+\log(C_{2}-zK_{2})\right)

is concave in a neighborhood of $0$ .

As a consequence, we get the following lemma.

Lemma 8

For $\theta<0$ , all $x\in[0,1]^{m}$ , the function

g_{t,\theta}(x)=e^{-\theta t}\cdot\mathrm{tr}\exp\left(\sum_{i\in[m]}\log% \operatorname{\mathbb{E}}_{x\sim\mathcal{D}(x)}e^{\theta x_{i}\cdot M_{i}}\right)

is concave under swaps.

Proof (of Lemma 8) Let $C_{i}:=\operatorname{\mathbb{E}}_{x\sim\mathcal{D}(x)}[e^{\theta x_{i}\cdot M_% {i}}]=x_{i}\cdot e^{\theta M_{i}}+(1-x_{i})\cdot I\succ 0$ , and for any $i\in[n]$

\displaystyle\operatorname{\mathbb{E}}_{x\sim\mathcal{D}(x+ze_{i})}[e^{\theta x% _{i}\cdot M_{i}}]=(x_{i}+z)e^{\theta M_{i}}+(1-x_{i}-z)\cdot I=C_{i}-z\cdot(I-% e^{\theta M_{i}}).

Then $\forall a,b\in[n]$ ,

	$\displaystyle g_{t,\theta}(x+z(e_{a}-e_{b}))$
	$\displaystyle=e^{-\theta t}\cdot\mathrm{tr}\exp\left(\sum_{i\in[n]\backslash\{% a,b\}}\log C_{i}+\log\left(C_{b}+z\cdot(I-e^{\theta M_{b}})\right)+\log\left(C% _{a}-z\cdot(I-e^{\theta M_{a}})\right)\right)$
	$\displaystyle=e^{-\theta t}\cdot\mathrm{tr}\exp\left(L+\log\left(C_{b}+z\cdot K% _{b}\right)+\log\left(C_{a}-z\cdot K_{a}\right)\right)\,,$

where $K_{a}=(I-e^{\theta M_{a}})$ , $K_{b}=(I-e^{\theta M_{b}})$ , and $L=\sum_{i\in[n]\backslash\{a,b\}}\log C_{i}\in\mathbb{S}_{d}$ . If $\theta\leq 0$ , then $K_{a}\succeq 0$ and $K_{b}\succeq 0$ . Using Lemma 7, $z\rightarrow g_{t,\theta}(x+z(e_{a}-e_{b}))$ is concave in $z$ , and the result follows. $\Box$

Combining Lemma 8 and Theorem 7 with Lemma 6, we obtain Lemma 4.