CVXSADes: a stochastic algorithm for constructing optimal exact regression designs with single or multiple objectives

Chi-Kuang Yeh \XeTeXLinkBox Corresponding author: [email protected] Department of Statistics and Actuarial Science, University of Waterloo Julie Zhou Department of Mathematics and Statistics, University of Victoria

May 5, 2024

ABSTRACT

We propose an algorithm to construct optimal exact designs (EDs). Most of the work in the optimal regression design literature focuses on the approximate design (AD) paradigm due to its desired properties, including the optimality verification conditions derived by Kiefer, (1959, 1974). ADs may have unbalanced weights, and practitioners may have difficulty implementing them with a designated run size $n$ . Some EDs are constructed using rounding methods to get an integer number of runs at each support point of an AD, but this approach may not yield optimal results. To construct EDs, one may need to perform new combinatorial constructions for each $n$ , and there is no unified approach to construct them. Therefore, we develop a systematic way to construct EDs for any given $n$ . Our method can transform ADs into EDs while retaining high statistical efficiency in two steps. The first step involves constructing an AD by utilizing the convex nature of many design criteria. The second step employs a simulated annealing algorithm to search for the ED stochastically. Through several applications, we demonstrate the utility of our method for various design problems. Additionally, we show that the design efficiency approaches unity as the number of design points increases.

Keywords: design of experiment, optimal approximate design, exact design, multiple-objective design, maximin design, stochastic optimization, annealing algorithm, CVX solver

MSC 2020: 62K05, 62K20.

1 Introduction

Consider a general regression model,

y_{i}=\eta(\boldsymbol{\mathrm{x}}_{i},\boldsymbol{\theta})+\epsilon_{i},\quad i% =1,\ldots,n,

(1)

where $y_{i}$ is the $i$ -th observation of a response variable $y$ at design point $\boldsymbol{\mathrm{x}}_{i}\in S\subset\mathbb{R}^{p}$ , $S$ is a design space, $\boldsymbol{\theta}\in\mathbb{R}^{q}$ is the unknown regression parameter vector, response function $\eta(\boldsymbol{\mathrm{x}}_{i},\boldsymbol{\theta})$ can be a linear or nonlinear function of $\boldsymbol{\theta}$ , and the errors $\epsilon_{i}$ are assumed to be uncorrelated with mean zero and finite variance $\sigma^{2}$ . Let $\hat{\boldsymbol{\theta}}$ be an estimator of $\boldsymbol{\theta}$ , such as the least squares estimator. Various optimal designs are defined by minimizing $\phi\left\{\mathcal{C}ov(\hat{\boldsymbol{\theta}})\right\}$ over the design points $\boldsymbol{\mathrm{x}}_{1},\ldots,\boldsymbol{\mathrm{x}}_{n}$ , where function $\phi(\cdot)$ can be determinant, trace, or other scalar functions. The resulting designs are called optimal exact designs (OEDs), which depend on the response function $\eta(\cdot,\cdot)$ , the design space $S$ , the estimator $\hat{\boldsymbol{\theta}}$ , the scalar function $\phi(\cdot)$ , and the number of points $n$ . As for searching for the OEDs, coordinate exchange and simulated annealing (SA) algorithms have been developed and used; see Meyer and Nachtsheim, (1988, 1995), Wilmut and Zhou, (2011), Smucker et al., (2012), Rempel and Zhou, (2014) and Palhazi Cuervo et al., (2016), for a small sample of recent contributions to this problem. It is well known that it is difficult to construct OEDs, even for relatively simple problems; see Section 1.7 in Berger and Wong, (2009) for more details. Note that other than the model and the parameter values, the exact designs also depend on the number of the run size $n$ , and the practitioner needs to recalculate the design for each $n$ , which makes it challenging to construct in practice.

To avoid calculating a huge number (near-infinite) of different exact designs for each given $n$ , Kiefer, (1959, 1974) proposed and developed the general equivalence theory for optimal approximate designs (OADs). With the approximate design, one does not need to recalculate the design for each $n$ . Instead, it calculates the proportion of how many resources should be allocated at different support points. The equivalence theorem is useful for constructing OADs analytically and numerically. After obtaining OADs, we may use rounding to convert them to exact designs. This method is usually suggested in research papers; e.g. Pukelsheim and Rieder, (1992) and its follow-up works. Before explaining the details of the conversion, we first give a short review of OADs. Let $\xi(\boldsymbol{\mathrm{x}})$ be a discrete distribution (design) with $k$ support points in $S$ , say, ${\bf v}_{1},\ldots,{\bf v}_{k}$ , and their corresponding weights are denoted by, $w_{1},\ldots,w_{k}$ , respectively. Note that $k$ is not fixed and can be any positive integer. Denote the set of all discrete distributions on $S$ as $\Xi_{S}$ . The information matrix of a design $\xi(\boldsymbol{\mathrm{x}})\in\Xi_{S}$ for model (1) is given by

\displaystyle\boldsymbol{\mathrm{I}}(\xi,\boldsymbol{\theta}^{*})=\sum_{i=1}^{% k}w_{i}\boldsymbol{\mathrm{f}}({\bf v}_{i},\boldsymbol{\theta}^{*})\boldsymbol% {\mathrm{f}}^{\top}({\bf v}_{i},\boldsymbol{\theta}^{*}),

(2)

where vector $\boldsymbol{\mathrm{f}}(\boldsymbol{\mathrm{x}},\boldsymbol{\theta})=\frac{% \partial\eta(\boldsymbol{\mathrm{x}},\boldsymbol{\theta})}{\partial\boldsymbol% {\theta}}$ , and $\boldsymbol{\theta}^{*}$ is the true value of $\boldsymbol{\theta}$ . The covariance matrix of $\hat{\boldsymbol{\theta}}$ , $\mathcal{C}ov(\hat{\boldsymbol{\theta}})$ , is proportional to $\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*})$ . An OAD is defined as the minimizer of $\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*})\right\}$ over all possible designs $\xi\in\Xi_{S}$ for a given function $\phi(\cdot)$ . If $\boldsymbol{\mathrm{I}}(\xi,\boldsymbol{\theta}^{*})$ depends on $\boldsymbol{\theta}^{*}$ , then the OAD is called a locally OAD or simply OAD in this paper. Notice that, for linear response functions, $\boldsymbol{\mathrm{I}}(\xi,\boldsymbol{\theta}^{*})$ does not depend on $\boldsymbol{\theta}^{*}$ . In practice, we do not know $\boldsymbol{\theta}^{*}$ , and replace $\boldsymbol{\theta}^{*}$ in (2) by an estimate, which may be available from pilot studies or the domain knowledge.

Since it is easier to construct OADs than OEDs, OADs have been obtained for various models, design spaces, and optimality criteria. Often numerical methods are used for finding OADs, and the methods include, for example, multiplicative algorithm (Zhang and Mukerjee, , 2013; Bose and Mukerjee, , 2015), cocktail algorithm (Yu, , 2011), genetic algorithm (Broudiscou et al., , 1996; Hamada et al., , 2001), semi-definite programming method (Papp, , 2012; Duarte and Wong, , 2015; Ye et al., , 2017), semi-infinite programming tools in Duarte and Wong, (2014) and Duarte et al., (2015), particle swarm method (Chen et al., , 2015), convex optimization method via CVX toolbox (Grant and Boyd, , 2020) in Gao and Zhou, (2017), a general method in Yang et al., (2013), and an efficient method in Duan et al., (2022). Mandal et al., (2015) provides a comprehensive review of the algorithmic approaches utilized in most of the methods mentioned above. Recently, Wong and Zhou, (2019, 2023) gave detailed discussions and comments on several algorithms, and they also developed effective numerical algorithms for finding OADs with multiple objective functions.

As the advantages of OADs are mentioned, tremendous efforts have been put into them, but how to effectively construct $n$ design points efficiently from those OADs is still uncertain. Suppose $\xi_{\phi}^{*}$ is an OAD with $m$ support points in $S$ , say ${\bf v}_{1}^{*},\ldots,{\bf v}_{m}^{*}$ , and their weights are $w_{1}^{*},\ldots,w_{m}^{*}$ , respectively. It is clear that $\xi_{\phi}^{*}$ does not depend on $n$ . However, it may not be easy to implement $\xi_{\phi}^{*}$ with $n$ runs in practice, as $nw_{i}^{*}$ , $i=1,\ldots,m$ , are usually not integers. A general suggestion is to round each $nw_{i}^{*}$ to the nearest positive integer subject to the total number of design points being $n$ (Wong and Zhou, , 2019). In some situations, how best to round $nw_{i}^{*}$ may not be clear. For instance, if an approximate design has weights as $(w_{1}^{*},\ldots,w_{5}^{*})=(0.2493,0.2465,0.1033,0.1517,0.2492)$ , from a design in Haines et al., (2018, Example 4.2), then it is not clear how to choose $n=10$ design points from the approximate design. Are there good strategies for selecting $n$ design points from OADs?

In addition, the debate surrounding this rounding method has persisted for a while. In a review paper by John and Draper, (1975), the author stated, “the rounding-off procedure may eliminate points with small measure, thereby changing the nature of the design.” López-Fidalgo, (2023) recalls in his recent book, stating that the controversy regarding exact design is discussed in Section 1.4.4, where he mentions that the rounding approach from approximate design to exact design is highly controversial, with some, including Box, dissenting. Particularly, on page 7, López-Fidalgo, (2023) states, “Box has never accepted the use of the approximate designs introduced by Kiefer, (1959),” and on page 11, he further remarks, “This idea came from Kiefer, (1974) and it used to be a controversial topic that George Box and others never have liked.” In particular, when the number of available runs subject to the available resources is small, it is unclear which support points to keep from the approximate design or whether to keep any of them at all. Mukerjee and Huda, (2016) also raised issues related to rounding for fractional factorial designs. They studied procedures for finding highly efficient exact designs from approximate design, however their procedure can only be applied for designs with a finite number of points.

In this paper, we propose a stochastic algorithm to construct OEDs to address these issues and systematically construct exact designs. Our proposed method utilizes a meta-heuristic algorithm, which does not rely on restrictive assumptions and does not require computing the gradient of the loss (objective) function in optimal design problems, which may be difficult to obtain in complicated design problems. Our algorithm first constructs an OAD and uses it as a starting point to search for exact designs. Additionally, OADs are used to compute the design efficiency of OEDs. Various optimality criteria are studied, including single-objective and multiple-objective criteria. To summarize, this paper makes three important contributions to optimal design of experiments as follows.

: (i) Fast and gradient-free algorithm: Our first contribution is the development of a general algorithm for finding highly efficient OEDs via a CVX solver and an SA algorithm. Highly efficient OEDs can be found easily from the algorithm. In particular, our method does not require the derivative/gradient and Hessian matrix of the objective function of the design problems.
: (ii) Importance of finding both exact and approximate designs: Our second contribution is to use our algorithm for finding both the OAD and OED. We show the importance of searching for both OAD and OED together. If we just use a SA algorithm to search for exact designs, we do not know if the resulting designs are optimal or efficient. Computing both OED and OAD allows us to compute the design efficiency of an OED. If we just search for OADs and use a rounding method to obtain OEDs, then the resulting designs may not be efficient, or we do not know how to do the rounding. We can find highly efficient exact designs by computing an OAD and then searching for an exact design. The efficiency is computed by using the OAD.
: (iii) Exact design on complex setup: Our third contribution is to construct highly efficient exact designs in complicated settings. The conventional methods to compute the exact designs in the literature mainly centre around the low dimensional (small $p$ ) problems and one objective function; for instance, Duarte et al., (2020) only considers the settings in at most three dimensions and only one objective function. There is a need to fill the gap between constructing the exact design in more complex settings, including design in high-dimension, and for more than one objective function. Our method is demonstrated to be able to find exact designs in higher dimensions, seven-dimensional space, and exact designs with multiple competing objectives. We present four applications with various optimality criteria and design spaces, including multiple-objective criteria and high-dimensional design spaces.

The rest of the paper is organized as follows. In Section 2, we review the optimal design criteria with single and multiple objectives, related equivalence theory for OADs, and convex optimal design problems on discrete design spaces. In Section 3, we develop an effective algorithm to find OADs and OEDs for single and multiple objectives. We also give several properties about the OADs and OEDs and many remarks on the properties of the proposed algorithm. Section 4 presents several applications and their OEDs, which are difficult to find by conventional methods. Finally, we close the paper with the concluding remarks in Section 5. All proofs and derivations are in the Appendix. The implementation is available on the author’s GitHub page https://github.com/chikuang/CVXSADes.

2 Optimality criteria with single or multiple objectives

Various optimality criteria have been proposed and studied in the literature; see, for example, Fedorov, (1972), Pukelsheim, (1993), Berger and Wong, (2009), and Dean et al., (2015). Here we recall several criteria to illustrate optimal design problems and equivalence theory, and present convex optimization problems on discrete design spaces.

2.1 Optimal design problems

A-, c-, D-, I-, and E-optimality criteria are commonly used to construct optimal designs with a single objective function. The optimal design problems for finding OADs can be written as

\displaystyle\min_{\xi\in\Xi_{S}}\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi,% \boldsymbol{\theta}^{*})\right\},

(3)

where $\boldsymbol{\mathrm{I}}(\xi,\boldsymbol{\theta}^{*})$ is defined in (2), and $\phi$ is a scalar function. For example, $\phi$ is the determinant function for D-optimality, and trace for A-optimality. We denote trace and determinant functions as $\operatorname{tr}(\cdot)$ and $\det(\cdot)$ , respectively. Let $\xi_{\phi}^{*}$ be the solution to problem (3), which depends on $\phi$ , and $\xi_{\phi}^{*}$ is called an OAD.

The equivalence theory states the necessary and sufficient condition that an OAD satisfies. A general form for the condition is

\displaystyle d_{\phi}(\boldsymbol{\mathrm{x}},\xi_{\phi}^{*})\leq 0,~{}~{}% \mbox{for all}~{}\boldsymbol{\mathrm{x}}\in S,

(4)

where function $d_{\phi}(\boldsymbol{\mathrm{x}},\xi_{\phi}^{*})$ depends on the optimality criterion and is the negative of the directional derivative of $\phi$ . The equality in (4) holds at the support points of $\xi_{\phi}^{*}$ . For a D-optimal design $\xi_{\phi}^{*}$ ,

\displaystyle d_{\phi}(\boldsymbol{\mathrm{x}},\xi_{\phi}^{*})=\boldsymbol{% \mathrm{f}}^{\top}(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})\boldsymbol% {\mathrm{I}}^{-1}(\xi_{\phi}^{*},\boldsymbol{\theta}^{*})\boldsymbol{\mathrm{f% }}(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})-q.

If $\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*})\right\}=% \operatorname{tr}\left\{{\bf C}^{\top}\boldsymbol{\mathrm{I}}^{-1}(\xi,% \boldsymbol{\theta}^{*}){\bf C}\right\}$ with a constant matrix ${\bf C}$ ( $q\times r$ ; $r\leq q$ ), which includes A-, c-, and I-optimality criteria, then

\displaystyle d_{\phi}(\boldsymbol{\mathrm{x}},\xi_{\phi}^{*})=\boldsymbol{% \mathrm{f}}^{\top}(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})\boldsymbol% {\mathrm{I}}^{-1}(\xi_{\phi}^{*},\boldsymbol{\theta}^{*}){\bf C}{\bf C}^{\top}% \boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi}^{*},\boldsymbol{\theta}^{*})% \boldsymbol{\mathrm{f}}(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})-% \operatorname{tr}\left\{{\bf C}^{\top}\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi}^% {*},\boldsymbol{\theta}^{*}){\bf C}\right\}.

For a generalized linear model (GLM), the maximum likelihood estimator is often used to estimate parameter $\boldsymbol{\theta}$ . The information matrix of design $\xi(\boldsymbol{\mathrm{x}})$ is different from that in (2) and can be written as

\boldsymbol{\mathrm{I}}(\xi,\boldsymbol{\theta}^{*})=\sum_{i=1}^{k}w_{i}% \lambda({\bf v}_{i},\boldsymbol{\theta}^{*})\boldsymbol{\mathrm{f}}({\bf v}_{i% })\boldsymbol{\mathrm{f}}^{\top}({\bf v}_{i}),

(5)

where $\lambda(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})$ and $\boldsymbol{\mathrm{f}}(\boldsymbol{\mathrm{x}})$ depend on the link and predictor functions of the GLM. In function $d_{\phi}(\boldsymbol{\mathrm{x}},\xi_{\phi}^{*})$ , we replace $\boldsymbol{\mathrm{f}}(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})$ by $\sqrt{\lambda({\boldsymbol{\mathrm{x}}},\boldsymbol{\theta}^{*})}\boldsymbol{% \mathrm{f}}(\boldsymbol{\mathrm{x}})$ , and several examples are given in Section 4.

2.2 Optimal design problems on discrete design spaces

The necessary and sufficient condition in (4) enables us to find analytical and numerical solutions of $\xi_{\phi}^{*}$ for various models, including polynomial, second-order, and nonlinear models. However, it is still challenging to find $\xi_{\phi}^{*}$ for complicated models or design spaces. Wong and Zhou, (2019) discussed and investigated OADs on discrete design spaces. They used the fact that optimal design problems are convex optimization problems for commonly used optimality criteria, and applied CVX solver for finding OADs. Here we recall some details of the optimal design problems on discrete design spaces.

Let $S_{N}=\{{\bf u}_{1},\ldots,{\bf u}_{N}\}\subset S$ be a discrete design space with $N$ points, where ${\bf u}_{1},\ldots,{\bf u}_{N}$ are user selected points. One possible choice is to use equally spaced grid points in $S$ . Denote any distribution on $S_{N}$ by

\xi_{N}=\left(\begin{array}[]{cccc}{\bf u}_{1}&{\bf u}_{2}&\ldots&{\bf u}_{N}% \\ w_{1}&w_{2}&\ldots&w_{N}\\ \end{array}\right),

where the weights satisfy $w_{j}\geq 0$ for $j=1,\ldots,N$ and $\sum_{j=1}^{N}w_{j}=1$ , and a point ${\bf u}_{j}$ is a support point of $\xi_{N}$ if the corresponding weight $w_{j}>0$ . Let $\Xi_{S_{N}}$ be the set of all distributions on $S_{N}$ . The information matrix of design $\xi_{N}$ become $\boldsymbol{\mathrm{I}}(\xi_{N},\boldsymbol{\theta}^{*})$ , which can be computed from (2) or (5), replacing ${\bf v}_{i}$ by ${\bf u}_{i}$ and replacing $\sum_{i=1}^{k}$ by $\sum_{i=1}^{N}$ . It is important to notice that in $\xi_{N}$ , ${\bf u}_{1},\ldots,{\bf u}_{N}$ are fixed points, but $w_{1},\ldots,w_{N}$ are unknown weights. In addition, $\boldsymbol{\mathrm{I}}(\xi_{N},\boldsymbol{\theta}^{*})$ is linear in weights $w_{1},\ldots,w_{N}$ . Let weight vector ${\bf w}=(w_{1},\ldots,w_{N})^{\top}$ . Then $\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{N},\boldsymbol{\theta}^{*})\right\}$ is a convex function of ${\bf w}$ for commonly used optimality criteria; see Boyd and Vandenberghe, (2004), and Wong and Zhou, (2019). An OAD on $S_{N}$ , denoted by $\xi_{\phi,N}^{*}$ , is a solution to the following optimization problem,

\min_{\xi_{N}\in\Xi_{S_{N}}}\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{N},% \boldsymbol{\theta}^{*})\right\},

\displaystyle\left\{\begin{array}[]{l}\min_{\bf w}\phi\left\{\boldsymbol{% \mathrm{I}}^{-1}(\xi_{N},\boldsymbol{\theta}^{*})\right\}\\ \mbox{subject to:}~{}~{}w_{j}\geq 0,~{}j=1,\ldots,N,~{}~{}\sum_{j=1}^{N}w_{j}=% 1.\\ \end{array}\right.

(8)

Problem (8) is a constraint convex optimization problem if $\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{N},\boldsymbol{\theta}^{*})\right\}$ is a convex function of ${\bf w}$ . It can be solved by CVX solver, and detailed procedures of using CVX for finding $\xi_{\phi,N}^{*}$ are given in Wong and Zhou, (2019).

2.3 Multiple-objective optimal designs

For multiple-objective optimal designs, there are mainly three optimality criteria, which are often used. They are compound, multiple efficiency constraint, and maximin efficiency criteria (Wong and Zhou, , 2023). Since compound and multiple efficiency constraint criteria need extra information to form the design problems, we only focus on maximin efficiency criterion to discuss OEDs. Let $\phi_{1}(\xi),\ldots,\phi_{\ell}(\xi)$ be $\ell$ objective functions. For instance, $\phi_{1}(\xi)=\det\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^% {*})\right\},\ldots,\phi_{\ell}(\xi)=\operatorname{tr}\left\{\boldsymbol{% \mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*})\right\}$ , which are defined as different scalar functions of the same information matrix of a model. They can also be defined as the same scalar function of information matrices of several models. Suppose there are three competing models for an experiment, which leads to three different information matrices, say, $\boldsymbol{\mathrm{I}}_{1}(\xi,\boldsymbol{\theta}_{1}^{*})$ , $\boldsymbol{\mathrm{I}}_{2}(\xi,\boldsymbol{\theta}_{2}^{*})$ , and $\boldsymbol{\mathrm{I}}_{3}(\xi,\boldsymbol{\theta}_{3}^{*})$ , where $\boldsymbol{\theta}_{1}^{*},\boldsymbol{\theta}_{2}^{*}$ , and $\boldsymbol{\theta}_{3}^{*}$ are the true parameters for the three models, respectively. We may then define $\phi_{1}(\xi)=\det\left\{\boldsymbol{\mathrm{I}}_{1}(\xi,\boldsymbol{\theta}_{% 1}^{*})\right\}$ , $\phi_{2}(\xi)=\det\left\{\boldsymbol{\mathrm{I}}_{2}(\xi,\boldsymbol{\theta}_{% 2}^{*})\right\}$ , and $\phi_{3}(\xi)=\det\left\{\boldsymbol{\mathrm{I}}_{3}(\xi,\boldsymbol{\theta}_{% 3}^{*})\right\}$ . Alternatively, we can use different scalar functions of those information matrices.

A maximin efficiency design is defined below. First, we minimize each $\phi_{i}(\xi)$ over $\xi$ to obtain an OAD $\xi_{\phi_{i}}^{*}$ , $i=1,\ldots,\ell$ . Second, we define design efficiency for a design $\xi$ and a given criterion $\phi$ as

\displaystyle\operatorname{\mathbb{E}ff}_{\phi}(\xi)=\frac{\phi\left\{% \boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi}^{*},\boldsymbol{\theta}^{*})\right\}}{% \phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*})\right\}}.

(9)

For the $\ell$ objective functions, there are $\ell$ efficiencies for a design $\xi$ , which can be written as

\operatorname{\mathbb{E}ff}_{\phi_{i}}(\xi)=\frac{\phi_{i}(\xi_{\phi_{i}}^{*})% }{\phi_{i}(\xi)},~{}~{}i=1,\ldots,\ell.

Third, we find a solution to the maximin problem given as

\max_{\xi\in\Xi_{S}}\min_{1\leq i\leq l}~{}\operatorname{\mathbb{E}ff}_{\phi_{% i}}(\xi),

and the solution is called a maximin efficiency design. The maximin problem is hard to solve. However, this problem can be transformed into a convex optimization problem on a discrete design space $S_{N}$ , which can be solved easily via CVX solver. On $S_{N}$ , in the maximin problem we replace $\xi$ and $\Xi_{S}$ by $\xi_{N}$ and $\Xi_{S_{N}}$ , respectively, and use $\xi_{\phi_{i},N}^{*}$ to compute the efficiencies. Wong and Zhou, (2023) developed an effective algorithm for finding maximin optimal designs with various kinds of objective functions. Gao et al., (2024) also derived the necessary and sufficient conditions for multiple-objective optimal designs on $S_{N}$ .

3 Algorithms for OEDs

In this section, we develop an algorithm to construct OEDs which depends on $n$ . Let $\xi_{n,\phi}^{*}$ be the OED for an optimality criterion $\phi$ , and its support points are denoted by ${\tilde{\boldsymbol{\mathrm{x}}}}_{1},\ldots,{\tilde{\boldsymbol{\mathrm{x}}}}% _{m}$ with corresponding weights ${\tilde{w}}_{1},\ldots,{\tilde{w}}_{m}$ . In contrast to $u_{1},\cdots,u_{N}$ , these ${\tilde{\boldsymbol{\mathrm{x}}}}_{1}$ are not fixed/user specified. The weights must satisfy (i) $n{\tilde{w}}_{i}$ is a positive integer for each $i=1,\ldots,m$ , (ii) $\sum_{i=1}^{m}{\tilde{w}}_{i}=1$ . Define $\Xi_{n}$ to be the set of all exact designs on $S$ with size $n$ . It is clear that $\Xi_{n}\subset\Xi_{s}$ . Since $\xi_{n,\phi}^{*}\in\Xi_{n}\subset\Xi_{S}$ and the OAD $\xi_{\phi}^{*}$ is a solution to (3), it is obvious that

\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{n,\phi}^{*},\boldsymbol{\theta}^{% *})\right\}\geq\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi}^{*},% \boldsymbol{\theta}^{*})\right\},

which implies that $\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})\leq 1$ . A design $\xi$ is said to be highly efficient if $\operatorname{\mathbb{E}ff}_{\phi}(\xi)$ is close to $1$ . In practice we may use $\operatorname{\mathbb{E}ff}_{\phi}(\xi)\geq 0.95$ to define highly efficient design $\xi$ .

For D-optimality, we may minimize $\det\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*})\right\}$ or $\log\left(\det\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*})% \right\}\right)$ to find D-optimal designs, but the D-efficiency is given by

\operatorname{\mathbb{E}ff}_{D}(\xi)=\frac{\left(\det\left\{\boldsymbol{% \mathrm{I}}^{-1}(\xi_{\phi}^{*},\boldsymbol{\theta}^{*})\right\}\right)^{1/q}}% {\left(\det\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*})% \right\}\right)^{1/q}},

where $\xi_{\phi}^{*}$ is a D-optimal design. For other optimality criteria, the design efficiency is usually given by (9).

Notice that $\xi_{\phi}^{*}$ may not always be available. In those cases, the OAD on $S_{n}$ $,\xi_{\phi,N}^{*}$ , can be used to compute a modified design efficiency as

\widetilde{\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{n,\phi}^{*})=\frac{\phi% \left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi,N}^{*},\boldsymbol{\theta}^{*})% \right\}}{\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{n,\phi}^{*},\boldsymbol% {\theta}^{*})\right\}}.

Since $\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi,N}^{*},\boldsymbol{\theta}^{% *})\right\}\geq\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi}^{*},% \boldsymbol{\theta}^{*})\right\},$ it is possible that $\widetilde{\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{n,\phi}^{*})>1$ . Nevertheless, we can still say that $\xi_{n,\phi}^{*}$ is highly efficient when $\widetilde{\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{n,\phi}^{*})\geq 0.95$ . Additional comments are given in Section 3.2.

We will introduce our proposal for an effective algorithm to find highly efficient $\xi_{n,\phi}^{*}$ for small $n$ and $\xi_{\phi,N}^{*}$ in Section 3.1. Then we discuss and explore various properties of the algorithm and $\xi_{n,\phi}^{*}$ in Section 3.2.

3.1 Algorithms

We develop a general algorithm for finding an approximate design $\xi_{\phi,N}^{*}$ for a given design problem, and then construct an exact design $\xi_{n,\phi}^{*}$ with high efficiency. In Algorithm 1, we first compute the OAD, $\xi_{\phi,N}^{*}$ via CVX, and then $\xi_{n,\phi}^{*}$ is obtained through a SA with a starting design generated from $\xi_{\phi,N}^{*}$ . To describe the algorithm clearly, we use a general design problem below to explain the detailed steps in the algorithm. The objective function is $\Phi(\xi)=\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*})\right)$ , where $\xi$ is a design on design space $S\subset\mathbb{R}^{p}$ .

Algorithm 1: CVXSADes for computing OADs and OEDs.

⬇

Input:

S_{N}=\{{\bf u}_{1},\ldots,{\bf u}_{N}\}

: a set of discrete design space

* Function

\boldsymbol{f}(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})

or functions

\lambda(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})

and

\boldsymbol{f}(\boldsymbol{\mathrm{x}})

to compute the information matrix

\boldsymbol{\mathrm{I}}(\xi,\boldsymbol{\theta}^{*})

in (2) or (5)

\Phi(\xi)=\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*})\right\}

: a loss/objective function

n

: the number of points in the exact design

M

: the number of times to run the annealing algorithm

T_{0},~{}T_{max},\alpha,K,\delta=10^{-5}

: parameters in the annealing algorithm for initial temperature, minimum temperature, cooling factor, number of iterations for each temperature, and tolerance for a stop** criterion, respectively

—————— Main steps of the algorithm ————————

———————————————————————–

Step 1: Compute an OAD

\xi_{\phi,N}^{*}

S_{N}

as a solution to problem (8) via CVX solver;

Step 2: For

j=1,\dots,M

2.1: Get an initial exact design from

\xi_{\phi,N}^{*}

and denote it as

\xi_{n,\phi,j}^{(0)}

. The support points of

\xi_{n,\phi,j}^{(0)}

are denoted by

\boldsymbol{\mathrm{x}}_{1}^{(0)},\ldots,\boldsymbol{\mathrm{x}}_{m}^{(0)}

, which are the same as those of

\xi_{\phi,N}^{*}

. The weights

w_{1}^{(0)},\ldots,w_{m}^{(0)}

, at

\boldsymbol{\mathrm{x}}_{1}^{(0)},\ldots,\boldsymbol{\mathrm{x}}_{m}^{(0)}

, respectively, are obtained as follows. Let

n_{i}^{(0)}

be the rounded integer from

nw_{i}^{*}

, satisfying

\sum_{i=1}^{m}n_{i}^{(0)}=n

, and let

w_{i}^{(0)}=n_{i}^{(0)}/n

for

i=1,\ldots,m

. Note that

w_{1}^{*},\ldots,w_{m}^{*}

are the weights at the support points of

\xi_{\phi,N}^{*}

;

2.2: Let

t\leftarrow 1

T\leftarrow T_{0}

\ell_{1}\leftarrow 1

and

\ell_{2}\leftarrow 0

;

2.3: Simulated Annealing: while

T>T_{min}

and

\lvert\ell_{2}-\ell_{1}\rvert>\delta

2.3.1: Make a small change in

\xi_{n,\phi,j}^{(t-1)}

to get

\xi_{n,\phi,j}^{(t)}

. It is done by moving a randomly selected design point in

\xi_{n,\phi,j}^{(t-1)}

, say

\tilde{\boldsymbol{\mathrm{x}}}

, to a new location in

S

, and the new location is randomly generated in a small hyper-cube centred at

\tilde{\boldsymbol{\mathrm{x}}}

;

2.3.2:

\xi_{n,\phi,j}^{(t)}

is accepted if

\exp\left\{-(\Phi(\xi_{n,\phi,j}^{(t)})-\Phi(\xi_{n,\phi,j}^{(t-1)}))/T\right% \}>u^{(t)}

, where

u^{(t)}\sim\mbox{unif}(0,1)

. If it is accepted, then

\ell_{1}\leftarrow\Phi(\xi_{n,\phi,j}^{(t-1)})

\ell_{2}\leftarrow\Phi(\xi_{n,\phi,j}^{(t)})

and

t\leftarrow t+1

;

2.3.3: Let

T\leftarrow\alpha\cdot T

after using the same temperature for

K

times;

2.4: Let

\xi_{n,\phi,j}^{*}

be the last accepted design. Compute

\Phi(\xi_{n,\phi,j}^{*})

and the modified design efficiency

\widetilde{\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{n,\phi,j}^{*})\leftarrow% \frac{\Phi(\xi_{\phi,N}^{*})}{\Phi(\xi_{n,\phi,j}^{*})}

;

Step 3: Select best the design from

\{\xi_{n,\phi,j}^{*}\}_{j=1}^{M}

with the highest efficiency

\widetilde{\operatorname{\mathbb{E}ff}}_{\phi}

and denote it as an OED

\xi_{n,\phi}^{*}

Refer to caption — Figure 1: Plot of loss function values $\Phi(\xi_{n,\phi,j}^{(t)})$ in an annealing algorithm, where the red line is the value of the initial value after rounding $\Phi(\xi_{n,\phi,j}^{(0)})$ .

Remark 1.

Wong and Zhou, (2019, 2023) have provided details for finding OADs via CVX with both single and multiple-objective OADs on $S_{N}$ . CVX solver can solve problem (8) easily with $N$ as large as 10,000. From our numerical results, $N$ does not have to be very large to obtain a highly efficient OAD $\xi_{\phi,N}^{*}$ . Often we can choose $N<10,000$ . In Application 1 in Section 4, a highly efficient OAD $\xi_{\phi,N}^{*}$ is obtained with $N=21^{2}=441$ , and additional comments are provided there.

Remark 2.

To get an initial design $\xi_{n,\phi}^{(0)}$ , we need to get the rounded integers $n_{i}^{(0)}$ . When it is not clear how to round $nw_{i}^{*}$ for some applications, we can just take floor or ceiling of $nw_{i}^{*}$ such that $\sum_{i=1}^{m}n_{i}^{(0)}=n$ . Since $\xi_{n,\phi}^{(0)}$ is just a starting point in the annealing algorithm, the rounding effect is not crucial for the substantial task. As usual, running the annealing algorithm several times is helpful to find an exact design with a high design efficiency value.

Remark 3.

SA algorithm has been used to find optimal designs in the literature, where there are several parameters in the algorithm that need to be adjusted for each optimization problem. For instance, Wilmut and Zhou, (2011) discussed some strategies for adjusting those parameters. Here, it is helpful to use a plot of loss function values $\Phi(\xi_{n,\phi,j}^{(t)})$ versus $t$ (iteration) for checking the annealing parameters, for each fixed $j$ . Figure 1 (from Application 2 in Section 4) shows a plot from the annealing algorithm when the parameters are set appropriately. At the beginning of the search $\Phi(\xi_{n,\phi,j}^{(t)})$ fluctuates, and eventually as $t$ increases $\Phi(\xi_{n,\phi.j}^{(t)})$ decreases and converges to a limit.

Remark 4.

If an OAD $\xi_{\phi}^{*}$ on $S$ is available, skip Step 1 and replace $\xi_{\phi,N}^{*}$ by it in Step 2.1.

Remark 5.

Since a SA algorithm is used in Step 2.3, it does not guarantee to find the best exact design. However, we can set a required efficiency value, say $0.95$ , and run the algorithm $M$ times to search for a highly efficient exact design.

Remark 6.

Algorithm 1 also works for the maximin optimal design by setting

\Phi(\xi)=-\min_{1\leq i\leq l}~{}\operatorname{\mathbb{E}ff}_{\phi_{i}}(\xi).

In Section 4, we provide four representative applications with various models and optimality criteria to show that Algorithm 1 is effective and works well.

3.2 Properties of optimal designs

Figure 2 shows the relationship among the three sets of distributions, where $\Xi_{n}$ is the set of exact designs with run size $n$ on $S$ . It is obvious that the optimal designs, $\xi_{n,\phi}^{*}$ , $\xi_{\phi,N}^{*}$ , and $\xi_{\phi}^{*}$ for any criterion on $\Xi_{n},\Xi_{S_{N}},$ and $\Xi_{S}$ , respectively, satisfy

\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{n,\phi}^{*},\boldsymbol{\theta}^{% *})\right\}\geq\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi}^{*},% \boldsymbol{\theta}^{*})\right\},~{}~{}\phi\left\{\boldsymbol{\mathrm{I}}^{-1}% (\xi_{\phi,N}^{*},\boldsymbol{\theta}^{*})\right\}\geq\phi\left\{\boldsymbol{% \mathrm{I}}^{-1}(\xi_{\phi}^{*},\boldsymbol{\theta}^{*})\right\},

for any $n$ and $N$ . Theorem 1 below shows some asymptotic results of $\xi_{n,\phi}^{*}$ and $\xi_{\phi,N}^{*}$ under some mild conditions.

Figure 2: Illustration of the three sets of distributions,

\Xi_{S}

\Xi_{S_{N}}

and

\Xi_{n}

for the optimal designs, OADs and OEDs, respectively.

Theorem 1.

Consider a regression model and a design space $S$ . Let $\xi_{n,\phi}^{*}$ be the OED for a given $n$ , $\xi_{\phi,N}^{*}$ be the OAD on $S_{N}$ , and $\xi_{\phi}^{*}$ be the OAD design on $S$ . Assume all entries of $I(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})$ are continuous functions of $\boldsymbol{\mathrm{x}}$ and $S$ is a bounded region, where $I(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})={\bf f}(\boldsymbol{\mathrm% {x}},\boldsymbol{\theta}^{*}){\bf f}^{\top}(\boldsymbol{\mathrm{x}},% \boldsymbol{\theta}^{*})$ for model (1), or $I(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})=\lambda(\boldsymbol{\mathrm% {x}},\boldsymbol{\theta}^{*}){\bf f}(\boldsymbol{\mathrm{x}}){\bf f}^{\top}(% \boldsymbol{\mathrm{x}})$ for a GLM. In addition, $S_{N}$ is formed by the Cartesian product of the equally spaced points for each design variable. We have the following results:

(i)

$\lim_{n\to\infty}\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})=1$ .
(ii)

$\lim_{N\to\infty}\operatorname{\mathbb{E}ff}_{\phi}(\xi_{\phi,N}^{*})=1$ .

The detailed proof is given in the Appendix. The asymptotic results indicate that $\xi_{n,\phi}^{*}$ and $\xi_{\phi,N}^{*}$ can be highly efficient for large $n$ and $N$ . However, in practice we are often interested in exact designs with small $n$ , and Algorithm 1 is very helpful for finding those exact designs.

From Theorem 1, $\lim_{n\to\infty}\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})=1$ . However, $\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})$ may not be an increasing function of $n$ . For example, if an OAD has 4 support points with equal weights, $1/4,1/4,1/4,1/4$ , then it is easy to construct OEDs for $n$ being multiples of 4 and those OEDs have $\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})=1$ . However, when $n$ is not a multiple of 4, it is clear that $\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})<1$ .

When the design space $S$ is discrete, we take $S_{N}=S$ . Then we have $\xi_{\phi}^{*}=\xi_{\phi,N}^{*}$ . In this situation, efficiency measures $\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})=\widetilde{\operatorname{% \mathbb{E}ff}}_{\phi}(\xi_{n,\phi}^{*})$ . In general, for any $S$ and $S_{N}\subset S$ , design efficiency measures satisfy

\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})=\operatorname{\mathbb{E}% ff}_{\phi}(\xi_{\phi,N}^{*})\cdot\widetilde{\operatorname{\mathbb{E}ff}}_{\phi% }(\xi_{n,\phi}^{*}),

by their definitions in Sections 2.3 and 3. For instance, if $\operatorname{\mathbb{E}ff}_{\phi}(\xi_{\phi,N}^{*})=0.97$ and $\widetilde{\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{n,\phi}^{*})=0.98$ , then $\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})=0.9506$ . Since $\widetilde{\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{n,\phi}^{*})$ may be larger than 1, $\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})$ can be larger than $\operatorname{\mathbb{E}ff}_{\phi}(\xi_{\phi,N}^{*})$ for some models and in particular for small $N$ . $\operatorname{\mathbb{E}ff}_{\phi}(\xi_{\phi,N}^{*})$ may not be an increasing function of $N$ , but $\lim_{N\to\infty}\operatorname{\mathbb{E}ff}_{\phi}(\xi_{\phi,N}^{*})=1$ from Theorem 1.

Algorithm 1 works well and is effective to find highly efficient $\xi_{n,\phi}^{*}$ for small or moderate $n$ . The number of distinct support points in $\xi_{n,\phi}^{*}$ may not be the same as that in $\xi_{\phi}^{*}$ or $\xi_{\phi,N}^{*}$ . When $n$ is very large, the SA algorithm in Algorithm 1 can be slow. This is true for any SA algorithms. In this situation, we can use a rounding method to construct OEDs, as illustrated in the proof of Theorem 1 in the Appendix, and we can replace $\xi_{\phi}^{*}$ by $\xi_{\phi,N}^{*}$ when $\xi_{\phi}^{*}$ is not available. In that case the number of distinct support points in $\xi_{n,\phi}^{*}$ are the same as that in $\xi_{\phi}^{*}$ or $\xi_{\phi,N}^{*}$ .

4 Applications

We apply our proposed algorithm, CVXSADes, to construct OEDs for various models and various values of $n$ . Representative results are given and discussed in four applications below. Application 1 is for a D-optimality criterion for a logistic model with two design variables, where it is not clear how to round $nw_{i}$ from OADs. In addition, we discuss the choice of $N$ and the efficiency of OADs $\xi_{\phi,N}^{*}$ on $S_{N}$ as $N$ varies. Application 2 is about constructing optimal group testing designs over a discrete design space $S$ . The OAD $\xi_{\phi,N}^{*}$ from Algorithm 1 is more appropriate than the optimal designs in a paper. Application 3 concerns high-dimensional designs, where the design space comprises seven design variables. The high-dimensionality presents a challenge in finding OADs and OEDs. Algorithm 1 provides an alternative method to find optimal designs and gives better approximate designs than those in a couple of examples in Xu et al., (2019). Application 4 shows results for maximin optimal designs in which multiple objectives compete against each other. In each of the applications, we demonstrate the usefulness and effectiveness of our proposed algorithm.

Application 1.

(Two-variable logit model) Consider a two-variable binary logistic regression model with interaction, as discussed in Haines et al., (2018), where it was used to study the effectiveness of different combinations of the concentration of two insecticides. The model is given as

\mbox{logit}(p)=\log\left(\frac{p}{1-p}\right)=\theta_{0}+\theta_{1}x_{1}+% \theta_{2}x_{2}+\theta_{12}x_{1}x_{2},

where $p$ is the probability of success, i.e., $p=P(Y=1)$ , $Y$ is a binary response variable, $x_{1}$ and $x_{2}$ are two design variables, such as the doses of two drugs. D-OEDs were studied and derived analytically in Haines et al., (2018) for various parameter values $\boldsymbol{\theta}^{*}$ and design spaces. For this GLM, the information matrix of design $\xi(\boldsymbol{\mathrm{x}})$ in (5) becomes

\displaystyle\boldsymbol{\mathrm{I}}(\xi,\boldsymbol{\theta}^{*})=\sum_{i=1}^{% k}w_{i}\frac{\exp(\boldsymbol{\mathrm{f}}^{\top}({\bf v}_{i})\boldsymbol{% \theta}^{*})}{\left(1+\exp(\boldsymbol{\mathrm{f}}^{\top}({\bf v}_{i})% \boldsymbol{\theta}^{*})\right)^{2}}\boldsymbol{\mathrm{f}}({\bf v}_{i})% \boldsymbol{\mathrm{f}}^{\top}({\bf v}_{i}),

(10)

where $\boldsymbol{\mathrm{f}}(\boldsymbol{\mathrm{x}})=(1,x_{1},x_{2},x_{1}x_{2})^{\top}$ .

Since the D-optimality is used as a loss function $\phi$ , corresponding OADs and OEDs are denoted as D-OADs and D-OEDs. Using Algorithm 1 we compute D-OADs $\xi_{\phi,N}^{*}$ on $S_{N}$ and construct D-OEDs for various values of $N$ and $n$ . Then we compare $\xi_{\phi,N}^{*}$ with $\xi_{\phi}^{*}$ in Haines et al., (2018) and comment on the D-OEDs. We have worked on several cases of $\boldsymbol{\theta}^{*}$ and design spaces. Here we give representative results for two cases:
(i) $\boldsymbol{\theta}^{*}=(-3,4,6,1)^{\top}$ and $S=[0,1]^{2}$ as in Haines et al., (2018, Example 4.2(b)),
(ii) $\boldsymbol{\theta}^{*}=(-2.2054,13.5803,2.2547,1.6262)^{\top}$ and $S=[0,2]^{2}$ as in Haines et al., (2018, Section 5).

For case (i), D-OADs and D-OEDs are plotted in Figure 3. In the approximate design in Figure 3(a), $S_{N}$ includes $N=51^{2}$ grid points in $S$ , which is formed by Cartesian product of 51 equally spaced points in [0,1] in each dimension. Note that $\xi_{\phi,N}^{*}$ and $\xi_{\phi}^{*}$ have 5 support points that are almost the same, but $\xi_{\phi,N}^{*}$ has one extra support point with a very small weight (0.0033). The weights of $\xi_{\phi,N}^{*}$ are displayed there, and the weights of $\xi_{\phi}^{*}$ are slightly different and not shown in the plot for clear presentation. In addition, the loss function values of the two designs are almost the same, with $\operatorname{\mathbb{E}ff}_{D}(\xi_{N,\phi}^{*})=0.9998$ . Three D-OEDs with $n=10,15,$ and 20 are plotted in Figure 3(b), (c) and (d), respectively, with efficiency $\stackrel{{\scriptstyle\sim}}{{\smash{\operatorname{\mathbb{E}ff}}\rule{0.0pt}% {4.73611pt}}}_{\phi}(\xi_{n,\phi}^{*})=0.9836,0.9785$ and $1.0001$ . This indicates that these D-OEDs are highly efficient. The support points of each D-OED are clustered around 5 points.

For case (ii), D-OADs and D-OEDs are plotted in Figure 4. There are three D-OADs in Haines et al., (2018, Section 5), which have 4, 5, and 6 support points, respectively. The one with 6 support points has the smallest value of ${\phi}(\xi_{\phi}^{*})$ , and we use it for the efficiency calculation of $\xi_{\phi,N}^{*}$ and $\xi_{n,\phi}^{*}$ . Several values of $N$ are used to compute $\xi_{\phi,N}^{*}$ and $\xi_{n,\phi}^{*}$ . Representative results are given in Table 1, which indicate that ${\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{\phi,N}^{*})$ increases as $N$ increases, ${\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{n,\phi}^{*})$ seems to be fluctuated a bit, and but both $\xi_{n,\phi}^{*}$ and $\xi_{\phi,N}^{*}$ have a high efficiency with the $N$ in the latter being as small as $21^{2}$ . The interpretation of having a constant ${\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{n,\phi}^{*})$ is that the annealing algorithm can effectively find highly efficient exact designs starting from different $\xi_{\phi,N}^{*}$ . This implies that the choice of $N$ does not affect the OED much, and in practice, we can choose a moderate $N$ . In Figure 4 (a) and (b), $\xi_{\phi,N}^{*}$ and $\xi_{\phi}^{*}$ are plotted. For clear presentation, the plots only display the weights in $\xi_{\phi,N}^{*}$ , which can also have 5 and 6 support points with different values of $N$ . The locations of the support points in $\xi_{\phi,N}^{*}$ are similar to those in $\xi_{\phi}^{*}$ . In Figure 4(c) and (d), D-OEDs are plotted. Design V and VI are the same, but the annealing algorithm used different starting designs. In Haines et al., (2018, Section 5), it is not clear how to get D-OEDs from $\xi_{\phi}^{*}$ with any given value of $n$ . Algorithm 1 provides an effective way to construct D-OEDs for any $n$ . The exact design in Figure 4(d) has ${\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{n,\phi}^{*})=0.9935$ for $n=15$ .

Table 1: Efficiency of

\xi_{\phi,N}^{*}

and

\xi_{n,\phi}^{*}

(with

n=10

) for case (ii) in Application 1.

$N$	${\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{\phi,N}^{*})$	${\operatorname{\mathbb{E}ff}}_{\phi}(\xi_{n,\phi}^{*})$
$21^{2}$	0.9716	0.9822
$31^{2}$	0.9901	0.9513
$41^{2}$	0.9961	0.9793
$51^{2}$	0.9985	0.9794
$81^{2}$	0.9984	0.9822

Application 2.

(Group testing design for disease prevalence) Group testing is employed to study rare diseases when testing individuals for a trait is costly (Hughes-Oliver and Swallow, , 1994; Hughes-Oliver and Rosenberger, , 2000). Instead of taking samples from each individual and testing them individually, it is more cost-efficient to conduct group testing, wherein samples from individuals are pooled as a group and tested together as a unit. In Huang et al., (2017), optimal group testing designs are studied, where optimal group sizes are selected for group testing experiments. Since the design space $S$ only includes integer values, finding OADs on discrete design spaces $S_{N}(=S)$ via CVX is extremely useful. We demonstrate this for D- and c-optimality criteria in Huang et al., (2017), and present OADs and OEDs for various values of $n$ . In addition, we find and comment on interesting features in OEDs.

To present the optimality criteria clearly, we rewrite the information matrix from Huang et al., (2017) as follows,

\displaystyle\boldsymbol{\mathrm{I}}(\xi,\boldsymbol{\theta})=\sum_{i=1}^{k}w_% {i}\lambda(x_{i})\boldsymbol{\mathrm{f}}(x_{i})\boldsymbol{\mathrm{f}}^{\top}(% x_{i}),

(11)

where $\boldsymbol{\theta}=(p_{0},p_{1},p_{2})^{\top}$ , $\lambda(x)=1/(\pi(x)(1-\pi(x)))$ with $\pi(x)=p_{1}-(p_{1}+p_{2}-1)(1-p_{0})^{x}$ , and

\boldsymbol{\mathrm{f}}(x)=\left(\begin{array}[]{c}x(p_{1}+p_{2}-1)(1-p_{0})^{% x-1}\\ 1-(1-p_{0})^{x}\\ -(1-p_{0})^{x}\end{array}\right).

See the details of the group testing model and information matrix in Huang et al., (2017). We consider D-optimality and c-optimality with vector ${\bf c}_{1}=(1,0,0)^{\top}$ . We minimize $\phi\{\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*})\}={\bf c}_{1}^% {\top}\boldsymbol{\mathrm{I}}^{-1}(\xi,\boldsymbol{\theta}^{*}){\bf c}_{1}$ to obtain c-optimal designs. Representative results are given in Table 2 for $\boldsymbol{\theta}^{*}=(0.07,0.93,0.96)^{\top}$ and $S_{N}=S=\{1,2,3,\ldots,61\}$ . From Huang et al., (2017), they found that the D-OAD has three support points: 1, 16.79, 61, which are similar to those of $\xi_{\phi,N}^{*}$ in Table 2. However, one of their support points is not an integer, which is not the design space. Thus, $\xi_{\phi,N}^{*}$ is more appropriate for this experiment. Since $S$ only includes integers, we modify the annealing algorithm by adding 1 or $-1$ to the selected design point to obtain a new design point in $S$ . As usual, we make sure that all new points are in $S$ . The D-OEDs for various values of $n$ from Algorithm 1 are the same as those by rounding $nw_{i}$ to integers, and they are highly efficient.

For c-optimal designs, $\xi_{\phi,N}^{*}$ in Table 2 is also more appropriate for this experiment, since the c-optimal design in Huang et al., (2017) also includes a non-integer support point (15.68). It is interesting to notice that the c-OEDs do not have the sample support points as in $\xi_{\phi,N}^{*}$ , and the design efficiencies are very high. One of the support points in $\xi_{\phi,N}^{*}$ is $16$ with a large weight 0.6279, but some of the exact designs include a support point $15$ or $17$ and do not include $16$ as a support point. See the results for $n=10$ , $11$ and $14$ . Figure 1 gives a plot of loss function value versus iteration number in the annealing algorithm for finding c-OED with $n=12$ where the y-axis is in log $10$ scale. Note that it is easy to find OADs and OEDs using Algorithm 1 for any $\boldsymbol{\theta}^{*}$ value, any integer design space, and any optimality criterion.

Table 2: c- and D-OADs, and c-, D-OEDs,

\xi_{\phi,N}^{*}

and

\xi_{n,\phi}^{*}

in Application 2, where Per(

i,j,k

) means any permutation of

i,j,k

D-optimality	support points	weights/observations	${\phi}(\boldsymbol{\mathrm{I}}^{-1})$	$\widetilde{\operatorname{\mathbb{E}ff}}(\xi_{n,\phi}^{*})$
approximate $\xi_{\phi,N}^{*}$	1, 17, 61	1/3, 1/3, 1/3	0.1448	1.0000
$n=10$	1, 17, 61	Per(4,3,3)	0.1462	0.9906
$n=11$	1, 17, 61	Per(4,4,3)	0.1461	0.9912
$n=12$	1, 17, 61	4, 4, 4	0.1448	1.0000
$n=13$	1, 17, 61	Per(5,4,4)	0.1457	0.9944
$n=14$	1, 17, 61	Per(5,5,4)	0.1456	0.9946
c-optimality	support points	weights/observations	${\phi}(\boldsymbol{\mathrm{I}}^{-1})$	$\widetilde{\operatorname{\mathbb{E}ff}}(\xi_{n,\phi}^{*})$
approximate $\xi_{\phi,N}^{*}$	1, 16, 61	0.1310, 0.6279, 0.2411	0.0354	1.0000
$n=10$	1, 17, 61	1, 6, 3	0.0361	0.9799
$n=11$	1, 17, 61	1, 7, 3	0.0361	0.9808
$n=12$	1, 15, 16, 61	2, 4, 3, 3	0.0358	0.9891
$n=13$	1, 15, 16, 61	2, 7, 1, 3	0.0355	0.9968
$n=14$	1, 15, 61	2, 9, 3	0.0355	0.9970

Application 3.

(Seven-dimensional design space) In real-world applications, experiments often involve many design variables, and efficient designs can help save resources and prevent wasted time. Consider a logistic model with seven design variables and its information matrix is similar to that in (10) with $\boldsymbol{\mathrm{f}}(\boldsymbol{\mathrm{x}})=(1,x_{1},x_{2},x_{3},x_{4},x_% {5},x_{6},x_{7})^{\top}$ . Xu et al., (2019) proposed an innovative algorithm using differential evolution for finding D-OADs with several design variables, and used this model as an example. When the design space $S=[-1,1]^{7}$ , the D-OAD in Xu et al., (2019, Table 4) with

\boldsymbol{\theta}^{*}=(-0.4926,-0.6280,-0.3283,0.4378,0.5283,-0.6120,-0.6837% ,-0.2061)^{\top}

has 48 support points. The corresponding weights for the 48 support points are, 0.0230, 0.0160, 0.0255, …, 0.0269. However, it is challenging to implement the D-OAD in practice. For instance, given a run size, let us say $n=30$ , how we construct the exact design remains unclear. We cannot have all the 48 support points in the exact design. In addition, it is not easy to round $nw_{i}$ as $nw_{i}<1$ for many of the support points in the D-OAD.

Algorithm 1 can be used construct D-OADs and D-OEDs for various design spaces and $n$ . Representative results are given in Tables 3 and 4. For each design variable, we take 4 equally spaced points on $[-1,1]$ , i.e., $-1,-1/3,1/3,1$ , and construct $S_{N}$ by including the Cartesian product of the equally spaced points for the 7 variables, which gives $N=4^{7}=16,384$ . Table 3 presents the D-OAD $\xi_{\phi,N}^{*}$ via CVX. It is clear that $\xi_{\phi,N}^{*}$ has 29 support points, which is much less than 48, the number of support points in Xu et al., (2019). In addition, $\xi_{\phi,N}^{*}$ has a loss function value $\left(\det\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi,N}^{*},\boldsymbol{% \theta}^{*})\right\}\right)^{1/q}=4.9485$ , which is smaller than $4.9573$ , the loss function value of the approximate design in Xu et al., (2019). Thus, CVX solver finds a better D-OAD with a smaller loss function and a smaller number of support points. Table 4 gives an exact design $\xi_{n,\phi}^{*}$ with $n=30$ , which has 22 support points. Some of the support points are not at the corners of the hyper-cube $[-1,1]^{7}$ , since the annealing algorithm allows us to make small changes of design points in $S$ . This $\xi_{n,\phi}^{*}$ has a loss function value $\left(\det\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{n,\phi}^{*},\boldsymbol{% \theta}^{*})\right\}\right)^{1/q}=5.1231$ , which yields an efficiency $\widetilde{\operatorname{\mathbb{E}ff}}(\xi_{n,\phi}^{*})=0.9659$ .

Table 3: D-OAD

\xi_{\phi,N}^{*}

via CVX with

N=4^{7}

Support point	$x_{1}$	$x_{2}$	$x_{3}$	$x_{4}$	$x_{5}$	$x_{6}$	$x_{7}$	weight
1	-1	-1	-1	-1	-1	1	1	0.0627
2	-1	-1	-1	1	-1	1	-1	0.0732
3	-1	-1	-1	1	1	-1	1	0.0487
4	-1	-1	1	-1	1	-1	-1	0.0499
5	-1	-1	1	-1	1	1	-1	0.0460
6	-1	1	-1	-1	-1	-1	-1	0.0088
7	-1	1	-1	-1	1	-1	-1	0.0561
8	-1	1	-1	1	-1	1	1	0.0212
9	-1	1	-1	1	1	-1	-1	0.0226
10	-1	1	1	-1	-1	-1	1	0.0840
11	-1	1	1	1	-1	1	-1	0.0306
12	-1	1	1	1	1	-1	1	0.0023
13	-1	1	1	1	1	1	1	0.0730
14	1	-1	-1	-1	-1	-1	1	0.0135
15	1	-1	-1	1	-1	-1	1	0.0217
16	1	-1	-1	1	1	-1	1	0.0415
17	1	-1	1	-1	-1	-1	1	0.0409
18	1	-1	1	-1	-1	1	-1	0.0375
19	1	-1	1	-1	1	-1	-1	0.0073
20	1	-1	1	1	-1	1	1	0.0255
21	1	-1	1	1	1	-1	-1	0.0489
22	1	-1	1	1	1	1	-1	0.0100
23	1	1	-1	-1	-1	-1	-1	0.0491
24	1	1	-1	1	-1	-1	-1	0.0404
25	1	1	1	-1	-1	-1	1	0.0042
26	1	1	1	1	-1	-1	1	0.0058
27	1	1	1	1	-1	1	-1	0.0420
28	1	1	1	1	-1	1	1	0.0033
29	1	1	1	1	1	-1	1	0.0295

Table 4: A D-OED

\xi_{n,\phi}^{*}

with

n=30

, where

n_{i}

denotes the number of observations at each support point.

Support point	$x_{1}$	$x_{2}$	$x_{3}$	$x_{4}$	$x_{5}$	$x_{6}$	$x_{7}$	$n_{i}$
1	-1	-1	-1	-1	-1	1	1	1
2	-1	-1	-1	1	-1	1	-1	2
3	-1	-1	-1	1	-1	1	-0.9999	1
4	-1	-1	-1	1	1	-1	1	2
5	-1	-1	1	-1	1	-1	-1	2
6	-1	-1	1	-1	1	1	-1	2
7	-1	1	-1	-1	-1	-1	-1	1
8	-1	1	-1	-1	1	-1	-1	2
9	-1	1	-1	1	-1	1	1	1
10	-1	1	-1	1	1	-1	-1	1
11	-1	1	1	-1	-1	-1	1	3
12	-1	1	1	1	-1	1	-1	1
13	-1	1	1	1	1	1	1	2
14	-0.9996	1	1	1	1	-1	1	1
15	-0.9995	-1	-1	-1	-1	0.9995	1	2
16	1	-1	-1	1	1	-1	1	1
17	1	-1	1	-1	-1	-1	1	1
18	1	-1	1	-1	-1	1	-1	1
19	1	-1	1	1	1	-1	-1	1
20	1	0.9998	-1	-1	-1	-1	-1	1
21	1	1	-1	1	-1	-1	-1	1
22	1	1	1	1	-1	1	-0.9978	1

Application 4.

(Maximin design for competing criteria in dose finding study) In the dose-finding study, the aim is to find a model that characterizes the dose-response relationship effectively (Dette et al., , 2008). However, a complex decision-making process is required due to its complexity and other external considerations, such as efficacy and ethics tradeoffs. Often, more than one response model is necessary, and determining how to maximize the efficiency of the design for a model becomes a practical question. Here, we consider maximin optimality criteria for multiple objectives and construct maximin exact designs. We use one application with four response models in Wong and Zhou, (2023), which is also studied in Bretz et al., (2010). There are one linear response model, two Emax models with different true parameter values, and one logistic model. Let $\boldsymbol{\mathrm{I}}_{i}(\xi,\boldsymbol{\theta}_{i}^{*})$ , $i=1,\ldots,4$ , be the information matrices for the four models, respectively, where $\boldsymbol{\theta}_{i}^{*}$ is the true parameter for model $i$ , the same as in Wong and Zhou, (2023).

We define maximin optimal designs in the same way as in Section 2.3. If $\phi_{i}(\xi)=\operatorname{tr}\left\{\boldsymbol{\mathrm{I}}_{i}^{-1}(\xi,% \boldsymbol{\theta}_{i}^{*})\right\}$ for $i=1,\ldots,4$ , we call it as maximin A-optimal design (maximin A-OAD/A-OED). If $\phi_{i}(\xi)=\det\left\{\boldsymbol{\mathrm{I}}_{i}^{-1}(\xi,\boldsymbol{% \theta}_{i}^{*})\right\}$ for $i=1,\ldots,4$ , we call it as maximin D-optimal design (maximin D-OAD/D-OED).

Wong and Zhou, (2023) discussed an algorithm to compute maximin OADs on $S_{N}$ , and the main idea is to transform the maximin problem into a convex optimization problem and use CVX to find solutions. Applying the algorithm in Wong and Zhou, (2023), we can easily obtain maximin A- and D-OADs $\xi_{\phi,N}^{*}$ via CVX in Step 1 of Algorithm 1. Then we can compute maximin A- and D-OEDs for various values of $n$ . Representative results are plotted in Figures 5 and 6, where $S=[0,500]$ , $N=201$ , and $n=10$ , 20, and 30 are used. In Figure 5, the distribution functions of maximin A-OADs and maximin A-OEDs are plotted. From Figure 5(d), the maximin A-OAD and maximin A-OED with $n=30$ are very similar. The maximin A-OAD has $\min\{\widetilde{\operatorname{\mathbb{E}ff}}_{\phi_{1}}(\xi),\ldots,% \widetilde{\operatorname{\mathbb{E}ff}}_{\phi_{4}}(\xi)\}=0.7155$ , and the maximin A-OEDs have $\min\{\widetilde{\operatorname{\mathbb{E}ff}}_{\phi_{1}}(\xi),\ldots,% \widetilde{\operatorname{\mathbb{E}ff}}_{\phi_{4}}(\xi)\}=0.6813$ , $0.6983$ and $0.7121$ for $n=10$ , $20$ and $30$ , respectively.

In Figure 6, the distribution functions of maximin D-OAD and D-OED are plotted. The maximin D-OED has 5 support points, while the exact ones have more than 5 support points. The maximin D-OAD has $\min\{\widetilde{\operatorname{\mathbb{E}ff}}_{\phi_{1}}(\xi),\ldots,% \widetilde{\operatorname{\mathbb{E}ff}}_{\phi_{4}}(\xi)\}=0.8538$ , and the maximin D-OEDs have $\min\{\widetilde{\operatorname{\mathbb{E}ff}}_{\phi_{1}}(\xi),\ldots,% \widetilde{\operatorname{\mathbb{E}ff}}_{\phi_{4}}(\xi)\}=$ $0.8371$ , $0.8420$ and $0.8459$ for $n=10$ , 20 and 30, respectively. From our numerical results, $\min\{\widetilde{\operatorname{\mathbb{E}ff}}_{\phi_{1}}(\xi),\ldots,% \widetilde{\operatorname{\mathbb{E}ff}}_{\phi_{4}}(\xi)\}$ of the maximin D-OED or A-OED increases as $n$ increases.

The gap between the distribution functions of the approximate and exact designs reflects the differences between their weights and slightly different support points. For an exact design with $n$ runs, the weights have to be multiples of $1/n$ . There are usually more support points in exact designs than those in approximate designs. Table 5 presents a maximin A-OED with $n=20$ . There are 11 support points. Algorithm 1 is flexible to find highly efficient maximin OEDs with various values of $n$ .

Table 5: A maximin A-OED with

n=20

, where

n_{i}

denotes the number of observations at each support point.

support point	0	23.07	28.47	36.48	45.77	80.95	83.54	91.30	173.32	217.51	500
$n_{i}$	8	1	1	1	1	1	1	1	1	1	3

5 Conclusion

Constructing an OED is challenging, as it involves solving an integer programming problem, which is NP-complete. The conventional methods for obtaining the exact design either involve deriving analytical solutions for low-dimensional problems or rounding approximate designs. However, closed-form solutions usually do not exist for designs involving more than two variables, and rounding may not yield exact designs with high efficiency. We have developed a general algorithm to search for OAEs and highly efficient OEDs. This algorithm is applicable to any criterion with a convex loss function, any design spaces, and any sample sizes $n$ , although it is particularly useful when $n$ is small or moderate. For very large $n$ , a rounding method still performs well in obtaining highly efficient OEDs. Notably, our algorithm also computes the OADs concurrently with OEDs. While there are other numerical algorithms for finding exact designs, they do not compute approximate designs, so it is difficult to access the efficiency of the resulting exact designs; some numerical methods are provided in Section 1. By computing the OADs, we can assess the efficiency of the OEDs relative to the OADs through the modified design efficiency measure. Our method offers a new and alternative approach to finding highly efficient exact designs.

We have provided four applications to demonstrate the effectiveness of our algorithm, which include single and multiple-objective optimality criteria, discrete and high-dimensional design spaces, linear, nonlinear and GLMs. Our proposed algorithm can find highly efficient exact designs for small $n$ quickly, which is very useful for practical applications. Full implementation of the proposed algorithm is available online for practitioners to use in other real-world applications.

Appendix: Proofs and derivations

Proof of Theorem 1: Write the information matrices of design $\xi(\boldsymbol{\mathrm{x}})$ in (2) and (5) using a general form as

\displaystyle\boldsymbol{\mathrm{I}}(\xi,\boldsymbol{\theta}^{*})=\sum_{i=1}^{% k}w_{i}I({\bf v}_{i},\boldsymbol{\theta}^{*}),

(12)

where $I(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})=\boldsymbol{\mathrm{f}}(% \boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})\boldsymbol{\mathrm{f}}^{\top}% (\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})$ in (2), or $\lambda(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})\boldsymbol{\mathrm{f}% }(\boldsymbol{\mathrm{x}})\boldsymbol{\mathrm{f}}^{\top}(\boldsymbol{\mathrm{x% }})$ in (5). Suppose the OAD $\xi_{\phi}^{*}$ has support points $\boldsymbol{\mathrm{x}}_{1}^{*},\ldots,\boldsymbol{\mathrm{x}}_{m}^{*}\in S$ with corresponding weights, $w_{1}^{*},\ldots,w_{m}^{*}$ , respectively.

For part (i), we want to show that $\lim_{n\to\infty}\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})=1$ .

For each $n$ , compute $a_{i}=n\cdot w_{i}^{*}$ for $i=1,\ldots,m$ . For large $n$ such that $a_{i}\geq 1$ for all $i=1,\ldots,m$ , we construct an exact design $\xi^{(n)}$ as follows. The support points of $\xi^{(n)}$ are the same as those of $\xi_{\phi}^{*}$ . We choose integer $n_{i}$ to be either the floor or the ceiling of $a_{i}$ and satisfy $\sum_{i=1}^{m}n_{i}=n$ . Since $a_{i}\geq 1$ for all $i=1,\ldots,m$ , it is clear that $n_{i}\geq 1$ and $|n_{i}-a_{i}|\leq 1$ for all $i=1,\ldots,m$ . Define the weights of exact design $\xi^{(n)}$ to be $n_{1}/n,\ldots,n_{m}/n$ for support points $\boldsymbol{\mathrm{x}}_{1}^{*},\ldots,\boldsymbol{\mathrm{x}}_{m}^{*}$ , respectively.

Let $\delta_{i}=a_{i}-n_{i}$ for $i=1,\ldots,m$ . Then $w_{i}^{*}-n_{i}/n=w_{i}^{*}-a_{i}/n+\delta_{i}/n=\delta_{i}/n$ . Note that $|\delta_{i}|\leq 1$ . Evaluate the information matrix of the exact design $\xi^{(n)}$ using (12),

$\displaystyle\boldsymbol{\mathrm{I}}(\xi^{(n)},\boldsymbol{\theta}^{*})$	$\displaystyle=$	$\displaystyle\sum_{i=1}^{m}\frac{n_{i}}{n}~{}I(\boldsymbol{\mathrm{x}}_{i}^{}% ,\boldsymbol{\theta}^{})$
	$\displaystyle=$	$\displaystyle\sum_{i=1}^{m}\left(w_{i}^{}-\delta_{i}/n\right)I(\boldsymbol{% \mathrm{x}}_{i}^{},\boldsymbol{\theta}^{*})$
	$\displaystyle=$	$\displaystyle\boldsymbol{\mathrm{I}}(\xi_{\phi}^{},\boldsymbol{\theta}^{})-% \frac{1}{n}\sum_{i=1}^{m}\delta_{i}I(\boldsymbol{\mathrm{x}}_{i}^{},% \boldsymbol{\theta}^{})$
	$\displaystyle=$	$\displaystyle\boldsymbol{\mathrm{I}}(\xi_{\phi}^{},\boldsymbol{\theta}^{})-% \frac{1}{n}{\bf B},$

where ${\bf B}=\sum_{i=1}^{m}\delta_{i}I(\boldsymbol{\mathrm{x}}_{i}^{*},\boldsymbol{% \theta}^{*})$ . Since all entries of $I(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})$ are continuous functions of $\boldsymbol{\mathrm{x}}$ , $S$ is a bounded region, $m$ is a fixed number, and $|\delta_{i}|\leq 1$ , all entries of ${\bf B}$ are bounded. Thus, $\frac{1}{n}{\bf B}\to{\bf 0}$ (a zero matrix), as $n\to\infty$ . This implies that $\boldsymbol{\mathrm{I}}(\xi^{(n)},\boldsymbol{\theta}^{*})\to\boldsymbol{% \mathrm{I}}(\xi_{\phi}^{*},\boldsymbol{\theta}^{*})$ as $n\to\infty$ , which leads to $\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi^{(n)},\boldsymbol{\theta}^{*})% \right\}\to\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi}^{*},\boldsymbol{% \theta}^{*})\right\}$ for commonly used optimality criterion.

Notice that $\xi^{(n)}$ is an OED with $n$ points. It is clear that for any $n$

\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi^{(n)},\boldsymbol{\theta}^{*})% \right\}\geq\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{n,\phi}^{*},% \boldsymbol{\theta}^{*})\right\}\geq\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(% \xi_{\phi}^{*},\boldsymbol{\theta}^{*})\right\}.

From above analysis, $\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi^{(n)},\boldsymbol{\theta}^{*})% \right\}\to\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi}^{*},\boldsymbol{% \theta}^{*})\right\}$ as $n\to\infty$ . Therefore, we must have $\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{n,\phi}^{*},\boldsymbol{\theta}^{% *})\right\}\to\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi}^{*},% \boldsymbol{\theta}^{*})\right\}$ as $n\to\infty$ , which gives $\lim_{n\to\infty}\operatorname{\mathbb{E}ff}_{\phi}(\xi_{n,\phi}^{*})=1$ .

For part (ii), we want to show that $\lim_{N\to\infty}\operatorname{\mathbb{E}ff}_{\phi}(\xi_{\phi,N}^{*})=1$ .

For each $S_{N}$ , we construct an approximate design $\xi_{N}^{(0)}$ with $m$ support points, which are selected as follows. By the construction of $S_{N}$ , we can find a sequence of points, ${\bf u}_{i,N}\in S_{N}$ , $i=1,\ldots,m$ , such that $\lim_{N\to\infty}{\bf u}_{i,N}=\boldsymbol{\mathrm{x}}_{i}^{*}$ , for all $i=1,\ldots,m$ . Then we choose ${\bf u}_{i,N}\in S_{N}$ , $i=1,\ldots,m$ , as the support points of $\xi_{N}^{(0)}$ , and their weights are $w_{1}^{*},\ldots,w_{m}^{*}$ (from $\xi_{\phi}^{*}$ ), respectively. The information matrix of the approximate design $\xi_{N}^{(0)}$ using (12) is,

$\displaystyle\boldsymbol{\mathrm{I}}(\xi_{N}^{(0)},\boldsymbol{\theta}^{*})$	$\displaystyle=$	$\displaystyle\sum_{i=1}^{m}w_{i}^{}~{}I({\bf u}_{i,N},\boldsymbol{\theta}^{})$
	$\displaystyle\to$	$\displaystyle\sum_{i=1}^{m}w_{i}^{}~{}I(\boldsymbol{\mathrm{x}}_{i}^{},% \boldsymbol{\theta}^{*}),~{}\mbox{as}~{}N\to\infty$
	$\displaystyle=$	$\displaystyle\boldsymbol{\mathrm{I}}(\xi_{\phi}^{},\boldsymbol{\theta}^{}),$

since all entries of $I(\boldsymbol{\mathrm{x}},\boldsymbol{\theta}^{*})$ are continuous functions of $\boldsymbol{\mathrm{x}}$ . Similar to the proof of part (i), it is clear that $\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{N}^{(0)},\boldsymbol{\theta}^{*})% \right\}\to\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi}^{*},\boldsymbol{% \theta}^{*})\right\}$ as $N\to\infty$ . Since $\xi_{N}^{(0)}$ is a design on $S_{N}$ and $\xi_{\phi,N}^{*}$ is an OAD on $S_{N}$ , it follows that

\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{N}^{(0)},\boldsymbol{\theta}^{*})% \right\}\geq\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi,N}^{*},% \boldsymbol{\theta}^{*})\right\}\geq\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(% \xi_{\phi}^{*},\boldsymbol{\theta}^{*})\right\}.

Thus, $\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi,N}^{*},\boldsymbol{\theta}^{% *})\right\}\to\phi\left\{\boldsymbol{\mathrm{I}}^{-1}(\xi_{\phi}^{*},% \boldsymbol{\theta}^{*})\right\}$ as $N\to\infty$ , which implies that $\lim_{N\to\infty}\operatorname{\mathbb{E}ff}_{\phi}(\xi_{\phi,N}^{*})=1$ . $\Box$

Acknowledgements

This research was partially supported by Discovery Grants from the Natural Sciences and Engineering Research Council of Canada. The first author is partially supported by the CANSSI Distinguished Postdoctoral Fellowship from the Canadian Statistical Sciences Institute.

Declarations

The authors have no conflicts of interest to declare.

References

Berger and Wong, (2009) Berger, M. P. and Wong, W.-K. (2009). An Introduction to Optimal Designs for Social and Biomedical Research. Wiley, New York.
Bose and Mukerjee, (2015) Bose, M. and Mukerjee, R. (2015). Optimal design measures under asymmetric errors, with application to binary design points. Journal of Statistical Planning and Inference, 159:28–36.
Boyd and Vandenberghe, (2004) Boyd, S. P. and Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press, New York.
Bretz et al., (2010) Bretz, F., Dette, H., and Pinheiro, J. C. (2010). Practical considerations for optimal designs in clinical dose finding studies. Statistics in Medicine, 29:731–742.
Broudiscou et al., (1996) Broudiscou, A., Leardi, R., and Phan-Tan-Luu, R. (1996). Genetic algorithm as a tool for selection of D-optimal design. Chemometrics and Intelligent Laboratory Systems, 35:105–116.
Chen et al., (2015) Chen, R.-B., Chang, S.-P., Wang, W., Tung, H.-C., and Wong, W. K. (2015). Minimax optimal designs via particle swarm optimization methods. Statistics and Computing, 25:975–988.
Dean et al., (2015) Dean, A. M., Morris, M., Stufken, J., and Bingham, D. (2015). Handbook of Design and Analysis of Experiments. CRC Press, Boca Raton.
Dette et al., (2008) Dette, H., Bretz, F., Pepelyshev, A., and Pinheiro, J. (2008). Optimal designs for dose-finding studies. Journal of the American Statistical Association, 103:1225–1237.
Duan et al., (2022) Duan, J., Gao, W., Ma, Y., and Ng, H. K. T. (2022). Efficient computational algorithms for approximate optimal designs. Journal of Statistical Computation and Simulation, 92:764–793.
Duarte et al., (2020) Duarte, B. P., Granjo, J. F., and Wong, W. K. (2020). Optimal exact designs of experiments via mixed integer nonlinear programming. Statistics and Computing, 30:93–112.
Duarte and Wong, (2014) Duarte, B. P. and Wong, W. K. (2014). A semi-infinite programming based algorithm for finding minimax optimal designs for nonlinear models. Statistics and Computing, 24:1063–1080.
Duarte and Wong, (2015) Duarte, B. P. and Wong, W. K. (2015). Finding bayesian optimal designs for nonlinear models: a semidefinite programming-based approach. International Statistical Review, 83:239–262.
Duarte et al., (2015) Duarte, B. P., Wong, W. K., and Atkinson, A. C. (2015). A semi-infinite programming based algorithm for determining T-optimum designs for model discrimination. Journal of Multivariate Analysis, 135:11–24.
Fedorov, (1972) Fedorov, V. V. (1972). Theory of Optimal Experiments. Academic Press, New York.
Gao et al., (2024) Gao, L. L., Ye, J. J., Zeng, S., and Zhou, J. (2024+). Necessary and sufficient conditions for multiple objective optimal regression designs. Statistica Sinica, to appear.
Gao and Zhou, (2017) Gao, L. L. and Zhou, J. (2017). D-optimal designs based on the second-order least squares estimator. Statistical Papers, 58:77–94.
Grant and Boyd, (2020) Grant, M. C. and Boyd, S. P. (2020). The CVX users’ guide, release 2.2. https://cvxr.com/cvx/doc/CVX.pdf. Online; accessed 02 April 2024.
Haines et al., (2018) Haines, L. M., Kabera, G. M., and Ndlovu, P. (2018). D-optimal designs for the two-variable binary logistic regression model with interaction. Journal of Statistical Planning and Inference, 193:136–150.
Hamada et al., (2001) Hamada, M., Martz, H., Reese, C., and Wilson, A. (2001). Finding near-optimal bayesian experimental designs via genetic algorithms. The American Statistician, 55:175–181.
Huang et al., (2017) Huang, S.-H., Huang, M.-N. L., Shedden, K., and Wong, W. K. (2017). Optimal group testing designs for estimating prevalence with uncertain testing errors. Journal of the Royal Statistical Society, Series B, 79:1547–1563.
Hughes-Oliver and Rosenberger, (2000) Hughes-Oliver, J. M. and Rosenberger, W. F. (2000). Efficient estimation of the prevalence of multiple rare traits. Biometrika, 87:315–327.
Hughes-Oliver and Swallow, (1994) Hughes-Oliver, J. M. and Swallow, W. H. (1994). A two-stage adaptive group-testing procedure for estimating small proportions. Journal of the American Statistical Association, 89:982–993.
John and Draper, (1975) John, R. S. and Draper, N. R. (1975). D-optimality for regression designs: a review. Technometrics, 17:15–23.
Kiefer, (1959) Kiefer, J. (1959). Optimum experimental designs. Journal of the Royal Statistical Society, Series B, 21:272–304.
Kiefer, (1974) Kiefer, J. (1974). General equivalence theory for optimum designs (approximate theory). The Annals of Statistics, 2:849–879.
López-Fidalgo, (2023) López-Fidalgo, J. (2023). Optimal Experimental Design: A Concise Introduction for Researchers. Springer, New York.
Mandal et al., (2015) Mandal, A., Wong, W. K., and Yu, Y. (2015). Algorithmic searches for optimal designs. In Angela Dean, Max Morris, J. S. and Bingham, D., editors, Handbook of Design and Analysis of Experiments, chapter 21, pages 755–783. CRC Press, Roca Raton.
Meyer and Nachtsheim, (1988) Meyer, R. K. and Nachtsheim, C. J. (1988). Constructing exact d-optimal experimental designs by simulated annealing. American Journal of Mathematical and Management Sciences, 8:329–359.
Meyer and Nachtsheim, (1995) Meyer, R. K. and Nachtsheim, C. J. (1995). The coordinate-exchange algorithm for constructing exact optimal experimental designs. Technometrics, 37:60–69.
Mukerjee and Huda, (2016) Mukerjee, R. and Huda, S. (2016). Approximate theory-aided robust efficient factorial fractions under baseline parametrization. Annals of the Institute of Statistical Mathematics, 68:787–803.
Palhazi Cuervo et al., (2016) Palhazi Cuervo, D., Goos, P., and Sörensen, K. (2016). Optimal design of large-scale screening experiments: a critical look at the coordinate-exchange algorithm. Statistics and Computing, 26:15–28.
Papp, (2012) Papp, D. (2012). Optimal designs for rational function regression. Journal of the American Statistical Association, 107:400–411.
Pukelsheim, (1993) Pukelsheim, F. (1993). Optimal Design of Experiments. Wiley, New York.
Pukelsheim and Rieder, (1992) Pukelsheim, F. and Rieder, S. (1992). Efficient rounding of approximate designs. Biometrika, 79:763–770.
Rempel and Zhou, (2014) Rempel, M. F. and Zhou, J. (2014). On exact k-optimal designs minimizing the condition number. Communications in Statistics-Theory and Methods, 43:1114–1131.
Smucker et al., (2012) Smucker, B. J., del Castillo, E., and Rosenberger, J. L. (2012). Model-robust two-level designs using coordinate exchange algorithms and a maximin criterion. Technometrics, 54:367–375.
Wilmut and Zhou, (2011) Wilmut, M. and Zhou, J. (2011). D-optimal minimax design criterion for two-level fractional factorial designs. Journal of Statistical Planning and Inference, 141:576–587.
Wong and Zhou, (2019) Wong, W. K. and Zhou, J. (2019). CVX-based algorithms for constructing various optimal regression designs. Canadian Journal of Statistics, 47:374–391.
Wong and Zhou, (2023) Wong, W. K. and Zhou, J. (2023). Using CVX to construct optimal designs for biomedical studies with multiple objectives. Journal of Computational and Graphical Statistics, 32:744–753.
Xu et al., (2019) Xu, W., Wong, W. K., Tan, K. C., and Xu, J.-X. (2019). Finding high-dimensional D-optimal designs for logistic models via differential evolution. IEEE Access, 7:7133–7146.
Yang et al., (2013) Yang, M., Biedermann, S., and Tang, E. (2013). On optimal designs for nonlinear models: a general and efficient algorithm. Journal of the American Statistical Association, 108:1411–1420.
Ye et al., (2017) Ye, J. J., Zhou, J., and Zhou, W. (2017). Computing A-optimal and E-optimal designs for regression models via semidefinite programming. Communications in Statistics-Simulation and Computation, 46:2011–2024.
Yu, (2011) Yu, Y. (2011). D-optimal designs via a cocktail algorithm. Statistics and Computing, 21:475–481.
Zhang and Mukerjee, (2013) Zhang, R. and Mukerjee, R. (2013). Highly efficient factorial designs for cdna microarray experiments: use of approximate theory together with a step-up step-down procedure. Statistical Applications in Genetics and Molecular Biology, 12:489–503.