[PDF] Efficient computational algorithms for approximate optimal designs

Abstract

In this paper, we propose two simple yet efficient computational algorithms to obtain approximate optimal designs for multi-dimensional linear regression on a large variety of design spaces. We focus on the two commonly used optimal criteria, D- and A-optimal criteria. For D-optimality, we provide an alternative proof for the monotonic convergence for D-optimal criterion and propose an efficient computational algorithm to obtain the approximate D-optimal design. We further show that the proposed algorithm converges to the D-optimal design, and then prove that the approximate D-optimal design converges to the continuous D-optimal design under certain conditions. For A-optimality, we provide an efficient algorithm to obtain approximate A-optimal design and conjecture the monotonicity of the proposed algorithm. Numerical comparisons suggest that the proposed algorithms perform well and they are comparable or superior to some existing algorithms.

Full PDF

aa r X i v : . [ m a t h . S T ] F e b Eﬃcient computational algorithms for approximate optimal designs

Jiangtao Duan a , Wei Gao a , Yanyuan Ma b and Hon Keung Tony Ng c a Key Laboratory for Applied Statistics of MOE, School of Mathematics andStatistics, Northeast Normal University, Changchun, Jilin 130024, China b Department of Statistics, The Pennsylvania State University, University Park c Department of Statistical Science, Southern Methodist University, Dallas

ARTICLE HISTORY

Compiled February 26, 2021

ABSTRACT

In this paper, we propose two simple yet eﬃcient computational algorithms to ob-tain approximate optimal designs for multi-dimensional linear regression on a largevariety of design spaces. We focus on the two commonly used optimal criteria, D -and A -optimal criteria. For D -optimality, we provide an alternative proof for themonotonic convergence for D -optimal criterion and propose an eﬃcient computa-tional algorithm to obtain the approximate D -optimal design. We further show thatthe proposed algorithm converges to the D -optimal design, and then prove that theapproximate D -optimal design converges to the continuous D -optimal design un-der certain conditions. For A -optimality, we provide an eﬃcient algorithm to obtainapproximate A -optimal design and conjecture the monotonicity of the proposed al-gorithm. Numerical comparisons suggest that the proposed algorithms perform welland they are comparable or superior to some existing algorithms. KEYWORDS

Approximate experimental design and D -optimal and A -optimal and Regressionmodel

1. Introduction

Optimal designs are a class of experimental designs that are optimal with respect tosome statistical criteria such as minimizing the variance of best linear unbiased estima-tors in regression problems and maximizing the amount of information obtained fromthe experiment. It is desirable to design experiments that provide more informationand reduce the uncertainty relating to the goal of the study. In regression problems,we model the responses of a random experiment, denoted as Y , . . . , Y N , whose inputsare represented by a vector x i ∈ X with respect to some known regression functions f ( x ) , . . . , f ( x N ) ∈ R p , i.e., Y ( x i ) = f T ( x i ) β + ε, i = 1 , , . . . , N, where f ( x ) ∈ R p is the covariates which is independent variable (regressor) associatedwith x , the vector β ∈ R p is p -dimensional parameter vector, and ε is the error termwith E [ ε ] = 0 and V ar [ ε ] = σ >

0. For diﬀerent independent trials, the errors are

CONTACT Wei Gao. Email: [email protected] ssumed to be uncorrelated and independent, and the inputs x i (a candidate set ofdesign points) are chosen by the experimenter in the design space X . We assume thatthe model is non-singular in the sense that { f ( x ) : x ∈ X } spans R p . We wish to picka small subset of the input vector x i such that querying the corresponding responseswill lead to a good estimator of the model. In this paper, we discuss the computationof approximate optimal designs for regression models with uncorrelated errors (see,for example, Atkinson et al. , 2007; Fedorov, 1972; Harman et al. , 2020).Assume that the inputs ˜ x i , for i = 1 , , . . . , N are chosen within a set of distinctpoints x , . . . , x m with integer p ≤ m ≪ N (but in some special case m can be equalto N , see Setting 5), and let n k denote the number of times the particular points x k occurs among ˜ x , . . . , ˜ x N , and ˜ N = n + · · · + n m indicates the number of all candidateexperiments. The exact experimental design can be summarized by deﬁning a design ξ ˜ N as ξ ˜ N = (cid:18) x · · · x mn ˜ N · · · n m ˜ N (cid:19) . (1)In the design ξ ˜ N in Eq. (1), the ﬁrst row gives the points in the design space X wherethe input parameters have to be taken and the second row indicates the proportion ofthe experimental units assigned or the frequencies of the experiments repeated at thesepoints. Strictly speaking, an exact experimental design ξ of size ˜ N can be characterizedby a probability distribution on X in which the probability of ξ ˜ N occurs at x j is n j / ˜ N .If ξ follows a continuous probability distribution or a discrete probability distribu-tion ξ = (cid:18) x · · · x m w · · · w m (cid:19) , (2)where Pr( ξ = x j ) = w j , j = 1 , , . . . , m and P mj =1 w j = 1, then ξ is a continuousdesign. The goal of the design of experiment theory is then to pick m out of the given N experiments so as to make the most accurate estimate of the parameter β . For thereview and details related to the determination of optimal experimental designs, thereaders can refer to Dette & Studden (1997) and the references therein.Let Ξ be the set of all exact designs or approximate designs (i.e., probability mea-sures) on the design space X ; for a given design ξ , we denote the information matrixof ξ for the experimental design by M ( ξ ) = X x ∈X ξ ( x ) f ( x ) f T ( x ) or M ( ξ ) = Z x ∈X ξ ( x ) f ( x ) f T ( x ) dξ ( x )Based on this formulation, an approximate D -optimal design for quadratic polynomialwas provided by Chen (2003) when the design space is a circle and Duan et al. (2019)provided two eﬃcient computational algorithms for optimal continuous experimentaldesigns for linear models. Under these model assumptions, the Fisher informationmatrix corresponding to β is proportional to the information matrix. Therefore, toobtain the most accurate estimate of certain parameters, we aim to choose the ξ suchthat M ( ξ ) is maximized according to some criterion. In the following, we will focuson a probability measure on X with support given by the points x i and weights w i inEq. (2). 2n order to obtain the optimal design, a general approach is to consider some gen-erally accepted statistical criteria proposed by Kiefer (1974) namely the Φ q -criteria.The D -optimality and the A -optimality are two of the most commonly used optimal-ity criteria due to their natural statistical interpretations. It has been shown that thecomputation of some important prediction-based optimality criteria such as the I -optimality criterion (Cook & Nachtsheim, 1982; Goos et al. , 2016) that minimizes theaverage prediction variance can be converted into the computation of the A -optimality(Atkinson et al. , 2007, Section 10.6). In particular, I -optimal designs on a ﬁnite designspace can also be computed using the algorithm developed for A -optimality. Thus, inthis paper, we focus on the D - and A -optimal designs where the objective functionsare in the form of Φ D ( M ) = det( M ) − and Φ A ( M ) = tr ( M ) − , respectively, for anypositive deﬁnite matrix M . The result by Welch (1982) about the NP-hardness of D -optimality is only valid for the exact design problem, while in this paper our aimis to develop eﬃcient computational algorithms for searching the solutions ξ ∗ of theoptimization problem min log Φ D ( M ( ξ )) and min log Φ A ( M ( ξ )) for D -optimality and A -optimality, respectively.Optimal design is at the heart of statistical planning and inference using lin-ear models (see, for example, Box et al. , 1978). The theory of optimal designs andthe development of numerical computational algorithms for obtaining optimal de-signs have long been studied in the literature under diﬀerent scenarios. For instance,Meyer & Nachtsheim (1995) proposed the coordinate exchange algorithm to construct D -optimal and linear-optimal experimental designs for exact design. The algorithmuses a variant of the Gauss-Southwell cyclic coordinate-descent algorithm withinthe K -exchange algorithm to achieve substantive reductions in required computing.Gao et al. (2014) developed a general class of the multiplicative algorithms for contin-uous designs, which can be used to obtain optimal allocation for a general regressionmodel subject to the D - and A -optimal criteria. For continuous experimental designs,in general, the continuous factors are generated by the vector f ( x ) of linearly inde-pendent regular functions where the design points x ﬁlling the design space X . Then,to choose the optimal design points that maximize the information matrix.There are many analytical methods for obtaining the approximate optimal de-signs. Kiefer & Wolfowitz (1959) introduced the equivalence principle and propose insome cases algorithms to solve the optimization problem. Following the early works ofKarlin & Studden (1966), the case of polynomial regression on a compact interval on R has been widely studied. The well-known equivalence theorem of Kiefer & Wolfowitz(1959) led to the development of a practical algorithm called vertex direction methods(VDMs) for the construction of a D -optimal design (Fedorov, 1972; Wynn, 1970). Theyalso proved the convergence of the sequence to an optimal (in the appropriate sense)design. Silvey et al. (1978) proposed a multiplicative algorithm (MUL) for optimal de-signs on ﬁnite design space, of which the analog in the mixture setting with ﬁnite, ﬁxedsupport is an EM algorithm (Dempster et al. , 1977). The VDMs and MUL algorithmsall are based on the techniques from diﬀerentiable optimization. The general idea is touse directional derivatives to ﬁnd a direction of improvement, and then employ a linesearch to determine an optimal step length. Yu (2011) proposed the cocktail algorithm,which actually is a mixture of multiplicative, vertex-exchange, and VDM algorithmsfor D -optimum design; it includes a nearest-neighbor exchange strategy that helps toapportion weights between adjacent points and has the property that poor supportpoints are quickly removed from the total support points. Harman et al. (2020) consid-ered an extension and combination of both the VEM algorithm and the KL -exchangealgorithm that is used to compute exact designs (Atkinson et al. , 2007) and developed3he randomized exchange method (REX) for the optimal design problem.Recent progress in this area has been obtained by employing hybrid methods thatalternate between steps of the cocktail algorithm, or by using the randomized ex-change method. Following the work of Gao et al. (2014), Duan et al. (2019) proposedan eﬃcient computational algorithm for computing continuous optimal experimentaldesigns for linear models.In this paper, we aim to propose a computational algorithm to obtain approximate D -optimal designs and a computational algorithm to obtain approximate A -optimaldesigns on any compact design spaces. This paper is organized as follows. The sta-tistical inference based on a regression model along with the form of an informationmatrix and variance-covariance matrix for the model parameters are presented in Sec-tion 2. After a review of the D - and A -optimal criteria, the proposed algorithms andthe theoretical results related to the convergence and monotonicity of the proposedalgorithms are also presented in Section 2. Section 3 presents some numerical illustra-tions with several linear regression models on diﬀerent types of design spaces whichare more general in practical applications for D -optimality and A -optimality designs.The proofs of the main results are presented in the Appendix.

2. Algorithms for Approximate Optimal Designs

In this section, we introduce the method for searching for optimal designs when re-gression analysis is used. We focus on the numerical computation of approximate D -and A -optimal designs. For notation simplicity, we denote f i = f ( x i ), w i = w ( x i ), y i = y ( x i ) in the following. Consider the linear regression model Y ( x ) = f T ( x ) β + ε, x ∈ X , (3)where f ( x ) is the covariates, β is a p -dimensional parameter vector, X is the designspace and ε is the error term with mean 0 and variance σ . When the observations( x , y = y ( x )) are obtained based on the model in Eq. (3), the ordinary least squaresestimator of β can be expressed as ˆ β = " N X i =1 w i f i f Ti − N X i =1 w i y i f i , where w i ≥ x i , i = 1 , , . . . , N and P Ni =1 w i = 1. Thevariance of ˆ β can be obtained as V ar ( ˆ β ) = " N X i =1 w i f i f Ti − σ . Most of the existing computational algorithms for obtaining optimal designs dis-cretize the underlying continuous space by considering a ﬁnite design space X = { x , · · · , x N } ⊂ R q . These existing algorithms rely on either complex algorithms oradvanced mathematical programming solvers. Here, we proposed algorithms that aresimple yet eﬀective in obtaining the optimal design for D -optimality and A -optimality4ithout relying on other complex algorithms or advanced mathematical programmingsolvers. D -optimal Designs In an experiment, researchers often wish to estimate the model parameters with thehighest precision. One of the commonly used optimality criteria in experimental designis the D -optimal design which maximizes the determinant of the Fisher informationmatrix, which results in minimum volume for the Wald-type joint conﬁdence regionfor the model parameters if the variance is known (Gilmour & Trinca, 2012). Speciﬁ-cally, the D -optimal design maximizes the log-determinant of the information matrix,i.e., it minimizes the log-determinant of the asymptotic variance-covariance matrix V ar ( ˆ β ). In other words, the D -optimality criterion results in minimizing the general-ized variance of the parameter estimates. The D -optimal criterion can be described asfollows. D -optimal criterion: min w , ··· ,w N ( − log | N X i =1 w i f i f Ti | : subject to w i ≥ N X i =1 w i = 1 , i = 1 , , . . . , N ) , (4)For D -optimality, we can obtain the following result. Theorem 1. w ∗ is the D -optimal solution for Eq. (4) if and only if N X i =1 w i f T ( x i )  N X j =1 w ∗ j f ( x j ) f T ( x j )  − f ( x i ) ≤ p for w i ≥ N P i =1 w i = 1 , i = 1 , · · · , N .Theorem 1 is a special case of a part of the general equivalence theorem, and thedetailed proof is provided in the Appendix. For the D -optimal criterion in Eq. (4), wepropose the following algorithm to obtain the optimal choice of w ∗ based on Theorem1. 5 lgorithm 1 Algorithm for D -optimal design Input:

Regressor f ( x ) ∈ R p , design space X , stopping parameter γ and tuning pa-rameter δ . Output:

Approximate design points x ∗ s and corresponding weight w ∗ s , s = 1 , · · · , k . Generate random design points x i ∈ X and corresponding starting weights w (0) i , i = 1 , · · · , N . compute the regressors f , · · · , f N ∈ R p . repeat w ( h ) i = w ( h − i f Ti D ( h − f i p , i = 1 , · · · , N, where D ( h − = " N X i =1 w ( h − i f i f Ti − . until P Ni =1 | w ( h ) i − w ( h − i | < γ . ﬁnd w i > δ and corresponding design points x i , i = 1 , · · · , N , and write them as x ∗ s , s = 1 , · · · , k . Let x ∗ s , s = 1 , · · · , k as the new design points and repeat Step 2–5. Output the optimal design points x ∗ s and corresponding weights w ∗ s , s = 1 , · · · , k .In Algorithm 1, f ( x ) is the functional form of the regressors, the stopping parameter γ determines the stopping criteria of the algorithm and the tuning parameter δ is usedto choose the design points with weights which are larger than δ . Note that the choiceof δ must such that k ≥ p , which is crucial for guaranteeing the non-singularity of theinformation matrix. In general, the experimenter only needs to set the values of γ and δ to be very small. In Section 3, we set δ = 0 . γ = 0 . Remark 1.

Algorithm 1 is eﬃcient even when the sample size N is very large becausethe iteration process relies on the weight of each sample, but it does not rely on theweights of other samples in a particular iteration. Hence, one can use a parallel strategyto speed up the computations required for Algorithm 1. Most of the existing algorithmsdo not share this advantage because the design points interact with each other. Forexample, the VDM algorithm is based on the diﬀerentiable optimization techniquesin which the basic idea is to move the current design point x to the direction of someother design points while decreasing all components of x . In addition, the algorithmsderived from the VDM algorithm are all depending on the sample size N . Therefore,these algorithms will suﬀer from losing eﬃciency when the sample size N is large.In the following, we present proof of the convergence of Algorithm 1 for D -optimality. To prove the convergence of the proposed algorithm, we ﬁrst show thatthe log-determinant of the information matrix is monotonic. To prove the convergenceof the proposed algorithm, we need to add the bounded assumption and require thefollowing two lemmas. Lemma 1.

Let A ( x ) be a nonnegative deﬁnite matrix function on X , w ( x ) = ( w ( x ) , w ( x ) , · · · , w ( x N )) and ˜ w ( x ) = ( ˜ w ( x ) , ˜ w ( x ) , · · · , ˜ w ( x N )) are two6robability vectors in R N , and N X i =1 w ( x i ) A ( x i ) and N X i =1 ˜ w ( x i ) A ( x i )are positive deﬁnite matrices. Then,log (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 w ( x i ) A ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − log (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ˜ w ( x ) A ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≥ N X i =1 ˜ w ( x i )tr  A ( x i )  N X j =1 ˜ w ( x j ) A ( x j )  −  log w ( x i )˜ w ( x i ) . Proof.

Following the proof of Lemma 1 in Duan et al. (2019) and Lemma 2 inGao et al. (2014), the results in Lemma 1 can be obtained.

Lemma 2.

Suppose w ( x ) = ( w ( x ) , w ( x ) , · · · , w ( x N )) and˜ w ( x ) = ( ˜ w ( x ) , ˜ w ( x ) , · · · , ˜ w ( x N )) satisfy P Ni =1 w ( x i ) = P Ni =1 ˜ w ( x i ) = 1 aretwo probability vectors in R N , then N X i =1 | w ( x i ) − ˜ w ( x i ) | ≤ " N X i =1 w ( x i ) log w ( x i )˜ w ( x i ) / . Proof.

See Kullback (1967).The following theorem shows the convergence of the proposed algorithm for D -optimality. Theorem 2.

Under the assumption that log (cid:12)(cid:12)(cid:12)(cid:12) N P i =1 f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12) is bounded, we have N X i =1 | w ( n ) ( x i ) − w ( n − ( x i ) | −→ n −→ + ∞ . The proof of Theorem 2 is provided in the Appendix.The algorithm proposed here can be considered as a member of the general class ofmultiplicative algorithms (Silvey et al. , 1978). Hence, the proposed algorithm sharesthe simplicity and monotonic convergence property of the class of multiplicativealgorithms, and the convergence rate does not depend on N compared to someexact algorithms such as the coordinate-exchange algorithm (Meyer & Nachtsheim,1995). Now, we provide a theorem to show that the D -optimal approximate designconverges to the continuous D -optimal design (see, for example, Duan et al. , 2019)under certain conditions. Here, a design is approximate if it is a discrete probabilitymeasure and a design is continuous if it is a probability measure with a density withrespect to the Lebesgue measure on the observation domain. Assume that w opt is the D -optimal design on the design space X , and it is a continuous probability distribution.7 heorem 3. Assume that the random sample points x i , i = 1 , , . . . , N are gener-ated according to g ( x ) on the design space X and g ( x ) > ǫ > x ∈ X . If R X f T ( x ) f ( x ) d x is bounded on X , thendet N X i =1 w ∗ ( x i ) f ( x i ) f T ( x i ) − det Z X w opt ( x ) f ( x ) f T ( x ) d x → N → ∞ . Proof.

First, we have Z X w opt ( x ) f ( x ) f T ( x ) d x = Z X w opt ( x ) g ( x ) f ( x ) f T ( x ) g ( x ) d x = N X i =1 w opt ( x i ) N g ( x i ) f ( x i ) f T ( x i ) + c N , where c N = O p ( N − / ). We also have 0 ≤ w opt ( x i ) / { N g ( x i ) } ≤ C/ ( N ǫ ) < N suﬃciently large. In addition, K N ≡ N X i =1 w opt ( x i ) / { N g ( x i ) } = Z X w opt /g ( x ) g ( x ) d x + d N = 1 + d N , where d N = O p ( N − / ). Note that w opt ( x i ) / { N K N g ( x i ) } , i = 1 , , . . . , N are validweights.Write N P i =1 w ∗ ( x i ) f ( x i ) f T ( x i ) = R X w N ( x ) f ( x ) f T ( x ) dµ ( x ), where w N ( x ) = w ∗ ( x i )if x ∈ { x , x , . . . , x N } and w N ( x ) = 0 otherwise. Then, because w opt is optimalon X among all possible bounded probability density functions and probability massfunctions on X , we can obtaindet Z X w opt ( x ) f ( x ) f T ( x ) d x ≥ det N X i =1 w ∗ ( x i ) f ( x i ) f T ( x i ) . (5)Given x i , i = 1 , , . . . , N , w ∗ ( x i ) is optimal, we havedet N X i =1 w opt ( x i ) N K N g ( x i ) f ( x i ) f T ( x i ) ≤ det N X i =1 w ∗ ( x i ) f ( x i ) f T ( x i ) . (6)8ombining Eqs. (5) and (6), we havedet Z X w opt ( x ) f ( x ) f T ( x ) d x ≥ det N X i =1 w ∗ ( x i ) f ( x i ) f T ( x i ) ≥ det 1 K N N X i =1 w opt ( x i ) N g ( x i ) f ( x i ) f T ( x i )= det 11 + d N (cid:26)Z X w opt ( x ) f ( x ) f T ( x ) d x − c N (cid:27) . (7)Because c N → and d N → N → ∞ , Eq. (7) leads todet Z X w N ( x ) f ( x ) f T ( x ) dµ ( x ) − det Z X w opt ( x ) f ( x ) f T ( x ) d x = det N X i =1 w ∗ ( x i ) f ( x i ) f T ( x i ) − det Z X w opt ( x ) f ( x ) f T ( x ) d x → N → ∞ .Theorem 3 veriﬁes that the approximate optimal design obtained from the proposedalgorithm will eventually converge to the continuous optimal design at the speed of √ N . Theorem 3 also guarantees the convergence of the proposed algorithm. Another commonly used optimality criterion is the A -optimality criterion that mini-mizes the trace of the variance-covariance matrix of the maximum likelihood estimates(MLEs). The A -optimality criterion provides an overall measure of the variations inthe model parameter estimates. The objective function being minimized in the A -optimal design is described as follows. A -optimal criterion: min w , ··· ,w N  tr " N X i =1 w i f i f Ti −  , subject to w i ≥ N X i =1 w i = 1 , i = 1 , · · · , N  . (8)For A -optimality, we can obtain the following theorem. For simplicity, we denote I X ( w , f ) = N P i =1 w ( x i ) f i ( x i ) f Ti ( x i ) and I X ( w ∗ , f ) = N P i =1 w ∗ ( x i ) f i ( x i ) f T ( x i ). Theorem 4. w ∗ ( x ) = ( w ∗ ( x ) , · · · , w ∗ ( x N )) is the A -optimal solution for Eq. (8) ifand only if tr([ I X ( w ∗ , f )] − I X ( w , f ) [ I X ( w ∗ , f )] − )tr([ I X ( w ∗ , f )] − ) ≤ , for w ( x i ) ≥ , i = 1 , , . . . , N and N P i =1 w ( x i ) = 1.9he proof of Theorem 4 is provided in the Appendix. A similar approach toTheorem 3 can be used to show that the approximate A -optimal design converges tothe continuous A -optimal design. Based on Theorem 4, for the A -optimal criteria inEq. (8), the following algorithm is proposed to obtain the A -optimal design w ∗ ( x ). Algorithm 2

Algorithm for A -optimal design Input:

Regressor f ( x ) ∈ R m , design space X , stopping parameter γ and tuningparameter δ . Output:

Approximate design points x ∗ s and corresponding weight w ∗ s , s = 1 , , . . . , k . Generate a random design points x i ∈ X and corresponding starting weights w (0) i , i = 1 , , . . . , N . compute the regressors f , f , . . . , f N ∈ R p . repeat w ( h ) i = w ( h − i (cid:20) ( p − p tr ( D ( h − f i f Ti D ( h − ) tr ( D ( h − ) + 1 p (cid:21) , i = 1 , , . . . , N, where D ( h − = " N X i =1 w ( h − i f i f Ti − . until P Ni =1 | w ( h ) i − w ( h − i | < γ . ﬁnd w i > δ and corresponding design points x i , i = 1 , , . . . , N , and write themas x ∗ s , s = 1 , , . . . , k . Let x ∗ s , s = 1 , , . . . , k as the new design points and repeat Step 2–5. Output the optimal design points x ∗ s and corresponding weights w ∗ s , s = 1 , · · · , k .In Algorithm 2, f ( x ) is the functional form of the regressors, the stopping parameter γ determines the stopping criteria of the algorithm and the tuning parameter δ is tochoose the optimal design points with the weight larger than δ . For the same reason asdescribed in Remark 1, a parallel strategy can be used to speed up the computationsrequired for Algorithm 2.For A -optimality, the proposed algorithm provides a convergence solution that isrobust to the initial value because the algorithm does not depend on the initial value.We have attempted to develop the theoretical justiﬁcation of the convergence of theproposed computational algorithm for A -optimality, however, the mathematical justi-ﬁcation is not available. Instead, we provide some simulation and numerical results tosupport the validity and reliability of the proposed algorithm in the subsequent sec-tion. Here, we conjecture the monotonic convergence of the algorithm for A -optimalitybased on the extensive simulation and numerical results.

3. Numerical Illustrations

In this section, we consider ﬁve diﬀerent settings (Castro et al. , 2019) to evaluate theperformance of the proposed algorithms for obtaining approximate D - and A -optimal10esigns. We use these numerical examples to illustrate that the proposed algorithmscan eﬃciently identify the optimal design.For comparative purposes, we also apply the randomized exchange (REX) algorithmproposed by Harman et al. (2020), the cocktail (CO) algorithm proposed by Yu (2011),the vertex direction (VDM) algorithm proposed by Fedorov (1972) and Wynn (1970),and the multiplicative (MUL) algorithm proposed by Silvey et al. (1978) for computingthe D - and A -optimal designs. Since the cocktail algorithm is for D -optimal only (Yu,2011), hence, the cocktail algorithm is not applied to obtain the A -optimal design. p = 6 In Setting 1, we consider the model y ( x ) = β T f ( x ) , where β = ( β , β , β , β , β , β ) T , f ( x ) = (1 , x , x , x , x x , x ) T and x ∈ X =[ − , × [ − , D - and A -optimal designs are presented in Tables 1 and 2, respectively. The optimalvalues of the corresponding objective functions are also presented in Tables 1 and 2.From Tables 1 and 2, we observe that the proposed algorithms identify the sameoptimal design points as the REX, CO, VDM, and MUL algorithms. The weights foroptimal points obtained from the REX, CO algorithms and the proposed algorithmare very close, and there is no signiﬁcant diﬀerence between these three algorithmsaccording to the values of the D -optimality objective function presented in Table 1.Furthermore, the performance of the VDM and MUL algorithms are not as good as theREX, CO and proposed algorithms. Similar results and conclusions can be observedfrom Table 2 for A -optimal designs. p = 6 In Setting 2, we consider the model y ( x ) = β T f ( x ) , where β = ( β , β , β , β , β , β ) T , f ( x ) = (1 , x , x , x , x x , x ) T and w ∈ X = { ( x , x ) T : x + x ≤ } .Figure 1 presents the D -optimal design points in which the weight for the centerof the unit circle (0, 0) is 1/6, and the other optimal design points are uniformlydistributed on the ring (the vertices of a regular s -sided polygon in the circle) witha combined weight 5 / et al. (2019). The optimal design points and their correspondingweights obtained from Algorithm 1, Algorithm 2 and the REX, CO, VEM and MULalgorithms for D - and A -optimal designs are presented in Tables 3 and 4. The optimalvalues of the corresponding objective functions are also presented in Tables 3 and 4,respectively. Note that diﬀerent algorithms may have diﬀerent design points becausethe optimal design points are distributed uniformly on a circle with almost equalweight. Hence, the points may locate at diﬀerent locations that are symmetrical aboutthe center of the circle. 11rom Table 3, the REX and CO algorithms, and the proposed algorithm have thesame weight as the theoretical value for the center point (0 ,

0) and the other optimaldesign points are evenly distributed on the ring with total weight 5 /

6, while the VDMand MUL algorithms have distributed the weights for all design points including thepoints in the center and on the ring. By comparing the values of the objective function − log( | Σ( x ) | ) of diﬀerent algorithms, we can see that the proposed algorithm and theREX algorithm give the same value of the objective function, which is a better valuecompared to the values obtained from other algorithms. Once again, similar resultsand conclusions can be observed from Table 4 for A -optimal designs. p = 6 In Setting 3, we consider a two-dimensional irregular design space called Wynn’s poly-gon as y ( x ) = β T f ( x ) , where β = ( β , β , β , β , β , β ) T , f ( x ) = (1 , x , x , x , x x , x ) T and x ∈ X = { ( x , x ) T : x , x ≥ − √ , x ≤ ( x + √ , x ≤ ( x + √ , x + x ≤ } .Figure 2 shows the D -optimal design points and Table 5 shows the design pointsand corresponding weights by diﬀerent algorithms. From Table 5, we can see that theproposed algorithm, the REX and CO algorithm can locate the optimal design pointswith the theoretical weights while the VDM and multiplicative algorithm fail to do so.For A -optimal, the design points and corresponding weights by diﬀerent algorithmsare presented in Table 6. From Table 6, we observe that the proposed algorithm andthe REX algorithm provide the same results, and they are superior to other methods. p = 10 In Setting 4, we consider the model y ( x ) = β T f ( x ) , where β = ( β , β , β , . . . , β , β ) T , f ( x ) = (1 , x , x , x , x , x x , x x , x , x x , x ) T and x ∈ X = [ − , × [ − , × [ − , D - and A -optimaldesign points with diﬀerent weights. The weights obtained from diﬀerent algorithmswith the corresponding values of the objective function are presented in Tables 7 and8 for the D - and A -optimality, respectively. From Tables 7 and 8, we observe that theresults obtained from the proposed algorithms are very close to the theoretical valuesDuan et al. (2019). Note that although the weights of the proposed optimal design arediﬀerent from the weights of continuous optimal design provided by Atkinson et al. (2007), our optimal design has a smaller D -optimal value. Therefore, according tothe deﬁnition of D -optimal criterion, the proposed optimal design should be better.In this numerical study, we observe that the optimal design points obtained by theexisting algorithms are unstable in the sense that the optimal design points may notbe unique in multiple runs of the algorithms. This may lead to uncertainty in practicalapplications. Moreover, we found that the REX, VDM, and MUL algorithms cannot12lways get all the optimal design points. This may lead to severe problems in somecritical experiments. For example, in pharmaceutical or chemical experiments, ignoringsome design points may lead to severe consequences. For illustrative purpose, the D -optimal value for the proposed, REX and CO algorithms presented in Tables 7 and 8are closest to the theoretical values.In fact, the REX method may fail when the number of candidate points is smalland the number of parameters is large. For example, in Setting 4, if the number ofcandidate points has 27 design points, the REX method sometimes has a singularityof the M matrix during the calculation process, which causes the failure of obtainingthe optimal design. Similarly, in Setting 1, when there are only 7 candidate points,we found that the REX method may also fail sometimes. In contrast, the proposedmethod is feasible and stable in obtaining the optimal design points even when thenumber of candidate points is small. This is a signiﬁcant advantage of the proposedmethod in practical application because there are many scenarios that the number ofcandidate points is small due to high cost or environmental factors. p = 9 In Setting 5, we consider the model y ( x ) = β T f ( x ) , where β = ( β , β , . . . , β , β ) T , f ( x ) = ( x , x , x , x , x x , x x , x , x x , x ) T and x ∈ X = { ( x , x , x ) T : x + x + x = 1 } .For this setting, we use the Fibonacci numbers on the 3-dimensional unit sphere asthe initial design points for the algorithms considered here. For more details relatedto this setting, the reader can refer to Castro et al. (2019). Regression problems witha unit sphere design space have many applications in astrophysics, gravity induction,geophysics, climate laws, and global navigation, because there are countless signalson the surface of the earth, and satellite signals also aﬀect our daily lives. Anotherimportant application of regression with a unit sphere design space is three-dimensionalhuman faces recognition with sparse spherical representation in authentication andsurveillance. Based on this setting, we ﬁnd that every design point on the unit spherecan be considered as an optimal design point. By using the proposed algorithms, weobtain all the design points with equal weights. Figures 4 and 5 display the designpoints of the D -optimal design when the number of supporting points are 500 and 10,respectively. However, when applying the REX, VDM, MUL, and CO algorithms, onlya smaller number of design points are identiﬁed. For instance, the REX algorithm givesonly 128 points as the optimal design points when we use 5000 supporting points inthe three-dimensional unit sphere. Moreover, the weights assigned to these 128 designpoints are not equal based on the REX algorithm. Figure 6 presents the D -optimaldesign obtained from the REX algorithm when the number of supporting points is 500.We also present the values of the D - and A -optimality objective functions in Table 9.From Table 9, we observe that the D -optimal and A -optimal values of the proposedalgorithm are smaller than the other algorithms considered here. Thus, the proposedmethod performs well in this case.To compare the speed of the proposed algorithm for D -optimality and A -optimalitywith the REX, CO, VDM and MUL algorithms, we plot the D -eﬃciency and A -eﬃciency (i.e, | N P i =1 w ∗ i f i f Ti | / | N P i =1 w i f i f Ti | and13r (cid:20) N P i =1 w ∗ i f i f Ti (cid:21) − ! / tr (cid:20) N P i =1 w i f i f Ti (cid:21) − ! , where w ∗ i i = 1 , · · · , N is the theoreticaloptimal design) versus the time (seconds) for Setting 4 with varying sizes from Figures7 – 12. From Figures 7 – 9, we can see that all of these algorithms will ultimatelyconverge to the theoretical optimal design (given enough time), but the proposedmethod is superior to the other methods for large size of N of the design space becauseof the parallel strategy for the computation as mentioned in Remark 1. However, fora small N , the REX algorithm tends to perform better than the proposed method. Asimilar observation can also be drawn for the A -eﬃciency from Figures 10 – 12.To illustrate the performance of the proposed method in the case that the numberof factors is large, we consider the full quadratic regression model y ( x , x , · · · , x q ) = β + q X i =1 β i x i + q X j =1 q X k = j β j,k x j x k + ε, (9)where β , , β , , · · · , β q,q correspond to the parameters β q +1 , β q +2 , · · · , β p , p = ( q +1)( q +2)2 . In Figure 13, the vertical axis is the value of − log (1 − ef f ) for q = 8 , , ,

14 and the horizontal axis denotes the number of iterations, where ef f represents the lower bound of the D -eﬃciency (Pukelsheim, 2006): D -eﬃciency ≥ p max x ∈X f ′ ( x ) M ( ξ ) f ( x ) , where ξ is the current design. In other words, the verticalaxis in Figure 13, the values 1 , , , · · · correspond to D -eﬃciency 0 . , . , . , · · · .In Figure 14, the vertical axis is the D -criterion values of designs produced by theproposed method for q = 8 , , ,

14 and the horizontal axis denotes the number of it-erations. From Figures 13 and 14, with the increase of iteration times, − log (1 − ef f )gradually increases and D -criterion values converge to the D -optimal value when thenumber of factors is large. Thus, the proposed method is still eﬀective even when thenumber of factors is large.Based on the numerical evaluations of the ﬁve settings considered in this section, wefound that the proposed algorithms for D -optimality and A -optimality converge in allcases, and the optimal design points, as well as the corresponding weights, are close tothe theoretical values. Furthermore, in some cases, the proposed method outperformssome existing algorithms for computing approximate D - and A -optimal designs. It isnoteworthy that the proposed algorithm is simple and it can be implemented withoutrelying on any advanced mathematical programming solvers. Therefore, the proposedalgorithms provide a more convenient and eﬀective way to approximate the D - and A -optimal solutions on the compact design space.

4. Concluding Remarks

In this paper, we discuss the approximate optimal design and proposed eﬃcient iter-ative computational algorithms to obtain the approximate D -optimal and A -optimaldesigns for linear models on compact design spaces. Due to the simplicity and ef-ﬁciency of the algorithm, the two proposed algorithms are easy to implement. Theproposed algorithms are useful tools for many practical applications of optimal designof experiments.We also provided proof of the monotonic convergence of the proposed algorithmfor D -optimality and demonstrate that the proposed algorithms provide solution that14onverges to the optimal design. Furthermore, we prove that the optimal approximatedesigns converge to the continuous optimal design under certain conditions. A theoret-ical justiﬁcation for the convergence of the proposed algorithm for A -optimality is notavailable, but our numerical results strongly support the validity and reliability of theproposed algorithm. These algorithms are implemented in Matlab and the programsare available from the authors upon request. It is worth mentioning that although wefocus on D -optimal designs and A -optimal designs for linear models in this paper, theideas of the proposed algorithms can be extended to other optimal criteria and otherdesign space in high-dimensional situations. Appendix

Proof of Theorem 1

Since log( | A | ) is concave in A with A being a positive deﬁnite matrix, we havelog ((cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (1 − λ ) N X i =1 w ∗ ( x i ) f ( x i ) f T ( x i ) + λ N X i =1 w ( x i ) f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)) ≥ (1 − λ ) log ((cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 w ∗ ( x i ) f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)) + λ log ((cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 w ( x i ) f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)) . Then, w ∗ is the optimal solution for the D -optimal criterion in (2) if and only iflog {| (1 − λ ) I X ( w ∗ , f ) + λI X ( w , f ) |} − log {| I X ( w ∗ , f ) |} λ ≤ , where I X ( w , f ) = N P i =1 w ( x i ) f ( x i ) f T ( x i ), for all w ( x i ) that satisfy w ( x i ) ≥ , i =1 , , . . . , N and N P i =1 w ( x i ) = 1, and λ >

0. Thus, for λ ↓

0, we havelim λ ↓ log( | (1 − λ ) I X ( w ∗ , f ) + λI X ( w , f ) | ) − log( | I X ( w ∗ , f ) | ) λ = ∂ log {| (1 − λ ) I X ( w ∗ , f ) + λI X ( w , f ) |} ∂λ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 = tr " N X i =1 w ∗ ( x i ) f ( x i ) f T ( x i ) − N X i =1 [ w ( x i ) − w ∗ ( x i )] f ( x i ) f T ( x i )  = tr " N X i =1 w ∗ ( x i ) f ( x i ) f T ( x i ) − N X i =1 w ( x i ) f ( x i ) f T ( x i )  − p ≤ , which gives the result in Theorem 1. 15 roof of Theorem 2 From Lemma 1, we havelog (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 w ( n ) ( x i ) f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − log (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 w ( n − ( x i ) f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≥ N X i =1 w ( n − ( x i )trace n f ( x i ) f T ( x i ) D ( n − o log w ( n ) ( x i ) w ( n − ( x i )= N X i =1 w ( n − ( x i ) f T ( x i ) D ( n − f ( x i ) log w ( n ) ( x i ) w ( n − ( x i )= p N X i =1 w ( n ) ( x i ) log w ( n ) ( x ) w ( n − ( x ) ≥ . Thus, we can conclude that log (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 w ( n ) ( x i ) f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) is increasing in n = 2 , , . . . . We can get thatlog (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 w ( n ) ( x i ) f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ log (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . Therefore, under the bounded assumption, the sequence log (cid:12)(cid:12)(cid:12)(cid:12) N P i =1 w ( n ) ( x i ) f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12) is uniformly bounded and increasing, and hence it is convergent.Using Lemma 1 and Lemma 2, we can obtain0 = lim n →∞ log (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 w ( n ) ( x i ) f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − log (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 w ( n − ( x i ) f ( x i ) f T ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≥ p N X i =1 w ( n ) ( x i ) log w ( n ) ( x i ) w ( n − ( x i ) ≥ p N X i =1 | w ( n ) ( x i ) − w ( n − ( x i ) | ] ≥ . Then, we can conclude that N X i =1 | w ( n ) ( x i ) − w ( n − ( x i ) | −→ n −→ + ∞ . roof of Theorem 4 We can check that tr( A ) − is concave in A , where A is positive deﬁnite matrix. Wehavetr[(1 − λ ) I X ( w ∗ , f ) + λI X ( w , f )] − ≤ (1 − λ )tr( I − X ( w ∗ , f )) + λ tr( I − X ( w , f )) . Then, w ∗ is the optimal solution for the A -optimal criterion in Eq. (8) if and only iffor all w ( x i )( w ( x i ) ≥ N P i =1 w ( x i ) = 1),tr[(1 − λ ) I X ( w ∗ , f ) + λI X ( w , f )] − − tr( I − X ( w ∗ , f )) λ ≥ λ >

0. Thus, for λ ↓ λ ↓ tr[(1 − λ ) I X ( w ∗ , f ) + λI X ( w , f )] − − tr( I − X ( w ∗ , f )) λ = ∂ { tr [(1 − λ ) I X ( w ∗ , f ) + λI X ( w , f )] − } ∂λ | λ =0 = − tr[ I − X ( w ∗ , f )( I X ( w , f ) − I X ( w ∗ , f )) I − X ( w ∗ , f )]= − tr( I − X ( w ∗ , f ) I X ( w , f ) I − X ( w ∗ , f )) + tr( I − X ( w ∗ , f )) ≥ , which implies Theorem 4. References

Atkinson, A C, Donev, A N, & Tobias, R D. 2007.

Optimum experimental designs, With SAS .Oxford University Press.Box, G E P, Hunter, W G, & Hunter, J S. 1978.

Statistics for experimenters: an introductionto design, data analysis, and model building . John Wiley & Sons.Castro, Y D, Gamboa, F, Henrion, D, Hess, R, & Lasserre, J B. 2019. Approximate optimaldesigns for multivariate polynomial regression.

The Annals of Statistics , (1), 127–155.Chen, Y H. 2003. D-optimal designs for linear and quadratic polynomial models. NationalSun Yat-Sen University, Taiwan .Cook, R D, & Nachtsheim, C J. 1982. Model robust, linear-optimal designs.

Technometrics , (1), 49–54.Dempster, A P, Laird, N M, & Rubin, D B. 1977. Maximum likelihood estimation fromincomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B , (1), 1–38.Dette, H, & Studden, W J. 1997. The theory of canonical moments with applications instatistics, probability, and analysis . Wiley & Sons.Duan, J T, Gao, W, & Ng, H K T. 2019. Eﬃcient Computational Algorithm for OptimalContinuous Experimental Designs.

Journal of Computational and Applied Mathematics , , 98–113.Fedorov, V. 1972. Theory of optimal experiments . Academic Press.Gao, W, Chan, P S, Ng, H K T, & Lu, X. 2014. Eﬃcient computational algorithm for optimalallocation in regression models.

Journal of Computational and Applied Mathematics , (1),118–126. ilmour, S. G., & Trinca, L. A. 2012. Optimum design of experiments for statistical inference. Journal of the Royal Statistical Society: Series C (Applied Statistics) , (3), 345–401.Goos, P, Jones, B, & Syaﬁtri, U. 2016. I-Optimal Design of Mixture Experiments. Journal ofthe American Statistical Association , (514), 899–911.Harman, R, Filov´a, L, & Richt´arik, P. 2020. A Randomized Exchange Algorithm for Com-puting Optimal Approximate Designs of Experiments. Journal of the American StatisticalAssociation, Accepted , (529), 348–361.Karlin, S, & Studden, W J. 1966. Tchebycheﬀ systems: With applications in analysis andstatistics . Interscience Publishers John Wiley & Sons.Kiefer, J. 1974. General equivalence theory for optimum designs (approximate theory).

TheAnnals of Statistics , (5), 849–879.Kiefer, J, & Wolfowitz, J. 1959. Optimum Designs in Regression Problems. The Annals ofMathematical Statistic , (2), 271–294.Kullback, S. 1967. A lower bound for discrimination information terms of variation. IEEETransactions on Information Theory , (1), 126–127.Meyer, R K, & Nachtsheim, C J. 1995. The Coordinate-Exchange Algorithm for ConstructingExact Optimal Experimental Designs. Technometrics , (1), 60–69.Pukelsheim, F. 2006. Optimal Design of Experiments (Classics in Applied Mathematics) .Philadelphia, PA: Society for Industrial and Applied Mathematics. [348,349,357].Silvey, S D, Titterington, D M, & Torsney, B. 1978. An algorithm for optimal designs on aﬁnite design space.

Communications in Statistics - Theory and Methods , (14), 1379–1389.Welch, W. J. 1982. Algorithmic complexity: three np-hard problems in computational statis-tics. Journal of Statistical Computation and Simulation , (1), 17–25.Wynn, H.P. 1970. The sequential generation of D -optimum experimental designs. Annals ofMathematical Statistics , (5), 1655–1664.Yu, Y. 2011. D-optimal designs via a cocktail algorithm. Statistics and Computing , (4),475–481. Table 1. D -optimal design points and weights, and the values of the D -optimality objective function forSetting 1 Weights Weights Weights Weights WeightsDesign points (REX) (CO) (VDM) (multiplicative) (proposed algorithm)( − , −

1) 0.1458 0.1458 0.1625 0.1430 0.1457(1 ,

1) 0.1458 0.1458 0.1196 0.1520 0.1457( − ,

1) 0.1458 0.1458 0.1595 0.1406 0.1457(1 , −

1) 0.1458 0.1458 0.1595 0.1436 0.1457(0 , −

1) 0.0802 0.0802 0.0798 0.0914 0.0803(0 ,

1) 0.0802 0.0802 0.1196 0.0822 0.0803( − ,

0) 0.0802 0.0802 0.0798 0.1066 0.0803(1 ,

0) 0.0802 0.0802 0.0798 0.1032 0.0803(0 ,

0) 0.0962 0.0962 0.0399 0.0373 0.0960 − log( | Σ( x ) | ) 4.4706 4.4706 4.5760 4.5512 4.471818 able 2. A -optimal design points and weights, and the values of the A -optimality objective function forSetting 1 Weights Weights Weights WeightsDesign points (REX) (VDM) (multiplicative) (proposed algorithm)( − , −

1) 0.0940 0.1202 0.0962 0.0939(1 ,

1) 0.0940 0.1202 0.0895 0.0939( − ,

1) 0.0940 0.1202 0.0953 0.0939(1 , −

1) 0.0940 0.1202 0.0978 0.0939(0 , −

1) 0.0978 0.1183 0.1181 0.0978(0 ,

1) 0.0978 0.0802 0.0972 0.0978( − ,

0) 0.0978 0.0802 0.0972 0.0978(1 ,

0) 0.0978 0.0802 0.0873 0.0978(0 ,

0) 0.2332 0.1603 0.2214 0.2332 tr (Σ( x ) − ) 17.8922 18.7300 17.9453 17.8922 −1.0 −0.5 0.0 0.5 1.0 − . − . . . . D−optimal x1 x Figure 1. D -optimal design for Setting 2 able 3. D -optimal design points and weights, and the values of the D -optimality objective function forSetting 2 Design Weights Design Weights Design Weights Design Weights Design WeightsPoints (REX) Points (CO) Points (VDM) Points (MUL) Points (Proposed)(-0.05, -1.00) 0.0682 (-1.00, 0.06) 0.1103 ( 0.08, 0.35) 0.0403 (-0.10, 0.92) 0.0332 (-1.00, 0.00) 0.0222(-0.78, 0.62) 0.0614 (-0.94, -0.55) 0.0694 (-0.76, -0.64) 0.0403 (-0.33, -0.94) 0.0514 (-0.96, -0.28) 0.0214( 0.37, -0.93) 0.0840 (-0.94, 0.55) 0.0064 ( 0.42, -0.91) 0.0403 (-0.01, 0.81) 0.0235 (-0.96, 0.28) 0.0812( 0.01, 1.00) 0.1370 (-0.73, 0.22) 0.0569 ( 0.23, 0.97) 0.0403 ( 0.86, 0.50) 0.0414 (-0.80, -0.60) 0.0591( 0.92, -0.38) 0.1153 (-0.73, -0.10) 0.0989 (-0.10, 0.34) 0.0403 (-0.56, 0.39) 0.0254 (-0.60, -0.80) 0.0591( 0.86, 0.51) 0.1306 (-0.28, -0.82) 0.0794 (-0.35, 0.94) 0.0403 ( 0.62, -0.78) 0.0879 (-0.60, 0.80) 0.0924(-0.78, -0.63) 0.1343 (-0.00, 0.00) 0.1667 (-0.91, -0.40) 0.0331 (-0.70, 0.71) 0.0499 (-0.28, -0.96) 0.0214( 0.00, 0.00) 0.1667 ( 0.30, 0.32) 0.0825 (-0.01, -0.05) 0.0403 (-0.96, 0.27) 0.0514 (-0.28, 0.96) 0.0073(-0.96, 0.29) 0.0978 ( 0.31, 0.00) 0.0061 ( 0.99, -0.13) 0.0403 ( 0.78, 0.57) 0.0375 ( 0.00, -1.00) 0.0222(-0.39, 0.92) 0.0047 ( 0.51, 0.34) 0.1069 ( 0.53, -0.85) 0.0403 (-0.87, -0.50) 0.0517 ( 0.00, 0.00) 0.1667( 0.51, -0.32) 0.0237 (-0.69, 0.72) 0.0403 ( 0.04, -0.01) 0.0448 ( 0.00, 1.00) 0.0075( 0.57, -0.38) 0.0738 ( 1.00, -0.01) 0.0403 ( 0.13, -0.73) 0.0256 ( 0.28, -0.96) 0.0812( 0.57, -0.78) 0.0207 (-0.25, 0.73) 0.0403 ( 0.98, 0.18) 0.0420 ( 0.28, 0.96) 0.1241(-0.36, -0.93) 0.0806 ( 0.85, -0.52) 0.0450 ( 0.51, 0.86) 0.0014( 0.63, -0.77) 0.0403 (-0.99, 0.12) 0.0509 ( 0.80, -0.60) 0.0924(-0.96, -0.28) 0.0403 (-0.04, 0.69) 0.0213 ( 0.86, 0.51) 0.0014(-0.96, 0.26) 0.0806 ( 0.01, -0.03) 0.0449 ( 0.96, -0.28) 0.0073(-0.30, 0.95) 0.0403 ( 0.08, -1.00) 0.0458 ( 0.96, 0.28) 0.1241( 0.60, 0.80) 0.0806 ( 0.99, 0.13) 0.0422 ( 1.00, 0.00) 0.0075( 0.10, -0.45) 0.0403 ( 0.30, 0.95) 0.0427( 0.00, 0.40) 0.0403 (-0.05, 1.00) 0.0433( 0.91, 0.20) 0.0403 (-0.71, -0.71) 0.0531( 0.01, -0.04) 0.0449 − log( | Σ( x ) | ) 8.2497 17.9858 8.6608 8.6669 8.2492 Table 4. A -optimal design points and weights, and the values of the A -optimality objective function forSetting 2 Design Weights Design Weights Design Weights Design WeightsPoints (REX) Points (VDM) Points (MUL) Points (Proposed)(-0.05, -1.00) 0.0371 (-1.00, -0.03) 0.0395 (-0.55, 0.15) 0.0174 (-0.96, 0.28) 0.0856(-0.78, 0.62) 0.0355 ( 0.69, -0.73) 0.0395 (-0.37, -0.72) 0.0305 (-0.80, -0.60) 0.0700( 0.37, -0.93) 0.0861 ( 0.75, -0.66) 0.0395 ( 0.78, 0.62) 0.0930 (-0.60, -0.80) 0.0700( 0.01, 1.00) 0.1053 (-0.09, 0.13) 0.0395 (-0.59, 0.27) 0.0149 (-0.60, 0.80) 0.0878(-0.89, 0.46) 0.0978 ( 0.04, -0.08) 0.0395 (-0.29, 0.96) 0.0383 (-0.01, 0.00) 0.0013( 0.81, 0.59) 0.0781 (-0.82, 0.11) 0.0395 ( 0.86, 0.51) 0.0474 ( 0.00, -0.01) 0.0013( 0.92, -0.38) 0.0984 ( 0.04, -0.02) 0.0395 (-0.36, 0.93) 0.0398 ( 0.00, 0.00) 0.2866( 0.86, 0.51) 0.0374 ( 0.92, 0.40) 0.0791 (-1.00, 0.03) 0.0613 ( 0.00, 0.01) 0.0013(-0.78, -0.63) 0.1324 (-0.18, 0.15) 0.0395 ( 0.05, 0.00) 0.0539 ( 0.01, 0.00) 0.0013( 0.00, 0.00) 0.2919 ( 0.03, 0.79) 0.0395 ( 0.83, -0.55) 0.0396 ( 0.28, -0.96) 0.0856( 0.37, 0.88) 0.0395 ( 0.69, -0.57) 0.0228 ( 0.28, 0.96) 0.0998( 0.96, 0.28) 0.0395 ( 0.05, -0.01) 0.1077 ( 0.51, 0.86) 0.0108(-0.30, -0.95) 0.0395 (-0.99, -0.11) 0.0609 ( 0.80, -0.60) 0.0878( 0.03, -0.04) 0.0395 (-0.03, 1.00) 0.0354 ( 0.86, 0.51) 0.0108( 0.06, -0.07) 0.0395 (-0.56, 0.65) 0.0242 ( 0.96, 0.28) 0.0998( 0.67, -0.74) 0.0395 ( 0.67, -0.74) 0.0376( 0.32, 0.95) 0.0395 ( 0.08, 0.08) 0.0532(-0.35, -0.94) 0.0395 (-0.63, 0.17) 0.0144(-0.71, -0.70) 0.0395 ( 0.08, -0.01) 0.0538( 0.03, -0.02) 0.0514 (-0.31, -0.95) 0.0567(-0.59, 0.81) 0.1186 (-0.39, -0.92) 0.0584(-0.95, 0.00) 0.0395 ( 0.79, -0.62) 0.0388 tr (Σ( x ) − ) 35.2212 38.1327 38.0870 35.2207 − . . . . . D−optimal x1 x Figure 2. D -optimal design for Setting 3 Table 5. D -optimal design points and weights, and the values of the D -optimality objective function forSetting 3 Design Weights Design Weights Design Weights Design Weights Design WeightsPoints (REX) Points (CO) Points (VDM) Points (MUL) Points (Proposed)(-0.35, -0.35) 0.1652 (-0.35, -0.35) 0.1652 (-0.35, -0.35) 0.1626 (-0.35, -0.35) 0.1619 (-0.35, -0.35) 0.1627(-0.35, 0.35) 0.1652 (-0.35, 0.35) 0.1652 (-0.35, 0.35) 0.1652 (-0.35, 0.35) 0.1648 (-0.35, 0.35) 0.1652( 0.12, 0.12) 0.0690 ( 0.12, 0.12) 0.0690 (-0.11, 0.25) 0.0002 ( 0.14, 0.00) 0.0602 ( 0.12, 0.12) 0.0690( 0.18, 0.53) 0.1396 ( 0.18, 0.53) 0.1396 ( 0.03, -0.30) 0.0002 ( 0.21, 0.54) 0.1475 ( 0.18, 0.53) 0.1396( 0.35, -0.35) 0.1652 ( 0.35, -0.35) 0.1652 ( 0.11, 0.11) 0.0003 ( 0.35, -0.35) 0.1666 ( 0.35, -0.35) 0.1652( 0.53, 0.18) 0.1396 ( 0.53, 0.18) 0.1396 ( 0.11, 0.12) 0.0004 ( 0.54, 0.21) 0.1417 ( 0.53, 0.18) 0.1396( 0.70, 0.70) 0.1587 ( 0.70, 0.70) 0.1587 ( 0.12, 0.12) 0.0677 ( 0.70, 0.70) 0.1573 ( 0.70, 0.70) 0.1587( 0.18, 0.46) 0.0002( 0.18, 0.53) 0.1392( 0.27, 0.07) 0.0002( 0.35, -0.35) 0.1652( 0.48, 0.24) 0.0002( 0.53, 0.18) 0.1394( 0.57, 0.63) 0.0002( 0.70, 0.70) 0.1585 − log( | Σ( x ) | ) 17.5100 17.5100 17.5121 17.5509 17.5100 able 6. A -optimal design points and weights, and the values of the A -optimality objective function forSetting 3 Design Weights Design Weights Design Weights Design WeightsPoints (REX) Points (VDM) Points (MUL) Points (Proposed)(-0.35, -0.35) 0.1047 (-0.35, -0.35) 0.1045 (-0.35, -0.35) 0.1042 (-0.35, -0.35) 0.1047(-0.35, 0.35) 0.1642 (-0.35, 0.35) 0.1641 (-0.35, 0.35) 0.1618 (-0.35, 0.35) 0.1642( 0.07, 0.07) 0.1926 (-0.17, -0.05) 0.0002 ( 0.08, 0.05) 0.1924 ( 0.07, 0.07) 0.1926( 0.21, 0.54) 0.1567 (-0.13, -0.02) 0.0002 ( 0.21, 0.54) 0.1572 ( 0.21, 0.54) 0.1567( 0.35, -0.35) 0.1642 (-0.12, 0.43) 0.0002 ( 0.35, -0.35) 0.1670 ( 0.35, -0.35) 0.1642( 0.54, 0.21) 0.1567 ( 0.07, 0.07) 0.1901 ( 0.54, 0.21) 0.1564 ( 0.54, 0.21) 0.1567( 0.70, 0.70) 0.0609 ( 0.07, 0.08) 0.0005 ( 0.70, 0.70) 0.0609 ( 0.70, 0.70) 0.0609( 0.08, 0.08) 0.0008( 0.09, 0.09) 0.0002( 0.09, 0.10) 0.0003( 0.10, 0.12) 0.0002( 0.13, 0.16) 0.0002( 0.14, -0.23) 0.0002( 0.21, 0.54) 0.1564( 0.25, -0.31) 0.0002( 0.35, -0.35) 0.1641( 0.41, -0.05) 0.0002( 0.52, 0.57) 0.0002( 0.54, 0.21) 0.1564( 0.55, 0.24) 0.0002( 0.70, 0.70) 0.0607 tr (Σ( x ) − ) 359.1845 359.4192 359.6661 359.1845 D−optimal −1.0 −0.5 0.0 0.5 1.0 − . − . . . . −1.0 −0.5 0.0 0.5 1.0 x1 x x Figure 3. D -optimal design for Setting 4 able 7. D -optimal design points and weights, and the values of the D -optimality objective function forSetting 4 Weights Weights Weights Weights WeightsDesign points (REX) (CO) (VDM) (MUL) (proposed algorithm)( − , − , −

1) 0.0668 0.0677 0.0489 0.0539 0.0684( − , − ,

1) 0.0810 0.0613 0.0489 0.0595 0.0684( − , , −

1) 0.0625 0.0698 0.0489 0.0576 0.0684( − , ,

1) 0.0766 0.0635 0.0733 0.0753 0.0684(1 , − , −

1) 0.0673 0.0803 0.0489 0.0721 0.0684(1 , − ,

1) 0.0815 0.0740 0.0733 0.0726 0.0684(1 , , −

1) 0.0630 0.0825 0.0733 0.0594 0.0684(1 , ,

1) 0.0771 0.0762 0.0733 0.0790 0.0684( − , − ,

0) 0.0151 0.0339 0.0489 0.0416 0.0262( − , , −

1) 0.0336 0.0254 0.0244 0.0530 0.0262( − , ,

1) 0.0053 0.0380 0.0489 0.0284 0.0262( − , ,

0) 0.0238 0.0295 0.0244 0.0289 0.0262(0 , − , −

1) 0.0288 0.0150 0.0468 0.0533 0.0262(0 , − ,

1) 0.0005 0.0276 0.0489 0.0275 0.0262(0 , , −

1) 0.0374 0.0106 0.0244 0.0482 0.0262(0 , ,

1) 0.0091 0.0232 0.0244 0.0137 0.0262(1 , − ,

0) 0.0141 0.0086 0.0244 0.0266 0.0262(1 , , −

1) 0.0326 0.0001 0.0244 0.0295 0.0262(1 , ,

0) 0.0228 0.0043 0.0000 0.0270 0.0262(1 , ,

1) 0.0043 0.0127 0.0000 0.0196 0.0262( − , ,

0) 0.0318 0.0073 0.0000 0.0000 0.0183(0 , − ,

0) 0.0414 0.0282 0.0000 0.0000 0.0183(0 , , −

1) 0.0045 0.0451 0.0244 0.0000 0.0183(0 , ,

1) 0.0611 0.0199 0.0244 0.0000 0.0183(0 , ,

0) 0.0241 0.0369 0.0489 0.0216 0.0183(1 , ,

0) 0.0339 0.0578 0.0489 0.0164 0.0183(0 , ,

0) 0.0000 0.0005 0.0244 0.0351 0.0290 − log( | Σ( x ) | ) 7.4554 7.6846 8.5046 8.0713 7.451423 able 8. A -optimal design points and weights, and the values of the A -optimality objective function forSetting 4 Weights Weights Weights WeightsDesign points (REX) (VDM) (MUL) (proposed algorithm)( − , − , −

1) 0.0277 0.0477 0.0321 0.0402( − , − ,

1) 0.0264 0.0477 0.0316 0.0402( − , , −

1) 0.0364 0.0477 0.0458 0.0402( − , ,

1) 0.0351 0.0477 0.0343 0.0402(1 , − , −

1) 0.0450 0.0238 0.0268 0.0402(1 , − ,

1) 0.0438 0.0238 0.0301 0.0402(1 , , −

1) 0.0538 0.0477 0.0389 0.0402(1 , ,

1) 0.0525 0.0477 0.0424 0.0402( − , − ,

0) 0.0521 0.0238 0.0465 0.0259( − , , −

1) 0.0421 0.0238 0.0282 0.0259( − , ,

1) 0.0447 0.0000 0.0307 0.0259( − , ,

0) 0.0347 0.0000 0.0271 0.0259(0 , − , −

1) 0.0335 0.0715 0.0485 0.0259(0 , − ,

1) 0.0361 0.0477 0.0482 0.0259(0 , , −

1) 0.0161 0.0238 0.0529 0.0259(0 , ,

1) 0.0186 0.0238 0.0386 0.0259(1 , − ,

0) 0.0175 0.0477 0.0518 0.0259(1 , , −

1) 0.0075 0.0477 0.0455 0.0259(1 , ,

0) 0.0000 0.0477 0.0243 0.0259(1 , ,

1) 0.0100 0.0238 0.0277 0.0259( − , ,

0) 0.0080 0.0715 0.0319 0.0430(0 , − ,

0) 0.0253 0.0238 0.0000 0.0430(0 , , −

1) 0.0453 0.0000 0.0240 0.0430(0 , ,

1) 0.0405 0.0715 0.0460 0.0430(0 , ,

0) 0.0602 0.0477 0.0221 0.0430(1 , ,

0) 0.0774 0.0000 0.0301 0.0430(0 , ,

0) 0.1101 0.0703 0.0936 0.1096 tr (Σ( x ) − ) 29.9135 31.2281 31.0049 29.9255 Figure 4. D -optimal design for Setting 5 igure 5. D -optimal design for Setting 5 Figure 6. D -optimal design for Setting 5 by REX method Table 9.

Values of the D - and A -optimality objective functions for Setting 5 Methods REX CO VDM multiplicative proposed algorithm − log( | Σ( x ) | ) 53.2852 53.5729 52.3201 52.7256 50.5689 tr (Σ( x ) − ) 72.0000 −− times (s) D - e ff i c i en cy N=21*21*21 proposedREXCocktailMultiplicativeVDM

Figure 7.

The D -eﬃciency for Setting 4 when N = 21 × ×

21. The vertical axis denotes the D -eﬃciency.The horizontal axis corresponds to the computation time (in seconds). times (s) D - e ff i c i en cy N=101*101*101 proposedREXCocktailMultiplicativeVDM

Figure 8.

The D -eﬃciency for Setting 4 when N = 101 × × D -eﬃciency.The horizontal axis corresponds to the computation time (in seconds).

200 400 600 800 1000 1200 1400 1600 1800 2000 times (s) D - e ff i c i en cy N=201*201*201 proposedREXCocktailMultiplicativeVDM

Figure 9.

The D -eﬃciency for Setting 4 when N = 201 × × D -eﬃciency.The horizontal axis corresponds to the computation time (in seconds). times (s) A - e ff i c i en cy N=21*21*21 proposedREXMultiplicativeVDM

Figure 10.

The A -eﬃciency for Setting 4 when N = 21 × ×

21. The vertical axis denotes the A -eﬃciency.The horizontal axis corresponds to the computation time (in seconds).

50 100 150 200 250 300 350 400 times (s) A - e ff i c i en cy N=101*101*101 proposedREXMultiplicativeVDM

Figure 11.

The A -eﬃciency for Setting 4 when N = 101 × × A -eﬃciency.The horizontal axis corresponds to the computation time (in seconds). times (s) A - e ff i c i en cy N=201*201*201 proposedREXMultiplicativeVDM

Figure 12.

The A -eﬃciency for Setting 4 when N = 201 × × A -eﬃciency.The horizontal axis corresponds to the computation time (in seconds).

20 40 60 80 100 120

Number of iterations - l og ( - e ff ) N=3 q , p=(q+1)(q+2)/2 q=8q=10q=12q=14 Figure 13.

The log-eﬃciency − log (1 − eff ) of the proposed method with q = 8 , , ,

14 for model (9),where eff is the lower bound of D -eﬃciency of design. The values 1 , , , · · · on the vertical axis correspondto D -eﬃciency 0 . , . , . , · · · . The horizontal axis corresponds to the number of iterations. Number of iterations D v a l ue N=3 q ,p=(q+1)(q+2)/2 q=8q=10q=12q=14 Figure 14.

The D -criterion value of the proposed method with q = 8 , , ,

14 for model in Eq. (9). Thevertical axis value denotes the D -criterion values of the proposed method and the horizontal axis denotes thenumber of iterations.-criterion values of the proposed method and the horizontal axis denotes thenumber of iterations.