[PDF] Improving D-Optimality in Nonlinear Situations

Abstract

Experimental designs based on the classical D-optimal criterion minimize the volume of the linear-approximation inference regions for the parameters using local sensitivity coefficients. For nonlinear models, these designs can be unreliable because the linearized inference regions do not always provide a true indication of the exact parameter inference regions. In this article, we apply the profile-based sensitivity coefficients developed by Sulieman et.al. [12] in designing D-optimal experiments for parameter estimation in some selected nonlinear models. Profile-based sensitivity coefficients are defined by the total derivative of the model function with respect to the parameters. They have been shown to account for both parameter co-dependencies and model nonlinearity up to second order-derivative. This work represents a first attempt to construct experiments using profile-based sensitivity coefficients. Two common nonlinear models are used to illustrate the computational aspects of the profile-based designs and simulation studies are conducted to demonstrate the efficiency of the constructed experiments.

Full PDF

aa r X i v : . [ s t a t . M E ] J a n Improving D-Optimality in NonlinearSituations

Hana SuliemanDepartment of Mathematics and StatisticsAmerican University of Sharjah, P.O.Box 26666, Sharjah, [email protected]

Abstract

Experimental designs based on the classical D-optimal criterionminimize the volume of the linear-approximation inference regions forthe parameters using local sensitivity coeﬃcients. For nonlinear mod-els, these designs can be unreliable because the linearized inferenceregions do not always provide a true indication of the exact parame-ter inference regions. In this article, we apply the proﬁle-based sen-sitivity coeﬃcients developed by Sulieman et.al. [12] in designing D-optimal experiments for parameter estimation in some selected non-linear models. Proﬁle-based sensitivity coeﬃcients are deﬁned by thetotal derivative of the model function with respect to the parameters.They have been shown to account for both parameter co-dependenciesand model nonlinearity up to second order-derivative. This work rep-resents a ﬁrst attempt to construct experiments using proﬁle-basedsensitivity coeﬃcients. Two common nonlinear models are used toillustrate the computational aspects of the proﬁle-based designs andsimulation studies are conducted to demonstrate the eﬃciency of theconstructed experiments.

Keywords : D-optimality; Local sensitivity coeﬃcient; Proﬁle-based sensitiv-ity coeﬃcient; Sequential design. 1

Design of experiments has been a very active research area in many scien-tiﬁc ﬁelds for the past two decades. For linear models, the theory of optimaldesigns is well established in the literature and the properties of these designsare fairly understood and used in various applications. On the other hand,for nonlinear models, the theory of design optimality is still emerging in theliterature. The major diﬃculty when the underlying model is nonlinear isthat the optimal designs depend on the true value of the parameters. Hence,poor estimates of the unknown parameter values generate poor designs.Several design of experiment techniques have been developed and appliedsuccessfully to wide range of model systems (Franceschini and Macchietto[8]; Berger and Wong [4]). The objectives of these techniques typically focuson model precision or/and model discrimination. D-optimality is one of themost popular design criteria used. The criterion minimizes the volume ofthe linear-approximation inference regions for the parameters. The measureof information content used in D-optimal designs involves local sensitivitycoeﬃcients deﬁned by the ﬁrst-order derivative matrix of the model func-tion with respect to the parameters. Hence, the resulting designs are termedlocally optimal. Locally optimal designs can be unreliable for highly nonlin-ear model functions. The works by Hamilton and Watts [10] and Vila andGauchi [17] represent examples of successful attempts to take into accountthe model nonlinearity in the design formulation.Sulieman et.al. [12, 13] proposed proﬁle-based nonlinear sensitivity mea-sure which simultaneously accounts for model nonlinearity and parameterestimates correlations. Applications of the measure to diﬀerent models bySulieman et.al.[15] have shown that the measure gives more reliable reﬂectionof the sensitivity behavior of the model to the parameters than that givenby the local sensitivity measures. The proﬁle-based sensitivity measure isdeﬁned by the total derivative of the model function with respect to the pa-rameter of interest. Hence and like any derivative measure, it is inherentlylocal, it provides however, a broader picture of the model sensitivity in thepresence of parameter co-dependencies and model nonlinearity.The primary goal of this article is to employ the proﬁle-based sensitivityinformation in the construction of D-optimal designs. The resulting designsare compared with the classical local D-optimal designs in which the conven-tional local sensitivity coeﬃcients are used. In section 2 we give a brief reviewof proﬁle-based sensitivity measure and discuss its characteristics. In Section3 we construct the proﬁle-based D-optimal design and discuss its relationsto the classical D-optimal design. Illustrative model cases are presented inSection 4, and conclusions are summarized in Section 5.

Let us consider the general mathematical form of a single response nonlinearregression model y = f ( X , Θ) + ǫ (1)where y is an n -element vector of observed values of the response variable forparticular values of the m -regressor variables X = { x , x , . . . , x m } , each x i is n -element vector of experimental settings. Θ is a k -element vector of un-known parameters, f is an n -element vector of predicted values of the responsevariable for given X and Θ, f ( X , Θ) = { f ( x , Θ) , f ( x , Θ) , . . . f ( x m , Θ) } , and ǫ is an n -element vector of independent random errors with a speciﬁed jointdistribution. In most cases, including the case here, ǫ is assumed to have aspherical normal distribution, with E ( ǫ ) = and var ( ǫ ) = E ( ǫǫ ′ ) = σ I .To emphasizes the dependence of the predicted response values on theparameters Θ, the above model is expressed as: y = η (Θ) + ǫ (2)where the j th element of the n -dimensional vector η (Θ) = ( η (Θ) , η (Θ) , . . . , η n (Θ)) ′ is given by η j (Θ) = f ( x j , Θ) j = 1 , , . . . , n. (3)Conventionally, sensitivity of model predictions to variation in parametervalues is measured by the ﬁrst-order partial derivative of predicted responsefunction, η (Θ), with respect to the parameters. Sulieman et al .[12] proposedpartitioning the p -element parameter vector Θ into Θ = ( θ i , Θ − i ) and η (Θ)into η ( θ i , Θ − i ) , where θ i is the parameter of interest for which sensitivityis measured. θ i is varied across a speciﬁed range of uncertainty so that foreach ﬁxed value of θ i , the conditional estimates of the remaining parametersΘ − i are obtained. Accordingly a sensitivity measure is deﬁned by the totalderivative of the predicted response with respect to θ i . It is given by: p i ( x ) = Dη ( θ i , Θ − i ( θ i )) Dθ i (4)where η = f ( x , Θ) is the predicted response at a selected point x and thenotation DDθ i means total derivative with respect to θ i . Based on the leastsquares estimation criterion, equation (4) can be expressed as: p i ( x ) = ∂η ∂θ i − ∂η ∂ Θ − i ( ∂ S∂ Θ − i ∂ Θ ′− i ) − ∂ S∂θ i ∂ Θ − i (cid:12)(cid:12)(cid:12) ˜Θ − i (5)where the function S is the sum of squares function given by S (Θ) = P nj =1 ( y j − η j (Θ)) and ˜Θ − i is the conditional least squares estimate of Θ − i given θ i .The ﬁrst term in equation (5), ∂η ∂θ i gives the conventional sensitivity coef-ﬁcient of the predicted response with respect to θ i . The second term is acorrection term that involves the marginal eﬀects of Θ − i on η weighted bythe correlations among the elements of ˜Θ − i , and correlations between ˆ θ i and˜Θ − i . These parameter correlations are based on the observed value of theHessian matrix, H (Θ) = ∂ S (Θ) ∂ Θ ∂ Θ ′ . p i ( x ) is called Proﬁle-based SensitivityCoeﬃcient after the proﬁling algorithm used to assess the extent of nonlin-earity in the model and construct likelihood regions for parameter estimates(Bates and Watts[1]). While p i ( x ) is a local derivative-based measure, theincorporation of the correlation structure, based on the Hessian terms above,makes it a more reliable measure of sensitivity as it accounts for simultaneouschanges in the parameter values and nonlinearity of the model. p i ( x ) can be expressed in terms of the ﬁrst and second order derivativeinformation of the model function η (Θ) as: p i ( x ) = v i − v ′ − i ( V ′− i V − i − [ e ′ ][ V − i − i ]) − ( V ′− i v i − V − ii ′ e ) (6)where v i is the i th component of the k -element ﬁrst order derivative vector v = ∂η (Θ) ∂ Θ evaluated at x ; V − i is an n × ( k −

1) matrix consisting of ﬁrstderivative vectors of η (Θ) with respect to Θ − i ; v − i is a ( k −

1) dimensionalvector consisting of the elements in the row of V − i that corresponds to x ; V − i − i is the n × ( k − × ( k −

1) array of the second order derivatives of η (Θ)with respect to Θ − i ; V − ii is the n × ( k −

1) matrix of the second derivativesof η (Θ) with respect to Θ − i and θ i , and e is the n -element residuals vector.In the k − e ′ ][ V − i − i ] give the projectionsof the second-order derivative vectors on the residual vector. Similarly thematrix V − ii ′ e gives the model function curvature in θ i direction projected onthe residual vector. Since the residual vector is orthogonal to the tangentplane spanned by the k − e ′ ][ V − i − i ] is a functionof only the intrinsic curvature portion of the second-order derivative array.For signiﬁcant intrinsic model nonlinearity these projections play major rolein the extent to which proﬁle-based nonlinear sensitivities diﬀer from thelocal linear sensitivities. The quantities in equation (6) are evaluated at(ˆ θ i , ˜Θ − i (ˆ θ i )).In vector notations, equation (6) can be expressed as: p i = v i − V − i ( V ′− i V − i − [ e ′ ][ V − i − i ]) − ( V ′− i v i − V ′− ii e ) (7)where p i is n × θ i evaluated at the n prediction points, v i is the corresponding vector of localsensitivity coeﬃcients. If the linear approximation to the model function isadequate, i.e., Hessian terms can be set to zero or when the model ﬁts dataexactly ( e = 0), the vector of proﬁle-based sensitivity coeﬃcients in equation(7) reduces to the following: p i = v i − V − i ( V ′− i V − i ) − V ′− i v i (8)which can be expressed as: p i = [ I n − P V − i ] v i (9)where I n is the n -dimensional identity matrix and P V − i = V − i ( V ′− i V − i ) − V ′− i is the projection matrix that orthogonally projects the columns of V − i ontothemselves. The projection matrix I n − P V − i projects the columns of V − i onto the orthogonal complement of the space spanned by the columns of V − i .A close examination of the formulation given in equation (9) reveals that p i is the vector of least squares residuals obtained from linearly regressing v i onregressor variables given by V − i . Note that the vector v i can be orthogonallydecomposed as: v i = P V − i v i + [ I n − P V − i ] v i (10)If the local eﬀect of θ i on the predicted response is highly correlated with theeﬀects of the remaining parameters, Θ − i , the ﬁrst term in equation (10) willhave large magnitude compared to the second term indicating small mag-nitude of proﬁle-based sensitivities. On the other hand, weak correlationsamong the eﬀects of parameters on the predicted response indicate largemagnitude of proﬁle-based sensitivities. This is to say that p i measures theinﬂuence that θ i exerts on the predicted response after the removal of itsco-dependencies with the remaining parameters. With this understanding,proﬁle-based sensitivities represent a particular orthogonalization of param-eter space so that the individual impacts of the resulting parameters aremore independent from each other than those of original parameters Θ. In-clusion of the second order derivatives in Equation (7) suggests that modelnonlinearity is accounted for in this particular orthogonalization.Sulieman et. al .[16] described proﬁle-based sensitivity procedure usingthe notion of model re-parametrization. They showed that the slopes ofthe proﬁle traces in the re-parametrized model represent the foundation forthe underlying deﬁnition of the proﬁle-based sensitivity measure. Sulieman et al .[13] extended the proﬁle-based sensitivity assessment to the parameterestimation in multi-response regression models. Sulieman et al .[15] presenteda comparative analysis of the proﬁle-based sensitivity and Fourier AmplitudeSensitivity Test (FAST). They showed that while FAST accounts for modelnonlinearity to all orders it fails to account for parameter co-dependencieswhich are considered in the proﬁle-based sensitivity measure. D-optimal designs are one of the most commonly used alphabet designs.A D-optimal design minimizes the volume of the parameter joint inferenceregion or equivalently maximizing the determinant of the Fisher Informationmatrix with respect to the design settings. Box and Lucas [5] gave theﬁrst formulation and geometric interpretation of the D-optimal design fornonlinear models. They deﬁned the D-optimality objective function, usingthe local sensitivity coeﬃcients, as: maxD = | V ′ V | (11)with respect to design settings, x , where the matrix of local sensitivity co-eﬃcients V is evaluated at an initial parameter estimates Θ . Under thelinear approximation, the model response surface is replaced by its tangentplane and the usual ellipsoidal joint inference region for Θ is the image inparameter space of a spherical region on the tangent plane. The volume ofthe ellipsoidal region evaluated at Θ is given by | V ′ V | − / . By maximizing D , this volume is minimized. Thus, for a given nonlinear model response,the D-optimal criterion ensures that the design is such that large regionson the tangent plane map into small regions in the parameter space.Whenmodel nonlinearity is pronounced, the local D-optimality can produce designswith poor performance and little information about parameters. Hamiltonand Watts [10] introduced quadratic designs based on second-order approx-imation to the volume of the inference region of Θ. Quadratic designs havethe distinct advantage of taking into account the nonlinearity of responsefunction. Benabbas et al .[3] proposed a curvature-based method for optimalexperimental design for parameter estimation in multi-response nonlineardynamic models. Vila and Gauchi[17] constructed non-sequential optimaldesigns based on the expected volume of exact parameter conﬁdence regions.These designs generally result in repeated experiments on k -support pointsfor models with k parameters and tend to reduce parameter nonlinearities.Gao and Zhou[9] developed a nonlinear D-optimal design criterion based onthe second-order least squares estimator for regression models with asym-metric error distribution.Using the proﬁle-based sensitivity coeﬃcients deﬁned in equation (7), theproﬁle-based D-optimality can be deﬁned as maximizing: maxD P = | P ′ P | (12)with respect to x , where the matrix P = [ p p . . . p k ] is evaluated at Θ , i.e.,each element p i is evaluated at Θ .As discussed in the previous section, proﬁle-based sensitivity coeﬃcients or-thogonalize the parameter space so that the resulting parameters are lesscorrelated than the original parameters. By maximizing D p , the volumeof the inference region in the less-correlated parameter space is minimized.Hence, the resulting design produce more precise and less correlated param-eter estimates than the corresponding local D -optimal design.The ( i, j ) th element of the k × k matrix P ′ P is given by: p ′ i p j = v ′ i v j − v ′ i V − j H − − j − j h − jj − h ′ − ii H − − i − i V ′ − i v j + h ′ − ii H − − i − i V ′ − i V − j H − − j − j h − jj (13)where H − i − i = V ′ − i V − i − [ e ′ ][ V − i − i ] is k − − i , h − ii = V ′− i v i − V ′− ii e is k − θ i and Θ − i .Similarly, the matrix H − j − j and vector h − jj contain the corresponding Hes-sian terms for Θ − j and θ j . The ﬁrst term in equation (13) is equal to the( i, j ) th element of the matrix V ′ V used in the local D-optimality criterion.The remaining terms are proportionate to pairwise products of various co-dependency structures among the conditional parameter estimates in Θ − i andthose in Θ − j in addition to the corresponding co-dependencies with the condi-tioning parameters θ i and θ j . Thus, the objective function in D P -optimalityis equal to the objective function in the conventional D -optimality correctedfor the correlation structures among parameters. Equation (13) suggests thatmaximum D P is obtained by maximizing D (ﬁrst term) and minimizing themagnitudes of the middle two terms representing correlation structures. Thissuggests that D P maximizes the information content of the design while ac-counting for the correlations among parameter estimates that result from anexisting design or/and model formulations. These correlations are ignored bythe D -optimal criterion. Similar to the D -optimal designs, the D P -optimaldesigns are invariant under nonsingular transformation of parameters.The ability of the D -optimal design to estimate model parameters relativeto the D P -optimal design is measured by its D -eﬃciency. The D -eﬃciencyis deﬁned by: D eff = ( | V ′ V || P ′ P | ) k × D eff gives the percentage of the experimental eﬀort of the D -optimal de-sign required by D p -optimal design in order to produce parameter estimatesof the same precision. To carry out the computations for constructing the optimal D and D p de-signs, we initially evaluate the two criteria at a selected set of ﬁne grid pointsin the design region using initial parameter values, Θ . Then we performed aminimization algorithm for each of D − and D − p using one of the two multi-variable nonlinear minimization routines found in MATLAB’s OptimizationToolbox: fminsearch for unconstrained optimization and fmincon for con-strained optimization. The starting point x used for the minimization isthe point at which D and D p are maximized in the initial grid search. Toensure that the resulting optimal design point is global one, a further gridsearch exploring the interior of the design region is carried out. A simu-lation study is conducted for each model example in order to evaluate theperformance of the constructed designs. The Michaelis-Menten model is one of the most widely used models in thebiological sciences. It is commonly used in enzymatic kinetics with well-known formulation: y = θ xθ + x + ǫ (15)where y is the measured initial velocity of an enzymatic reaction and x isthe substrate concentration. The unknown parameters θ and θ representmaximum conversion rate and Michaelis-Menten constant, respectively.One of the most widely data sets used to ﬁt the model is published inBates and Watts[1] representing reaction velocity measurements with en-zyme treated with Puromycin and with untreated enzyme. Table 1 depictsthe design and velocity values for the treated enzyme. This original designTable 1: Data set for the Michaelis-Menten model reported in Bates &Watts [1] Observation no. Substrate Concentration (ppm)

Velocity (Treated) (counts/min ) θ and θ , ˆ θ and ˆ θ . Figure 1 depicts the 90% likelihood infer-ence regions for the two parameters based on unconditional and conditionalleast squares estimation of the model. The unconditional likelihood contour(solid line) was determined by evaluating the sum of squares function foran arbitrary grid of (ˆ θ , ˆ θ ) values in their respective ranges of uncertainty.0

190 200 210 220 230 2400.040.050.060.070.080.090.1 θ θ Figure 1: 90% joint likelihood regions for the parameter in the Michaelis-Menten model based on unconditional least squares estimation (solid line)and conditional least squares estimations (dashed-dotted line). The uncon-ditional least squares estimate (ˆ θ = 212 .

68, ˆ θ = 0 . θ ( θ ), ˜ θ ( θ )) where˜ θ ( θ ) is the estimate of θ conditional on selected values of θ in its range ofvariation and ˜ θ ( θ ) is the least squares estimate of θ conditional on selectedvalues of θ in its range of variation.It is evident from Figure 1 that the 90% conditional likelihood inference re-gion is less elliptical with reduced inclination reﬂecting decreased correlationbetween the two parameters. The conditional estimation procedure reducesthe correlations induced by the simultaneous search for optimal least squaresestimates in θ and θ directions. Hence, the D p -optimality provides a basisfor designing experimental settings that produce less correlated parameterestimates.1

195 200 205 210 215 220 225 230 . . . . X θ θ Figure 2: 95% approximate conﬁdence region for three designs for estimat-ing the Michaelis-Menten model parameters θ and θ . The largest region(solid line) represents the original design (Table 1), the middle region (dottedline) represents the D - optimal design and the smallest region (dashed line)represents the D p -optimal design Using the local D-optimal criterion in equation (11), Bates and Watts[1]constructed a starting design for the model in equation (15). They showedthat the design does not depend on the conditionally linear parameter θ andused θ = 0 . θ . The maximum D occurred at x = 1 . x = 0 .

085 where the value 1.1 is the maximum concentration reported inthe original design given in Table 1 and 0.085 is nearly the half-concentrationin the same design. They compared their design to the original design andconcluded that their design produced smaller linear approximation inferenceregion for θ and θ with lower correlation between the parameter estimates.In constructing the starting D P design, we set the unknown residual vec-2tor in equation (7) to zero and so the p i vectors are reduced to the form givenin equation (9). We write the model equation (15) as f ( x, Θ) = θ g ( x, θ )where g ( x, θ ) = xθ + x . For notational convenience, we write g ( x, θ ) = g and so f ( x, Θ) = θ g . It can be shown that the 2-element vectors of proﬁle-based sensitivity of θ and θ are given by: p = [ I − g θ ( g ′ θ g θ ) − g ′ θ ] g θ (16) p = θ [ I − g ( g ′ g ) − g ′ ] g ∝ [ I − g ( g ′ g ) − g ′ ] g (17)where g = ( g ( x , θ ) , g ( x , θ )) ′ and g θ = ( ∂g ( x , θ ) ∂θ , ∂g ( x , θ ) ∂θ ) ′ . It isobvious that p is free of the conditionally linear parameter θ and that θ appears linearly in p as a proportion constant to a term that dependson θ only. This implies that the optimum value of D P is independent of θ . As shown by Bates and Watts[1],the classical D -optimal designs areindependent of the conditionally linear parameters for most nonlinear models.This property holds also true for the D P -optimal designs only when modelsare intrinsically linear ( V − i − i and V − ii can be set to zero) or when data ﬁtsthe model perfectly ( e is zero).Using MATLAB 8.2.0 optimizer, the locations of maximum D P werefound at x = 1 . x = 0 . D P -optimaldesign to the D -optimal design constructed by Bates and Watts[1], we placessix replications at x = 1 . x = 0 . D -optimal design consists of six replications of x = 1 . x = 0 . θ and θ using the three designsassuming that all designs produced same parameter estimates and residualvariance.We can clearly see that the D P -optimal design produces the smallestjoint conﬁdence region (dashed line) and smaller conﬁdence intervals. Alsothe correlation between the two parameter estimates is the lowest (0.65) forthe D P -optimal design compared to 0.68 for the D -optimal design and 0.76for the original design. The D -eﬃciency is 95% indicating that 95% of theoptimal D experimental eﬀort is needed by the optimal D P design in orderto produce similar accuracy of parameter estimates. In other words, the D optimal design requires 5% more of experimental eﬀort in order to obtain asaccurate parameter estimates as that given by the D P design.3 When some experiments are already done, a natural way of designing anexperiment is to use a sequential method. Sequential designs are appeal-ing because they oﬀer the chance to change strategy after the ﬁrst round ofexperiments has been completed and new information is available. Of par-ticular interest is the case when one additional design point is desired. In D -optimality, this is achieved by maximizing | V ′ n +1 V n +1 | with respect to the( n + 1) th design point, x n + , where V n +1 = (cid:20) V n v n +1 (cid:21) (18)where V n is the design matrix consisting of the local sensitivity measures forthe pre-existing experimental settings and v n +1 is k -element vector of sen-sitivity coeﬃcients that correspond to the new experimental setting beingselected. The corresponding proﬁle-based sequential design strategy maxi-mizes | P ′ n +1 P n +1 | where P n +1 = (cid:20) P n p n +1 (cid:21) (19)and p n +1 is k -element vector of proﬁle-based sensitivity coeﬃcients corre-sponding to the ( n + 1) th experimental setting. V n +1 and P n +1 are evaluatedat the least squares parameter estimates from the already existing n experi-ments.The original design for the Michaelis-Menten model given in Table 1 wasused to obtain the least squares estimates of θ and θ . These estimateswere used to evaluate V n +1 and P n +1 above. The 13 th concentration point isgenerated using MATLAB 8.2.0 optimizer for restricted x , 0 < x ≤ x max =1 .

1. The optimal value for the additional concentration point is x = 0 . D is maximized and x = 0 . D P is maximized. With oneadditional design point, the eﬃciency for the new D P design is 98%.4 Index corr D corr D P Figure 3: Simulated correlation coeﬃcients between ˆ θ and ˆ θ in Michaelis-Menten Model using 13-points sequential design resulted from D -optimality(solid line) and D P -optimality (dotted line). The dashed line gives the cor-relation coeﬃcient (0.77) from the 12-point original design Simulation Study

In a simple attempt to evaluate the information content of the 13-pointdesign, formed by adding the new optimal value of x to the existing design inTable 1, the parameters θ and θ are re-estimated twice: one time using thenew design derived by D criterion and the other using the design derived by D P criterion. The response variable for the 13 th optimal concentration pointis simulated using the ﬁtted model obtained from the original 12-point designand adding normally distributed random noise. The estimation procedure foreach design was carried out for 2000 simulations. For each simulation, theestimated linear-approximation based variance-covariance matrix, s ( V ′ V ) − is evaluated at the least squares estimate ˆΘ and recorded.Figure 3 shows the resulting correlation coeﬃcients between ˆ θ and ˆ θ plotted against the simulation number. The dotted line represents the corre-lation coeﬃcient resulting from the D P design while the solid line representsthe corresponding correlation resulting from the D design. The horizontaldashed line gives the correlation coeﬃcient (0.77) resulted from the original5 −3 −3 se D (ˆ θ ) se D (ˆ θ ) se D P (ˆ θ ) se D P (ˆ θ )Figure 4: Simulated standard errors of ˆ θ and ˆ θ in Michaelis-Menten modelusing the 13-point sequential design resulted from D -optimality and D P -optimality (asterisk points). The solid line is the line of equalitydesign given in Table 1. The line charts of Figure 3 clearly demonstrate that,for vast majority of the simulations, the D P design gives reduced correlationcoeﬃcient from that given by the original design and these correlations areremarkably lower than correlations generated from the D design.Furthermore, in Figure 4 we present scatter plots of the estimated standarderrors of ˆ θ and ˆ θ from the 2000 simulations. The parameter standard errorsresulted from D design ( se D ) are plotted on the horizontal axis while the D P design standard errors ( se D p ) are plotted (asterisk points) on the verticalaxis. The solid line is the line of equality of se D and se D P so that it is easilyseen that in a signiﬁcant portion of the simulations the se D P is less than se D ,reﬂecting higher accuracy of parameter estimates. The average value of thesimulated D-eﬃciency scores is 97 .

6% which suggests that the two designs areof comparable eﬃciency in estimating the two parameters. The D P design,however, has the distinctive ability to increase parameter estimate precisionwhile reducing their correlations. The Hougen-Watson model is common in chemical kinetics of catalyzed re-actions. It expresses the reaction rate in terms of the catalyst variables,6the temperature and concentrations of reactants. One of the expressions themodel takes is: f ( x , Θ) = θ θ ( x − x / . θ x + θ x + θ x (20)where x , x and x represent the partial pressures of the reactants. Thedata set, reported in Bates and Watts[1], consists of 24 runs of the designvariables ( x , x , x ). The model was ﬁtted to this initial 24-point design.Table 2 below shows the results.Table 2: Summary of parameter estimates for the Hougen-Watson modelParameter Estimate St.error Correlation θ θ θ θ D -optimality with design variables restricted to thefollowing ranges: 100 ≤ x ≤ ≤ x ≤

350 and 30 ≤ x ≤ D at the corner points of the restricted designregion and inside its interior. They found the combination ( x = 100 , x =350 , x = 30) to be the point at which the optimum D occurred and hencethey recommended this corner point for the next experimental run.We implemented the same strategy using MATLAB 8.2.0 optimizer. Atﬁrst, we evaluated the D P criterion using equation (7) at the corner points ofthe design region above and at the original 24 design points. The maximum D P occurred at the design point ( x = 251 , x = 294 , x = 41 . D P . The 25 th design pointat which the optimum value was found is ( x = 245 , x = 300 , x = 40). Simulation Study

Similar to the previous example, we ran 2000 simulations of the Hougen-Watson model estimation in order to evaluate the information content of7

Index corr (ˆ θ , ˆ θ ) corr (ˆ θ , ˆ θ ) corr (ˆ θ , ˆ θ ) corr (ˆ θ , ˆ θ ) corr (ˆ θ , ˆ θ ) corr (ˆ θ , ˆ θ ) Figure 5: Simulated correlation coeﬃcients between the four parameter es-timates in the Hougen-Watson model. The solid line represent correlationsfrom the D -optimal design, the dotted line represent correlations from the D P -optimal design and the horizontal dashed line gives the correlation coef-ﬁcient from the 24-point original designeach of the two 25-point designs constructed by D and D P criteria. In eachsimulation, the response variable for the additional point; ( x = 100 , x =350 , x = 30) for D and ( x = 245 , x = 300 , x = 40) for D P ; was estimatedby using the ﬁtted model in Table 2 and adding randomly generated noisefrom a normal distribution. The results are depicted in Figures 4 and 5.Figure 4 clearly demonstrates that, for vast majority of the simulations,the magnitudes of the correlations among parameter estimates are lower forthe D P (dotted line) than the D design (solid line). The reduction in correla-tions is most pronounced for relationships involving ˆ θ . This is to say that thelocation of the additional design point generated by the D P criterion providesinformative experimental setting to reduce dependencies between θ from theother parameter estimates, thereby, reducing the overall ill-conditioning ofthe model estimation. The additional design point generated by the classical D criterion gave rise to higher magnitudes of all correlations involving ˆ θ .As for the correlations involving ˆ θ , ˆ θ and ˆ θ , the D P continued to producelower values than those produced by the D design. Because the D P criterion8 ˆ se D ˆ se DP ˆ se D ˆ se DP ˆ se D ˆ se DP ˆ se D ˆ se DP Figure 6: Simulated standard errors of ˆ θ , ˆ θ , ˆ θ and ˆ θ in the Hougen-Watsonmodel using the 25-points sequential design resulted from D -optimality and D P -optimality (asterisk points). The solid line is the line of equalityaccounts for the correlation structure among parameters based on second-order derivative information, the resulting correlations from D P design areseen to have more volatility that the correlations produced by the classical D design.Figure 5 shows the scatter plots of the estimated standard errors of ofthe parameter estimates. The solid straight is the line of equality of thethe estimated standard errors produced by the D P design, se D P , and the cor-responding standard errors produced by the D design, se D . Clearly, thesimulated standard errors of the four parameter estimates are lower for the D P design (asterisk points) than that for the D design in substantially allthe simulations. It should be pointed out that, except for ˆ θ , the simulatedstandard errors of the other three parameter estimates by the D and D P designs are in general larger than those reported in Table 2 above. Thisis to say that the additional experimental setting in each design was notinformative enough to improve the precision of ˆ θ , ˆ θ or ˆ θ . However, the ad-ditional design point in both D and D P produced a signiﬁcant improvementin the precision of ˆ θ in nearly all of the simulations. The additional designpoint for the D P optimal design was signiﬁcantly informative in reducing the9correlations involving ˆ θ and improving its precision. In this article we have applied the proﬁle-based sensitivity coeﬃcients devel-oped by Sulieman et.al. [12] in designing experiments for nonlinear models.Given that the proﬁle-based sensitivity coeﬃcients account for both param-eter correlations and model nonlinearity, it has been shown that utilizingthem in the D -optimal criterion generates more informative experimentalsettings. Two model examples have been used to demonstrate the com-putational aspects of the proﬁle-based D -optimal criterion. Furthermore,simulation studies have shown that the constructed proﬁle-based optimal de-signs are more eﬃcient and informative than the classical D -optimal designs.Future work will include establishing more detailed theoretical frameworkfor the proposed design criterion and conducting further comparisons withexisting nonlinear design criteria including that of Hamilton and Watts [10]and Vila and Gauchi [17]. The authors gratefully acknowledge the ﬁnancial support of the AmericanUniversity of Sharjah, United Arab Emirates.

References [1] Bates, D.M. and Watts, D.G. (1988)

Nonlinear Regression Analysis andIts Applications . Wiley, New York.[2] Bates, D.M. and Watts, D.G. (1981) Parameter transformation for im-proved approximate conﬁdence regions in nonlinear least squares.

Annalsof Statistics

9: 1152- 1167.[3] Benabbas, L., Asprey, S. P. and Macchietto, S. (2005) Curvature-basedmethods for designing optimally informative experiments in multiresponsenonlinear dynamic situations.

Industrial and Engineering Chemistry Re-search

44: 7120-7131.0[4] Berger M.P.F. and Wong W.K. (2009)

An introduction to optimal designsfor social and biomedical research .Wiley, Chichester[5] Box, G.E.P. and Lucas, H.L. (1959) Design of experiments in non-linearsituations.

Biometrika

46: 77-90.[6] Buzzi Ferraris, G.and Donati, G. (1974) A powerful method for Hougen-Watson model parameter estimation with integral conversion data.

Chem-ical Engineering Science

29: 1504-1509.[7] Clarke, G.P.Y. (1987) Approximate conﬁdence limits for a parameterfunction in Nonlinear Regression.

JASA

Chemical EngineeringScience

Stat Papers

58: 77-94.[10] Hamilton, D.C. and Watts, D.G. (1985) A quadratic design criterion forprecise estimation in nonlinear regression models.

Technometrics

Modern Experimental Design . John Wiley & Sons,Inc., Hoboken, New Jersey.[12] Sulieman, H., McLellan, P.J. and Bacon, D.W. (2001) A Proﬁle-BasedApproach to Parametric Sensitivity Analysis of Nonlinear Regression Mod-els.

Technometrics

Computational Statistics & Data Analysis

45: 721-740.[14] Sulieman H. (2008)Improved local sensitivity measures for regressionmodels with correlated parameters.

Proceedings in Computational Statis-tics 18th Symposium , Porto, Portugal.[15] Sulieman, H., Kucuk, I. and McLellan J.(2009) Parametric sensitivity:A case study comparison.

Computational Statistics & Data Analysis

Journal of the Franklin Institute