Experimental Designs for Accelerated Degradation Tests Based on Linear Mixed Effects Models
EExperimental Designs for Accelerated Degradation Tests Based onLinear Mixed Effects Models
Helmi Shat a, ˚ , Rainer Schwabe a a Institute for Mathematical Stochastics, Otto-von-Guericke University Magdeburg,Universit¨atsplatz 2, 39106 Magdeburg, Germany
Abstract
Accelerated degradation tests are used to provide accurate estimation of lifetime properties of highlyreliable products within a relatively short testing time. There data from particular tests at high levelsof stress (e. g. temperature, voltage, or vibration) are extrapolated, through a physically meaningfulmodel, to obtain estimates of lifetime quantiles under normal use conditions. In this work, we considerrepeated measures accelerated degradation tests with multiple stress variables, where the degradationpaths are assumed to follow a linear mixed effects model which is quite common in settings whenrepeated measures are made. We derive optimal experimental designs for minimizing the asymptoticvariance for estimating the median failure time under normal use conditions when the time points formeasurements are either fixed in advance or are also to be optimized.
Keywords:
Accelerated degradation test, linear mixed effects model, failure time distribution, locallyoptimal design, destructive testing.
1. Introduction
Rapid advances in modern manufacturing technologies, along with market needs for sustainableand reliable products, have motivated industrial companies to design and manufacture products thatcan operate without failure for years or even decades. As a consequence, manufacturers are demandedto provide their customers with accurate information about the reliability of their products. However,when the products get more reliable, it becomes more difficult or even unfeasible to assess a sufficientamount of lifetime data on the basis of traditional reliability testing in order to accurately estimatecharacteristics of the lifetime distribution of the products because failure or fatigue can hardly beobserved under normal use conditions in a reasonable time period for testing. As an alternative,for highly reliable and enduring products, Accelerated Degradation Tests can be utilized to providesufficient information on the deterioration of the products to obtain a sufficiently accurate estimate oflifetime properties within a relatively short testing time period. In Accelerated Degradation Testingproducts are tested at various elevated stress levels (for e. g. temperature, voltage, or vibration), andthe grade of degradation of the product under these stress levels is measured over a tractable timeperiod. The measured data are then extrapolated, through a technically reasonable statistical model,to obtain estimates of lifetime characteristics under normal use conditions. The precision of theestimates is influenced by several factors, such as the number of units tested, the duration of thetesting period, the frequency of measurements, and, on particular, the choice of the stress levels towhich the units are exposed.A vast amount of literature is devoted to the analysis of Accelerated Degradation Tests, see,for example, (Meeker and Escobar, 2014) for a comprehensive survey on various approaches in theliterature used to assess reliable information from degradation data. In more detail, (Bagdonaviciusand Nikulin, 2001) present models and methods of statistical analysis for Accelerated DegradationTests and further references can be found there. As additional sources, (Nelson, 2005) providesan extensive list of references related to accelerated test planning and (Limon et al., 2017) reviewprominent methods for statistical inference and optimal design of accelerated testing plans. Theredifferent types of test planning strategies are categorized according to their merits and drawbacksand research trends are provided. (Li and Kececioglu, 2006) presents an analytical method forthe optimum planning of Accelerated Degradation Tests with an application to the reliability of ˚ Corresponding author
Email addresses: [email protected] (Helmi Shat), [email protected] (Rainer Schwabe) a r X i v : . [ s t a t . A P ] F e b ight-Emitting Diodes. There the author states that the variability of the measured units havea substantial impact on the accuracy of estimation. Therefore these random effects should beencountered in the choice of the experimental settings for the Accelerated Degradation Tests. Basedon the observation that ignoring the variability in the normal use conditions may lead to significantprediction errors, (Liao and Elsayed, 2006) extend Accelerated Degradation Test models to predictfield reliability by considering variations in the stress levels by considering a degradation processrepresented by a Brownian motion with linear drift via a stochastic differential equation. (Wanget al., 2017) propose a M -optimality criterion for designing constant stress Accelerated DegradationTests when the degradation path can be represented by an inverse Gaussian process with covariatesand random effects. This criterion focuses on a degradation mechanism equivalence rather thanon the evaluation precision or the prediction accuracy which are usually employed in traditionaloptimization criteria. Those authors prove that, with a slightly relaxed requirement of predictionaccuracy, the obtained optimum designs minimize the dispersion of the estimated acceleration factorbetween the normal stress level and a higher accelerated stress level. Wiener processes (Brownianmotions) are intensively used to represent degradation paths in Accelerated Degradation Testing,see (Ye et al., 2015) and (Guan et al., 2016). For instance, (Lim and Yum, 2011) develop optimalAccelerated Degradation Test plans assuming that the constant stress loading method is employedand the degradation characteristics follows a Wiener process. These authors determine the test stresslevels and the proportion of test units allocated to each stress level such that the asymptotic varianceof the maximum likelihood estimator of a particular quantile of the lifetime distribution at thenormal use condition is minimized. In addition, compromise plans are also developed for checking thevalidity of the relationship between the model parameters and the stress variable. In a case study forrandom effects in degradation of semiconductors, (Lu et al., 1997) propose a repeated measurementsmodel with random regression coefficients and a standard deviation function for analyzing lineardegradation data. The authors utilize several large sample interval estimation procedures to estimatethe failure time distribution and its quantiles.On the other hand, the general theory of optimal design of experiments is well developed in themathematical context of approximate designs which allow for analytical solutions (see e. g. (Silvey,1980) or (Atkinson et al., 2007)). In addition, (Schwabe, 1996b) deals with the theory of optimaldesigns for multi-factor models which can be used here to treat more than one stress variable andthe choice of time plans simultaneously under various interaction structures. In the presence ofrandom effects, (Entholzner et al., 2005) derive that for single samples the optimal designs for fixedeffects models retain their optimality for linear optimality criteria. (Debusho and Haines, 2008)show that this also holds for D -optimality in linear models when only the intercept is random.However, in a multi-sample situation (Schmelter, 2007) and (Schwabe and Schmelter, 2008) exhibitthat the variability of the intercept has a non-negligible influence on the D -optimal design. Inthe case of random slope effects this dependence already occurs in single samples as outlines by(Schmelter et al., 2007). (Dette et al., 2010) consider the problem of constructing D -optimal designsfor linear and nonlinear random effect models with applications in population pharmacokinetics.These authors present a new approach to determine efficient designs for nonlinear least squaresestimation which addresses the problem of additional correlation between observations within units.Based on geometrical arguments, (Graßhoff et al., 2012) derive D -optimal designs for randomcoefficient regression models when only one observation is available per unit, a situation which occursin destructive testing. (Mentre et al., 1997) present an approach to optimal design of experiments forrandom effects regression models in the presence of cost functions related to costs per unit and costsper measurement with applications to toxicokinetics.The present approach is based on the discussion paper by (Weaver and Meeker, 2014) in whichtwo case studies are introduced for optimal planning of repeated measures Accelerated DegradationTests. There the authors consider the influence of a single stress variable and use a criterion based ona large-sample approximation of the precision for estimating a quantile of the failure-time distributionunder normal use conditions. We will adopt this approach, generalize the results presented there tomore general models, and extend the design optimization also to generate optimal time plans.The present paper is organized as follows. Section 2 starts with a motivation example basedon a case study in (Weaver and Meeker, 2014). In Sections 3, 4 and 5 we state the general modelformulation, specify the maximum-likelihood estimation and exhibit the corresponding informationmatrix. Basic concepts of optimal design theory in the present context are collected in Section 6while Section 7 is devoted to the idea of soft failure due to degradation, where we derive thedesign optimality criterion for estimating a quantile of the failure time distribution under normaluse conditions. In Section 8 optimal designs are characterized when the time plan for repeatedmeasurements at the testing units is fixed in advance. In Section 9 also the measurement times are2 igure 1: Observed degradation paths (left panel) and corresponding mean degradation paths (right panel) optimized under the constraint that measurements are taken according to the same time plan forall units, and in Section 10 optimal measurement times are determined in the setting of destructivetesting. The paper closes with a short discussion in Section 11. Throughout the paper the theoreticalconcepts are illustrated by two accompanying running examples. Some technical result is deferred toan Appendix.
2. Introductory example
Before formulating our general degradation model in section 3, we start in this section formotivation with the description of a simple introductory example based on (Weaver and Meeker,2014).
Example 1.
The model proposed in (Weaver and Meeker, 2014) is a linear mixed effect model with asingle stress variable x . In this model there are n testing units for which degradation y ij is observed at k time points t j , j “ , ..., k . The (standardized) stress variable x can be chosen by the experimenterfrom the design region X “ r , s .On the unit level the response y ij for the degradation of testing unit i at time t j is represented by y ij “ β i, ` β x i ` β i, t j ` β x i t j ` ε ij , (2.1) where the intercept β i, is the mean degradation of unit i at time t “ under the stress level x “ , β is the common (not unit specific) mean increase in degradation depending on the stress variable x , β i, is the mean increase in degradation of unit i over time t when the stress level is set to x “ ,and β is the interaction effect between time and stress. The measurement errors ε ij are assumed tobe realizations of a normally distributed error variable with mean zero and error variance σ ε .On the whole experiment level the unit specific parameters p β i, , β i, q T of the units are assumedto be realizations of a bivariate normal distribution with mean p β , β q T and a variance covariancematrix Σ “ ˆ σ ρσ σ ρσ σ σ ˙ . All random effect parameters and measurement errors are assumedto be independent both within as well as between units. Under well controlled measuring testingconditions, the variability of the response is completely described by both the unit to unit variability Σ and the within unit variability of the measurement errors.To illustrate the situation some virtual degradation paths y ij , j “ , ..., k , are depicted in Figure 1(left panel) for three different values of the stress variable x . There are three units shown at eachvalue of the stress level ( n “ ) and k “ equally spaced measurement times t j . The roughness ofthe paths is due to the measurement errors ε ij The corresponding underlying mean degradation paths µ i p x i , t q “ β i, ` β x i ` β i, t ` β x i t , corrected for the measurement errors, are shown in the rightpanel of Figure 1. These mean degradation paths are represented by straight lines over time, whereboth the intercept and the slope may vary across units around an aggregate value determined by thevalue x i of the stress variable. For the analysis of degradation under normal use it is assumed thatthe model equation 2.1 is also valid at the normal use condition x u , where typically x u ă on thestandardized scale, i. e. µ u p x u , t q “ β u, ` β x u ` β u, t ` β x u t describes the mean degradation of afuture unit u at normal use condition x u and time t , and the unit specific parameters p β u, , β u, q T are bivariate normal with mean p β , β q T and variance covariance matrix Σ . able 1: Nominal values for Example 1 β β β β σ σ ρ σ ε x u y f j p x, t q x t xt .
397 1 .
629 1 .
018 0 . .
114 0 . ´ .
143 0 . ´ .
056 3 . The degradation is assumed to undergo soft failure, i. e. the unit u will be considered to fail due todegradation when its mean degradation exceeds a predetermined threshold y . The corresponding time t u , for which µ u p x u , t u q “ y , will be called the failure time of unit u under normal use condition dueto degradation. These failure times vary across different unit because of the unit specific parameters β u, and β u, .In both panels of Figure 1 the predetermined failure threshold y “ is indicated by a horizontalline. As typical for degradation studies failure does not occur during the time of experiment even forthe highest stress level. (Weaver and Meeker, 2014) derived locally c -optimal designs for certain characteristics of themean degradation curves (for future units) under normal use condition x u . The optimal design isalways supported on the extremal points of the design region X and the optimal weights dependon the nominal values of the model parameters. In Table 1 we reproduce the nominal values ofExample 7.2 by (Weaver and Meeker, 2014) on scar width growth after standardization for furtheruse.In the following section we extend the work of (Weaver and Meeker, 2014) by considering a generalmixed effects model with multiple stress variables x given that the general model can be applied tolinear mixed effect models, linear additive models or exponential models. Further, this generalizationallows for degradation models with and without interactions between the multiple design variables x .
3. Formulation of the model
In this section, we give a general formulation of a mixed effects regression model incorporating aproduct-type structure with complete interactions between the stress and the time variable. Thisgeneral formulation can cover the situation described in the introductory Example1 of the previoussection, but may readily be extended to more than one stress variable and to more complex marginalmodels for both the stress variables and the time variable, as indicated later. To become more specificwe assume that there are n testing units i “ , ..., n , for which degradation y ij is to be measured at k subsequent time points t j , j “ , ..., k , t ă ... ă t k . Each unit i is observed under a value x i ofthe stress variable(s), which is kept fixed for each unit throughout the degradation process, but maydiffer from unit to unit. The number k of measurements and the time points are the same for allunits. The measurements y ij are regarded as realizations of random variables Y ij which are describedby a hierarchical model.For each unit i the observation Y ij at time point t j is given by Y ij “ µ i p x i , t q ` ε ij , (3.1)where µ i p x , t q is the mean degradation of unit i at time t , when stress x is applied to unit i , and ε ij is the associated measurement error at time point t j . The mean degradation µ i p x , t q is assumed tobe given by a linear model equation in the stress variable x and time t , µ i p x , t q “ p ÿ r “ β i,r f r p x i , t j q “ f p x i , t j q T β i (3.2)where f p x , t q “ p f p x , t q , ..., f p p x , t qq T is a p -dimensional vector of known regression functions f q p x , t q in both the stress variable(s) x and the time t , β i “ p β i, , ..., β i,p q T is a p -dimensional vector of unitspecific parameters β i,q . Hence, the response is given by Y ij “ f p x i , t j q T β i ` ε ij . (3.3)The measurement error ε ij is assumed to be normally distributed with zero mean and some potentiallytime dependent error variance σ (cid:15),j ( ε ij „ N p , σ (cid:15),j q ). Moreover, the error terms may be correlatedwithin a unit over time. So, in general the vector ε i “ p ε i , ..., ε ik q T of errors associated with the k observations within one unit i is k -dimensional multivariate normally distributed with zero meanand positive definite variance covariance matrix Σ ε ( ε i „ N p , Σ ε q ).4 xample (Example 1 cont.) . In the introductory model Y ij “ β i, ` β x i ` β i, t j ` β x i t j ` ε ij of Section 2 there is only one stress variable x “ x , the vector of regression functions is given by f p x, t q “ p , x, t, xt q T , and the unit specific vector of parameters is β i “ p β i, , β , β i, , β q T ( p “ ).The error terms are assumed to be homoscedastic and independent, i. e. Σ ε “ σ ε I k , where I k denotesthe k ˆ k identity matrix. For the regression functions f p x , t q we suppose a product-type structure with complete interactionsbetween the stress variable x and the time t , i. e. there are marginal regression functions f p x q “p f p x q , ..., f p p x qq T and f p t q “ p f p t q , ..., f p p t qq T of dimension p and p which only depend onthe stress variable x and the time t , respectively, and the vector f p x , t q “ f p x q b f p t q of regressionfunctions factorizes into its marginal counterparts ( p “ p p ). Here “ b ” denotes the Kroneckerproduct of matrices or vectors. Then the observation Y ij can be written as Y ij “ p ÿ r “ p ÿ s “ β i,rs f r p x i q f s p t j q ` ε ij “ p f p x q b f p t qq T β i ` ε ij , (3.4)where for notational convenience the entries of the vector β i “ p β i, , ..., β i, p , ..., β i,p p q of pa-rameters are relabeled lexicographically according to their associated marginal regression functions( q “ p r ´ q p ` s , r “ , ..., p , s “ , ..., p ). Example (Example 1 cont.) . In the introductory model of Section 2 the rearranged vector ofregression functions f p x, t q “ p , t, x, xt q T “ pp , x q b p , t qq T factorizes into the vector f p x q “ p , x q T of a simple linear regression in the stress variable x ( p “ ) and the vector f p t q “ p , t q T of asimple linear regression in time t ( p “ ). The unit specific vector of parameters β i is relabeled by β i, “ β i, , β i, “ β i, , β “ β , and β “ β , where for β and β the index i of the unit issuppressed as these parameters do not depend on the unit. Thus the model equation can be rewrittenas Y ij “ β i, ` β i, t j ` β x i ` β x i t j ` ε ij “ pp , x i q b p , t j qq T β i ` ε ij . Moreover, as in the introductory example, we will assume throughout that the marginal regressionfunction f p x q “ p f p x q , ..., f p p x qq T of the stress variable x contains a constant term, f p x q ” p parameters β i, , ..., β i, p of β i associated with this constant term are unit specific. All otherparameters in β i are assumed to take the same value β rs , r “ , ..., p , s “ , ..., p , for all individuals i “ , ..., n . Hence, for unit i the model (3.4) can be rewritten as Y ij “ p f p x i q b f p t j qq T β ` f p t j q T γ i ` ε ij , (3.5)where β “ p β , ..., β p p q T is the vector of fixed effect (aggregate) parameters (averaged overthe units) associated with the constant term in the regression functions of the stress variable x and γ i “ p γ i , ..., γ ip q T is the p -dimensional vector of unit specific deviations γ is “ β i, s ´ β s , s “ , ..., p , from the corresponding aggregate parameters.On the aggregate level it is assumed that the units are representatives of a larger entity. Thedeviations of the units from the aggregate value are then modeled as random effects, i. e. they are p -dimensional multivariate normal with zero mean and variance-covariance matrix Σ γ ( γ i „ N p , Σ γ q ).All vectors γ i of random effects and all vectors ε i of random errors are assumed to be independent. Example (Example 1 cont.) . In the introductory model of Section 2 the two-dimensional vectorof deviations associated with the regression functions f p x, t q “ and f p x, t q “ t is γ i “ p β i, ´ β , β i, ´ β q T with general ˆ variance-covariance matrix Σ γ “ Σ . This leads to the mixedeffects model equation in standard notation Y ij “ β ` β t j ` β x i ` β x i t j ` γ i ` γ i t j ` ε ij “ pp , x i q b p , t j qq T β ` p , t j q T γ i ` ε ij . In vector notation the k -dimensional vector Y i “ p Y i , ..., Y ik q T of observations for unit i can beexpressed as Y i “ p f p x i q T b F q β ` F γ i ` ε i , where F “ p f p t q , ..., f p t k qq T is the k ˆ p marginal design matrix for the time variable. Then Y i is k -dimensional multivariate normally distributed with mean p f p x i q T b F q β and variance covariancematrix V “ F Σ γ F T ` Σ ε . The variance covariance matrix V is not affected by the choice of thestress level x i and, hence, equal for all units i . 5 xample (Example 1 cont.) . In the introductory model of Section 2 the marginal design matrix F “ ˆ ... t ... t k ˙ T is the common design matrix for simple linear regression over time. The variance covariance matrix V “ F ΣF T ` σ ε I k has diagonal entries v jj “ Var p Y ij q “ σ ` ρσ σ t j ` σ t j ` σ ε and off-diagonalentries v jj “ cov p Y ij , Y ij q “ σ ` ρσ σ p t j ` t j q ` σ t j t j . In total, for the observations of all n units the stacked nk -dimensional response vector Y “p Y T , ..., Y Tn q T can be represented in matrix notation as Y “ p F b F q β ` p I n b F q γ ` ε , (3.6)where F “ p f p x q , ..., f p x n qq T is the n ˆ p marginal design matrix for the stress variables acrossunits, γ “ p γ T , ..., γ Tn q T is the np -dimensional stacked parameter vector of random effects and ε “ p ε T , ..., ε Tn q T is the nk -dimensional stacked vector of random errors. Such a model equation issometimes called the “marginal model” for the response Y , but should not be confused with modelsmarginalized for the covariates x and t , respectively (see the decomposition at the end of Section 5).Note that the vectors γ „ N p , I n b Σ γ q of all random effects and the vector ε „ N p , I n b Σ ε q aremultivariate normal. Hence, the vector Y of all observations is nk -dimensional multivariate normal, Y „ N p , I n b V q . Example (Example 1 cont.) . In the introductory model of Section 2 the marginal design matrix F “ ˆ ... x ... x n ˙ T is the common design matrix for simple linear regression on the stress variable x . For the analysis of degradation under normal use we further assume that the general model 3.5 isalso valid at the normal use condition x u , where typically x u R X , i. e. µ p x u , t q “ p f p x u q b f p t qq T β ` f p t q T γ u (3.7)describes the mean degradation of a future unit u at normal use condition x u and time t , and therandom effects γ u are k -dimensional multivariate normal with mean zero and variance covariancematrix Σ γ .In the following we will consider thoroughly a more complex example, where two stress variablesare involved. Example 2.
In this example the degradation is influenced by two standardized accelerating stressvariables x and x which act linearly on the response with a potential interaction effect associatedwith x x . The two stress variables x and x can be chosen independently from marginal designregions X “ X “ r , s , respectively. Also the time is assumed to act linearly on the degradationand all interactions between stress variables and time are present as in Example 1.If, for testing unit i , the stress variables are set to x i and x i the response y ij at time t j is givenby y ij “ β i, ` β x i ` β x i ` β x i x i ` β i, t j ` β x i t j ` β x i t j ` β x i x i t j ` ε ij , (3.8) where the intercept β i, is the mean degradation of unit i at time t “ under the stress levels x “ and x “ , β is the common (not unit specific) mean increase in degradation depending on thestress variable x when x “ , β is the common mean increase in degradation depending on thestress variable x when x “ , and β is the interaction effect between the two stress variables.Accordingly β i, is the mean increase in degradation of unit i over time t when the stress levels areset to x “ and x “ , β is the interaction effect between time and the stress variable x when x “ , β is the interaction effect between time and the stress variable x when x “ , and β is the second-order interaction effect between time and the two stress variables. Also here only theparameters β i, and β i, associated with the constant term in the stress variables may vary acrossunits. On the aggregate level these two unit parameters are again assumed to be normally distributedwith means E p β i, q “ β and E p β i, q “ β and variance covariance matrix Σ “ ˆ σ ρσ σ ρσ σ σ ˙ . fter rearranging terms and relabeling the parameters the model can be rewritten as Y ij “ p f p x i q b f p x i q b f p t j qq T β ` f p t j q T γ i ` ε ij , (3.9) where f p x q “ p , x q T , f p x q “ p , x q T and f p t q “ p , t q T are the marginal regression functionsfor the stress variables x , x and the time variable t , respectively, β “ p β , β , β , β , β , β ,β , β q T “ p β , β , β , β , β , β , β , β q T is the rearranged vector of aggregate parameters, and γ i “ p γ i , γ i q T “ p β i, ´ β , β i, ´ β q T is the vector of parameters for the deviations of unit i fromthe aggregate values. These deviations constitute again random effects with zero mean and variancecovariance matrix Σ . With x “ p x , x q , X “ X ˆ X “ r , s , f p x q “ p f p x q T , f p x q T q T , p “ , p “ , and p “ model 3.9 fits into the framework of the general product-type model 3.4. Later we will also consider shortly a model in two stress variables without interaction termsbetween the stress variables.
4. Estimation of the model parameters
Under the distributional assumptions of normality for both the random effects and the measurementerrors the model parameters may be estimated by means of the maximum likelihood method. Denoteby θ “ p β T , ς T q T the vector of all model parameters, where ς collects all variance covarianceparameters from Σ γ and Σ ε For the general model (3.6) the log-likelihood is given by (cid:96) p θ ; y q “ ´ nk log p π q ´ n log p det p V qq ´ p y ´ p F b F q β q T p I n b V q ´ p y ´ p F b F q β q , (4.1)where the variance covariance matrix V “ V p ς q of measurements per unit depends only on ς . Themaximum likelihood estimator of β can be calculated as p β “ pp F b F q T p I n b p V q ´ p F b F qq ´ p F b F q T p I n b p V q ´ Y “ pp F T F q ´ F T q b pp F T p V ´ F q T F T p V ´ q Y , (4.2)if both F and F are of full column rank p and p , respectively, and p V “ V p p ς q , where p ς is the maximum likelihood estimator of ς . When V is known, at least up to a multiplicativeconstant V “ σ V , then p β is the best liner unbiased (general least squares) estimator p β GLS “pp F T F q ´ F T q b pp F T V ´ F q T F T V ´ q Y of β . In particular, when the measurement errors areuncorrelated and homoscedastic, i. e. Σ ε “ σ ε I k , then this estimator reduces to the ordinary leastsquares estimator p β OLS “ pp F T F q ´ F T q b pp F T F q T F T q Y because VF “ F p Σ γ F T F ` σ ε I p q by a result of (Zyskind, 1967). Hence, in the case of uncorrelated homoscedastic measurement errorsthe maximum likelihood estimator of the location parameters β does neither depend on the variancecovariance parameters nor on their estimates.In general, the quality of the estimator p β can be measured in terms of its variance covariancematrix which is given by Cov p p β q “ p F T F q ´ b p F T V ´ F q ´ . (4.3)By using the structure V “ F Σ γ F T ` Σ ε the last term can be calculated as p F T V ´ F q ´ “ p F T Σ ´ ε F q ´ ` Σ γ (4.4)in terms of the variance covariance matrices Σ γ and Σ ε of the random effects and the measurementerrors, respectively.
5. Information
In general, the Fisher information matrix is defined as the variance covariance matrix of the scorefunction U which itself is defined as the vector of first derivatives of the log likelihood with respectto the components of the parameter vector θ . More precisely, let U “ p BB θ (cid:96) p θ ; Y q , ..., BB θ q (cid:96) p θ ; Y qq T ,where q is the dimension of θ . Then for the full parameter vector θ the Fisher information matrixis defined as M θ “ Cov p U q , where the expectation is taken with respect to the distribution of Y . The Fisher information can also be computed as the expectations of the second derivativesof the score function U , i. e. M θ “ ´ E ´ B B θ B θ T (cid:96) p θ ; Y q ¯ . Under common regularity conditions themaximum likelihood estimator p θ of θ is consistent and asymptotically normal with asymptoticvariance covariance matrix equal to the inverse M ´ θ of the Fisher information matrix M θ .7o specify the Fisher information matrix further, denote by M β “ ´ E ´ B B β B β T (cid:96) p θ ; Y q ¯ , M ς “´ E ´ B B ς B ς T (cid:96) p θ ; Y q ¯ , M βς “ ´ E ´ B B β B ς T (cid:96) p θ ; Y q ¯ and M ςβ “ M T βς the blocks of the Fisher informa-tion matrix corresponding to the second derivatives with respect to β and ς and the mixed derivatives,respectively. The mixed blocks can be seen to be zero and the Fisher information matrix is blockdiagonal, M θ “ ˆ M β
00 M ς ˙ . (5.1)Moreover, the block M β associated with the aggregate location parameters β can be determined as M β “ p F T F q b p F T V ´ F q (5.2)which turns out to be the inverse of the variance covariance matrix for the estimator p β of β , when V is known. Actually, because the Fisher information matrix for θ is block diagonal, the inverse M ´ β “ p F T F q ´ b p F T V ´ F q ´ of the block associated with β is the corresponding block of theinverse of M θ and is, hence, the asymptotic variance covariance matrix of p β .Accordingly the asymptotic variance covariance matrix for estimating the variance parameters ς is the inverse of the block M ς . In the following we will call M β and M ς the information matrices for β and ς , respectively, for short. The particular form of M ς will be not of interest here. However, asthe information matrix M ς for the variance parameters ς is given by M ς “ n ˆ B log p det p V qqB ς B ς T ` tr ˆ V B V ´ B ς B ς T ˙˙ . It is important to note that M ς does not depend on the settings x , ..., x n of the stress variable incontrast to the information matrix M β of the aggregate location parameters β . For the generalproduct-type model (3.6) the information matrix M β for the aggregate parameters β factorizesaccording to M β “ M b M (5.3)into the information matrix M “ F T F in the marginal model Y p q i “ f p x i q T β p q ` ε p q i , (5.4) i “ , ..., n , in the stress variable x with standardized uncorrelated homoscedastic error terms, σ ε p q “
1, and the information matrix M “ F T V ´ F in the mixed effects marginal model Y p q j “ f p t j q T β p q ` f p t j q T γ p q ` ε p q j , (5.5) j “ , ..., k , in the time variable t with variance covariance matrices Σ γ and Σ ε for the random effects γ p q and measurement errors ε p q “ p ε p q , ..., ε p q k q T , respectively. Then the information matrix M θ inthe full model depends on the settings x , ..., x n of the stress variable only through the informationmatrix M in the first marginal model.
6. Design
The quality of the estimates will be measured in terms of the information matrix and, hence,depends on both the settings of the stress variable and the time points of measurements. Whenthese variables are under the control of the experimenter, then their choice will be called the designof the experiment. Here we assume that the time plan t “ p t , ..., t k q T for the time points ofmeasurements within units is fixed in advance and is not under disposition of the experimenter. Thenonly the settings x , ..., x n of the stress variable x can be adjusted to the units i “ , ..., n . Theirchoice p x , ..., x n q is then called an “exact” design, and their influence on the performance of theexperiment is indicated by adding them as an argument to the information matrices, M θ p x , ..., x n q , M β p x , ..., x n q , and M p x , ..., x n q , where appropriate. Remind that both M ς and M do not dependon the design for the stress variable.As M p x , ..., x n q “ ř ni “ f p x i q f p x i q T it can easily be seen that the information matrices do notdepend on the order of the setting but only on their mutually distinct settings, x , ..., x m say, andtheir corresponding frequencies n , ..., n m , such that ř mi “ n i “ n , i. e. M “ ř mi “ n i f p x i q f p x i q T .Finding optimal exact designs is, in general, a difficult task of discrete optimization. To circumventthis problem we follow the approach of approximate designs propagated by (Kiefer, 1959) in which8he requirement of integer numbers n i of testing units at a stress level x i is relaxed. Then continuousmethods of convex optimization can be employed (see e. g. (Silvey, 1980)) and efficient exact designscan be derived by rounding the optimal numbers to nearest integers. This approach is, in particular,of use when the number n of units is sufficiently large. Moreover, the frequencies n i will be replaced byproportions w i “ n i { n , because the total number n of units does not play a role in the optimization.Thus an approximate design ξ is defined by a finite number of settings x i , i “ , ..., m , from anexperimental region X with corresponding weights w i ě ř mi “ w i “ ξ “ ˆ x ... x m w ... w m ˙ , (6.1)The corresponding standardized, per unit information matrices are accordingly defined as M p ξ q “ m ÿ i “ w i f p x i q f p x i q T (6.2)for the marginal model on itself or by plugging (6.2) in into the standardized, per unit informationmatrix M β p ξ q “ M p ξ q b M (6.3)for the aggregate parameters β , where again M “ F T V ´ F , and M θ p ξ q “ ˆ M p ξ q b M Ă M ς ˙ (6.4)or the full parameter vector θ , where now Ă M ς “ n M ς is the standardized, per unit information forthe variance parameters ς . If all nw i are integer, then these standardized versions coincide with theinformation matrices of the corresponding exact design up to the normalizing factor 1 { n and are ,hence, an adequate generalization.In order to optimize information matrices, some optimality criterion has to be employed which isa real valued function of the information matrix and reflects the main interest in the experiment.
7. Optimality criterion based on the failure time under normal use condition
As in (Weaver and Meeker, 2014) we are interested in some characteristics of the failure timedistribution of soft failure due to degradation (see the introductory example in Section 2). Thereforeit is assumed that the model equation (3.7) µ u p t q “ µ p x u , t q “ p f p x u q b f p t qq T β ` f p t q T γ u for themean degradation paths is also valid under normal use condition x u , where µ u denotes the degradationpath under normal use condition for short. We further denote by µ p t q “ E p µ u p t qq “ p f p x u qb f p t qq T β the aggregate degradation path under normal use condition and by δ “ δ p β q “ p δ p β q , ..., δ p p β qq T the vector of its coefficients δ s “ δ s p β q “ ř p r “ f r p x u q β rs , s “ , ...., p , in the regression functions f s in t , i. e. µ p t q “ δ f p t q “ ř p s “ δ s f s p t q .For the following it is assumed that the mean degradation paths are strictly increasing overtime. Then a soft failure due to degradation is defined as the exceedance of the degradationover a failure threshold y . This definition is based on the mean degradation path and not on a“real” path subject to measurement errors. The failure time T under normal use condition is thendefined as the first time t the mean degradation path µ u p t q reaches or exceeds the threshold y , i. e. T “ min t t ě µ u p t q ě y u . As random effects γ u are involved in the mean degradation path, thefailure time T is random. Actually, T may become infinite, if the mean degradation path does notreach the threshold, or may degenerate to T “
0, if the degradation already exceeds the threshold attime t “
0, because of unfortunate values of the random effects γ u , but this will happen only withlow probability and will not affect the further argumentation.In order to describe certain characteristics of the distribution of the failure time, we will determinethe distribution function F T p t q “ P p T ď t q . First note that T ď t if and only if µ u p t q ě y . Hence F T p t q “ P p µ u p t q ě y q“ P p µ p t q ` f p t q T γ u ě y q“ P p´ f p t q T γ u ď µ p t q ´ y q“ Φ p h p t qq , (7.1)9here h p t q “ µ p t q ´ y σ u p t q , (7.2) σ u p t q “ f p t q T Σ γ f p t q is the variance of the mean degradation path µ u p t q at time t , and Φ denotesthe distribution function of the standard normal distribution. Here it is tacitly assumed that thevariance σ µ p t q of the mean degradation path is greater than zero for every t ě
0. This condition issatisfied, in particular, when the variance covariance matrix Σ γ of the random effects is positivedefinite.We will be interested in quantiles t α of the failure time distribution, i. e. P p T ď t α q “ α . For each α the quantile t α gives the time up to which under normal use conditions (at least) α ¨
100 percent ofthe units fail and (at least) p ´ α q ¨
100 percent of the units persist. The quantiles t α are increasingin α . Note that this standard definition of quantiles is in contrast to the“upper” quantiles ( t ´ α )used in (Weaver and Meeker, 2014) where percentages of failures and persistence are reversed. Ofparticular importance is the median t . up to which under normal use conditions half of the unitsfails and half of the units persist ( α “ . t . and t . which give the times up to which 95 or 90 % percent of the unitspersist, respectively. By (7.1) these quantiles can be determined as the solutions of the equation h p t α q “ z α , (7.3)where z α “ Φ ´ p α q is the α -quantile of the standard normal distribution. For the median ( α “ { z . “ t . is the solution of µ p t q “ y , i. e. the aggregatedegradation path reaches the threshold at time t . . Note that the function h represents the failuretime distribution function F T on a normal Q-Q-plot scale.In the particular case of straight lines for the mean degradation paths, i. e. f p t q “ p , t q T asconsidered in Examples 1 and 2, the function h p t q specifies to h p t q “ δ t ` δ ´ y a σ ` ρσ σ t ` σ t , (7.4)where δ “ ř p r “ f r p x u q β r and δ “ ř p r “ f r p x u q β r are the intercept and the slope of theaggregate degradation path µ p t q “ δ ` δ t under normal use condition, respectively. The medianfailure time is then given by t . “ p y ´ δ q{ δ which provides a proper solution t . ą δ ą
0, and that theaggregate degradation at the beginning of the testing at time t “ δ ă y . Example (Example 1 cont.) . In the introductory example the aggregate degradation path µ p t q “ δ ` δ t has intercept δ “ β ` β x u and slope δ “ β ` β x u . Hence, the median failure timeis given by t . “ p y ´ β ´ β x u q{p β ` β x u q . If we use the standardized nominal values ofTable 1 for Example 7.2 by (Weaver and Meeker, 2014), the aggregate degradation path becomes µ p t q “ . ` . t under normal use condition, and the median failure time is t . “ . . Notethat, as typical for degradation experiments, the median failure time is larger than the maximalexperimental time t max “ . Under the additional assumption that the correlation of the random effects is non-negativefor the intercept and the slope of the mean degradation path, ρ ě
0, the function h p t q can beseen to be strictly increasing, h p t q ą
0, in t ą
0. This also remains true for small to moderatenegative correlations. However, the range of h p t q is bounded and does not cover the whole real linesuch that not all quantiles are non-degenerate. For small α the α -quantile to be positive requires z α ą h p q “ ´p y ´ δ q{ σ , i. e. the variance σ of the intercept of the mean degradation path has tobe sufficiently small compared to the distance from its mean δ to the threshold y . In particular, inthe case of the 5 %-quantile σ ă . p y ´ δ q is needed for t . ą
0. For large α the α -quantile isfinite if z α ă lim t Ñ8 h p t q “ δ { σ , i. e. the variance σ of the slope of the mean degradation pathhas to be sufficiently small compared to its mean δ . Note that Φ p h p qq “ ´ Φ pp y ´ δ q{ σ q is theprobability that under normal use condition the mean degradation path exceeds the threshold y already at the initial time t “
0. Note also that formally 1 ´ Φ p lim t Ñ8 h p t qq “ Φ p´ δ { σ q is theprobability that the mean degradation path has a negative slope which may be interpreted as theprobability that soft failure due to degradation will not occur at all under normal use condition.When the α -quantile is non-degenerate (0 ă t α ă 8 ), then t α is a solution of the quadratic equation p δ t ` δ ´ y q “ z α p σ ` ρσ σ t ` σ t q , F T ( t ) h ( t ) Figure 2: Failure time distribution F T (left panel) and defining function h (right panel) for Example 1Table 2: Nominal values for Example 2 β β β β β β β β σ γ σ ε x u x u y f j p x , t q x x x x t x t x t x x t . . .
75 1 . . .
25 0 .
25 4 .
03 0 . . ´ . ´ . . σ “
0, all α -quantiles t α finitely exist for α ě Φ p´p y ´ δ q{ σ q and can be determined as the solution of a linear equation to t α “ p y ´ δ ` z α σ q{ δ . Example (Example 1 cont.) . For the introductory example the function h p t q is plotted in the rightpanel of Figure 7 under the standardized nominal values of (Weaver and Meeker, 2014) given inTable 1. The defining function h p t q is seen to be strictly increasing although the correlation ismoderately negative ( ρ “ ´ . ). Thus the distribution function F T p t q “ Φ p h p t qq is well-defined,and it is represented in the left panel of Figure 7. In both plots the median failure time t . “ . is indicated by a dashed vertical line. Moreover, as h p q “ ´ . and lim t Ñ8 h p t q “ . , the rangeof h covers all reasonable quantiles. It has to be noted that in the case of the standardized nominal values of Table 1 for highstress levels ( x h “
1) the mean degradation path µ i exceeds the threshold y for soft failure dueto degradation with high probability (P p µ i p , q ě y q ą {
2) already at the initial experimentaltime t min “
0. Hence, care has to be taken that the model equation for the mean degradation pathsis also valid beyond the threshold, i. e. in the case that soft failure has already occurred. To avoidthis complication we consider in Example 2 nominal values which guarantee that soft failure occursduring the experiment only with negligible probability.
Example (Example 2 cont.) . For the model with two interacting stress variables x and x weconsider the virtual nominal values for the parameters, normal use conditions and threshold given inTable 2. The aggregate degradation path µ p t q is a straight line with intercept δ “ β ` β x u ` β x u ` β x u x u “ . and slope δ “ β ` β x u ` β x u ` β x u x u “ . , i. e. µ p t q “ . ` . t under normal use condition. With a threshold of y “ . for soft failure the median failure timeresults in t . “ p y ´ δ q{ δ “ . which is substantially larger than the maximal experimentaltime t max “ . For the characterization of other quantiles the function h p t q is plotted in the rightpanel of Figure 7 together with the corresponding distribution function F T p t q “ Φ p h p t qq (left panel).The median failure time t . “ . is indicated in both plots by a dashed vertical line. As ρ “ the function h p t q is strictly increasing and ranges from h p q “ ´ . to h max “ lim t Ñ8 h p t q “ . .Thus, quantiles t α are non-degenerate as long as α ď α max , where α max “ Φ p h max q “ . , and p ´ α max q ¨ “ . percent of the mean degradation paths do not lead to a soft failure. Both α max and h max are indicated in the respective plots by a dashed horizontal line. Note that for the nominalvalues of Table 2 the mean degradation µ p x , t q under experimental conditions attains its maximum . for the maximal stress levels ( x “ x “ ) and maximal experimental time ( t “ ), and, hence,the mean degradation paths do not exceed the threshold for all experimental settings. F T ( t ) h ( t ) Figure 3: Failure time distribution F T (left panel) and defining function h (right panel) for Example 2 In any case the quantile t α “ t α p θ q is a function of both the aggregate location parameters β andthe variance parameters ς , in general. Hence, the maximum likelihood estimator of the quantile t α isgiven by p t α “ t α p p θ q in terms of the maximum likelihood estimator p θ of θ . The task of designing theexperiment will now be to provide an as precise estimate of the α -quantile as possible.By the delta-method p t α is seen to be asymptotically normal with asymptotic varianceaVar p p t α q “ c T M ´ θ c , (7.5)where c “ BB θ t α is the gradient vector of partial derivatives of t α with respect to the components ofthe parameter vector θ . The asymptotic variance depends on the design of the experiment throughthe information matrix M θ and will be chosen as the optimality criterion for the design.The gradient c can be seen to be equal to c “ ´ c ` BB θ µ p t q| t “ t α ´ z α BB θ σ u p t q| t “ t α ˘ , (7.6)in view of (7.2) and (7.3) by the implicit function theorem (see e. g. (Krantz and Parks, 2012)), where c “ {p µ p t α q ´ z α σ u p t α qq is the inverse of the derivative of the defining function h with respect to t .As the aggregate mean degradation µ p t q only depends on the aggregate location parameters β and the variance σ u p t q only depends on the variance parameters ς the gradient simplifies to c “ ´ c p c T β , c T ς q T , where c β “ BB β µ p t q| t “ t α “ f p x u , t α q is the gradient of µ p t q with respect to β and c ς “ ´ z α BB ς σ u p t q| t “ t α is ´ z α times the gradient of σ u p t q with respect to ς . The particular shape of c ς does not play a rolehere, in general. But note that c ς “ in the case of the median ( α “ . p t α becomes aVar p p t α q “ c ´ c T β M ´ β c β ` c T ς M ´ ς c ς ¯ (7.7)which simplifies to aVar p p t . q “ c c T β M ´ β c β (7.8)in the case of the median.For the product-type model (3.6) the expression related to the aggregate parameters β furtherdecomposes, c T β M ´ β c β “ f p x u q T M ´ f p x u q ¨ f p t α q T M ´ f p t α q , (7.9)as c β “ f p x u q b f p t α q factorizes. 12 . Optimal designs in the case of predetermined measurement times From (7.7) and (7.9) it can be seen that for obtaining a minimal asymptotic variance for p t α only f p x u q T M ´ f p x u q has to be minimized, because all other terms do not depend on the experimentalsettings x , ..., x n of the stress variable, when the measurement times t , ..., t k are predetermined.The optimality criterion of minimization of the asymptotic variance of p t α thus reduces to a c -criterion c T M p ξ q ´ c for extrapolation of the marginal response at normal use condition x u in the firstmarginal model (5.4), c “ f p x u q , which is a well-known problem from the literature (see (Kieferand Wolfowitz, 1964)). It is remarkable that this criterion and, hence, the corresponding optimaldesign is the same whatever the value of α is, as long as there is a proper solution 0 ă t α ă 8 forthe α -quantile of the failure time. Proposition 8.1.
If the design ξ ˚ is c -optimal for extrapolation of the mean response at the normaluse condition in the marginal model (5.4) for the stress variable, then ξ ˚ minimizes the asymptoticvariance for the estimator p t α of the α -quantile of the failure time for every α when ă t α ă 8 (forpredetermined measurement times t , ..., t k ). Although the normal use condition is typically outside the experimental region, the aboveproposition also would hold for interpolation, i. e. x u P X . The result of Proposition 8.1 is next usedto derive optimal designs for the situation in Examples 1 and 2. Example (Example 1 cont.) . In the introductory model of Section 2 the marginal model for thestress variable x is given by a simple linear regression, f p x q “ p , x q T . In this marginal model the c -optimal design ξ ˚ for extrapolation of the mean response µ p q p x u q “ β p q ` β p q x u under normaluse condition x u ă assigns weight w ˚ “ | x u |{p ` | x u |q to the highest stress level x h “ andweight ´ w ˚ “ p ` | x u |q{p ` | x u |q to the lowest stress level x l “ on the standardized scale X “ r , s (see (Kiefer and Wolfowitz, 1964)). Note that larger weight ´ w ˚ ą w ˚ is assigned tothe lowest stress level x l “ which is closer to x u ă than x h “ and that the weight ´ w ˚ at x l decreases from to { , when the distance between the normal use condition and the experimentalregion gets larger, i. e. x u decreases. For the standardized value x u “ ´ . of the normal usecondition from Table 1 the optimal weights for extrapolation at x u “ ´ . are w ˚ “ . at x h “ and ´ w ˚ “ . at x l “ , and the optimal design is ξ ˚ “ ˆ .
95 0 . ˙ . Further examples for extrapolation at x u “ ´ . , ´ . , and ´ give optimal weights w ˚ “ . , . ,and . at x h “ , and ´ w ˚ “ . , . , and . at x l “ , respectively.By Proposition 8.1 the design ξ ˚ is also optimal for minimization of the asymptotic variancefor estimating the α -quantile t α of the failure time for soft failure due to degradation under normaluse condition x u , when ă t α ă 8 and the measurement times t , ..., t k are predetermined (see(Schwabe et al., 2014) for estimation of the median, α “ . ). In particular, for the standardizedvalue x u “ ´ . of the normal use condition from Table 1 the optimal design for estimating any α -quantile t α assigns weight . to x l “ and weight . to x h “ , as found numerically by(Weaver and Meeker, 2014) in the case of the median t . . The so obtained designs for one stress variable can now be used to construct c -optimal designs inthe presence of two stress variables with interactions. Example (Example 2 cont.) . In the model with two interacting stress variables x and x themarginal model for the combined stress variable x “ p x , x q is given itself by a product-type structure, f p x q “ f p x q b f p x q , where both components x and x are specified by f v p x v q “ p , x v q T assimple linear regressions in their corresponding submarginal models Y p v q i “ β p v q ` β p v q x v ` ε p v q i , v “ , , with standardized homoscedastic and uncorrelated error terms. Moreover, the experimentalregion X “ r , s for the combined stress variable x is the Cartesian product of the marginalexperimental regions X “ X “ r , s for the components x and x , respectively. The vector c forextrapolation of the mean response c T β p q “ µ p q p x u q “ β p q ` β p q x u ` β p q x u ` β p q x u x u undernormal use condition x u “ p x u , x u q , x u , x u ă , is given by f p x u q “ f p x u q b f p x u q and,hence, also factorizes as c “ c b c , where c v “ f v p x uv q . In this setting the c -optimal design ξ ˚ for extrapolation at x u can be obtained as the product ξ ˚ “ ξ ˚ b ξ ˚ of the c -optimal designs ξ ˚ v forextrapolation at x uv in the submarginal models (see Theorem 4.4 in (Schwabe, 1996b)). he submarginal c -optimal designs ξ ˚ v can be derived as in Example 1. They assign weight w ˚ v “ | x uv |{p ` | x uv |q to x v “ and weight ´ w ˚ v “ p ` | x uv |q{p ` | x uv q to x v “ . Hence, the c -optimal design ξ ˚ “ ξ ˚ b ξ ˚ for extrapolation at x u is given by ξ ˚ “ ˆ p , q p , q p , q p , qp ´ w ˚ qp ´ w ˚ q p ´ w ˚ q w ˚ w ˚ p ´ w ˚ q w ˚ w ˚ ˙ . Then, by Proposition 8.1, the design ξ ˚ is also optimal for minimization of the asymptoticvariance for estimating the α -quantile t α of the failure time for soft failure due to degradation, when ă t α ă 8 and the measurement times t , ..., t k are predetermined. For example, when the normaluse conditions are x u “ ´ . for the first component and x u “ ´ . for the second component asspecified in Table 2, then by the results in Example 1 the optimal marginal weights are w ˚ “ . and w ˚ “ . , and the optimal design ξ ˚ “ ξ ˚ b ξ ˚ is given by ξ ˚ “ ˆ p , q p , q p , q p , q .
58 0 .
17 0 .
19 0 . ˙ . The c -optimal extrapolation designs can be obtained in both Examples 1 and 2 by Elfving’stheorem ((Elfving, 1952)) which provides a geometrical construction of a c -optimal design (see(Schwabe, 1996b), Theorem 2.13). To give a rough idea of this construction one has to consider theElfving set which is the convex hull E “ conv pt f p x q ; x P X u Y t´ f p x q ; x P X uq of the union of the so-called induced design region t f p x q ; x P X u and its image t´ f p x q ; x P X u underreflection at the origin in R p . Here x and f denote variables and regression functions associated witha generic model Y i “ f p x i q T β ` ε i . In Example 1 we have x “ x , f p x q “ p , x q T and X “ r , s , andthe Elfving set E is given as a parallelogram in R with one edge from p , q T to p , q T representingthe induced design region and the opposite edge from p´ , q T to p´ , ´ q T representing its imageunder reflection. For another illustration of an Elfving set see Figure 4 in Section 10, where anoptimal design is sought for the time variable t .The c -optimal design for estimating c T β can then be constructed as follows: Determine theintersection point of the ray λ c , λ ą
0, with the boundary of the Elfving set, λ c c say. This point canbe represented as a convex combination of (extremal) points ˘ f p x i q of the induced design region andits reflection, λ c c “ m ÿ i “ w i z i f p x i q , where z i “
1, when the (extremal) point f p x i q is from the induced design region, and z i “ ´
1, whenthe point ´ f p x i q is from the reflection, and the weights w i of the convex combination satisfy w i ą ř mi “ w i “
1. Then Elfving’s theorem states that the design ξ ˚ which assigns weights w i to thesettings x i is c -optimal (for c ). Moreover, this construction also provides the value of the c -criterion, c T M p ξ ˚ q ´ c “ { λ c .In Example 1 the ray λ p , x u q T intersects the boundary of the Elfving set E at the connectingline from p , q T “ f p q to p´ , ´ q T “ ´ f p q at λ c p , x u q T “ w p´ , ´ q T ` w p , q T with λ c “ {p ´ x u q , w “ ´ x u {p ´ x u q ą w “ ´ w “ p ´ x u q{p ´ x u q ą
0. Hence, theoptimality of the given design follows.We will use Elfving’s theorem next to characterize optimal designs for the situation with twonon-interacting stress variables.
Example 3.
In the case of two non-interacting stress variables x and x we consider the modelequation y ij “ β i, ` β x i ` β x i ` β i, t j ` β x i t j ` β x i t j ` ε ij (8.1) for the combined stress variable x “ p x , x q and the time variable t . This model contains all termsof the full interaction model 3.8) with the exception of the terms x x and x x t related to potentialinteractions between the stress variables. The interpretation of all other terms in (8.1) is the same asin Example 2. Model (8.1) is constructed from the marginal model Y p q i “ β p q ` β p q x i ` β p q x i ` ε p q i which is additive in the effects of the stress variables x and x , i. e. f p x q “ p , x , x q T . or this marginal model of multiple regression, the Elfving set is an oblique prism with quadraticbase with vertices p , , q T , p , , q T , p , , q T , p , , q T and quadratic top with vertices p´ , , q T , p´ , , ´ q T , p´ , ´ , q T , p´ , ´ , ´ q T . To find the c -optimal extrapolation design at the normaluse condition x u “ p x u , x u q by Elfving’s theorem we have to determine the intersection point ofthe ray λ p , x u , x u q T with the surface of the Elfving set. For x u ă x u ă the ray intersectsthe surface at the quadrangular face of the prism spanned by p , , q T , p , , q T , p´ , ´ , q T , and p´ , ´ , ´ q T when λ c “ {p ` | x u |q . The representation of the intersection point λ c p , x u , x u q T by the vertices of the quadrangle is not unique. There are two c -optimal designs ξ ˚ “ ˆ p , q p , q p , qp ` | x u |q λ c p| x u | ´ | x u |q λ c | x u | λ c ˙ and ξ ˚ “ ˆ p , q p , q p , qp ` | x u |q λ c p| x u | ´ | x u |q λ c | x u | λ c ˙ which are supported on three vertices. As a consequence, also for all coefficients a , ă a ă , theconvex combination ξ ˚ a “ p ´ a q ξ ˚ ` aξ ˚ “ ˆ p , q p , q p , q p , qp ` a | x u | ` p ´ a q| x u |q λ c p ´ a qp| x u | ´ | x u |q λ c a p| x u | ´ | x u |q λ c pp ´ a q| x u | ` a | x u |q λ c ˙ supported on all four vertices is c -optimal for extrapolation at x u , ă t α ă 8 . Then, by Proposi-tion 8.1, the designs ξ ˚ a are also optimal for minimization of the asymptotic variance for estimatingthe α -quantile t α of the failure time for soft failure due to degradation, when ă t α ă 8 and themeasurement times t , ..., t k are predetermined, ď α ď . For example, when the normal useconditions are x u “ ´ . for the first component and x u “ ´ . for the second component as inExample 2, then the optimal design ξ ˚ a is given by ξ ˚ a “ ˆ p , q p , q p , q p , q . ` . a . ´ . a . a . ´ . a ˙ with the special cases ξ ˚ “ ˆ p , q p , q p , q .
70 0 .
05 0 . ˙ and ξ ˚ “ ˆ p , q p , q p , q .
75 0 .
05 0 . ˙ supported on three vertices.Note that there are also other designs which are c -optimal for extrapolation at x u , but whichare not solely supported on the vertices. For example, for x u ă x u ă the two-point designwhich assigns weight w “ | x u |{p ` | x u |q to p , q and weight ´ w “ p ` | x u |q{p ` | x u |q to p , x u { x u q is c -optimal by Elfving’s theorem. However, these designs can be used for estimating t α by means of maximum-likelihood only when the resulting information matrix is non-singular, i. e.when the design has, at least, three distinct support points.For x u ă x u ă optimal designs can be obtained from the above case by interchanging the rolesof the two components x and x .In the case x u “ x u ă there is only one c -optimal design for extrapolation. This design issupported on two vertices and assigns weight w “ | x u |{p ` | x u |q to p , q and weight ´ w “p ` | x u |q{p ` | x u |q to p , q . As the resulting information matrix is singular, this design cannotbe used for estimating the α -quantile t α of the failure time for soft failure due to degradation. Hence,no suitable optimal design exists in this case, but the c -optimal design may serve as a benchmark forjudging the quality of a competing design in terms of efficiency. To quantify the quality of a standard design ξ for estimating the quantile t α of the mean failuretime under normal use condition we make use of the efficiencyeff aVar p ξ q “ aVar p p t α ; ξ ˚ q aVar p p t α ; ξ q , (8.2)where aVar p p t α ; ξ q “ c T M θ p ξ q ´ c denotes the standardized asymptotic variance for estimating t α byequation (6.4) when design ξ is used, and ξ ˚ is the corresponding optimal design. The efficiencygives the proportion of units to be used under the optimal design ξ ˚ which provides (asymptotically)the same accuracy (in terms of the asymptotic variance) compared to the standard design ξ . For15 able 3: Efficiency of uniform designs ¯ ξ m for various normal use conditions x u in Example 1 m x u ´ . ´ . ´ . ´ . ´8 .
50 0 .
55 0 .
76 0 .
80 0 .
90 1 .
003 0 .
40 0 .
43 0 .
55 0 .
57 0 .
62 0 .
674 0 .
36 0 .
38 0 .
47 0 .
49 0 .
52 0 .
565 0 .
33 0 .
36 0 .
43 0 .
44 0 .
47 0 . .
25 0 .
26 0 .
30 0 .
31 0 .
32 0 . . ξ than underthe optimal design ξ ˚ to get the same accuracy. Note that both the asymptotic variance and theefficiency may also depend on the parameter vector θ , at least, through t α and are, hence, localquantities (at θ ) without explicitly stated in the notation.In the case of estimating the median t . the standardized asymptotic variance factorizes asaVar p p t . ; ξ q “ n c f p x u q T M p ξ q ´ f p x u q ¨ f p t α q T M ´ f p t α q (8.3)by equations (7.8) and (7.9) for the general product-type model (3.6). Thus the efficiency defined in(8.2) reduces to the c -efficiency eff c p ξ q “ f p x u q T M p ξ ˚ q ´ f p x u q f p x u q T M p ξ q ´ f p x u q for extrapolation at the normal use condition x u in the first marginal model with uncorrelatedhomoscedastic errors and does not depend on θ . Example (Example 1 cont.) . In the introductory model of Section 2 the c -criterion for extrapolationat x u is defined by Φ c p ξ q “ f p x u q T M p ξ q ´ f p x u q . For x u ă it attains its minimal value value Φ c p ξ ˚ q “ p ` | x u |q for the optimal design ξ ˚ which assigns weight w ˚ “ | x u |{p ` | x u |q to x h “ and weight ´ w ˚ “ p ` | x u |q{p ` | x u |q to x l “ . Common alternatives would be uniformdesigns ¯ ξ m which assign equal weights w “ { m to m experimental settings on an equidistant grid t x , ..., x m u “ t , {p m ´ q , ..., u of the experimental region X “ r , s . For these designs the c -criterion for extrapolation at x u ă can be calculated as Φ c p ¯ ξ m q “ ` a m p ` | x u |q , where a m “ p m ´ q{p m ` q . Their c -efficiency for extrapolation at x u and, hence, their efficiency for estimatingthe median failure time under normal use condition x u is equal to eff c p ¯ ξ m q “ Φ c p ξ ˚ q{ Φ c p ¯ ξ m q “p ´ {p ` a m p ` | x u |q qq{ a m which increases from {p ` a m q “ p m ` q{p m ´ q for x u close tothe lowest stress level x l “ to { a m “ p m ` q{p m ´ q when x u tends to minus infinity. Moreover,for fixed x u , the efficiency decreases when m increases, i. e. when the grid becomes more dense. Forselected values of the normal use condition x u and numbers m of grid points numerical values of theefficiency are reported in Table 3 Note that in Table 3 the row m “ 8 corresponds to a continuousuniform design as an approximation to large numbers m of grid points, while the columns x u “ and x u “ ´8 give approximations for normal use conditions x u close to the lowest experimental stresslevel or far away, respectively.For the particular case m “ , where the design ¯ ξ assigns equal weights w “ ´ w “ { to both the highest and the lowest stress level x h “ and x l “ , we have a “ and, hence, Φ c p ξ q “ ` p ` | x u |q for the c -criterion. The c -efficiency of ¯ ξ for extrapolation at x u and,thus, its efficiency for estimating the median failure time under normal use condition x u is equal to eff c p ¯ ξ q “ ´ {p ` p ` | x u |q q which ranges from { for x u close to the lowest stress level x l “ to when x u tends to minus infinity.For the nominal value x u “ ´ . of the normal use condition in Table 1 the efficiency ofthe equidistant grid designs ¯ ξ m is reported in the third column of Table 3. In particular, for theuniform design ¯ ξ on the endpoints of the experimental region this efficiency is . which meansthat eff c p ¯ ξ q ´ ´ “ {p ` | x u |q “
81 % more units would have to be used for design ¯ ξ to obtainthe same quality for estimating the median failure time than for the optimal design ξ ˚ . The efficiencies in the case of one stress variable can be used to compute the efficiency in thepresence of two stress variables with interactions.
Example (Example 2 cont.) . In the model with two interacting stress variables x and x the c -criterion Φ c p ξ q “ f p x u q T M p ξ q ´ f p x u q for extrapolation at x u “ p x u , x u q factorizes into its coun-terparts in the submarginal models, Φ c p ξ q “ f p x u q T M p ξ q ´ f p x u q¨ f p x u q T M p ξ q ´ f p x u q . ecause also the c -optimal design ξ ˚ “ ξ ˚ b ξ ˚ has product-type structure, the c -efficiency forextrapolation at x u and, hence, the efficiency for estimating the median failure time factorizes, eff c p ξ b ξ q “ eff c p ξ q ¨ eff c p ξ q , where eff cv p ξ v q is the corresponding efficiency in the v th sub-marginal model, v “ , .The design ¯ ξ which assigns equal weights { to the four vertices p , q , p , q , p , q , and p , q ofthe experimental region serves as a natural standard design. This design can be seen to be the product ¯ ξ “ ¯ ξ b ¯ ξ of submarginal designs ¯ ξ which assign equal weights { to the lowest and highest stresslevel x vl and x vh in the submarginal models, v “ , . Hence, from Example 1 we get the efficiency of ¯ ξ as eff c p ¯ ξ q “ pp ` | x u |qp ` | x u |qq {pp ` p ` | x u |q qp ` p ` | x u |q qq which ranges from { for x u close to the combination p x l , x l q of lowest stress levels x l “ and x l “ to whenboth normal use conditions x u and x u tend to minus infinity.For example, when the normal use conditions are x u “ ´ . for the first component and x u “´ . for the second component as specified in Table 2, then according to Table 3 the efficiency eff cv p ¯ ξ q of ¯ ξ is . and . in the respective submarginal models, v “ , . By the above considerations theefficiency of ¯ ξ is eff c p ¯ ξ q “ . ¨ . “ . . This means that eff c p ¯ ξ q ´ ´ “ . { . “
64 % moreunits have to be used for design ¯ ξ to obtain the same quality for estimating the median failure timethan for the optimal design ξ ˚ . Hence, the optimal design ξ ˚ performs much better than the standarddesign ¯ ξ in this situation. Even more prominent results can be obtained for the marginal model without interactions betweenthe stress variables.
Example (Example 3 cont.) . In the model with two non-interacting stress variables x and x the value of the c -criterion for the locally c -optimal design ξ ˚ for extrapolation at x u “ p x u , x u q , x u ă x u ă , is given by Φ c p ξ ˚ q “ { λ c “ p ` | x u |q as seen before. The uniform design ¯ ξ whichassigns equal weights { to the four vertices p , q , p , q , p , q , and p , q of the experimental regionhas a value of Φ c p ¯ ξ q “ ` p ` | x u |q ` p ` | x u |q . Hence, the uniform design ¯ ξ has efficiency eff c p ¯ ξ q “ pp ` | x u |q {p ` p ` | x u |q ` p ` | x u |q q which ranges from { for x u close to thecombination p x l , x l q of lowest stress levels x l “ and x l “ to when the lower normal usecondition x u tends to minus infinity while x u remains fixed. Moreover, the efficiency approaches { when x u « x u and both normal use conditions tend to minus infinity simultaneously.For example, when the normal use conditions are x u “ ´ . for the first component and x u “ ´ . for the second component as specified in Table 2, then the values of the c -criterion are Φ c p ξ ˚ q “ . for the optimal design ξ ˚ and Φ c p ¯ ξ q “ . for the uniform design ¯ ξ , respectively.Hence, the efficiency of the uniform design ¯ ξ is eff c p ¯ ξ q “ . { . “ . . This means that morethan twice as many units have to be used for design ¯ ξ to obtain the same quality for estimatingthe median failure time than for the optimal design ξ ˚ . This highlights that the optimal design ξ ˚ performs substantially better than the standard design ¯ ξ in the current model of two non-interactingstress variables. The above efficiency calculations are all related to estimating the median failure time for softfailure due to degradation under normal use conditions x u . For estimating any other quantile t α ofthe failure time distribution, the efficiency of a design ξ can be written aseff aVar p ξ q “ eff c p ξ q ` p ´ eff c p ξ qq c T ς Ă M ´ ς c ς { aVar p p t α ; ξ q (8.4)by equations (7.7) and (6.4). This efficiency depends on the variance parameters, but it is boundedfrom below by the c -efficiency eff c p ξ q of ξ for extrapolation at x u . Hence, designs with a highefficiency for estimating the median failure time are also suitable for estimating any other reasonablequantile t α , 0 ă t α ă 8 .
9. Optimization of the time plan
In Section 8, we considered the situation when there is a fixed time plan t “ p t , ..., t k q T , andonly the stress variables x P X are to be optimized across units. In contrast to that we considernow the situation when also the settings t , ..., t k for the standardized time variable t P r , s may beoptimized within units. As in Section 3 we still assume that the same time plan t “ p t , ..., t k q T isused for all units.Similar to the case of fixed time points t , ..., t k , the standardized asymptotic variance is given byaVar p p t α ; ξ, t , ..., t k q“ c ´ f p x u q T M p ξ q ´ f p x u q ¨ f p t α q T M p t , ..., t k q ´ f p t α q ` c T ς Ă M ς p t , ..., t k q ´ c ς ¯ (9.1)17y equations (7.7) and (7.9). But here the dependence of the asymptotic variance on the set-tings t , ..., t k for the time variable is explicitly stated which comes through the information ma-trix M p t , ..., t k q in the second marginal model as well as the standardized information matrix Ă M ς p t , ..., t k q for the variance parameters.As has been seen in Section 8 the optimization with respect to the stress variables x u can bedone independently of the second marginal model and the settings t , ..., t k of the time variable. Thereverse does not hold true for the time points t , ..., t k , in general, by the extra dependence of theasymptotic variance on the information matrix Ă M ς p t , ..., t k q for the variance parameters.To circumvent this problem we restrict to the case of estimating the median failure time t . .There the second term on the right hand side of equation (9.1) vanishes and the standardizedasymptotic variance simplifies toaVar p p t . ; ξ, t , ..., t k q “ c f p x u q T M p ξ q ´ f p x u q ¨ f p t . q T M p t , ..., t k q ´ f p t . q . (9.2)Hence, for the median failure time also the optimization of the measurement times t , .., t k can beperformed independently of the marginal model of the stress variables and their settings. Then, onlythe marginal c -criterion f p t . q T M p t , ..., t k q ´ f p t . q has to be minimized.Remember that the marginal information matrix M p t , ..., t k q is given by M p t , ..., t k q “ F p t , ..., t k q T V p t , ..., t k q ´ F p t , ..., t k q in the second marginal model for the time variable t , wherethe variance covariance matrix V p t , ..., t k q of the vector Y i of measurements for each unit i is givenby V p t , ..., t k q “ F p t , ..., t k q Σ γ F p t , ..., t k q T ` Σ ε and Σ γ and Σ ε are the variance covariancematrices for the random effects γ and for the measurement errors ε , respectively.(Schmelter, 2007) proposed a representation of the inverse of the information matrix in randomeffects models in the case of uncorrelated homoscedastic errors ( Σ ε “ σ ε I k ). This can be readilyextended to a general, non-singular variance covariance structure Σ ε for the measurement errors ε (seeLemma A.1 in the Appendix). From this we obtain for the marginal information matrix M p t , ..., t k q that its inverse can be decomposed to M p t , ..., t k q ´ “ F p t , ..., t k q T Σ ´ ε F p t , ..., t k q ` Σ γ .As a consequence, the marginal c -criterion f p t . q T M p t , ..., t k q ´ f p t . q can be split up into f p t . q T M p t , ..., t k q ´ f p t . q “ f p t . q T M p q p t , ..., t k q ´ f p t . q ` f p t . q T Σ γ f p t . q , where M p q p t , ..., t k q “ F p t , ..., t k q T Σ ´ ε F p t , ..., t k q is the information matrix in the marginalfixed effect model Y p , q j “ f p t j q T β p q ` ε p q j , (9.3) j “ , ..., k , in the time variable t with variance covariance matrix Σ ε for the vector ε p q “p ε p q , ..., ε p q k q T of error terms. Hence, as for any linear criterion, the optimization of the asymptoticvariance (9.2) of the median failure time t . with respect to the time plan t , ..., t k does not de-pend on the variance covariance matrix Σ γ of the random effects. For optimization only the term f p t . q T M p q p t , ..., t k q ´ f p t . q has to be minimized which is the c -criterion for extrapolation ofthe mean response at t . in the marginal fixed effect model in t with Cov p ε p q q “ Σ ε . This leads tothe following result which is similar to Proposition 8.1 for optimization with respect to the stressvariable x . Proposition 9.1.
If the time plan t ˚ , ..., t ˚ k is c -optimal for extrapolation of the mean response at t . in the marginal fixed effect model (9.3) with covariance Σ ε for the time variable t , then t ˚ , ..., t ˚ k minimize the asymptotic variance for the estimator p t . of the median failure time t . under normaluse condition. Note that in degradation experiments the median failure time t . under normal use condition istypically much larger than the time horizon of the experiment, i. e. t . ą t . P r , s .The optimal time plan depends on the location parameters β through t . , but also on the variancecovariance structure Σ ε of the measurement errors within units, and are, hence, local. In Examples 1and 2 we assumed uncorrelated homoscedastic measurement errors, Σ ε “ σ ε I k . In that case, the c -criterion for extrapolation can be reduced to f p t . q T p F p t , ..., t k q T F p t , ..., t k qq ´ f p t . q , andthe optimization does not depend on the error variance σ ε .If the number k of measurement times t , ..., t k is large, one might be tempted to use theconcept of approximate designs also here. In that case, the approximate design τ will be defined bymutually distinct time points t , ..., t (cid:96) from the standardized experimental time interval T “ r , s with corresponding proportions π , ..., π (cid:96) ą ř (cid:96)j “ π j “ π j k measurements are performed at time point t j , j “ , ..., (cid:96) , for each unit. In thesituation of a linear time trend, f p t q “ p , t q T , as in Examples 1 and 2, this leads to essentiallythe same extrapolation problem as for the stress variable in Example 1. The c -optimal designfor extrapolation at t . ą (cid:96) “ t “ t “ T “ r , s with corresponding proportions π “ t . ´ t . ´ and π “ t . t . ´ (see (Schwabe, 1996b), Example 2.1). Similarly, as for extrapolation of the stressvariable in Example 1, the proportion π at the endpoint t “ t . decreases from 1, when t . is close to 1, to 0 .
5, when the distance gets large ( t . Ñ 8 ).However, from a practical point of view, it does not seem meaningful to have replications, i. e. tohave more than one measurement at a time at the same unit. Moreover, even if this would be possible,these measurements would be expected to be strongly correlated with a correlation beyond thatcaused by the random effects. To avoid these technical and modeling problem additional constraintshave to be imposed on the time plan to guarantee independence of the measurement errors. Forinstance, the different measurement times t , ..., t k have to be at least some sufficiently large ∆ t ą t . Nevertheless approximate design theory can be used to determine optimaltime plans. To do so the standardized experimental time interval is discretized to a sufficientlycoarse grid, T “ t j ∆ t ; j “ , , ..., J u , where ∆ t “ { J , i. e. T “ t , ∆ t, t, ..., u . Additionallyconstraints are imposed on the proportions π j that none of these proportions is larger than 1 { k , i. e.the number of measurements at a time is bounded by one and, hence, there are at least k differenttime points. Optimal approximate time plans can then be obtained under these constraints by usingstandard algorithms for design optimization. Actually, the so obtained proportions may be smallerthan one for some of the time points. But a theoretical result based on an equivalence theorem underconstraints (convex optimization with Kuhn-Tucker conditions) guarantees that this only occurs fora small number of time points: (Sahm and Schwabe, 2001) proved for the D -optimality criterionthat the experimental region splits up in subregions where the weight of the optimal design attainsthe upper bound 1 { k and where it is 0. Only at the boundary of these subregions weights differentfrom 0 and 1 { k may occur. In particular, for straight line regression on r , s , the subregions withmaximal weight 1 { k are adjacent to the endpoints 0 and 1 of the interval while in the interior theoptimal design has zero weight. This result carries over directly to other criteria like the c -criterionunder appropriate conditions which are met in the present case of extrapolation. From this approachefficient exact designs can be obtained by adjusting the weights to the admissible values 0 and 1 { k under the constraint of total weight 1. Example (Example 2 cont.) . In the setup of Example 2 we deal with straight line regression Y p , q j “ β p q ` β p q t j ` ε p q j for the fixed effects marginal model in the time variable t . The measurement errors ε p q j are supposedto be uncorrelated and homoscedastic ( Σ ε “ σ ε I k ). We are searching for a locally c -optimal design τ ˚ for extrapolation at the median failure time t . which is equal to . under the nominal valuesof Table 2. As constraints for the design we assume that k “ observation can be taken on a gridwith increment ∆ t “ . of the standardized experimental time interval r , s , i. e. J “ and T “ t , . , . , ..., u .By the multiplicative algorithm (see e.g. (Silvey et al., 1978) ) adapted to the present constraintsituation the following numerical solution is obtained for the locally c -optimal design τ ˚ “ ˆ .
00 0 .
05 0 .
10 0 .
85 0 .
90 0 .
95 1 . .
166 0 .
166 0 .
130 0 .
040 0 .
166 0 .
166 0 . ˙ . The optimal approximate design τ ˚ is supported on seven time points t ˚ , ..., t ˚ which are concentratedto the ends of the standardized experimental time interval r , s with maximal admitted weight { k forall but the two boundary points t ˚ “ . and t ˚ “ . separating grid points t ˚ “ . , t ˚ “ . , t ˚ “ . , t ˚ “ . , and t ˚ “ . with full weight { k from those with zero weight. This shapeof the optimal design is in accordance with the findings of (Sahm and Schwabe, 2001) in the caseof D -optimality. In view of Proposition 9.1 the design τ ˚ is also optimal for the estimation of themedian failure time.For practical use, the optimal approximate design τ ˚ may be adjusted to τ “ ˆ .
00 0 .
05 0 .
10 0 .
90 0 .
95 1 . .
166 0 .
166 0 .
166 0 .
166 0 .
166 0 . ˙ hich is supported on exactly k “ time points by transferring the weight from the boundary point t ˚ “ . with the lower weight to the boundary point t ˚ “ . with the higher weight (see (Dorfleitnerand Klein, 1999)). As a consequence all weights in the adjusted design τ are equal to { . Thenthe design τ can be realized as an exact design by taking one measurement at each of the six timepoints . , . , . , . , . , and . . To quantify what might have got lost, the quality of theadjusted design τ may be measured in terms of the local c -efficiency eff c p τ q “ f p t . q T M p τ ˚ q ´ f p t . q f p t . q T M p τ q ´ f p t . q “ . . “ . for extrapolation at t . in the marginal mixed effects model and, hence, for estimation of the medianfailure time. This indicates that the adjusted design τ is highly efficient and can be recommended asthe time plan for conducting an accelerated degradation testing experiment. Note that by Proposition 9.1 the optimal design τ ˚ does not depend on the variance covariancestructure Σ γ of the random effects, but the efficiency of the adjusted design τ may be affected by therandom effects. Nevertheless, similar to equation (8.4), also here the c -efficiency for extrapolation at t . provides a lower bound for the efficiency in estimating the median failure time. The assumption Σ ε “ σ ε I k of uncorrelated measurement errors could be replaced by equal correlations ρ ε between allmeasurements, i. e. Σ ε is compound symmetric with all diagonal entries equal to σ ε and all off-diagonalentries equal to ρ ε σ ε . For estimating the variance covariance parameters an additional identifiabilitycondition would be required in this case to distinguish the correlation ρ ε of the measurement errorsfrom the variance σ of the random intercept. Then estimation of the location parameters β is notaffected, and the design τ ˚ retains its optimality.For further approaches allowing for correlations depending on the distance of the measurementtimes, e. g. when measurements follow a random process, we refer to (N¨ather, 1985) and (M¨uller,2007) for optimization of the measurement times.
10. Experimental design with a cross-sectional time plan
In contrast to the previous sections we will allow here for time plans t i , ..., t ik differing acrossunits, Y ij “ p f p x i q b f p t ij qq T β ` f p t ij q T γ i ` ε ij , (10.1) j “ , ..., k , i “ , ..., n , where all other expressions have the same meaning as in the general modelequation 3.5. Differing time plans are, for example, required when the number k of time points is lessthan the number p of parameters in the marginal model for the degradation paths. In particular, inthe case of destructive testing only k “ Y i “ p f p x i q b f p t i qq T β ` f p t i q T γ i ` ε i , (10.2) i “ , ..., n . We will restrict to the case of k “ Y i are assumed to be independent. Denote by σ p t q “ f p t q T Σ γ f p t q ` σ ε ą t , i. e. Var p Y i q “ σ p t i q . As discussed at the end of theprevious section also here an identifiability condition has to be imposed on the variance parameters ς to distinguish between the variance σ ε of the measurement error and the variance σ of the randomintercept, see (Graßhoff et al., 2012) who derived D -optimal designs.Under the assumption of normally distributed random effects and measurement errors theinformation matrix M β for the location parameter β can be written as M β “ n ÿ i “ p f p x i q f p x i q T q b ´ σ p t i q f p t i q f p t i q T ¯ according to the product-type structure of the model (10.2). For estimating the median failuretime t . the asymptotic variance is given by aVar p p t . q “ c c T M ´ β c as in Section 7. Hence, weare searching for a c -optimal design, where the vector c factorizes according to c “ c b c intocomponents c “ f p x u q and c “ f p t . q associated with the marginal models for the stress andthe time variables, see (7.9). Moreover, we suppose, as before, that the experimental settings x i for the stress variable and t i for the time of measurement may be chosen independently from theirstandardized experimental regions X and T , respectively, i. e. p x i , t i q P X ˆ T .20lso here we make use of the concept of approximate designs denoted by ζ which are specified by m mutually distinct combinations p x i , t i q P X ˆ T with corresponding weights η i ą i “ , ..., ν , ř νi “ η i “
1. Accordingly the corresponding normalized information matrix is defined as M p ζ q “ ν ÿ i “ η i p f p x i q f p x i q T q b p ˜ f p t i q ˜ f p t i q T q , where ˜ f p t q “ f p t q{ σ p t q is the weighted version of the marginal regression function for the timevariable (standardized by the standard deviation for measurement at time t ). Then a c -optimaldesign for c “ c b c can be obtained as the product-type design generated from c -optimal designsfor c v in the associated marginal models.To be more specific, let ξ “ ˆ x ... x m w ... w m ˙ and τ “ ˆ t ... t (cid:96) π ... π (cid:96) ˙ be approximate designs in the marginal models for the stress variables and the time, respectively. Thenthe product-type design ζ “ ξ b τ is supported on the ν “ m(cid:96) mutually distinct combinations p x i , t j q with corresponding weights η ij “ w i π (cid:96) , i “ , ..., m , j “ , ..., (cid:96) , and its standardized informationmatrix M p ξ b τ q “ M p ξ q b ˜ M p τ q factorizes into the standardized information matrix M p ξ q “ ř mi “ w i f p x i q f p x i q T of ξ in the marginalmodel for the stress variables and the standardized information matrix ˜ M p τ q “ ř kj “ π j ˜ f p t j q ˜ f p t j q T of τ in the weighted marginal model Y p q j “ ˜ f p t j q T β p q ` ε p q (10.3)for the time variable. Then also the c -criterion c T M p ξ b τ q ´ c “ c T M p ξ q ´ c ¨ c T ˜ M p τ q ´ c for c “ c b c factorizes into the c -criteria for c and c in the corresponding marginal models.This expression is minimized if ζ ˚ is the product ζ ˚ “ ξ ˚ b τ ˚ of the c -optimal marginal designs ξ ˚ and τ ˚ for c and c in their marginal models, respectively. The design ζ ˚ “ ξ ˚ b τ ˚ is not only c -optimal for c “ c b c in the class of product-type designs ξ b τ , but also compared to all designs ζ on Z (see (Schwabe, 1996a), Theorem 4.4) which establishes the following result. Theorem 10.1. If ξ ˚ is c -optimal for extrapolation at normal use condition x u in the marginalmodel for the stress variables x and τ ˚ is c -optimal for extrapolation at the median failure time t . in the weighted marginal model (10.3) for the time variable t , then the design ζ ˚ “ ξ ˚ b τ ˚ is optimalfor estimating the median failure time t . in the combined model (10.2). Note that for the second marginal model we utilize that ˜ f p t . q and c “ f p t . q only differ bythe factor 1 { σ p t . q ą c -optimality.The c -optimal designs ξ ˚ for extrapolation at x u are the same as obtained in Section 8, and the c -optimal designs for extrapolation at t . can be obtained similarly by Elfving’s theorem applied tothe weighted regression functions ˜ f p t q “ f p t q{ σ p t q . Example (Example 1 cont.) . In the introductory model of Section 2 we have straight line regressionin both the stress variable x and the time variable t with an interaction term xt . For the stress variable,the c -optimal marginal design ξ ˚ for extrapolation at x u ă assigns weight w ˚ “ | x u |{p ` | x u |q to x h “ and weight ´ w ˚ “ p ` | x u |q{p ` | x u |q to x l “ , see Section 8. For the timevariable t , the c -optimal marginal design τ ˚ for extrapolation at t . ą can be similarly obtainedby Elfving’s theorem applied to the vector of weighted regression functions ˜ f p t q . The shape of theElfving set is exhibited in Figure 4. Note that the induced design region t ˜ f p t q ; t P r , su and itsnegative image constitute non-overlapping arc segments of the ellipse defined by z T Σz “ , z P R ,centered at p , q T , where Σ “ Σ γ ` ˆ σ ε
00 0 ˙ is the variance covariance matrix of the randomeffects augmented by the variance of the measurement error. The ray λ f p t . q intersects the boundaryof the Elfving set at the line segment connecting ˜ f p q “ p , q T { σ p q and ´ ˜ f p q “ p , q T { σ p q when t . ą . Hence, the c -optimal design τ ˚ is supported by the endpoints of the standardizedexperimental time interval T “ r , s , and the optimal weights can be calculated by Elfving’s theorem to z Figure 4: Elfving set for t in Example 1: Induced design region (right solid line), negative image (left solid line),boundary (dashed lines), c “ p , t . q T (arrow) and corresponding ray (dotted line) π ˚ “ t . σ p q{p t . σ p q`p t . ´ q σ p qq at t max “ and ´ π ˚ “ p t . ´ q σ p q{p t . σ p q`p t . ´ q σ p qq at t min “ .By Theorem 10.1, the design ζ ˚ “ ξ ˚ b τ ˚ “ ˆ p x l , t min q p x l , t max q p x h , t min q p x h , t max qp ´ w ˚ qp ´ π ˚ q p ´ w ˚ q π ˚ w ˚ p ´ π ˚ q w ˚ π ˚ ˙ is optimal for estimating the median failure time t . . Under the nominal values of Table 1 theoptimal weights are w ˚ “ . for extrapolation at x u “ ´ . in the marginal model for the stressvariable, see Section 8, and π ˚ “ . for extrapolation at t . “ . in the weighted marginal modelfor the time variable. Hence, the optimal design for estimating the median failure time is given by ζ ˚ “ ˆ p , q p , q p , q p , q .
22 0 .
73 0 .
01 0 . ˙ . According to this design
73 % of the units should be exposed to the lowest stress level x l “ andmeasured at the maximum allowed time t max “ while only of the units should be exposed tothe highest stress level x h “ and measured at the initial time t min “ . Accordingly
22 % should beexposed to low stress and measured initially, and should be measured at the end of the experimentaltime interval under high exposure.
A similar results can be obtained for the situation of Example 2.
Example (Example 2 cont.) . In the model with two interacting stress variables (Example 2) the c -optimal marginal design ξ ˚ for extrapolation at x u “ p x u , x u q T is the same as in Section 8, ξ ˚ “ ξ ˚ b ξ ˚ . For the time variable t , the c -optimal marginal design τ ˚ for extrapolation at t . ą is obtained as in Example 1: τ ˚ assigns weights π ˚ “ t . σ p q{p t . σ p q ` p t . ´ q σ p qq to t max “ and ´ π ˚ to t min “ . By Theorem 10.1, the optimal design for estimation of the median failuretime is then ζ ˚ “ ξ ˚ b ξ ˚ b τ ˚ .Under the nominal values of Table 2 the optimal weights are w ˚ “ . and w ˚ “ . forextrapolation at x u “ ´ . and x u “ ´ . in the submarginal models of the stress variables,respectively, see Section 8. For the time variable the optimal weight results in π ˚ “ . forextrapolation at t . “ . in the weighted marginal model. Hence, the optimal design for estimatingthe median failure time is given by ξ ˚ b ξ ˚ b τ ˚ “ ˆ p , , q p , , q p , , q p , , q p , , q p , , q p , , q p , , q .
25 0 .
33 0 .
07 0 .
10 0 .
09 0 .
11 0 .
02 0 . ˙ . p s ( ) s ( ) p Figure 5: Optimal weights π ˚ in dependence on t . (left panel) and σ p q{ σ p q (right panel) for Example 1 The locally optimal designs for estimating the median failure time are influenced by both thelocation and the variance parameters β and ς . Therefore we make a sensitivity analysis, howthe optimal designs change with the parameters and how well they perform under parametermisspecification. For this we note first that the optimal marginal design ξ ˚ for extrapolation at x u is globally optimal and does not depend on the parameters. Moreover, in the case of straight linedegradation paths as in Examples 1 and 2 only the weight π ˚ “ t . σ p q{p t . σ p q ` p t . ´ q σ p qq of the optimal marginal design τ ˚ depends on the location parameters β through t . and on thevariance parameters ς through the ratio σ p q{ σ p q of the standard deviations at the endpoints ofthe experimental time region. Note that the c -efficiency for extrapolation at t . in the weightedmarginal model for t coincides with the efficiency for estimating the median failure time when theoptimal design ξ ˚ is used for the stress variables. Example (Example 1 cont.) . For the introductory model with one stress variable (Example 1), theoptimal weight π ˚ “ t . σ p q{p t . σ p q ` p t . ´ q σ p qq is plotted in Figure 10 as a function of t . while the ratio σ p q{ σ p q is held fixed to . induced by the nominal values in Table 1 (left panel)and as a function of the ratio σ p q{ σ p q while t . is held fixed to the nominal value . (rightpanel). When t . increases, the optimal weight π ˚ decreases from for t . close to the maximalexperimental time t max “ to σ p q{p σ p q ` σ p qq “ . for t . Ñ 8 . On the other hand the optimalweight π ˚ increases in the ratio σ p q{ σ p q of standard deviations from for the ratio close to to when the ratio tends to infinity. The limits for π ˚ are indicated in Figure 10 by horizontal dashedlines in both panels while the nominal values for t . and σ p q{ σ p q are indicated by vertical dottedlines, respectively. To assess the influence of the variation of the optimal weights we consider the efficiency of alocally optimal design when the underlying parameters are misspecified. Define byeff aVar p ζ ; θ q “ aVar θ p p t . ; ζ q aVar θ p p t . ; ζ ˚ θ q the asymptotic efficiency of the design ζ for estimating t . , where aVar θ p p t . ; ζ q denotes the asymptoticvariance of p t . , when ζ is used and when θ is the true vector of underlying parameters, and by ζ ˚ θ the locally optimal design at θ . Similar to the situation of submarginal models in Example 2 (seeSection 8), the asymptotic efficiencyeff aVar p ξ b τ ; θ q “ eff c p ξ q ¨ eff c p τ ; θ q , factorizes for product-type designs ζ “ ξ b τ into the marginal efficiency eff c p ξ q of ξ for extrapolationat x u in the marginal model for the stress variable and the marginal efficiency eff c p τ ; θ q of τ forextrapolation at t . in the weighted marginal model for the time variable. Remember that, for thestress variable, the marginal information matrix M p ξ q and, hence, the marginal efficiency eff c p ξ q does not depend on θ . Thus, also the c -optimal marginal design ξ ˚ does not rely on the parametersand can be used throughout in the comparison. The asymptotic efficiency of a product-type design ζ “ ξ ˚ b τ with optimal marginal ξ ˚ reduces to the c -efficiency of the second marginal τ ,eff aVar p ξ ˚ b τ ; θ q “ eff c p τ ; θ q “ c p θ q T M p τ ; θ q c p θ q c p θ q T M p τ ˚ θ ; θ q c p θ q , e ff a V a r s ( ) s ( ) e ff a V a r Figure 6: Efficiency of xi ˚ b τ ˚ (solid line), ξ ˚ b ¯ τ (dashed line) and ξ ˚ b ¯ τ (dashed and dotted line) in Example 1 where c p θ q “ f p t . q and τ ˚ θ is the locally c -optimal design at θ for extrapolation at t . in theweighted marginal model for the time variable. The c -vector c p θ q “ f p t . q depends on the locationparameters β merely through t . , but not on ς . In contrast to that, the information matrices M p τ ; θ q are only affected by the variance parameters ς , but not by β . For straight line regressionin the time variable (cf. Examples 1 and 2), only the ratio σ p q{ σ p q of the standard deviations formeasurements at the endpoints of the experimental time interval has an effect on the informationmatrix M p τ ; θ q , when the design τ is supported by these endpoints, as is the case for the locally c -optimal design τ ˚ θ . Example (Example 1 cont.) . For the setting of the introductory model with one stress variable(Example 1), we examine the efficiency of the design ζ ˚ “ ξ ˚ b τ ˚ which is locally optimal forestimation of the median failure time under the nominal values of Table 1 when the nominalparameters are misspecified. In Figure 10 we plot the efficiency of the locally optimal design xi ˚ b τ ˚ at the nominal values (solid line) together with the efficiency of the designs xi ˚ b ¯ τ (dashed line) and xi ˚ b ¯ τ (dashed and dotted line) for which the marginal design ¯ τ k is uniform on k equally spaced timepoints, i. e. ¯ τ assigns weight { to the endpoints and of the time interval, and τ assigns weight { to the time points . , . , . , . , . , and . (cf. the definition of ¯ ξ m in Section 8). In the leftpanel of Figure 10 the efficiency is displayed in dependence on the true value of the median failuretime t . while the ratio σ p q{ σ p q of standard deviations is held fixed to the nominal value . .In the right panel the efficiency is shown in dependence on the true value of the ratio σ p q{ σ p q ofstandard deviations while the median failure time t . is held fixed to . derived from the nominalvalues. Also here the nominal values for t . and σ p q{ σ p q are indicated by vertical dotted lines inthe corresponding panel, respectively.The efficiency seems to be more sensitive with respect to deviations in the ratio than in the medianfailure time t . . However, commonly neither small values of t . nor small values of σ p q{ σ p q seemto be reasonable. In particular, we may expect that the variance σ p q at the end of the experimentaltime interval is larger than the variance σ p q at the initial time, i. e. σ p q{ σ p q ě . This is satisfiedwhen ρ ě ´ σ {p σ q . In total, the design ξ ˚ b τ ˚ which is locally optimal for the nominal valuesseems to perform quite well over a wide range of parameters while the design ξ ˚ b ¯ τ with the samenumber of units at the endpoints of the intervals is only preferable for larger values of the medianfailure time t . . The design ξ ˚ b ¯ τ with six equally spaced time points performs substantially worsethroughout for reasonable parameter values. It has to be noted in this context that the efficiency of ¯ τ depends on the variance parameters ς not only through the ratio σ p q{ σ p q because measurementsare also to be taken in the interior of the interval. The current plot has been generated by fixing σ “ σ ` σ ε and letting ρ vary. However, other choices of the variance parameters do not changemuch in the performance of the design. Similar results can be obtained for the settings of Example 2.
11. Discussion and conclusion
During the design stage of highly reliable systems it is extremely important to assess the reliabilityrelated properties of the product. One method to handle this issue is to conduct accelerated24egradation testing. Accelerated degradation tests have the advantage to provide an estimation oflifetime and reliability of the system under study in a relatively short period of time. To account forvariability between units in accelerated degradation tests, it is assumed that the degradation functioncan be described by a mixed-effects linear model. This also leads to a non-degenerate distribution ofthe failure time, due to soft failure by exceedance of the expected (conditionally per unit) degradationpath over a threshold, under normal use conditions. Therefore it is desirable to estimate certainquantiles of this failure time distribution as a characteristic of the reliability of the product. In thiscontext we discussed the existence of non-degenerate solutions for the quantiles. The purpose ofoptimal experimental design is then to find the best settings for the stress variable and/or the timevariable to obtain most accurate estimates for these quantities.In the present model for accelerated degradation testing, it is further assumed that stress remainsconstant within each testing unit during the whole period of experimental measurements but mayvary between units. Hence, in the corresponding experiment a cross-sectional design between unitshas to be chosen for the stress variable while for repeated measurements the time variable variesaccording to a longitudinal design within units.In the present paper we assumed a model with complete interactions between the time andthe stress variables and random effects only associated with time but not with stress. Then thecross-sectional design for the stress variables and the longitudinal design for the time variable canbe optimized independently, and the resulting common optimal design can be generated as thecross-product of the optimal marginal designs for stress and time, respectively. In particular, thesame time plan for measurements can be used for all units in the test. Moreover, the marginal optimaldesign for the stress variables can be chosen independently of any model parameters. Optimal timeplans may depend on the aggregate location parameters via the median failure time, but do notdepend on which quantile of the failure distribution is to be estimated. These results were extendedto a model of destructive testing in which also the time variable has to be chosen cross-sectionally.There the optimal choice of measurement times may also be affected by the variance covarianceparameters of the random effects. In both cases (longitudinal and cross-sectional time settings) theefficiency of the designs considered factorizes which facilitates to assess their performance when thenominal values for these parameters are misspecified at the design stage.Finding optimal designs may become more complicated when the above assumptions are not met.In particular, the designs for stress and time variables may no longer be optimized independently ifthere are only additive effects in the model (lacking interaction terms xt , cf. Example 3 for a similarsituation in the marginal stress model) or when also the stress variables are accompanied by randomeffects. The impact of these deviations from the model assumptions on optimal designs are object offurther research as well as the construction of designs which are robust against misspecification ofthe nominal parameters, such as maximin efficient or weighted (“Bayesian”) optimal designs. Offurther interest would be to consider optimality criteria accounting for simultaneous estimation ofvarious characteristics of the failure time distribution. Acknowledgement
The work of the first author has been supported by the German Academic Exchange Service(DAAD) under grant no. 2017-18/ID-57299294. 25 eferences
Atkinson, A., Donev, A., and Tobias, R. (2007).
Optimum Experimental Designs, with SAS . OxfordStatistical Science Series. Oxford University Press.Bagdonavicius, V. and Nikulin, M. (2001).
Accelerated Life Models: Modeling and Statistical Analysis .Chapman & Hall/CRC Monographs on Statistics & Applied Probability. CRC Press.Debusho, L. K. and Haines, L. M. (2008). V- and D-optimal population designs for the simple linearregression model with a random intercept term.
Journal of Statistical Planning and Inference ,138(4):1116–1130.Dette, H., Pepelyshev, A., and Holland-Letz, T. (2010). Optimal designs for random effect modelswith correlated errors with applications in population pharmacokinetics.
The Annals of AppliedStatistics , 4(3):1430–1450.Dorfleitner, G. and Klein, T. (1999). Rounding with multiplier methods: An efficient algorithm andapplications in statistics.
Statistical Papers , 40(2):143–157.Elfving, G. (1952). Optimum allocation in linear regression theory.
The Annals of MathematicalStatistics , 23(2):255–262.Entholzner, M., Benda, N., Schmelter, T., and Schwabe, R. (2005). A note on designs for estimatingpopulation parameters.
Biometrical Letters – Listy Biometryczne , 42:25–41.Graßhoff, U., Doebler, A., Holling, H., and Schwabe, R. (2012). Optimal design for linear regressionmodels in the presence of heteroscedasticity caused by random coefficients.
Journal of StatisticalPlanning and Inference , 142(5):1108–1113.Guan, Q., Tang, Y., and Xu, A. (2016). Objective bayesian analysis accelerated degradation testbased on wiener process models.
Applied Mathematical Modelling , 40(4):2743–2755.Kiefer, J. (1959). Optimum experimental designs.
Journal of the Royal Statistical Society, Series B ,21(2):272–304.Kiefer, J. and Wolfowitz, J. (1964). Optimum extrapolation and interpolation designs, i.
Annals ofthe Institute of Statistical Mathematics , 16(1):79–108.Krantz, S. G. and Parks, H. R. (2012).
The implicit function theorem: history, theory, and applications .Springer.Li, Q. and Kececioglu, D. B. (2006). Design of an optimal plan for an accelerated degradation test: acase study.
International Journal of Quality & Reliability Management , 23(4):426–440.Liao, H. and Elsayed, E. A. (2006). Reliability inference for field conditions from accelerateddegradation testing.
Naval Research Logistics (NRL) , 53(6):576–587.Lim, H. and Yum, B.-J. (2011). Optimal design of accelerated degradation tests based on wienerprocess models.
Journal of Applied Statistics , 38(2):309–325.Limon, S., Yadav, O. P., and Liao, H. (2017). A literature review on planning and analysis ofaccelerated testing for reliability assessment.
Quality and Reliability Engineering International ,33(8):2361–2383.Lu, J.-C., Park, J., and Yang, Q. (1997). Statistical inference of a time-to-failure distribution derivedfrom linear degradation data.
Technometrics , 39(4):391–400.Meeker, W. Q. and Escobar, L. A. (2014).
Statistical methods for reliability data . John Wiley & Sons.Mentre, F., Mallet, A., and Baccar, D. (1997). Optimal design in random-effects regression models.
Biometrika , 84(2):429–442.M¨uller, W. G. (2007).
Collecting Spatial Data : Optimum Design of Experiments for Random Fields.3rd rev. and extended ed.
Springer.N¨ather, W. (1985).
Effective Observation of Random Fields . Teubner.26elson, W. B. (2005). A bibliography of accelerated test plans part ii - references.
IEEE Transactionson Reliability , 54(3):370–373.Sahm, M. and Schwabe, R. (2001). A note on optimal bounded designs. In
Optimum Design 2000 ,pages 131–140. Kluwer.Schmelter, T. (2007). The optimality of single-group designs for certain mixed models.
Metrika ,65(2):183–193.Schmelter, T., Benda, N., and Schwabe, R. (2007). Some curiosities in optimal designs for randomslopes. In mODa 8 - Advances in Model-Oriented Design and Analysis , pages 189–195. Physica.Schwabe, R. (1996a). Optimal design of experiments for multivariate re-sponse in two-factor linearmodels. In
Multidimensional Statistical Analysis and Theory of Random Matrices. Proceedings ofthe Sixth Eugene Lukacs Symposium, Bowling Green, OH. VSP, Utrecht , pages 235–242.Schwabe, R. (1996b).
Optimum designs for multi-factor models . Springer.Schwabe, R., Prus, M., and Graßhoff, U. (2014). Discussion of ‘methods for planning repeatedmeasures accelerated degradation tests’ by brian p. weaver and william q. meeker.
Applied StochasticModels in Business and Industry , 30(6):677–679.Schwabe, R. and Schmelter, T. (2008). On optimal designs in random intercept models.
Tatra Mt.Math. Publ , 39:145–153.Silvey, S. D. (1980).
Optimal design , volume 1. Chapman and Hall.Silvey, S. D., Titterington, D. H., and Torsney, B. (1978). An algorithm for optimal designs on adesign space.
Communications in Statistics – Theory and Methods , 7(14):1379–1389.Wang, H., Zhao, Y., Ma, X., and Wang, H. (2017). Optimal design of constant-stress accelerateddegradation tests using the m-optimality criterion.
Reliability Engineering & System Safety ,164:45–54.Weaver, B. P. and Meeker, W. Q. (2014). Methods for planning repeated measures accelerateddegradation tests.
Applied Stochastic Models in Business and Industry , 30(6):658–671.Ye, Z.-S., Chen, N., and Shen, Y. (2015). A new class of Wiener process models for degradationanalysis.
Reliability Engineering & System Safety , 139:58–67.Zyskind, G. (1967). On canonical forms, non-negative covariance matrices and best and simple leastsquares linear estimators in linear models.
Annals of Mathematical Statistics , 38(4):1092–1109.27 . AppendixLemma A.1.
Let F be a k ˆ p matrix of rank p , Σ γ a non-negative definite p ˆ p matrix, Σ ε apositive definite k ˆ k matrix, V “ FΣ γ F T ` Σ ε , and M “ F T V ´ F . Then M ´ “ p F T Σ ´ ε F q ´ ` Σ γ . (A.1) Proof.
We prove the statement of the Lemma by showing that multiplication of M with the righthand side C “ p F T Σ ´ ε F q ´ ` Σ γ results in the k ˆ k identity matrix I k . For this note first thatafter premultiplication with F the right hand side can be expanded to FC “ Σ ε Σ ´ ε F p F T Σ ´ ε F q ´ ` FΣ γ p F T Σ ´ ε F qp F T Σ ´ ε F q ´ “ VΣ ´ ε F p F T Σ ´ ε F q ´ . Hence, by straightforward multiplication MC “ F T V ´ FC “ F T Σ ´ ε F p F T Σ ´ ε F q ´ “ I kk