Linear Estimation of Location and Scale Parameters Using Partial Maxima
aa r X i v : . [ m a t h . S T ] S e p Linear Estimation of Location and ScaleParametersusing Partial Maxima
Nickos Papadatos
Department of Mathematics, Section of Statistics and O.R., University of Athens,Panepistemiopolis, 157 84 Athens, Greece. e-mail: [email protected]
Abstract
Consider an i.i.d. sample X ∗ , X ∗ , . . . , X ∗ n from a location-scale family, and as-sume that the only available observations consist of the partial maxima (orminima) sequence, X ∗ , X ∗ , . . . , X ∗ n : n , where X ∗ j : j = max { X ∗ , . . . , X ∗ j } . Thiskind of truncation appears in several circumstances, including best perfor-mances in athletics events. In the case of partial maxima, the form of theBLUEs (best linear unbiased estimators) is quite similar to the form of thewell-known Lloyd’s (1952, Least-squares estimation of location and scale pa-rameters using order statistics, Biometrika , vol. 39, pp. 88–95) BLUEs, basedon (the sufficient sample of) order statistics, but, in contrast to the classi-cal case, their consistency is no longer obvious. The present paper is mainlyconcerned with the scale parameter, showing that the variance of the partialmaxima BLUE is at most of order O (1 / log n ), for a wide class of distributions. Key words and phrases : Partial Maxima BLUEs; Location-scale family; Partial Maxima Spacings;S/BSW-type condition; NCP and NCS class; Log-concave distributions; Consistency for the scaleestimator; Records.
There are several situations where the ordered random sample, X ∗ n X ∗ n · · · X ∗ n : n , (1.1)corresponding to the i.i.d. random sample, X ∗ , X ∗ , . . . , X ∗ n , is not fully reported,because the values of interest are the higher (or lower), up-to-the-present, recordvalues based on the initial sample, i.e., the partial maxima (or minima) sequence X ∗ X ∗ · · · X ∗ n : n , (1.2)where X ∗ j : j = max { X ∗ , . . . , X ∗ j } . A situation of this kind commonly appears inathletics, when only the best performances are recorded.Through this article we assume that the i.i.d. data arise from a location-scalefamily, { F (( · − θ ) /θ ); θ ∈ R , θ > } , here the d.f. F ( · ) is free of parameters and has finite, non-zero variance (so that F is non-degenerate), and we consider the partial maxima BLUE (best linear unbiasedestimator) for both parameters θ and θ . This consideration is along the lines of theclassical Lloyd’s (1952) BLUEs, the only difference being that the linear estimatorsare now based on the “insufficient sample” (1.2), rather than (1.1), and this factimplies a substantial reduction on the available information. Tryfos and Blackmore(1985) used this kind of data to predict future records in athletic events, Samaniegoand Whitaker (1986, 1988) estimated the population characteristics, while Hofmannand Nagaraja (2003) investigated the amount of Fisher Information contained in suchdata; see also Arnold, Balakrisnan & Nagaraja (1998, Section 5.9).A natural question concerns the consistency of the resulting BLUEs, since toomuch lack of information presumably would result to inconsistency (see at the end ofSection 6). Thus, our main focus is on conditions guaranteeing consistency, and themain result shows that this is indeed the case for the scale parameter BLUE from awide class of distributions. Specifically, it is shown that the variance of the BLUE isat most of order O (1 / log n ), when F ( x ) has a log-concave density f ( x ) and satisfiesthe Von Mises-type condition (5.11) or (6.1) (cf. Galambos (1978)) on the right end-point of its support (Theorem 5.2, Corollary 6.1). The result is applicable to severalcommonly used distributions, like the Power distribution (Uniform), the Weibull(Exponential), the Pareto, the Negative Exponential, the Logistic, the Extreme Value(Gumbel) and the Normal (see section 6). A consistency result for the partial maximaBLUE of the location parameter would be desirable to be included here, but it seemsthat the proposed technique (based on partial maxima spacings, section 4) does notsuffice for deriving it. Therefore, the consistency for the location parameter remainsan open problem in general, and it is just highlighted by a particular application tothe Uniform location-scale family (section 3).The proof of the main result depends on the fact that, under mild conditions,the partial maxima spacings have non-positive correlation. The class of distributionshaving this property is called NCP (negative correlation for partial maxima spacings).It is shown here that any log-concave distribution with finite variance belongs toNCP (Theorem 4.2). In particular, if a distribution function has a density whichis either log-concave or non-increasing then it is a member of NCP. For ordinaryspacings, similar sufficient conditions were shown by Sarkadi (1985) and Bai, Sarkar& Wang (1997) – see also David and Nagaraja (2003, pp. 187–188), Burkschat (2009),Theorem 3.5 – and will be referred as “S/BSW-type conditions”.In every experiment where the i.i.d. observations arise in a sequential manner,the partial maxima data describe the best performances in a natural way, as theexperiment goes on, in contrast to the first n record values, R , R , . . . , R n , whichare obtained from an inverse sampling scheme – see, e.g., Berger and Gulati (2001).Due to the very rare appearance of records, in the latter case it is implicitly assumedthat the sample size is, roughly, e n . This has a similar effect in the partial maximasetup, since the number of different values are about log n , for large sample size n .Clearly, the total amount of information in the partial maxima sample is the sameas that given by the (few) record values augmented by record times. The essential2ifference of these models (records / partial maxima) in statistical applications ishighlighted, e.g., in Tryfos and Blackmore (1985), Samaniego and Whitaker (1986,1988), Smith (1988), Berger and Gulati (2001) and Hofmann and Nagaraja (2003) –see also Arnold,Balakrishnan & Nagaraja (1998, Chapter 5). Consider the random sample X ∗ , X ∗ , . . . , X ∗ n from F (( x − θ ) /θ ) and the corre-sponding partial maxima sample X ∗ X ∗ · · · X ∗ n : n ( θ ∈ R is the locationparameter and θ > X , X , . . . , X n and X X · · · X n : n be the corresponding samples fromthe completely specified d.f. F ( x ), that generates the location-scale family. Since( X ∗ , X ∗ , . . . , X ∗ n : n ) ′ d = ( θ + θ X , θ + θ X , . . . , θ + θ X n : n ) ′ , a linear estimator based on partial maxima has the form L = n X i =1 c i X ∗ i : i d = θ n X i =1 c i + θ n X i =1 c i X i : i , for some constants c i , i = 1 , , . . . , n .Let X = ( X , X , . . . , X n : n ) ′ be the random vector of partial maxima from theknown d.f. F ( x ), and use the notation µ = IE [ X ] , Σ = ID[ X ] and E = IE[ XX ′ ] , (2.1)where ID [ ξ ] denotes the dispersion matrix of any random vector ξ . Clearly, Σ = E − µµ ′ , Σ > , E > . The linear estimator L is called BLUE for θ k ( k = 1 ,
2) if it is unbiased for θ k andits variance is minimal, while it is called BLIE (best linear invariant estimator) for θ k if it is invariant for θ k and its mean squared error, MSE [ L ] = IE[ L − θ k ] , isminimal. Here “invariance” is understood in the sense of location-scale invariance asit is defined, e.g., in Shao (2005, p. xix).Using the above notation it is easy to verify the following formulae for the BLUEsand their variances. They are the partial maxima analogues of Lloyd’s (1952) estima-tors and, in the case of partial minima, have been obtained by Tryfos and Blackmore(1985), using least squares. A proof is attached here for easy reference. Proposition 2.1
The partial maxima
BLUEs for θ and for θ are, respectively, L = − µ ′ Γ X ∗ and L = 1∆ ′ Γ X ∗ , (2.2) where X ∗ = ( X ∗ , X ∗ , . . . , X ∗ n : n ) ′ , ∆ = ( ′ Σ − )( µ ′ Σ − µ ) − ( ′ Σ − µ ) > , =(1 , , . . . , ′ ∈ R n and Γ = Σ − ( µ ′ − µ ′ ) Σ − . The corresponding variances are Var[ L ] = 1∆ ( µ ′ Σ − µ ) θ and Var [ L ] = 1∆ ( ′ Σ − ) θ . (2.3)3 roof: Let c = ( c , c , . . . , c n ) ′ ∈ R n and L = c ′ X ∗ . Since IE [ L ] = ( c ′ ) θ + ( c ′ µ ) θ , L is unbiased for θ iff c ′ = 1 and c ′ µ = 0, while it is unbiased for θ iff c ′ = 0 and c ′ µ = 1. Since Var [ L ] = ( c ′ Σ c ) θ , a simple minimization argument for c ′ Σ c withrespect to c , using Lagrange multipliers, yields the expressions (2.2) and (2.3). (cid:3) Similarly, one can derive the partial maxima version of Mann’s (1969) best linearinvariant estimators (BLIEs), as follows.
Proposition 2.2
The partial maxima
BLIEs for θ and for θ are, respectively, T = ′ E − X ∗ ′ E − and T = ′ GX ∗ ′ E − , (2.4) where X ∗ and are as in Proposition and G = E − ( µ ′ − µ ′ ) E − . Thecorresponding mean squared errors are MSE [ T ] = θ ′ E − and MSE [ T ] = (cid:18) − D ′ E − (cid:19) θ , (2.5) where D = ( ′ E − )( µ ′ E − µ ) − ( ′ E − µ ) > . Proof:
Let L = L ( X ∗ ) = c ′ X ∗ be an arbitrary linear statistic. Since L ( b X ∗ + a ) = a ( c ′ ) + bL ( X ∗ ) for arbitrary a ∈ R and b >
0, it follows that L is invariant for θ iff c ′ = 1 while it is invariant for θ iff c ′ = 0. Both (2.4) and (2.5) now followby a simple minimization argument, since in the first case we have to minimize themean squared error IE [ L − θ ] = ( c ′ Ec ) θ under c ′ = 1, while in the second one,we have to minimize the mean squared error IE [ L − θ ] = ( c ′ Ec − µ ′ c + 1) θ under c ′ = 0. (cid:3) The above formulae (2.2)-(2.5) are well-known for order statistics and records– see David (1981, Chapter 6), Arnold, Balakrishnan & Nagaraja (1992, Chapter7; 1998, Chapter 5), David and Nagaraja (2003, Chapter 8). In the present setup,however, the meaning of X ∗ , X , µ , Σ and E is completely different. In the caseof order statistics, for example, the vector µ , which is the mean vector of the orderstatistics X = ( X n , X n , . . . , X n : n ) ′ from the known distribution F ( x ), depends onthe sample size n , in the sense that the components of the vector µ completely changewith n . In the present case of partial maxima, the first n entries of the vector µ , whichis the mean vector of the partial maxima X = ( X , X , . . . , X n : n ) ′ from the knowndistribution F ( x ), remain constant for all sample sizes n ′ greater than or equal to n .Similar observations apply for the matrices Σ and E . This fact seems to be quitehelpful for the construction of tables giving the means, variances and covariancesof partial maxima for samples up to a size n . It should be noted, however, thateven when F ( x ) is absolutely continuous with density f ( x ) (as is usually the case forlocation-scale families), the joint distribution of ( X i : i , X j : j ) has a singular part, sinceIP [ X i : i = X j : j ] = i/j > i < j . Nevertheless, there exist simple expectation andcovariance formulae (Lemma 2.2). 4s in the order statistics setup, the actual application of formulae (2.2) and (2.4)requires closed forms for µ and Σ , and also to invert the n × n matrix Σ . Thiscan be done only for very particular distributions (see next section, where we applythe results to the Uniform distribution). Therefore, numerical methods should beapplied in general. This, however, has a theoretical cost: It is not a trivial fact toverify consistency of the estimators, even in the classical case of order statistics. Themain purpose of this article is in verifying consistency for the partial maxima BLUEs.Surprisingly, it seems that a solution of this problem is not well-known, at least toour knowledge, even for the classical BLUEs based on order statistics. However,even if the result of the following lemma is known, its proof has an independentinterest, because it proposes alternative (to BLUEs) n − / –consistent unbiased linearestimators and provides the intuition for the derivation of the main result of thepresent article. Lemma 2.1
The classical
BLUEs of θ and θ , based on order statistics from alocation-scale family, created by a distribution F ( x ) with finite non-zero variance,are consistent. Moreover, their variance is at most of order O (1 /n ) . Proof:
Let X ∗ = ( X ∗ n , X ∗ n , . . . , X ∗ n : n ) ′ and X = ( X n , X n , . . . , X n : n ) ′ be theordered samples from F (( x − θ ) /θ ) and F ( x ), respectively, so that X ∗ d = θ + θ X .Also write X ∗ , X ∗ , . . . , X ∗ n and X , X , . . . , X n for the corresponding i.i.d. samples.We consider the linear estimators S = X ∗ = 1 n n X i =1 X ∗ i d = θ + θ X and S = 1 n ( n − n X i =1 n X j =1 | X ∗ j − X ∗ i | d = θ n ( n − n X i =1 n X j =1 | X j − X i | , i.e., S is the sample mean and S is a multiple of Gini’s statistic. Observe thatboth S and S are linear estimators in order statistics. [In particular, S can bewritten as S = 4( n ( n − − P ni =1 ( i − ( n + 1) / X ∗ i : n .] Clearly, IE( S ) = θ + θ µ ,IE ( S ) = θ τ , where µ is the mean, IE ( X ), of the distribution F ( x ) and τ is thepositive finite parameter IE | X − X | . Since F is known, both µ ∈ R and τ > U = S − ( µ /τ ) S and U = S /τ . Obviously, IE ( U k ) = θ k , k = 1 ,
2, and both U , U are linear estimatorsof the form T n = (1 /n ) P ni =1 δ ( i, n ) X ∗ i : n , with | δ ( i, n ) | uniformly bounded for all i and n . If σ is the (assumed finite) variance of F ( x ), it follows thatVar [ T n ] n n X i =1 n X j =1 | δ ( i, n ) || δ ( j, n ) | Cov( X ∗ i : n , X ∗ j : n ) n (cid:18) max i n | δ ( i, n ) | (cid:19) Var( X ∗ n + X ∗ n + · · · + X ∗ n : n )= 1 n (cid:18) max i n | δ ( i, n ) | (cid:19) θ σ = O ( n − ) → , as n → ∞ , U k ) →
0, and thus U k is consistent for θ k , k = 1 ,
2. Since L k hasminimum variance among all linear unbiased estimators, it follows that Var( L k ) Var ( U k ) O (1 /n ), and the result follows. (cid:3) The above lemma implies that the mean squared error of the BLIEs, based onorder statistics, is at most of order O (1 /n ), since they have smaller mean squarederror than the BLUEs, and thus they are also consistent. More important is the factthat, with the technique used in Lemma 2.1, one can avoid all computations involvingmeans, variances and covariances of order statistics, and it does not need to invertany matrix, in order to prove consistency (and in order to obtain O ( n − )-consistentestimators). Arguments of similar kind will be applied in section 5, when the problemof consistency for the partial maxima BLUE of θ will be taken under consideration.We now turn in the partial maxima case. Since actual application of partialmaxima BLUEs and BLIEs requires the computation of the first two moments of X =( X , X , . . . , X n : n ) ′ in terms of the completely specified d.f. F ( x ), the followingformulae are to be mentioned here (cf. Jones and Balakrishnan (2002)). Lemma 2.2
Let X X · · · X n : n be the partial maxima sequence based onan arbitrary d.f. F ( x ) . (i) For i j , the joint d.f. of ( X i : i , X j : j ) is F X i : i ,X j : j ( x, y ) = (cid:26) F j ( y ) if x > y,F i ( x ) F j − i ( y ) if x y. (2.6)(ii) If F has finite first moment, then µ i = IE [ X i : i ] = Z ∞ (1 − F i ( x )) dx − Z −∞ F i ( x ) dx (2.7) is finite for all i . (iii) If F has finite second moment, then σ ij = Cov [ X i : i , X j : j ] = Z Z −∞ X, Y ] = Z ∞−∞ Z ∞−∞ ( F X,Y ( x, y ) − F X ( x ) F Y ( y )) dy dx (see Hoeffding (1940), Lehmann (1966), Jones and Balakrishnan (2002), among oth-ers), applied to ( X, Y ) = ( X i : i , X j : j ) with joint d.f. given by (2.6) and marginals F i ( x ) and F j ( y ). (cid:3) F does not have a density.Tryfos and Blackmore (1985) obtained an expression for the covariance of partialminima involving means and covariances of order statistics from lower sample sizes. Let X ∗ , X ∗ , . . . , X ∗ n ∼ U ( θ , θ + θ ), so that ( X ∗ , X ∗ , . . . , X ∗ n : n ) ′ d = θ + θ X ,where X = ( X , X , . . . , X n : n ) ′ is the partial maxima sample from the standardUniform distribution. Simple calculations, using (2.7)-(2.9), show that the meanvector µ = ( µ i ) and the dispersion matrix Σ = ( σ ij ) of X are given by (see alsoTryfos and Blackmore (1985), eq. (3.1)) µ i = ii + 1 and σ ij = i ( i + 1)( j + 1)( j + 2) for 1 i j n. Therefore, Σ is a patterned matrix of the form σ ij = a i b j for i j , and thus,its inverse is tridiagonal; see Graybill (1969, Chapter 8), Arnold, Balakrishnan &Nagaraja (1992, Lemma 7.5.1). Specifically, Σ − = γ − δ . . . − δ γ − δ . . . − δ γ . . . . . . γ n − − δ n − . . . − δ n − γ n where γ i = 4( i + 1) ( i + 2) (2 i + 1)(2 i + 3) , δ i = ( i + 1)( i + 2) ( i + 3)2 i + 3 , i = 1 , , . . . , n − , and γ n = ( n + 1) ( n + 2) n + 1 . Setting a ( n ) = ′ Σ − , b ( n ) = ( − µ ) ′ Σ − ( − µ ) and c ( n ) = ( − µ ) ′ Σ − , we get a ( n ) = ( n + 1) ( n + 2) n + 1 − n − X i =1 ( i + 1)( i + 2) (3 i + 1)(2 i + 1)(2 i + 3) = n + o ( n ) ,b ( n ) = ( n + 2) n + 1 − n − X i =1 ( i − i + 2)(2 i + 1)(2 i + 3) = 12 log n + o (log n ) ,c ( n ) = ( n + 1)( n + 2) n + 1 − n − X i =1 ( i + 2)(4 i + 7 i + 1)(2 i + 1)(2 i + 3) = n + o ( n ) . L ] = a ( n ) + b ( n ) − c ( n ) a ( n ) b ( n ) − c ( n ) θ = (cid:18) n + o (cid:18) n (cid:19)(cid:19) θ , andVar[ L ] = a ( n ) a ( n ) b ( n ) − c ( n ) θ = (cid:18) n + o (cid:18) n (cid:19)(cid:19) θ . The preceding computation shows that, for the Uniform location-scale family, thepartial maxima BLUEs are consistent for both the location and the scale parameters,since their variance goes to zero with the speed of 2 / log n . This fact, as expected,contradicts the behavior of the ordinary order statistics BLUEs, where the speed ofconvergence is of order n − for the variance of both Lloyd’s estimators. However, thecomparison is quite unfair here, since Lloyd’s estimators are based on the completesufficient statistic ( X ∗ n , X ∗ n : n ), and thus the variance of order statistics BLUE isminimal among all unbiased estimators.On the other hand we should emphasize that, under the same model, the BLUEs(and the BLIEs) based solely on the first n upper records are not even consistent.In fact, the variance of both BLUEs converges to θ / 3, and the MSE of both BLIEsapproaches θ / 4, as n → ∞ ; see Arnold, Balakrishnan & Nagaraja (1998, Examples5.3.7 and 5.4.3). In the classical order statistics setup, Balakrishnan and Papadatos (2002) observedthat the computation of BLUE (and BLIE) of the scale parameter is simplified con-siderably if one uses spacings instead of order statistics – cf. Sarkadi (1985). Theirobservation applies here too, and simplifies the form of the partial maxima BLUE(and BLIE).Specifically, define the partial maxima spacings as Z ∗ i = X ∗ i +1: i +1 − X ∗ i : i > Z i = X i +1: i +1 − X i : i > 0, for i = 1 , , . . . , n − 1, and let Z ∗ = ( Z ∗ , Z ∗ , . . . , Z ∗ n − ) ′ and Z = ( Z , Z , . . . , Z n − ) ′ . Clearly, Z ∗ d = θ Z , and any unbiased (or even in-variant) linear estimator of θ based on the partial maxima sample, L = c ′ X ∗ ,should necessarily satisfy P ni =1 c i = 0 (see the proofs of Propositions 2.1 and 2.2).Therefore, L can be expressed as a linear function on Z ∗ i ’s, L = b ′ Z ∗ , where now b = ( b , b , . . . , b n − ) ′ ∈ R n − . Consider the mean vector m = IE [ Z ], the dispersionmatrix S = ID [ Z ], and the second moment matrix D = IE[ ZZ ′ ] of Z . Clearly, S = D − mm ′ , S > D > 0, and the vector m and the matrices S and D are oforder n − 1. Using exactly the same arguments as in Balakrishnan and Papadatos(2002), it is easy to verify the following. Proposition 4.1 The partial maxima BLUE of θ , given in Proposition , has thealternative form L = m ′ S − Z ∗ m ′ S − m , with Var[ L ] = θ m ′ S − m , (4.1)8 hile the corresponding BLIE , given in Proposition , has the alternative form T = m ′ D − Z ∗ , with MSE [ T ] = (1 − m ′ D − m ) θ . (4.2)It should be noted that, in general, the non-negativity of the BLUE of θ doesnot follow automatically, even for order statistics. In the order statistics setup, thisproblem was posed by Arnold, Balakrishnan & Nagaraja (1992), and the best knownresult, till now, is the one given by Bai, Sarkar & Wang (1997) and Sarkadi (1985).Even after the slight improvement, given by Balakrishnan and Papadatos (2002) andby Burkschat (2009), the general case remains unsolved. The same question (of non-negativity of the BLUE) arises in the partial maxima setup, and the following theoremprovides a partial positive answer. We omit the proof, since it again follows by astraightforward application of the arguments given in Balakrishnan and Papadatos(2002). Theorem 4.1 (i) There exists a constant a = a n ( F ) , < a < , depending only onthe sample size n and the d.f. F ( x ) (i.e., a is free of the parameters θ and θ ) , suchthat T = a L . This constant is given by a = m ′ D − m = m ′ S − m / (1 + m ′ S − m ) . (ii) If either n = 2 or the (free of parameters) d.f. F ( x ) is such that Cov [ Z i , Z j ] for all i = j , i, j = 1 , . . . , n − , (4.3) then the partial maxima BLUE (and BLIE) of θ is non-negative. Note that, as in order statistics, the non-negativity of L is equivalent to the factthat the vector S − m (or, equivalently, the vector D − m ) has non-negative entries;see Balakrishnan and Papadatos (2002) and Sarkadi (1985).Since it is important to know whether (4.3) holds, in the sequel we shall makeuse of the following definition. Definition 4.1 A d.f. F ( x ) with finite second moment (or the corresponding density f ( x ), if exists)(i) belongs to the class NCS (negatively correlated spacings) if its order statisticshave negatively correlated spacings for all sample sizes n > n > f ( x ) with finite variance belongs to NCS – cf. Sarkadi (1985). We call this suf-ficient condition as the S/BSW-condition (for ordinary spacings). Burkschat (2009,Theorem 3.5) showed an extended S/BSW-condition, under which the log-concavityof both F and 1 − F suffice for the NCS class. Due to the existence of simple formu-lae like (4.1) and (4.2), the NCS and NCP classes provide useful tools in verifyingconsistency for the scale estimator, as well as, non-negativity. Our purpose is toprove an S/BSW-type condition for partial maxima (see Theorem 4.2, Corollary 4.1,below). To this end, we first state Lemma 4.1, that will be used in the sequel.Only through the rest of the present section, we shall use the notation Y k =max { X , . . . , X k } , for any integer k > 1. 9 emma 4.1 Fix two integers i , j , with i < j , and suppose that the i.i.d. r.v.’s X , X , . . . have a common d.f. F ( x ) . Let I ( expression ) denoting the indicator func-tion taking the value , if the expression holds true, and otherwise. (i) The conditional d.f. of Y j +1 given Y j is IP [ Y j +1 y | Y j ] = (cid:26) , if y < Y j F ( y ) , if y > Y j = F ( y ) I ( y > Y j ) , y ∈ R . If, in addition, i + 1 < j , then the following property (which is an immediate conse-quence of the Markovian character of the extremal process) holds: IP [ Y j +1 y | Y i +1 , Y j ] = IP[ Y j +1 y | Y j ] , y ∈ R . (ii) The conditional d.f. of Y i given Y i +1 is IP [ Y i x | Y i +1 ] = F i ( x ) P ij =0 F j ( Y i +1 ) F i − j ( Y i +1 − ) , if x < Y i +1 , if x > Y i +1 = I ( x > Y i +1 ) + I ( x < Y i +1 ) F i ( x ) P ij =0 F j ( Y i +1 ) F i − j ( Y i +1 − ) , x ∈ R . If, in addition, i + 1 < j , then the following property (which is again an immediateconsequence of the Markovian character of the extremal process) holds: IP[ Y i x | Y i +1 , Y j ] = IP[ Y i x | Y i +1 ] , x ∈ R . (iii) Given ( Y i +1 , Y j ) , the random variables Y i and Y j +1 are independent. We omit the proof since the assertions are simple by-products of the Markoviancharacter of the process { Y k , k > } , which can be embedded in a continuous timeextremal process { Y ( t ) , t > } ; see Resnick (1987, Chapter 4). We merely note thata version of the Radon-Nikodym derivative of F i +1 w.r.t. F is given by h i +1 ( x ) = dF i +1 ( x ) dF ( x ) = i X j =0 F j ( x ) F i − j ( x − ) , x ∈ R , (4.4)which is equal to ( i + 1) F i ( x ) only if x is a continuity point of F . To see this, itsuffices to verify the identity Z B dF i +1 ( x ) = Z B h i +1 ( x ) dF ( x ) for all Borel sets B ⊆ R . (4.5)Now (4.5) is proved as follows: Z B dF i +1 = IP( Y i +1 ∈ B ) 10 i +1 X j =1 IP " Y i +1 ∈ B, i +1 X k =1 I ( X k = Y i +1 ) = j = i +1 X j =1 X k < ··· Assume that the d.f. F ( x ) , with finite second moment, is a log-concavedistribution (in the sense that log F ( x ) is a concave function in J , where J = { x ∈ R : 0 < F ( x ) < } ) , and has not an atom at its right end-point, ω ( F ) = inf { x ∈ R : F ( x ) = 1 } . Then, F ( x ) belongs to the class NCP , i.e., (4.3) holds for all n > . Proof: For arbitrary r.v.’s X > x > −∞ and Y y < + ∞ , with respective d.f.’s F X , F Y , we haveIE [ X ] = x + Z ∞ x (1 − F X ( t )) dt and IE [ Y ] = y − Z y −∞ F Y ( t ) dt (4.6)(cf. Papadatos (2001), Jones and Balakrishnan (2002)). Assume that i < j . ByLemma 4.1(i) and (4.6) applied to F X = F Y j +1 | Y j , it follows thatIE [ Y j +1 | Y i +1 , Y j ] = IE [ Y j +1 | Y j ] = Y j + Z ∞ Y j (1 − F ( t )) dt, w.p. 1 . Similarly, by Lemma 4.1(ii) and (4.6) applied to F Y = F Y i | Y i +1 , we conclude thatIE [ Y i | Y i +1 , Y j ] = IE[ Y i | Y i +1 ] = Y i +1 − h i +1 ( Y i +1 ) Z Y i +1 −∞ F i ( t ) dt, w.p. 1 , where h i +1 is given by (4.4). Note that F is continuous on J , since it is log-concavethere, and thus, h i +1 ( x ) = ( i + 1) F i ( x ) for x ∈ J . If ω ( F ) is finite, F ( x ) is also11ontinuous at x = ω ( F ), by assumption. On the other hand, if α ( F ) = inf { x : F ( x ) > } is finite, F can be discontinuous at x = α ( F ), but in this case, h i +1 ( α ( F )) = F i ( α ( F )) > 0; see (4.4). Thus, in all cases, h i +1 ( Y i +1 ) > Y i and Y j +1 (Lemma 4.1(iii)), we haveCov ( Z i , Z j | Y i +1 , Y j ) = Cov ( Y i +1 − Y i , Y j +1 − Y j | Y i +1 , Y j )= − Cov ( Y i , Y j +1 | Y i +1 , Y j ) = 0 , w.p. 1 , so that IE [ Cov ( Z i , Z j | Y i +1 , Y j )] = 0, and thus,Cov [ Z i , Z j ] = Cov [ IE( Z i | Y i +1 , Y j ) , IE ( Z j | Y i +1 , Y j )] + IE[ Cov ( Z i , Z j | Y i +1 , Y j )]= Cov [ IE( Y i +1 − Y i | Y i +1 , Y j ) , IE( Y j +1 − Y j | Y i +1 , Y j )]= Cov [ Y i +1 − IE ( Y i | Y i +1 , Y j ) , IE( Y j +1 | Y i +1 , Y j ) − Y j ]= Cov [ g ( Y i +1 ) , h ( Y j )] , (4.7)where g ( x ) = i + 1) F i ( x ) Z x −∞ F i ( t ) dt, x > α ( F ) , , otherwise, h ( x ) = Z ∞ x (1 − F ( t )) dt. Obviously, h ( x ) is non-increasing. On the other hand, g ( x ) is non-decreasing in R .This can be shown as follows. First observe that g ( α ( F )) = 0 if α ( F ) is finite,while g ( x ) > x > α ( F ). Next observe that g is finite and continuous at x = ω ( F ) if ω ( F ) is finite, as follows by the assumed continuity of F at x = ω ( F )and the fact that F has finite variance. Finally, observe that F i ( x ), a product oflog-concave functions, is also log-concave in J . Therefore, for arbitrary y ∈ J , thefunction d ( x ) = F i ( x ) / R y −∞ F i ( t ) dt , x ∈ ( −∞ , y ) ∩ J , is a probability density, andthus, it is a log-concave density with support ( −∞ , y ) ∩ J . By Pr`ekopa (1973) orDasgupta and Sarkar (1982) it follows that the corresponding distribution function, D ( x ) = R x −∞ d ( t ) dt = R x −∞ F i ( t ) dt/ R y −∞ F i ( t ) dt , x ∈ ( −∞ , y ) ∩ J , is a log-concavedistribution, and since y is arbitrary, H ( x ) = R x −∞ F i ( t ) dt is a log-concave function,for x ∈ J . Since F is continuous in J , this is equivalent to the fact that the function H ′ ( x ) H ( x ) = F i ( x ) R x −∞ F i ( t ) dt , x ∈ J, is non-increasing, so that g ( x ) = H ( x ) / (( i + 1) H ′ ( x )) is non-decreasing in J .The desired result follows from (4.7), because the r.v.’s Y i +1 and Y j are posi-tively quadrant dependent (PQD – Lehmann (1966)), since it is readily verified that F Y i +1 ,Y j ( x, y ) > F Y i +1 ( x ) F Y j ( y ) for all x and y (Lemma 2.2(i)). This completes theproof. (cid:3) The restriction F ( x ) → x → ω ( F ) cannot be removed from the theorem.Indeed, the function F ( x ) = x ,x/ , x < , , x > , 12s a log-concave distribution in J = ( α ( F ) , ω ( F )) = (0 , Z , Z ] = > 0. The function g , used in the proof, is given by g ( x ) = max (cid:26) , x ( i + 1) (cid:27) , x < ,x − i + 1 + 1( i + 1) i , x > , and it is not monotonic.Since the family of densities with log-concave distributions contains both familiesof log-concave and non-increasing densities (see, e.g., Pr`ekopa (1973), Dasgupta andSarkar (1982), Sengupta and Nanda (1999), Bagnoli and Bergstrom (2005)), thefollowing corollary is an immediate consequence of Theorems 4.1 and 4.2. Corollary 4.1 Assume that F ( x ) has finite second moment. (i) If F ( x ) is a log-concave d.f. (in particular, if F ( x ) has either a log-concave or anon-increasing (in its interval support) density f ( x )) , then the partial maxima BLUE and the partial maxima BLIE of θ are non-negative. (ii) If F ( x ) has either a log-concave or a non-increasing (in its interval support) density f ( x ) then it belongs to the NCP class. Sometimes it is asserted that “the distribution of a log-convex density is log-concave” (see, e.g., Sengupta and Nanda (1999), Proposition 1(e)), but this is notcorrect in its full generality, even if the corresponding r.v. X is non-negative. Forexample, let Y ∼ Weibull with shape parameter 1 / 2, and set X d = Y | Y < 1. Then X has density f and d.f. F given by f ( x ) = exp( −√ − x )2(1 − e − ) √ − x , F ( x ) = exp( −√ − x ) − e − − e − , < x < , and it is easily checked that log f is convex in J = (0 , F is not log-concave in J . However, we point out that if sup J = + ∞ then any log-convex density, supportedon J , has to be non-increasing in J and, therefore, its distribution is log-concave in J . Examples of log-convex distributions having a log-convex density are given byBagnoli and Bergstrom (2005). Through this section we always assume that F ( x ), the d.f. that generates the location-scale family, is non-degenerate and has finite second moment. The main purpose isto verify consistency for L , applying the results of section 4. To this end, we firstlystate and prove a simple lemma that goes through the lines of Lemma 2.1. Due tothe obvious fact that MSE [ T ] Var[ L ], all the results of the present section applyalso to the BLIE of θ . 13 emma 5.1 If F ( x ) belongs to the NCP class then (i) Var[ L ] θ P n − k =1 m k /s k , (5.1) where m k = IE[ Z k ] is the k -th component of the vector m and s k = s kk = Var[ Z k ] isthe k -th diagonal entry of the matrix S . (ii) The partial maxima BLUE , L , is consistent if the series ∞ X k =1 m k s k = + ∞ . (5.2) Proof: Observe that part (ii) is an immediate consequence of part (i), due to thefact that, in contrast to the order statistics setup, m k and s k do not depend on thesample size n . Regarding (i), consider the linear unbiased estimator U = 1 c n n − X k =1 m k s k Z ∗ k d = θ c n n − X k =1 m k s k Z k , where c n = P n − k =1 m k /s k . Since F ( x ) belongs to NCP and the weights of U arepositive, it follows that the variance of U , which is greater than or equal to thevariance of L , is bounded by the RHS of (5.1); this completes the proof. (cid:3) The proof of the following theorem is now immediate. Theorem 5.1 If F ( x ) belongs to the NCP class and if there exists a finite constant C and a positive integer k such that IE [ Z k ] k IE [ Z k ] C, for all k > k , (5.3) then Var[ L ] O (cid:18) n (cid:19) , as n → ∞ . (5.4) Proof: Since for k > k , m k s k = m k IE [ Z k ] − m k = 1IE [ Z k ]IE [ Z k ] − > Ck − , the result follows by (5.1). (cid:3) Thus, for proving consistency of order 1 / log n into NCP class it is sufficientto verify (5.3) and, therefore, we shall investigate the quantities m k = IE [ Z k ] and14E [ Z k ] = s k + m k . A simple application of Lemma 2.2, observing that m k = µ k +1 − µ k and s k = σ k +1 − σ k,k +1 + σ k , shows thatIE [ Z k ] = Z ∞−∞ F k ( x )(1 − F ( x )) dx, (5.5)IE [ Z k ] = 2 Z Z −∞ For any t > − , lim k →∞ k t Z u k (1 − u ) t du = Γ(1 + t ) > . (5.8)(ii) For any t with t < and any a > , there exist positive constants C , C ,and a positive integer k such that < C < k t (log k ) a Z u k (1 − u ) t L a ( u ) du < C < ∞ , for all k > k , (5.9) where L ( u ) = − log(1 − u ) . Proof: Part (i) follows by Stirling’s formula. For part (ii), with the substitution u = 1 − e − x , we write the integral in (5.9) as1 k + 1 Z ∞ ( k + 1)(1 − e − x ) k e − x exp( − tx ) x a dx = 1 k + 1 IE (cid:20) exp( − tT ) T a (cid:21) , where T has the same distribution as the maximum of k +1 i.i.d. standard exponentialr.v.’s. It is well-known that IE [ T ] = 1 − + 2 − + · · · + ( k + 1) − . Since the secondderivative of the function x → e − tx /x a is x − a − e − tx ( a + ( a + tx ) ), which is positivefor x > 0, this function is convex, so by Jensen’s inequality we conclude that k t (log k ) a Z u k (1 − u ) t ( L ( u )) a du > kk + 1 (cid:18) log k / · · · + 1 / ( k + 1) (cid:19) a × exp (cid:20) − t (cid:18) · · · + 1 k + 1 − log k (cid:19)(cid:21) , and the RHS remains positive as k → ∞ , since it converges to e − γt , where γ = . . . . is Euler’s constant. This proves the lower bound in (5.9). Regarding the15pper bound, observe that the function g ( u ) = (1 − u ) t /L a ( u ), u ∈ (0 , g ′′ ( u ) = − (1 − u ) t − L a +2 ( u ) [ t (1 − t ) L ( u ) + a (1 − t ) L ( u ) − a ( a + 1)] , < u < , and since 0 t < a > L ( u ) → + ∞ as u → − , it follows that there exists aconstant b ∈ (0 , 1) such that g ( u ) is concave in ( b, I k = I (1) k ( b ) + I (2) k ( b ) = Z b u k (1 − u ) t L a ( u ) du + Z b u k (1 − u ) t L a ( u ) du, and observe that for any fixed s > b ∈ (0 , k s I (1) k ( b ) k s b k − a Z b u a (1 − u ) t L a ( u ) du k s b k − a Z u a (1 − u ) t L a ( u ) du → , as k → ∞ , because the last integral is finite and independent of k . Therefore, k t (log k ) a I k isbounded above if k t (log k ) a I (2) k ( b ) is bounded above for some b < 1. Choose b closeenough to 1 so that g ( u ) is concave in ( b, − b k +1 < I (2) k ( b ) = 1 − b k +1 k + 1 Z b f k ( u ) g ( u ) du = 1 − b k +1 k + 1 IE [ g ( V )] k + 1 g [ IE( V )] , where V is an r.v. with density f k ( u ) = ( k + 1) u k / (1 − b k +1 ), for u ∈ ( b, V ) = (( k + 1) / ( k + 2))(1 − b k +2 ) / (1 − b k +1 ) > ( k + 1) / ( k + 2), and g is positiveand decreasing (its first derivative is g ′ ( u ) = − (1 − u ) t − L − a − ( u )( tL ( u ) + a ) < < u < k t (log k ) a I (2) k ( b ) k t (log k ) a k + 1 g [ IE( V )] k t (log k ) a k + 1 g (cid:18) k + 1 k + 2 (cid:19) = kk + 1 (cid:18) log k log( k + 2) (cid:19) a (cid:18) kk + 2 (cid:19) t . This shows that k t (log k ) a I (2) k ( b ) is bounded above, and thus, k t (log k ) a I k isbounded above, as it was to be shown. The proof is complete. (cid:3) Corollary 5.1 (i) Under the assumptions of Lemma , for any b ∈ [0 , thereexist positive constants A , A and a positive integer k such that < A < k t Z b u k (1 − u ) t du < A < + ∞ , for all k > k . (ii) Under the assumptions of Lemma , for any b ∈ [0 , there exist positiveconstants A , A and a positive integer k such that < A < k t (log k ) a Z b u k (1 − u ) t L a ( u ) du < A < ∞ , for all k > k . roof: The proof follows from Lemma 5.2 in a trivial way, since the correspondingintegrals over [0 , b ] are bounded above by a multiple of b k − a , of the form Ab k − a , with A < + ∞ being independent of k . (cid:3) We can now state and prove our main result: Theorem 5.2 Assume that F ( x ) lies in NCP , and let ω = ω ( F ) be the upper end-point of the support of F , i.e., ω = inf { x ∈ R : F ( x ) = 1 } , where ω = + ∞ if F ( x ) < for all x . Suppose that lim x → ω − F ( x ) = 1 , and that F ( x ) is differentiablein a left neighborhood ( M, ω ) of ω , with derivative f ( x ) = F ′ ( x ) for x ∈ ( M, ω ) . For δ ∈ R and γ ∈ R , define the (generalized hazard rate) function L ( x ) = L ( x ; δ, γ ; F ) = f ( x )(1 − F ( x )) γ ( − log(1 − F ( x ))) δ , x ∈ ( M, ω ) , (5.10) and set L ∗ = L ∗ ( δ, γ ; F ) = lim inf x → ω − L ( x ; δ, γ, F ) , L ∗ = L ∗ ( δ, γ ; F ) = lim sup x → ω − L ( x ; δ, γ, F ) . If either (i) for some γ < / and δ = 0 , or (ii) for some δ > and some γ with / < γ , < L ∗ ( δ, γ ; F ) L ∗ ( δ, γ ; F ) < + ∞ , (5.11) then the partial maxima BLUE L (given by (2.2) or (4.1)) of the scale parameter θ is consistent and, moreover, Var [ L ] O (1 / log n ) . Proof: First observe that for large enough x < ω , (5.11) implies that f ( x ) > ( L ∗ / − F ( x )) γ ( − log(1 − F ( x ))) δ > 0, so that F ( x ) is eventually strictly in-creasing and continuous. Moreover, the derivative f ( x ) is necessarily finite since f ( x ) < L ∗ (1 − F ( x )) γ ( − log(1 − F ( x ))) δ . The assumption lim x → ω − F ( x ) = 1 nowshows that F − ( u ) is uniquely defined in a left neighborhood of 1, that F ( F − ( u )) = u for u close to 1, and that lim u → − F − ( u ) = ω . This, in turn, implies that F − ( u ) isdifferentiable for u close to 1, with (finite) derivative ( F − ( u )) ′ = 1 /f ( F − ( u )) > Z k ] and for a lower bound on IE [ Z k ]. Clearly, (5.3) will be deduced ifwe shall verify that, under (i), there exist finite constants C > C > k − γ IE[ Z k ] C and k − γ IE [ Z k ] > C , (5.12)for all large enough k . Similarly, (5.3) will be verified if we show that, under (ii),there exist finite constants C > C > k − γ (log k ) δ IE [ Z k ] C and k − γ (log k ) δ IE[ Z k ] > C , (5.13)for all large enough k . Since the integrands in the integral expressions (5.5)-(5.7)vanish if x or y lies outside the set { x ∈ R : 0 < F ( x ) < } , we have the equivalent17xpressions IE [ Z k ] = Z ωα F k ( x )(1 − F ( x )) dx, (5.14)IE [ Z k ] = 2 Z Z α 0, we have, as in theproof of Lemma 5.2(ii), thatlim k →∞ k s Z Mα F k ( x )(1 − F ( x )) dx = 0 , lim k →∞ k s Z Mα Z ωx F k ( x )(1 − F ( y )) dy dx = 0 , because F ( M ) < k = 1, by theassumption that the variance is finite ((5.15) with k = 1 just equals to the varianceof F ; see also (2.9) with i = 1). Therefore, in order to verify (5.12) and (5.13)for large enough k , it is sufficient to replace IE[ Z k ] and IE[ Z k ], in both formulae(5.12), (5.13), by the integrals R ωM F k ( x )(1 − F ( x )) dx and R ωM R ωx F k ( x )(1 − F ( y )) dydx ,respectively, for an arbitrary (fixed) M ∈ ( α, ω ). Fix now M ∈ ( α, ω ) so large that f ( x ) = F ′ ( x ) exists and it is finite and strictly positive for all x ∈ ( M, ω ), andmake the transformation F ( x ) = u in the first integral, and the transformation( F ( x ) , F ( y )) = ( u, v ) in the second one. Both transformations are now one-to-oneand continuous, because both F and F − are differentiable in their respective intervals( M, ω ) and ( F ( M ) , F − ( u ) → ω as u → − , it is easily seen that (5.12) will be concluded if it can be shown that forsome fixed b < k − γ Z b u k f ( F − ( u )) (cid:18)Z u − vf ( F − ( v )) dv (cid:19) du C and (5.16) k − γ Z b u k (1 − u ) f ( F − ( u )) du > C , (5.17)holds for all large enough k . Similarly, (5.13) will be deduced if it will be proved thatfor some fixed b < k − γ (log k ) δ Z b u k f ( F − ( u )) (cid:18)Z u − vf ( F − ( v )) dv (cid:19) du C and (5.18) k − γ (log k ) δ Z b u k (1 − u ) f ( F − ( u )) du > C , (5.19)holds for all large enough k . The rest of the proof is thus concentrated on showing(5.16) and (5.17) (resp., ((5.18) and (5.19)), under the assumption (i) (resp., underthe assumption (ii)). 18ssume first that (5.11) holds under (i). Fix now b < L ∗ − F ( x )) γ < f ( x ) < L ∗ (1 − F ( x )) γ , for all x ∈ ( F − ( b ) , ω );equivalently, 12 L ∗ < (1 − u ) γ f ( F − ( u )) < L ∗ , for all u ∈ ( b, . (5.20)Due to (5.20), the inner integral in (5.16) is Z u − vf ( F − ( v )) dv = Z u (1 − v ) − γ (1 − v ) γ f ( F − ( v )) dv − u ) − γ (2 − γ ) L ∗ . By Corollary 5.1(i) applied for t = 2 − γ > − 1, the LHS of (5.16) is less than orequal to2 k − γ (2 − γ ) L ∗ Z b u k (1 − u ) − γ (1 − u ) γ f ( F − ( u )) du k − γ (2 − γ ) L ∗ Z b u k (1 − u ) − γ du C , for all k > k , with C = 4 A L − ∗ (2 − γ ) − < ∞ , showing (5.16). Similarly, using thelower bound in (5.20), the integral in (5.17) is Z b u k (1 − u ) f ( F − ( u )) du = Z b u k (1 − u ) − γ (1 − u ) γ f ( F − ( u )) du > L ∗ Z b u k (1 − u ) − γ du, so that, by Corollary 5.1(i) applied for t = 1 − γ > − 1, the LHS of (5.17) is greaterthan or equal to k − γ L ∗ Z b u k (1 − u ) − γ du > A L ∗ > , for all k > k ,showing (5.17).Assume now that (5.11) is satisfied under (ii). As in part (i), choose a largeenough b < L ∗ < (1 − u ) γ L δ ( u ) f ( F − ( u )) < L ∗ , for all u ∈ ( b, , (5.21)where L ( u ) = − log(1 − u ). Due to (5.21), the inner integral in (5.18) is Z u (1 − v ) − γ L δ ( v ) (1 − v ) γ L δ ( v ) f ( F − ( v )) dv L ∗ Z u (1 − v ) − γ L δ ( v ) dv − u ) − γ L ∗ L δ ( u ) , because (1 − u ) − γ /L δ ( u ) is decreasing (see the proof of Lemma 5.2(ii)). By Corollary5.1(ii) applied for t = 2 − γ ∈ [0 , 1) and a = 2 δ > 0, the double integral in (5.18) isless than or equal to2 L ∗ Z b u k (1 − u ) − γ L δ ( u ) (1 − u ) γ L δ ( u ) f ( F − ( u )) du L ∗ Z b u k (1 − u ) − γ L δ ( u ) du C k − γ (log k ) δ , k > k , with C = 4 A L − ∗ < ∞ , showing (5.18). Similarly, using the lowerbound in (5.21), the integral in (5.19) is Z b u k (1 − u ) − γ L δ ( u ) (1 − u ) γ L δ ( u ) f ( F − ( u )) du > L ∗ Z b u k (1 − u ) − γ L δ ( u ) du, and thus, by Corollary 5.1(ii) applied for t = 1 − γ ∈ [0 , 1) and a = δ > 0, the LHSof (5.19) is greater than or equal to k − γ (log k ) δ L ∗ Z b u k (1 − u ) − γ L δ ( u ) du > A L ∗ > , for all k > k ,showing (5.19). This completes the proof. (cid:3) Remark 5.1 Taking L ( u ) = − log(1 − u ), the limits L ∗ and L ∗ in (5.11) can berewritten as L ∗ ( δ, γ ; F ) = lim inf u → − f ( F − ( u ))(1 − u ) γ L δ ( u ) = (cid:18) lim sup u → − ( F − ( u )) ′ (1 − u ) γ L δ ( u ) (cid:19) − ,L ∗ ( δ, γ ; F ) = lim sup u → − f ( F − ( u ))(1 − u ) γ L δ ( u ) = (cid:18) lim inf u → − ( F − ( u )) ′ (1 − u ) γ L δ ( u ) (cid:19) − . In the particular case where F is absolutely continuous with a continuous density f and interval support, the function f ( F − ( u )) = 1 / ( F − ( u )) ′ is known as the density-quantile function (Parzen, (1979)), and plays a fundamental role in the theory oforder statistics. Theorem 5.2 shows, in some sense, that the behavior of the density-quantile function at the upper end-point, u = 1, specifies the variance behavior of thepartial maxima BLUE for the scale parameter θ . In fact, (5.11) (and (6.1), below)is a Von Mises-type condition (cf. Galambos (1978), §§ Remark 5.2 It is obvious that condition lim x → ω − F ( x ) = 1 is necessary for theconsistency of BLUE (and BLIE). Indeed, the event that all partial maxima areequal to ω ( F ) has probability p = F ( ω ) − F ( ω − ) (which is independent of n ).Thus, a point mass at x = ω ( F ) implies that for all n , IP ( L = 0) > p > 0. Thissituation is trivial. Non-trivial cases also exist, and we provide one at the end of nextsection. In most commonly used location-scale families, the following corollary suffices forconcluding consistency of the BLUE (and the BLIE) of the scale parameter. Itsproof follows by a straightforward combination of Corollary 4.1(ii) and Theorem 5.2. Corollary 6.1 Suppose that F is absolutely continuous with finite variance, andthat its density f is either log-concave or non-increasing in its interval support = ( α ( F ) , ω ( F )) = { x ∈ R : 0 < F ( x ) < } . If, either for some γ < / and δ = 0 , or for some δ > and some γ with / < γ , lim x → ω ( F ) − f ( x )(1 − F ( x )) γ ( − log(1 − F ( x ))) δ = L ∈ (0 , + ∞ ) , (6.1) then the partial maxima BLUE of the scale parameter is consistent and, moreover,its variance is at most of order O (1 / log n ) . Corollary 6.1 has immediate applications to several location-scale families. The fol-lowing are some of them, where (6.1) can be verified easily. In all these familiesgenerated by the distributions mentioned below, the variance of the partial maximaBLUE L (see (2.3) or (4.1)), and the mean squared error of the partial maximaBLIE T (see (2.5) or (4.2)) of the scale parameter is at most of order O (1 / log n ), asthe sample size n → ∞ . 1. Power distribution (Uniform). F ( x ) = x λ , f ( x ) = λx λ − , 0 < x < λ > ω ( F ) = 1. The density is non-increasing for λ λ > 1. Itis easily seen that (6.1) is satisfied for δ = γ = 0 (for λ = 1 (Uniform) see section 3). 2. Logistic distribution. F ( x ) = (1 + e − x ) − , f ( x ) = e − x (1 + e − x ) − , x ∈ R , and ω ( F ) = + ∞ . The density is log-concave, and it is easily seen that (6.1) is satisfiedfor δ = 0, γ = 1. 3. Pareto distribution. F ( x ) = 1 − x − a , f ( x ) = ax − a − , x > a > 2, so that thesecond moment is finite), and ω ( F ) = + ∞ . The density is decreasing, and it is easilyseen that (6.1) is satisfied for δ = 0, γ = 1 + 1 /a . Pareto case provides an examplewhich lies in NCP and not in NCS class – see Bai, Sarkar & Wang (1997). 4. Negative Exponential distribution. F ( x ) = f ( x ) = e x , x < 0, and ω ( F ) = 0.The density is log-concave and it is easily seen that (6.1) is satisfied for δ = γ = 0.This model is particularly important, because it corresponds to the partial minimamodel from the standard exponential distribution – see Samaniego and Whitaker(1986). 5. Weibull distribution (Exponential). F ( x ) = 1 − e − x c , f ( x ) = cx c − exp( − x c ), x > c > ω ( F ) = + ∞ . The density is non-increasing for c c > 1, and it is easily seen that (6.1) is satisfied for δ = 1 − /c , γ = 1.It should be noted that Theorem 5.2 does not apply for c < 1, since δ < 6. Gumbel (Extreme Value) distribution. F ( x ) = exp( − e − x ) = e x f ( x ), x ∈ R ,and ω ( F ) = + ∞ . The distribution is log-concave and (6.1) holds with γ = 1, δ = 0( L = 1). This model is particularly important for its applications in forecastingrecords, especially in athletic events – see Tryfos and Blackmore (1985). 7. Normal Distribution. f ( x ) = ϕ ( x ) = (2 πe x ) − / , F = Φ, x ∈ R , and ω ( F ) =+ ∞ . The density is log-concave and Corollary 6.1 applies with δ = 1 / γ = 1.Indeed,lim + ∞ ϕ ( x )(1 − Φ( x ))( − log(1 − Φ( x ))) / = lim + ∞ ϕ ( x ) x (1 − Φ( x )) x ( − log(1 − Φ( x ))) / , + ∞ ϕ ( x ) x (1 − Φ( x )) = 1 , lim + ∞ x − log(1 − Φ( x )) = 2 , so that L = √ − X , . . . , − X n , arising from F − X ( x ) = 1 − F X ( − x − ), and to observethat min { X , . . . , X i } = − max {− X , . . . , − X i } , i = 1 , . . . , n . Thus, we just have toreplace F ( x ) by 1 − F ( − x − ) in the corresponding formulae.There are some related problems and questions that, at least to our point ofview, seem to be quite interesting. One problem is to verify consistency for thepartial maxima BLUE of the location parameter. Another problem concerns thecomplete characterizations of the NCP and NCS classes (see Definition 4.1), sincewe only know S/BSW-type sufficient conditions. Also, to prove or disprove the non-negativity of the partial maxima BLUE for the scale parameter, outside the NCPclass (as well as for the order statistics BLUE of the scale parameter outside the NCSclass).Some questions concern lower variance bounds for the partial maxima BLUEs.For example we showed in section 3 that the rate O (1 / log n ) (which, by Theorem 5.2,is just an upper bound for the variance of L ) is the correct order for the variance ofboth estimators in the Uniform location-scale family. Is this the usual case? If it is so,then we could properly standardize the estimators, centering and multiplying themby (log n ) / . This would result to limit theorems analogues to the correspondingones for order statistics – e.g., Chernoff, Gastwirth & Johns (1967); Stigler (1974);– or analogues to the corresponding ones of Pyke (1965), (1980), for partial maximaspacings instead of ordinary spacings. However, note that the Fisher-Informationapproach, in the particular case of the one-parameter (scale) family generated by thestandard Exponential distribution, suggests a variance of about 3 θ / (log n ) for theminimum variance unbiased estimator (based on partial maxima) – see Hoffman andNagaraja (2003, eq. (15) on p. 186).A final question concerns the construction of approximate BLUEs (for both loca-tion and scale) based on partial maxima, analogues to Gupta’s (1952) simple linearestimators based on order statistics. Such a kind of approximations and/or limittheorems would be especially useful for practical purposes, since the computation ofBLUE via its closed formula requires inverting an n × n matrix. This problem hasbeen partially solved here: For the NCP class, the estimator U , given in the proofof Lemma 5.1, is consistent for θ (under the assumptions of Theorem 5.2) and canbe computed by a simple formula if we merely know the means and variances of thepartial maxima spacings.Except of the trivial case given in Remark 5.2, above, there exist non-trivialexamples where no consistent sequence of unbiased estimators exist for the scale22arameter. To see this, we make use of the following result. Theorem 6.1 (Hofmann and Nagaraja 2003, p. 183) Let X ∗ , X ∗ , . . . , X ∗ n be an i.i.d.sample from the scale family with distribution function F ( x ; θ ) = F ( x/θ ) ( θ > and density f ( x ; θ ) = f ( x/θ ) /θ , where f ( x ) is known, ithas a continuous derivative f ′ , and its support, J ( F ) = { x : f ( x ) > } , is one of theintervals ( −∞ , ∞ ) , ( −∞ , or (0 , + ∞ ) . (i) The Fisher Information contained in the partial maxima data X ∗ X ∗ · · · X ∗ n : n is given by I max n = 1 θ n X k =1 Z J ( F ) f ( x ) F k − ( x ) (cid:18) xf ′ ( x ) f ( x ) + ( k − xf ( x ) F ( x ) (cid:19) dx. (ii) The Fisher Information contained in the partial minima data X ∗ > X ∗ > · · · > X ∗ n is given by I min n = 1 θ n X k =1 Z J ( F ) f ( x )(1 − F ( x )) k − (cid:18) xf ′ ( x ) f ( x ) − ( k − xf ( x )1 − F ( x ) (cid:19) dx. It is clear that for fixed θ > I max n and I min n both increase with the sample size n . In particular, if J ( F ) = (0 , ∞ ) then, by Beppo-Levi’s Theorem, I min n converges(as n → ∞ ) to its limit I min = 1 θ Z ∞ ( µ ( x ) (cid:18) xf ′ ( x ) f ( x ) − xµ ( x ) (cid:19) + x µ ( x ) ( λ ( x ) + µ ( x )) ) dx, (6.2)where λ ( x ) = f ( x ) / (1 − F ( x )) and µ ( x ) = f ( x ) /F ( x ) is the failure rate and re-verse failure rate of f , respectively. Obviously, if I min < + ∞ , then the Cram´er-Raoinequality shows that no consistent sequence of unbiased estimators exists. This,of course, implies that in the corresponding scale family, any sequence of linear (inpartial minima) unbiased estimators is inconsistent. The same is clearly true for thelocation-scale family, because any linear unbiased estimator for θ in the location-scale family is also a linear unbiased estimator for θ in the corresponding scalefamily.In the following we show that there exist distributions with finite variance suchthat I min in (6.2) is finite: Define s = e − and F ( x ) = , x , − log( x ) , < x s, − ( ax + bx + c ) e − x , x > s, where a = 154 exp( e − )(18 − e + e ) ≃ . , = − 227 exp( − e − )(9 − e + 2 e ) ≃ − . ,c = 154 exp( − e − )(18 − e + 43 e ) ≃ . . Noting that F ( s ) = 1 / F ′ ( s ) = e / F ′′ ( s ) = − e / 27, it can be easily verifiedthat the corresponding density f ( x ) = x (1 − log( x )) , < x s, ( ax + ( b − a ) x + c − b ) e − x , x > s, is strictly positive for x ∈ (0 , ∞ ), processes finite moments of any order, and hascontinuous derivative f ′ ( x ) = x ) x (1 − log( x )) , < x s, − ( ax + ( b − a ) x + 2 a − b + c ) e − x , x > s. Now the integrand in (6.2), say S ( x ), can be written as S ( x ) = − x ) x ( − log( x ))(1 − log( x )) , < x s,A ( x ) + B ( x ) , x > s, where, as x → + ∞ , A ( x ) ∼ Ax e − x and B ( x ) ∼ Bx e − x , with A , B being positiveconstants. Therefore, R s S ( x ) dx = R ∞ yy (1+ y ) dy = log(3 / − / ≃ . S ( x ) is continuous in [ s, + ∞ ) and S ( x ) ∼ Ax e − x as x → + ∞ , it follows that R ∞ s S ( x ) dx < + ∞ and I min is finite.Numerical integration shows that R ∞ s S ( x ) dx ≃ . 77 and thus, I min ≃ . /θ < /θ . In view of the Cram´er-Rao bound this means that, even if a huge sample ofpartial minima has been recorded, it is impossible to construct an unbiased scaleestimator with variance less than θ / 3. Also, it should be noted that a similarexample can be constructed such that f ′′ ( x ) exists (and is continuous) for all x > F ( x ) = ( ax − bx + c ) e x , x − s, − log( − x )1 − log( − x ) , − s x < , , x > , with s , a , b and c as before. 24 cknowledgements. Research partially supported by the University of Athens’ Re-search Fund under Grant 70/4/5637. Thanks are due to an anonymous referee forthe very careful reading of the manuscript and also for correcting some mistakes. Iwould also like to express my thanks to Professors Fred Andrews, Mohammad Raqab,Narayanaswamy Balakrishnan, Barry Arnold and Michael Akritas, for their helpfuldiscussions and comments; also to Professors Erhard Cramer and Claude Lef`evre, forbringing to my attention many interesting references related to log-concave densities.Special thanks are due to Professor Fredos Papangelou for his helpful remarks thatled to a more general version of Theorem 4.2, and also to Dr. Antonis Economou,who’s question for the Normal Distribution was found to be crucial for the final formof the main result. References [1] Arnold, B.C., Balakrishnan, N. and Nagaraja, H.N. (1992). A First Course in OrderStatistics . Wiley, N.Y.[2] Arnold, B.C., Balakrishnan, N. and Nagaraja, H.N. (1998). Records . Wiley, N.Y.[3] Bai, Z., Sarkar, S.K. and Wang, W. (1997). Positivity of the best unbiased L -estimatorof the scale parameter with complete or selected order statistics from location-scaledistribution. Statist. Probab. Lett. Econom. Theory Statist. Probab. Lett. J. Statist. Comput. Simul. J. Multivariate Anal. Ann. Math. Statist. and Log-Concavity. In: Inequalities inStatistics and Probability , IMS Lecture Notes-Monograph Series, vol. 5, 54–58.[10] David, H.A. (1981). Order Statistics . Wiley, N.Y. (2nd. ed.)[11] David, H.A. and Nagaraja, H.N. (2003). Order Statistics . Wiley, Hoboken (3rd. ed.)[12] Galambos, J. (1978). The Asymptotic Theory of Extreme Order Statistics . Wiley, N.Y.Belmont, California.[13] Graybill, F.A. (1969). Introduction to Matrices with Applications in Statistics .Wadsworth, Belmont, California.[14] Gupta, A.K. (1952). Estimation of the mean and standard deviation of a normalpopulation from a censored sample. Biometrika 15] Hoeffding, W. (1940). Masstabinvariante korrelations-theorie. Scrift. Math. Inst. Univ.Berlin Metrika J. Statist. Plann. Inference (C.R. Rao 80th birthdayfelicitation volume, Part I) Ann. Math. Statist. Biometrika Ann. Math. Statist. Statist. Probab. Lett. J. Amer. Statist. Assoc. Act. Sci. Math. (Szeged) , 335–343.[24] Pyke, R. (1965). Spacings (with Discussion). J. Roy Statist. Soc. , Ser. B Ann. Probab. Extreme Values, Regular Variation, and Point Processes .Springer-Verlag, N.Y.[27] Samaniego, F.J. and Whitaker, L.R. (1986). On estimating population characteristicsfrom record-breaking observations I. Parametric results. Naval Res. Logist. Quart. Naval Res. Logist. Statist. Decisions Suppl. Naval Res. Logist. Mathematical Statistics: exercises and solutions . Springer, N.Y.[32] Smith, R.L.. (1988). Forecasting records by maximum likelihood. J. Amer. Statist.Assoc. Ann. Statist. , 466.[34] Tryfos, P. and Blackmore, R. (1985). Forecasting records. J. Amer. Statist. Assoc.46–50.