[PDF] Extremal properties of the multivariate extended skew-normal distribution

Abstract

The skew-normal and related families are flexible and asymmetric parametric models suitable for modelling a diverse range of systems. We show that the multivariate maximum of a high-dimensional extended skew-normal random sample has asymptotically independent components and derive the speed of convergence of the joint tail. To describe the possible dependence among the components of the multivariate maximum, we show that under appropriate conditions an approximate multivariate extreme-value distribution that leads to a rich dependence structure can be derived.

Full PDF

EExtremal properties of the multivariate extended skew-normaldistribution

B. Beranger ∗† , S. A. Padoan ‡ , Y. Xu ∗ and S. A. Sisson ∗ Abstract

The skew-normal and related families are ﬂexible and asymmetric parametric models suitable formodelling a diverse range of systems. We show that the multivariate maximum of a high-dimensionalextended skew-normal random sample has asymptotically independent components and derive the speedof convergence of the joint tail. To describe the possible dependence among the components of the multi-variate maximum, we show that under appropriate conditions an approximate multivariate extreme-valuedistribution that leads to a rich dependence structure can be derived.Keywords: Asymptotic independence; Coeﬃcient of upper-tail dependence; Pickands dependence func-tion; Multivariate extreme-value distribution; Stable-tail dependence function.

The skew-normal and related families, such as the more ﬂexible extended skew-normal and extended skew- t distributions (Arellano-Valle and Genton, 2010, Azzalini and Capitanio, 2014, Ch 5.), are suitable for datathat exhibit an asymmetric distribution, while still providing relatively simple probabilistic models. Forrisk analysis in the ﬁelds of insurance (credit risk management, loss ratios), climatology (ﬂoods, heat waves,storms) and health (inﬂuenza mortality), it is of particular interest to study the tail behavior of the skew-normal and its related families (e.g. Peng et al., 2016, Fung and Seneta, 2014, Liao et al., 2014, Azzalini andCapitanio, 2014, Ch. 4). As a consequence, a number of results on the limiting extreme-value distribution forthe extremes of skew-normal and skew- t samples have been obtained (e.g. Chang and Genton, 2007, Lysenkoet al., 2009, Padoan, 2011, Beranger et al., 2017). However, while the extremal properties of skew-normaland skew- t distributions have been extensively studied, those of the more ﬂexible extended skew-normaldistribution have not yet been investigated.In this contribution we derive the extremal properties of the multivariate extended skew-normal dis-tribution. Recall that a d -dimensional random vector X follows an extended skew-normal distribution ∗ School of Mathematics and Statistics, University of New South Wales, Sydney, Australia. † Communicating Author:

[email protected] ‡ Department of Decision Sciences, Bocconi University of Milan, Italy. a r X i v : . [ s t a t . M E ] S e p Arellano-Valle and Genton, 2010), denoted as X ∼ ESN d ( µ , Ω , α , τ ), if its probability density function(pdf) is given by φ d ( x ; µ , Ω , α , τ ) = φ d ( x ; µ , Ω )Φ (cid:16) τ / p Q ¯ Ω ( α ) (cid:17) Φ( α > z + τ ) , x ∈ d , (1)where φ d ( x ; µ , Ω ) is a d -dimensional normal pdf with mean µ ∈ d and d × d covariance matrix Ω , z = ω − ( x − µ ), ω = diag( Ω ) / , ¯ Ω = ω − Ω ω − , Q ¯ Ω ( α ) = α > ¯ Ω α , Φ( · ) is the standard univariate normalcumulative distribution function (cdf) and α ∈ d and τ ∈ are the slant and extension parameters, respec-tively, which control the nature of density deviations away from normality. When τ = 0 or τ = 0 and α = the extended skew-normal distribution reduces to the skew-normal SN d ( µ , Ω , α ) or the normal N d ( µ , Ω )distribution. Without loss of generality, we work with location and scale standardised distributions through-out, so that ESN d ( ¯ Ω , α , τ ) and Φ d ( x ; ¯ Ω , α , τ ) refer to the d -dimensional extended skew-normal distributionand extended skew-normal cdf with location µ = and correlation matrix ¯ Ω , respectively. Finally, in theunivariate setting, for brevity, we write the distributional parameters in the subscript of the pdf and cdf sothat φ ( x ; α, τ ) = φ α,τ ( x ) and Φ( x ; α, τ ) = Φ α,τ ( x ).In this paper we establish that the multivariate maximum of a high-dimensional extended skew-normalrandom sample has asymptotically independent components. In particular, in the bivariate case we derivethe speed of convergence of the joint upper tail. To describe the possible dependence between the componentsof the multivariate maximum, we consider a similar approach to that introduced in H¨usler and Reiss (1989).We compute a multivariate maximum over a triangular array of extended skew-normal random vectors and,under suitable conditions, derive an approximate multivariate extreme-value distribution, for large samplesizes. This leads to a model with a rich extremal dependence structure, of which we illustrate several features.The paper is organized as follows. In Section 2 we brieﬂy review basic notions of multivariate extreme-value theory. In Section 3 we show that the multivariate sample maximum has asymptotically independentcomponents and for the bivariate case deduce the convergence speed of the joint tail. We complete theSection by deriving an approximate multivariate extreme-value distribution and discuss some features of itsextremal dependence structure. All proofs are provided in the Appendix. Let I = { , . . . , d } be an index set denoting variables of interest. Let X , . . . , X n , be a series of iid d -dimensional random vectors, where X i = ( X i, , . . . , X i,d ) > for i = 1 , . . . , n , with a continuous jointdistribution function F deﬁned on d , with marginal distributions F j , j ∈ I . The vector of ( n -partial) samplemaxima is deﬁned componentwise as M n = ( M n, , . . . , M n,d ) > with M n,j = max i =1 ,...,n X i,j , j ∈ I. Aswith the univariate setting, if there is a sequence of normalising constants a n = ( a n, , . . . , a n,d ) > > =(0 , . . . , > and b n = ( b n, , . . . , b n,d ) > ∈ d such thatlim n →∞ Pr (cid:18) M n − b n a n ≤ x (cid:19) = lim n →∞ F n ( a n x + b n ) = G ( x ) , (2)2or all continuity points x = ( x , . . . , x d ) > ∈ d of G ( x ), and where a n x denotes componentwise multiplication,then if G is a distribution function with nondegenerate margins it is called a multivariate extreme-valuedistribution (e.g. Beirlant et al., 2004, Ch. 6). Speciﬁcally, G takes the form G ( x ) = C { G ( x ) , . . . , G d ( x d ) } , x ∈ d , where its univariate margins G j , j ∈ I , are members of the GEV family (e.g. Beirlant et al., 2004, p.47) and C is an extreme-value copula with expression C ( u ) = exp {− L ( − ln u , · · · , − ln u d ) } , u ∈ (0 , d , where u = ( u , . . . , u d ) > and where L : [0 , ∞ ) d [0 , ∞ ) is the stable dependence function (e.g. Beirlantet al., 2004, Section 8.2.2). Speciﬁcally, L ( z ) = d Z S d max ( z w , . . . , z d w d ) H (d w ) , z ∈ [0 , ∞ ) d , (3)where w = ( w , . . . , w d ) > , z = ( z , . . . , z d ) > and where the angular measure H is a probability measuredeﬁned on the d -dimensional unit simplex S d := (cid:8) ( v , . . . , v d ) ∈ [0 , d : v + · · · + v d = 1 (cid:9) satisfying themean constraint R S d w j H (d w ) = 1 /d for all j ∈ I . By the homogeneity property of L it follows that L ( z ) = ( z + · · · + z d ) A ( t ) , z ∈ [0 , ∞ ) d , where t = ( t , . . . , t d ) > with t j = z j / ( z + · · · + z d ) for j = 1 , . . . , d − t d = 1 − t − · · · − t d − , where A is Pickands dependence function (e.g. Beirlant et al., 2004, Section 8.2.5), which is the restriction of L on S d . It quantiﬁes the level of dependence between the extremes, and satisﬁes the condition 1 /d ≤ max( t , . . . , t d ) ≤ A ( t ) ≤ t ∈ S d , with the lower and upper bounds representing complete dependenceand independence, respectively.An important and useful summary of extremal dependence is the coeﬃcient of upper-tail dependence,denoted by χ (Li, 2009, Joe, 1997, Ch. 2). In the bivariate case, it is constructed as the probability that X i and X j , i = j ∈ I , are jointly extreme. Explicitly, χ := lim u → + χ ( u ), where χ ( u ) = Pr( F i ( X i ) ≥ − u, F j ( X j ) ≥ − u ) u , u ∈ (0 , , (4)where 0 ≤ χ ≤

1. The variables ( X i , X j ) are said to be asymptotically independent in the upper-tailwhen χ = 0 and are asymptotically dependent when χ >

0. The case where χ = 1 represents completedependence between X i and X j . On the basis of the speed of convergence of χ ( u ) to zero as u → + ,Ledford and Tawn (1996) proposed an approach to describe the sub-asymptotic, upper-tail dependence inthe case of asymptotic independence. Speciﬁcally, they assumed that the upper-tail dependence function χ ( u )(4) behaves as χ ( u ) = u /η − L (1 /u ) , as u → + , where η ∈ (0 ,

1] is the coeﬃcient of tail dependence and L (1 /u ) is a slowly varying function, such that L ( a/u ) / L ( u ) → u → + , for ﬁxed a >

0. Considering L as a constant, at extreme levels margins are negatively associated when η < /

2, independent when η = 1 / / < η <

1. When η = 1 and L (1 /u ) Extremes of extended skew-normal random samples

It is well known that the components of both normal and skew-normal random vectors are asymptoticallyindependent. That is, the limit distribution of the normalised vector of componentwise maxima given by(2) is equal to the product of its marginal distributions (e.g Lysenko et al., 2009, Beirlant et al., 2004, pp.285–87). However, Beranger et al. (2017) showed that for the skew-normal case, the rate of convergence tozero of the upper-tail dependence function χ ( u ) in (4) depends on the slant parameters α , and dependingon the sign of the elements of α , this can occur at a faster or slower rate than that of the normal case.Accordingly, from both theoretical and applied perspectives, it is important to understand whether theseresults also hold for the tail behaviour of the extended skew-normal distribution, in which the extensionparameter τ also plays a part in the speed of convergence. We ﬁrst consider the question of asymptoticdependence or asymptotic independence. Proposition 3.1 (Asymptotic Independence) . Let X ∼ ESN d ( ¯ Ω , α , τ ) . Let χ ( u ) with u ∈ (0 , be thejoint probability in (4) . Then, for every bivariate pair ( X i , X j ) with ≤ i < j ≤ d we have that χ = 0 . That is, it follows from Proposition 3.1 that regardless of the degree of sub-asymptotic dependence, thecomponents of the multivariate extended skew-normal distribution are asymptotically independent, and sothe asymptotic distribution is a product of univariate standard Gumbel distributions. We now examine therate of convergence of χ ( u ) → τ . Proposition 3.2 (Bivariate Tail Convergence) . Let ( X , X ) ∼ ESN ( ¯ Ω , α , τ ) , where the oﬀ-diagonalterm of ¯ Ω is ω ∈ [0 , , α ∈ and τ ∈ . Set K = Φ( τ / p α + α + 2 ωα α ) , ¯ α j = (1 + α ∗ j ) / , α ∗ j = ( α j + ωα − j ) / { α − j (1 − ω ) } / for j = 1 , . Then, χ ( u ) ≈ u /η − L (1 /u ) as u → + , where(i) when either α , α ≥ or ω > and α j ≤ and ωα − j + α j ≥ for j = 1 , , then η = (1 + ω ) / , L (1 /u ) = (1 + ω ) K/ (1 − ω )(4 π ln(1 /u )) − ω ω . (ii) when ω > , α j ≤ and − α j ω ≤ α − j < − α j /ω for j = 1 , , then(ii a ) if α − j > − α j / ¯ α j , then η = (1 − ω ) ¯ α j − ω + (¯ α j − ω ) , L (1 /u ) = ¯ α j (1 − ω ) K /η − / { ¯ α j (1+ ω ) } (¯ α j − ω )(1 − ω ¯ α j ) (4 π ln(1 /u )) / η − . (ii b ) if α − j < − α j / ¯ α j , then η = h { − ω + (¯ α j − ω ) } / { (1 − ω ) ¯ α j } + ( α − j − α j / ¯ α j ) i − , L (1 /u ) = e − τ / ¯ α j (1 − ω )( α − j − α j / ¯ α j ) − K /η − (¯ α j − ω ) { − ω ¯ α j + α − j α j ¯ α j (1 − ω ) } (4 π ln(1 /u )) / η − / . .80 0.85 0.90 0.95 1.00 . . . . . . w = , t = −5 v c ( - v ) a = , a = a = , a = - a = - , a = a = - , a = - . . . . . . w = , t = v c ( - v ) a = , a = a = , a = - a = - , a = a = - , a = - . . . . . . w = , t = v c ( - v ) a = , a = a = , a = - a = - , a = a = - , a = - Figure 1:

The behaviour of χ (1 − v ) versus v for diﬀerent values of the parameters ω , α , α and τ , for the bivariateskew-normal distribution. From left to right, the panels illustrate the eﬀect of negative, zero and positive values of τ , respectively. (iii) when either α , α < or ω > , α j < and < α − j < − ωα j for j = 1 , , then η = (1 − ω ) ( α − j (1 − ω ) + 1¯ α − j + α j (1 − ω ) + 1¯ α j + 2( α α (1 − ω ) − ω )¯ α ¯ α ) − , L (1 /u ) = ¯ α j ¯ α − j (1 − ω ) (cid:2) { − ω ¯ α − j / ¯ α j } / (1 − ω ) + α j ( α j + α − j ¯ α − j / ¯ α j ) (cid:3) − e − τ / K /η − [ { α − j (1 − ω ) }{ ¯ α j − ω ¯ α − j } + α α (1 − ω ) / (1 − ω )¯ α − j ]( α − j ¯ α j + α j ¯ α − j ) × (4 π ln(1 /u )) / η − / . From Proposition 3.2 we see that the contribution of the extension parameter τ to the rate of tailconvergence is contained in the K ψ term, where the power ψ is independent of τ and changes dependingon the value of α . For a bivariate skew-normal distribution, Figure 2 illustrates the behaviour of χ (1 − v )against v → − , where χ ( u ) is the upper-tail dependence function (4), for diﬀerent values of the modelparameters ω , α , α and τ . In each panel, for ﬁxed ω and τ , the speed of convergence of χ (1 − v ) to 0 as v → − is fastest when both slant parameters ( α , α ) are negative. It is slower in any other case, with theslowest convergence rate depending on both the sign and magnitude of the slant parameters. However theeﬀect of τ on the rate of convergence is more straightforward. While ﬁxing all other parameters, for lowervalues of τ (left panel) the rate of convergence is faster than for higher values (right panel).Proposition 3.1 states that the marginal (componentwise) maxima M n, , . . . , M n,d are asymptoticallyindependent, thereby determining an extremal framework that only permits independence among observedsample maxima. However, for data following Gaussian-type distributions, H¨usler and Reiss (1989) developedan approach by which, under suitable conditions, an alternative non-independence asymptotic distributionfor componentwise maxima may be formulated. This allows an extremal dependence structure possessing arich class of asymptotic behaviour, ranging from independence to complete dependence, to be derived. Wenow develop this alternative asymptotic distribution for the extend skew-normal class.Precisely, for n = 1 , , . . . let X n,i , i = 1 , . . . , n , be a triangular array of random vectors, where5 n,i = ( X n,i ;1 , . . . , X n,i ; d ) > . Following H¨usler and Reiss (1989), for each n , assume that X n, , . . . , X n,n areindependent random vectors, where X n,i ∼ ESN d ( ¯ Ω n , α n , τ ). Here, the dependence structure and asym-metry of the extended skew-normal distribution, as measured through ¯ Ω n and α n , changes as the samplesize n increases. In particular it is assumed that the strength of dependence and asymmetry increase with n at an appropriate rate. We formalise this as follows. Condition 1.

For all j ∈ I , the elements of α n = ( α n ;1 , . . . , α n ; d ) > satisfy α n ; j → ±∞ as n → ∞ and α ◦ j = lim n →∞ α n ; j (ln n ) − / ∈ , with α ◦ + · · · + α ◦ d = 0 . For every i, j ∈ I , the correlations ω n ; i,j of the d -dimensional matrix ¯ Ω n satisfy λ i,j = lim n →∞ (1 − ω n ; i,j ) ln n ∈ (0 , ∞ ] . Under the assumptions in Condition 1, we are now able to establish H¨usler and Reiss (1989)’s alternativeextremal limit in the case of the extended skew-normal distribution.

Theorem 3.1.

Consider a triangular array of extended skew-normal random vectors X ,n , . . . , X n,n , n =1 , , . . . . Let M n,n = ( M n,n ;1 , . . . , M n,n ; d ) > where M n,n ; j = max( X n, j , X n, j , . . . , X n,n ; j ) , j ∈ I . Underthe assumptions in Condition 1 there are sequences of normalising constants a n > and b n ∈ d such that Φ nd ( a n x + b n ; ¯ Ω n , α n , τ ) → G ( x ) as n → ∞ , where the univariate margins of G are standard Gumbeldistributions, i.e. G ( x ) = e − e − x with x ∈ , and L ( z ) = d X j =1 z j Φ d − ((cid:18) λ ij + 12 λ ij log ˜ z j ˜ z i , i ∈ I j (cid:19) > ; ¯ Λ j , ˜ α j , ˜ τ j ) , z ∈ [0 , ∞ ) d , (5) and where ¯ Λ j is a ( d − × ( d − correlation matrix with upper diagonal entries λ ij + λ kj − λ ik λ ij λ kj , j ∈ I, i, k ∈ I j = I \{ j } ˜ α j = ( √ α ◦ i λ i,j , i ∈ I j ) > , ˜ τ j = τ − P i ∈ I j √ α ◦ i λ i,j and ˜ z i = z i Φ  τ − P k ∈ I j √ λ k,j α ◦ k q P k,m ∈ I j α ◦ k α ◦ m ( λ k,j + λ m,j − λ k,m )  / Φ  τ q P k,m ∈ I j α ◦ k α ◦ m ( λ k,j + λ m,j − λ k,m )  and ˜ z j is deﬁned as ˜ z i but where the index i is replaced by j and vice versa. For the resulting multivariate extreme-value distribution in Theorem 3.1 we may derive representationsof the extremal dependence. In particular, from (5) we may construct Pickands dependence function as A ( t ) = d X j =1 t j Φ d − ((cid:18) λ ij + 12 λ ij log ˜ t j ˜ t i , i ∈ I j (cid:19) > ; ¯ Λ j , ˜ α j , ˜ τ j ) , (6)for t = ( t , . . . , t d ) > , where ˜ t j and ˜ t i are deﬁned as ˜ z j and ˜ z i .By exploiting the method described in (e.g. Beirlant et al., 2004, pp. 263-264, 292-293) the angularmeasure H (deﬁned through (3)) relative to (5) may be derived. Speciﬁcally, H places mass only in the6nterior of the simplex and so the angular density on S d may be expressed as h ( w ) = φ d − (cid:26)(cid:16) λ i + λ i log ˜ w i ˜ w , i ∈ I (cid:17) > ; ¯ Λ , ˜ α , ˜ τ (cid:27) w Q di =2 w i λ i, , w ∈ S d , (7)where ˜ w j and ˜ w i are deﬁned as ˜ z j and ˜ z i . Finally, for a bivariate random vector ( Z , Z ) with distributiongiven in Theorem 3.1, the coeﬃcient upper-tail dependence in (4) is χ = 1 − Φ  λ , + 12 λ , log Φ (cid:18) τ −√ λ , α ◦ λ , α ◦ (cid:19) Φ (cid:18) τ + √ λ , α ◦ λ , α ◦ (cid:19) ; −√ λ , α ◦ , τ + √ λ , α ◦  = 1 − Φ  λ , + 12 λ , log Φ (cid:18) τ + √ λ , α ◦ λ , α ◦ (cid:19) Φ (cid:18) τ −√ λ , α ◦ λ , α ◦ (cid:19) ; √ λ , α ◦ , τ − √ λ , α ◦  . (8)7 .0 0.2 0.4 0.6 0.8 1.0 . . . . . . t A ( t ) . . . . . . t A ( t ) . . . . . . t A ( t ) −4 −2 0 2 4 − − a t −4 −2 0 2 4 − − a t −4 −2 0 2 4 − − a t w w w w w w w w w w w w w w w w w w Figure 2:

Extremal dependence for the extended skew-normal distribution: (Top panels) Pickands dependencefunction, A ( t ), (second row) the coeﬃcient of upper-tail dependence χ and (bottom two rows) the angular density, h ( w ), for diﬀerent values of α ◦ , τ and λ (see main text for details). Figure 2 graphically illustrates a range of extremal dependence structures in terms of Pickands dependencefunction A ( t ) (6), the angular density h ( w ) (7) and the coeﬃcient of upper tail dependence χ (8). Eachbivariate Pickands dependence function (top row) is constructed with λ taking eight equally spaced valuesbetween 0 . α ◦ = −

20 and τ = − α ◦ = τ = 0) and right-skewed ( α ◦ = 20, τ = 6) dependence functions. The bivariate coeﬃcient8f upper-tail dependence (second row), is illustrated for diﬀerent values of α ◦ ∈ [ − ,

5] and τ ∈ [ − , λ = 0 . , .

5. It is apparent that for ﬁxed values of α ◦ and τ , χ decreasesfor increasing values of the dependence parameter λ . Similarly, for ﬁxed values of λ and α ◦ , χ increases fordecreasing values of τ . Finally, for ﬁxed values of λ and τ , χ increases for increasing values of | α ◦ | .The bottom two rows illustrate trivariate angular densities for λ > = ( λ , λ , λ ) = (0 . , . , . α ◦ = ( α ◦ , α ◦ , α ◦ ) > and τ . From left to right and top to bottom,the plots are produced with slant parameters α given by (0 , , , , − − , , , − , , , − , − ,

3) and extension parameters 0, 0, 0, 3, 5 and −

5, respectively. The mass in the left panel of thethird row concentrates around the centre of the simplex meaning that there is strong dependence amongall variables. In the middle (and right) panels of the third row, the mass is concentrated on the bottomleft (right) corner and left (right) edge. This means that two variables are themselves mildly dependent,and weakly dependent of the third. In the bottom row, the mass in the left panel is concentrated on onecorner and two edges, meaning that one variable is mildly dependent on the other two, and these are weaklydependent themselves. In the centre panel of the fourth row, the mass concentration in the centre panel isin the centre and on two edges, meaning that one variable is strongly dependent on the others, and theseare themselves weakly dependent. Finally, in the right panel the mass concentrates on one edge, so that twovariables are strongly dependent but they are each weakly dependent on the third.

The success of the multivariate skew-normal family is also due to its stochastic representations which motivateits use as stochastic model for data. For instance, sampling from the multivariate extended skew-normaldistribution can be achieved through the distribution of the ﬁrst d components of a ( d + 1)-dimensionalGaussian random vector, conditionally that the ( d + 1)th component satisﬁes a certain condition (Arellano-Valle and Genton, 2010, Azzalini and Capitanio, 2014, Ch. 5.1.3, 5.3.3). We have studied the extremalbehaviour of extended skew-normal random vectors. Although their multivariate sample maximum hasasymptotically independent components, we have showed that the slant and extension parameters aﬀect thespeed of convergence of the joint upper tail for each of its bivariate components. Furthermore, we havederived the asymptotic distribution for the sample maximum of a triangular array of independent extendedskew-normal random vectors, under appropriate conditions on the correlations and the slant parameters.This produces a skewed version of the well-known H¨usler-Reiss model (H¨usler and Reiss, 1989), where theskewness of such a distribution is aﬀected by the extension parameter. Hashorva and Ling (2016) haveinvestigated the asymptotic distribution for the sample maximum of a triangular array of independentbivariate skew-elliptical triangular arrays. They have also found that a modiﬁed version of the H¨usler-Reissmodel emerges as possible asymptotic distribution under an appropriate condition on the random radiusrelative to the elliptical random vectors. Their result diﬀer from our result. In future would be interesting9o investigate the extremal properties of a multivariate skew-elliptical distributions (Azzalini and Capitanio,2014, Ch. 6) and study the relation with the triangular array type of approach investigated in Hashorva andLing (2016). Acknowledgments

We thank two anonymous referees for having carefully read the ﬁrst version of this manuscript and fortheir helpful comments that have contributed to improving the presentation of this article. BB and SASare supported by the Australian Centre of Excellence in Mathematical and Statistical Frontiers (ACEMS;CE140100049) and by the Australian Research Council Discovery Projects Scheme (FT170100079). SAP issupported by the Bocconi Institute for Data Science and Analytics (BIDSA).

A Proofs

Auxiliary proofs and details of the results in Lemmas 1 –3 are provided in the Supplementary Material.

A.1 Proof of Proposition 3.1

For 1 ≤ i < j ≤ d we deﬁne α ∗ i = ( α i + ω i,j α j ) / { α j (1 − ω i,j ) } / , α ∗ j = ( α j + ω i,j α i ) / { α i (1 − ω i,j ) } / . (9)We analyse the following four possible scenarios: (a) 0 < α ∗ i < α ∗ j , (b) α i , α j < α ∗ i < α ∗ j with α ∗ i <

0, (c) α ∗ i , α ∗ j < α i < α j ≥ α ∗ i < α ∗ j ≥

0. Interchanging α ∗ i with α ∗ j produces the same results. For brevity we set ¯ α i = (1 + α ∗ i ) / and ¯ α j = (1 + α ∗ j ) / . We ﬁrstneed the following result. Lemma 1.

Let X ∼ ESN d ( ¯ Ω , α , τ ) . for every pair ( X i , X j ) with ≤ i < j ≤ d we have that under thescenarios (a) and (b) lim x →∞ Pr( X j ≥ x, X i ≥ x )Pr( X i ≥ x ) = 0 . While under the scenario (c) and (d) we respectively have lim x →∞ Pr ( X j ≥ x, X i ≥ x ¯ α j / ¯ α i )Pr ( X i ≥ x ¯ α j / ¯ α i ) = 0 and lim x →∞ Pr ( X j ≥ x, X i ≥ x/ ¯ α i )Pr ( X i ≥ x/ ¯ α i ) = 0 . Consider the case (a) 0 < α ∗ i < α ∗ j . Using deﬁnition (9) this assumption implies the inequality { α i (1 − ω i,j ) } ( α j + ω i,j α j ) < ( α j + ω i,j α i ) { α j (1 − ω i,j ) } and from this with elementary computations we obtain α i < α j and τ ∗ j = τ / { α i (1 − ω i,j ) } / > τ / { α j (1 − ω i,j ) } / = τ ∗ i . x → ∞ , φ ( x )Φ( α ∗ i x + τ ∗ i )Φ ( τ ∗ i / ¯ α i ) < φ ( x )Φ( α ∗ j x + τ ∗ j )Φ (cid:0) τ ∗ j / ¯ α j (cid:1) , which implies that 1 − Φ α ∗ i ,τ ∗ i ( x ) < − Φ α ∗ j ,τ ∗ j ( x ) and Φ α ∗ i ,τ ∗ i ( x ) > Φ α ∗ j ,τ ∗ j ( x ) as x → ∞ . Then χ = lim u → + Pr(Φ α ∗ j ,τ ∗ j ( X j ) ≥ − u | Φ α ∗ i ,τ ∗ i ( X i ) ≥ − u )= lim x →∞ Pr(Φ α ∗ j ,τ ∗ j ( X j ) ≥ Φ α ∗ i ,τ ∗ i ( x ) | X i ≥ x ) ≤ lim x →∞ Pr(Φ α ∗ j ,τ ∗ j ( X j ) ≥ Φ α ∗ j ,τ ∗ j ( x ) | X i ≥ x ) = lim x →∞ Pr( X j ≥ x, X i ≥ x )Pr( X i ≥ x ) . By Lemma 1 the last limit is equal to zero and therefore χ = 0.Consider case (b) α i , α j < α ∗ i < α ∗ j with α ∗ i <

0. Using similar arguments we obtain χ < lim x →∞ Pr( X j ≥ x, X i ≥ x )Pr( X i ≥ x ) . Then, by applying Lemma 1 we obtain χ = 0.Consider case (c) α ∗ i , α ∗ j < α i < α j ≥

0, which implies that α ∗ i < α ∗ j . ApplyingProposition 2.2 from Beranger et al. (2018) to 1 − Φ α ∗ i ,τ ∗ i ( x ) and Mill’s ratio (Mills, 1926) to Φ( α ∗ i x + τ ∗ i )we obtain 1 − Φ α ∗ i ,τ ∗ i ( x ) ≈ φ ( x )Φ( α ∗ i x + τ ∗ i )Φ ( τ ∗ i / ¯ α i ) { ¯ α i x + α ∗ i τ ∗ i } as x → ∞≈ φ ( x ) φ ( α ∗ i x + τ ∗ i )Φ ( τ ∗ i / ¯ α i ) { ¯ α i x + α ∗ i τ ∗ i }{− ( α ∗ i x + τ ∗ i ) } as x → ∞≈ φ ( x ¯ α i )Φ ( τ ∗ i / ¯ α i ) ¯ α i ( − α ∗ i ) √ πx as x → ∞ . Now note that φ ( x (¯ α j / ¯ α i )¯ α i ) = φ ( x ¯ α j ) φ ( x (¯ α j / ¯ α i )¯ α i )Φ (cid:0) τ ∗ j / ¯ α j (cid:1) (¯ α j / ¯ α i )¯ α i ( − α ∗ j ) √ πx = φ ( x ¯ α j )Φ (cid:0) τ ∗ j / ¯ α j (cid:1) ¯ α j ( − α ∗ j ) √ πx . Since α ∗ i < α ∗ j then − /α ∗ i < − /α ∗ j and it follows that φ ( x (¯ α j / ¯ α i )¯ α i )Φ (cid:0) τ ∗ j / ¯ α j (cid:1) (¯ α j / ¯ α i )¯ α i ( − α ∗ i ) √ πx < φ ( x ¯ α j )Φ (cid:0) τ ∗ j / ¯ α j (cid:1) ¯ α j ( − α ∗ j ) √ πx . Therefore, 1 − Φ α ∗ i ,τ ∗ i ( x ¯ α j / ¯ α i ) < − Φ α ∗ j ,τ ∗ j ( x ) and Φ α ∗ j ,τ ∗ j ( x ) < Φ α ∗ i ,τ ∗ i ( x ¯ α j / ¯ α i ) and Φ α ∗ i ,τ ∗ i ( x ¯ α j / ¯ α i ) < Φ α ∗ i ,τ ∗ i ( x ). From this, with some manipulation we may obtain χ ≤ lim x →∞ Pr ( X j ≥ x, X i ≥ x ¯ α j / ¯ α i )Pr ( X i ≥ x ¯ α j / ¯ α i ) . Now, applying Lemma 1 we obtain χ = 0.Finally, consider case (d) α ∗ i < α ∗ j ≥

0. Note that as x → ∞ we haveΦ (cid:0) α ∗ j ¯ α i x + τ ∗ (cid:1) > √ π ( − α ∗ i )¯ α j x − Φ xα ∗ j ,τ ∗ j (¯ α i ) > − Φ α ∗ i ,τ ∗ i ( x ) and Φ α ∗ j ,τ ∗ j ( x ¯ α i ) < Φ α ∗ i ,τ ∗ i ( x ) as x → ∞ . These resultsimply that χ ≤ lim x →∞ Pr ( X j ≥ x, X i ≥ x/ ¯ α i )Pr ( X i ≥ x/ ¯ α i ) . Then, by applying Lemma 1 we obtain χ = 0. Since χ = 0 for all 1 ≤ i < j ≤ d then by Resnick (1987,Proposition 5.27) we have that X ∼ ESN d ( ¯ Ω , α , τ ) has asymptotically independent components. A.2 Proof of Proposition 3.2

From Arellano-Valle and Genton (2010), recall that if X ∼ ESN ( ¯ Ω , α , τ ) then for j = 1 , X j ∼ ESN ( α ∗ j , τ ∗ j ) , α ∗ j = α j + ωα − j q α − j (1 − ω ) , τ ∗ j = τ q α − j (1 − ω ) ,X j | X − j ∼ ESN (cid:16) ωx − j , p − ω , α j · − j , τ j · − j (cid:17) , α j · − j = α j p − ω , τ j · − j = (1 − ω ) α − j x − j + τ. Deﬁne x j ( u ) = Φ ← (1 − u ; α ∗ j , τ ∗ j ), for any u ∈ [0 , ← ( · ; α ∗ j , τ ∗ j ) is the inverse of the marginaldistribution function Φ( · ; α ∗ j , τ ∗ j ), for j = 1 ,

2. The asymptotic behaviour of x j ( u ) as u → x j ( u ) =  x ( u ), if α ∗ j ≥ x ( u )¯ α j − α ∗ j τ ∗ j ¯ α j − ln(2 √ π )+ln( | α ∗ j | )+1 / /u )+ α ∗ j / ‘ /u,α ∗ j , if α ∗ j < j = 1 ,

2, where ¯ α j = { α ∗ j } / , x ( u ) ≈ ‘ /u, − { ln(2 √ π ) + 1 / /u ) + ln Φ( τ ∗ j / ¯ α j ) } /‘ /u, and ‘ /u,a = p /u )(1 + a ) for any a ∈ . We denote the asymptotic joint survivor function of the bivariateextended skew-normal distribution by p ( u ) = P { X > x ( u ) , X > x ( u ) } for u → α , α > x ( u ) = x ( u ) = x ( u ). Set K = Φ( τ / p α + α + 2 ωα α ). Then,the joint upper tail p ( u ) behaves as u → p ( u ) = Z ∞ x ( u ) (cid:26) − Φ (cid:18) y ( u ) − ωv √ − ω ; α · , τ · (cid:19)(cid:27) φ ( v ; α ∗ , τ ∗ )d v ≈ √ − ω x ( u ) Z ∞ φ ( x ( u ) , x ( u ) + t/x ( u ); ¯ Ω , α , τ ) x ( u )(1 − ω ) − ωt/x ( u ) d t ≈ K − e − x u )1+ ω π (1 − ω ) x ( u ) Z ∞ e − t ω d t − e − x u )( α α − x ( u )( α + α ) τ √ π ( α + α ) x ( u ) Z ∞ e − t { ω + α ( α + α ) } d t  = e − x ( u ) / (1+ ω ) (1 + ω )2 πK (1 − ω ) x ( u ) − e − x ( u )( α + α ) / − x ( u )( α + α ) τ √ π ( α + α ) { α ( α + α )(1 + ω ) } x ( u ) ! . (11)The ﬁrst approximation is obtained using Proposition 2.2 from Beranger et al. (2018). The second ap-proximation uses Mills’ ratio approximation. Substituting x ( u ) into (11) we obtain the approximation p ( u ) ≈ u /η L (1 /u ) as u → + , where η = (1 + ω ) / L ( x ) = (1 + ω ) K − ω ω (1 − ω )(4 π ln 1 /u ) ω ω  − (4 π ln 1 /u ) ( α α − u ( α + α ) K ( α + α ) e − τ (1 + ω ) − (1 − ω )( α + α ) { α ( α + α )(1 + ω ) }  . (12)12s the second term in the parentheses in (12) is o ( u ( α + α ) ) for u → + , then the quantity inside theparentheses → u → + , and so L (1 /u ) is well approximated by the ﬁrst term in (12). When α < α ≥ − α /ω , then α ∗ , α ∗ > α < − ω, α ≤ α < − ω − α , then α ∗ ≥ α ∗ < x ( u ) = x ( u )and x ( u ) is given as in the second line of (10). For the case (iia), i.e. when α > − ¯ α α , then following asimilar derivation to that of (11), we obtain that p ( u ) ≈ ¯ α (1 − ω )(1 − ω ¯ α ) − π K (¯ α − ω ) x ( u ) exp (cid:20) − x ( u )2 (cid:26) − ω + (¯ α − ω ) (1 − ω )¯ α (cid:27)(cid:21) , u → . Similarly, for the case (iib), i.e. when α < − ¯ α α , by applying Mills’ ratio we obtain p ( u ) ≈ − ¯ α { − ω ¯ α + α ( α + α ¯ α )(1 − ω ) } − π / K (¯ α − ω )(1 − ω ) − ( α + α / ¯ α ) x ( u ) e − x u )2 (cid:26) − ω α − ω )2(1 − ω α + (cid:16) α + α α (cid:17) − τ (cid:27) , u → . For case (iii), when α < < α < − ωα , then α ∗ , α ∗ < x ( u ) and x ( u ) are given as inthe second line of (10). Then, by Proposition 2.2 from Beranger et al. (2018) we obtain p ( u ) ≈ − ¯ α / ¯ α (1 − ω )(¯ α − ω ¯ α ) − ( α ¯ α + α ¯ α ) − (2 π ) / K { − ω ¯ α + α ( α + α ¯ α / ¯ α )(1 − ω ) } x ( u ) × exp (cid:20) − x ( u )2(1 − ω ) (cid:18) α (1 − ω ) + 1¯ α + α (1 − ω ) + 1¯ α + 2( α α (1 − ω ) − ω )¯ α ¯ α (cid:19) − τ (cid:21) u → . When α , α < ω − α ≤ α < α with α producesthe same results but where α j and ¯ α j are substituted in the above with α − j and ¯ α − j respectively, for j = 1 , . A.3 Proof of Theorem 3.1

Let X n,m ∼ ESN d ( ¯ Ω n , α n , τ ), n ∈ N and m = 1 , . . . , n , where ¯ Ω n and α n are deﬁned in Condition 1 and τ ∈ . We want to derive norming constants a n > and b n ∈ d such that we can derive a non-trivial limitdistribution for Φ nd ( a n x + b n ; ¯ Ω n , α n , τ ). Recall that from Arellano-Valle and Genton (2010) we have thatfor all j ∈ I , X n,m ; j ∼ ESN ( α ∗ n ; j , τ ∗ n ; j ), where α ∗ n ; j and τ ∗ n ; j are appropriate slant and extension marginalparameters. Then, we may state the following result. Lemma 2.

For all j ∈ I deﬁne the normalising constants a n ; j = ‘ − n , b n ; j = ‘ n − ln(2 √ π ) + (1 /

2) ln ln n + ln Φ (cid:0) τ ∗ n ; j / ¯ α n ; j (cid:1) − ln Φ (cid:0) α ∗ n ; j ‘ n + τ ∗ n ; j (cid:1) ‘ n , if α ∗ n ; j ≥ ,b n ; j = ‘ n − ln √ π + ln Φ (cid:0) τ ∗ n ; j / ¯ α n ; j (cid:1) − ln Φ (cid:0) α ∗ n ; j ‘ n + τ ∗ n ; j (cid:1) ‘ n − ln Φ (cid:0) ¯ α n ; j ‘ n + α ∗ n ; j τ ∗ n ; j (cid:1) ‘ n , if α ∗ n ; j < , where ¯ α n ; j = { α ∗ n ; j } / , ‘ n = √ n . Then, for all j ∈ I , lim n →∞ Φ nα ∗ n ; j ,τ ∗ n ; j ( a n ; j x j + b n ; j ) = e − e − xj , x j ∈ . j ∈ I , e − e − xj is continuous then the weak convergence of ESN d ( ¯ Ω n , α n , τ ) is equivalentto weak convergence of the marginal distributions functions and the copula function (e.g. Beirlant et al.,2004, Section 8.3.2). It remains to derive the limiting form of the copula function of ESN d ( ¯ Ω n , α n , τ ). Wecomplete the proof deriving the stable-tail dependence function L , since an extreme-value copula is of theform C ( u ) = exp {− L ( − ln u , . . . , − ln u d ) } (see Section 2). Lemma 3.

The stable-tail dependence function associated with the limit distribution of Φ nd ( a n x + b n ; ¯ Ω n , α n , τ ) is L ( z ) = lim n →∞ n n − Pr (cid:16) Φ α ∗ n ; j ,τ ∗ n ; j ( X j ) ≤ − z j n , j = 1 , . . . , d (cid:17)o , z ∈ [0 , ∞ ) d = d X j =1 z j Φ d − ((cid:18) λ ij + 12 λ ij log ˜ z j ˜ z i , i ∈ I j (cid:19) > ; ¯ Λ j , ˜ α j , ˜ τ j ) , where for all j ∈ I , ¯ Λ j , ˜ α j and ˜ τ j are given in statement of the theorem. References

Arellano-Valle, R. B. and M. G. Genton (2010). Multivariate extended skew- t distributions and relatedfamilies. Metron 68 (3), 201–234.Azzalini, A. and A. Capitanio (2014).

The Skew-Normal and Related Families . Cambridge: University Press,Cambridge.Beirlant, J., Y. Goegebeur, J. Teugels, and J. Segers (2004).

Statistics of Extremes: Theory and Applications .John Wiley & Sons, Ltd., Chichester.Beranger, B., S. A. Padoan, and S. A. Sisson (2017). Models for extremal dependence derived from skew-symmetric families.

Scandinavian Journal of Statistics 44 (1), 21–45. 10.1111/sjos.12240.Beranger, B., S. A. Padoan, Y. Xu, and S. A. Sisson (2018). Extremal properties of the univariate extendedskew-normal distribution.

Submitted .Chang, S.-M. and M. G. Genton (2007). Extreme value distributions for the skew-symmetric family ofdistributions.

Communications in Statistics - Theory and Methods 36 (9), 1705–1717.Fung, T. and E. Seneta (2014). Convergence rate to a lower tail dependence coeﬃcient of a skew- t distribution. Journal of Multivariate Analysis 128 , 62–72.Hashorva, E. and C. Ling (2016). Maxima of skew elliptical triangular arrays.

Communications in Statistics- Theory and Methods 45 , 3692–3705.H¨usler, J. and R.-D. Reiss (1989). Maxima of normal random vectors: between independence and completedependence.

Statist. Probab. Lett. 7 (4), 283–286. 14oe, H. (1997).

Multivariate Models and Dependence Concepts . Chapman & Hall.Ledford, A. W. and J. A. Tawn (1996). Statistics for near independence in multivariate extreme values.

Biometrika 83 (1), 169–187.Li, H. (2009). Orthant tail dependence of multivariate extreme value distributions.

Journal of MultivariateAnalysis 100 , 243–256.Liao, X., Z. Peng, S. Nadarajah, and X. Wang (2014). Rates of convergence of extremes from skew-normalsamples.

Statistics & Probability Letters 84 , 40 – 47.Lysenko, N., P. Roy, and R. Waeber (2009). Multivariate extremes of generalized skew-normal distributions.

Statist. Probab. Lett. 79 (4), 525–533.Mills, J. P. (1926). Table of the ratio: Area to bounding ordinate, for any portion of normal curve.

Biometrika 18 (3/4), 395–400.Padoan, S. A. (2011). Multivariate extreme models based on underlying skew- t and skew-normal distribu-tions. Journal of Multivariate Analysis 102 (5), 977–991.Peng, Z., C. Li, and S. Nadarajah (2016). Extremal properties of the skew-t distribution.

Statistics &Probability Letters 112 , 10–19.Resnick, S. I. (1987).

Extreme Values, Regular Variation, and Point Processes . Springer-Verlag.15 upplementary material for ‘Extremal properties of the multivariateextended skew-normal distribution’

B. Beranger ∗† , S. A. Padoan ‡ , Y. Xu ∗ and S. A. Sisson ∗ Abstract

This document contains the proof of Lemma 1 used in proof of Proposition 3.1, and Lemmas 2 and3 used in the proof of Theorem 3.1 in the main paper.

Note that all references below of the form ( · ∗ ) refer to equation ( · ) in the main paper. When we refer toa proposition or theorem we implicitly refer to a result in the paper unless otherwise speciﬁed. We ﬁrst recall properties of the extended skew-normal distribution that will be useful in some the followingproofs (e.g. Arellano-Valle and Genton, 2010).

Property 1.

Let X ∼ ESN d ( µ , Ω , α , τ ) . Let I ⊂ { , . . . , d } and ¯ I = { , . . . , d }\ I identify the d I and d ¯ I -dimensional subvector partition of X such that X = (cid:16) X > I , X > ¯ I (cid:17) > , with corresponding partition of theparameters ( µ , Ω , α ) . Then(i) X I ∼ ESN d I ( µ I , Ω II , α ∗ I , τ ∗ I ) where α ∗ I = α I + ¯ Ω − II Ω I ¯ I α ¯ I q Q ˜ Ω ¯ I ¯ I · I ( α ¯ I ) and τ ∗ I = τ q Q ˜ Ω ¯ I ¯ I · I ( α ¯ I ) , given ˜ Ω ¯ I ¯ I · I =¯ Ω ¯ I ¯ I − ¯ Ω ¯ II ¯ Ω − II ¯ Ω I ¯ I and Q ˜ Ω ¯ I ¯ I · I ( α ¯ I ) = α > ¯ I ˜ Ω − I ¯ I · I α ¯ I .(ii) ( X ¯ I | X I = x I ) ∼ ESN d ¯ I ( µ ¯ I · I , Ω ¯ I · I , α ¯ I · I , τ ¯ I · I ) where µ ¯ I · I = µ ¯ I + ¯ Ω ¯ II ¯ Ω − II ( x I − µ I ) , Ω ¯ I · I = Ω ¯ I ¯ I − Ω ¯ II Ω − II Ω I ¯ I , α ¯ I · I = ω ¯ I · I ω − I α ¯ I , ω ¯ I · I = diag ( Ω ¯ I · I ) / , ω ¯ I = diag ( Ω ¯ I ¯ I ) / and τ ¯ I · I = (cid:16) α > ¯ I ¯ Ω ¯ II ¯ Ω − II + α > I (cid:17) x I + τ . ∗ School of Mathematics and Statistics, University of New South Wales, Sydney, Australia. † Communicating Author:

[email protected] ‡ Department of Decision Sciences, Bocconi University of Milan, Italy. a r X i v : . [ s t a t . M E ] S e p .1 Proof of Lemma 1 Recall that if X ∼ ESN d ( ¯ Ω , α , τ ), then from Property 1(i) and 1(ii) we have that for any pair ( X i , X j ) with1 ≤ i < j ≤ d , X i ∼ ESN ( α ∗ i , τ ∗ i ) , α ∗ i = α i + ω i,j α j √ α j (1 − ω i,j ) , τ √ α j (1 − ω i,j ) X j ∼ ESN ( α ∗ j , τ ∗ j ) , α ∗ j = α j + ω i,j α i √ α i (1 − ω i,j ) , τ √ α i (1 − ω i,j ) . We consider the scenarios: (a) 0 < α ∗ i < α ∗ j , (b) α i , α j < α ∗ i < α ∗ j with α ∗ i <

0, (c) α ∗ i , α ∗ j < α i < α j ≥ α ∗ i < α ∗ j ≥

0. For cases (a) and (b) we need to show thatlim x →∞ Pr( X j ≥ x, X i ≥ x )Pr( X i ≥ x ) = 0 . (1)With case (a), we know that Pr( X j ≥ x, X i ≥ x ) ≤ Pr(min( Z i , Z j ) ≥ x ) /K , where( Z i , Z j ) ∼ N ( ¯ Ω ) , K = Φ( τ / q α i + α j + 2 ω i,j α i α j ) . (2)Furthermore, min( Z i , Z j ) ∼ ESN ( α ∗ , , α ∗ = − q (1 − ω i,j ) / (1 + ω i,j ) , see e.g. Azzalini and Capitanio (2014, p. 29). Therefore by applying these results to (1) and applyingProposition 2.2 from Beranger et al. (2018) to 1 − Φ α ∗ i ,τ ∗ i ( x ) and 1 − Φ α ∗ , ( x ) we obtainPr( X j ≥ x, X i ≥ x )Pr( X i ≥ x ) ≤ lim x →∞ − Φ α ∗ , ( x )1 − Φ α ∗ i ,τ ∗ i ( x ) K − ≤ lim x →∞ (1 − ω i,j ) / φ (cid:16) x p / (1 + ω i,j ) (cid:17)p π (1 − ω i,j ) x Φ ( τ ∗ i / ¯ α i ) φ ( x )Φ( α ∗ i x + τ ∗ i ) K − = 0and therefore the limit in (1) is satisﬁed.With case (b), using similar arguments than above we havePr( X j ≥ x, X i ≥ x ) ≤ Pr( Z j ≥ x, Z i ≥ x )Φ( x ( α i + α j ) + τ ) K − = (1 − Φ α ∗ , ( x ))Φ( x ( α i + α j ) + τ ) K − . Since x ( α i + α j ) + τ < x → ∞ , then by Proposition 2.2 from Beranger et al. (2018) we haveΦ( x ( α i + α j ) + τ ) ≈ − φ ( x ( α i + α j ) + τ ) / ( x ( α i + α j ) + τ ) , x → ∞ . Furthermore, since α ∗ i < α ∗ i x + τ ∗ i < x → ∞ and by Proposition 2.2 from Beranger et al. (2018)we have Φ( α ∗ i x + τ ∗ i ) = 1 − Φ( − α ∗ i x − τ ∗ i ) ≈ − φ ( α ∗ i x + τ ∗ i ) / ( α ∗ i x + τ ∗ i ) , x → ∞ . X j ≥ x, X i ≥ x )Pr( X i ≥ x ) ≤ lim x →∞ (1 − ω i,j ) / φ (cid:16) x p / (1 + ω i,j ) (cid:17)p π (1 − ω i,j ) x φ ( x ( α i + α j ) + τ ) φ ( x ) φ ( α ∗ i x + τ ∗ i ) α ∗ i x + τ ∗ i x ( α i + α j ) + τ ≈ lim x →∞ − (1 − ω i,j ) / α ∗ i p π (1 − ω i,j ) x φ (cid:16) x p / (1 + ω i,j ) (cid:17) φ ( x ( α i + α j ) + τ ) φ ( x ) φ ( α ∗ i x + τ ∗ i )= 0 , where in the last step used the fact that φ (cid:16) x p / (1 + ω i,j ) (cid:17) φ ( x ( α i + α j ) + τ ) φ ( x ) φ ( α ∗ i x + τ ∗ i ) = exp (cid:18) − (cid:18) x ω i,j + ( x ( α i + α j ) + τ ) − x − ( α ∗ i x + τ ∗ i ) (cid:19)(cid:19) ≈ exp (cid:20) − (cid:26) x (cid:18)

21 + ω i,j + ( α i + α j ) − − α ∗ i (cid:19)(cid:27)(cid:21) , x → ∞ , and where 2 / (1 + ω i,j ) + ( α + α ) − − α ∗ i >

0. Therefore, also in this case the limit in (1) is satisﬁed.With case (c) we need to show thatlim x →∞ Pr ( X j ≥ x, X i ≥ x ¯ α j / ¯ α i )Pr ( X i ≥ x ¯ α j / ¯ α i ) = 0 . (3)Applying Proposition 2.2 from Beranger et al. (2018) to Φ( α ∗ i ¯ α j / ¯ α i x + τ ∗ i ) we obtain for the denominatorthat as x → ∞ D ( x ) = ∂∂x Pr ( X i ≥ x ¯ α j / ¯ α i ) = − ¯ α j / ¯ α i φ α ∗ i ,τ ∗ i ( x ¯ α j / ¯ α i ) ≈ ¯ α j / ¯ α i φ ( x ¯ α j / ¯ α i ) φ ( α ∗ i ¯ α j / ¯ α i x + τ ∗ i )Φ( τ ∗ i / ¯ α i ) ( α ∗ i ¯ α j / ¯ α i x + τ ∗ i ) ≈ ¯ α j / ¯ α i ¯ α i exp (cid:8) − x ¯ α j / (cid:9) πxα ∗ i ¯ α j Φ( τ ∗ i / ¯ α i ) . For the numerator we have ∂∂x

Pr ( X j ≥ x, X i ≥ x ¯ α j / ¯ α i ) = − ¯ α j / ¯ α i Z ∞ x φ (cid:16) ( x ¯ α j / ¯ α i , v ) > ; ¯ Ω , α , τ (cid:17) d v − Z ∞ a φ (cid:0) ( x, v ) > ; ¯ Ω , α , τ (cid:1) d v = D ( x ) + D ( x ) , where a = x ¯ α j / ¯ α i . For D ( x ) we use integration by parts where r = − Φ ( α i x ¯ α j / ¯ α i + α j v + τ ) v + ω i,j x ¯ α j / ¯ α i s = φ (cid:16) ( x ¯ α j / ¯ α i , v ) > ; ¯ Ω (cid:17) , and then applying Proposition 2.2 from Beranger et al. (2018) to Φ ( α i x ¯ α j / ¯ α i + α j v + τ ) we obtain D ( x ) ≈ ¯ α j ¯ α i φ (cid:16) ( x ¯ α j / ¯ α i , x ) > ; ¯ Ω (cid:17) φ ( α i x ¯ α j / ¯ α i + α j x + τ )( x + ω i,j x ¯ α j / ¯ α i ) ( α i x ¯ α j / ¯ α i + α j x + τ ) K − as x → ∞ . For D ( x ) we also use integration by parts where r = − Φ( α v + α x + τ ) v + ω i,j x , s = φ (cid:16) ( x, v ) > ; ¯ Ω (cid:17) , α i x ¯ α j / ¯ α i + α j v + τ ) we obtain D ( x ) ≈ φ (cid:16) ( x ¯ α j / ¯ α i , x ) > ; ¯ Ω (cid:17) φ ( α i x ¯ α j / ¯ α i + α j x + τ )( ω i,j x + x ¯ α j / ¯ α i ) ( α i x ¯ α j / ¯ α i + α j x + τ ) K − , as x → ∞ . Then the ratio in (3) behaves asymptotically as ( D ( x ) + D ( x )) /D ( x ) as x → ∞ . Furthermore, as x → ∞ D ( x ) D ( x ) ≈ φ (cid:16) ( x ¯ α j / ¯ α i , x ) > ; ¯ Ω (cid:17) φ ( α i x ¯ α j / ¯ α i + α j x + τ ) x (1 + ω i,j ¯ α j / ¯ α i ) ( α i ¯ α j / ¯ α i + α j + τ /x ) exp (cid:0) − x ¯ α j / (cid:1) → , x → ∞ , and as x → ∞ D ( x ) D ( x ) ≈ φ (cid:16) ( x ¯ α j / ¯ α i , x ) > ; ¯ Ω (cid:17) φ ( α i x ¯ α j / ¯ α i + α j x + τ ) x ( ω i,j + ¯ α j / ¯ α i ) ( α i ¯ α j / ¯ α i + α j + τ /x ) exp (cid:0) − x ¯ α j / (cid:1) → . Therefore, the limit in (3) is proven.With case (d) we need to show thatlim x →∞ Pr ( X j ≥ x, X i ≥ x/ ¯ α i )Pr ( X i ≥ x/ ¯ α i ) = 0 . (4)Note that in this case as x → ∞ we haveΦ (cid:0) α ∗ j ¯ α i x + τ ∗ (cid:1) > √ π ( − α ∗ i )¯ α j x , which implies that as x → ∞ ,1 − Φ xα ∗ j ,τ ∗ j (¯ α i ) > − Φ α ∗ i ,τ ∗ i ( x ) , Φ α ∗ j ,τ ∗ j ( x ¯ α i ) < Φ α ∗ i ,τ ∗ i ( x ) . These results imply that Pr ( X j ≥ x, X i ≥ x/ ¯ α i )Pr ( X i ≥ x/ ¯ α i ) ≤ Pr ( Z i ≥ x, Z j ≥ x/ ¯ α i )Pr ( X i ≥ x/ ¯ α i ) K − where ( Z i , Z j ) and K are given in (2). By Savage’s approximation and applying Proposition 2.2 fromBeranger et al. (2018) we obtain as x → ∞ Pr ( Z i ≥ x, Z j ≥ x/ ¯ α i )Pr ( X i ≥ x/ ¯ α i ) K − ≤ φ (( x, x/ ¯ α i ) > ; ¯ Ω ) x (1 / ¯ α i − ω i,j ) (1 − ω i,j / ¯ α i ) Pr ( X i ≥ x/ ¯ α i ) K − ≈ √ π ( − α ∗ i )(1 / ¯ α i − ω i,j ) (1 − ω i,j / ¯ α i ) φ (( x, x/ ¯ α i ) > ; ¯ Ω ) φ ( x ) ≈ (1 + α ∗ i )( − α ∗ i )(1 / ¯ α i − ω i,j ) (1 − ω i,j / ¯ α i ) (1 − ω i,j ) exp ( − x ω i,j ¯ α i ) (1 − ω i,j )(1 + α ∗ i ) ) → , x → ∞ , and therefore also the last limit in (4) is proven. 4 .2 Proof of Lemma 2 Let X n,m ∼ ESN d ( ¯ Ω n , α n , τ ), n ∈ N with m = 1 , . . . , n , where ¯ Ω n and α n are deﬁned in Condition 1 and τ ∈ . From Property 1(i) we have X n,m ; j ∼ ESN ( α ∗ n ; j , τ ∗ n ; j ) , j ∈ I, (5)where I = { , . . . , d } , α ∗ n ; j = α n ; j + P i ∈ I j α n ; i ω n ; i,j C ( ¯ Ω n , α n ) , τ ∗ n,j = τC ( ¯ Ω n , α n ) , (6)with C ( ¯ Ω n , α n ) =  X i,k ∈ I j α n ; i α n ; k ( ω n ; i,k − ω n ; i,j ω n ; j,k )  / and ( X n,m ; i , i ∈ I j ) > | X n,m ; j = x j ∼ ESN d − ( ˜ µ n ; j , ˜ Ω n ; j , ˜ α n ; j , ˜ τ n ; j ) , j ∈ I, (7)where ˜ µ n ; j = ( x j ω n ; i,j , i ∈ I j ) > , I j = I \{ j } , ˜ Ω n ; j is a ( d − × ( d −

1) correlation matrix with diagonalentries 1 − ω n ; i,j for i ∈ I j and upper diagonal entries ω n ; i,k − ω n ; i,j ω n ; j,k for i, k ∈ I j and˜ α n ; j = (cid:16)(cid:0) − ω n ; i,j (cid:1) / α n ; i , i ∈ I j (cid:17) > , ˜ τ n ; j = X i ∈ I j ( α n ; i ω n ; i,j ) + α n ; j  x j + τ. (8)Using similar steps to those in the proof of Proposition 2.4 from Beranger et al. (2018) we may derivefor all j ∈ I the normalising constants a n ; j = ‘ − n , b n ; j = ‘ n − ln(2 √ π ) + (1 /

ESN d ( ¯ Ω n , α n , τ ) are continuous, thenfrom the theory of multivariate tail dependence functions (e.g. Nikoloulopoulos et al., 2009, Li, 2009) thestable-tail dependence function can be derived as L ( z ) = lim n →∞ n n − Pr (cid:16) Φ α ∗ n ; j ,τ ∗ n ; j ( X j ) ≤ − z j n , j = 1 , . . . , d (cid:17)o , z ∈ [0 , ∞ ) d = d X j =1 z j lim n →∞ Pr (cid:16) Φ α ∗ n ; i ,τ ∗ n ; i ( X i ) ≤ − z i n , i ∈ I j | Φ α ∗ n ; j ,τ ∗ n ; j ( X j ) = 1 − z j n (cid:17) = d X j =1 z j lim n →∞ Pr  X i ≤ a n ; i  U i (cid:16) nz i (cid:17) − b n ; i a n ; i  + b n ; i , i ∈ I j | X j = a n ; j  U j (cid:16) nz j (cid:17) − b n ; j a n ; j  + b n ; j  U i ( n/z i ) = Φ ← α ∗ n ; i ,τ ∗ n ; i (1 − z i /n ) , U j ( n/z j ) = Φ ← α ∗ n ; j ,τ ∗ n ; j (1 − z j /n )and Φ ← α ∗ n ; i ,τ ∗ n ; i ,Φ ← α ∗ n ; j ,τ ∗ n ; j denote the left-continuous inverse of Φ α ∗ n ; i ,τ ∗ n ; i and Φ α ∗ n ; j ,τ ∗ n ; j , for all j ∈ I and i ∈ I j .From Condition 1 we obtain the following results. For all j ∈ I and i, k ∈ I j , as n → ∞ we have C ( ¯ Ω n , α n ) →  X i,k ∈ I j α ◦ i α ◦ k ( λ i,j + λ j,k − λ i,k )  / =: C ( ¯ Λ , α ◦ ) , where α ◦ i α k ∈ , λ i,j , λ j,k , λ i,k ∈ (0 , ∞ ]. Furthermore, as n → ∞ , ω n ; i,k − ω n ; i,j ω n ; j,k q (1 − ω n ; i,j )(1 − ω n ; k,j ) ≈ − λ i,k ln n − (cid:16) − λ i,j ln n (cid:17) (cid:16) − λ j,k ln n (cid:17)s(cid:26) − (cid:16) − λ i,j ln n (cid:17) (cid:27) (cid:26) − (cid:16) − λ j,k ln n (cid:17) (cid:27) (9) → − λ i,k + λ i,j + λ j,k λ i,j λ j,k , q − ω n ; i,j α n ; i = p (1 + ω n ; i,j )(1 − ω n ; i,j ) ln nα n ; i ln n (10) → √ λ i,j α ◦ i ,α ∗ n ; j = P j ∈ I α n ; j − P i ∈ I j α n ; i (1 − ω n ; i,j ) C ( ¯ Ω n , α n ) → , (11) τ ∗ n ; j = τ r P i,k ∈ I j α n ; i α n ; k q (1 − ω n ; i,j )(1 − ω n ; j,k ) ( ω n ; i,k − ω n ; i,j ω n ; j,k ) √ (1 − ω n ; i,j )(1 − ω n ; k,j ) (12) → τC ( ¯ Λ , α ◦ ) ≡ τ ∗ j and α ∗ n ; j √ ln n = − P i ∈ I j α n ; i (1 − ω n ; i,j ) √ ln nC ( ¯ Ω n , α n ) (13) → − P i ∈ I j λ i,j α ◦ i C ( ¯ Λ , α ◦ ) . From (11) and (12) we have that as n → ∞ U i ( n/z i ) ≈ Φ ← ,τ ∗ i (1 − z i /n ) , U j ( n/z j ) ≈ Φ ← ,τ ∗ j (1 − z j /n ) , and from Resnick (1987, Proposition 0.10) it follows that as n → ∞ ( U i ( n/z i ) − b n ; i ) /a n ; i ≈ ln z − i , ( U j ( n/z j ) − b n ; j ) /a n ; j ≈ ln z − i . L ( z ) = d X j =1 z j lim n →∞ Φ (cid:26) X i ≤ U i (cid:18) nz i (cid:19) , i ∈ I j | X j = U j (cid:18) nz j (cid:19) ; ˜ Ω n ; j , ˜ α n ; j , ˜ τ n ; j (cid:27) = d X j =1 z j lim n →∞ Φ  a n ; i  U i (cid:16) nz i (cid:17) − b n ; i a n ; i  + b n ; j − ω n ; i,j  a n ; i  U i (cid:16) nz i (cid:17) − b n ; j a n ; i  + b n ; j  , i ∈ I j ; ˜ Ω n ; j , ˜ α n ; j , ˜ τ n ; j  ≈ d X j =1 z j lim n →∞ Φ a n ; i ln z − i + b n ; j − ω n ; i,j (cid:0) a n ; j ln z − j + b n ; j (cid:1) { − ω n ; i,j } / , i ∈ I j ; ¯˜ Ω n ; j , ˜ α n ; j , ˜ τ n ; j ! . From (9) and (10) we have that ¯˜ Ω n ; j converges elementwise to ¯ Λ j and ˜ α n ; j converges componentwise to ˜ α j as n → ∞ . Now, if we assume that α ∗ n ; i ≥ α ∗ n ; j <

0, then by (11), (12) and (13) we have for all j ∈ I and i ∈ I j that a n ; i ln z − i + b n ; j − ω n ; i,j (cid:0) a n ; j ln z − j + b n ; j (cid:1) { − ω n ; i,j } / = ln z − i − ln z − j + (1 − ω n ; i,j ) ln z − j ‘ n { − ω n ; i,j } / + { − ω n ; i,j } / ‘ n { ω n ; i,j } / − ln(2 √ π ) + (1 /

2) ln ln n + ln Φ (cid:0) τ ∗ n ; i / ¯ α n ; i (cid:1) − ln Φ (cid:0) α ∗ n ; i ‘ n + τ ∗ n ; i (cid:1) ‘ n { − ω n ; i,j } / + ω n ; i,j ln √ π + ln Φ (cid:0) τ ∗ n ; j / ¯ α n ; j (cid:1) − ln Φ (cid:0) α ∗ n ; j ‘ n + τ ∗ n ; j (cid:1) ln Φ (cid:0) ¯ α n ; j ‘ n + α ∗ n ; j τ ∗ n ; j (cid:1) ‘ n { − ω n ; i,j } / → ln z − i − ln z − j λ ij + λ ij + ln Φ (cid:18) τ − P k ∈ Ii √ λ k,i α ◦ k C ( ¯ Λ , α ◦ ) (cid:19) − ln Φ (cid:18) τ − P k ∈ Ij √ λ k,j α ◦ k C ( ¯ Λ , α ◦ ) (cid:19) λ i,j − ln Φ (cid:16) τC ( ¯ Λ , α ◦ ) (cid:17) − ln Φ (cid:16) τC ( ¯ Λ , α ◦ ) (cid:17) λ i,j as n → ∞ = ln ˜ z j / ˜ z i λ i,j + λ i,j . Furthermore we also have˜ τ n ; j = τ − X i ∈ I j (1 − ω n ; i,j ) α n ; i ( a n ; j ln z − j + b n ; j )= τ − X i ∈ I j (1 − ω n ; i,j ) α n ; i ln z − j ‘ n − X i ∈ I j (1 − ω n ; i,j ) α n ; i ‘ n + P i ∈ I j (1 − ω n ; i,j ) ‘ n (cid:16) ln √ π + ln Φ (cid:0) τ ∗ n ; j / ¯ α n ; i (cid:1) − ln Φ (cid:0) α ∗ n ; j ‘ n + τ ∗ n ; j (cid:1) ln Φ (cid:0) ¯ α n ; i ‘ n + α ∗ n ; j τ ∗ n ; j (cid:1)(cid:17) → τ + 0 − X i ∈ I j √ α ◦ i λ i,j + 0 ≡ ˜ τ as n → ∞ . Combining all these results together produces the ﬁnal expression in (5 ∗ ). The same results are also obtainedwith similar steps for the cases ( α ∗ n ; j ≥ α ∗ n ; i < α ∗ n ; i ≥ α ∗ n ; j ≥

0) and ( α ∗ n ; i < α ∗ n ; j < eferences Arellano-Valle, R. B. and M. G. Genton (2010). Multivariate extended skew- t distributions and relatedfamilies. Metron 68 (3), 201–234.Azzalini, A. and A. Capitanio (2014).

The Skew-Normal and Related Families . Cambridge: University Press,Cambridge.Beranger, B., S. A. Padoan, Y. Xu, and S. A. Sisson (2018). Extremal properties of the univariate extendedskew-normal distribution.

Submitted .Li, H. (2009). Orthant tail dependence of multivariate extreme value distributions.

Journal of MultivariateAnalysis 100 , 243–256.Nikoloulopoulos, A. K., H. Joe, and H. Li (2009). Extreme value properties of multivariate t copulas.

Extremes 12 , 129–148.Resnick, S. I. (1987).