[PDF] Regression-type analysis for block maxima on block maxima

Abstract

This paper devises a regression-type model for the situation where both the response and covariates are extreme. The proposed approach is designed for the setting where both the response and covariates are themselves block maxima, and thus contrarily to standard regression methods it takes into account the key fact that the limiting distribution of suitably standardized componentwise maxima is an extreme value copula. An important target in the proposed framework is the regression manifold, which consists of a family of regression lines obeying the latter asymptotic result. To learn about the proposed model from data, we employ a Bernstein polynomial prior on the space of angular densities which leads to an induced prior on the space of regression manifolds. Numerical studies suggest a good performance of the proposed methods, and a finance real-data illustration reveals interesting aspects on the comovements of extreme losses between two leading stock markets.

Full PDF

RRegression-type analysis for block maxima on block maxima

Miguel de Carvalho a , Gon¸calo dos Reis a,b , Alina Kumukova c a School of Mathematics, University of Edinburgh, The King’s Buildings, Edinburgh, EH9 3FD, UK b Centro de Matem´atica e Aplica¸c˜oes (CMA), FCT, UNL, PT c Maxwell Institute for Mathematical Sciences School of Mathematics, University of Edinburgh, Edinburgh UK

Abstract

This paper devises a regression-type model for the situation where both the response and covariatesare extreme. The proposed approach is designed for the setting where both the response andcovariates are themselves block maxima, and thus contrarily to standard regression methods ittakes into account the key fact that the limiting distribution of suitably standardized componentwisemaxima is an extreme value copula. An important target in the proposed framework is the regressionmanifold, which consists of a family of regression lines obeying the latter asymptotic result. Tolearn about the proposed model from data, we employ a Bernstein polynomial prior on the space ofangular densities which leads to an induced prior on the space of regression manifolds. Numericalstudies suggest a good performance of the proposed methods, and a ﬁnance real-data illustrationreveals interesting aspects on the comovements of extreme losses between two leading stock markets.

Key words and phrases:

Bernstein polynomials, Block maxima, Extreme value copula, Jointextremes, Multivariate extreme value distribution, Quantile regression, Statistics of extremes.

1. Introduction

Block maxima data—such as annual maxima—are a mainstay of statistics of extremes. Whereasclassical statistical modeling is mostly concerned with inferences surrounding the bulk of a distri-bution, the ﬁeld of statistics of extremes deals with the rather challenging situation of conductinginferences about the tail of a distribution. The behavior of extreme values in large samples is oftenmathematically tractable, and this tractability is often used to build sound statistical methods formodeling risk and extreme values. As an example of this asymptotic tractability, it is well knownthat if Y , . . . , Y n is a random sample with sample maximum M n = max( Y , . . . , Y n ) and if thereexist sequences { a n > } and { b n } such that ( M n − b n ) /a n d → Z , then Z follows a GEV (GeneralizedExtreme Value) distribution with location, scale, and shape parameters µ ∈ R , σ >

0, and ξ ∈ R respectively; see, for instance, (Coles, 2001, Theorem 3.1.1). Details on the paradigm of statisticsof extremes can be found, for instance, in the books of Coles (2001), Beirlant et al. (2004), and deHaan and Ferreira (2006); for applications in ﬁnance, insurance, and economics see, for instance,Embrechts et al. (1997), (McNeil et al., 2015, Ch. 7), and Longin (2017).In this paper, we devise a regression-type method for the situation where both the responseand the covariates are themselves block maxima. An important result in the ﬁeld of statisticsof extremes—that will be fundamental for our developments—is that the properly standardizedvector of block maxima converges in distribution to a so-called extreme value copula (Gudendorf Preprint submitted to Elsevier February 19, 2021 a r X i v : . [ m a t h . S T ] F e b nd Segers, 2010). Thus, a key target in the proposed framework is what we will refer below as theregression manifold, that is, a family of regressions lines that obeys the latter large sample result.Our methods thus take on board information on the dependence structure between the extremevalues so to assess what eﬀects block maxima covariates can have on a block maxima response. Tolearn about the proposed model from data, we develop a prior in the space of regression manifoldsby resorting to a ﬂexible Bernstein polynomial prior on the space of angular densities as recentlyproposed by Hanson et al. (2017).Our approach contributes to the literature on conditional modeling given large observed values(e.g. Wang and Stoev, 2011; Cooley et al., 2012), nonetheless, our focus diﬀers from the latter papersin a number of important ways as we describe next. A main diﬀerence is that, as anticipated above,here the focus is on devising a regression framework for a block maxima response on block maximacovariate, whereas the latter papers focus mainly on using the conditional density as a way tomake probabilistic statements about the likelihood of an extreme given the occurrence of anotherextreme. Since our main target of analysis is regression, our method has some links with statisticalapproaches for nonstationary extremes (e.g. Coles, 2001, Section 6); the most elementary versionof approaches for nonstationary extremes aims to learn about how the limiting law of a suitablystandardized block maxima response ( S x ) changes according to a covariate x = ( x , . . . , x p ) T , viathe speciﬁcation ( S | X = x ) ∼ GEV( µ x , σ x , ξ x ) . (1)Since the approach in (1) is built from the univariate theory of extremes it is not tailored forconditioning on another variable being extreme as it fails to take on board information from thedependence structure between the extremes.Additionally, the method proposed in this work has some relation to quantile regression (Koenker,2005), whose most elementary version consists of modeling the conditional quantile of a response Y given a covariate X = ( X , . . . , X p ) T in a linear fashion, that is F − ( q | x ) = x T β q , < q < , (2)where F − ( q | x ) = inf { y : F ( y | x ) ≥ q } and F ( y | x ) is the distribution function of Y | X .Versions of quantile regression that aim to equip (2) with the ability to extrapolate into the tailof Y are often known as extremal quantile regression methods (e.g. Chernozhukov, 2005). Whileﬂexible and sturdy, such quantile regression-based approaches do not take into account informationon the fact that the limiting joint distribution of suitably standardized componentwise maxima isan extreme value copula, and thus fail to be equipped with the ability to extrapolate into the jointtail.The remainder of the paper unfolds as follows. In Section 2 we introduce the proposed modeland devise an approach for learning about it from data. Section 3 reports the main ﬁndings of aMonte Carlo simulation study. We showcase the proposed methodology in a real data applicationto stock market data in Section 4. Finally, in Section 5 we present closings remarks. Proofs andderivations can be found in the appendix, and further numerical experiments and other technicaldetails are presented in the supplementary material (Appendix E and Appendix F).2 . Modelling limiting block maxima conditioned on block maxima Prior to introducing a regression of block maxima on block maxima we need to lay groundworkon multivariate extremes. Let { ( S i , Z i ) } ni =1 be a sequence of independent random vectors with unitFr´echet marginal distributions, i.e. exp( − /z ), for z >

0. In our setup, S i should be understood asa response, whereas Z i = ( Z ,i , . . . , Z p,i ) should be understood as a p -dimensional covariate. Letthe componentwise block maxima be M n = ( M n,s , M n,z , . . . , M n,z p ) with M n,s = max { S , . . . , S n } and M n,z j = max( Z j, , . . . , Z j,n ), for j = 1 , . . . , p . Under this setup, it is well-known that the vectorof normalized componentwise maxima M n /n converges in distribution to a random vector ( Y, X )which follows a multivariate extreme value distribution with the joint distribution function G ( y, x ) = exp {− V ( y, x ) } , y, x , . . . , x p > . (3)Here, V ( y, x ) = d (cid:90) ∆ d max (cid:18) w y , w x , . . . , w d x p (cid:19) H (d w ) , is the so-called exponent measure; see, for instance, de Haan and Resnick (1977), Pickands (1981),and (Coles, 2001, Theorem 8.1). In addition, H is a parameter of the multivariate extreme value dis-tribution G known as angular measure , which controls the dependence between the extreme values;speciﬁcally, H is a probability measure on the unit simplex ∆ d = { ( w , . . . , w d ) ∈ [0 , d , (cid:80) di =1 w i =1 } ⊂ R d , with d = p + 1, and obeying the mean constraint (cid:90) ∆ d w H (d w ) = 1 d d , (4)where 11 d is a vector of ones in R d . If H is absolutely continuous with respect to the Lebesguemeasure then its density is given by the Radon–Nikodym derivative h = d H/ d w , for w ∈ ∆ d . Regression manifold for block maxima on block maxima

We are now ready to introduce our regression method for block maxima on block maxima. Wedeﬁne the regression manifold as the family of regression lines, L = { L q : 0 < q < } with L q = { y q | x : x ∈ (0 , ∞ ) p } , (5)where y q | x = inf (cid:8) y > G Y | X ( y | x ) ≥ q (cid:9) , (6)is a conditional quantile of a multivariate extreme value distribution, with q ∈ (0 ,

1) and x ∈ (0 , ∞ ) p ,and G Y | X ( y | x ) = P { Y ≤ y | X = x } is a conditional multivariate extreme value distributionfunction. 3n higher dimensions G Y | X can be expressed with the help of a joint multivariate extreme valuedensity g Y, X and its expression has been derived by Stephenson and Tawn (2005). By applyingBayes’ theorem, we deduce G Y | X ( y | x ) = (cid:82) y g Y | X ( z | x ) d z from g Y, X with g Y | X given as follows: g Y | X ( y | x ) = exp {− V ( y, x ) } d (cid:80) i =1 n i,d (cid:80) j =1 ( − i (cid:81) Λ ∈ r ij,d V Λ ( y, x ) d (cid:80) i =1 n i,d (cid:80) j =1 ( − i ∞ (cid:82) exp {− V ( y, x ) } (cid:81) Λ ∈ r ij,d V Λ ( y, x ) d y , y, x > , (7)where V Λ ( y, x ) corresponds to mixed partial derivative of the exponent measure V ( y, x ) with respectto the i th components of ( y, x ) such that i ∈ Λ, n l,d is the number of partitions of { , . . . , d } of size l = 1 , . . . , d , and r lm,d is the m th partition of { , . . . , d } of size l , with 1 ≤ m ≤ n l,d .In the particular case where we have a single covariate ( p = 1), the regression manifold L in (5) can be derived using properties of bivariate copulas; see Appendix A. Accordingly, for anabsolutely continuous angular measure H (with density h ), it follows that G Y | X ( y | x ) = 2 exp (cid:26) − (cid:90) max (cid:26) wx , − wy (cid:27) h ( w ) d w + 1 x (cid:27) (cid:90) ω ( x,y ) wh ( w ) d w, x, y > , (8)where ω ( x, y ) = x/ ( x + y ), and y q | x is then calculated via (6). For general p ≥

1, in the two specialcases of independent and perfectly dependent extremes the regression manifold L in (5) can beobtained in a more direct way. For the former, we derive the regression lines L q explicitly by ﬁrstcalculating g Y | X and then using that G Y | X ( y | x ) = (cid:82) y g Y | X ( z | x ) d z while for the latter, weprovide approximations.When extremes are independent, H assigns equal mass to the boundaries of the simplex, whichalso corresponds to asymptotic independence of X and Y (H¨usler and Li (2009)), resulting in L q = {− / log q : x ∈ (0 , ∞ ) p } , (9)with G Y | X ( q | x ) = (cid:90) y exp( − z − − x − − · · · − x − p )( − z − ) p (cid:81) j =1 ( − x − j )exp( − x − − · · · − x − p ) p (cid:81) j =1 ( − x − j ) d z = e − y − , y > . When extremes are perfectly dependent, the angular measure H assigns all its mass to thebarycenter of the simplex, d − d , leading to G ( y, x ) = exp {− max( y − , x − , . . . , x − p ) } . Takingderivatives of G ( y, x ) in this case is non-trivial, and we replace the maximum function with a softmaximum (Cook, 2011) so to obtain an approximation for the shape of the regression lines forperfectly dependent extremes. Thus, the soft maximum approximation for the regression lines forperfectly dependent extremes is˜ L q = { min( x , . . . , x p ) : x ∈ (0 , ∞ ) p } , (10)with G Y | X ( y | x ) = I { y ≥ min( x , . . . , x p ) } , where I is the indicator function; see Appendix B forthe derivation. 4rivially, quantiles of a conditional extreme value distribution obey the standard propertiesof quantile functions (see van der Vaart, 1998, Chap. 21). Less trivial is however the fact that,monotone regression dependence of bivariate extremes (Guillem, 2000, Theorem 1) implies thatregression lines L q in (5) are non-decreasing in x , for p = 1, under some mild assumptions. Proposition 2.1 (Monotonicity of regression manifold) . Let G Y | X ( y | x ) = P { Y ≤ y | X = x } be aconditional bivariate extreme value distribution function, which we assume to be jointly continuouslydiﬀerentiable and strictly increasing in y for any ﬁxed x ∈ (0 , ∞ ) . Then, the regression lines forbivariate extremes (0 , ∞ ) (cid:51) x (cid:55)→ y q | x are non-decreasing for all q ∈ (0 , .Proof. See Appendix C.An example of a bivariate extreme value distribution satisfying the assumptions of Proposi-tion 2.1 is the logistic model, whose regression manifold is discussed in Example 2.2 below.

Parametric examples of regression manifolds for block maxima on block maxima

We now consider some parametric instances of regression manifolds as deﬁned in (5). Charts ofregression manifolds for these parametric examples are depicted in Figure 1. In Appendix D, weshow that for suﬃciently large x , the following approximation holds for the regression manifold ofthe logistic model, ˜ L q = { ˜ y q | x : x ∈ (0 , ∞ ) p } , with˜ y q | x = γ q + β q x + O ( x ) . (11)Here γ q and β q are functions of α and q (see (D.4) and (D.3)), and O ( x ) is little- O of x in Bachmann–Landau notation. Example 2.2 (Logistic) . An instance of the logistic regression manifold can be found in Figure 1(top). It stems from the logistic bivariate extreme value distribution function given by G ( x, y ) = exp {− ( x − /α + y − /α ) α } , x, y > , where α ∈ (0 ,

1] characterizes the dependence between extremes: The closer α is to 0, the strongerthe dependence, with the limit α → Y given X is G Y | X ( y | x ) = G ( x, y )( x − /α + y − /α ) α − x − /α exp( x − ) , x, y > , thus leading to the following family of regression lines L q in (5) where y q | x = (cid:34)(cid:26) − αα xW (cid:18) α − α x − e α/ (1 − α ) x − q α/ ( α − (cid:19)(cid:27) /α − (cid:35) − α x, (12)and x >

0. Here, W is the so-called Lambert W function, that is, the multivalued analytic inverseof f ( z ) = z exp( z ) with z denoting a real or complex number (Borwein and Lindstrom, 2016); seethe supplementary material Appendix F for further details. As it can be seen from Figure 1 (top),the regression lines obey what is claimed in Proposition 2.1 in the sense that (0 , ∞ ) (cid:51) x (cid:55)→ y q | x arenon-decreasing for all q ∈ (0 , ogistic regression manifold x q y q | x a = x q y q | x a = x q y q | x a = Husler–Reiss regression manifold x q y q | x l = x q y q | x l = x q y q | x l = Coles–Tawn regression manifold x q y q | x a = b = x q y q | x a = b = x q y q | x a = b = Figure 1: Regression manifold L , as deﬁned in (5), for bivariate logistic, Husler–Reiss and Coles–Tawn models (topto bottom) with strong dependence, intermediate and weak extremal dependence (left to right). Example 2.3 (Husler–Reiss) . An instance of the Husler–Reiss regression manifold is depicted inFigure 1 (middle). It follows from the Husler–Reiss bivariate extreme value distribution functionwhich has the following form: G ( x, y ) = exp (cid:26) − x − Φ (cid:18) λ + 12 λ log yx (cid:19) − y − Φ (cid:18) λ + 12 λ log xy (cid:19)(cid:27) , x, y > , λ ∈ (0 , ∞ ] is the parameter regulatingthe dependence between extremes: λ → λ → ∞ corresponds to complete independence. The family of regression lines L q in (5) for thismodel does not have explicit representations and is obtained using (6) with G Y | X ( y | x ) = (cid:20) Φ (cid:18) λ + 12 λ log yx (cid:19) + 12 λ φ (cid:16) λ + log yx (cid:17) λ − xy − λ φ (cid:18) λ + 12 λ log xy (cid:19)(cid:21) × G ( x, y ) exp { x − } , x, y > , where φ is the standard normal density function. Example 2.4 (Coles–Tawn) . An instance of the Coles–Tawn regression manifold is depicted inFigure 1 (bottom). It follows from the Coles–Tawn bivariate extreme value distribution functionwhich has the following form: G ( x, y ) = exp[ − x − { − Be( q ; α + 1 , β ) } − y − Be( q ; α, β + 1)] , x, y > , where Be( q ; a, b ) is a cdf of a beta distribution function with parameters a, b > q = αy − / ( αy − + βx − ) and α, β > α = β = 0corresponds to complete independence and α = β → ∞ corresponds to perfect dependence; forﬁxed α ( β ) the strength of dependence increases with β ( α ). The family of regression lines L q in(5) for this model does not have an explicit representation and is calculated using (6), for x, y > G Y | X ( y | x ) = (cid:104) − Be ( q ; α + 1 , β ) + ( α + 1) βγ be ( q ; α + 2 , β + 1) − xy α ( β + 1) γ be ( q ; α + 1 , β + 2) (cid:105) x − G ( x, y ) exp { x − } , where be( q ; a, b ) is the density function of the Beta distribution with parameters a, b > γ = ( α + β + 2)( α + β + 2).This section introduced our key parameter of interest—regression manifolds for block maxima onblock maxima, i.e. L as in (5)—, it commented on some of its properties, and gave examples ofparametric instances. Next, we discuss Bayesian inference for L . In this section we discuss how to learn about regression manifolds from data. To achieve this,we will deﬁne an induced prior on the space of regression manifolds by resorting to a ﬂexible prioron the space of all angular measures that was recently proposed by Hanson et al. (2017). To laythe groundwork, we start by deﬁning the setup of interest. Assume the observations { ( S i , Z i ) } ni =1 ,deﬁne the scalar R i = S i + (cid:80) pj =1 Z j,i and the vector W i = ( S i , Z i ) /R i , known as the pseudo-angulardecomposition of the observations. In de Haan and Resnick (1977) it is shown the equivalence ofthe convergence of normalized componentwise maxima to G to the following weak convergence ofmeasures P { W ∈ · | R > u } d → H ( · ) , as u → ∞ . R is suﬃciently large, the pseudo-angles W are nearly independentof R and follow approximately a distribution associated with the angular measure H . Thus, to learnabout L q in (5), we ﬁrst learn about H based on { W i : R i > u } ni =1 , for a large u , with a methodologywe describe next.Following Hanson et al. (2017), we model the angular density h via a Bernstein polynomialdeﬁned on the unit simplex ∆ d , and hence basis polynomials are Dirichlet densities. Speciﬁcally,our speciﬁcation for the angular density is as follows h v ( w ) = (cid:88) | α | = J v α Γ( | α | ) d (cid:81) i =1 Γ( α i ) d (cid:89) i =1 w α i − i , w ∈ ∆ d , (13)where v α > α ∈ N d , | α | = (cid:80) dj =1 α j , and where J ∈ N (with N := { , , , · · · } ).To ensure that the resulting h v ( w ) is a valid angular density (i.e. an actual density satisfying themoment constraint (4)), the weights must obey (cid:88) | α | = J v α = 1 , J − d +1 (cid:88) i =1 i (cid:88) | α | = J,α j = i v α = Jd , (14)for j = 1 , . . . , d .To induce a prior in the space of regression manifolds we plug-in the angular density in (13)into (7); subsequent integration with respect to y and inversion of G Y | X ( y | x ) leads to an inducedprior on the space of regression lines L q . In detail, to deﬁne a prior on the space of regressionmanifolds we proceed as follows; to streamline the presentation of ideas, we will ﬁrst concentrateon the case p = 1. The Bernstein polynomial prior in (13) induces a prior on the space of regressionlines L q = { y q | x : x ∈ (0 , ∞ ) } , where y q | x is a solution to equation, G Y | X ( y | x ) = q , for q ∈ (0 , G Y | X ( y | x )= 2 J exp (cid:40) − J (cid:88) | α | = J v α (cid:16) α x − (cid:20) − Be (cid:18) xx + y ; α + 1 , α (cid:19)(cid:21) + α y − Be (cid:18) xx + y ; α , α + 1 (cid:19) (cid:17)(cid:41) × (cid:88) | α | = J v α α (cid:20) − Be (cid:18) xx + y ; α + 1 , α (cid:19)(cid:21) exp (cid:8) x − (cid:9) , x, y > . (15)When p > G Y | X ( y | x ) = q , with h v ( w ) as in (13). The expression for the conditional multivariate extremevalue distribution G Y | X ( y | x ) for p > Induced prior on marginal eﬀects

We now switch gears and focus on marginal eﬀects. Generally speaking, one can quantify theeﬀect of a covariate on the quantiles of the response via so-called marginal eﬀects, expressed as thederivative of the regression lines with respect to that covariate. For a given conditional quantile ofthe response the marginal eﬀect of the i th covariate is interpreted as the change in this quantile ofthe response occurring due to a change in the i th covariate, with other covariates being ﬁxed.8he marginal eﬀect of x i on y q | x , can be formally deﬁned by taking the derivative of the regres-sion line L q with respect to x i , that is β q,x i = ∂y q | x ∂x i , i = 1 , . . . , p. To ease notation we focus on the case p = 1. Note ﬁrst that β q,x ≥ x as continuity of H implies strict monotonicity of G Y | X in y , and thus the marginal eﬀects β q,x are non-negative for all x ∈ (0 , ∞ ) and q ∈ (0 , h ( w ) >

0, for w ∈ (0 , β q,x = A ( x ) { − A ( x ) } x − + B ( x ) y q | x A ( x ) C ( x ) y − q | x + B ( x ) x , (16)with q ∈ (0 ,

1) and x ∈ (0 , ∞ ), and where A ( x ) = J − (cid:88) | α | = J v α α { − Be( x/ ( x + y q | x ); α + 1 , α ) } ,B ( x ) = x ( x + y q | x ) − (cid:88) | α | = J v α be( x/ ( x + y q | x ); α , α ) ,C ( x ) = J − (cid:88) | α | = J v α α Be( x/ ( x + y q | x ); α , α + 1) . Some comments on the prior in (16) are in order. First, (16) is derived by applying the implicitfunction theorem and calculating the derivative of y q | x with respect to x , in a similar fashion as in(C.1). Second, β q,x in (16) is a valid prior in the sense that β q,x ≥

0; the latter claim follows againfrom Proposition 2.1. p -covariate setting The procedure discussed in Sections 2.1 and 2.2 becomes analytically intractable and compu-tationally expensive even for a relatively low-dimensional covariate space. We thus propose anapproach for learning about the regression manifold L , as deﬁned in (5), via an approximation tothe conditional multivariate GEV density. Let u = ( y, x ) ∈ (0 , ∞ ) d and u ( t ) = ( t, x ) ∈ (0 , ∞ ) d with (cid:107) u (cid:107) = y + (cid:80) pi =1 x i > u for a large threshold u ; then, following Cooley et al. (2012, Proposition 1)the conditional density of a multivariate extreme value distribution can be approximated, via apoint process representation for extremes, as follows g Y | X ( y | x ) = (cid:107) u (cid:107) − d − h ( u / (cid:107) u (cid:107) ) (cid:82) ∞ (cid:107) u ( t ) (cid:107) − d − h ( u ( t ) / (cid:107) u ( t ) (cid:107) ) d t . (17)An induced prior for g can be devised by plugging the approximation in (17) with the speciﬁca-tion from Sections 2.1, which leads to the following induced prior for the conditional multivariateextreme value distribution function, G Y | X ( y | x ) = (cid:80) | α | = J C α p (cid:81) i =1 x α i − i (cid:82) y t α d − (cid:107) u ( t ) (cid:107) − J − d t (cid:80) | α | = J C α p (cid:81) i =1 x α i − i (cid:82) ∞ t α d − (cid:107) u ( t ) (cid:107) − J − d t , (18)9 cenario 1 —strongly dependent extremes: Husler–Reiss model x q y q | x x q y q | x x q y q | x Scenario 2 —weakly dependent extremes: Logistic model x q y q | x x q y q | x x q y q | x Scenario 3 —asymmetric intermediate dependence: Coles–Tawn model x q y q | x x q y q | x x q y q | x Figure 2: True regression manifold L , as deﬁned in (5), and its estimate ˆ L obtained using the methods fromSection 2.2 for Husler–Reiss, Logistic and Coles–Tawn bivariate extreme value models (top to bottom) on a single-run experiment. C α = v α Γ( | α | ) / (cid:81) di =1 Γ( α i ). Hence, we can learn about the regression manifold L byestimating v α as described in Section 2.2, that is, by plugging in the Bernstein polynomial estimates(13) into the approximation (17) and numerically inverting it, (18), with respect to y . The meritof this strategy is illustrated numerically in the supplementary materials (Appendix E).

3. Simulation study

We study the ﬁnite sample performance of the proposed methods under three data generatingscenarios that were introduced in Section 2.1; see Examples 2.2—2.4. Speciﬁcally, we simulate dataas follows: • Scenario 1 —strongly dependent extremes: Husler–Reiss model with λ = 0 . • Scenario 2 —weakly dependent extremes: Logistic model with α = 0 . • Scenario 3 —asymmetric intermediate dependence: Coles–Tawn model with α = 0 . β =100.For now we focus on illustrating the methods in a single-run experiment; a Monte Carlo simulationstudy will be reported in Section 3.2. To illustrate how the resulted estimates compare with thetrue regression lines on a one-shot experiment in each scenario we generate n = 5000 samples { ( Y i , X i ) } ni =1 . For the analysis we use observations for which y i + x i > u , where u is a 95%quantile of Y + X , providing k = 250 exceedances to ﬁt the model. We take the number of basisfunctions equal to the number of exceedances k . To learn about regression lines from data, weexploit the single component adaptive Markov Chain Monte Carlo (MCMC) with a wide Dirichletprior, Dirichlet(0 . × k ), deﬁned on a generalized logit transformation of weights v α . The lengthof the MCMC chain is 6000 with a burn-in period of 2000.In Figure 2 we plot true and estimated regression manifolds under the three scenarios aboveover the range ( x, y ) ∈ (0 , × (0 , L for all three cases. However, for q close to 0 it overestimates the dependence for weaklydependent extremes and underestimates the dependence between strongly dependent extremes.This is even more evident in Figure 3, where we depict cross sections of the regression manifolds forthe Husler–Reiss (top), logistic (central), and Coles–Tawn (bottom) model. The plots on the left ofFigure 3 represent regression lines for bivariate extremes L q (for ﬁxed q ), and show that the strongerthe dependence between extremes, the closer the lines get to the identity y = x . Moreover, onecan notice that for strongly dependent extremes regression lines are almost linear and for weaklydependent extremes they are curved for small x and linear for large x . Indeed, for the logistic modelfor α < x increases; see Appendix D.11 cenario 1 —strongly dependent extremes: Husler–Reiss model x y Regression line estimatedtrue q q c ond i t i ona l quan t il e estimatedtrue x Scenario 2 —weakly dependent extremes: Logistic model x y Regression line estimatedtrue q q c ond i t i ona l quan t il e estimatedtrue x Scenario 3 —asymmetric intermediate dependence: Coles–Tawn model x y Regression line estimatedtrue q q c ond i t i ona l quan t il e estimatedtrue x Figure 3: Posterior mean regression lines L q for q = { . , . , . } and x ∈ (0 ,

20] (left) and conditional quantile curves { y q | x : q ∈ (0 , } along with credible bands, for x = { , , } (right) for Husler–Reiss, Logistic and Coles–Tawnbivariate extreme value models (top to bottom). q → q . The former drawback can be caused by less precise estimates ofthe angular density close to the simplex boundaries, and the latter is probably due to large biasthat the estimators of angular densities often have when applied to U -shaped angular densities.Nevertheless, since we work in the extreme value setup, non-extremal quantiles are also usefulwhen considering extremal events. Scenario 1 —strongly dependent extremes: Husler–Reiss model x r eg r e ss i on li ne q=0.1 x r eg r e ss i on li ne q=0.45 x r eg r e ss i on li ne q=0.55 x r eg r e ss i on li ne q=0.9 Scenario 2 —weakly dependent extremes: Logistic model x r eg r e ss i on li ne q=0.1 x r eg r e ss i on li ne q=0.45 x r eg r e ss i on li ne q=0.55 x r eg r e ss i on li ne q=0.9 Scenario 3 —asymmetric intermediate dependence: Coles–Tawn model x r eg r e ss i on li ne q=0.1 x r eg r e ss i on li ne q=0.45 x r eg r e ss i on li ne q=0.55 x r eg r e ss i on li ne q=0.9 Figure 4: Posterior mean regression lines L q for q = { . , . , . , . } and x ∈ (0 ,

20] estimated from 500 MonteCarlo simulations (gray lines) for sample size k = 500 and true conditional quantiles (black line) for Husler–Reiss,Logistic and Coles–Tawn bivariate extreme value models (top to bottom). .2. Monte Carlo simulations To conduct a simulation study we generate 500 Monte Carlo samples of sizes n = 5000 and n = 10000 resulting in k = 250 and k = 500 for the three scenarios described in Section 3.1. We usethe MCMC algorithm as described in Section 3.1 with the same prior on v α ’s. The performance ofour methods will be visualized via a comparison of posterior mean estimates of the regression lineswith the true regression lines L q for a few ﬁxed q ∈ (0 , x ∈ (0 ,

20] as thebivariate extreme value concentrates most of its mass (at least 90%) in the set (0 , × (0 , k = 500 are shown in Figure 4respectively; a similar chart for k = 250 is available from the supplementary materials (Figure E.9).Figure 4 outlines that the model ﬁts the data from Scenario 1 reasonably well and, for weakerdependent extremes (Scenario 2) and asymmetrically dependent extremes (Scenario 3), it providesrelatively precise estimates for middle values of q , but has expected bias is visible for q close to 0 and1. Comparing diﬀerent sample sizes (i.e. comparing Figure 4 with Figure E.9 in the supplementarymaterial) we can observe that increasing sample size reduces the variation of estimates for all q ∈ { . , . , . , . } and for all scenarios. Where the model performs relatively well the estimatesconcentrate more around the true quantity indicating improvement in the quality of ﬁt, and wherethe ﬁtted values diﬀer substantially from the true, raising sample size leads to larger bias.

4. Application to stock markets

We now apply the proposed method to the two of the world’s biggest stock markets—the NAS-DAQ and NYSE (New York Stock Exchange). According to the Statistics Portal of the WorldFederation of Exchanges ( https://statistics.world-exchanges.org ), the total equity marketcapitalization of the NASDAQ and NYSE are respectively 16 and 22 trillion US $ , as of 2020 / Oct,thus illustrating well the scale of these players in the worldwide stock-exchange industry. The datawere gathered from Yahoo Finance ( finance.yahoo.com ), and consists of daily closing prices ofthe NASDAQ and NYSE composite indices over the period from February 5, 1971 to November 23,2020.A key goal of the analysis will be to learn about the degree of association of extreme lossesin both of these markets through the lenses of our model, and thus we focus on modeling neg-ative log returns, which can be regarded as a proxy for losses, and which consist of ﬁrst diﬀer-ences of prices on a log-scale; the resulting data for NASDAQ and NYSE is denoted below as { ( X i , Y i ) } ni =1 . The sample period under analysis is suﬃciently broad to cover a variety of majordownturns and selloﬀs including, for example, those related with the 2007–2010 subprime mortgagecrisis, the ongoing China–US trade war, and with the 2020 COVID-19 pandemic. We take weeklymaxima of negative log returns adjusted for heteroskedasticity by ﬁtting GARCH(1 ,

1) ﬁlter withnormally distributed errors to both and convert them to unit Fr´echet margins via the transformation( (cid:98) X i , (cid:98) Y i ) = ( − / log { (cid:98) F X ( X i ) } , − / log { (cid:98) F Y ( Y i ) } ) , where (cid:98) F X and (cid:98) F Y respectively denote the empir-ical distribution functions of negative log return for NASDAQ ( X ) and NYSE ( Y ). The raw dataand resulting preprocessed data are depicted in Figure 5. As can be seen from the latter ﬁgure thecomposite indices exhibit a similar dynamics reacting to diﬀerent economic shocks (9/11 attacks,2001; 2008 ﬁnancial crisis; China-US trade war started in 2018) alike. The shape of the scatterplotof the negative log returns brought to unit Fr´echet margins in log-log scale above the boundarythreshold evidences the extremal dependence between log returns as they are quite concentratedaround the identity line. 14 Year v a l ue , U S D −2.50.02.55.07.5−2.5 0.0 2.5 5.0 7.5 NASDAQ N YSE w h ( w ) Figure 5: Left: NASDAQ (red) and NYSE (blue) composite indices. Middle: Scatterplot of negative log returns ofNASDAQ and NYSE composite indices converted to unit Fr´echet margins; the solid line corresponds to the boundarythreshold in the log-log scale, with both axes being logarithmic. Right: Angular density estimate with 95% credibleband along with a rug of pseudo-angles.

We now apply our model so to learn about how the extreme losses on both exchanges relate.To employ our model we start by ﬁtting the angular density via the Bernstein polynomial-basedapproach from Section 2 by using the pseudo-angles based on thresholding the pseudo-radius attheir 95% quantile. Some comments on prior speciﬁcation and on posterior inference are in order.For our calculations we used a single component adaptive MCMC method with a wide Dirichletprior deﬁned on a generalized logit transformation of weights v α . We run a MCMC chain of length10000 with a burn-in period of 2000 and set the number of basis functions to be equal to the numberof exceedances. N AS D A Q q y q | x N AS D A Q q y q | x NASDAQ N YSE q q c ond i t i ona l quan t il e x Figure 6: Left: Posterior mean regression manifold L for NYSE given NASDAQ. Middle: Posterior mean regressionlines L q for q = { . , . , . } for NYSE given NASDAQ along with 95% credible bands and plotted against ﬁlterednegative log returns on the indices converted to unit Fr´echet margins. Right: Posterior mean conditional quantilecurves { y q | x : q ∈ (0 , } of ﬁltered negative log returns on NYSE for x = { , , } , along with with 95% crediblebands, corresponding to ﬁltered negative log returns on NASDAQ converted to unit Fr´echet margins. The obtained ﬁt for the angular density is reported in Figure 5 (right). As is illustrated bythis plot most of the observed pseudo-angles lie closer to the middle of the interval (0 ,

1) and theestimate resembles a bell-shaped density which suggests there is a dependence between extremal15osses on NASDAQ and NYSE composite indices. One can also observe that the estimated densitypeaks at a point lying in the ﬁrst half of the interval providing an indication of an asymmetry inthe dependence. Next we learn about the regression manifold.Figure 6 represents the resulting estimates of the regression manifold along with cross-sectionsin p and x for negative log-returns on NASDAQ and NYSE composite indices. The shape ofthe regression manifold suggests there is an intermediate dependence between extremal losses onthe speciﬁed indices and the regression lines on the middle graph substantially diﬀer from thosecorresponding to independence and tend to be closer to the identity line. Moreover, the cross-sections for diﬀerent values of x reveal considerable variation in quantiles of y supporting theconclusion about presence of the dependence between negative log-returns. Table 1: Marginal eﬀects on 10%, 50% and 90% quantiles of ﬁltered log-returns on NYSE, Y , evaluated at values { , , } of ﬁltered log-returns on NASDAQ, x , in unit Fr´echet margins; 95% credible intervals in brackets quantiles x of Y .

165 0 .

041 0 . (0 . , . . , . . , .

50% 1 .

133 0 .

581 0 . (1 . , . . , . . , .

90% 3 .

677 2 .

387 2 . (3 . , . . , . . , . Having assessed the degree of association between extreme values, we now quantify the eﬀect ofextremal losses on NASDAQ index on extremal changes of log-returns on NYSE index by calculatingmarginal eﬀects. The marginal eﬀects on 10%, 50% and 90% quantiles of ﬁltered log-returns onNYSE calculated at values { , , } of ﬁltered log-returns on NASDAQ (both brought to unitFr´echet margins) are presented in Table 1. As can be seen from the table the eﬀect of NASDAQ onNYSE decreases when losses on NASDAQ increase with more substantial fall in lower quantiles oflosses on NYSE. Moreover, for ﬁxed losses on NASDAQ the eﬀect on NYSE losses becomes largerfor higher quantiles.

5. Closing remarks

We propose a regression-type model for the setup where both the response and the covariate areblock maxima. The modeling starting point is the result that the limiting behavior of the vector ofproperly standardized componentwise maxima is given by a multivariate extreme value distribution.Conceptually, the model is then constructed in a similar fashion as in quantile regression, that is,by assessing how the conditional quantile of the response reacts to changes in the covariate whileit takes into account the latter asymptotic result. An important target in the proposed frameworkis the regression manifold, which consists of a family of regression lines obeying the proviso ofmultivariate extreme value theory. A Bernstein polynomial prior on the space of angular densitiesis used to learn about the model from data, with numerical studies showcasing its ﬂexibility.One could wonder why not to resort to statistical models for nonstationary extremes (e.g. Coles,2001, Section 6) as an alternative to methods proposed herein, as these can be used for assessingthe eﬀect of covariates on an extreme-valued response, by indexing the parameters of the GEV16istribution with a covariate. Yet, since the latter models are built from the univariate theory ofextremes they are not tailored for conditioning on another variable being extreme, as they fail totake on board information from the dependence structure between the extremes. Other relatedapproaches include extremal quantile regression methods (Chernozhukov, 2005)—which similarlyto the statistical models for nonstationary extremes—have not been designed for conditioning onanother variable being extreme, as they do not take into account the dependence structure betweenthe extremes.We close the manuscript with comments on future research. For regressions with many pre-dictors, it is likely that most covariates will have little eﬀect on the response and thus one couldwonder how to devise a version of the proposed method that shrinks towards zero the eﬀect of suchirrelevant covariates; the development of a Lasso (Tibshirani, 1996) version of the proposed modelwould thus seem natural for such situation, and is left as an open problem for future research. An-other natural avenue for future research would be to devise regression-type methods for exceedanceson exceedances by resorting to the so-called multivariate generalized Pareto distribution (Kiriliouket al., 2019), rather than with the multivariate extreme value distribution as herein. Finally, thedevelopment of a version of the model that could take into account asymptotic independence byresorting to the hidden angular measure (Ramos and Ledford, 2009), rather than the standardangular measure as herein, would seem natural as well.

Acknowledgement

We thank Johan Segers (Universit´e catholique de Louvain) for insightful discussions.M. de Carvalho acknowledges support from the

Funda¸c ˜ ao para a Ci ˆ e ncia e a Tecnologia (Por-tuguese Foundation for Science and Technology) through the projects PTDC/MAT-STA/28649/2017and UID/MAT/00006/2020.G. dos Reis acknowledges support from the Funda¸c ˜ ao para a Ci ˆ e ncia e a Tecnologia (Por-tuguese Foundation for Science and Technology) through the project UIDB/00297/2020 (Centro deMatem´atica e Aplica¸c˜oes CMA/FCT/UNL).A. Kumukova was supported by The Maxwell Institute Graduate School in Analysis and itsApplications, a Centre for Doctoral Training funded by the UK Engineering and Physical SciencesResearch Council (grant EP/L016508/01), the Scottish Funding Council, Heriot-Watt Universityand the University of Edinburgh. Appendix A. Conditional bivariate extreme value distribution

Here we derive the expression for the conditional bivariate extreme value distribution function in (8). Recall thataccording to Sklar’s theorem (Nelsen, 2007, Theorem 2.3.3), a joint bivariate distribution function G : R → [0 , G X : R → [0 ,

1] and G Y : R → [0 ,

1] can be uniquely represented through acopula function C for ( x, y ) ∈ R G ( x, y ) = C ( G X ( x ) , G Y ( y )) , or, equivalently, C ( u , u ) = G ( G − X ( u ) , G − Y ( u )),for ( u , u ) ∈ [0 , , where G − X ( q ) = inf { x : G X ( x ) ≥ q } . Using the following well-known property of copulas, C U | U ( u | u ) := P ( U ≤ u | U = u ) = ∂C ( u , u ) ∂u , ( u , u ) ∈ [0 , , we calculate the conditional distribution ( Y | X ) as G Y | X ( y | x ) = C U | U ( e − y − | e − x − ) . n our setting C ( u , u ) = exp (cid:26) − (cid:90) max( − w log u , − (1 − w ) log u ) d H ( w ) (cid:27) . Assuming H is absolutely continuous with density h , we have ∂∂x (cid:90) max (cid:18) wx , − wy (cid:19) d H ( w ) = 2 ∂∂x (cid:32) x − (cid:90) ω ( x,y ) w d H ( w ) + y − (cid:90) ω ( x,y )0 (1 − w ) d H ( w ) (cid:33) = 2 (cid:34) − x − (cid:90) ω ( x,y ) w d H ( w ) − ( x − xy − y − y ) 1( x + y ) h { ω ( x, y ) } (cid:35) = − x − (cid:90) ω ( x,y ) wh ( w ) d w, where ω ( x, y ) = x/ ( x + y ). Then, the conditional copula has the following form C U | U ( u | u ) = 2 u − C ( u , u ) (cid:90) log u log u u wh ( w ) d w, which in turn yields G Y | X ( y | x ) = 2 exp (cid:26) − (cid:90) max (cid:18) wx , − wy (cid:19) h ( w ) d w + x − (cid:27) (cid:90) w ( x,y ) wh ( w ) d w, x, y > . Appendix B. Soft-maximum approximation for regression manifold of perfectly de-pendent extremes

Here we give details on the soft maximum approximation for the regression lines for perfectly dependent extremesclaimed in (10). We use a smooth approximation of a maximum function called soft-maximum , f ( z , . . . , z d ; N ) = 1 N log( e Nz + · · · + e Nz d ) , which is inﬁnitely diﬀerentiable everywhere and converges to a maximum function as N → ∞ (Cook, 2011). Then,the approximation of a multivariate GEV distribution function for the case of perfect dependent extremes, G ( y, x ) =max { y − , x − , . . . , x − p } is˜ G ( y, x ; N ) = exp (cid:26) − N log( e Ny − + e Nx − + · · · + e Nx − p ) (cid:27) = ( e Ny − + e Nx − + · · · + e Nx − p ) − /N , and its partial derivative of order d is˜ g ( y, x ; N ) = ∂ d ∂y∂x · · · ∂x p ˜ G ( y, x ; N )= y − p (cid:89) i =1 (1 + iN ) x − i exp (cid:32) Ny − + p (cid:88) i =1 x − i (cid:33) ( e Ny − + e Nx − + · · · + e Nx − p ) − /N − d . This yields the following approximation of the conditional multivariate GEV density˜ g Y | X ( y | x ; N ) = y − p (cid:81) i =1 (1 + iN ) x − i exp (cid:18) Ny − + p (cid:80) i =1 x − i (cid:19) ( e Ny − + e Nx − + · · · + e Nx − p ) − /N − dp − (cid:81) i =1 (1 + iN ) p (cid:81) i =1 x − i exp (cid:18) p (cid:80) i =1 x − i (cid:19) ( e Nx − + · · · + e Nx − p ) − /N − p = (1 + pN ) y − e Ny − ( e Ny − + e Nx − + · · · + e Nx − p ) − /N − d ( e Nx − · · · + e Nx − p ) − /N − p , nd the conditional cumulative distribution function˜ G Y | X ( y | x ; N ) = (cid:90) y ˜ g Y | X ( z | x ) d z = (1 + pN )( e Nx − + · · · + e Nx − p ) /N + p (cid:90) y z − e Nz − ( e Nz − + e Nx − + · · · + e Nx − p ) − /N − d d z = (1 + pN )( − /N ) − /N − d + 1 ( e Nx − + · · · + e Nx − p ) /N + p ( e Ny − + e Nx − + · · · + e Nx − p ) − /N − d +1 = (cid:32) e Ny − + e Nx − + · · · + e Nx − p e Nx − + · · · + e Nx − p (cid:33) − /N − p . Passing the last expression to the limit as N → ∞ provides an ansatz for the true conditional distribution function˜ G Y | X ( y | x ) = (cid:40) , y ≥ min( x , . . . , x p )0 , y < min( x , . . . , x p )with y, x , . . . , x p >

0, from where (10) follows.

Appendix C. Proof of monotonicity of regression manifold

Proof of Proposition 2.1.

Appendix D. Exact and limiting regression manifolds for logistic model

Here we give details on how the exact (12) and approximated (11) regression manifolds for the logistic modelcan be derived.

D.1. Exact regression manifold . Here we compute the conditional quantiles for bivariate extreme value distri-bution and their linear approximation for large x . Using (8), we calculate the conditional distribution function forthe logistic model: for ( x, y ) ∈ (0 , ∞ ) G Y | X ( y | x ) = G ( x, y )( x − /α + y − /α ) α − x − /α − x exp( x − )= exp {− ( x − /α + y − /α ) α + x − } ( x − /α + y − /α ) α − x − /α . The conditional quantiles behave diﬀerently depending on the strength of dependence between extremes. The specialcase of the logistic model is for α = 1, corresponding to independence between extremes for which the family ofregression lines are known to be given by (9). We now derive conditional quantiles for bivariate dependent extremes,i.e. when α ∈ [0 , y (cid:55)→ G Y | X is continuous, a conditional quantile is a solution toexp {− ( x − /α + y − /α ) α } ( x − /α + y − /α ) α − x − /α exp( x − ) = q, (D.1) hich can be written in terms of the Lambert W function. Rewriting (D.1) asexp (cid:26) − αα − x − /α + y − /α ) α (cid:27) ( x − /α + y − /α ) ( α − α/ ( α − x ( α − /αα/ ( α − exp (cid:26) αα − x − (cid:27) = q α/ ( α − ⇔ exp (cid:26) α − α ( x − /α + y − /α ) α (cid:27) α − α ( x − /α + y − /α ) α = α − α x − exp (cid:26) α − α x − (cid:27) q α/ ( α − ⇔ α − α ( x − /α + y − /α ) α = W (cid:18) α − α x − e α − α x − q α/ ( α − (cid:19) , gives y q | x = (cid:34)(cid:26) − αα xW (cid:18) α − α x − e α/ (1 − α ) x − q α/ ( α − (cid:19)(cid:27) /α − (cid:35) − α x, x > . (D.2)From properties of the Lambert W function (see supplementary material for details, Appendix F) it follows thatlim x →∞ x W (cid:18) α − α x − e α/ (1 − α ) x − q α/ ( α − (cid:19) = α − α q α/ ( α − , and that the conditional quantiles tend to inﬁnity as x → ∞ . D.2. Limiting regression manifold . Below we show that (11) holds. To ﬁnd γ q and β q we use the expan-sion of the principal branch of the Lambert W function, a solution to z = we w when z >

0, around 0, that is W ( z ) = ∞ (cid:88) n =1 ( − n ) n − n ! z n = z + O ( z ) , where O ( z ) is big- O of z . Substituting in (D.2) the expansion of W leads to y q | x =  − αα x  α − α x − e α − α x − q αα − O ( x − e α/ (1 − α ) x − )  /α −  − α x =  e α − α x − q αα − O ( x − e α/ (1 − α ) x − )  /α −  − α x, and taking x → ∞ we ﬁnd the linear asymptote of x (cid:55)→ y q | x (we use L’Hospital in the computation of γ q ) β q = lim x →∞ y q | x x = { q − − α − } − α , (D.3)and, the more involved calculation of γ q . The derivative of the asymptotic expansion of a function does not necessarilycorrespond to the asymptotic expansion of the derivative of the function, hence, we use L’Hospital with (D.2) andonly then inject the asymptotic expansions. We have then after some tedious derivations that γ q = lim x →∞ (cid:0) y q | x − β q x (cid:1) = α − α { q / ( α − − } − α − { q α/ (1 − α ) − } q α − . (D.4)The resulted linear approximation is˜ y q | x = α − α { q / ( α − − } − α − { q α/ (1 − α ) − } q / ( α − + { q − / (1 − α ) − } − α x, q ∈ (0 , , x (cid:29) . upplementary materialsAppendix E. Additional numerical evidence Appendix E.1. Induced prior for p -covariate setting We report on two one-shot numerical experiments aimed at illustrating the approach in Sec-tion 2.3 in the paper, that induces a prior on the space of all regression manifolds by resorting toBernstein polynomials and an approximation of a multivariate GEV density due to Cooley et al.(2012). For the numerical experiments in this supplementary material, we test our model by tak-ing a trivariate logistic extreme value distribution with dependence parameter α = 0 . p = 2, i.e. with the trivariate GEV distribution G ( y, x , x ) = exp {− ( y − /α + x − /α + x − /α ) α } , y, x , x > . We generate two samples of sizes n = 10000 and n = 20000 which, after thresholding at 95%empirical quantiles of the observed pseudo angles, yield k = 500 and k = 1000 data points to ﬁt themodel. Here, we use a similar prior speciﬁcation and MCMC setup as in Section 3.1 of the paper.Figure E.7 indicates that the proposed estimator of the angular density captures reasonably wellthe dependence between extremes by concentrating around the barycenter of the simplex, thoughin a less pronounced form than the true density. As can be seen from Figure E.8 the resultingregression lines estimates resemble the true regression lines L q , and increasing sample size improvesthe ﬁt as the lateral surfaces of the estimates become more slanting, for q = { . , . , . } .

20 40 60 80 100 w w w true angular density level

20 40 60 80 100 w w w estimated angular density level Figure E.7: Level plots of the true angular density (left) along with the posterior mean estimate resulting from themethods from Section 2.2 (right) on n = 10000 observations for the trivariate logistic extreme value distribution, ona single-run experiment, with the dependence parameter α = 0 . = 0 . x x y q | x x x y q | x x x y q | x x x y q | x x x y q | x q = 0 . x x y q | x x x y q | x x x y q | x x x y q | x x x y q | x q = 0 . x x y q | x x x y q | x x x y q | x x x y q | x x x y q | x Figure E.8: The true L q (left) along with the posterior mean estimate resulting from the methods from Section 2.3on n = 10000 (middle) and n = 20000 (right) observations for q = { . , . , . } (top to bottom) for the trivariatelogistic extreme value distribution, on a single-run experiment, with the dependence parameter α = 0 . x = ( x , x ) ∈ (0 , . cenario 1 –strongly dependent extremes: Husler–Reiss model x r eg r e ss i on li ne q=0.1 x r eg r e ss i on li ne q=0.45 x r eg r e ss i on li ne q=0.55 x r eg r e ss i on li ne q=0.9 Scenario 2 –weakly dependent extremes: Logistic model x r eg r e ss i on li ne q=0.1 x r eg r e ss i on li ne q=0.45 x r eg r e ss i on li ne q=0.55 x r eg r e ss i on li ne q=0.9 Scenario 3 –asymmetric intermediate dependence: Coles–Tawn model x r eg r e ss i on li ne q=0.1 x r eg r e ss i on li ne q=0.45 x r eg r e ss i on li ne q=0.55 x r eg r e ss i on li ne q=0.9 Figure E.9: Regression lines L q for q = { . , . , . , . } and x ∈ (0 ,

20] estimated from 500 Monte Carlosimulations (gray lines) for sample size k = 250 and true conditional quantiles (black line) for Husler–Reiss, Logisticand Coles–Tawn bivariate extreme value models (top to bottom). ppendix E.2. Supporting Monte Carlo experiment Figure E.9 below complements Figure 4; the number of exceedances of the ﬁgure reported hereis k = 250 while that on Figure 4 is k = 500. Appendix F. Details on the Lambert W function The Lambert W function is used in the paper for deriving a linear approximation to the logisticregression manifold (cf Appendix D), and thus we oﬀer here some details on it. Formally, theLambert W function is a set of functions representing the inverse relation of the function f ( z ) = ze z for any complex z . Since we deal only with positive real valued z , the equation f ( z ) = ze z has onlyone solution w = W ( z ), with W being the principal branch of the Lambert W function. A usefulproperty of this function is that for any constant a ∈ R one haslim z →∞ zW ( a/z ) = lim z →∞ ae − W ( a/z ) = a, which is derived from lim z →∞ az = lim z →∞ e W ( a/z ) W ( a/z ) ⇒ lim z →∞ W ( a/z ) = 0 . See Borwein and Lindstrom (2016) for further details.24 eferences

Beirlant, J., Goegebeur, Y., Segers, J., and Teugels, J. (2004),

Statistics of Extremes: Theory and Applications ,Wiley, Hoboken, NJ: Wiley.Borwein, J. M. and Lindstrom, S. B. (2016), “Meetings with Lambert W and other special functions in optimizationand analysis,” Pure Appl. Funct. Anal. , 1, 361–396.Chernozhukov, V. (2005), “Extremal Quantile Regression,”

Annals of Statistics , 33, 806–839.Coles, S. (2001),

An Introduction to Statistical Modeling of Extreme Values , Springer Series in Statistics, London:Springer London.Cook, J. D. (2011), “Basic properties of the soft maximum,” bepress,Working Paper Series 70 .Cooley, D., Davis, R. A., and Naveau, P. (2012), “Approximating the conditional density given large observed valuesvia a multivariate extremes framework, with application to environmental data,”

Ann. Appl. Stat. , 6, 1406–1429.de Haan, L. and Ferreira, A. (2006),

Extreme Value Theory: An Introduction , New York: Springer.de Haan, L. and Resnick, S. I. (1977), “Limit Theory for Multivariate Sample Extremes,”

Zeitschrift f¨ur Wahrschein-lichkeitstheorie und Verwandte Gebiete , 40, 317–337.Embrechts, P., Kl¨uppelberg, C., and Mikosch, T. (1997),

Modelling Extremal Events for Insurance and Finance ,New York: Springer.Gudendorf, G. and Segers, J. (2010), “Extreme-Value Copulas,” in

Copula Theory and Its Applications , Springer,Berlin, Heidelberg, pp. 127–145.Guillem, A. I. G. (2000), “Structure de d´ependance des lois de valeurs extrˆemes bivari´ees,”

Comptes Rendus del’Acad´emie des Sciences-Series I-Mathematics , 330, 593–596.Hanson, T. E., de Carvalho, M., and Chen, Y. (2017), “Bernstein Polynomial Angular Densities of MultivariateExtreme Value Distributions,”

Statistics and Probability Letters , 128, 60–66.H¨usler, J. and Li, D. (2009), “Testing asymptotic independence in bivariate extremes,”

Journal of Statistical Planningand Inference , 139, 990–998.Kiriliouk, A., Rootz´en, H., Segers, J., and Wadsworth, J. L. (2019), “Peaks over thresholds modeling with multivariategeneralized Pareto distributions,”

Technometrics , 61, 123–135.Koenker, R. (2005),

Quantile Regression , Cambridge, MA: Cambridge University Press.Longin, F. (ed.) (2017),

Extreme Events in Finance: A Handbook of Extreme Value Theory and Its Applications ,New York: Wiley.McNeil, A. J., Frey, R., and Embrechts, P. (2015),

Quantitative risk management: concepts, techniques and tools-revised edition , Princeton: Princeton University Press.Nelsen, R. B. (2007),

An introduction to copulas , Springer Science & Business Media.Pickands, J. (1981), “Multivariate Extreme Value Distributions,” in

Proc. 43th Sess. Int. Statist. Inst , pp. 859–878.Ramos, A. and Ledford, A. (2009), “A New Class of Models for Bivariate Joint Tails,”

Journal of the Royal StatisticalSociety. Series B (Statistical Methodology) , 71, 219–241.Stephenson, A. and Tawn, J. (2005), “Exploiting Occurrence Times in Likelihood Inference for ComponentwiseMaxima,”

Biometrika , 92, 213–227.Tibshirani, R. (1996), “Regression shrinkage and selection via the lasso,”

Journal of the Royal Statistical Society:Series B , 58, 267–288.van der Vaart, A. W. (1998),

Asymptotic Statistics , Cambridge series in statistical and probabilistic mathematics,Cambridge, UK ; New York, NY, USA: Cambridge University Press.Wang, Y. and Stoev, S. A. (2011), “Conditional sampling for spectrally discrete max-stable random ﬁelds,”

Advancesin Applied Probability , 43, 461–483., 43, 461–483.