[PDF] Modeling short-ranged dependence in block extrema with application to polar temperature data

Abstract

The block maxima approach is an important method in univariate extreme value analysis. While assuming that block maxima are independent results in straightforward analysis, the resulting inferences maybe invalid when a series of block maxima exhibits dependence. We propose a model, based on a first-order Markov assumption, that incorporates dependence between successive block maxima through the use of a bivariate logistic dependence structure while maintaining generalized extreme value (GEV) marginal distributions. Modeling dependence in this manner allows us to better estimate extreme quantiles when block maxima exhibit short-ranged dependence. We demonstrate via a simulation study that our first-order Markov GEV model performs well when successive block maxima are dependent, while still being reasonably robust when maxima are independent. We apply our method to two polar annual minimum air temperature data sets that exhibit short-ranged dependence structures, and find that the proposed model yields modified estimates of high quantiles.

Full PDF

MModeling short-ranged dependence in block extremawith application to polar temperature data

Brook T. Russell ∗ Whitney K. HuangSeptember 24, 2020

Abstract

The block maxima approach is an important method in univariate extremevalue analysis. While assuming that block maxima are independent results instraightforward analysis, the resulting inferences maybe invalid when a seriesof block maxima exhibits dependence. We propose a model, based on a ﬁrst-order Markov assumption, that incorporates dependence between successiveblock maxima through the use of a bivariate logistic dependence structure whilemaintaining generalized extreme value (GEV) marginal distributions. Modelingdependence in this manner allows us to better estimate extreme quantiles whenblock maxima exhibit short-ranged dependence. We demonstrate via a simu-lation study that our ﬁrst-order Markov GEV model performs well when suc-cessive block maxima are dependent, while still being reasonably robust whenmaxima are independent. We apply our method to two polar annual minimumair temperature data sets that exhibit short-ranged dependence structures, andﬁnd that the proposed model yields modiﬁed estimates of high quantiles. ∗ Clemson University School of Mathematical and Statistical Sciences, Clemson, SC 29634 (email:[email protected]) a r X i v : . [ s t a t . M E ] S e p eywords : Annual minimum air temperature, Generalized extreme value distribu-tion, Bivariate logistic dependence structure, Bayesian inference In recent years, there has been an increased focus on modeling extremes of environ-mental variables.

Extreme value theory (EVT) provides a probabilistic framework forperforming statistical inference on the far upper tail of distributions, and is thereforeuseful in a wide variety of environmental applications. Examples include modelingextreme temperatures (Huang et al., 2016; Stein, 2020b,a; O’Sullivan et al., 2020),precipitation extremes (Huang et al., 2019; Russell et al., 2020; Hazra et al., 2020;Fix et al., 2020), and extremes in hydrology (Towe et al., 2019; Beck et al., 2020).In the analysis of univariate extremes, the block maxima approach (Coles, 2001;Gumbel, 1958) is among the most commonly employed methods. Under this frame-work of analysis, brieﬂy outlined in Section 2, renormalized block maxima can beshown to converge to the generalized extreme value (GEV) distribution under certainconditions. Inference is typically performed by assuming that a series of block maximaare independent GEV realizations. In many applications, this assumption of indepen-dence among block maxima is quite reasonable (e.g., Huang et al., 2016), given thatthe block size is suﬃciently large and the (within block) serial dependence is relativelyweak; however, there are cases where dependence between consecutive block maximais exhibited (e.g., Zhu et al., 2019). In these instances, traditional block maximaanalysis will ignore this dependence, potentially disregarding important informationand leading to invalid inference.In this work, we are motivated by series of block minima that appear to exhibitshort-ranged asymptotic (tail) dependence . Informally, two variables are asymptoti-2ally dependent if the probability of the event that both are at their most extremelevels simultaneously is non-zero. Our motivating data sets consist of annual mini-mum temperatures at Arctic and Antarctic research stations, and both exhibit theaforementioned short-ranged asymptotic dependence structure. That is, consecutiveblock minima appear to show dependence, but this (estimated) dependence dimin-ishes beyond a lag of one (e.g., more than one year apart. See Fig. 2). Althoughwe are interested in modeling block minima , we note that we begin by developingmethodology for block maxima in this manuscript. This approach is justiﬁed by thefact that methods for analysis of block maxima are easily applied to block minimaafter the response variable is negated.Zhu et al. (2019) analyze the same Antarctic data set that we consider here andconclude that the series of annual minima exhibits short-ranged dependence, pri-marily at lag one. However, their approach is based on using a Gaussian copula(Joe, 1997; Nelsen, 2006) to model dependence. Although copula-based methods areﬂexible in terms of modeling dependence in general, Gaussian copula based modelsare incapable of modeling asymptotic dependence (Sibuya, 1959). In extreme valueanalysis, asymptotic dependence is often the type of dependence that is of primaryinterest. We give a brief overview of asymptotic dependence in Section 2; Coles (2001)and Resnick (2007) provide additional details for the interested reader.In order to model such asymptotic dependence among block maxima, one potentialapproach is to employ multivariate extreme value methods (see, for example, Ch. 8of Coles, 2001). Unfortunately, characterizing dependence for extremes is nontrivialeven for moderate dimension (e.g., dimension d ≥ d and thereforethe resulting model is challenging to ﬁt (Castruccio et al., 2016). This complicationwith the likelihood function makes both maximum likelihood and Bayesian inferenceapproaches impractical.In this work we propose a modeling procedure speciﬁcally for block maxima witha short-ranged asymptotic dependence structure. Our approach is based on a ﬁrst-order Markov assumption, and oﬀers several attractive properties: GEV marginaldistributions, a likelihood function with an easily expressed closed form that makesboth frequentist and Bayesian inference relatively straightforward, and short-ranged asymptotic dependence of block maxima.This manuscript is organized in the following manner. In Section 2 we give a briefbackground in univariate and bivariate extremes used in this work. We describe ourmethod for modeling dependent block maxima in Section 3 and give the results of asimulation study in Section 4. Our analysis of temperature data is given in Section5. We conclude with a discussion in Section 6. In this section, we provide background for univariate and bivariate extreme valueanalysis. We introduce the block maxima approach in Section 2.1, as our method ofanalysis is built within this framework. An asymptotic dependence measure and thebivariate logistic dependence model is described in Section 2.2.

Fisher and Tippett (1928) and Gnedenko (1943) provide the theoretical basis formodeling block maxima, the maxima taken from sequences of n independent and4dentically distributed random variables ( X , . . . , X n ), with “block length” n suf-ﬁciently large, through the use of an extreme value distribution. Speciﬁcally, let M n = max { X , . . . , X n } denote the block maxima. If there exist sequences { a n > } and { b n } such that P (cid:18) M n − b n a n ≤ z (cid:19) n →∞ −→ G ( z ) (1)for a non-degenerate distribution G , then G must belong to the (reversed) Weibullfamily, the Gumbel family, or the Fr´echet family. The GEV distribution is a threeparameter distribution that includes these three families as special cases. We let Z denote the (non-degenarate) asymptotic distribution of the block maximum (e.g.,annual maximum) of a variable of interest (e.g., daily temperature with block size n = 365). For Z ∼ GEV( µ, σ, ξ ), its distribution function is deﬁned such that P ( Z < z ) = exp (cid:32) − (cid:18) ξ (cid:18) z − µσ (cid:19)(cid:19) − /ξ + (cid:33) , (2)where c + := max { c, } . The GEV parameters are referred to as the location param-eter µ ∈ R , the scale parameter σ >

0, and the shape parameter ξ ∈ R . All threeparameters are involved in modeling extremes, but the shape is especially importantas it determines the nature of the tail. If ξ <

0, the tail will be bounded and theGEV becomes the Weibull. If ξ >

0, the tail will be heavy and the GEV becomes theFr´echet. If ξ →

0, the resulting tail will be light and the GEV becomes the Gumbel.Coles (2001) oﬀers an introductory resource for the analysis of univariate extremes.We note that Leadbetter et al. (1983) show that the independence assumption on { X , . . . , X n } can be relaxed for weakly dependent stationary time series. Einmahlet al. (2016) extend the theory to non-identically distributed observations when dis-5ributions of { X , . . . , X n } share a common absolute maximum.In practice, block maxima are extracted where the blocks are produced by dividingthe data record (e.g., a time series of certain climate variable) into non-overlappingperiods. If the blocks are thought to be large enough, the series of block maximamay be considered GEV realizations and can be used to estimate the correspondingGEV parameters. This can be done using a likelihood (Prescott and Walden, 1980)or a moment based method (Hosking et al., 1985), or alternatively via a Bayesianapproach (Coles and Tawn, 1996). For the case where analysis of minima is of interest,researchers can use this type of approach after negating the series of block maxima.In application, researchers are often interested in estimating the (1 − p ) th quantileof Z , the distribution of block maxima, with a “small” p , say 0.05 or 0.01. For Z ∼ GEV( µ, σ, ξ ), where µ , σ , and ξ are known, this quantile is given by Z p ( µ, σ, ξ ) =  µ − σξ (1 − {− log(1 − p ) } − ξ ) for ξ (cid:54) = 0 µ − σ log {− log(1 − p ) } for ξ = 0 . (3)After estimating the three GEV parameters based on sample data, the (1 − p ) th quantile can be estimated via the plug-in estimate (cid:98) Z p = Z p (ˆ µ, ˆ σ, ˆ ξ ). Importantly,when the data are composed of annual maxima, one can estimate the r -year returnlevel that is associated with exceedance probability p , and is denoted (cid:99) RL r = (cid:98) Z p .Here, the return period is r = 1 /p years. For example, the exceedance probability of.02 corresponds to a return period of 1 /.

02 = 50 years.Quantifying uncertainty in return level estimates can be performed in several ways.A delta method based approach can be used, but the resulting conﬁdence intervals areknown to perform poorly when p is small (Coles, 2001). For likelihood based inference,a proﬁle likelihood method is sometimes recommended. When Bayesian inference is6mployed, credible intervals for return levels can be produced from the Markov chainMonte Carlo (MCMC) output. Coles and Tawn (1996) provide a discussion of thisissue in the extreme value analysis context. When performing an extreme value analysis of bivariate random vectors, describ-ing asymptotic dependence is often of primary interest. In this section, we brieﬂyintroduce asymptotic dependence and discuss a parametric model for asymptotic de-pendence.

For random variables X and Y with their corresponding cumulative distribution func-tion F X and F Y , deﬁne the tail dependence coeﬃcient χ (Coles et al., 1999) χ = lim u → − P ( F Y ( Y ) > u | F X ( X ) > u ) . (4)If χ >

0, the two variables are termed asymptotically dependent ; the asymptotic inde-pendence case is implied when χ = 0. Determining the presence and the strength ofasymptotic dependence is important in many applications. For example, experiencingextreme levels of storm surge and precipitation simultaneously may result in muchgreater damage compared to one of these variables reaching extreme levels by its self.Typical association metrics, such as Pearson’s correlation coeﬃcient, may be use-ful for describing association in the bulk of the data; however, they often performpoorly in terms of describing asymptotic dependence. It is also important to notethat bivariate Gaussian random variables with a correlation coeﬃcient less than oneare asymptotically independent (Sibuya, 1959). Therefore, one should carefully con-7ider whether to use Gaussian copulas to model random variables that may exhibitasymptotic dependence. Assume that { ( X i , Y i ) } i ∈ N is a sequence of independent bivariate random vectors withjoint distribution function F X,Y ( x, y ). Deﬁne the vector of componentwise maxima, M n = ( M x,n , M y,n ), where M x,n = max i ∈{ ,...,n } { X i } and M y,n = max j ∈{ ,...,n } { Y j } . Impor-tantly, we note that the index for which the maximum of the X i s occurs is not neces-sarily the same index for which the maximum of the Y j s occurs. That is, argmax i ∈{ ,...,n } X i is not necessarily the same as argmax j ∈{ ,...,n } Y j , and therefore M n may not appear in theoriginal data set. As is standard in extreme value analysis, we transform the marginaldistributions of M x,n and M y,n such that both have the unit Fr`echet distribution (aspecial case of GEV( µ = 1 , σ = 1 , ξ = 1)), with distribution function given by P ( Z ≤ z ) = exp( − z − ) for z > . (5)The use of such marginal transformations is theoretically justiﬁed (Resnick, 2007)and allows for describing asymptotic dependence in a more straightforward manner.If P ( M x,n ≤ x, M y,n ≤ y ) d → G ( x, y ) for non-degenerate G , then G will have theform G ( x, y ) = exp ( − V ( x, y )) (6)for x, y >

0. The function V can be expressed in the form V ( x, y ) = 2 (cid:90) max (cid:18) wx , − wy (cid:19) dH ( w ) , (7)8here H is a non-negative measure that determines the dependence and satisﬁes (cid:90) w dH ( w ) = (cid:90) (1 − w ) dH ( w ) = 1 . (8)(Tawn, 1988; Coles and Tawn, 1991, 1994; Ledford and Tawn, 1997).One approach is to model H using H θ , a parametric family with parameters θ ∈ Θ .In this work, we utilize the logistic (as known as Gumbel) family (Tawn, 1988) wherethe dependence structure is determined by a single parameter 0 ≤ α ≤

1. The jointcumulative distribution function is given by G α ( x, y ) = exp {− ( x − /α + y − /α ) α } . (9)Under this parametric modeling assumption, the parameter α determines the na-ture of the asymptotic dependence between the corresponding maxima, and a smallerparameter value implies a higher degree of tail dependence. As α →

1, the two max-ima become asymptotically independent; as α →

0, they exhibit perfect dependence.We also note that under the logistic modeling assumption, there is a deterministicrelationship between α and the parameter χ from (4), given by χ = 2 − α (see Em-brechts et al., 2001, p. 16, Example 3.3). We note that χ = lim u → u ∗ P ( Y > u | X >u ) = 2 − = 0 for the asymptotically independent case, and χ = 2 − = 1 for theperfect dependence case. In this section, we outline a ﬁrst-order Markov based model to account for short-ranged dependence among block maxima, and discuss how it is possible to use such amodel to estimate conditional tail quantiles. Smith et al. (1997) and Smith (1992) also9uggest a Markov dependence structure, but in the context of threshold exceedances.

Let { Z t } t =1 ,...,n be a series of block maxima based on blocks with a large number ofobservations. At this point, we assume that each Z t has the unit Fr`echet distribution,and each marginal distribution function is therefore given by F Z t ( z t ) = exp {− z − t } , (10)leading to the marginal density function f Z t ( z t ) = z − t exp {− z − t } . (11)We again emphasize that the unit Fr`echet distribution is a special case of theGEV, deﬁned in (2). Under our proposed modeling procedure, asymptotic dependencebetween consecutive block maxima is modeled by assuming that Z t and Z t +1 have abivariate extreme value distribution with logistic dependence structure. As we areonly interested in modeling data with short-ranged dependence, we make a ﬁrst-order Markov assumption; that is, we assume that Z t and Z t + k are conditionallyindependent given Z t + k − for k > t = 1 , . . . , n − k .Under the bivariate logistic dependence structure described in Section 2.2, thejoint distribution function between consecutive observations is given by F Z t ,Z t +1 ( z t , z t +1 | α ) = exp {− ( z − /αt + z − /αt +1 ) α } , (12)10nd their joint density is given by f Z t ,Z t +1 ( z t , z t +1 | α ) = F Z t ,Z t +1 ( z t , z t +1 )( z t z t +1 ) − /α ( z − /αt + z − /αt +1 ) − α × ( α − − z − /αt + z − /αt +1 ) α ) . (13)We can then obtain the likelihood function L ( α | Z ) = f Z ( z | α )= f Z ( z ) n − (cid:89) t =1 f Z t +1 ( Z t +1 | Z t , α )= f Z ( z ) n − (cid:89) t =1 f Z t ,Z t +1 ( z t , z t +1 | α ) f Z t ( z t | α ) , (14)relying on the ﬁrst-order Markov assumption. This implies that the log likelihood forthe series is given bylog( L ( α | Z )) = log( f Z ( z | α )) + n − (cid:88) t =1 log (cid:18) f Z t ,Z t +1 ( z t , z t +1 | α ) f Z t ( z t | α ) (cid:19) = n − (cid:88) t =1 log( f Z t ,Z t +1 ( z t , z t +1 | α )) − n − (cid:88) t =2 log( f Z t ( z t | α )) . As the likelihood function has a relatively simple closed form, inference can be per-formed in a straightforward manner via maximum likelihood, and Bayesian inferenceis also possible. Importantly, inference on the dependence parameter α may yieldvaluable information regarding the degree to which consecutive block maxima exhibitdependence.To this point, we have assumed unit Fr´echet marginal distributions. To allow forarbitrary GEV marginal distributions in analysis, we rely on the following. Assumethat Z is a unit Fr´echet random variable, and consider Y = µ + σ ( Z ξ + − /ξ for11 ∈ R , σ >

0, and ξ ∈ R . It is straightforward to show that Y ∼ GEV( µ, σ, ξ ) . (15)In extremes, it is common practice to use this relation to perform transformations ofthe marginals in order to conduct analysis with unit Fr´echet marginal distributions. When block maxima are assumed to be independent, estimating an upper tail quantile(or the corresponding return level) of interest is straightforward, and is simply afunction of the three GEV parameter estimates; this relation is given in Equation(3). In the case where block maxima exhibit short-ranged dependence, it may bepossible to improve estimates of upper tail quantiles at time t + 1 by incorporatinginformation regarding the block maximum at time t .Assume that the ﬁrst-order Markov based model from Section 3.1 holds. Giventhat Z t = z t , the distribution of Z t +1 is given by F Z t +1 | Z t = z t ( z t +1 | α ) = P ( Z t +1 ≤ z t +1 | Z t = z t , α ) = 1 f Z t ( z t ) (cid:90) z t +1 f Z t ,Z t +1 ( z t , v | α ) dv. (16)In order to obtain the desired upper tail quantile, one simply needs to ﬁnd the valueinf { z t +1 > | F Z t +1 | Z t = z t ( z t +1 | α ) ≥ − p } . (17)We are not able to ﬁnd a closed form for the conditional distribution function in (16);therefore, we employ numerical methods to obtain tail quantile estimates in practice.Quantifying uncertainty in tail quantile estimates presents additional challenges12ompared to the independent block maxima case. A delta method based approachpresents similar downsides as in the independent block maxima case, and proﬁlelikelihood based methods are diﬃcult to implement. However, Bayesian inferencemethods yield straightforward means for producing the desired credible intervals.For this reason, we employ a Bayesian approach in this work. In order to assess the ﬁrst-order Markov GEV model’s ability to estimate extremequantiles for both dependent and independent block maxima data, we undertake asimulation study. In this simulation study, we consider three data generating pro-cesses: a stationary independent GEV process, a stationary ﬁrst-order Markov GEVprocess, and a stationary moving average process of order two, abbreviated MA(2).For all three processes, the marginal GEV location and scale parameters are taken tobe 0 and 1 (respectively); in order to approximate the shape parameter in our polartemperature data application, we choose the shape parameter for both processes tobe -0.1, reﬂecting a typical shape parameter value for near-surface air temperatureextremes. For the ﬁrst-order Markov GEV process, we set the dependence parameter α to be 0.7, corresponding to a moderate level of asymptotic dependence betweenconsecutive block maxima, and similar to what we observe in our data application.Realizations for the ﬁrst-order Markov GEV process are obtained via the use of in-verse transform sampling using the conditional distribution function in Equation (16).The MA(2) process that we consider is deﬁned by X t = W t + 0 . W t − + 0 . W t − , (18)13or t ∈ N and where W is iid Gaussian white noise with unit variance. The resultingseries is transformed to have GEV(0,1,-0.1) marginal distributions via probabilityintegral transformations, similar to Zhu et al. (2019). We note that this MA(2)process has a true correlation of approximately 0.42 at lag one, and a true correlationof approximately 0.06 at lag two.We randomly generate 400 simulated series for each of the three processes, wherethe length of each series is taken to be 100. Table 1 presents the average (over the400 simulated data sets) empirical estimator of χ , from Equation (4), for each datagenerating process at lags k = 1 , . . . ,

5. We use the empirical estimator mentioned inColes (2001), ˆ χ k ( u ) = (cid:80) ni = k +1 I { X i > ˆ F − X ( u ) } I { X i − k > ˆ F − X ( u ) } (cid:80) ni = k +1 I { X i > ˆ F − X ( u ) } , (19)where I {·} represents the indicator function and ˆ F is the marginal empirical cumula-tive distribution function. The threshold is set at the empirical marginal 0.95 quantile(we consider other thresholds in the Supplementary Materials). We note that the in-Table 1: The average estimate of χ (using the estimator in Equation (19) with athreshold set at their empirical 0.95 quantile) for each assumed dependent structureat lag 1 to 5. Lag 1 Lag 2 Lag 3 Lag 4 Lag 5Independent 0.05 0.04 0.03 0.04 0.04GEV ProcessFirst-order 0.30 0.15 0.08 0.05 0.04Markov ProcessMA(2) Process 0.15 0.05 0.05 0.04 0.04dependent GEV process does not allow for asymptotic dependence, and therefore weare not surprised to see the relatively small average empirical estimators at all lagsfor this process. The ﬁrst-order Markov GEV process does allow for asymptotic de-14endence, and therefore the spike at lag one is not unexpected. Although the MA(2)process does not produce true asymptotic dependence, it does induce dependence ;therefore we are not surprised to see the small spike at lag one.For each simulated data set, we ﬁt two models: the stationary independent GEVmodel and the ﬁrst-order Markov GEV model. A Bayesian inference approach isemployed for each model ﬁt, based on obtaining 2,000 posterior draws via MCMC(after burn-in). For each simulated data set, we estimate the true 0.95 quantile forthe next realization in the series of maxima. In the case of the stationary independentprocess, this value would correspond with the 20 year return level. For the ﬁrst-orderMarkov GEV process and MA(2) process, the interpretation is less clear; therefore,we term this parameter the conditional 0.95 quantile (given the series of maxima).Vague Gaussian priors are employed on the location and natural logarithm of thescale parameters. Weakly informative priors are used for the shape and dependenceparameters (truncated Gaussian and Beta, respectively). For each simulated data setand for each model, the resulting posterior draws are used to generate 90% credibleintervals for the conditional 0.95 quantile. In the case of the stationary independentGEV process, we compare each 90% credible interval with the corresponding truequantile. This true quantile of interest is based exclusively oﬀ of the known marginalGEV parameters, calculated using the GEV quantile function in Equation (3). For thesimulations from the ﬁrst-order Markov GEV process, we also compare each credibleinterval with the corresponding true value, determined by the relation in Equation(17). In the case of the MA(2) process, (18) yields a simple closed form that can beused to calculate the desired true conditional 0.95 quantiles for each simulated dataset.The results of the simulation study are summarized in Table 2. We observe thatwhen there is no dependence present in the generating process (the stationary in-15ependent GEV process), the ﬁrst-order Markov based model does not perform aswell as the stationary independent GEV based model in terms of capturing the truetail quantile value. However, the dropoﬀ in the empirical coverage rate is not large(86.5% versus 90.8%). When there is short-ranged asymptotic dependence presentin the process used to generate the data (the ﬁrst-order Markov GEV process), theindependent GEV based model seems to perform much worse in terms of the empiri-cal coverage rate. When sampling from the ﬁrst-order Markov process, the empiricalcoverage rates for the 90% credible regions are 83.7% versus 31.4%. Similarly, whensampling from the MA(2) process deﬁned in Equation (18), the empirical coveragerates are 83.4% versus 42.1%. Importantly, the ﬁrst-order Markov GEV model looksto perform reasonably well despite the fact that the MA(2) process does not trulyproduce asymptotic dependence. For illustrative purposes, for the ﬁst 20 simulateddata sets from each of the three processes we present graphs of the resulting 90%credible intervals based on both models in Figure 1. Each credible interval has beencentered such that its corresponding true conditional 0.95 quantile is represented byzero.Table 2: The empirical coverage rate of the 90% credible interval of the conditional0.95 quantile for each combination of data generating process and modeling procedure.First-order IndependentMarkov Model GEV ModelIndependent 0.8651 0.9084GEV ProcessFirst-order 0.8377 0.3144Markov ProcessMA(2) Process 0.8337 0.420516 Simulating from Indep GEV Process

Deviation from True Quantile S i m u l a t i on N u m be r l ll l ll l ll ll ll l l l ll lll ll l ll l ll llll l l ll l ll Indep GEVMarkov GEV −1 0 1 2 3

Simulating from Markov GEV Process

Deviation from True Conditional Quantile S i m u l a t i on N u m be r l ll lll l lll ll l ll l ll llll l llll l l lll l ll l ll l l Indep GEVMarkov GEV −1.0 −0.5 0.0 0.5 1.0 1.5 2.0

Simulating from MA(2) Process

Deviation from True Conditional Quantile S i m u l a t i on N u m be r l l ll lll l lll ll l ll ll l ll llll ll lll lll l ll lll l Indep GEVMarkov GEV

Figure 1: For the ﬁrst 20 simulations from each of the three process, we plot theresulting centered 90% credible intervals for the 0.95 quantile of the next observationin the series based on both the independent GEV model and the ﬁrst-order MarkovGEV model. All intervals are shifted such that zero corresponds with the true 0.95quantile value.Taken together, this suggests the following set of conclusions. If a data analystknows with certainty that the block maxima are independent, then he or she maybe wise to use the independent GEV model. Similarly, if a data analyst knows withcertainty that consecutive block maxima are dependent, then he or she may do better17y using the ﬁrst-order Markov GEV model. However, in cases when an analyst isunsure whether or not this type of dependence is present, the penalty for using theﬁrst-order Markov GEV model appears to be less onerous.

Meteorological data from the Arctic and Antarctic regions of the Earth is often diﬃ-cult to obtain, and may also have issues with data quality. However, these data couldbe interesting to analyze, as the phenomena governing these remote regions may bediﬀerent compared to more temperate regions. In this section, we present analysis oftwo polar annual minimum temperature time series: one in Antarctica and the otherin the former Soviet Arctic region, both of which appear to exhibit short-rangedasymptotic dependence in their annual minimum near-surface temperatures.

We ﬁrst present an analysis of an annual minimum temperature data set at theFaraday/Vernadsky station in Antarctica (65 . ◦ S, 64 . ◦ W). This same time seriesis analyzed in Zhu et al. (2019), and includes data from 1947-1993 (Jones and Reid,2001); we plot the corresponding values in the left panel of Figure 2. Based onthese data, in the right panel of Figure 2 we plot the estimated value of χ , usingthe estimator in Equation (19) for k = 1 , . . . ,

5. This plot could loosely be thoughtof as an asymptotic dependence analog to a sample autocorrelation plot. Here, thethreshold is set at the empirical 0.95 quantile, but other thresholds are considered inthe Supplementary Materials. We note that the estimated value of χ is moderately18igh at lag one, but drops oﬀ to zero quickly beyond this point. This exploratoryanalysis is not conclusive, but suggests that the ﬁrst-order Markov model may beuseful for these data.Zhu et al. (2019) performed analysis of this Antarctic annual minimum tempera-ture data set, and make two overall conclusions in their work. First, they ﬁnd thatannual minimum temperatures seem to be increasing over this time period. Second,they conclude that dependence between annual minima exists at a lag of one year.In order to make these conclusions, they develop a time series model for (negated)annual minimum temperatures using a Gaussian dependence structure and marginalGEV distributions. To ensure that the marginal distributions are GEV, they employprobability integral transformations. The ﬁndings of Zhu et al. (2019) are intriguing,and suggest an interesting climatological phenomenon; however, they are not able todetermine the presence of short-ranged asymptotic dependence. This is due to thefact that methods that incorporate Gaussian based dependence are not capable ofmodeling asymptotic dependence when correlation is less than one (Sibuya, 1959).For this reason, we consider the ﬁrst-order Markov GEV model developed in Section3. In this analysis, we use the approach developed in Section 3. As in our simula-tion study, to allow for arbitrary GEV marginals we rely on the relation outlined in(15). In our analysis, we consider four models. In order to ensure that ˆ σ >

0, weperform inference for the scale parameter on the log scale. To allow for temporal non-stationarity, we consider models that include a linear temporal trend in the locationparameter, i.e., Y t ∼ GEV( µ + µ t, σ, ξ ). This model is obtained by transforming Z t , the stationary ﬁrst-order Markov GEV model with unit Fr´echet marginals, via Y t = µ + µ t + σ (cid:16) ( Z t ) ξ + − (cid:17) /ξ for µ , µ ∈ R , σ >

0, and ξ ∈ R . To this end, weconsider the four models, M1 - M4, deﬁned such that19

950 1960 1970 1980 1990 − − − − Year A nnua l M i n i m u m T e m pe r a t u r e . . . . . . Lag c ^ Figure 2:

Left : Annual minimum temperatures at the Faraday/Verdnansky station.

Right : ˆ χ at lags one through 5 for the same near-surface air temperature data, where χ is estimated using its traditional empirical estimator (Coles, 2001) with a thresholdat the empirical 0.95 quantile. • M1: Y t ∼ GEV( µ, σ, ξ ) - stationary independent GEV, • M2: Y t ∼ GEV( µ + µ t, σ, ξ ) - independent GEV with a linear temporal trendin the location parameter, • M3: Y t ∼ GEV( µ, σ, ξ ) with dependence parameter α - stationary ﬁrst-orderMarkov GEV, and • M4: Y t ∼ GEV( µ + µ t, σ, ξ ) with dependence parameter α - ﬁrst-order MarkovGEV with a linear temporal trend in location parameter.For each of the above models, a Bayesian modeling approach is taken. We use vagueGaussian priors for µ and µ in M2 and M4, and µ in M1 and M3. The scaleparameter is modeled on the log scale in order to ensure positivity, and we assume avague Gaussian prior for log( σ ). As the GEV scale parameter is known to be diﬃcult20o estimate in practice, we employ a mildly informative truncated Gaussian prior for ξ in all models. In M3 and M4, we use a mildly informative beta prior for α . Detailsregarding prior distributions for all four models are available in the Appendix.The posterior distributions do not have closed forms, thus MCMC methods areemployed for inference (Gelman et al., 2013). MCMC is performed in R (R Core Team,2016) via the package rstan (Stan Development Team, 2016), which utilizes the Stan programming language (Carpenter et al., 2017). In order to assess convergence, foreach model, two independent chains with randomly selected initial values are runin parallel until 110,000 draws from each chain are obtained. The ﬁrst 10,000 fromeach chain are discarded as burn-in and every 20 th observation from the remaining100,000 draws is retained, yielding a total of 5,000 posterior draws from each chain.Convergence is assessed via traceplots (presented in Supplementary Materials) andˆ R (Gelman et al., 2013), which is approximately 1 for all parameters. Althoughboth chains appear to have converged to the same distribution, inference hereafter isarbitrarily based oﬀ of draws from the ﬁrst chain exclusively. Additional discussionof our MCMC procedure is presented in the Supplementary Materials.As outlined in Spiegelhalter et al. (2002), model comparison is performed bycalculating the corresponding deviance information criterion (DIC) values for M1- M4. These results indicate that M4 is the best of these four models for the Fara-day/Vernadsky station data. Noting that this model includes short-ranged asymp-totic dependence and a linear temporal trend in the location parameter, the posteriormean and key posterior quantiles for M4 are reported in Table 3. The posterior distri-bution of α suggests that there is at least a moderate degree of asymptotic dependencebetween annual minima at lag one. Relying the fact that χ = 2 − α , we transformthe asymptotic dependence parameter estimate to the χ scale for the purpose of com-parison. The posterior mean value for α is 0.657, which yields an estimated value of21 of approximately 0.423, which is quite similar the the empirical estimate at lag onepresented in the right panel of Figure 2.We also use model output to make inference on q . , the annual minimum tem-perature associated with an exceedance probability of 0.05 for 1994. The moderatedegree of dependence at lag one indicates that q . may be dependent upon the an-nual minima in 1993, the last year of the series. Based on the MCMC output, theposterior mean estimate of the conditional tail quantile q . is -18.884. For the sakeof comparison, we contrast this estimate with the estimate from M2, the best ﬁttingmodel that does not incorporate asymptotic dependence (based on the DIC criterion).MCMC output from M2 yields a posterior mean estimate of -21.967 for q . (as seenin Table 6), which is nearly 3 degrees cooler than the estimate produced by the modelthat does account for short-ranged asymptotic dependence.Based on analysis of the posterior distribution of µ , there is also moderate evi-dence of a linear temporal trend in the location parameter. This provides additionalevidence of warming annual minimum temperatures at this location over the studyperiod. This conclusion seems to be consistent with the ﬁndings of Zhu et al. (2019)22able 3: A numerical summary of our MCMC posterior draws in M4 for the Faradayseries. We report the posterior mean and several quantiles for the location interceptparameter, location temporal trend parameter, scale parameter, shape parameter,dependence parameter, and 0.95 conditional tail quantile for the next year in theseries (top row to bottom row, respectively). Recall that the model is ﬁt based on theseries of negated minima, and therefore the negative posterior mean for µ actuallycorresponds with increasing annual minimum temperatures.Mean 2.5% 5% 50% 95% 97.5% µ µ -0.120 -0.280 -0.245 -0.123 0.010 0.051 σ ξ -0.035 -0.224 -0.193 -0.040 0.143 0.182 α q . -18.884 -22.798 -22.091 -18.728 -16.204 -15.792Table 4: As in Table. 3 but for M2.Mean 2.5% 5% 50% 95% 97.5% µ µ -0.135 -0.226 -0.212 -0.136 -0.059 -0.043 σ ξ -0.096 -0.265 -0.237 -0.099 0.055 0.089 q . -21.967 -26.759 -25.787 -21.736 -18.918 -18.47423able 5: As in Table. 3 but for M4 for the Soviet series.Mean 2.5% 5% 50% 95% 97.5% µ µ -0.022 -0.065 -0.058 -0.021 0.013 0.021 σ ξ -0.141 -0.303 -0.282 -0.147 0.016 0.045 α q . -45.650 -47.792 -47.382 -45.573 -44.209 -43.978Table 6: As in Table. 5 but for M2.Mean 2.5% 5% 50% 95% 97.5% µ µ -0.019 -0.052 -0.047 -0.019 0.008 0.014 σ ξ -0.170 -0.319 -0.295 -0.172 -0.035 -0.007 q . -46.129 -48.278 -47.875 -46.048 -44.669 -44.433 The analysis in Section 5.1 indicates that there may be asymptotic dependence at lagone in annual minimum temperatures at the Faraday/Vernadsky station in Antarc-tica. In this section, we investigate whether this type of dependence exists at anarbitrarily selected station in the Arctic region. To this end, we perform analysisof annual minimum air temperature data using the northernmost station (73 . ◦ N,80 . ◦ E) in a Soviet research station database (Razuvaev et al., 2008). We plot thistime series in the left panel of Figure 3 and note that it includes data from winters(DJF) beginning in the years 1936-2000. In the right panel of Figure 3, we plot theestimated value of χ at lags one through 5 for these data. Here, we estimate χ us-24ng the estimator described in Equation 19 with the threshold at the empirical 0.95quantile. Other thresholds are considered in the Supplementary Materials. As in theAntarctic data, the estimated value of χ is moderately high at lag one, and drops oﬀto zero quickly beyond this point, suggesting that the ﬁrst-order Markov model maybe appropriate for these data as well.In a similar fashion, we consider the Bayesian models M1 - M4 for the Arcticstation data. Similar to the Antarctic data, M4 is also the best ﬁtting model accordingto DIC; we ﬁnd this result interesting, as it indicates short-ranged dependence may bepresent in these data as well. Table 5 presents the posterior mean and key posteriorquantiles for the parameters in M4. Although the evidence appears to be less strong,these results suggest that a model that includes asymptotic dependence and a lineartemporal trend in the location parameter may be favored.Although the diﬀerence in tail quantile estimates is not as large as what we ob-served in the Antarctic data, we note that the estimates of q . diﬀer by nearly 1degree for the Soviet data compared to the estimate based on M2. We note that theposterior mean and key posterior quantiles for the parameters in M2 are presentedin Table 6. Transforming the posterior mean asymptotic dependence parameter esti-mate to the χ scale results in an estimated value of χ of approximately 0.243, whichis slightly lower than seen in the empirical estimate in the right panel of Figure 3.25

940 1960 1980 2000 − − − − Year A nnua l M i n i m u m T e m pe r a t u r e . . . . . . Lag c ^ Figure 3: As in Fig. 2 but for the Soviet station.

In cases when a series of block maxima exhibits short-ranged temporal dependenceanalysis becomes more complicated. In these situations, modeling block maxima us-ing a multivariate extreme value distribution or a max-stable process approach arevalid strategies; unfortunately, these methods can be diﬃcult to implement in prac-tice. Max-stable must typically be ﬁt using composite (pairwise) likelihood methodsdue to the fact that the joint likelihood is not tractable for even a relatively smallnumber of observations (Cooley et al., 2012; Davison et al., 2012). This compli-cates Bayesian inference approaches, though a few works have managed to imple-ment Bayesian models in special cases (Reich and Shaby, 2012; Ribatet et al., 2012;Thibaud et al., 2016). Others have modeled dependence among block maxima usingGaussian copula based dependence structures (Zhu et al., 2019). While straightfor-26ard to implement, a major drawback is that the Gaussian copula based approachdoes not allow for asymptotic dependence.In our modeling approach, we make a ﬁrst-order Markov assumption, and thereforethe joint likelihood only utilizes pairwise likelihoods between consecutive observations,which is the full likelihood of the proposed model. In situations where the dependenceamong block maxima is short-ranged, as we observe in our motivating data sets, thisﬁrst-order Markov assumption makes a great deal of sense and results in a tractablelikelihood function and utilizing a logistic structure to model dependence. Our modelprovides a simple alternative to account for short-ranged dependent block maxima.Our motivating data sets exhibits dependence between annual minima at lag one;therefore our modeling assumptions appear to be reasonable. However, if a data setshowed dependence at lags one and two, it would be possible to extend the method toassume that X t depends on X t − and X t − . Under this scenario, the joint likelihoodwould then include trivariate likelihood functions. Although trivariate dependence isconsiderably more diﬃcult to characterize, there are multivariate dependence struc-tures that may make sense in these situations. We leave this extension for futurework. Conﬂict of Interest

The authors declare that they have no conﬂicts of interest.

Acknowledgements

Clemson University is acknowledged for its generous allotment of computing time onthe Palmetto Cluster. We thank the authors of Zhu et al. (2019) for sharing the27araday/Vernadsky data set.

Data Availability Statement

The Faraday/Verdnadsky data are described in Jones and Reid (2001) and havebeen provided to us by Zhu et al. (2019). The data used in our analysis are postedat https://github.com/brooktrussell/DependentGEV/Faraday.csv . The Sovietstation data are described by Razuvaev et al. (2008) and available at https://cdiac.ess-dive.lbl.gov/ftp/ndp040/ . The data used in our analysis are postedat https://github.com/brooktrussell/DependentGEV/Soviet.csv .28

Additional Details of Bayesian Inference Proce-dure

In this section, we present additional details of our Bayesian inference procedure thatwere not included in the manuscript. In our Bayesian modeling procedure, we utilizethe following prior distributions for model parameters in M1-M4. • M1: µ ∼ N (0 , ), log( σ ) ∼ N (0 , ), and ξ ∼ T N (0 , . , − . , . • M2: µ ∼ N (0 , ), µ ∼ N (0 , ), log( σ ) ∼ N (0 , ), and ξ ∼ T N (0 , . , − . , . • M3: µ ∼ N (0 , ), log( σ ) ∼ N (0 , ), ξ ∼ T N (0 , . , − . , . α ∼ Beta (1 . , • M4: µ ∼ N (0 , ), µ ∼ N (0 , ), log( σ ) ∼ N (0 , ), ξ ∼ T N (0 , . , − . , . α ∼ Beta (1 . , T N denotes the truncated Gaussian distribution where the param-eters are the mean, variance, lower truncation value, and upper truncation value(respectively). The prior distribution for the shape parameter ξ and the dependenceparameter α are plotted in the left and right (respectively) panels Figure 4.29 . . . . . . Shape Parameter D en s i t y . . . . Dependence Parameter D en s i t y Figure 4: We plot the prior distribution used for the shape parameter (L) and thedependence parameter (R).In order to minimize the eﬀect of temporal dependence in MCMC realizations,we thin by retaining every 20 th observation. This strategy appears to be reasonablyeﬀective, as the number of eﬀective observations for each parameter (as calculated bythe rstan package) are reasonably large, as seen in Table 7.We also believe that it is reasonable to think that both chains have converged. Inthe manuscript, we mention that the value of ˆ R is approximately one for all parame-ters. This is further evidenced by the traceplots in Figures 5 and 6, which appear tobe consistent with convergence.Table 7: We report the number of eﬀective MCMC realizations, recalling that a totalof 10,000 draws were retained after thinning and burn-in (including both chains). ξ α σ µ µ Soviet Station 10,000 9,643 9,360 10,000 10,000Faraday Station 9764 9169 8526 8701 942730 i alpha lsigmu0 mu1 −0.4−0.20.00.20.4 0.000.250.500.751.00 246−150−100−50050 −1011000 2000 3000 4000 5000 1000 2000 3000 4000 5000 1000 2000 3000 4000 50001000 2000 3000 4000 5000 1000 2000 3000 4000 5000 chain Figure 5: In order to aid in assessing convergence, we present traceplots for theparameters in M4 based on the Faraday station data.31 i alpha lsigmu0 mu1 −0.4−0.20.00.2 0.40.60.81.0 1.01.52.039424548 −0.10−0.050.000.051000 2000 3000 4000 5000 1000 2000 3000 4000 5000 1000 2000 3000 4000 50001000 2000 3000 4000 5000 1000 2000 3000 4000 5000 chain Figure 6: In order to aid in assessing convergence, we present traceplots for theparameters in M4 based on the Soviet station data.As described in the manuscript’s Appendix, we employ vague Gaussian priors forthe location and log scale parameters. The GEV shape parameter is known to bediﬃcult to estimate; therefore, we use a slightly informative truncated Normal priordistribution for this parameter. We believe that a prior that places no mass outsideof ( − . , .

5) is reasonable for this application. Since we believe that the dependenceparameter α is moderate, we employ a Beta prior that has much higher density over(0 . , .

0) compared to the interval (0 , . α and ξ are plotted in Figure 4.32 A Comparison of Diﬀerent Thresholds

To supplement the results in the the manuscript, we consider diﬀerent thresholdvalues. Table 8 is analogous to Table 1 in the manuscript, except that it is based onthresholding at the empirical 0.925 quantile. Table 9 is analogous to Table 1 in themanuscript, except that it is based on thresholding at the empirical 0.90 quantile.Both tables convey similar results compared to Table 1 in the manuscript.Table 8: The average estimate of χ (using the estimator in Equation (20) with athreshold set at their empirical 0.925 quantile) for each assumed dependent structureat lag 1 to 5. Lag 1 Lag 2 Lag 3 Lag 4 Lag 5Independent 0.07 0.07 0.07 0.08 0.08GEV ProcessFirst-order 0.36 0.19 0.12 0.10 0.08Markov ProcessMA(2) Process 0.37 0.14 0.08 0.07 0.07Table 9: The average estimate of χ (using the estimator in Equation (20) with athreshold set at their empirical 0.90 quantile) for each assumed dependent structureat lag 1 to 5. Lag 1 Lag 2 Lag 3 Lag 4 Lag 5Independent 0.09 0.09 0.09 0.09 0.10GEV ProcessFirst-order 0.38 0.22 0.15 0.12 0.11Markov ProcessMA(2) Process 0.39 0.16 0.10 0.10 0.09In order to investigate the degree to which the graphs in the right panels of Figures2 and 3 in the manuscript are threshold dependent, we create additional ﬁgures basedon alternative thresholds. The graph in the left panel of Figure 7 is analogous to theright panel of Figure 2 in the manuscript, but thresholds at the empirical 0.90 quantile.The graph in the right panel of Figure 7 is analogous to the right panel of Figure 3 in33he manuscript, but thresholds at the empirical 0.90 quantile. The graph in the leftpanel of Figure 8is analogous to the right panel of Figure 2 in the manuscript, butthresholds at the empirical 0.925 quantile. The graph in the right panel of Figure 8is analogous to the right panel of Figure 3 in the manuscript, but thresholds at theempirical 0.925 quantile. All graphs presented here seem to yield similar conclusionsas presented in the manuscript. . . . . . . Lag c ^ . . . . . . Lag c ^ Figure 7: We plot the empirical estimates of χ at lags one through ﬁve, based on theempirical 0.90 quantile, for the Faraday data (L) and the Soviet data (R).34 . . . . . . Lag c ^ . . . . . . Lag c ^ Figure 8: We plot the empirical estimates of χ at lags one through ﬁve, based on theempirical 0.925 quantile, for the Faraday data (L) and the Soviet data (R). References

Beck, N., Genest, C., Jalbert, J., and Mailhot, M. (2020). Predicting extreme surgesfrom sparse data using a copula-based hierarchical bayesian spatial model.

Envi-ronmetrics , page e2616.Carpenter, B., Gelman, A., Hoﬀman, M., Lee, D., Goodrich, B., Betancourt, M.,Brubaker, M. A., Guo, J., Li, P., and Riddell, A. (2017). Stan: A probabilisticprogramming language.

Journal of Statistical Software , 76(1).Castruccio, S., Huser, R., and Genton, M. G. (2016). High-order composite likelihoodinference for max-stable distributions and processes.

Journal of Computational andGraphical Statistics , 25(4):1212–1229. 35oles, S., Heﬀernan, J., and Tawn, J. (1999). Dependence measures for extreme valueanalyses.

Extremes , 2(4):339–365.Coles, S. G. (2001).

An Introduction to Statistical Modeling of Extreme Values .Springer Series in Statistics. Springer-Verlag London Ltd., London.Coles, S. G. and Tawn, J. A. (1991). Modelling extreme multivariate events.

Journalof the Royal Statistical Society: Series B (Methodological) , 53(2):377–392.Coles, S. G. and Tawn, J. A. (1994). Statistical methods for multivariate extremes:an application to structural design.

Journal of the Royal Statistical Society: SeriesC (Applied Statistics) , 43(1):1–31.Coles, S. G. and Tawn, J. A. (1996). A bayesian analysis of extreme rainfall data.

Journal of the Royal Statistical Society: Series C (Applied Statistics) , 45(4):463–478.Cooley, D., Cisewski, J., Erhardt, R. J., Jeon, S., Mannshardt, E., Omolo, B. O., andSun, Y. (2012). A survey of spatial extremes: measuring spatial dependence andmodeling spatial eﬀects.

Revstat , 10(1):135–165.Davison, A. C., Padoan, S. A., and Ribatet, M. (2012). Statistical Modeling of SpatialExtremes.

Statistical Science , 27(2):161–186.Einmahl, J. H., Haan, L., and Zhou, C. (2016). Statistics of heteroscedastic extremes.

Journal of the Royal Statistical Society: Series B , 78(1):31–51.Embrechts, P., Lindskog, F., and McNeil, A. (2001). Modelling dependence withcopulas.

Rapport technique, D´epartement de math´ematiques, Institut F´ed´eral deTechnologie de Zurich, Zurich , 14. 36isher, R. A. and Tippett, L. H. C. (1928). Limiting forms of the frequency distri-bution of the largest or smallest member of a sample.

Mathematical Proceedings ofthe Cambridge Philosophical Society , 24(2):180–190.Fix, M. J., Cooley, D. S., and Thibaud, E. (2020). Simultaneous autoregressive modelsfor spatial extremes.

Environmetrics .Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B.(2013).

Bayesian data analysis . CRC press.Gnedenko, B. (1943). Sur la distribution limite du terme maximum d’une s´erieal´eatoire.

Annals of Mathematics , 44(3):423–453.Gumbel, E. J. (1958).

Statistics of extremes.

Columbia University Press, New York.Hazra, A., Reich, B. J., and Staicu, A.-M. (2020). A multivariate spatial skew-t process for joint modeling of extreme precipitation indexes.

Environmetrics ,31(3):e2602.Hosking, J., Wallis, J. R., and Wood, E. F. (1985). Estimation of the generalizedextreme-value distribution by the method of probability-weighted moments.

Tech-nometrics , 27(3):251–261.Huang, W. K., Nychka, D. W., and Zhang, H. (2019). Estimating precipitationextremes using the log-histospline.

Environmetrics , 30(4):e2543.Huang, W. K., Stein, M. L., McInerney, D. J., and Moyer, E. J. (2016). Estimatingchanges in temperature extremes from millennial-scale climate simulations usinggeneralized extreme value (GEV) distributions.

Advances in Statistical Climatology,Meteorology and Oceanography , 2(1):79.37oe, H. (1997).

Multivariate models and multivariate dependence concepts . Chapmanand Hall/CRC.Jones, P. and Reid, P. (2001). A databank of Antarctic surface temperature andpressure data. ORNL/CDIAC-27 NDP-032. Carbon Dioxide Information AnalysisCenter, Oak Ridge National Laboratory, US Department of Energy, Oak Ridge,Tennessee. doi: 10.3334/CDIAC/cli.ndp032.Leadbetter, M. R., Lindgren, G., and Rootz´en, H. (1983).

Extremes and RelatedProperties of Random Sequences and Processes . Springer-Verlag, New York.Ledford, A. W. and Tawn, J. A. (1997). Modelling dependence within joint tailregions.

Journal of the Royal Statistical Society: Series B (Statistical Methodology) ,59(2):475–499.Nelsen, R. (2006).

An Introduction to Copulas, 2nd Edition . Lecture Notes in Statis-tics No. 139. Springer, New York.O’Sullivan, J., Sweeney, C., and Parnell, A. C. (2020). Bayesian spatial extremevalue analysis of maximum temperatures in county dublin, ireland.

Environmetrics ,31(5):e2621.Prescott, P. and Walden, A. (1980). Maximum likelihood estimation of the parametersof the generalized extreme-value distribution.

Biometrika , 67(3):723–724.R Core Team (2016).

R: A Language and Environment for Statistical Computing . RFoundation for Statistical Computing, Vienna, Austria.Razuvaev, V., Apasova, E. G., and Martuganov, R. (2008). Daily temperature andprecipitation data for 223 former-USSR stations. ORNL/CDIAC-56 NDP-040.38arbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, USDepartment of Energy, Oak Ridge, Tennessee. doi: 10.3334/CDIAC/cli.ndp040.Reich, B. J. and Shaby, B. A. (2012). A hierarchical max-stable spatial model forextreme precipitation.

Annals of Applied Statistics , 6(4):1430–1451.Resnick, S. (2007).

Heavy-Tail Phenomena: Probabilistic and Statistical Modeling .Springer Series in Operations Research and Financial Engineering. Springer, NewYork.Ribatet, M., Cooley, D., and Davison, A. C. (2012). Bayesian inference from com-posite likelihoods, with an application to spatial extremes.

Statistica Sinica , pages813–845.Russell, B. T., Risser, M. D., Smith, R. L., and Kunkel, K. E. (2020). Investigatingthe association between late spring gulf of mexico sea surface temperatures and usgulf coast precipitation extremes with focus on hurricane harvey.

Environmetrics ,31(2):e2595.Sibuya, M. (1959). Bivariate extreme statistics, I.

Annals of the Institute of StatisticalMathematics , 11(2):195–210.Smith, R. L. (1990). Max–stable processes and spatial extremes.

Unpublishedmanuscript , 205.Smith, R. L. (1992). The extremal index for a Markov chain.

Journal of appliedprobability , 29(1):37–45.Smith, R. L., Tawn, J. A., and Coles, S. G. (1997). Markov chain models for thresholdexceedances.

Biometrika , 84(2):249–268.39piegelhalter, D. J., Best, N. G., Carlin, B. P., and Van Der Linde, A. (2002). Bayesianmeasures of model complexity and ﬁt.

Journal of the Royal Statistical Society:Series B (Statistical Methodology) , 64(4):583–639.Stan Development Team (2016). RStan: the R interface to Stan. R package version2.14.1.Stein, M. L. (2020a). A parametric model for distributions with ﬂexible behavior inboth tails.

Environmetrics .Stein, M. L. (2020b). Parametric models for distributions when interest is in extremeswith an application to daily temperature.

Extremes .Tawn, J. A. (1988). Bivariate extreme value theory: models and estimation.

Biometrika , 75(3):397–415.Thibaud, E., Aalto, J., Cooley, D. S., Davison, A. C., Heikkinen, J., et al. (2016).Bayesian inference for the Brown–Resnick process, with an application to extremelow temperatures.

The Annals of Applied Statistics , 10(4):2303–2324.Towe, R. P., Tawn, J. A., Lamb, R., and Sherlock, C. G. (2019). Model-basedinference of conditional extreme value distributions with hydrological applications.

Environmetrics , 30(8):e2575.Zhang, Z. and Smith, R. L. (2004). The behavior of multivariate maxima of movingmaxima processes.

Journal of Applied Probability , 41(4):1113–1123.Zhu, L., Liu, X., and Lund, R. (2019). A likelihood for correlated extreme series.