On a Bivariate Copula for Modeling Negative Dependence
OOn a Bivariate Copula for Modeling Negative Dependence
Shyamal Ghosh , Prajamitra Bhuyan , and Maxim Finkelstein , University of the Free State, Bloemfontein, South Africa Imperial College London, United Kingdom The Alan Turing Institute, London, United Kingdom ITMO University, St. Petersburg, Russia
Abstract
A new bivariate copula is proposed for modeling negative dependence between two randomvariables. We show that it complies with most of the popular notions of negative dependencereported in the literature and study some of its basic properties. Specifically, the Spearman’srho and the Kendall’s tau for the proposed copula have a simple one-parameter form withnegative values in the full range. Some important ordering properties comparing the strengthof negative dependence with respect to the parameter involved are considered. Simple examplesof the corresponding bivariate distributions with popular marginals are presented. Applicationof the proposed copula is illustrated using a real data set.
Keywords: Likelihood ratio dependent; Negatively ordered; Negatively Quadrant Dependent; Spear-man’s rho; Sub-harmonic. Introduction
Copulas provide an effective tool for modeling dependence in various multivariate phenomena inthe fields of reliability engineering, life sciences, environmental science, economics and finance, etc(Fontaine et al., 2020; Cooray, 2019; Joe, 2015, Ch-7). Specifically, in recent decades, bivariatecopulas were used to generate bivariate distributions with suitable dependence properties (Lai andXie, 2000; Bairamov and Kotz, 2003; Finkelstein, 2003; Mohtashami-Borzadaran et al., 2019). Thedetailed discussion of historical developments, obtained results and perspectives along with the upto date theory can be found in Nelsen (2006). It should be noted that most copulas available inthe literature possess some limitations in modeling negatively dependent data, which is a certaindisadvantage, as negative dependence between vital variables is often encountered in real life.Lehmann (1966) introduced several concepts of negative dependence for bivariate distributions.Later, Esary and Lehmann (1972) and Yanagimoto (1972) extended the correponding definitions1 a r X i v : . [ s t a t . M E ] F e b nd developed stronger notions of bivariate negative dependence. See Balakrishnan and Lai (2009)for detailed discussion on popular dependence notions and their applications in the context ofcontinuous bivariate distributions. Scarsini and Shaked (1996) provided a detailed overview of thecorresponding ordering properties for the multivariate distributions. These results provide usefultools for describing the dependence properties of copulas with respect to a dependence parameter.However, only a few bivariate copulas that allow for a simple and meaningful analysis of this kindhave been developed and studied in the literature so far. The Farlie-Gumbel-Morgenstern (FGM)family of distributions exhibits negative dependence, but the Spearman’s rho for this family lieswithin the interval [ − / , /
3] (Schucany et al., 1978). Bairamov and Kotz (2000) and Bekrizadehet al. (2012) have conisdered the four-parameter and the three-parameter extensiosn of the FGMfamily proposed by Sarmanov (1996), with Spearman’s rho lying within the interval [ − . , . − . , . − , − . , − . , − , − , Bhuyan et al. (2020) proposed a negatively dependent bivariate life distribution that possesses nice,closed-form expressions for the joint distributions and exhibits various strong notions of negativedependence reported in the literature. Most importantly, its correlation coefficient may take anyvalue in the interval ( − , H ( x, y ) = y λ − e − λx + λ ( λ + µ ) y µ (cid:2) e ( λ + µ ) x − y λ + µ (cid:3) , < y ≤ , x > − log y − e − λx − λ ( λ + µ ) y µ (cid:2) − e − ( λ + µ ) x (cid:3) , x > , y > , (1)and F ( x ) = 1 − e − λx for x >
0, and G ( y ) = µ ( λ + µ ) y λ (0 < y ≤
1) + (cid:20) − λ ( λ + µ ) y µ (cid:21) ( y > λ, µ >
0. Note that F ( · ) and G ( · ) are continuous. We first, find the quasi-inversefunctions of F ( · ) and G ( · ) and ’insert’ those into the arguments of the joint distribution function H ( · , · ) given by (1). Then by Corollary 2.3.7 of Nelsen (2006, p-22), we obtain the following copula C λ,µ ( u, v ) = v − (1 − u ) + λµ µλ ( λ + µ ) µλ (1 − u ) µλ v − µλ , < v ≤ µµ + λ , − ( λ + µ ) vµ < u < ,u − (1 − v ) (cid:104) − (1 − u ) µλ (cid:105) , < u < , µµ + λ < v < . (2)Now using the reparameterization µ = θλ , in (2), we rewrite C λ,µ as C θ ( u, v ) = v − (1 − u ) + θ θ (1 + θ ) θ (1 − u ) θ v − θ , < v ≤ θ θ , − (1 + θ ) vθ < u < u − (1 − v ) (cid:2) − (1 − u ) θ (cid:3) , < u < , θ θ < v < , (3)for θ >
0. It is easy to verify that C θ ( u, v ), given by (3), satisfies the following conditions: (i) C θ ( u,
0) = 0 = C θ (0 , v ), (ii) C θ ( u,
1) = u , C θ (1 , v ) = v , for any u , v in I = [0 , θ ( u , v ) − C θ ( u , v ) − C θ ( u , v ) + C θ ( u , v ) ≥
0, for any u , u , v , v in I with u ≤ u and v ≤ v . In Figure 1, we provide graphical presentation of the proposed copula for different valuesof the dependence parameter θ .The survival copula and the density function of the proposed copula C θ ( u, v ) are given by¯ C θ ( u, v ) = θ θ (1 + θ ) θ u θ (1 − v ) − θ , < v ≤ θ θ , − (1 + θ ) vθ < u < vu (1+ θ ) , < u < , θ θ < v < , and c θ ( u, v ) = θ θ (1 + θ ) θ (1 − u ) θ v − (1+ θ ) , < v ≤ θ θ , − (1 + θ ) vθ < u < θ )(1 − u ) θ , < u < , θ θ < v < , (4)respectively. The conditional copula of U given V = v , is as follows. For 0 < v ≤ θ (1+ θ ) , C θ ( u | v ) = 1 − θ (1+ θ ) (1 + θ ) (1+ θ ) (1 − u ) (1+ θ ) v − (1+ θ ) , − (1 + θ ) vθ < u < , (5)whereas for θ (1+ θ ) < v < C θ ( u | v ) = 1 − (1 − u ) (1+ θ ) , < u < . (6)The conditional mean and variance of U | V = v are given by E [ U | V = v ] = − (1 + θ ) vθ ( θ + 2) , < v ≤ θ θ θ + 2 , θ θ < v < , v C opu l a (a) Copula plot for θ = 0 . u v C opu l a (b) Copula plot for θ = 1 u v C opu l a (c) Copula plot for θ = 5 u v C opu l a (d) Copula plot for θ = 10 Figure 1: Graphical plots of C θ for different choices of θ on an unit square.5nd V ar [ U | V = v ] = (1 + θ ) v θ ( θ + 2) ( θ + 3) , < v ≤ θ θθ + 1( θ + 2) ( θ + 3) , θ θ < v < , respectively. Remark 2.1.
The regression of U on V = v is linearly decreasing in v for 0 < v ≤ θθ +1 , andindependent of v for θθ +1 < v <
1. Also, it is interesting to note that the conditional variance of U | V = v is an increasing function of v and bounded from above by θ +1( θ +2) ( θ +3) . The conditional copula of V given U = u , is given by C θ ( v | u ) = − θ θ (1 + θ ) θ (1 − u ) θ v − θ , (1 − u ) θ (1 + θ ) < v ≤ θ θ − (1 + θ )(1 − v )(1 − u ) θ , θ θ < v < V | U = u , are given by E [ V | U = u ] = (1 − u ) θ − θ ) − θ (1 − u )1 − θ , for θ (cid:54) = 1, and V ar [ V | U = u ] = 1( θ − (cid:20) − (1 + θ )(1 − u ) θ [2 − θ + θ (2 − u ) + 3 θ u ]3( θ −
2) + θ (1 − u ) θ − − (1 + θ ) (1 − u ) θ (cid:21) , for θ (cid:54) = 1 ,
2, respectively.
Remark 2.2.
The regression of V on U = u is strictly decreasing in u .One can use the conditional copula of U given V = v , provided in (5) and (6), to simulate fromthe proposed copula C θ , given by (3), using the following steps.Step I. Simulate v i and u ∗ i independently from standard uniform distribution.Step II. If v i ≤ θθ +1 , then solving C θ ( u | v i ) = u ∗ i from (5), we get u i = 1 − ( θ +1 θ ) v i (1 − u ∗ i ) θ ;else, solving C θ ( u | v i ) = u ∗ i from (6), we get u i = 1 − (1 − u ∗ i ) θ .6tep III. Repeat Step I and Step II n times to obtain independently and identically distributed real-izations ( u i , v i ), for i = 1 , , . . . , n from C θ .A similar algorithm can be elaborated to simulate from C θ based on the conditional copula of V given U , provided in (7). Scatter plots based on 500 simulated observations using the aforementionedalgorithm for four different values of θ are given in Figure 2. As expected, the data points aregetting closer to the diagonal v = − u for higher values of θ . l l l ll l l ll lll ll ll ll l ll ll ll lll ll lll ll l ll ll l lll llll llll lll l ll ll ll ll l ll l ll ll ll l l ll l lll lll l ll l ll lll lll l l ll ll l ll ll l ll ll lll l ll llll l l lll lll ll lll ll lll ll llll ll l lll llll ll l l ll l llll llll ll l ll llll l l ll lll l l ll ll l l ll lll ll l l ll lll l lll l lll l ll l lllllll l ll llll ll lll lllll l l l l ll l l ll ll ll l lll l lllll l l ll ll l ll llll l lll l ll ll ll l lll ll lll l ll ll ll l lll lll lll lll lllllll ll lll l ll lll l l lll ll ll ll lll l ll ll l ll ll lll l l lll lll lll l llll lll ll ll llll llll lll l ll lll l ll ll ll l lll l lllll l lll ll ll ll l llll l ll l llll l lll l l ll ll lll l ll llll l ll ll l lll l ll lll ll ll ll lll l lll ll ll . . . . . . u v (a) Scatter plot for θ = 0 . ll ll ll ll lll ll ll l ll lll l ll ll l l l ll llll l ll llll lll l lll l ll ll llll ll l ll ll l lll lll l ll lll ll ll ll l ll l l ll ll lll l l ll l llll l ll ll lll ll l ll lll llll l lll l l l ll llll ll ll lll lll l l llll l llll l lll ll ll lll lll lll lll l llll l l ll ll ll ll ll ll ll ll l ll ll ll ll l ll llll ll lll l llll ll ll lllll ll ll lllll lll l ll llll l ll ll ll ll ll l ll ll l lll l ll ll ll lll ll ll ll l l l llllll ll lll l lllll lll ll l l lll lll ll lll l l lllll llll lll lll l ll ll ll lll l l lll ll llll l ll ll lll l llll l ll l lll l l ll l ll lll l ll l l llll ll llll ll ll ll llll l llll l ll ll ll l lllll lll lll lll l ll ll ll lll lllll ll ll l l ll l ll l lll ll llll lll lll ll ll ll ll . . . . . . u v (b) Scatter plot for θ = 1 l ll l l ll l lll lll l ll lllll l llll ll l llll lll l llllll lll ll lll ll l ll lll llll ll l lll ll lll lll l lll l l l lll lll ll ll ll l ll ll ll lll l lll l lll ll lll ll l lll ll ll ll l lll lll ll l lll ll l ll l lll ll llll l ll l lll lll l ll l lll lll ll ll l ll ll ll ll lll lll ll ll llll ll l ll ll lll l l ll ll ll llll ll l l ll lllll l l lll ll lllll lll lll l lll ll lll ll lll ll ll l ll lll ll ll ll lll ll l ll l lll ll ll llll l lll llll ll ll l l ll ll ll lll l ll l l l ll ll ll ll ll ll ll l lll lll lll llll ll ll l ll l llll ll l lll lll ll lll l ll l ll ll lll lll l ll llll lll l ll lll llll ll ll lll ll ll l ll llll ll ll ll ll ll l lllll ll ll lll ll llll l ll lll ll ll ll lll lll ll ll ll lll l . . . . . . u v (c) Scatter plot for θ = 5 lll lll lll l ll ll ll l ll ll lll l l ll ll ll ll l lll ll lll ll l llll ll ll ll lll l lll l lll lll l lllll l lll l l llll ll l lll ll ll l lll lll lll ll lll ll ll ll lll l l l ll lllll lll lll ll ll ll l ll ll llll l ll ll ll lll l ll l l ll l l llll lll lll l llll l ll lll ll llll l ll ll l l l ll ll ll ll ll l ll ll l llll l ll l lll ll l l ll l lll lll lll lll l lll ll l lll ll ll ll ll l ll lll ll lll ll ll ll llll ll ll lll ll lll ll ll l ll ll lll ll lll lll l ll lll lll ll ll ll ll ll lll ll l llllll llll lll ll ll ll ll ll l l lll l lll l ll ll lll ll ll lll ll ll ll llll l l lll ll lll ll l l ll ll ll ll lll ll lll l l ll l lll l ll l llll l llll l ll ll l ll llll ll lll l l lll ll ll ll ll ll l lll lll l l llll . . . . . . u v (d) Scatter plot for θ = 10 Figure 2: Scatter plots based on 500 simulated observations from C θ for different choices of θ .7 .2 Basic Properties Proposition 2.1.
The copula C θ , defined in (3), is decreasing with respect to its dependence pa-rameter θ , i.e., if θ ≤ θ then C θ ( u, v ) ≤ C θ ( u, v ) , for all ( u, v ) ∈ I = [0 , × [0 , .Proof. Case I.
For 0 < v ≤ θ θ , and 1 − (1+ θ ) vθ < u <
1, we have ∂C θ ∂θ = θ θ (1 + θ ) (1+ θ ) (1 − u ) (1+ θ ) v − θ (cid:20) log (cid:18) θ θ (cid:19) + log(1 − u ) − log( v ) (cid:21) ≤ θ θ (1 + θ ) (1+ θ ) (1 − u ) (1+ θ ) v − θ (cid:20) log (cid:18) θ θ (cid:19) + log (cid:20) (1 + θ ) vθ (cid:21) − log( v ) (cid:21) , since (1 − u ) ≤ (1 + θ ) vθ = 0 Case II.
For 0 < u <
1, and θ θ < v <
1, we have ∂C θ ∂θ = (1 − u ) (1+ θ ) (1 − v ) log(1 − u ) ≤ . Now combining Case I and II, we have ∂C θ ∂θ ≤ u, v ) ∈ I , which implies C θ is decreasingin θ . Proposition 2.2.
The copula C θ , defined in (3), is sub-harmonic, i.e., ∇ C θ ( u, v ) ≥ .Proof. Case I.
For 0 < v ≤ θ θ , and 1 − (1+ θ ) vθ < u <
1, we have ∇ C θ ( u, v ) = ∂ C θ ( u, v ) ∂u + ∂ C θ ( u, v ) ∂v = θ (1+ θ ) (1 + θ ) θ (cid:2) (1 − u ) ( θ − v − θ + (1 − u ) (1+ θ ) v − (2+ θ ) (cid:3) ≥ Case II.
For 0 < u <
1, and θ θ < v <
1, we have ∇ C θ ( u, v ) = ∂ C θ ( u, v ) ∂u + ∂ C θ ( u, v ) ∂v = θ (1 + θ )(1 − u ) ( θ − (1 − v ) ≥ ∇ C θ ( u, v ) ≥ u, v ) ∈ I , and hence the resultfollows. 8 roposition 2.3. The copula C θ , defined in (3), is absolutely continuous.Proof. To establish the absolute continuity of the proposed copula C θ , it is required to show (cid:90) u (cid:90) v ∂ ∂s∂t C θ ( s, t ) dtds = C θ ( u, v ) , for every ( u, v ) ∈ I . Case I.
For 0 < v ≤ θ θ , and 1 − (1+ θ ) vθ < u <
1, we have (cid:90) u (cid:90) v ∂ ∂s∂t C θ ( s, t ) dtds = (cid:90) u − (1+ θ ) vθ (cid:90) v θ (1 − s )(1+ θ ) θ θ (1 + θ ) θ (1 − s ) θ t − (1+ θ ) dtds = (cid:90) u − (1+ θ ) vθ (cid:34) − (cid:18) θ θ (cid:19) θ (1 − s ) θ v − θ (cid:35) ds = (cid:90) (1+ θ ) vθ − u (cid:34) − (cid:18) θ θ (cid:19) θ z θ v − θ (cid:35) dz (where z = 1 − s )= v − (1 − u ) + θ θ (1 + θ ) θ (1 − u ) θ v − θ = C θ ( u, v ) . Case II.
For 0 < u <
1, and θ θ < v <
1, we have (cid:90) u (cid:90) v ∂ ∂s∂t C θ ( s, t ) dtds = (cid:90) u (cid:90) θ (1+ θ ) θ (1 − s )(1+ θ ) θ θ (1 + θ ) θ (1 − s ) θ t − (1+ θ ) dtds + (cid:90) u (cid:90) v θ (1+ θ ) (1 + θ )(1 − s ) θ dtds = (cid:90) u (cid:2) − (1 − s ) θ (cid:3) ds + (cid:90) u [ v − θ (1 − v )] (1 − s ) θ = u −
11 + θ + (1 − u ) θ +1 θ + 1 + [ v − θ (1 − v )] (cid:2) − (1 − u ) ( θ +1) (cid:3) θ = u − (1 − v ) (cid:2) − (1 − u ) ( θ +1) (cid:3) = C θ ( u, v ) . Therefore, the results follows by combining Case I and II.
Measures of dependence are commonly used to summarize the complicated dependence structureof bivariate distributions. See Joe (1997) and Nelsen (2006) for a detailed review on measures of9ependence and associated properties. In this section, we derive the expressions for the Kendall’stau and the Spearman’s rho based on the proposed copula C θ . Essentially, these coefficients measurethe correlation between the ranks rather than actual values of X and Y . Therefore, these coefficientsare unaffected by any monotonically increasing transformation of X and Y . Definition 2.1.
Let X and Y be the continuous random variables with the dependence structuredescribed by the copula C . Then the population version of the Spearman’s rho for X and Y is givenby ρ := (cid:90) (cid:90) uvdC ( u, v ) − (cid:90) (cid:90) C ( u, v ) dudv − Proposition 2.4.
Let ( X, Y ) be a random pair with copula C θ . The Spearman’s rho is given by ρ = 2(3 + 3 θ + θ )2 + 3 θ + θ − , which is a decreasing function in θ and takes any values between 0 and -1. Definition 2.2.
Let X and Y be the continuous random variables with copula C . Then, the popu-lation version of the Kendall’s tau for X and Y is given by τ := 4 (cid:90) (cid:90) C ( u, v ) dC ( u, v ) − Proposition 2.5.
Let ( X, Y ) be a random pair with copula C θ . Then the Kendall’s tau is given by τ = − θ (1 + θ ) , which is a decreasing function in θ and takes any values between 0 and -1. In Figure 3, we have plotted the Spearman’s rho and the Kendall’s tau against the dependenceparameter θ . It is easy to see that the Spearman’s rho is less than the Kendall’s tau for all θ > As discussed in Subsection 2.3, the Spearman’s rho and the Kendall’s tau measure the correlationbetween two random variables. However, it is possible, in principle, that these random variablesmay have the strong correlation, but possess the weak association with respect to different notions10
10 15 20 − . − . − . − . − . q C o rr e l a t i on C oe ff i c i en t Sperman's r Kendall's t Figure 3: Plot of Spearman’s rho and Kendall’s tau against the dependence parameter θ .of dependence, or vice versa. Therefore, having in mind this observation, in this section, we discussseveral relevant notions of negative dependence, namely Quadrant Dependence , Regression Depen-dence and
Likelihood Ratio Dependence , etc., and explore relevant connections for the proposedcopula. First, we provide the definitions of the aforementioned dependence notions, as discussed inNelsen (2006) and Balakrishnan and Lai (2009).
Definition 3.1.
Let X and Y be continuous random variables with copula C . Then1. X and Y are Negatively Quadrant Dependent (NQD) if P ( X ≤ x, Y ≤ y ) ≤ P ( X ≤ x ) P ( Y ≤ y ) , for all ( x, y ) ∈ R , where R is the domain of joint distribution of X and Y , or equivalently a copula C is said to be NQD if for all ( u, v ) ∈ I , C ( u, v ) ≤ uv. Y is left tail increasing in X (LTI( Y | X )), if P [ Y ≤ y | X ≤ x ] is a nondecreasing function of x for all y .3. X is left tail increasing in Y (LTI( X | Y )), if P [ X ≤ x | Y ≤ y ] is a nondecreasing function of y for all x .4. Y is right tail decreasing in X (RTD( Y | X )), if P [ Y > y | X > x ] is a nonincreasing functionof x for all y . . X is left tail increasing in Y (RTD( X | Y )), if P [ X > x | Y > y ] is a nonincreasing function of y for all x .6. Y is stochastically decreasing in X denoted as SD( Y | X ),(also known as negatively regressiondependent ( Y | X )) if P [ Y > y | X = x ] is a nonincreasing function of x for all y .7. X is stochastically decreasing in Y denoted as SD( X | Y ), (also known as negatively regressiondependent ( X | Y )) if P [ X > x | Y = y ] is a nonincreasing function of y for all x .8. Let X and Y be continuous random variables with joint density function h ( x, y ) . Then X and Y are negatively likelihood ratio dependent, denote by NLR(X,Y), if h ( x , y ) h ( x , y ) ≤ h ( x , y ) h ( x , y ) for all x , x , y , y ∈ I such that x ≤ x and y ≤ y . Theorem 3.1.
Let X and Y be two random variables with copula C θ . Then ( i ) X and Y areLTI( Y | X ), ( ii ) X and Y are LTI( X | Y ), ( iii ) X and Y are RTD( Y | X ), and ( iv ) X and Y areRTD( X | Y ).Proof. (i) To establish LTI( Y | X ), it is sufficient to show that for any v in I , C ( u,v ) u is nondecreasingin u (Nelsen, 2006, Theorem 5.2.5, p-192). For 0 < u <
1, and θ θ < v <
1, we have ∂∂u (cid:20) C ( u, v ) u (cid:21) = (1 − v )[1 − (1 − u ) θ (1 + θu )] u . Now we need to prove that [1 − (1 − u ) θ (1 + θu )] >
0. Define h ( u ) := (1 − u ) θ (1 + θu ). Observe that h (0) = 1, h (1) = 0, and h ( u ) is a decreasing function in u , since h (cid:48) ( u ) = − θ (1 + θ ) u (1 − u ) ( θ − < u ∈ (0 , ∂∂u (cid:104) C ( u,v ) u (cid:105) > < v ≤ θ θ , and 1 − (1+ θ ) vθ < u <
1, it can be shown that ∂∂u (cid:20) C ( u, v ) u (cid:21) = − θ θ (1+ θ ) (1+ θ ) (1 + θu )(1 − u ) θ + v (1+ θ ) − v θ u v θ > . Hence, the result follows.(ii) In view of Theorem 5.2.5 in (Nelsen, 2006, p-192), the necessary and sufficient condition forLTI( X | Y ) is that, C ( u,v ) v is nondecreasing in v , for any u in I .12or 0 < v ≤ θ θ , and 1 − (1+ θ ) vθ < u <
1, we have ∂∂v (cid:20) C ( u, v ) v (cid:21) = ( u − (cid:104) θ θ (1+ θ ) θ (1 − u ) θ − v θ (cid:105) v θ +2 ≥ , since ( u − < (cid:104) θ θ (1+ θ ) θ (1 − u ) θ − v θ (cid:105) < < u <
1, and θ θ < v <
1, we have ∂∂v (cid:20) C ( u, v ) v (cid:21) = ( u − − u ) θ − v ≥ . Hence, the result follows.(iii) To establish RTD( Y | X ), it is sufficient to show that v − C ( u,v )(1 − u ) is a nondecreasing function in u for any v ∈ I (Nelsen, 2006, Theorem 5.2.5, p-192).For 0 < v ≤ θ θ and 1 − (1+ θ ) vθ < u <
1, we have ∂∂u (cid:20) v − C ( u, v )(1 − u ) (cid:21) = (cid:18) θ θ (cid:19) θ (1 − u ) θ − v − θ > . Similarly, for 0 < u <
1, and θ θ < v <
1, we have ∂∂u (cid:20) v − C ( u, v )(1 − u ) (cid:21) = (1 − v )(1 − u ) θ − > . Hence, the conclusion follows.(iv) By Theorem 5.2.5 in (Nelsen, 2006, p-192), RTD( Y | X ) holds, if u − C ( u,v )(1 − v ) is a nondecreasingfunction in v for any u ∈ I . 13or 0 < v ≤ θ θ and 1 − (1+ θ ) vθ < u <
1, we have ∂∂v (cid:20) u − C ( u, v )(1 − v ) (cid:21) = θ θ (1+ θ ) (1+ θ ) (1 − u ) θ v − (1+ θ ) [ θ (1 − v ) − v ](1 − v ) , which is non-negative, since v < θ θ .Similarly, for any fixed u ∈ I , and θ θ < v < u − C ( u,v )(1 − v ) = 1 − (1 − u ) θ is a constant functionin v . Hence the results follows. Theorem 3.2.
Let X and Y be two random variables with copula C θ . Then ( i ) X and Y areSD( Y | X ), and ( ii ) X and Y are SD( X | Y ).Proof. To establish SD( Y | X ) property of the proposed copula C θ , we utilise the geometricinterpretation of the stochastic monotonicity given in Corollary 5.2.11 of (Nelsen, 2006, p-197).Therefore, it is sufficient to show that C θ ( u, v ) is a convex function of u . Similarly, SD( X | Y ) canbe established by showing C θ ( u, v ) is a convex function of v .(i) For 0 < v ≤ θ θ , and 1 − (1+ θ ) vθ < u <
1, we have ∂ ∂u C θ ( u, v ) = θ (1+ θ ) (1 + θ ) θ (1 − u ) θ − v − θ > . For 0 < u < , and θ θ < v <
1, we have ∂ ∂u C θ ( u, v ) = θ (1 + θ )(1 − v )(1 − u ) ( θ − > . Hence C θ ( u, v ) is a convex function of u .(ii) For 0 < v ≤ θ θ , and 1 − (1+ θ ) vθ < u <
1, we have ∂ ∂v C θ ( u, v ) = θ (1+ θ ) (1 + θ ) θ (1 − u ) θ v − (2+ θ ) > . Note that, for any fixed u ∈ I , and θ θ < v < ∂∂v C θ ( u, v ) is a constant function of v . Hence,the result follows. 14 heorem 3.3. Let X and Y be two random variables with copula C θ . Then X and Y are NLR.Proof. To established the NLR between X and Y with copula C θ , we need to show c θ ( u , v ) c θ ( u , v ) ≤ c θ ( u , v ) c θ ( u , v ) holds for all u ≤ u , and v ≤ v , where c θ ( u, v ) is thecopula density given in (4). Note that for the proposed copula C θ , the aforementioned conditionholds with equality for all u ≤ u and v ≤ v in I . Remark 3.1.
Two random variables X and Y with the copula C θ are NQD. This directly followsfrom Theorem 4.3. See the interrelationships between different concepts of negative dependencesummarised in (Balakrishnan and Lai, 2009, p-130) for details. In Section 3, several negative dependence properties of the proposed copula C θ has been investigatedfor the fixed θ >
0. In this section, we discuss the ordering properties of this copula C θ , whichprovides a precise (and also intuitively expected) notion for one bivariate distribution to be morepositively or negatively associated than another. For this purpose, we first recall the definitionsof the dependence orderings for bivariate distributions. These definitions describe the strength ofdependence of a copula with respect to its dependence parameter θ . Lehmann (1966) was firstto introduce the NQD and NRD notions. The corresponding orderings Yanagimoto and Okamoto(1969) were then defined as Definition 4.1.
Let F and G be two bivariate distributions with the same marginals. Then F issaid to be smaller than G in the NQD sense denoted as F ≺ NQD G if F ( x, y ) ≥ G ( x, y ) ∀ x and y. Definition 4.2.
Let F and G be two bivariate distributions with the same marginals, and let ( U, V ) and ( X, Y ) be two random vectors having the distributions F and G , respectively. Then F is saidto be smaller than G in the NRD sense, denoted by F ≺ NRD G or ( U, V ) ≺ NRD ( X, Y ) if, for any x ≤ x , F − V | U ( u | x ) ≥ F − V | U ( v | x (cid:48) ) = ⇒ G − V | U ( u | x ) ≥ G − V | U ( v | x (cid:48) ) for any u, v ∈ I , where F V | U denote the conditional distribution of V given U = u and F − V | U denoteits right-continuous inverse. Equivalently, F ≺ NRD G if and only if G − Y | X (cid:2) F V | U ( y | x ) | x (cid:3) is decreasing n x for all y (Fang and Joe, 1992). Later, Kimeldorf and Sampson (1987) have introduced and studied in detail the notion of theNegatively Likelihood Ratio (NLR) dependence ordering that is described in the following definition.Let the random variables X and Y have the joint distribution G ( x, y ). For any two intervals I and I of the real line, let us denote I ≤ I if x ∈ I and x ∈ I imply that x ≤ x . For any twointervals I and J of the real line let G ( I, J ) represent the probability assigned by G to the rectangle I × J . Definition 4.3.
Let F and G be two bivariate distributions with the same marginals, and let ( U, V ) and ( X, Y ) be two random vectors having the distributions F and G , respectively. Then F is saidto be smaller than G in the NLR dependence sense, denoted by F ≺ NLR G or ( U, V ) ≺ NLR ( X, Y ) if F ( I , J ) F ( I , J ) G ( I , J ) G ( I , J ) ≥ F ( I , J ) F ( I , J ) G ( I , J ) G ( I , J ) whenever I ≤ I and J ≤ J . When the densities F and G exist and denoted by f and g , respectively,then the aforementioned condition equivalently is written as f ( x , y ) f ( x , y ) g ( x , y ) g ( x , y ) ≥ f ( x , y ) f ( x , y ) g ( x , y ) g ( x , y ) whenever x ≤ x and y ≤ y . Theorem 4.1. If θ ≤ θ , then C θ ( u, v ) ≺ NQD C θ ( u, v ) . Proof.
The results directly follow from Proposition 2.1.
Theorem 4.2. If θ ≤ θ , then C θ ( u, v ) ≺ NRD C θ ( u, v ) . Proof.
Let θ ≤ θ . The conditional copula of V given U = u is given by C θ ( v | u ) = − θ θ (1 + θ ) θ (1 − u ) θ v − θ , (1 − u ) θ (1 + θ ) < v < θ θ − (1 + θ )(1 − v )(1 − u ) θ , θ θ < v < . Then C − θ ( C θ ( v | u ) | u ) is given by C − θ ( C θ ( v | u ) | u ) = θ (1 + θ ) ( θ /θ ) (1 + θ ) θ ( θ /θ )1 (1 − u ) − ( θ /θ ) v ( θ /θ ) , < v ≤ − (1 − u ) θ − θ θ (1 − v )(1 − u ) ( θ − θ ) , − (1 − u ) θ < v < . Note that C − θ ( C θ ( v | u ) | u ) is a decreasing function in u as θ ≤ θ . Now, using Definition 4.2, theresult follows. 16 heorem 4.3. If θ ≤ θ , then C θ ( u, v ) ≺ NLR C θ ( u, v ) . Proof.
Let θ ≤ θ . Now, it is easy to verify that the condition provided in Definition 4.3 holds forany choice of u , u , v , v , where u ≤ u , v ≤ v . Traditionally, bivariate life distributions available in the literature are positively correlated (Bal-akrishnan and Lai, 2009). However, in many real life scenarios, paired observations of non-negativevariables are negatively correlated (Bhuyan et al., 2020). For example, the rainfall intensity andthe duration are jointly modeled incorporating their negative dependence for the study of the cor-responding flood frequency distribution Kurothe et al. (1997). Gumbel (1960) and Freund (1961)have proposed the bivariate Exponential distributions with lower bound of the correlation coefficientas − .
4. In this section, several specific families of bivariate distributions are generated using theproposed copula (3). For the baseline distributions, we consider the Weibull and Gamma distribu-tions. It should be noted that the resulting bivariate Weibull and bivariate Gamma distributionscan be described by implementing all notions of negative dependence discussed in Section 3 and 4.
Example 5.1. Bivariate Weibull distribution:
A family of bivariate Weibull distributionsbased on the proposed copula C θ , with marginals F ( x ) = (cid:104) − e − ( λ x ) δ (cid:105) ( x > G ( y ) = (cid:104) − e − ( λ y ) δ (cid:105) ( y > h ( x, y ) = δ δ λ δ λ δ θ θ +1 (1 + θ ) θ x δ − y δ − (cid:32) e − ( λ x ) δ − e − ( λ y ) δ (cid:33) θ , < y ≤ φ , x > φ ( y ) δ δ λ δ λ δ (1 + θ ) x δ − y δ − e − ( λ y ) δ (cid:16) e − ( λ x ) δ (cid:17) θ , x > , y > φ where φ = 1 λ [log(1 + θ )] δ , φ ( y ) = 1 λ (cid:20) log (cid:18) θ (1 + θ )(1 − e − ( λ y ) δ ) (cid:19)(cid:21) δ , λ i > δ i > i = 1 , Example 5.2. Bivariate Gamma distribution:
A family of bivariate Gamma distributionsbased on the proposed copula C θ , with marginals F ( x ) = (cid:104)(cid:82) x α ) β α x α − e − β x (cid:105) ( x > G ( y ) = (cid:104)(cid:82) y α ) β α y α − e − β y (cid:105) ( y > ( x, y ) = β α β α θ θ x α − y α − e − ( β x + β y ) Γ( α )Γ( α )(1 + θ ) θ (cid:20) − γ ( α , β x )Γ( α ) (cid:21) θ (cid:20) γ ( α , β y )Γ( α ) (cid:21) − (1+ θ ) , < y ≤ ξ , ξ ( y ) < x < ηβ α β α (1 + θ )Γ( α )Γ( α ) x α − y α − e − ( β x + β y ) (cid:20) − γ ( α , β x )Γ( α ) (cid:21) θ , < x < η, ζ < y < ζ , where ζ = γ − (cid:18) θ θ (cid:19) , ζ = γ − (Γ( α )), ξ = γ − (cid:18) Γ( α ) θ θ (cid:19) , η = γ − (Γ( α )), ξ ( y ) = γ − (cid:20) Γ( α ) (cid:18) − (1 + θ ) γ ( α , β y ) θ Γ( α ) (cid:19)(cid:21) , γ i ( α i , β i ) = (cid:82) β i t α i − e − t dt , α i > β i > i = 1 , Remark 5.1.
The bivariate Weibull (in Example 5.1) and the bivariate Gamma (in Example 5.2)reduce to bivariate Exponential distribution for δ = δ = 1 , and α = α = 1 , respectively. For an illustrative data analysis based on the proposed copula, we consider a dataset on the dailyair quality measurements for New York Metropolitan Area from May 1, 1973, to September 30,1973. Information on average wind speed (in miles per hour) and mean ozone level (in parts perbillion), were obtained from the New York State Department of Conservation and the NationalWeather Service. See Chambers et al. (1983, Ch 2-5) for the detailed description of the data.Ozone in the upper atmosphere protects the earth from the sun’s harmful rays. On the contrary,exposure to ozone also can be hazardous to both humans and some plants in the lower atmosphere.Variations in weather conditions play an important role in determining the ozone levels (Khiemet al., 2010; Topcu et al., 2003). In general, the concentration of the ozone level is affected bya wind speed. High winds tend to disperse pollutants, which in turn, dilute the concentration ofthe ozone level. However, stagnant conditions or light winds allow pollution levels to build upand, thereby, the ozone level also becomes larger. Environmental scientists and meteorologists areinterested in the study of the effect of a wind speed on the distribution patterns of ozone (Goraiet al., 2015) levels. Based on the observed data, we find that Spearman’s rho and Kendall’s taubetween the wind speed and the ozone levels are -0.59 and -0.43, respectively, which indicatesstrong negative dependence. To analyse this phenomenon, we apply a method that close in spiritto the method of inference a function for margins (IFM). Joe (2015, Ch-5) have provided a detaileddescription and theoretical properties of the IFM method. This estimation method is based on18wo separate maximum likelihood estimations of the univariate marginal distributions, followed byan optimization of the bivariate likelihood as a function of the dependence parameter. However,we do not maximize the bivariate likelihood. Instead, we determine the dependence parameter byequating the empirical rank correlation with the theoretical Spearman’s rho. This allows the copulato adequately approximate the dependence structure of the bivariate data (Joe, 2015, p-255).First, we consider three different models Lognormal, Weibull, and Gamma, for estimation ofthe parameters associated with the marginal distributions of the wind speed and the mean ozonelevel. Based on the Akaike information criterion, the Gamma distribution fits both marginalsbetter as compared with other choices. The maximum likelihood estimates of the shape and thescale parameters are obtained as 7.171 and 1.375, respectively, for the wind speed, and the same forthe mean ozone levels are 1.7 and 24.775, respectively. The estimate of the dependence parameteris given by ˆ θ = 0 . We construct the new flexible bivariate copula for modeling negative dependence between tworandom variables. Its correlation coefficient takes any value in the interval ( − , i n d s p e e d ( m p h ) M ean o z one ( ppb ) D i s t r i bu t i on f un c t i on Figure 4: Estimated joint distribution function of wind speed and mean ozone. . . . . . . Mean ozone (ppb) D i s t r i bu t i on f un c t i on Average wind speed=14.9 mphAverage wind speed=9.7 mphAverage wind speed=5.7 mph
Figure 5: Effect of wind speed on the distribution of mean ozone.20egative dependence available in the literature, namely quadrant dependence, regression dependenceand likelihood ratio dependence, etc.We believe that the main contribution of our paper (apart from proposing the new flexible copulathat can be used in numerous applications) is the explicit study of its properties. Specifically, weshow that the Spearman’s rho and the Kendall’s tau are the decreasing functions with its dependenceparameter and establish relevant ordering properties and connections with the main notions ofnegative dependence.For the illustrative data analysis based on the proposed copula, we consider a dataset on dailyair quality measurements for New York Metropolitan Area. Based on the observed data, we showthat the Spearman’s rho and the Kendall’s tau between wind speed and ozone levels are -0.59 and-0.43, respectively, which indicate strong negative dependence.It is shown that the Gamma distri-butions fits better for both marginals and that the distribution of the mean ozone level decreasesstochastically (in the sense of the usual stochastic order) as the wind speed increases.It is interesting to consider in the future research the semiparametric generalisation of theproposed coupla and to investigate its associated properties. Another possible direction of furtherstudies could be a multivariate extension of the proposed copula using the approaches consideredby Liebscher (2008), Fischer and K¨ock (2012) and Mazo et al. (2015).
Acknowledgement
The first author sincerely acknowledges the financial support from the University of the Free State,South Africa. The second author was supported in part by the Lloyd’s Register Foundation pro-gramme on data-centric engineering at the Alan Turing Institute, UK.
References
Ahn, J. Y. (2015). Negative dependence concept in copulas and the marginal free herd behaviorindex.
Journal of Computational and Applied Mathematics , 288:304–322.Amblard, C. and Girard, S. (2009). A new extension of bivariate FGM copulas.
Metrika , 70:1–17.Bairamov, I. and Kotz, S. (2000). Dependence structure and symmetry of huang-kotz FGM distri-butions and their extensions.
Metrika , 56:55–72.21airamov, I. and Kotz, S. (2003). On a new family of positive quadrant dependent bivariatedistributions.
International Mathematical Journal , 3(11):1247–1254.Balakrishnan, N. and Lai, C. (2009).
Continuous Bivariate Distributions . Springer, Newyork.Balakrishnan, N. and Ristic, M, M. (2016). Multivariate families of gamma-generated distributionswith finite or infinite support above or below the diagonal.
Journal of Multivariate Analysis ,143:194–207.Bekrizadeh, H. and Jamshidi, B. (2017). A new class of bivariate copulas: dependence measuresand properties.
Metron , 75:31–50.Bekrizadeh, H., Parham, G. A., and Zadkarmi, M. R. (2012). The new generalization of farlie-gumbel- morgenstern copulas.
Metrika , 6:3527–3533.Bhuyan, P., Ghosh, S., Majumder, P., and Mitra, M. (2020). A bivariate life distribution andnotions of negative dependence.
Stat , 9(1):1–11.Chambers, J. M., Cleveland, W. S., Kleiner, B., and Tukey, P. A. (1983).
Graphical Methods forData Analysis . Wadsworth & Brooks.Cooray, K. (2019). A new extension of the FGM copula for negative association.
Communicationsin Statistics - Theory and Methods , 48(8):1902–1919.Esary, J. D. and Lehmann, E. L. (1972). Relationship among some concepts of bivariate dependence.
The Annals of Mathematical Statistics , 43:651–655.Fang, Z. and Joe, H. (1992). Further developments on some dependence orderings for continuousbivariate distributions.
Annals of the Institute of Statistical Mathematics , 44:501–517.Finkelstein, M. (2003). On one class of bivariate distributions.
Statistics & Probability Letters ,65:1–6.Fischer, M. and K¨ock, C. (2012). Constructing and generalizing given multivariate copulas: aunifying approach.
Statistics , 46:1–12.Fontaine, C., Frostig, R., and Ombao, H. (2020). Modeling dependence via copula of functionals offourier coefficients.
TEST , https://doi.org/10.1007/s11749-020-00703-5.22reund, J. E. (1961). A bivariate extension of the exponential distribution.
Journal of the AmericanStatistical Association , 56:296:971–977.Gorai, A. K., Tuluri, F., Huang, H., Hayami, H., Yoshikado, H., and Kawamoto, Y. (2015). Influenceof local meteorology and
N O conditions on ground-level ozone concentrations in the eastern partof Texas, USA. Air Quality, Atmosphere & Health , 8(1):81–96.Gumbel, E. J. (1960). Bivariate exponential distributions.
Journal of the American StatisticalAssociation , 55(292):698–707.H¨urlimann, W. (2015). A comprehensive extension of the FGM copula.
Statistical Papers , 58:373–392.Joe, H. (1997).
Multivariate Models and Dependence Concepts . Chapman & Hall, London.Joe, H. (2015).
Dependence Modeling with Copulas . CRC Press, Taylor & Francis Group, LLC.Khiem, M., Ooka, R., Huang, H., Hayami, H., Yoshikado, H., and Kawamoto, Y. (2010). Analysisof the relationship between changes in meteorological conditions and the variation in summerozone levels over the central kanto area.
Advances in Meteorology .Kimeldorf, G. and Sampson, A. R. (1987). Positive dependence orderings.
Annals of the Instituteof Statistical Mathematics , 39:113 –128.Kurothe, R. S., Goel, N. K., and Mathur, B. S. (1997). Derived flood frequency distribution fornegatively correlated rainfall intensity and duration.
Water Resources Research , 33:2103–2107.Lai, C. D. and Xie, M. (2000). A new family of positive quadrant dependent bivariate distributions.
Statistics & Probability Letters , 46:359–364.Lehmann, E. L. (1966). Some concepts of dependence.
The Annals of Mathematical Statistics ,37(5):1137–1153.Liebscher, E. (2008). Construction of asymetric multivariate copulas.
Journal of MultivariateAnalysis , 99:2234–2250.Mazo, G., Girard, S., and Forbes, F. (2015). A class of multivariate copulas based on products ofbivariate copulas.
Journal of Multivariate Analysis , 140:363–376.23ohtashami-Borzadaran, V., Amini, M., and Ahmadi, J. (2019). On the properties of a reliabilitydependent model.
Proceeding of the 5th Seminar on Reliability Theory and its Applications, Yazd,Iran , pages 256–265.Nelsen, R. B. (2006).
An Introduction to Copula . Springer Science+Business Media, Inc.Sarmanov, O. V. (1996). Generalized normal correlation and two-dimensional fr´echet classes.
Dok-lady Akademii Nauk SSSR , 168:596–599.Scarsini, M. and Shaked, M. (1996). Positive dependence orders: a survey.
In Athens Conferenceon Applied Probability and Time Series Analysis , pages 70–91.Schucany, W. R., Parr, W. C., and Boyer, J. E. (1978). Correlation structure in farlie-gumbel-morgenstern distributions.
Biometrika , 65(3):650–653.Topcu, S., Anteplioglu, U., and Incecik, S. (2003). Surface ozone concentrations and its relation towind field in Istanbul.
Water, Air, & Soil Pollution: Focus , 3:53–60.Yanagimoto, T. (1972). Families of positively dependent random variables.
The Annals of Mathe-matical Statistics , 24:559–573.Yanagimoto, Y. and Okamoto, M. (1969). Partial orderings of permutations and monotonicity of arank correlation statistic.