[PDF] Moments of the doubly truncated selection elliptical distributions with emphasis on the unified multivariate skew- t distribution

Abstract

In this paper, we compute doubly truncated moments for the selection elliptical (SE) class of distributions, which includes some multivariate asymmetric versions of well-known elliptical distributions, such as, the normal, Student's t, slash, among others. We address the moments for doubly truncated members of this family, establishing neat formulation for high order moments as well as for its first two moments. We establish sufficient and necessary conditions for the existence of these truncated moments. Further, we propose optimized methods able to deal with extreme setting of the parameters, partitions with almost zero volume or no truncation which are validated with a brief numerical study. Finally, we present some results useful in interval censoring models. All results has been particularized to the unified skew-t (SUT) distribution, a complex multivariate asymmetric heavy-tailed distribution which includes the extended skew-t (EST), extended skew-normal (ESN), skew-t (ST) and skew-normal (SN) distributions as particular and limiting cases.

Full PDF

aa r X i v : . [ m a t h . S T ] J u l M OMENTS OF THE DOUB LY TRUNC ATED SELEC TIONELLIPTICAL DISTR IBUTIONS WITH EMPHASIS ON THE UNIFIEDMULTIVARIATE SKEW - t DISTR IBUTION

Christian E. Galarza

Departamento de EstadísticaEscuela Superior Politecnica del LitoralGuayaquil, Ecuador [email protected]

Larissa A. Matos

Departamento de EstatísticaUniversidade Estadual de CampinasCampinas, Brazil [email protected]

Victor H. Lachos

Department of StatisticsUniversity of ConnecticutStorrs CT 06269, U.S.A. [email protected]

July 30, 2020 A BSTRACT

In this paper, we compute doubly truncated moments for the selection elliptical (SE) class of distribu-tions, which includes some multivariate asymmetric versions of well-known elliptical distributions,such as, the normal, Student’s t , slash, among others. We address the moments for doubly truncatedmembers of this family, establishing neat formulation for high order moments as well as for its ﬁrsttwo moments. We establish sufﬁcient and necessary conditions for the existence of these truncatedmoments. Further, we propose optimized methods able to deal with extreme setting of the parame-ters, partitions with almost zero volume or no truncation which are validated with a brief numericalstudy. Finally, we present some results useful in interval censoring models. All results has been par-ticularized to the uniﬁed skew- t (SUT) distribution, a complex multivariate asymmetric heavy-taileddistribution which includes the extended skew- t (EST), extended skew-normal (ESN), skew- t (ST)and skew-normal (SN) distributions as particular and limiting cases. K eywords Censored regression models · Elliptical distributions · Selection distributions · Truncated distributions · Truncated moments

Truncated moments have been a topic of high interest in the statistical literature, whose possible applications are wide,from simple to complex statistical models as survival analysis, censored data models, and in the most varied areasof applications such as agronomy, insurance, ﬁnance, biology, among others. These areas have data whose inherentcharacteristics lead to the use of methods that involve these truncated moments, such as restricted responses to a certaininterval, partial information such as censoring (which may be left, right or interval), missing, among others. The needto have more ﬂexible models that incorporate features such as asymmetry and robustness, has led to the exploration ofthis area in last years. From the ﬁrst two one-sided truncated moments for the normal distribution, useful in Tobin’smodel ([1]), its evolution led to its extension to the multivariate case ([2]), double truncation ([3]), heavy tails whenconsidering the Student’s t bivariate case in [4], and ﬁnally the ﬁrst two moments for the multivariate Student’s t case in [5]. Besides the interval-type truncation in cases before, [6] considers an interesting non-centered ellipsoidelliptical truncation of the form a ≤ ( x − µ A ) ⊤ A ( x − µ A ) on well known distributions as the multivariate normal,Student’s t , and generalized hyperbolic distribution. On the other hand, [7] recently proposed a recursive approachthat allows calculating arbitrary product moments for the normal multivariate case. Based on the latter, [8] proposesthe calculation of doubly truncated moments for the normal mean-variance mixture distributions ([9]) which includesseveral well-known complex asymmetric multivariate distributions as the generalized hyperbolic distribution ([10]).Unlike [8], in this paper we focus our efforts to the general class of asymmetric distributions called the multivariateelliptical selection family. This large family of distributions includes complex multivariate asymmetric versions of PREPRINT - J

ULY

30, 2020well-known elliptical distributions as the normal, Student’s t , exponential power, hyperbolic, slash, Pearson type II,contaminated normal, among others. We go further in details for the uniﬁed skew- t (SUT) distribution, a complex mul-tivariate asymmetric heavy-tailed distribution which includes the extended skew- t (EST) distribution ([11]), the skew- t (ST) distribution ([12]) and naturally, as limiting cases, its analogous normal and skew-normal (SN) distributions when ν → ∞ .The rest of the paper is organized as follows. In Section 2 we present some preliminaries results, most of them beingdeﬁnitions of the class of distributions and its special cases of interest along the manuscript. Section 3, the addressesthe moments for the doubly truncated selection elliptical distributions. Further, we establish formulas for high ordermoments as well as its ﬁrst two moments. We present a methodology to deal with some limiting cases and a discussionwhen a non-truncated partition exists. In addition, we establish sufﬁcient and necessary conditions for the existence ofthese truncated moments. Section 4 bases results from Section 3 to the SUT case. In Section 5, a brief numerical studyis presented in order to validate the methodology. In Section 6, we present some Lemmas and Corollaries relatedto conditional expectations which are useful in censored modeling. An application of selection elliptical truncatedmoments on tail conditional expectation is presented in Section 7. Finally, the paper closes with some conclusions anddirection for future research. First, we start our exposition deﬁning a selection distribution as in [13].

Deﬁnition 1 ( selection distribution). Let X ∈ R q and X ∈ R p be two random vectors, and denote by C ameasurable subset of R q . We deﬁne a selection distribution as the conditional distribution of X given X ∈ C ,that is, as the distribution of ( X | X ∈ C ) . We say that a random vector Y ∈ R p has a selection distribution if Y d = ( X | X ∈ C ) . We use the notation Y ∼ SLCT p,q with parameters depending on the characteristics of X , X , and C . Furthermore,for X having a probability density function (pdf) f X say, then Y has a pdf f Y given by f Y ( y ) = f X ( y ) P ( X ∈ C | X = y ) P ( X ∈ C ) . (1)Since selection distribution depends on the subset C ∈ R q , particular cases are obtained. One of the most importantcase is when the selection subset has the form C ( c ) = { x ∈ R q | x > c } . (2)In particular, when c = , the distribution of Y is called to be a simple selection distribution.In this work, we are mainly interested in the case where ( X , X ) has a joint density following an arbitrary symmetricmultivariate distribution f X , X . For Y d = ( X | X ∈ C ) , this setting leads to a Y p -variate random vector followinga skewed version of f , which its pdf can be computed in a simpler manner as f Y ( y ) = R C f X , X ( x , y ) d x R C f X ( x ) d x . (3) A quite popular family of selection distributions arises when X and X have a joint multivariate elliptically contoured ( EC ) distribution, as follows: X = (cid:18) X X (cid:19) ∼ EC q + p (cid:18) ξ = (cid:18) ξ ξ (cid:19) , Ω = (cid:18) Ω Ω Ω Ω (cid:19) , h ( q + p ) (cid:19) , (4)where ξ ∈ R q and ξ ∈ R p are location vectors, Ω ∈ R q × q , Ω ∈ R p × p , and Ω ∈ R p × q are dispersion matrices,and, in addition to these parameters, h ( q + p ) is a density generator function. We denote the selection distributionresulting from (4) by SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ) . They typically result in skew-elliptical distributions, except fortwo cases: Ω = p × q and C = C ( ξ ) (for more details, see [13]). Given that the elliptical family of distributions2 PREPRINT - J

ULY

30, 2020is closed under marginalization and conditioning, the distribution of X and ( X | X = x ) are also elliptical, wheretheir respective pdfs are given by X ∼ EC p ( ξ , Ω , h ( p ) ) , (5) X | X = x ∼ EC q ( ξ + Ω Ω − ( x − ξ ) , Ω − Ω Ω − Ω , h ( q ) x ) , (6)with induced conditional generator h ( q ) x ( u ) = h ( q + p ) ( u + δ ( x )) h ( p ) δ ( x ) , with δ ( x ) △ = ( x − ξ ) ⊤ Ω − ( x − ξ ) . These last equations imply that the selection elliptical distributions are alsoclosed under marginalization and conditioning. Furthermore, it is well-know that the SE family is closed under lineartransformations. For A ∈ R r × p and b ∈ R r being a matrix of rank r ≤ p and a vector, respectively, it holds thatthe linear transformation AY + b d = ( AX + b ) | ( X > ) , where d = is an acronym that stands for identicallydistributed, and then AY + b ∼ SLCT - EC r,q (cid:18) ξ = (cid:18) ξ A ξ + b (cid:19) , Ω = (cid:18) Ω Ω A ⊤ AΩ AΩ A ⊤ (cid:19) , h ( q + r ) (cid:19) . (7)Notice from Equation (3), that alternatively we can write f Y ( y ) = R C f q + p ( x , y ; ξ , Ω , h ( q + p ) ) d x R C f q ( x ; ξ , Ω , h ( q ) ) d x . (8) Some particular cases, useful for our purposes, are detailed next. For further details, we refer to [13].

Uniﬁed-skew elliptical (SUE) distribution

Let Y ∼ SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ) . Y is said to follow the uniﬁed skew-elliptical distribution introduced by[14] when the truncation subset C = C ( ) . From (8), it follows that f Y ( y ) = f p ( y ; ξ , Ω , h ( p ) ) F q ( ξ + Ω Ω − ( y − ξ ); , Ω − Ω Ω − Ω , h ( q ) y ) F q ( ξ ; Ω , h ( q ) ) , (9)where f p ( y ; ξ , Ω , h ( p ) ) = | Ω | − / h ( p ) ( δ X ( y )) , and F q ( z ; , Θ , g ( q ) ) denote the cumulative distribution func-tion (cdf) of the EC q ( , Θ , g ( q ) ) . Note that the density in (9) extends the family of skew elliptical distributionsproposed by [15] (see also, [12]), which consider q = 1 and ξ = 0 . Scale-mixture of uniﬁed-skew normal (SMSUN) distribution

Let W being a nonnegative random variable with cdf G . For a generator function h ( p + q ) ( u ) = R ∞ (2 πκ ( w )) − ( p + q ) / e − u/ κ ( w ) d G ( w ) , several skewed and thick-tailed distributions can be obtained from differentspeciﬁcations of the weight function κ ( · ) and G . It is said that Y follows a SMSUN distribution, if its probabilitydensity function (pdf) takes the general form f Y ( y ) = Z ∞ φ p ( y ; ξ , κ ( w ) Ω ) Φ q ( ξ + Ω Ω − ( y − ξ ); κ ( w ) { Ω − Ω Ω − Ω } )Φ q ( ξ ; κ ( w ) Ω ) d G ( w ) , (10)where Φ r ( · ; Σ ) represents the cdf of a r -variate normal distribution with mean vector and variance-covariancematrix Σ . Here Y | ( W = w ) follow a uniﬁed skew-normal (SUN) distribution, where we write Y | ( W = w ) ∼ SU N ( ξ , κ ( w ) Ω ) . 3 PREPRINT - J

ULY

30, 2020 • Uniﬁed skew-normal (SUN) distribution

Setting W as a degenerated r.v. in 1 ( P ( W = 1) = 1 ) and κ ( w ) = w , then h ( p + q ) ( u ) = (2 π ) − ( p + q ) / e − u/ , u ≥ , for which h ( p ) ( u ) = (2 π ) − p/ e − u/ . Then, Y follow a SUN distribution, that is, Y ∼ SU N p,q ( ξ , Ω ) ,with pdf as f Y ( y ) = φ p ( y ; ξ , Ω ) Φ q ( ξ + Ω Ω − ( y − ξ ); Ω − Ω Ω − Ω )Φ q ( ξ ; Ω ) . (11) • Uniﬁed skew- t (SUT) distribution For W ∼ G ( ν/ , ν/ and weight function κ ( w ) = 1 /w , we obtain h ( p + q ) ( u ) = Γ(( p + q + ν ) / ν ν/ Γ( ν/ π ( p + q ) / { u } − ( p + q + ν ) / and hence (10) becomes f Y ( y ) = t p ( y ; ξ , Ω , ν ) T q ( ξ + Ω Ω − ( y − ξ ); ν + δ ( y ) ν + p { Ω − Ω Ω − Ω } , ν + p ) T q ( ξ ; Ω , ν ) , (12)where T r ( · ; Σ , ν ) represents the cdf of a r -variate Student’s t distribution with location vector , scale matrix Σ and degrees of freedom ν . For Y with pdf as in (12) is said to follow a SUT distribution, which is denotedby Y ∼ SU T p,q ( ξ , Ω , ν ) and was introduced by [14]. It is well-know that (12) reduces to a SUN pdf (11) as ν → ∞ and to an uniﬁed skew-Cauchy (SUC) distribution, when ν = 1 .Furthermore, using the following parametrization: ξ = (cid:18) τµ (cid:19) and Ω = (cid:18) Ψ + Λ ⊤ Λ Ω Ω Σ (cid:19) , (13)where Ω = Σ / Λ , with Σ / being the square root matrix of Σ such that Σ = Σ / Σ / , we use thenotation Y ∼ SU T p,q ( µ , Σ , Λ , τ , ν, Ψ ) , to stand for a p -variate EST distribution with location parameter µ ∈ R p , positive-deﬁnite scale matrix Σ ∈ R p × p , shape matrix parameter λ ∈ R p × q , extension vectorparameter τ ∈ R q and positive-deﬁnite correlation matrix Ψ ∈ R q × q . The pdf Y is now simpliﬁed to SU T p,q ( y ; µ , Σ , Λ , τ , ν, Ψ ) = t p ( y ; µ , Σ , ν ) T q (cid:0) ( τ + Λ ⊤ Σ − / ( y − µ )) ν ( y ) , Ψ ; ν + p (cid:1) T q ( τ ; Ψ + Λ ⊤ Λ , ν ) , (14)with ν ( x ) ≡ ν X ( x ) △ = ( ν + dim ( x )) / ( ν + δ ( x )) and δ ( x ) = ( x − µ X ) ⊤ Σ − X ( x − µ X ) being the Ma-halanobis distance. The pdf in (14) is equivalent to the one found in [11], with a different parametrization.Although the uniﬁed skew- t distribution above is appealing from a theoretical point of view, the particularcase, when q = 1 , leads to simpler but ﬂexible enough distribution of interest for practical purposes. Extended skew- t (EST) distribution For q = 1 , we have that Ψ = 1 , Λ = λ and T q ( x ; Ψ , ν ) = T ( x/ √ ψ, ν ) , hence (14) reduces to the pdf of aEST distribution, denoted by EST p ( y ; µ , Σ , λ , τ ) , that is, EST p ( y ; µ , Σ , λ , τ ) = t p ( y ; µ , Σ , ν ) T (cid:0) ( τ + λ ⊤ Σ − / ( y − µ )) ν ( y ); ν + p (cid:1) T (˜ τ ; ν ) . (15)with ˜ τ = τ / p λ ⊤ λ .Here, λ ∈ R p is a shape parameter which regulates the skewness of Y , and τ ∈ R is a scalar. Location and scale parameters µ and Σ remains as before. Here, we write Y ∼ EST p ( µ , Σ , λ , τ ) Notice that,

SU T p, ≡ EST p . Besides, it is straightforward to see that EST p ( y ; µ , Σ , λ , τ, ν ) −→ t p ( y ; µ , Σ , ν ) , a s τ → ∞ , where t p ( · ; µ , Σ , ν ) corresponds to the pdf of a multivariate Student’s t distribution with location parameter µ , scale parameter Σ and degrees of freedom ν . On the other hand, when τ = 0 , we retrieve the skew- t distribution ST p ( µ , Σ , λ , ν ) say, which density function is given by ST p ( y ; µ , Σ , λ , ν ) = 2 t p ( y ; µ , Σ , ν ) T (cid:0) λ ⊤ Σ − / ( y − µ ) ν ( y ); ν + p (cid:1) , (16)4 PREPRINT - J

ULY

30, 2020Figure 1: Densities for particular cases of a truncated SUT distribution. Normal cases at left column (normal, SN andESN from top to bottom) and Student’s- t cases at right (Student’s t , ST and EST from top to bottom).that is, EST p ( µ , Σ , λ , , ν ) = ST p ( µ , Σ , λ , ν ) . Further properties were studied in [11], but with a slightlydifferent parametrization.Six different densities for special cases of the truncated SUT distribution are shown in Figure 1. Symmetricalcases normal and Student’s t are shown at ﬁrst row ( λ = ), skew cases: skew-normal (SN) and ST at secondrow ( τ = 0 ) and extended skew cases: extended skew-normal (ESN) and EST at the third row. Locationvector µ and scale matrix Σ remains ﬁxed for all cases. • Others uniﬁed skewed distributions

Others uniﬁed members are given by different combinations of the weight function κ ( W ) and the mixturecdf G . For instance, we obtain an uniﬁed skew-slash distribution when κ ( w ) = 1 /w and W ∼ Beta( ν, ; an uniﬁed skew-contaminated-normal distribution when κ ( W ) = 1 /W and W is a discrete r.v. with probabilitymass function (pmf) g ( w ; ν, γ ) = ν I { w = γ } + (1 − ν ) I { w =1 } , with I being the identity function. Besides, [15]mentions some other distributions as the skew-logistic, skew-stable, skew-exponential power, skew-Pearsontype II and ﬁnite mixture of skew-normal distribution. It is worth mentioning that even though [15] workswith a subclass of the SMSUN, when q = 1 and ξ = 0 , uniﬁed versions of these are readily computed byconsidering the same respective weight function κ ( · ) and mixture distribution G . Let Y ∼ SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ) with pdf as deﬁned in (8) and let also A be a Borel set in R p . We say thata random vector W has a truncated selection elliptical (TSE) distribution on A when W d = Y | ( Y ∈ A ) . In this case,5 PREPRINT - J

ULY

30, 2020the pdf of W is given by f W ( w ) = f Y ( w ) P ( Y ∈ A ) A ( w ) , where A is the indicator function of the set A . We use the notation W ∼ T SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ; A ) . If A has the form A = { ( y , . . . , y p ) ∈ R p : a ≤ y ≤ b , . . . , a p ≤ y p ≤ b p } = { y ∈ R p : a ≤ y ≤ b } , (17)we say that the distribution of W is doubly truncated distribution and we use the notation { Y ∈ A } = { a ≤ Y ≤ b } ,where a = ( a , . . . , a p ) ⊤ and b = ( b , . . . , b p ) ⊤ , where a i and b i values may be inﬁnite, by convention. Analogouslywe deﬁne { Y ≥ a } and { Y ≤ b } . Thus, we say that the distribution of W is truncated from below and truncatedfrom above, respectively. For convenience, we also use the notation W ∼ T SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ; ( a , b )) with the last parameter indicating the truncation interval. Analogously, we do denote T EC p ( ξ , Ω , h ( p ) ; ( a , b )) to referto a p -variate (doubly) truncated elliptical (TE) distribution on ( a , b ) ∈ R p . Some characterizations of the doubly TEhave been recently discussed in [16]. For two p -dimensional vectors y = ( y , . . . , y p ) ⊤ and k = ( k , . . . , k p ) ⊤ , let y k stand for ( y k , . . . , y k p p ) , that is,we use a pointwise notation. Next, we present a formulation to compute arbitrary product moments of a TSLCT-ECdistribution. Theorem 1 ( moments of a TSE). Let X ∼ EC q + p ( ξ , Ω , h ( q + p ) ) as deﬁned in (44) . Let C be a truncation subsetof the form C ( c , d ) = { x ∈ R q | c ≤ x ≤ d } . For Y ∼ SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ( c , d )) , then E [ Y k ] = E [ Y k Y k . . . Y k p p ] can be computed as E [ Y k | a ≤ Y ≤ b ] = E [ X κ | α ≤ X ≤ β ] , (18) with κ = ( ⊤ q , k ⊤ ) ⊤ , α = ( c ⊤ , a ⊤ ) ⊤ and β = ( d ⊤ , b ⊤ ) ⊤ , where k = ( k , k , . . . , k p ) ⊤ , with k i ∈ N , for i = 1 , . . . , p .Proof. Since Y d = X | ( c ≤ X ≤ d ) , the proof is direct by noting that Y | ( a ≤ Y ≤ b ) d = X | ( c ≤ X ≤ d ∩ a ≤ X ≤ b ) d = X | ( α ≤ X ≤ β ) . Corollary 1 ( ﬁrst two moments of a TSE). Under the same conditions of Theorem 1, let m = E [ X | α ≤ X ≤ β ] and M = E [ XX ⊤ | α ≤ X ≤ β ] , both partitioned as m = (cid:18) m m (cid:19) and M = (cid:18) M M M M (cid:19) , respectively. Then, the ﬁrst two moments of Y | ( a ≤ Y ≤ b ) are given by E [ Y | a ≤ Y ≤ b ] = m , (19) E [ YY ⊤ | a ≤ Y ≤ b ] = M , (20) where m ∈ R p and M ∈ R p × p . For the particular truncation subset C ( c ) as in (2), Theorem 1 and Corollary 1 hold considering α = ( c ⊤ , a ⊤ ) ⊤ and β = ( ∞ ⊤ , b ⊤ ) ⊤ . Notice that, Theorem 1 and Corollary 1 state that we are able to compute any arbitrary moment of Y | ( a ≤ Y ≤ b ) , that is, a TSE distribution just using an unique corresponding moment of a doubly TE distribution X | ( α ≤ X ≤ β ) .This is highly convenient since doubly truncated moments for some members of the elliptical family of distributionsare already available in the literature and statistical softwares. In particular for the truncated multivariate normal andStudent’s-t we have the R packages TTmoment , tmvtnorm and MomTrunc .6 PREPRINT - J

ULY

30, 2020

Consider X ∼ EC q + p ( ξ , Ω , h ( q + p ) ) and Y ∼ SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ) as in Theorem 1 with truncation subset C = C ( ) . As ξ → ∞ , we have that P ( X ≥ ) → . Besides, as ξ → − ∞ , we have that P ( X ≥ ) → andconsequently P ( a ≤ Y ≤ b ) = P ( α ≤ X ≤ β ) / P ( X ≥ ) → ∞ . Thus, for ξ containing high negative valuessmall enough, sometimes we are not able to compute E [ Y k ] due to computation precision, mainly when we workwith distributions with lighter tails densities. For instance, for a normal univariate case, Φ ( ξ ) = 0 for ξ ≤ − in R software. The next proposition helps us to circumvent this problem. Proposition 1 ( limiting case of a SE). As ξ → − ∞ , i.e., ξ i → −∞ , i = 1 , . . . , q , then SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ( )) −→ EC p ( ξ − Ω Ω − ξ , Ω − Ω Ω − Ω , h ( p ) ) . (21) Proof.

Let X = ( X ⊤ , X ⊤ ) ⊤ ∼ EC q + p ( ξ , Ω , h ( q + p ) ) and Y ∼ T SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ( ); ( a , b )) . As ξ → − ∞ , we have that P ( X ≥ ) → , E [ X | X ≥ ] → and var[ X | X ≥ ] → , hence X | X ≥ becomes degenerated on . From Deﬁnition 1, Y d −→ ( X | X = ) , and by the conditional distribution in Equation(6), it is straightforward to show that X | X ∼ EC p ( ξ + Ω Ω − ( X − ξ ) , Ω − Ω Ω − Ω , h ( p ) X ) . Evaluating X = we achieve (21) concluding the proof. While using the relation (19) and (20), we may face numerical problems trying to compute m = E [ X | α ≤ X ≤ β ] and M = E [ XX ⊤ | α ≤ X ≤ β ] for extreme settings of ξ and Ω . Usually, it occurs when P ( α ≤ X ≤ β ) ≈ because the probability density is far from the integration region ( α , β ) . It is worth mentioning that, for these cases,it is not even possible to estimate the moments generating Monte Carlo (MC) samples via rejection sample due to thehigh rejection ratio when subsetting to a small integration region. Other methods as Gibbs sampling are preferableunder this situation.Hence, we present correction method in order to approximate the mean and the variance-covariance of a multivariateTE distribution even when the numerical precision of the software is a limitation. Consider X ∼ EC r (cid:0) ξ , Ω , h ( r ) (cid:1) to be partitioned as X = ( X T , X ⊤ ) ⊤ such that dim ( X ) = r , dim ( X ) = r ,where r + r = r . Also, consider ξ , Ω , α = ( α ⊤ , α ⊤ ) ⊤ and β = ( β ⊤ , β ⊤ ) ⊤ partitioned as before. Suppose that weare not able to compute E [ X κ | α ≤ X ≤ β ] , because there exists a partition X of X of dimension r that is out-of-bounds, that is P ( α ≤ X ≤ β ) ≈ . Notice that this happens because P ( α ≤ X ≤ β ) ≤ P ( α ≤ X ≤ β ) ≈ . Besides, we suppose that P ( α ≤ X ≤ β ) > . Since the limits of X are out-of-bounds (and α < β ), wehave two possible cases: β → − ∞ or α → ∞ . For convenience, let µ = E [ X | α ≤ X ≤ β ] and Σ = cov[ X | α ≤ X ≤ β ] . For the ﬁrst case, as β → − ∞ , we have that µ → β and Σ → r × r .Analogously, we have that µ → α and Σ → r × r as α → ∞ . Hence, X | ( α ≤ X ≤ β ) is degeneratedon µ and then X . d = X | ( X = µ ) ∼ EC r ( ξ + Ω Ω − ( µ − ξ ) , Ω − Ω Ω − Ω , h ( r ) µ ) . Given that cov[ E [ X | X ]] = and cov[ E [ X | X ] , X ] = , it follows that E [ X | α ≤ X ≤ β ] = (cid:20) µ . µ (cid:21) and cov[ X | α ≤ X ≤ β ] = (cid:20) Σ . r × r r × r r × r (cid:21) , (22)with µ . = E [ X . | α ≤ X . ≤ β ] and Σ . = cov[ X . | α ≤ X . ≤ β ] being the mean andvariance-covariance matrix of a r -variate TE distribution.In the event that there are double inﬁnite limits, we can part the vector as well, in order to avoid unnecessary calculationof these integrals. Now, consider X = ( X ⊤ , X ⊤ ) ⊤ to be partitioned such that the upper and lower truncation limits associated with X are both inﬁnite, but at least one of the truncation limits associated with X is ﬁnite. Then r be the number ofpairs in ( α , β ) that are both inﬁnite, that is, dim ( X ) = r and dim ( X ) = r , by complement. Since α = − ∞ and β = ∞ , it follows that X | ( α ≤ X ≤ β ) ∼ T EC r (cid:0) ξ , Ω , h ( r ) ; [ α , β ] (cid:1) and X | X ∼ EC r (cid:0) ξ + PREPRINT - J

ULY

30, 2020 Ω Ω − ( X − ξ ) , Ω − Ω Ω − Ω , h ( r ) X (cid:1) . Let µ = E [ X | α ≤ X ≤ β ] and Σ = cov[ X | α ≤ X ≤ β ] . Hence, it follows that E [ X | α ≤ X ≤ β ] = E [ E [ X | X ] | α ≤ X ≤ β ] , that is E [ X | α ≤ X ≤ β ] = E (cid:20)(cid:18) ξ + Ω Ω − ( X − ξ ) X (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) α ≤ X ≤ β (cid:21) = (cid:20) ξ + Ω Ω − ( µ − ξ ) µ (cid:21) . (23)On the other hand, we have that cov[ X , E [ X | X ]] = cov[ X , X Ω − Ω ] = Σ Ω − Ω , cov[ E [ X | X ]] = Ω Ω − Σ Ω − Ω and E [cov[ X | X ]] = ω . ( Ω − Ω Ω − Ω ) , with ω . being a constant depending of theconditional generating function h ( r ) X . Finally, cov[ X | α ≤ X ≤ β ] = (cid:20) ω . Ω − Ω Ω − (cid:0) ω . I p − Σ Ω − (cid:1) Ω Ω Ω − Σ Σ Ω − Ω Σ (cid:21) , (24)where µ and Σ are the mean vector and variance-covariance matrix of a TE distribution, so we can use (19) and(20) as well. Remark 1.

Note that X | ( α ≤ X ≤ β ) does not follow a non-truncated elliptical distribution, that is, X | ( α ≤ X ≤ β ) ≁ EC r (cid:0) ξ , Ω , h ( r ) (cid:1) even though − ∞ ≤ X ≤ ∞ . This occurs due to X | ( α ≤ X ≤ β ) = X | ( α ≤ X ≤ β ) . In general, the marginal distributions of a TE distribution are not TE, however this holds for X due to the particular case α = − ∞ and β = ∞ . Particular cases

Notice that the constant ω . will vary depending of the elliptical distribution we are using. For instance, if X ∼ t r + r ( ξ , Ω , ν ) then it follows that X ∼ t r (cid:0) ξ , Ω , ν (cid:1) and X | X ∼ t r (cid:0) ξ + Ω Ω − ( X − ξ ) , ( Ω − Ω Ω − Ω ) /ν ( X ) , ν + r (cid:1) . In this case, it takes the form ω . = E [( ν + r ) / ( ν + r − ν ( X ) | α ≤ X ≤ β ] ,which is given by ω . = E (cid:20) ν + δ ( X ) ν + r − | α ≤ X ≤ β (cid:21) , = (cid:18) νν − (cid:19) L r ( α , β ; ξ , ν Ω / ( ν − , ν − L r ( α , β ; ξ , Ω , ν ) , (25)where L r ( α , β ; ξ , Ω , ν ) denotes the integral L r ( α , β ; ξ , Ω , ν ) = Z βα t r ( y ; ξ , Ω , ν ) d y , (26)that is, L r ( α , β ; ξ , Ω , ν ) = P ( α ≤ Y ≤ β ) for Y ∼ t r ( ξ , Ω , ν ) . Probabilities in (25) are involved in the calculationof µ and Σ so they are recycled. For the normal case, it is straightforward to see that ω . = 1 , by taking ν → ∞ .As can be seen, we can use equations (23) and (24) to deal with double inﬁnite limits, where the truncated moments arecomputed only over a r -variate partition, avoiding some unnecessary integrals and saving signiﬁcant computationaleffort. On the other hand, expression (22) let us to approximate the mean and the variance-covariance matrix for caseswhere the computational precision is a limitation. It is well know that for some members of EC family of distributions, their moments do not exist, however, this couldbe different depending of the truncation limits.Let X ∼ EC r ( ξ , Ω , h ( r ) ) be partitioned as in Subsection 3.3.2, with r being the number of pairs in ( α , β ) that areboth ﬁnite and r = r − r . Similarly, κ = ( κ ⊤ , κ ⊤ ) ⊤ is partitioned as well. If r = r , then the truncation limits α and β contains only ﬁnite elements, and hence E [ X κ | α ≤ X ≤ β ] exists for all κ ∈ N r because the distributionis bounded. When r ≥ , there exists at least one pair in ( α , β ) containing inﬁnite values, and the expectation maynot exist. Given that E [ X κ | α ≤ X ≤ β ] = E [ X κ E [ X κ | X , α ≤ X ≤ β ] | α ≤ X ≤ β ] , for anymeasurable function g , E [ g ( X ) | α ≤ X ≤ β ] always exists, and ( α , β ) is not bounded, it is straightforwardto see that E [ X κ | α ≤ X ≤ β ] exist if and only if ( iff ) the inner expectation E [ X κ | X ] exists.8 PREPRINT - J

ULY

30, 2020As seen, the existence only depends of the order of the moment κ and the distribution of X | X , this last dependingon the conditional generating function h ( r ) X .If Y ∼ SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ) , with truncation subset of the form C ( c , d ) and r = p + q say. It follows fromTheorem 1, that E [ Y k | a ≤ Y ≤ b ] = E [ X κ | α ≤ X ≤ β ] . Hence, the same condition holds taking in accountthat κ = ( ⊤ q , k ⊤ ) ⊤ , α = ( c ⊤ , a ⊤ ) ⊤ and β = ( d ⊤ , b ⊤ ) ⊤ . Next, we present a result for a particular case. For the rest of the paper we shall focus our attention on the computation of the moments of the doubly truncateduniﬁed skew- t (TSUT) distribution, denoted by W ∼ T SU T p,q ( µ , Σ , Λ , τ , ν, Σ ; ( a , b )) . Besides, we shallstudy some of its properties and for its particular case (when q = 1 ), the doubly truncated extended skew- t distribution, say W ∼ T EST p ( µ , Σ , λ , τ, ν ; ( a , b )) . For the limiting symmetrical case, we shall use the notation W ∼ T t p ( µ , Σ , ν ; ( a , b )) to refer to a p -variate truncated Student- t (TT) distribution on ( a , b ) ∈ R p . Finally, W ∼ T N p ( µ , Σ ; ( a , b )) will stand for a p -variate truncated normal distribution on the interval ( a , b ) . Hereinafterwe shall omit the expression doubly due to we only work with intervalar truncation. Corollary 2 ( moments of a TSUT). If Y ∼ SU T p,q ( µ , Σ , Λ , τ , ν, Ψ ) , it follows from Theorem 1 that E [ Y k | a ≤ Y ≤ b ] = E [ X κ | α ≤ X ≤ β ] , where X ∼ t q + p ( ξ , Ω , ν ) with ξ and Ω as deﬁned in Equation (13) and κ = ( ⊤ q , k ⊤ ) ⊤ , α = ( ⊤ q , a ⊤ ) ⊤ and β = ( ∞ ⊤ q , b ⊤ ) ⊤ . Let Y ∼ T SU T p,q ( µ , Σ , Λ , τ , ν, Ψ ; ( a , b )) and X ∼ T t q + p ( ξ , Ω , ν ; ( α , β )) . From Corollary 2, we have that theﬁrst two moments of Y can be computed as E [ Y ] = m , (27) E [ YY ⊤ ] = M , (28)where m = E [ X ] and M = E [ XX ⊤ ] are partitioned as in Corollary 1. Notice that cov[ Y ] = E [ YY ⊤ ] − E [ Y ] E [ Y ⊤ ] .Equations (27) and (28) are convenient for computing E [ Y ] and cov[ Y ] since all boils down to compute the mean andthe variance-covariance matrix for a q + p -variate TT distribution which can be calculated using the our MomTrunc R package available on CRAN.

Existence of the moments of a TSUT

Let also p be the number of pairs in ( a , b ) that are both ﬁnite. Without loss of generality, we assume Y =( Y ⊤ , Y ⊤ ) ⊤ , where the upper and lower truncation limits associated with Y are both ﬁnite, but at least one of thetruncation limits associated with Y is not ﬁnite, say dim ( Y ) = p and dim ( Y ) = p , with p + p = p . Considerthe partitions of a = ( a ⊤ , a ⊤ ) ⊤ , b = ( b ⊤ , b ⊤ ) ⊤ and k = ( k ⊤ , k ⊤ ) ⊤ as well. The next proposition gives a sufﬁcientcondition for the existence of the moment of a TSUT distribution. Proposition 2 ( existence of the moments of a TSUT). Under the conditions above, E [ Y k | a ≤ Y ≤ b ] exists iff sum ( k ) < ν + p .Proof. From subsection 3.4, it is sufﬁces to demonstrate that E [ X κ | X ] exists. Since α = ( ⊤ q , a ⊤ , a ⊤ ) ⊤ and β = ( ∞ ⊤ q , b ⊤ , b ⊤ ) ⊤ , it follows that r = p , r = q + p , κ = k and κ = ( ⊤ q , k ⊤ ) ⊤ . It is easy to show that thedistribution of X | X is a ( q + p ) -variate Student- t distribution with ν + p degrees of freedom. Hence, the aboveexpectation exists iff sum ( k ) < ν + p .From Proposition 2, see that E [ Y ] and E [ YY ⊤ ] exist iff ν + p > and ν + p > respectively. Remark 2 (Sufﬁcient condition of existence of the ﬁrst two moments of a TSUT) . Since ν > , it is equivalent to saythat, the ﬁrst moment exists if at least one dimension containing a ﬁnite limit exists. Besides, the second moment existsif at least two dimensions containing a ﬁnite limit exist. PREPRINT - J

ULY

30, 2020 −0.40.00.4 −0.5 0.0 0.5 y y −0.10−0.050.00 0 2500 5000 7500 10000 sample size m ^ sample size m ^ sample size s ^ −0.050−0.0250.0000.025 0 2500 5000 7500 10000 sample size s ^ sample size s ^ Figure 2: Simulation study. Contour plot for the TSUT density (upper left corner) and trace plots of the evolution ofthe MC estimates for the mean and variance-covariance elements of Y . The solid line represent the true estimatedvalue by our proposal. Remark 3.

Sufﬁcient conditions aforementioned hold for the truncated Student- t ( q = 0 ) and for the truncated ESTdistribution ( q = 1 ) due to the condition does not depend on q . Next, in light of proposition 1, we propose a corollary for the limiting case of a SUT pdf when τ → − ∞ . Corollary 3.

Under the condition of Proposition 1, as τ → − ∞ , i.e., τ i → −∞ , i = 1 , . . . , q , then SU T p,q ( µ , Σ , Λ , τ , ν, Ψ ) −→ t p ( γ , ω τ Γ , ν + q ) , (29) with γ = µ − Ω Ω − τ , Γ = Σ − Ω Ω − Ω and ω τ = ν X ( ) = ( ν + τ ⊤ Ω − τ ) / ( ν + q ) with Ω = Ψ + Λ ⊤ Λ .In particular, for q = 1 , EST p ( µ , Σ , λ , τ, ν ) −→ t p ( γ , ( ν + ˜ τ ) / ( ν + 1) Γ , ν + 1) , (30) with γ = µ − ˜ τ ∆ , Γ = Σ − ∆∆ ⊤ , and ∆ = Σ / λ / p λ ⊤ λ . It is worth to stress that parameters ∆ and Γ are well know in the context of SN and ST modeling since they areused in the the stochastic representation of this variates. Furthermore, the resulting symmetric distribution is highlyinvolved in the framework of censored modeling as shown next in Section 6. In order to illustrate our method, we performed a simple Monte Carlo (MC) simulation study to show how MCestimators for the mean vector and variance-covariance matrix elements converge to the real values computed by ourmethod.We consider a bivariate TSUT distribution Y ∼ T SU T , ( µ , Σ , Λ , τ , ν, Ψ ; ( a , b )) with lower and upper truncationlimits a = ( − . , − . ⊤ and b = (0 . , . ⊤ respectively, null location vector µ = , degrees of freedom ν = 4 , τ = (cid:18) − (cid:19) , Σ = (cid:18) . . (cid:19) , Λ = (cid:18) − − (cid:19) and Ψ = (cid:18) − . − . (cid:19) . PREPRINT - J

ULY

30, 2020Figure 2 shows the contour plot for the TSUT density (upper left corner) as well as the evolution trace of the MCestimates for the mean (ﬁrst row) and variance-covariance (last row) elements µ , µ , σ , σ and σ . Estimatedtrue values for the mean vector and the variance-covariance matrix were computed using equations (27) and (28),being E [ Y ] = (cid:18) − . . (cid:19) and cov [ Y ] = (cid:18) . − . − .

007 0 . (cid:19) , which are depicted as a blue solid line in Figure 2. Note that even with 1000 MC simulations there exists a signiﬁcantvariation in the chains. Under interval censoring mechanism the implementation of inferences depends on the computation of certain marginaland conditional expectations ([17]). For instance, for X = ( X ⊤ , X ⊤ ) ⊤ ∼ φ p ( ξ , Ω , ν ) , as in (13), with Ψ = 1 , Λ = λ and τ = 0 , it holds that f X ( | X = Y ) = φ (cid:0) λ ⊤ Σ − / ( Y − µ ) (cid:1) . Then, E (cid:20) g ( Y ) f X ( | X = Y ) P ( X > | X = Y ) (cid:21) = E " g ( Y ) φ (cid:0) λ ⊤ Σ − / ( Y − µ ) (cid:1) Φ (cid:0) λ ⊤ Σ − / ( Y − µ ) (cid:1) , (31)where g ( · ) is a measurable function. The expectation in the right side of the expression (31) is highly used to performinferences under SN censored models from a likelihood-based perspective, such as the E-Step of the EM-algorithm([18]).Next, we derive general expressions that are involved in interval censored modeling, speciﬁcally, in the E-step of theEM algorithm. These expressions arise, when we consider the responses Y i , i = 1 , . . . , n , to be i.i.d. realizationsfrom a selection elliptical distribution or any of its particular cases. For instance, a SUT, EST or ST distribution orany normal limiting case as the SUN, ESN or SN distribution as the example in (31). Lemma 1.

Let X = ( X ⊤ , X ⊤ ) ⊤ ∼ EC q + p ( ξ , Ω , h ( q + p ) ) and Y ∼ T SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ; ( a , b )) withtruncation subset C = C ( ) . For any measurable function g ( y ) : R p → R , we have that E (cid:20) g ( Y ) f X ( | X = Y ) P ( X > | X = Y ) (cid:21) = P ( a ≤ W ≤ b ) P ( a ≤ Y ≤ b ) E [ g ( W )] P ( X ≥ ) f X ( ) , (32) where X ∼ EC p ( ξ , Ω , h ( q ) ) , Y ∼ SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ( )) , W ∼ EC p ( ξ − Ω Ω − ξ , Ω − Ω Ω − Ω , h ( p ) ) and W d = W | ( a ≤ W ≤ b ) .Proof. Using basic probability theory, we have = E (cid:20) g ( Y ) f X ( | X = Y ) P ( X > | X = Y ) (cid:21) = 1 P ( a ≤ Y ≤ b ) Z ba g ( y ) f X ( | X = y ) P ( X > | X = y ) f Y ( y )d y , = 1 P ( a ≤ Y ≤ b ) Z ba g ( y ) f X ( | X = y ) P ( X > | X = y ) P ( X > | X = y ) f X ( y ) P ( X > ) d y , = 1 P ( a ≤ Y ≤ b ) Z ba g ( y ) f X ( | X = y ) f X ( y ) P ( X > ) d y , = 1 P ( a ≤ Y ≤ b ) f X ( ) P ( X > ) Z ba g ( y ) f X ( y | X = ) d y , = P ( a ≤ W ≤ b ) P ( a ≤ Y ≤ b ) E [ g ( W )] P ( X > ) f X ( ) , where W d = X | ( X = ) and W d = W | ( a ≤ W ≤ b ) .11 PREPRINT - J

ULY

30, 2020

Lemma 2.

Consider X , Y and g as in Lemma 1. Now, consider Y to be partitioned as Y = ( Y ⊤ , Y ⊤ ) ⊤ ofdimensions p and p ( p + p = p ). For a given random variable U , let U ∗ stands for U ∗ d = U | Y . It follows that E (cid:20) g ( Y ) f X ( | X = Y ) P ( X > | X = Y ) (cid:12)(cid:12)(cid:12)(cid:12) Y (cid:21) = P ( a ≤ W ∗ ≤ b ) P ( a ≤ Y ∗ ≤ b ) E [ g ( W )] P ( X ∗ > ) f X ∗ ( ) (33) with X , Y , and W as deﬁned in Lemma 1, and W d = W ∗ | ( a ≤ W ∗ ≤ b ) .Proof. Consider X partitioned as X = ( X ⊤ , X ⊤ ) ⊤ such that dim ( X ) = dim ( Y ) and dim ( X ) = dim ( Y ) .Since f Y ( y | Y = y ) = f Y ( y ) /f Y ( y ) , it follows (in a similar manner that the proof of Lemma 1) that = E (cid:20) g ( Y ) f X ( | X = Y ) P ( X > | X = Y ) (cid:12)(cid:12)(cid:12)(cid:12) Y (cid:21) = 1 P ( a ≤ Y ∗ ≤ b ) Z b a g ( y ) f X ( | X = y ) P ( X > | X = y ) P ( X > | X = y ) P ( X > | X = y ) f X ( y ) f X ( y ) d y , = 1 P ( a ≤ Y ∗ ≤ b ) Z b a g ( y ) f X ( | X = y ) P ( X > | X = y ) f X ( y ) f X ( y ) d y , = 1 P ( a ≤ Y ∗ ≤ b ) f X ( ) P ( X > | X = y ) Z b a g ( y ) f X ( y | X = ) f X ( y ) d y , = 1 P ( a ≤ Y ∗ ≤ b ) f X ( | X = y ) P ( X > | X = y ) Z b a g ( y ) f X ( y | X = y , X = ) d y , = P ( a ≤ W ∗ ≤ b ) P ( a ≤ Y ∗ ≤ b ) E [ g ( W )] P ( X ∗ > ) f X ∗ ( ) , where W ∗ d = X | ( X = y , X = ) and W d = W ∗ | ( a ≤ W ∗ ≤ b ) .In the next corollaries, we particularize the aforementioned lemmas to the truncated SUT, EST, SUN and ESN distri-butions. Corollary 4.

Under the condition of Lemma 1, let Y ∼ T SU T p,q ( µ , Σ , Λ , τ , ν, Ψ , ( a , b )) . For any measurablefunction g ( y ) : R p → R , we have that E " g ( Y ) t q (cid:0) ( τ + Λ ⊤ Σ − / ( Y − µ )) ν ( Y ) , Ψ ; ν + p (cid:1) T q (cid:0) ( τ + Λ ⊤ Σ − / ( Y − µ )) ν ( Y ) , Ψ ; ν + p (cid:1) = P ( a ≤ W ≤ b ) P ( a ≤ Y ≤ b ) E [ g ( W )] η , (34) where η = t q ( τ ; Ψ + Λ ⊤ Λ , ν ) /T q ( τ ; Ψ + Λ ⊤ Λ , ν ) , Y ∼ SU T p,q ( µ , Σ , Λ , τ , ν, Ψ ) , W ∼ t p ( γ , ω τ Γ , ν + q ) and W d = W | ( a ≤ W ≤ b ) . When τ = , we have that η = 2 t q ( τ ; Ψ + Λ ⊤ Λ , ν ) and W ∼ t p ( µ , ν Γ / ( ν + q ) , ν + q ) .In particular for q = 1 , Y ∼ TEST p ( µ , Σ , λ , τ, ν ; ( a , b )) , and E " g ( Y ) t (cid:0) ( τ + λ ⊤ Σ − / ( Y − µ )) ν ( Y ); ν + p (cid:1) T (cid:0) ( τ + λ ⊤ Σ − / ( Y − µ )) ν ( Y ); ν + p (cid:1) = P ( a ≤ W ≤ b ) P ( a ≤ Y ≤ b ) η E [ g ( W )] , (35) with η = t ( τ ; 1 + λ ⊤ λ , ν ) /T (˜ τ ; ν ) , Y ∼ EST p ( µ , Σ , λ , τ , ν ) , W ∼ t p ( γ , ( ν + ˜ τ ) Γ / ( ν + 1) , ν + 1) , and W d = W | ( a ≤ W ≤ b ) . Similarly, when τ = 0 , we have that η = 2 t (0; 1 + λ ⊤ λ , ν ) and W ∼ t p ( µ , ν Γ / ( ν +1) , ν + 1) . Corollary 5.

Under the condition of Lemma 1, let ν → ∞ , Y ∼ T SU N p,q ( µ , Σ , Λ , τ , Ψ , ( a , b )) , it follows that E " g ( Y ) φ q (cid:0) τ + Λ ⊤ Σ − / ( Y − µ ) , Ψ (cid:1) Φ q (cid:0) τ + Λ ⊤ Σ − / ( Y − µ ) , Ψ (cid:1) = P ( a ≤ W ≤ b ) P ( a ≤ Y ≤ b ) E [ g ( W )] η , (36)12 PREPRINT - J

ULY

30, 2020 where η = φ q ( τ ; Ψ + Λ ⊤ Λ ) / Φ q ( τ ; Ψ + Λ ⊤ Λ ) , Y ∼ SU N p,q ( µ , Σ , Λ , τ , Ψ ) , W ∼ N p ( γ , Γ ) , and W d = W | ( a ≤ W ≤ b ) .When τ = , we have that η = 2 φ q ( ; Ψ + Λ ⊤ Λ ) and W ∼ N p ( µ , Γ ) .In particular for q = 1 , Y ∼ TESN p ( µ , Σ , λ ; ( a , b )) , and E " g ( Y ) φ (cid:0) τ + λ ⊤ Σ − / ( Y − µ ) (cid:1) Φ (cid:0) τ + λ ⊤ Σ − / ( Y − µ ) (cid:1) = P ( a ≤ W ≤ b ) P ( a ≤ Y ≤ b ) η E [ g ( W )] , (37) with η = φ ( τ ; 1 + λ ⊤ λ ) / Φ(˜ τ ) , Y ∼ ESN p ( µ , Σ , λ , τ ) , W ∼ N p ( γ , Γ ) , and W d = W | ( a ≤ W ≤ b ) .Similarly, when τ = 0 , we have that η = p /π (1 + λ ⊤ λ ) and W ∼ N p ( µ , Γ ) . Let Y be a random variable representing in this context, the total loss in a portfolio investment, a credit score, etc. Let y α be the (1 − α ) th quantile of Y , that is, P ( Y > y α ) = α . Hence, the tail conditional expectation (TCE) (see, e.g.,[19]) is denoted by T CE Y ( y α ) = E [ Y | Y > y α ] . (38)This can be interpreted as the expected value of the α % worse losses. The quantile y α is usually chosen to be high inorder to be pessimistic, for instance, α = 0 . . Notice that, if we consider a variable Y which we are interested onmaximizing, for example, the pay-off of a portfolio, we simply compute T CE − Y ( − y α ) = − E [ Y | Y ≤ − y α ] , beinga measure of worst expected income.Main applications of TCE are in actuarial science and ﬁnancial economics: market risk, credit risk of a portfolio,insurance, capital requirements for ﬁnancial institutions, among others. TCE (also known as tail value at risk, TVaR)and it represents an alternative to the traditional value at risk (VaR) that is more sensitive to the shape of the tail of theloss distribution. Furthermore, if Y is a continuous r.v., TCE coincides with the well-known risk measure expectedshortfall ([20]). In contrast with VaR, TCE is said to be a coherent measure, holding desirable mathematical propertiesin the context of risk measurement and and is a convex function of the selection weights ([21, 22]). A good referenceto several risk measures and their properties can be found in [23]. Multivariate framework

Let consider a set of p assets, business lines, credit scores, Y = ( Y , · · · , Y p ) ⊤ . Inthe multivariate case, the sum of risks arises as a natural and simple measure of total risk. Hence, the sum S = Y + Y + · · · + Y p follows a univariate distribution and from (38), we have that the TCE for S is given by T CE S ( s α ) = E [ S | S > s α ] . (39)Even though we may know the marginal distribution of S , it is preferable to compute the total risk T CE of S as adecomposed sum, that is E [ S | S > s α ] = p X i =1 E [ Y i | S > s α ] , (40)where each term E [ Y i | S > s α ] represents the average amount of risk due to Y i . This decomposed sum offers a wayto study the individual impact of the elements of the set, being an improvement to (39).In order to model combinations of correlated risks, [24] extended the TCE to the multivariate framework. The multi-variate TCE (MTCE) is given by M T CE Y ( y α ) = E [ Y | Y > y α ] = E [ Y | Y > y α , . . . , Y p > y pα p ] , (41)with α = ( α , . . . , α p ) be a vector of quantiles of interest. Notice that the quantile-level for the MTCE is ﬁxed pereach risk i = 1 , . . . , p , in contrast with the TCE of the sum, which is ﬁxed over all the sum of risk S . Let consider Y ∼ SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ) . With loss of generality, we consider the selection subset C = C ( ) . It follows from Theorem 1 that M T CE Y ( y α ) = E [ X | X > x α ] , (42)13 PREPRINT - J

ULY

30, 2020with x α = ( ⊤ q , y ⊤ α ) ⊤ and where X = ( X ⊤ , X ⊤ ) ⊤ ∼ EC q + p ( ξ , Ω , h ( q + p ) ) . It is noteworthy that the computationof the MTCE for Y following a SE distribution relies on the calculation of truncated moments for its symmetricalelliptical multivariate case.On the other hand, by noticing that S = ⊤ Y , it follows from (7) that S is an univariate SE distribution given by S ∼ SLCT - EC ,q ( ξ s , Ω s , h ( q +1) , C ) , with ξ S = (cid:18) ξ ⊤ ξ (cid:19) and Ω S = (cid:18) Ω Ω ⊤ Ω ⊤ Ω (cid:19) . Hence, its TCE in (39) can be easily computed as E [ S | S > s α ] = E [ W | W > , W > s α ] , W = ( W ⊤ , W ) ⊤ ∼ EC q +1 ( ξ s , Ω s , h (1+ q ) ) , due to S d = W | ( W > ) . Next, we establish a generalproposition for computing E [ S | S > α s ] in matrix form as a decomposed sum. Proposition 3.

Let Y ∼ SLCT - EC p,q ( ξ , Ω , h ( q + p ) , C ) , with ξ and Ω as in (44) , and W = ( W ⊤ , W ) ⊤ ∼ EC q +1 ( ξ S , Ω S , h (1+ q ) ) as before. It follows that E [ S | S > s α ] = ⊤ s , (43) with s = ξ + Ω S Ω − S ( E S − ξ S ) , where Ω S = ( Ω , Ω ) and E S = E [ W | W > , W > s α ] .Proof. Let A = ( , I p ) ⊤ be a real matrix of dimensions ( p + 1) × p . For V = AY , it follows that V = (cid:18) V V (cid:19) ∼ SLCT - EC p +1 ,q (cid:18) ξ V = (cid:18) ξ S ξ (cid:19) , Ω V = (cid:18) Ω S Ω ⊤ S Ω S Ω (cid:19) , h ( q +1+ p ) , C (cid:19) , (44)where V = ( S, Y ⊤ ) ⊤ . It comes from the deﬁnition of selection distribution that V d = ( X , X ⊤ ) ⊤ | ( X > , where X = ( X ⊤ , X , X ⊤ ) ⊤ is a partitioned random vector with elements of dimensions q , and p respectively, where X ∼ EC p + q +1 ( ξ V , Ω V ; h ( q +1+ p ) ) . Hence, it is straightforward to see that s = E [ Y | S > s α ] = E [ X | X > , X > s α , − ∞ ≤ X ≤ ∞ ] . Since there exists a non-truncated partition, the result in (43) then immediately follows from equation (23), with W = ( X , X ) ⊤ . Remark 4.

It is noteworthy that, the i th element of vector s , say s i = e ⊤ i s , is equal to E [ Y i | S > α s ] , representingthe contribution to the total risk due to the i th risk. Remark 5.

Since S d = W | ( W > ) , it follows that the last element of the vector E s is equivalent to E [ S | S >s α ] = E [ W | W > , W > s α ] . Suppose that a set of risks Y are distributed as Y ∼ ST p ( µ , Σ , λ , ν ) . Let y represents a realization of Y . Based on y , the set of parameters θ = ( µ , Σ , λ , ν ) ⊤ can be estimated through maximum likelihood estimation. It follows that M T CE Y ( y α ) = E [ X | X > , X > y α ] , (45)where X = ( X , X ⊤ ) ⊤ ∼ t p ( ξ , Ω , ν ) with ξ = (cid:18) µ (cid:19) and Ω = (cid:18) ∆ ⊤ ∆ Σ (cid:19) . (46)Additionally, using simple algebraic manipulation, it follows from (7) that S ∼ ST  µ S = p X i =1 µ i , σ S = p X i =1 p X j =1 σ ij , λ S = ∆ S p σ S − ∆ S , ν  , (47)14 PREPRINT - J

ULY

30, 2020with ∆ S = P pi =1 ∆ i . Besides, the TCE of the sum is given by T CE S ( s α ) = E [ W | W > , W > s α ] , W = ( W ⊤ , W ) ⊤ ∼ t ( ξ S , Ω S , ν ) , where ξ S = (cid:18) µ S (cid:19) , and Ω S = (cid:18) S ∆ S σ S (cid:19) . Finally, we have from Proposition 3 that E [ Y i | S > α s ] , = e ⊤ i (cid:2) µ + ( ∆ , Σ1 ) Ω − S ( E S − ξ S ) (cid:3) , = µ i + E S (∆ i σ S + σ iS ∆ S ) − ( T CE S ( s α ) − µ S )(∆ i ∆ S + σ iS ) , (48)with E S = E [ W | W > , W > s α ] and σ iS = P pj =1 σ ij . Besides, E [ S | S > s α ] = µ S + E S p X i =1 (cid:8) ∆ i σ S + σ iS ∆ S (cid:9) − ( T CE S ( s α ) − µ S ) p X i =1 { ∆ i ∆ S + σ iS } . (49) In this paper, we proposed expressions to compute product moment of truncated multivariate distributions belongingto the selection elliptical family, showing in a clever way that their moments can be computed using an unique momentfor their respective elliptical symmetric case. In contrast with other recent works, we avoid cumbersome expressions,having neat formulas for high-order truncated moments. To the best of our knowledge, this is the ﬁrst proposaldiscussing the conditions of existence of the truncated moments for members of the selection elliptical family. Also,we propose optimized methods able to deal with extreme setting of the parameters, partitions with almost zero volumeor no truncation.We expect in the near future to use expressions in Section 6 to propose a robust likelihood-based censored regressionmodel considering EST errors, able to ﬁt multivariate censored responses with high skewness/kurtosis, presence ofatypical observations and missing data. As more truncated moments for other symmetric elliptical distributions appearin the literature, we expect to implement the truncated moments for their respective asymmetric extended versionsas well as censored models considering this last. Additionally, theoretical results can be extended to compute themoments of the class of extended generalized skew-elliptical distributions (see, [25]), where the jointly distributedcondition in (44) is not longer considered.Finally, theoretical and MC moments (among other functions of interest) for several multivariate asymmetric distri-butions are already available in our

MomTrunc R package, which will be constantly updated when other treatabledistributions are available.

Acknowledgment

Christian Galarza acknowledges support from FAPESP-Brazil (Grant 2015/17110-9 and Grant 2018/11580-1).

References [1] James Tobin. Estimation of relationships for limited dependent variables.

Econometrica: Journal of the Econo-metric Society , pages 24–36, 1958.[2] G. M. Tallis. The moment generating function of the truncated multi-normal distribution.

Journal of the RoyalStatistical Society. Series B (Statistical Methodology) , 23(1):223–229, 1961.[3] B. G. Manjunath and Stefan Wilhelm. Moments calculation for the double truncated multivariate normal density.

Available at SSRN 1472153 , 2009.[4] Saralees Nadarajah. A truncated bivariate t distribution.

Economic Quality Control , 22(2):303–313, 2007.[5] H. J. Ho, T. I. Lin, H. Y. Chen, and W. L. Wang. Some results on the truncated multivariate t distribution.

Journalof Statistical Planning and Inference , 142:25–40, 2012.[6] Juan C. Arismendi and Simon Broda. Multivariate elliptical truncated moments.

Journal of Multivariate Analysis ,157:29 – 44, 2017.[7] Raymond Kan and Cesare Robotti. On moments of folded and truncated multivariate normal distributions.

Jour-nal of Computational and Graphical Statistics , 25(1):930–934, 2017.15

PREPRINT - J

ULY

30, 2020[8] Roohollah Roozegar, Narayanaswamy Balakrishnan, and Ahad Jamalizadeh. On moments of doubly truncatedmultivariate normal mean–variance mixture distributions with application to multivariate tail conditional expec-tation.

Journal of Multivariate Analysis , 177:104586, 2020.[9] Ole Barndorff-Nielsen, John Kent, and Michael Sørensen. Normal variance-mean mixtures and z distributions.

International Statistical Review/Revue Internationale de Statistique , pages 145–159, 1982.[10] Wolfgang Breymann and David Lüthi. ghyp: A package on generalized hyperbolic distributions.

Manual for RPackage ghyp , 2013.[11] Reinaldo B Arellano-Valle and Marc G Genton. Multivariate extended skew-t distributions and related families.

Metron , 68(3):201–234, 2010.[12] A. Azzalini and A. Capitanio. Distributions generated and perturbation of symmetry with emphasis on themultivariate skew-t distribution.

Journal of the Royal Statistical Society, Series B , 61:367–389, 2003.[13] Reinaldo B Arellano-Valle, Máaarcia D Branco, and Marc G Genton. A uniﬁed view on skewed distributionsarising from selections.

Canadian Journal of Statistics , 34(4):581–601, 2006.[14] Reinaldo B Arellano-Valle and Adelchi Azzalini. On the uniﬁcation of families of skew-normal distributions.

Scandinavian Journal of Statistics , 33(3):561–574, 2006.[15] M. D. Branco and D. K. Dey. A general class of multivariate skew-elliptical distributions.

Journal of MultivariateAnalysis , 79:99–113, 2001.[16] Raúl Alejandro Morán-Vásquez and Silvia L P Ferrari. New results on truncated elliptical distributions.

Com-munications in Mathematics and Statistics , (53), 2019.[17] L. A. Matos, M. O. Prates, M. H. Chen, and V. H. Lachos. Likelihood-based inference for mixed-effects modelswith censored response using the multivariate-t distribution.

Statistica Sinica , 23:1323–1342, 2013.[18] A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm.

Journalof the Royal Statistical Society, Series B , 39:1–38, 1977.[19] Michel Denuit, Jan Dhaene, Marc Goovaerts, and Rob Kaas.

Actuarial theory for dependent risks: measures,orders and models . John Wiley & Sons, 2006.[20] Carlo Acerbi and Dirk Tasche. Expected shortfall: a natural coherent alternative to value at risk.

Economic Notes ,31(2):379–388, 2002.[21] Philippe Artzner, Freddy Delbaen, Jean-Marc Eber, and David Heath. Coherent measures of risk.

MathematicalFinance , 9(3):203–228, 1999.[22] Georg Ch Pﬂug. Some remarks on the value-at-risk and the conditional value-at-risk. In

Probabilistic Con-strained Optimization , pages 272–281. Springer, 2000.[23] Ekaterina N Sereda, Eﬁm M Bronshtein, Svetozar T Rachev, Frank J Fabozzi, Wei Sun, and Stoyan V Stoy-anov. Distortion risk measures in portfolio optimization. In

Handbook of portfolio construction , pages 649–673.Springer, 2010.[24] Zinoviy Landsman and Emiliano A Valdez. Tail conditional expectations for elliptical distributions.

NorthAmerican Actuarial Journal , 7(4):55–71, 2003.[25] Zinoviy Landsman, Udi Makov, and Tomer Shushi. Extended generalized skew-elliptical distributions and theirmoments.

Sankhya A , 79(1):76–100, 2017. 16, 79(1):76–100, 2017. 16