[PDF] Backward Simulation of Multivariate Mixed Poisson Processes

Abstract

The Backward Simulation (BS) approach was developed to generate, simply and efficiently, sample paths of correlated multivariate Poisson process with negative correlation coefficients between their components. In this paper, we extend the BS approach to model multivariate Mixed Poisson processes which have many important applications in Insurance, Finance, Geophysics and many other areas of Applied Probability. We also extend the Forward Continuation approach, introduced in our earlier work, to multivariate Mixed Poisson processes.

Full PDF

aa r X i v : . [ s t a t . C O ] S e p Backward Simulation of Multivariate Mixed PoissonProcesses

Michael Chiu ∗ , Kenneth R. Jackson † , and Alexander Kreinin ‡ Department of Computer Science, University of Toronto FE, Research and Validation, SS&C TechnologiesSeptember 17, 2020

Abstract

The Backward Simulation (BS) approach was developed to generate, simply and eﬃ-ciently, sample paths of correlated multivariate Poisson process with negative correlationcoeﬃcients between their components. In this paper, we extend the BS approach to modelmultivariate Mixed Poisson processes which have many important applications in Insurance,Finance, Geophysics and many other areas of Applied Probability. We also extend the For-ward Continuation approach, introduced in our earlier work, to multivariate Mixed Poissonprocesses.

The simulation of dependent Poisson processes is an important problem having many applica-tions in Insurance, Finance, Geophysics and many other areas of applied probability—see [1, 2,4, 5, 6, 7, 14, 27, 28, 32] and references therein. For example, in Operational Risk modelling, thecorrelation matrices of operational events must be calibrated to for simulation purposes; see [13].The Poisson and Negative Binomial processes are some of the most popular underlying modelsamongst practitioners for describing the operational losses of the business units of a ﬁnancialorganization and the moment of claim arrivals in the insurance industry [21, 25]. Dependencebetween Poisson processes can be achieved by various operations applied to independent pro-cesses. One of the most popular approaches, often considered in actuarial modeling, is theCommon Shock Model (CSM) [21, 29, 32], where a third Poisson process is used to couple twoindependent processes. For example, let ( ν (1) t , ν (2) t , ν (3) t ) be three independent Poisson processeswith intensities ( λ , λ , λ ), each deﬁned as ν ( j ) t = ∞ X i =0 ( T ( j ) i ≤ t ) ∗ [email protected] † [email protected] ‡ [email protected] T ( j ) i here are exponentially distributed random variables (arrival moments). Throughsuperposition, we can obtain two correlated Poisson processes N (1) t = ν (1) t + ν (2) t and N (2) t = ν (2) t + ν (3) t , where N (1) t = (cid:0) ν (1) t + ν (2) t (cid:1) := X j =1 ∞ X i =0 ( T ( j ) i ≤ t )and N (2) t = (cid:0) ν (2) t + ν (3) t (cid:1) := X j =2 ∞ X i =0 ( T ( j ) i ≤ t )The correlation coeﬃcient between the Poisson processes N (1) t and N (2) t , having intensities λ = λ + λ and µ = λ + λ , in the CSM satisfy0 ≤ ρ ≤ s min( λ, µ )max( λ, µ )It is clear that, in such a model, negative correlations cannot be obtained and that correlationsare constant in time. The extreme correlation problem was considered in [18] and [24] where anoptimization problem for the joint distribution was solved numerically. The problem was reducedto that of random vectors having speciﬁed marginal distributions in [20], where the ExtremeJoint Distribution (EJD) method, a pure probabilistic, eﬃcient, and rather simple algorithmto ﬁnd the joint distributions with extreme correlations applicable to any discrete marginalprobability distribution was proposed. Connections with some classical results obtained in [16],[19], and [33] were also discussed. The Backward Simulation (BS) method, in conjunction withthe EJD method, was developed in order to address the restrictions in the correlation structureof multivariate Poisson processes constructed using classical approaches such as the CSM. TheBackward Simulation approach, considered in [13], allows for a wider range of correlations, bothpositive and negative, to be attained than the CSM and allows for a dynamic correlation structureas a linear function of time of the correlation, ρ ( T ), at the end of the simulation interval [0 , T ][20].There are two general approaches to the simulation of Poisson processes—Forward and Back-ward simulation. The Forward approach consists of generating exponentially distributed inter-arrival times until the simulation time is at or past the simulation interval [0 , T ]. Our Backwardapproach is based on exploiting the conditional uniformity of Poisson processes—we ﬁrst obtainthe joint distribution at some terminal time T and then generate the corresponding numberof arrival moments using the conditional uniformity of the arrival times . This is one of themajor advantages of the Backward approach—only the ability to sample from a suitable jointdistribution at the terminal time is required; the arrival moments can be generated uniformlyin a coordinate-wise manner. The BS approach was extended in [20] to the class of bivariateprocesses containing both Poisson and Wiener components. It also led to the introduction ofthe Forward Continuation (FC) of Backward Simulation, a method for extending the processsimulated by BS to subsequent intervals [ nT, ( n + 1) T ], where n is some integer, that preservesthe joint distribution at various grid points nT [9].The EJD method enables the construction of joint distributions that exhibit extreme de-pendence between the components; in other words, the EJD method constructs extreme jointdistributions that extremize ρ ( T ). Extreme joint distributions are used to generate extreme ad-missible correlations, from which all correlations within the admissible range can be obtained.It was extended to the multivariate setting in [9]. This is also known as the order statistic property [11] .1 Our Contributions In this paper, we extend the Backward Simulation approach for Poisson processes to the classof Mixed Poisson processes (MPPs), which are a natural generalization of the class of Poissonprocesses that can be represented as a Poisson process with a random intensity [17]. We alsoextend the FC of the BS to MPPs.

The plan of this paper is as follows. In Section 2, we brieﬂy discuss the basics of MPPs. Section 3extends the BS method to multivariate MPPs, allowing us to ﬁll in our process back to time 0.The extension to MPPs is ﬁrst discussed in the bivariate setting. Section 4 brieﬂy reviews theEJD method, necessary for the construction of joint distributions needed at the terminal time.We also discuss in Section 4 how to sample from Extreme Joint Distributions. In Section 5, weextend the FC approach to MPPs. This allows us to propagate the process forward in time tosome possibly inﬁnite horizon. Finally, we make some concluding remarks in Section 6.

We begin by reviewing some properties of MPPs. The main results of the theory of MPPs canbe found in [17]. Recent results on the characterization of the multivariate MPP are in [34]. Weconsider a counting process X t = ∞ X i =1 ( T i ≤ t ) (1)with arrival moments 0 < T < · · · < T i < · · · , where ( · ) is the indicator function. Also, forconvenience, we let T = 0. The classical Poisson process is deﬁned as a process with independentincrements such that the inter-arrival times between the events ∆ T i := T i − T i − form a sequenceof independent identically distributed random variables having an exponential distribution withparameter λ . It is well known that the number of events of X t in the interval [0 , t ] has thePoisson distribution with parameter λt : P ( X t = k ) = e − λt ( λt ) k k ! , k = 0 , , , . . . t > λ , leading to the Mixed Poisson Distribution (MPD). Deﬁnition 2.1 (Mixed Poisson Distribution [17]) . A discrete random variable X is said to bemixed Poisson distributed, MP(U), with structure distribution U, if p k := P ( X = k ) = E (cid:2) (Λ) k k ! e − Λ (cid:3) = Z ∞ − ( λ ) k k ! e − λ dU ( λ ) , k = 0 , , , . . . (3)where Λ is a random variable distributed according to U . Remark . The structure distribution U can be viewed as a prior distribution, which allows usto view (2) as a conditional distribution , given a realization of the intensity parameter Λ = λ and (3) as an unconditional distribution. 3 emark . Another interpretation of (3) is that it is a mixture of Poisson distributions.

Deﬁnition 2.2 (Mixed Poisson Process) . X t is a MPP if it is MP( U )-distributed for all t ≥ U andthat the process is uniquely deﬁned.In what follows, we denote by MPP( U ), the class of MPPs with structure distribution U . Itis not diﬃcult to see that if X t ∈ MPP( U ), then the moment generating function takes the form G ( t ; z ) := E [ z X t ] = Z ∞ e xt ( z − d U ( x ) (4)and E [ X t ] = ¯ λt, σ ( X t ) = ¯ λt + σ ( λ ) t where ¯ λ = E [ λ ] = Z ∞ x d U ( x ) , σ ( λ ) = Z ∞ ( x − ¯ λ ) d U ( x ) . It is well known that the intervals ∆ T i = T i − T i − of a Poisson process with intensity λ form asequence of independent, exponentially distributed random variables: P (∆ T i ≤ t ) = 1 − e − λt , t ≥ i = 1 , , . . . This forms the basis of the forward approach to the simulation of Poisson processes. Given n events to be generated and a positive intensity λ , one can sequentially generate exponentiallydistributed intervals, ∆ T i , and determine the arrival moments, T n = n X i =1 ∆ T i . However, there is an alternative approach based on the fundamental property of the conditionaldistribution of the arrival moments [10]. Let T = { T , T , . . . , T n } be a sequence of n independentrandom variables having a uniform distribution in the interval [0 , T ]: P ( T i ≤ t ) = tT , ≤ t ≤ T. Denote by τ k the k th order statistic of T , ( k = 1 , , . . . , n ): τ = min ≤ k ≤ n T k , τ = min ≤ k ≤ n { T k : T k > τ } , . . . , τ n = max ≤ k ≤ n T k (5) Theorem 2.3.

The distribution of the arrival moments of a Poisson process, X t , with ﬁniteintensity in the interval [0 , T ] conditional on the number of arrivals, X T = n , coincides with thedistribution of the order statistics: P ( T k ≤ t | X T = n ) = P ( τ k ≤ t ) , ≤ t ≤ T, k = 1 , , . . . , n (6)The converse statement was proved in [20]: 4 roposition 2.4. If a process, X t , is represented as a random sum X t = N X k =1 ( T k < t ) where { T k } Nk =1 are independent, identically distributed random variables having a uniform con-ditional distribution, P ( T k ≤ t | N ) = tT − , k = 1 , , . . . , N in the interval [0 , T ] and the random variable N ∼ Pois( λT ) , then X t is a Poisson process withintensity λ in the interval [0 , T ] . This result leads to the BS algorithm for the multivariate Poisson processes considered in[13] and [20]. In Section 3, we generalize Proposition 2.4 for the class of MPPs, which can beobtained by a similar construction using the random variable N ∼ MP( U ). Let us now consider the Negative Binomial (NB) process, which is a MPP with the structuredistribution U being the gamma distribution. The Negative Binomial process is widely used forcount data that exhibit overdispersion because, unlike the Poisson process, it does not have therestriction that its mean must equal its variance. For this reason, we use the Negative Binomialdistribution in numerical experiments in Section 3.1 and Section 5.1 to compare the processesgenerated by BS in the mixed Poisson versus the Poisson case.The generating function of the Negative Binomial process is given in the following Lemma,the proof of which can be found in standard texts [15]. Lemma 2.5.

The generating function, G ( t, z ) = E [ z X t ] , of the Negative Binomial Process is G ( t, z ) = ( bb + t (1 − z ) ) r , | z | ≤ , t ≥ , r > Remark . Notice that our process is not a L´evy process—the inter-arrival times are not inde-pendent but only conditionally independent.

Backward Simulation of Poisson processes, studied in [13] and [20], relies on the conditionaluniformity of the arrival moments. BS requires sampling the corresponding joint distribution, atterminal time, to obtain a vector of the number of events for each coordinate in the simulationinterval [0 , T ]. The dependency structure manifests itself in the joint distribution (Section 4discusses how correlated joint distributions can be obtained), i.e., in the sampled vector ofterminal events. Each coordinate is simulated independently by drawing the correspondingnumber of uniform variates, which are then ordered to give the arrival moments of events.Let us now show that the distribution of the arrival moments, conditional on the number ofevents, is also uniform for MPPs. Moreover, we show that the process generated by BS remainsa MPP. First, we introduce the following two lemmas, the proofs of which can be found in [20].To this end, we introduce some useful notation. Let X t be a MPP and consider the points { T , T , . . . , T d } where 0 ≤ T < T < · · · < T d ≤ T . Denote by ∆ X i := X T i − X T i − , i =5 , , . . . , d , non-overlapping intervals. For a d -dimensional vector , k = ( k , k , . . . , k d ) ∈ Z d + ,with non-negative integer coordinates, k j ≥

0, we denote the norm of the vector by k k k = d X j =1 k j For any d -dimensional vector, x = ( x , x , . . . , x d ), with non-negative coordinates, and k ∈ Z d + ,we denote x k := d Y j =1 x j k j The conditional probability of the number of events in each non-overlapping interval giventhe number of terminal events takes the form [17] P (cid:18) ∆ X = k , . . . , ∆ X d = k d (cid:12)(cid:12)(cid:12) X T = l + d X j =1 k j (cid:19) = (cid:18) k + l k (cid:19) · p k · q l , (8)with the multinomial coeﬃcient (cid:18) k + l k (cid:19) := (cid:18) l + d P i =1 k i (cid:19) ! l ! · d Q i =1 k i ! p j = (∆ T j /T ) ∈ [0 ,

1] for j = 1 , , . . . , d , p = ( p , . . . , p d ) and q = 1 − P dj =1 p j . Lemma 3.1.

Consider a discrete random variable, ξ , taking non-negative integer values withprobabilities, p k = P ( ξ = k ) , k = 0 , , , . . . , and denote its moment generating function by ˆ p ( z ) = P ∞ k =0 p k z k , | z | ≤ . Consider a sequence q k ( x ) = ∞ X m =0 p k + m (cid:18) k + mk (cid:19) x k (1 − x ) m , ≤ x ≤ , k = 0 , , , . . . (9) Then, for any ﬁxed x ∈ [0 , , the sequence { q k ( x ) } is a probability distribution and its momentgenerating function, ˆ q ( z ; x ) , is ˆ q ( z ; x ) = ˆ p (1 − x + xz ) . Lemma 3.2.

Consider a discrete random variable, ξ , taking non-negative integer values withprobabilities, p k = P ( ξ = k ) , k = 0 , , , . . . , and denote its moment generating function by ˆ p ( z ) = P ∞ k =0 p k z k , | z | ≤ . Let k ∈ Z d + and consider the function π : Z d + → R deﬁned by, π ( k ; x ) = ∞ X l =0 p k k k + l (cid:18) k + l k (cid:19) · x k · y l , (10) where x = ( x , . . . , x d ) , x j ≥ , P dj =1 x j < and y = 1 − P dj =1 x j . Denote by ˆ π ( z ) the generatingfunction ˆ π ( z ; x ) := X k ∈ Z d + π ( k ; x ) z k The d that we use here for the dimension of a generic vector should not be confused with the dimension of amultivariate mixed Poisson process in Section 4 here z = ( z , z , . . . , z d ) and max {| z | , . . . , | z d |} ≤ , then ˆ π ( z ; x ) = ˆ p (1 − d X j =1 x j (1 − z j )) (11)With Lemma 3.2, the vector analogue of Lemma 3.1, we can prove the main theorem of thissection. Theorem 3.3.

Let the process X t be represented as a random sum X t = N X k =1 ( T k < t ) where the number of random events N ∼ MP( U ) and { T k } Nk =1 are independent, identicallydistributed random variables having a uniform conditional distribution in the interval [0 , T ] , then X t is MPP( U ) in the interval [0 , T ] .Proof. We prove the following two statements.1. At any time t ∈ [0 , T ], the moment generating function of X t is E [ z X t ] = R ∞ e xt ( z − d U ( x ) .

2. The increments of the process X t over disjoint intervals are conditionally independentrandom variables.The theorem follows immediately from the two results above. Let us prove the ﬁrst statement.As noted in (4), the moment generating function of X T is E [ z X T ] = Z ∞ e xT ( z − d U ( x ) . The probabilities p k ( t ) := P ( X t = k ) , k ∈ Z + , ≤ t ≤ T , satisfy p k ( t ) = ∞ X l =0 p k + l ( T ) (cid:18) k + lk (cid:19) (cid:18) tT (cid:19) k · (cid:18) − tT (cid:19) l , k = 0 , , , . . . (12)In our case, ˆ p ( z ) = R ∞ e xT ( z − d U ( x ). Thus, the probabilities q k = p k ( t ) := P ( X t = k | X T ).Taking x = tT − in (9), we obtain from (12) and Lemma 3.1 thatˆ q ( z ) = Z ∞ e xt ( z − d U ( x )as was to be proved.The second statement is proved using Lemma 3.2. Let x = ( x , x , . . . , x d ), satisfying theconditions listed in the statement of the lemma.Then from (8), we have P (∆ X = k , . . . , ∆ X d = k d )= ∞ X l =0 (cid:18) k + l k (cid:19) · p k · q l · P (cid:0) X T = l + d X j =1 k j (cid:1) (13)7ow consider the generating function π ( z ) := E h d Y j =1 z ∆ X j j i , | z j | ≤ , j = 1 , , . . . , d Applying Lemma 3.2 with ˆ p ( z ) := E [ z X T ] = R ∞ e λT ( z − d U ( λ ), x j = p j and y = q , we obtain π ( z ) = Z ∞ d Y j =1 e λτ j ( z j − d U ( λ ) . The latter relation implies that the increments of X t are conditionally independent, as was tobe proved.Theorem 3.3 is intuitively appealing. Indeed, X t is a Poisson process with random intensity, λ , which is determined at time t = 0 and, therefore, measurable with respect to the ﬁltration { F t } t ≥ generated by the process X t . The conditional distribution of the arrival moments isuniform in the interval [0 , T ] and does not depend on the parameters of the process.Algorithm 1 describes the Backward Simulation of multivariate MPPs (MMPPs) in detail.A correlated multivariate mixed Poisson process, N t , has as its marginals MPPs. Since themarginals are correlated and the joint distribution does not factorize, a joint distribution thathas the desired correlation structure with the marginalized distributions satisfying the constraintsof the given marginals is needed (Step 1 of Algorithm 1). This is discussed in Section 4. Givena vector of the counts of the number of events from the joint distribution, Theorem 3.3 appliesto each marginal distribution independently. Algorithm 1:

Backward Simulation of Correlated MMPPs

Requires:

Multivariate mixed Poisson distribution at terminal time

MP( U ) = (MP( U (1) ) , . . . , MP( U ( d ) )) Output:

Scenarios of the multivariate mixed Poisson process // Get the number of events at terminal time T Generate N = ( N (1) , . . . , N ( d ) ) where N ∼ MP( U ) for each j do // this can be done in parallel Generate N ( j ) uniform random variables T ( j ) = ( T ( j )1 , . . . , T ( j ) N ( j ) ) Sort T ( j ) in ascending order end return T = ( T (1) , . . . , T ( d ) ) Let us now analyze the time structure of the correlations of multivariate MPPs generated byBS. Since correlations are inherently pairwise in nature, the analysis carried out in the bivariatesetting corresponds to pairs of variables in the multivariate setting.

Theorem 3.4 (Time Structure of the Correlation Coeﬃcient) . Consider a bivariate process ( X t , Y t ) such that X t and Y t possess the conditional uniformity property. The sample paths ofthe processes X t and Y t are generated by BS in the interval [0 , T ] . Let the correlation coeﬃcientat time T , ρ ( T ) := Corr( X T , Y T ) be known. Then ρ ( t ) = Corr( X t , Y t ) takes the form ρ ( t ) = ρ ( T ) · Z ( T ) Z ( t ) , < t ≤ T. (14)8 here Z ( t ) = σ ( X t ) σ ( Y t ) t , t > . and σ ( X t ) denotes the variance of X t . Similarly for Y t .Proof. First, we show that the generating function of the process,ˆ g ( t, z, w ) := E [ z X t w Y t ] , | z | ≤ , | w | ≤ g ( t, z, w ) = ˆ g ( T, − tT − + ztT − , − tT − + wtT − ) . (15)To this end, note that for 0 ≤ m ≤ k and 0 ≤ n ≤ l , P ( X t = m, Y t = n | X T = k, Y T = l ) = (16) (cid:18) km (cid:19) (cid:18) tT (cid:19) m (cid:18) − tT (cid:19) k − m (cid:18) ln (cid:19) (cid:18) tT (cid:19) n (cid:18) − tT (cid:19) l − n since at the end of the simulation interval T there are k events in total for X T and l events intotal for Y T , the probability of the number of events m and n by a certain time t can be viewedas a Bernoulli trial with probability of success ( t/T ). Taking expectation, we obtain E [ z X t w Y t | ( X T = k, Y T = l )]= k X m =0 l X n =0 z m w n P ( X t = m, Y t = n | X T = k, Y T = l )= (cid:18) − tT + z tT (cid:19) k (cid:18) − tT + w tT (cid:19) l . Denote P kl = P ( X T = k, Y T = l ). Then we ﬁndˆ g ( t, z, w ) = E h E [ z X t w Y t | ( X T , Y T )] i = X k ≥ X l ≥ P kl (cid:18) − tT + z tT (cid:19) k (cid:18) − tT + w tT (cid:19) l = ˆ g (cid:0) T, − tT − + ztT − , − tT − + wtT − (cid:1) . Equation (15) is derived. Diﬀerentiating ˆ g ( t, z, w ) twice, we ﬁndCov( X t , Y t ) = t T Cov( X T , Y T ) (17)from which we obtain ρ ( t ) = Cov( X t , Y t ) σ ( X t ) σ ( Y t )= t T · Cov( X T , Y T ) σ ( X t ) σ ( Y t )= t T · Cov( X T , Y T ) σ ( X T ) σ ( Y T ) · σ ( X T ) σ ( Y T ) σ ( X t ) σ ( Y t )= ρ ( T ) t T · σ ( X T ) σ ( Y T ) σ ( X t ) σ ( Y t )= ρ ( T ) · Z ( T ) Z ( t ) . Z ( T ) /Z ( t ) in (14) reduces to tT − . Thus, thecorrelation structure is linear in time in the simulation interval [0 , T ]. This is not true in generalfor MPPs. For example, for the Negative Binomial processes, the auxiliary functions take theform ρ ( t ) = ρ ( T ) · Z ( T ) Z ( t )= ρ ( T ) · tT · s (¯ λ X + σ ( λ X ) T )(¯ λ Y + σ ( λ Y ) T )(¯ λ X + σ ( λ X ) t )(¯ λ Y + σ ( λ Y ) t )The graph of the correlation function is presented in Figure 1, where the good agreement of thetheoretical and empirical results can be seen. 10 t -1-0.8-0.6-0.4-0.200.20.40.60.81 ( t ) t -1-0.8-0.6-0.4-0.200.20.40.60.81 ( t ) Figure 1: Depicted by the red line and the blue circles are the dynamic correlation structuresof two bivariate Negative Binomial (NB) processes. The red line represents the theoreticalcorrelation structure as described in Theorem 3.4. The blue circles represent the correlationstructure recovered by Monte Carlo simulation of bivariate NB processes by BS. In the left ﬁgure,the ﬁrst process has mean 3 and variance 1, while the second process has mean 5 and variance30. In the right ﬁgure, the ﬁrst process has mean 3 and variance 1, while the second process hasmean 30 and variance 5. In both ﬁgures, the bivariate NB processes are calibrated to a positivecorrelation coeﬃcient of ρ ( T ) = 0 . ρ ( T ) = − .

7. Thedotted black line represents a bivariate Poisson process with mean parameters 3 and 30 simulatedvia BS. The bivariate Poisson processes are also calibrated to a positive correlation coeﬃcient of ρ ( T ) = 0 . ρ ( T ) = − . Given a simulation interval [0 , T ], stochastic processes are usually simulated forwards in time.This is due to the fact that it is conceptually natural and technically simpler to do so. However,it is not always the most suitable choice. This can be seen in the Forward Simulation (FS) of acorrelated bivariate Poisson process ( N (1) t , N (2) t ) where N ( i ) t ∼ Poiss( µ i ) and { ∆ T ( i ) k } k ≥ denotes11he sequence of inter-arrival times for process i ∈ { , } . Forward Simulation of counting pro-cesses like Poisson processes consists of repeatedly simulating the inter-arrival times { ∆ T ( i ) k } k ≥ while P k ∆ T ( i ) k ≤ T . The sequence of inter-arrival times represents a sample path of thecounting process. In the Poisson case, the inter-arrival times are exponentially distributed, P (∆ T ( i ) k ≤ t ) = 1 − e − µ i t . We must rely on the Fr´echet-Hoeﬀding Theorem in [16, 19] to inducedependence between the marginal distributions of the inter-arrival times in the case of FS, whichgives us the relations µ ∆ T (1) k = µ ∆ T (2) k , k = 1 , , . . . (18)to obtain extremal positive dependence between the distributions of the inter-arrival times andexp ( − µ · ∆ T (1) k ) + exp ( − µ · ∆ T (2) k ) = 1 (19)to obtain extremal negative dependence.We claim that the relations (18) and (19) lead to extreme correlations of the process underFS. In the case of extremal positive dependence, (18) implies that µ T (1) k = µ T (2) k , k = 1 , , . . . (20)Deﬁne κ = µ /µ . Obviously, 0 < κ < ∞ . We show that for all t > ,N (1) t = N (2) κt (21)Suppose that N (1) t = m for some t > m is an integer. The arrival moments for N (1) t satisfy the inequality T (1) m ≤ t < T (1) m +1 It follows immediately from (20) that, the arrival moments for N (2) t must satisfy T (2) m = κ T (1) m for all m = 1 , , . . . This implies that T (2) m ≤ κt < T (2) m +1 which in turns implies that N (2) κt = m . Thus we have shown(21) since m is arbitrary. Now let us compute the correlation coeﬃcient of such a process in thecase κ > N ( i ) κt can then be written as N ( i ) κt = N ( i ) t + ∆ N ( i ) κt where i = 1 , N ( i ) κt represents the increment of the i th process in the interval [ t, κt ] and isindependent of N (1) t . Then we obtain E [ N (1) t N (2) t ] = E [ N (2) κt · N (2) t ]= E [( N (2) t ) ] + E [ N (2) t ∆ N (2) κt ]and Cov( N (1) t , N (2) t ) = σ ( N (2) t ) . The latter relation implies that ρ ( N (1) t , N (2) t ) = 1 √ κ , where κ ≥ For discrete distributions, Fr´echet-Hoeﬀding is equivalent to the EJD theorem in 2-dimensions [20]. < κ < ρ ( N (1) t , N (2) t ) = √ κ This allows us to compare the correlations obtained from the FS case to the correlationsobtained in the BS case. From the last two equations above, we can see that the notion ofextreme dependence obtained via the Fr´echet-Hoeﬀding theorem results in a correlation coeﬃ-cient that is a function of the intensities. This is very restrictive and precludes the possibility ofcalibrating to data. In contrast, the BS approach allows for the construction of processes with any desired correlation that is within the range of admissible correlations. Further comparisonsof the Forward vs the Backward approaches can be found in [8, 20].

We showed in Section 3 that the conditional uniformity property holds for the class of MixedPoisson processes and that the process resulting from Backward Simulation with the number ofevents at terminal simulation time T generated by a MPD is indeed a MPP. Backward Simulationfor the class of Mixed Poisson processes relies on the knowledge of the joint MPD at terminal time T , but how do we construct a multivariate MPD with some desired dependency structure in theﬁrst place? In this section, we brieﬂy review the work in [9] and [20] that addresses the generalproblem of constructing multivariate joint distributions from given marginal distributions suchthat the linear correlation coeﬃcient between the marginals are equal to some desired correlations.We also discuss how to sample from such multivariate joint distributions. We begin by describing the main ideas in 2-dimensions to build some intuition before presentingthe general d -dimensional case. To that end, suppose we have a discrete bivariate distribution P with marginals Q (1) and Q (2) . Clearly, the admissible linear correlation coeﬃcient C betweenthe marginals is bounded by some maximum attainable correlation ˆ C (1) and some minimumattainable correlation ˆ C (2) . Moreover, every admissible correlation C , can be represented as aconvex combination of the extreme correlations C = w ˆ C (1) + (1 − w ) ˆ C (2) (22)for some w ∈ [0 , Deﬁnition 4.1 (Extreme Measures in 2-dimensions) . Extreme Measures are solutions to thefollowing inﬁnite dimensional Linear Program (LP)extremize h ( P ) (23)subject to ∞ X j =0 P ij = Q (1) i , i = 0 , , . . . ∞ X i =0 P ij = Q (2) j , j = 0 , , . . .P ij ≥ i, j = 0 , , . . . P ∞ i =0 Q (1) i = P ∞ j =0 Q (2) j = 1. Extremize denotes either max or min and the objectivefunction is h ( P ) := E [ X X ] = ∞ X i =0 ∞ X j =0 ij P ij (24)where P ij = P ( X = i, X = j )For completeness, we mention that the inﬁnite dimensional LP (23) is a Monge KantorovichProblem (MKP). This aspect of the problem is not immediately relevant to us; we refer tostandard references such as [31] for more details.The solution to (23) is an Extreme Joint Distribution that determines the Extreme Measuresˆ P (1) and ˆ P (2) which have a one-to-one relationship to the extreme correlations ˆ C (1) and ˆ C (2) [8]. The Extreme Measures (23) lead to extreme correlations since extremizing the bivariateexpectation extremizes the linear correlation coeﬃcient as can be seen in (24). Moreover, let P = w ˆ P (1) + (1 − w ) ˆ P (2) (25)where w is the solution of (22) and ˆ P (1) and ˆ P (2) are the Extreme Measures having extremecorrelations ˆ C (1) and ˆ C (2) , respectively. Then, it is not hard to show that P is a discrete bivariateprobability distribution with marginals Q (1) and Q (2) and correlation coeﬃcient C . This insightallows us to reduce the problem of calibration to a simpler problem of solving a linear equation.Thus, if extreme joint distributions can be computed, they can be used to generate extremecorrelations (extreme points) to calibrate to the given correlation. If the calibration fails—thereis no solution to the linear equation (22) with w ∈ [0 , no bivariate processwith the marginal distributions Q (1) and Q (2) and correlation C exists . In such an event, theassumptions of the parameter values (including the inference procedures to obtain them) andany raw data should be checked. The 2-dimensional case described in the previous subsection generalizes to the d -dimensional case( d >

2) described below. However, instead of dealing with a single correlation coeﬃcient, in thegeneral case, we consider a d × d correlation matrix C where C ij represents the linear correlationcoeﬃcient between marginal distributions Q ( i ) and Q ( j ) . Clearly, we now have to considermore general notions of extremal dependency. One concept of extremal dependency consistentwith observations of correlations matrices is to consider only pairwise extremal dependence.That is, we consider pairwise monotonicity, which represents the strongest type of associationbetween random variables and implies maximally positive (comonotonicity) and negative values(antimonotonicity) for the linear correlation coeﬃcient; see [30] for more details on extremaldependence concepts in multivariate settings and [20] for details on monotonicity as it relates todistributions. In contrast to the bivariate case, there are n = 2 d − extreme correlation matricesˆ C ( j ) , which are also extreme points [9], each described by a monotonicity structure . Deﬁnition 4.2 (Monotonicity Structure) . A monotonicity structure e ( j ) , where j ∈ { , . . . , n } ,is a binary vector describing the pairwise extremal dependency structure between the marginaldistributions e ( j ) = ( e ( j )1 , . . . , e ( j ) d ) (26) In practice, probability distributions are truncated to some desired accuracy; we are really dealing with linearprograms. e ( j ) i = ( , if X and X i are antimonotone0 , if X and X i are comonotoneassuming that e ( j )1 = 0. If e ( j ) i = e ( j ) k , then marginal distributions Q ( i ) and Q ( k ) have a comono-tone dependency relationship and an antimonotone dependency relationship otherwise. Remark . Note that whether e ( j )1 is initially set to 0 or 1 does not matter and is an arbitrarychoice.Similar to the bivariate case, for each j = 1 , , . . . , n , each extreme correlation matrix ˆ C ( j ) is associated with an Extreme Measure ˆ P ( j ) , described by a monotonicity structure e ( j ) , asdescribed below. Deﬁnition 4.3 (Extreme Measures in d -dimensions) . Extreme Measures are solutions to thefollowing multi-objective inﬁnite dimensional LPsextremize h ( j ) k,l ( P ) 1 ≤ k < l ≤ d (27a)subject to X j ∈ I k ∞ X i j =0 P ( j ) i ,...i k − ,i k ,i k +1 ,...,i d = Q ( k ) i k k = 1 , , . . .i k = 0,1,. . . (27b) P i ,...,i d ≥ h ( j ) k,l ( P ) = ( max h ( j ) k,l ( P ) if e ( j ) k = e ( j ) l min h ( j ) k,l ( P ) if e ( j ) k = e ( j ) l I k = { j : 1 ≤ j ≤ d, j = k } , Q ( k ) represents the k -th given marginal distribution and eachobjective function takes the form h k,l ( P ) = ∞ X i k =0 ∞ X i l =0 i k i l P ( k,l ) i k ,i l ≤ k < l ≤ d (28)where, similiarly, P ( k,l ) i k ,i l = X j ∈ I k,l ∞ X i j =0 P i ,...i k − ,i k ,i k +1 ,...,i l − ,i l ,i l +1 ,...,i d with I k,l = { j : 1 ≤ j ≤ d, j = k, j = l } Remark . There are m = d ( d − / h k,l ( p ) extremizes thedependency between a pair of coordinates.The multi-objective program (27) is, in fact, a multi-objective multi-marginal MKP, thesolutions of which determine Extreme Measures. Potential solutions of (27) are multivariateprobability measures P which are tensors. Thus, the multi-objective problem is not only tediousto program but practically prohibitively expensive to compute (in terms of both time and storage)for moderate d [8]. One approach that leads to a computable solution to these inﬁnite dimensionalproblems (23) and (27) is given by the Extreme Joint Distribution (EJD) Theorem, which givesa semi-analytic form describing completely the extreme joint distribution.15 heorem 4.4 (EJD Theorem in d -dimensions) . Given marginal cumulative distribution func-tions F (1) , F (2) , . . . , F ( d ) on Z + , corresponding to the marginal distributions Q (1) , Q (2) , ..., Q ( d ) inthe constraints (27b) and a monotonicity structure e ( j ) , where j ∈ { , . . . , n } , the correspondingExtreme Measure is deﬁned by the probabilities ˆ P ( j ) i ,...,i d = [min( ¯ F ( i − e ( j )1 ; e ( j )1 ) , . . . , ¯ F d ( i d − e ( j ) d ; e ( j ) d )) (29) − max( ¯ F ( i + ( e ( j )1 − e ( j )1 )) , . . . , ¯ F d ( i d + ( e ( j ) d − e ( j ) d ))] + where [ · ] + = max(0 , · ) and ¯ F k is deﬁned as ¯ F k ( i k ; e ( j ) k ) = ( F ( k ) ( i k ) if e ( j ) k = 01 − F ( k ) ( i k ) if e ( j ) k = 1 (30)An accompanying algorithm, the EJD algorithm, provides an eﬃcient numerical method tosolve (27) by computing the extreme joint distributions in (29) and their corresponding supports;see [20] and [9] for more details. Note that ˆ P ( j ) is very sparse in that most ˆ P ( j ) i ,...,i d = 0. Acomplete exposition of the details in the general case can be found in [8]. There are two attractive features of the EJD approach which make sampling from multivariateExtreme Measures simple. The ﬁrst is that Extreme Measures ˆ P ( k ) are monotone distributions[20]. Consequently, their support remains a graph in higher dimensions. This is very convenientfor sampling as this means that Extreme Measures can be sampled from via the inverse CDFmethod. Second, any discrete multivariate probability measure with speciﬁed marginals and somedesired dependency structure can be represented as a convex combination of Extreme Measures.That is, the one-to-one relationship between (22) and (25) extends to the multidimensionalcase as follows. We ﬁrst ﬁnd coeﬃcients ( w , . . . , w n ) that satisfy w j ≥ j = 1 , , . . . , n , P nj =1 w j = 1 and C = w ˆ C (1) + · · · + w n ˆ C ( n ) (31)and then set P = w ˆ P (1) + · · · + w n ˆ P ( n ) (32)where ˆ P ( j ) is the extreme measure satsifying the LP (27) and having the extreme correlationmatrix ˆ C ( j ) . Since w i ≥ i = 1 , , ..., n and P ni =1 w i = 1, it follows immediately that P is aprobability measure. Moreover, it follows from the linearity of sums that P has the correlationmatrix C given on the left side of (31). In addition, since each ˆ P ( j ) has marginal distributions Q (1) , . . . , Q ( d ) , it follows that P also has marginal distributions Q (1) , . . . , Q ( d ) .Note that (31) can be converted to a constrained system of linear equations by ﬂattening eachextreme correlation matrix ˆ C ( j ) into a column vector A j ∈ R m where m = d ( d − /

2. Since C and all ˆ C ( j ) are symmetric with 1s on their diagonal, this can be done by taking each rowin the strictly upper triangular part of each ˆ C ( j ) , appending them into a row vector and takingthe transpose to be A j to obtain A = [ A , . . . , A n ] ∈ R m × n , representing the extreme points ofour problem in correlation space. Similarly, we can ﬂatten the correlation matrix C on the leftside of (31) to a vector b ∈ R m . Then (31) and the constraints w j ≥ j = 1 , , ..., n and16 bj =1 w j = 1 are equivalent to the constrained system of linear equations Aw = b (33a) T w = 1 (33b) w j ≥ j = 1 , , . . . , n (33c)There are many possible solutions to the constrained system of equations (33). One approach isto choose a suitable objective function and then use (33) as the constraints for an optimizationproblem with that objective function. However, if the goal is just to ﬁnd any solution to (33),then a simpler approach is to reformulate (33) asˆ Aw = ˆ b (34a) w j ≥ j = 1 , , . . . , n (34b)where ˆ A is A with the row 1 T appended to the bottom of it and ˆ b is b with a 1 appended tothe bottom of it. Then note that (34) has the form of the standard constraints for a LinearProgramming Problem (LPP). Moreover, the ﬁrst stage of many LPP codes ﬁnds a solution to(34). As explained in Section 13.5 of [26], one standard approach to ﬁnding a solution to (34) isto solve the LPP min T z (35a)subject to ˆ Aw + Ez = ˆ b (35b)( w, z ) ≥ z ∈ R m +1 and E is a ( m + 1) × ( m + 1) diagonal matrix such that E ii = +1 if ˆ b i ≥ E ii = − b i <

0. Clearly, w = 0 and z = | b | satisﬁes the constraints (35b) and (35c). So, wecan use w = 0 and z = | b | as a staring point for the simplex method to solve (35). It’s clearfrom the constraint z ≥ T z ≥

0. Moreover, if T z = 0 then z = 0 . Hence, (35) has a solution T z = 0 if and only if ˆ Aw = ˆ b , w ≥ { w j } nj =1 through calibration, as described above, the decomposition of thedesired measure as a convex combination of Extreme Measures (32) provides an easy method forsimulation. Since w j ≥ j = 1 , , . . . , n and P nj =1 w j = 1 we can view w j as the probabilitythat a draw of P comes from the Extreme Measure ˆ P ( j ) . Moreover, as noted at the beginningof this subsection, we can easily sample from ˆ P ( j ) . Therefore, we can utilize methods of discreterandom variate simulation (see [12] for a comprehensive exposition) to generate a random variablethat has the distribution P . Remark . Despite the fact that the problem size grows exponentially in d , due to the structureof the problem (33) and the fact that the simplex method needs to explicitly access m +1 columnsof ˆ A at a time (assuming you have some clever way to decide which new vector to bring into theactive set at each step of the simplex method without explicitly accessing all the columns of Athat are in the inactive set) the LP (35) can be solved for a surprisingly large d , e.g., d = 51,which corresponds to n = 2 ≈ ; see [8] and [23]. This is also the subject of future work. Forward Continuation of the Backward Simulation forMixed Poisson Processes

Up to this point we have discussed the construction of MPPs within some interval [0 , T ]. Anatural question to ask is, what if we want to simulate the process forwards in time, pastthe original simulation interval. One solution to this was introduced in the Poisson setting in[9], known as the Forward Continuation (FC) to Backward Simulation (BS), where the jointdistribution was preserved at various future time points, with the interval in between ﬁlled inby BS. The main idea of the FC method is to retain the independent increments property. Notethat since the notion of the linear correlation coeﬃcient and our choice of multivariate extremaldependency concept is pairwise in nature, the discussion in the bivariate setting generalizesimmediately to the multivariate case.In the more general setting of MPPs, the conditional independence of the increments posesa challenge. We show that with the right construction, the arguments in the Poisson settingextend naturally to the MPP setting. Consider a sequence of time intervals [0 , T ), [ T, T ), . . . , [ mT, ( m + 1) T ). Suppose that a bivariate MPP ( X t , Y t ) has been simulated in [0 , T ) usingBS and that we wish to continue forward the process ( X T + τ , Y T + τ ) = ( X T + ∆ X τ , Y T + ∆ Y τ ) in[ T, T ). At the end of the interval [ T, T ), draw a new independent version of ( X T , Y T ) and addit to the original ( X T , Y T ) to obtain ( X T , Y T ) at time 2 T , i.e, take ( X T , Y T ) d = ( X T , Y T ). For0 ≤ τ < T , the process ( X T + τ , Y T + τ ) can be constructed by conditional uniformity given thenumber of events in [ T, T ), i.e by Backward Simulation. This retains the independence of themarginal increments since X T is independent of ∆ X τ , and similiarly for Y T , yet preserves thejoint distribution of the bivariate process ( X t , Y t ) even though, in the MPP setting, the marginalincrements are only conditionally independent. Note that X T + τ remains a MPP due to ourconstruction and the superposition property of MPPs [17]. Now that we have a method to extend a bivariate MPP deﬁned initially within the interval [0 , T )to an interval [ mT, ( m + 1) T ), it is natural to analyze the behaviour of the correlation coeﬃcienton each interval. It turns out that we can prove asymptotic stationarity of the correlationcoeﬃcient under Forward Continuation. Indeed, the covariance of the processes X t and Y t attime T + τ can be written asCov( X T + τ , Y T + τ ) = Cov( X T , Y T ) + Cov( X τ , Y τ )since X T is independent of ∆ Y τ and vice versa. From (17), we have:Cov( X T + τ , Y T + τ ) = (1 + τ T ) Cov( X T , Y T ) . (36)Dividing each side by σ ( X T + τ ) σ ( Y T + τ ), we obtain the correlation coeﬃcient ρ ( T + τ ) = ρ ( T ) (1 + τ T ) · σ ( X T ) σ ( Y T ) σ ( X T + τ ) σ ( Y T + τ )Using a similar argument we can extend (17) toCov( X mT + τ , Y mT + τ ) = ( m + τ T ) Cov( X T , Y T ) (37)18or m = 0 , , , . . . Thus, ρ ( mT + τ ) = ρ ( T )( m + τ T ) · σ ( X T ) σ ( Y T ) p σ ( X mT + ∆ X τ ) p σ ( Y mT + ∆ Y τ )= ρ ( T )( m + τ T ) · σ ( X T ) σ ( Y T ) p σ ( X mT ) + σ (∆ X τ ) p σ ( Y mT ) + σ (∆ Y τ )= ρ ( T )( m + τ T ) · σ ( X T ) σ ( Y T ) p m σ ( X T ) + σ ( X τ ) p m σ ( Y T ) + σ ( Y τ ) (38)In going from the ﬁrst line in (38) to the second line, we use the property that the varianceterm σ ( X mT + ∆ X τ ) can be decomposed as σ ( X mT ) + σ (∆ X τ ) since ∆ X τ d = X τ and X mT is independent of ∆ X τ . Similarly, σ ( Y mT + ∆ Y τ ) = σ ( Y mT ) + σ (∆ Y τ ). In going from thesecond line in (38) to the third line, we use the property that σ ( X mT ) can be written as m σ ( X T ) which can be seen as follows. Consider that σ ( X T ) = σ ( X T + T ) = σ ( X T + ˆ X T ) = σ ( X T ) + σ ( ˆ X T ) = 2 σ ( X T ), where X T and ˆ X T are iid. So, by induction on m, we get that σ ( X mT ) = mσ ( X T ). Similarly, σ ( Y mT ) = mσ ( Y T ). Theorem 5.1 (Asymptotic Stationarity of the Forward Continuation) . The correlation ρ ( mT + τ ) achieves asymptotic stationarity as m → ∞ : lim m →∞ ρ ( mT + τ ) = ρ ( T ) , τ ∈ [0 , T ] . (39) Proof.

The R.H.S. of (38) can be rewritten as follows ρ ( T )( m + τ T ) · σ ( X T ) σ ( Y T ) p m σ ( X T ) + σ ( X τ ) p m σ ( Y T ) + σ ( Y τ )= ρ ( T )( m + τ T ) · m · σ ( X T ) σ ( Y T ) p σ ( X T ) + (1 /m ) σ ( X τ ) p σ ( Y T ) + (1 /m ) σ ( Y τ )Passing to the limit as m → ∞ in the standard manner, we obtain thatlim m →∞ ρ ( mT + τ ) = ρ ( T )as was to be proved.A graphical illustration of Theorem 5.1 can be found in Figure 2, which shows the goodagreement between the analytic and numerical results. Note that while the correlations at thecalibrated integer grid points nT are exact, the correlation structure in between grid points,generated via the FC method, requires a few time periods in order to settle to the asympototicvalue of ρ ( T ). The ﬁrst few time periods can be used as “burn in” periods in order to achieve amore stable and accurate desired value for the correlation between processes for the in-betweentime periods as ﬁlled in by BS. That is, depending on the choice of simulation, the processobtained by simulating purely through BS on [0,2T) will have a diﬀerent correlation structurethan a process simulated using BS on [0,T) and FC on [T,2T). This is a user-deﬁned choice thatis dependent on the needs of the application. 19 t -1-0.8-0.6-0.4-0.200.20.40.60.81 ( t ) Figure 2: Correlation structure for a bivariate NB process where the ﬁrst NB process has a meanof 5 and a variance of 5 and the second NB process has a mean of 5 and a variance of 30. Thebivariate process was calibrated to a positive correlation of ρ (1) = 0 . ρ (1) = − . T = 1. Then, the bivariate process was extended forwardvia Forward Continuation to T = 7. The blue circles represent the correlations from MonteCarlo simulations of the bivariate NB process constructured through FC of the BS. The red linerepresents the correlation obtained analytically. The black dotted line represents the correlationstructure of a bivariate Poisson process with the same mean parameters and calibrated to thesame positive and negative correlations, derived analytically. In this paper, we extended the Backward Simulation (BS) method and the Forward Continuationof the BS method from the class of Poisson processes to the more general class of multivariateMixed Poisson Processes. The advantages of the Backward approach over the Forward approachfor generating sample paths of multivariate Mixed Poisson processes in some simulation interval[0 , T ] are numerous: simple and eﬃcient simulation in d -dimensions; speciﬁcation of a dependencystructure in the form of a given correlation matrix C at terminal simulation time T ; a wider rangeof possible correlations between the marginal distributions.The Backward Simulation approach is applicable to any process that exhibits the order statis-tic property. For example, Backward Simulation is applicable to the class of Negative BinomialL´evy Processes [3]. In fact, it is applicable to a more general class of processes known as Com-pound Poisson Processes which also posses a linear correlation structure under BS. It is alsoapplicable to the inhomogeneous Poisson Processes. Extending the BS approach to CompoundPoisson Processes and inhomogeneous Poisson Processes will be the subject of our forthcomingwork. 20 cknowledgements This research was supported in part by the Natural Sciences and Engineering Research Council(NSERC) of Canada.

References [1] F. Aue and M. Kalkbrener. “LDA at Work: Deutsche Bank’s Approach to QuantifyingOperational Risk”. In:

The Journal of Operational Risk

Journal of Statistical Computation and Simulation doi : . eprint: https://doi.org/10.1080/00949655.2016.1277428 . url : https://doi.org/10.1080/00949655.2016.1277428 .[3] T. Bae and M. Mazjini. “Backward Simulation of Correlated Negative Binomial L´evy Pro-cesses”. In: Mathematics and Statistics doi : .[4] O. Barndorﬀ-Nielsen and N. Shephard. “Non-Gaussian Ornstein–Uhlenbeck-based modelsand some of their uses in ﬁnancial economics”. In: Journal of the Royal Statistical Society:Series B (Statistical Methodology)

Journal of AppliedProbability

QuantitativeFinance

Journal of Banking & Finance

Model Assisted Statistics and Applications

Financial Modelling with Jump Processes . CRC Press, 2004.[11] K. S. Crump. “On point processes having an order statistic structure”. In:

Sankhy¯a: TheIndian Journal of Statistics, Series A (1975), pp. 396–404.[12] L. Devroye.

Non-Uniform Random Variate Generation . Springer, 1986.[13] K. Duch, A. Kreinin, and Y. Jiang. “New Approaches to Operational Risk Modeling”. In:

IBM Journal of Research and Development

The Geneva Risk and Insurance Review

An Introduction to Probability Theory and Its Applications . Vol. 1. JohnWiley & Sons, 2008.[16] M. Fr´echet. “Sur les tableaux de corr´elation dont les marges sont donn´ees”. In:

Revue deL’Institut International De Statistique

14 (1951), pp. 53–77.[17] J. Grandell.

Mixed Poisson Processes . CRC Press, 1997.2118] R. C. Griﬃths, R. K. Milne, and R. Wood. “Aspects of correlation in bivariate Poisson dis-tributions and processes”. In:

Australian & New Zealand Journal of Statistics

Schriften Math. Inst. Univ.Berlin.

Financial Signal Processing and Machine Learning . Ed. by A. N Akansu, S. R Kulkarni,and D. M Malioutov. John Wiley & Sons, 2016. Chap. 9, pp. 191–230.[21] F. Lindskog and A. J. McNeil. “Common Poisson shock models: applications to insur-ance and credit risk modelling”. In:

ASTIN Bulletin: The Journal of the IAA

A Modiﬁed Simplex Method for Solving Ax = b , x ≥ , fo r Very LargeA Arising from a Calibration Problem . A soon to be published research paper, ComputerScience Department, University of Toronto. 2020.[24] R. Nelsen. “Discrete bivariate distributions with given marginals and correlation”. In: Com-munications in Statistics-Simulation and Computation

Journal of Operational Risk

Numerical Optimization . Springer Science & BusinessMedia, 2006.[27] H. H. Panjer.

Operational Risk: Modeling Analytics . John Wiley & Sons, 2006.[28] G. Peters, P. Shevchenko, and M. Wuthrich. “Dynamic operational risk: modeling depen-dence and combining diﬀerent sources of information”. In:

Journal of Operational Risk

Algo Research Quarterly

Statist. Sci. doi : . url : https://doi.org/10.1214/15-STS525 .[31] S. T. Rachev and L. R¨uschendorf. Mass Transportation Problems: Volume I: Theory .Springer Science & Business Media, 1998.[32] P. Shevchenko.

Modelling Operational Risk Using Bayesian Inference . Springer Science &Business Media, 2011.[33] W. Whitt. “Bivariate distributions with given marginals”. In: