[PDF] Parameter estimation of default portfolios using the Merton model and Phase transition

Abstract

Full PDF

aa r X i v : . [ q -f i n . R M ] M a y Parameter estimation of default portfoliosusing the Merton model and Phase transition

Masato Hisakado ∗ Nomura Holdings, Inc., Otemachi 2-2-2,Chiyoda-ku, Tokyo 100-8130, Japan

Shintaro Mori † † Department of Mathematics and Physics,Graduate School of Science and Technology, Hirosaki UniversityBunkyo-cho 3, Hirosaki, Aomori 036-8561, Japan (Dated: May 19, 2020)

Abstract

We discuss the parameter estimation of the probability of default (PD), the correlation betweenthe obligors, and a phase transition. In our previous work, we studied the problem using thebeta-binomial distribution. A non-equilibrium phase transition with an order parameter occurswhen the temporal correlation decays by power law. In this article, we adopt the Merton model,which uses an asset correlation as the default correlation, and ﬁnd that a phase transition occurswhen the temporal correlation decays by power law. When the power index is less than one,the PD estimator converges slowly. Thus, it is diﬃcult to estimate PD with limited historicaldata. Conversely, when the power index is greater than one, the convergence speed is inverselyproportional to the number of samples. We investigate the empirical default data history of severalrating agencies. The estimated power index is in the slow convergence range when we use longhistory data. This suggests that PD could have a long memory and that it is diﬃcult to estimateparameters due to slow convergence. . INTRODUCTION Anomalous diﬀusion is one of the most interesting topics in sociophysics and econophysics[1–3]. The models describing such phenomena have a long memory [4–10] and show severaltypes of phase transitions. In our previous work, we investigated voting models for aninformation cascade [11–17]. This model has two types of phase transitions. One is theinformation cascade transition, which is similar to the phase transition of the Ising model [13]that shows whether a distribution converges. The other phase transition is the convergencetransition of the super-normal diﬀusion that corresponds to an anomalous diﬀusion [12, 18].In ﬁnancial engineering, several products have been invented to hedge risks. The creditdefault swap (CDS) is one tool used to hedge credit risks and is a single name credit deriva-tive that targets the default of one single obligor. Synthetic collateralized debt obligations(CDOs) are ﬁnancial innovations that securitize portfolios of assets, which, in the 2000s, be-came the trigger of the great recession in 2008. These products provide protections againsta subset of the total loss on a credit portfolio in exchange for payments. They provide valu-able insights into market implications on default dependencies and the clustering of defaults.This ﬁnal aspect is important because the diﬃculties in managing credit events depend oncorrelations.Estimations of the probability of default (PD) and correlation between the obligors havebeen obtained from empirical studies on historical data. These two parameters are importantfor pricing ﬁnancial products such as synthetic CDOs [19–21]. Moreover, they are importantto ﬁnancial institutions for portfolio management and are called ”long-run PDs” in theregulations. When defaults are minimal, it is not easy to estimate these parameters whenthere is a correlation [22, 23].In this work, we study a Bayesian estimation method using the Merton model. Undernormal circumstances, the Merton model incorporates default correlation by the correlationof asset price movements (asset correlation), which is used to estimate the PD and thecorrelation. A Monte Carlo simulation is an appropriate tool to estimate the parameters,except under the limit of large homogeneous portfolios [21]. In this case, the distributionbecomes a Vasciek distribution that can be calculated analytically [24].In our previous paper, we discussed parameter estimation using the beta-binomial distri-butionwith default correlation and considered a multi-year case with a temporal correlation217]. A non-equilibrium phase transition, like that of the Ising model, occurs when the tem-poral correlation decays by power law. In this study, we discuss a phase transition when weuse the Merton model. When the power index is less than one, the estimator distribution ofthe PD converges slowly to the delta function. Alternatively, when the power index is greaterthan one, the convergence is the same as that of the normal case. When the distributionslowly converges, it takes time to estimate the PD with limited data.To conﬁrm the decay form of the temporal correlation, we investigate empirical defaultdata. We conﬁrm the estimation of the power index in the slow convergence range. Thisdemonstrates that even if there exists adequate historical data, it will take time to correctlyestimate the parameters of PD, asset correlation, and temporal correlation.The remainder of this paper is organized as follows. In Section 2, we introduce thestochastic process of the Merton model and consider the convergence of the PD estimator.In Section 3, we apply Bayesian estimation approach to the empirical data of default historyusing the Merton model and conﬁrm its parameters. The estimated parameter is in the slowconvergence phase. Finally, the conclusions are presented in Section 4.

2. ASSET CORRELATION AND DEFAULT CORRELATION

In this section we consider whether the time series of a stochastic process using theMerton model converges [25]. We show that the convergence is intimately related to thephase transition. Using this conclusion, we discuss if we can estimate the parameters.Normal random variables, S t , are hidden variables that explain the status of the economicsand S t aﬀects all obligors in the t -th year. In order to introduce the temporal correlationof the defaults from diﬀerent years, let { S t , ≤ t ≤ T } be the time series of the stochasticvariables of the correlated normal distribution with the following correlation matrix Σ:Σ ≡  d d · · · d T − d d . . . .... . . . . . . . . . . . . . .... . . . . . . . . . dd T − · · · d d  , (1)where ( S , · · · , S T ) T ∼ N T (0 , Σ). In this work, we consider two cases of temporal correla-tion: exponential decay, d i = θ i , ≤ θ ≤

1, and power decay, d i = 1 / ( i + 1) − γ , γ ≥ t -th year is constant and we denote it as n .The asset correlation, ρ A , is the parameter that describes the correlation between thevalue of the assets of the obligors in the same year. We consider the i -th asset value, ˆ U it , attime t , to be ˆ U it = √ ρ A S t + q − ρ A ǫ it , (2)where ǫ it ∼ N(0 ,

1) is i.i.d. By this formulation, the equal-time correlation of U it is ρ A . Thediscrete dynamics of the process is described by X it = 1 ˆ U it ≤ Y , (3)where Y is the threshold and 1 ≤ i ≤ n . When X it = 1(0), the i -th obligor in the t -th year isdefault (non-default). Eq.(3) corresponds to the conditional default probability for S t = S as G ( S ) ≡ P( X it = 1 | S t = S ) = Φ Y − √ ρ A S √ − ρ A ! , (4)where Φ( x ) is the standard normal distribution, G ( S t ) is the distribution of the defaultprobability during the t -th year in the portfolio, and the average PD is p = Φ( Y ), whichcorresponds to “long run PDs”.The default correlation, ρ D , is ρ D = f ( ρ A ) ≡ P( X it = 1 ∩ X jt = 1) − p p (1 − p ) = Φ ((Φ − ( p ) , Φ − ( p )) , ρ A ) − p p (1 − p ) , where Φ denotes the bivariate normal distribution with standardized marginals. We deﬁnethe mapping function, ρ D = f ( ρ A ), between the default correlation ρ D and the asset corre-lation ρ A . Note that the mapping function, ρ D = f ( ρ A ), depends on p . By the temporalcorrelation of S t , we have the asset correlation of the asset values at diﬀerent times asCor( U it , U jt ′ ) = ρ A Σ t,t ′ , (5)where ρ A is the asset correlation and Σ is the correlation matrix for S t , t = 1 , · · · , T . InAppendix A we explain how to calculate Eq.(5). The default correlation between X it and X jt ′ is given byP( X it = 1 ∩ X jt ′ = 1) − p p (1 − p ) = Φ ((Φ − ( p ) , Φ − ( p )) , ρ A d | t − t ′ | ) − p p (1 − p ) = f ( ρ A d | t − t ′ | ) .

4e are interested in the unbiased estimator of PD, Z ( T ) ≡ P t,i X it / ( nT ), and the limitlim T →∞ Z ( T ). As the covariance of X it and X jt ′ is p (1 − p ) f ( ρ A d | t − t ′ | ), the variance of Z ( T )is V( Z ( T )) = p (1 − p ) 1 nT + p (1 − p ) ( n − nT f ( ρ A ) + 2 p (1 − p ) 1 T T − X i =1 f ( ρ A d i )( T − i ) . The ﬁrst term is from the binomial distribution, the second term is from the default corre-lation in the same year, and the third term is from the temporal correlation. In the limit T → ∞ , the ﬁrst two terms disappear and the convergence of the estimator Z ( T ) is governedby the third term.We study the asymptotic behavior of f ( ρ A d i ) for large i . f ( ρ A ) is explicitly given as f ( ρ A ) = Φ ((Φ − ( p ) , Φ − ( p )) , ρ A ) − p p (1 − p )= 1 p (1 − p ) π (1 − ρ A ) Z Φ − ( p ) −∞ dx Z Φ − ( p ) −∞ dy exp( − − ρ ) ( x + y − ρ A xy )) − p ! . As we assume that d i decays to zero for large i , we expand f ( ρ A ) at ρ A = 0 as f ( ρ A ) = ρ A πp (1 − p ) ( Z Φ − ( p ) −∞ exp( − x ) dx ) + O ( ρ A ) . We denote the coeﬃcient as A and f ( ρ A ) ≃ Aρ A . A is deﬁned as A ≡ πp (1 − p ) Z Φ − ( p ) −∞ exp( − x ) dx ! > . (6)Hence, we can conﬁrm A >

0. For large i and d i →

0, we have the asymptotic behavior ofthe default correlation as C ( t ) ≡ Cor( X is , X j ( s + t ) ) = f ( ρ A d t ) ≃ Aρ A d t . We note that the default correlation f ( ρ A d i ) obeys the same decay law as that of d i for large i and d i → ρ D = f ( ρ A ) in the ( ρ A , ρ D )-plane in Fig. 1 under theconditions f (0) = 0 and f (1) = 1. The straight lines ρ D = Aρ A with A from Eq.(6) areplotted in Fig. 1 (b). We conﬁrm that A in Eq.(6) is the same as the slope of the tangentline at point ( ρ A , ρ D ) = (0 , p and A in Fig. 1(c). From theconvexity of f ( ρ A ) in Fig. 1(a), we ﬁnd the following inequality: Aρ A d i < f ( ρ A d i ) < f ( ρ A ) d i = ρ D d i , for ρ A , ρ D > , < d i < A x r A r D x r D f(x r A ) . . . . . . r A r D p=50%p=10%p=1% . . . . p A FIG. 1. (a) Plot of the mapping function f ( ρ A ) vs. ρ A , which shows the relation between the assetcorrelation ρ A and the default correlation ρ D . f ( ρ A ) is downward convex and f ( xρ A ) ≤ xf ( ρ A )for 0 ≤ x ≤

1. (b) We set p = 0 . . . A in Eq.(6). (c) Plot of A vs. p . Using this inequality, we form the upper bound for V ( Z ( T )) as:V( Z ( T )) = p (1 − p ) 1 nT + p (1 − p ) ( n − nT f ( ρ A ) + 2 p (1 − p ) 1 T T − X i =1 f ( ρ A d i )( T − i ) ≤ p (1 − p ) 1 nT + p (1 − p ) ( n − nT f ( ρ A ) + 2 p (1 − p ) ρ D T T − X i =1 d i ( T − i ) . The lower bound is then V ( Z ( T )) ≥ p (1 − p ) 1 nT + p (1 − p ) ( n − nT f ( ρ A ) + 2 p (1 − p ) Aρ A T T − X i =1 d i ( T − i ) . In both the upper and lower bounds, their third term is proportional to T P T − i =1 d i ( T − i ).Thus, we can estimate the asymptotic behavior of V ( Z ( T )) by the following expression:V( Z ( T )) = p (1 − p ) 1 nT + p (1 − p ) ( n − nT ρ D + 2 p (1 − p ) c T T − X i =1 d i ( T − i ) , (7)where c is a positive constant and ρ D > c > Aρ A . In this subsection, we study the convergence of Z ( T ) for the exponential decay model d i = θ i , θ ≤ V ( Z ( T )) ≃ p (1 − p ) 1 nT + p (1 − p ) ( n − nT f ( ρ A ) + 2 p (1 − p ) c T T − X i =1 θ i ( T − i ) . (8)6he ﬁrst two terms on the right-hand side (RHS) behave as ∝ /T and, thus, converge to0 in the limit T → ∞ . In the case that θ = 1, the third term is2 p (1 − p ) c T [ T − θ T − − θ + ( T − θ T − (1 − θ ) − (1 − θ T − ) θ (1 − θ ) ∝ /T and it converges to 0 in the limit T → ∞ . In addition, C ( t ) ≃ Aρ A θ t for large t . Weconclude that as the number of data samples increases, the distribution of Z ( T ) convergesto a delta function and therefore, PD can be estimated empirically.Thus, we calculate C ( t )) = f ( ρ A θ t ) and V ( Z ( t )) numerically for t ≤ . We set ρ A =0 . , n = 10 , p = 0 . , . , .

01 and θ = 0 . , . , . , . C ( t ) vs. t . Here, it is clearly seen that C ( t ) decays exponentially. Figure2 (d)-(f) shows the plot of V( Z ( t )) vs. t . For all θ < ∈ { . , . , . , . } , V ( Z ( t ))decays as 1 /t . When θ = 1, there is no temporal correlation decay case and all obligors arecorrelated ρ A . Hence, there is no phase transition for θ < (a) − − − + t C o r( X ( ) , X ( t + )) q =0.999 q =0.99 q =0.9 q =0.8 (b) − − + t C o r( X ( ) , X ( t + )) q =0.999 q =0.99 q =0.9 q =0.8 (c) − − + t C o r( X ( ) , X ( t + )) q =0.999 q =0.99 q =0.9 q =0.8 (d) − − − − t V ( Z ( t )) q =0.999 q =0.99 q =0.9 q =0.80.01/t (e) − − − − t V ( Z ( t )) q =0.999 q =0.99 q =0.9 q =0.80.1/t (f) − − − − t V ( Z ( t )) q =0.999 q =0.99 q =0.9 q =0.80.5/t FIG. 2. Plots of (a),(b),(c) C ( t ) and (d), (e), (f) V ( Z ( t )) vs. t , for θ ∈ { . , . , . , . } ,Exponential decay case. The PDs are 1% for (a) and (d), 10% for (b) and (e), and 50% for (c) and(f). .2 Power temporal correlation In this subsection, we consider the power decay case d i = 1 / ( i + 1) γ , i = 1 , , · · · , where γ ≥ γ ≤ γ > V ( Z ( T )) is given as:V( Z ( T )) ≃ p (1 − p ) 1 nT + p (1 − p ) ( n − nT f ( ρ A ) + 2 p (1 − p ) c T T − X i =1 ( i + 1) − γ ( T − i ) . γ > case We can obtain V ( Z ( T )) ≃ p (1 − p ) nT + p (1 − p )( n − f ( ρ A ) nT + 2 p (1 − p ) cT T − X i =1 ( T − i ) / ( i + 1) γ ≃ p (1 − p ) nT + p (1 − p )( n − ρ D nT + 2 pqcT − γ / ( γ − . (9)The ﬁrst two terms decrease as 1 /T and the third term decreases as 1 /T γ where γ > V ( Z ( T )) behavesas ∼ /T . The convergence speed is the same as that of the independent binomial case. γ = 1 case V ( Z ( T )) behaves as V ( Z ( T )) ≃ p (1 − p ) nT + p (1 − p )( n − f ( ρ A ) nT + 2 p (1 − p ) cT T − X i =1 ( T − i ) / ( i + 1) . (10)The RHS of Eq.(10) is evaluated as RHS ≃ p (1 − p ) nT + p (1 − p )( n − f ( ρ A ) nT + 2 p (1 − p ) c [( T + 1) log T − T + 2] ∼ log T /T. (11)In conclusion, V ( Z ( T )) behaves asymptotically as V ( Z ( T )) ∼ log T /T (12)and the estimator Z ( T ) converges to p more slowly than in the normal case.8 .2.3) γ < case V ( Z ( T )) is calculated as: V ( Z ( T )) ≃ p (1 − p ) nT + p (1 − p )( n − f ( ρ A ) nT + 2 p (1 − p ) c [ 1(1 − γ )(2 − γ ) T γ ] ∼ T − γ . (13)Then, we can conclude V ( Z ( T )) behaves as V ( Z ( T )) ∼ T − γ . (14) (a) − − − + t C o r( X ( ) , X ( t + )) g =0.1 g =0.5 g =1.0 g =1.5 g =2.0 (b) − − − + t C o r( X ( ) , X ( t + )) g =0.1 g =0.5 g =1.0 g =1.5 g =2.0 (c) − − − + t C o r( X ( ) , X ( t + )) g =0.1 g =0.5 g =1.0 g =1.5 g =2.0 (d) − − − t V ( Z ( t )) g =0.1 g =0.5 g =1.0 g =1.5 g =2.00.005/t (e) − − − t V ( Z ( t )) g =0.1 g =0.5 g =1.0 g =1.5 g =2.00.05/t (f) − − − t V ( Z ( t )) g =0.1 g =0.5 g =1.0 g =1.5 g =2.00.1/t FIG. 3. Plots of (a),(b),(c) C ( t ) and (d),(e),(f) V ( Z ( t )) vs. t , for γ ∈ { . , . , . , . , . , . } .Power decay case. The PDs are 1% for (a) and (d), 10% for (b) and (e), and 50% for (c) and (f). In conclusion, a phase transition occurs when the temporal correlation decays by powerlaw. When the power index, γ , is less than one, the PD estimator Z ( T ) slowly converges to p . Conversely, when the power index γ is greater than one, the convergence behavior is thesame as that of the binomial distribution. This phase transition is called a ”super-normaltransition” [12, 18], which is the transition between long memory and intermediate memory.This transition is diﬀerent from the phase transition found when we used the beta-binomialdistribution in our previous work. In that article, when the power index was less than one,the PD estimator Z ( T ) did not converge to p when the beta-binomial model was used [17].9 .0 0.5 1.0 1.5 2.0 2.5 3.0 . . . . . . g d p=0.5p=0.1p=0.01 FIG. 4. Plots of log ( Z ( T ) /Z (2 T )) with T = 10 vs. γ . The PDs are 1%(dotted), 10%(broken)and 50%(solid). To conﬁrm the phase transition, we calculate C ( t ) = f ( ρ A ( t + 1) − γ ) and V ( Z ( T )). Fig.3 (a)-(c) shows the double logarithmic plot of C ( t ) vs. t . C ( t ) decays by power law for γ ∈ { . , . , . , . , . } . For small γ , such as 0 . , .

1, the slope is extremely small. Fig. 3(d)-(f) shows the double logarithmic plot of V ( Z ( t )) vs. t . For γ ∈ { . , . , . , . , . } , V ( Z ( t )) decays as 1 /t . At γ ≤

1, the slope of the decay becomes less than one. In this case,the convergence becomes slower than in the normal case.Next, we conﬁrm the phase transition using ﬁnite size scaling. We estimate the exponentof the convergence of Z ( T ). If we assume that V ( Z ( T )) ∝ T − δ , the exponent δ is estimatedas δ = log V ( Z ( T )) V ( Z (2 T )) . In the case V ( Z ( T )) ∼ ln T /T , we havelog V ( Z ( T )) V ( Z (2 T )) = 1 − log (1 + 2ln T ) < . We estimate δ numerically for T = 10 . We plot the results in Fig. 4. We see that δ = 1 for γ > δ = γ for γ <

1. When γ ≃

1, the relation becomes obscured by the ﬁnite sizeeﬀect.In summary, when γ > Z ( T ) converges to p as in the normal case. On the other hand,when γ ≤

1, the convergence is slower than that of the normal case. Hence, there is thephase transition at γ = 1. 10 . ESTIMATION OF PARAMETERS As discussed in the previous section, whether temporal correlation obeys an exponentialdecay or a power decay is an important issue because there exists a super-normal transitionin the latter case. Further, the appearance of a transition aﬀects whether we can estimatethe PD.First, the S&P default data from 1981 to 2018 [27] are used. The average PD is 1.51 % forall ratings, 3.90 % for speculative grade (SG) ratings, and 0.09% for investment grade (IG)ratings. The SG rating represents ratings under BBB-(Baa3) and IG represents that aboveBBB-(Baa3). In Fig. 5 (a) we show the historical default rate of the S&P. The solid anddotted lines correspond to the speculative grade and investment grade samples, respectively.We use Moody’s default data from 1920 to 2018 for 99 years [28]. It includes the GreatDepression in 1929 and Great Recession in 2008. The average default rate is 1.50% for allof the ratings, 3 .

70% for speculative ratings, and 0 .

14% for investment grade. In Fig. 5 (b),we show the historical default rate of Moody’s. (a) (b)FIG. 5. (a): S&P Default Rate in 1981-2018. (b)Moody’s Default Rate in 1920-2018. The solidand dotted lines respectively correspond to the speculative grade (SG) and investment grade (IG)of all the samples.

We estimate the parameters p, ρ A , θ and γ of the Merton model using the Bayesian methodand Stan 2.19.2 in R 3.6.2 software. We explain the method and how to estimate theparameters in Appendix B [29] and summarize the results in Table I. We show ρ D insteadof ρ A , as we need to compare it with that of the beta-binomial distribution model. Theestimation of the parameters are the maximum a posteriori (MAP) estimation. A detailed11 ABLE I. MAP estimation of the parameters for the exponential and power decay models by theMerton model Exponential decay Power decayNo. Model p ρ D θ p ρ D γ explanation of the estimation procedure and rmd ﬁle is provided on github [30]. We noticethat the power index γ is smaller than 1 for all cases and the values are smaller than thephase transition point, γ = 1.We compare these results to the MAP estimation using the beta-binomial distributionby using the same data [17]. The conclusions are shown in Table II for the exponentialand power decay models. We conﬁrmed small θ and large γ values, which represent smalltemporal correlation. The parameter γ for the power decay is larger than the phase transitionpoint, γ = 1. The PD and default correlation are almost the same as the estimations by theexponential and power decay models. The reason behind this is that the power exponent γ

12s adequately large and there is only a small diﬀerence between the exponential and powerdecay models.

TABLE II. Most likelihood estimate of the parameters for the exponential and power decay modelsby beta-binomial distribution Exponential decay Power decayNo. Model p ρ D θ p ρ D γ We can conﬁrm that θ and γ both have large diﬀerences between the values estimatedby the beta-binomial distribution and the Merton model. The reason for this is shown inFig. 1 (a). We set d A and d D for ρ A and ρ D , respectively. From this, we can obtain theinequality d A = d A ρ A ρ A >> f ( d A ρ A ) f ( ρ A ) = f ( d A ρ A ) ρ D = d D . f .Hence, θ and γ for a default correlation is much smaller than that for asset correlation.Next, we discuss whether the correlation has a long memory. In Table III, we calculatedthe WAIC and WBIC for each model that uses the Merton model for the discussion. UsingMoody’s data from 1920, the power decay model is found to be superior to the exponentialdecay model. Therefore, it seems that the default rate has a long memory. As γ is lessthan 1 for long history data, the phase is in the slow convergence phase. In other words,parameter estimation becomes diﬃcult because the convergence speed becomes slow whenthe temporal correlation is the power decay.In Table IV, we show the AIC and BIC for each model using the beta-binomial distributionand compare them to the estimation using Merton model. We obtain the same conclusionusing Moody’s data from 1920: the power decay model is superior to the exponential decaymodel. The parameter γ is not less than 1 for power decay case when we use the beta-binomial distribution.

4. CONCLUDING REMARKS

In this study, we introduced the Merton model with temporal asset correlation and dis-cussed the convergence of the estimator of the probability of default. We adopted a Bayesianestimation method to estimate the models parameters and discussed its implication in theestimation of PD. We found a phase transition when the temporal correlation decayed by apower curve, which meant that the correlation had a long memory. When the power index γ was larger than one, the estimator distribution of the PD converged normally. When thepower index was less than or equal to 1, the distribution converged slowly. This phase tran-sition is called the ”super-normal transition”. For the case of an exponential decay, therewas no phase transition.In our previous work, we studied a beta-binomial distribution model with temporal de-fault correlation. The estimator of PD also showed a phase transition in the power decaycase. The transition depended on whether the distribution converged or not. It was diﬀerentfrom the phase transition found in the present study.The main diﬀerence between the Merton model and the beta-binomial distribution model14 ABLE III. WAIC and WBIC for the exponential and power decay using the Merton modelExponential decay Power decayNo. Model WAIC WBIC WAIC WBIC1 Moody’s 1920-2018 572.9 746.7 568.6 745.92 S&P 1981- 2018 271.5 332.9 272.1 334.83 Moody’s 1981-2018 277.6 339.0 277.5 341.44 S& P 1990-2018 214.3 256.4 214.6 258.35 Moody’s 1990-2018 219.5 262.1 219.6 264.76 Moody’s 1920-2018 SG 564.3 731.9 560.2 733.87 S&P 1981-2018 SG 268.7 328.1 268.9 330.58 Moody’s 1981-2018 SG 274.1 333.6 274.4 337.19 S& P 1990-2018 SG 212.0 253.9 212.8 255.210 Moody’s 1990-2018 SG 217.3 260.4 218.3 262.811 Moody’s 1920-2018 IG 247.6 351.2 244.9 351.212 S&P 1981-2018 IG 116.0 153.3 115.0 160.313 Moody’s 1981-2018 IG 110.3 156.3 108.8 156.714 S&P 1990-2018 IG 87.6 114.3 86.8 117.715 Moody’s 1990-2018 IG 81.7 111.4 82.5 114.5 is the incorporation of default correlation. In the latter model, the default correlation isdeﬁned by binary variables. In the former, the default correlation is incorporated into theasset correlation, which is deﬁned by a continuous variable. The implication of the presentstudy is about the diﬃculty in the estimation of PD when we adopt the former models. Theestimated power index is in the slow convergence region of PD. Even with empirical dataover a long period of time, PD is diﬃcult to estimate when we adopt the proposed model.15

ABLE IV. AIC and BIC for the exponential and power decay using the beta-binomial distributionExponential decay Power decayNo. Model AIC BIC AIC BIC1 Moody’s 1920-2018 746.8 780.8 746.8 780.92 S&P 1981- 2018 352.0 382.9 353.6 384.53 Moody’s 1981-2018 362.2 395.0 363.4 396.24 S& P 1990-2018 285.0 315.5 285.8 316.35 Moody’s 1990-2018 293.8 326.3 298.6 331.16 Moody’s 1920-2018 SG 730.8 762.0 730.6 761.87 S&P 1981-2018 SG 346.8 366.7 348.8 368.78 Moody’s 1981-2018 SG 356.0 385.9 360.4 390.39 S& P 1990-2018 SG 283.0 302.6 283.8 303.410 Moody’s 1990-2018 SG 291.0 320.7 291.6 321.311 Moody’s 1920-2018 IG 302.2 334.9 300.4 333.112 S&P 1981-2018 IG 134.9 164.5 136.1 165.713 Moody’s 1981-2018 IG 142.3 173.7 140.2 171.714 S&P 1990-2018 IG 106.0 135.3 107.2 136.515 Moody’s 1990-2018 IG 111.8 142.8 112.4 143.4

Appendix A. Temporal correlation for the Merton Model

We consider the assets of two obligors, ˆ U t and ˆ U t , in year t to conﬁrm temporal corre-lation. The assets of two obligors have the correlation ρ A . S t is the global economic factorthat aﬀects the two obligors at t :ˆ U t = √ ρ A S t + q − ρ A ǫ t , ˆ U t = √ ρ A S t + q − ρ A ǫ t , (15)where ǫ and ǫ are the individual factors for the obligors. Here, there is no correlation among ǫ , ǫ , and S t because they are independent of each other. In the following year, t + 1, the16ssets of the two obligors are ˆ U t +1 and ˆ U t +1 . The assets have the same correlation, ρ A ,through the global factor S t +1 . We can write this as:ˆ U t +1 = √ ρ A S t +1 + q − ρ A ǫ t +1 , ˆ U t +1 = √ ρ A S t +1 + q − ρ A ǫ t +1 . (16)The temporal correlation between t and t + 1 is d . The correlation between ˆ U t and ˆ U t +1 is d ρ A . In the same way, we obtain the temporal correlation matrix, Eq.(5). It is sameas that from the Bayesian estimation, which was introduced in [17], without diﬀerentiatingbetween the asset and default correlations. Appendix B. Bayesian estimation using the Merton model

In this Appendix we explain the estimate of parameters using the Merton model [29].There is a prior belief of the possible value on the PD. The prior belief is updated byobservations while using the prior distribution as a weighting function. Here, we use theprior function, which is a uniform prior distribution.To calculate the unconditional probability P ( X = k , · · · , X T = k T ), we approximatethe solution by Monte Carlo simulations and numerical integration. Here, the number ofobligors and defaults in the t -th year are n t and k t , respectively, and they are observablevariables. The likelihood is P ( X = k , · · · , X T = k T ) ∼ n X i =1 T Y t =1 n t ! k t !( n t − k t )! G ( S it ) k t (1 − G ( S it )) ( n t − k t ) , (17)where G ( S it ) is deﬁned as the probability that an obligor will default in year t , which isconditional to the i -th path realization of all global factors such that G ( S it ) = Φ( Φ − ( p ) − S it √ ρ A √ − ρ A ) , (18)where ρ A is the asset correlation among obligors within a one year window and Φ is thecumulative normal distribution. S it is the correlated multi-dimensional normal distributionand we use the MAP estimation to estimate the parameters.We have estimated parameters using a beta-binomial distribution provided in Section3 and [17]. One of the diﬀerences between using the Merton model and beta-binomialdistribution is the default correlation and the asset correlation. The default correlation is17eﬁned by binary variables. On the other hand, the asset correlation is deﬁned by continuousvariables. The other diﬀerence is that one can calculate the parameters analytically whenusing the beta-binomial distribution. Hence, it is easier to estimate parameters when usingthe beta-binomial distribution than when using the Merton model. In fact, we estimate theparameters to be stable in Section 3, especially for IG samples, which have small PD. Theestimation of IG samples using the Merton model is diﬃcult. ∗ [email protected] † [email protected][1] G. Galam, Stat. Phys. , 943 (1990).[2] G. Galam, Inter J. Mod. Phys. C , 409 (2008).[3] N. M. Mantegna and H. E. Stanley, Introduction to Econophysics: Correlations and Complex-ity in Finance (Cambridge University Press, 2000).[4] D. Brockmann, L. Huﬁnage, and T. Geisel, Nature , 462 (2006).[5] I. T. Wong, M. L. Gardel, D. R. Reichman, E. R. Weeks, M. T. Valentine, A. R. Bausch, andD. A. Weitz, Phys. Rev. Lett. , 178101 (2004).[6] Y. Gefen, A. Aharony, and S. Alexander, Phys. Rev. Lett. , 77 (1983).[7] R. Metzler and J. Klafter, Phys. Rep. , 1 (2000).[8] S. Hod and U. Keshet, Phys. Rev. E , 034001 (2010).[12] M. Hisakado and S. Mori, J. Phys. A , 31527 (2010).[13] M. Hisakado and S. Mori, J. Phys. A , 275204 (2011).[14] M. Hisakado and S. Mori, J. Phys. A , 345002 (2012).[15] M. Hisakado and S. Mori, Physica A , 63 (2015).[16] M. Hisakado and S. Mori, Physica. A. , 570 (2016).[17] M. Hisakado and S. Mori, Physica A, 123480 (2019)[18] S. Hod and U. Keshet, Phys. Rev. E , 11006 (2004).[19] S. Mori, K. Kitsukawa, and M. Hisakado, Quant. Fin. , 1469 (2010).

20] S. Mori, K. Kitsukawa, and M. Hisakado, J. Phys. Sco.Jpn. , 114802 (2008).[21] P. J. Sch¨onbucher, Cresit Derivatives Pricing Models:Models, Pricing, and Inplementation (John Wiley & Sons, Ltd. 2003).[22] K. Pluto and D. Tasche,

Estimating Probabilities of Default for Low Default Portfolios

In:Engelmann. B., Rauhmeier R. (eds) The Basel II Risk Parameters. Springer, Berlin, Heidelberg(2011).[23] N. Benjamin, A. Cathcart, and K. Ryan K,

Low Default Portfolios: A Proposal for Conserva-tive Estimation of Default Probabilities (Financial Services Authority, 2006).[24] O. Vasicek, Risk , (2002) 160[25] R. C. Merton, J. Fin. , 449 (1974).[26] I. Florescu, M. C. Mariani, H. E. Stanley, and F. G. Viens (Eds.)

Handbook of High-FrequencyTrading and Modeling in Finance

John Wiley& Sons (2016)[27] (S& P Global Ratings,2019).[28]

Moody’s Annual Default Study: Corporate default and recovery dates, 1920-2018 (Moody’s2019).[29] D. Tasche, J. Risk Management in Financial Institutions,302-326 (2013).[30] https://github.com/shintaromori/DefaultCorrelation.