Using Network Interbank Contagion in Bank Default Prediction
UUsing Network Interbank Contagion in BankDefault Prediction
Riccardo Doyle**Royal Holloway, University of London23 May, 2020
Abstract
Interbank contagion can theoretically exacerbate losses in a financial system and lead to additional cascade defaults duringdownturn. In this paper we produce default analysis using both regression and neural network models to verify whether interbankcontagion offers any predictive explanatory power on default events. We predict defaults of U.S. domiciled commercial banks inthe first quarter of 2010 using data from the preceding four quarters. A number of established predictors (such as Tier 1 CapitalRatio and Return on Equity) are included alongside contagion to gauge if the latter adds significance. Based on this methodology,we conclude that interbank contagion is extremely explanatory in default prediction, often outperforming more established metrics,in both regression and neural network models. These findings have sizeable implications for the future use of interbank contagionas a variable of interest for stress testing, bank issued bond valuation and wider bank default prediction.
Index Terms
Computational Finance, Machine Learning, Quantitative Finance, Empirical Finance, Financial Contagion, InterbankContagion, Default Analysis, Bank Default, Systemic Risk, Financial Stability, Finance.
I. I
NTRODUCTION
In the aftermath of the Great Recession, the stability ofestablished economic models came under scrutiny, leading togreater exploration of dynamic and macro-prudential methodsthat prioritize relationships between financial agents over theirstandalone characteristics.Among these is the study of interbank contagion, a subset offinancial contagion seeking to quantify how the structure oflinkages between banks’ liabilities exacerbate loss propagationacross the wider financial system.Over the past decade, interbank contagion has been exten-sively covered in academic literature and widely adopted byregulators, with an extensive number of official bodies suchas the IMF (Ozkan & Unsal, 2012), the Bank of England(Bardoscia et. al, 2017) the Federal Reserve (Morrison et. al,2017) and the European Central Bank (Haaj & Kok, 2013),including interbank contagion models in working papers andstress testing frameworks.Despite widespread implementation and a rich history ofacademic discourse over the topic, there is little to no literatureto examine whether or not interbank contagion models areaccurate predictors of systemic risk and whether they are worthderiving conclusions from. Instead, studies thus far can be sub-grouped into either theoretical work concerning the networkstructure of the interbank market, or empirical work that onlydraws applied results from theoretical models, without eververifying model accuracy in the first instance.We believe this paper will address the aforementioned liter-ature gap and make important contributions in determiningwhether or not interbank contagion models are explanatoryand whether their implementation in regulatory contexts orscope for further research is justified. To validate our earlier claims, let us offer an overview ofexisting literature with reference to methods and frameworksincorporated in this paper.On a theoretical level, much effort has been placed in under-standing how the structure of interbank networks influencesthe extent of contagion. Initial studies focused on foundationalconcepts, such as work from Allen and Gale (2000) showingthat a complete network of interregional banks (where each islinked to every other) is more resilient to liquidity shocks thanan incomplete one. Over time more complex analysis began toemerge. Nier et. al (2007) used non-empirical networks withranges of key structure, probability and node size parametersto test contagion effects, finding among other things thatthe number of connections in a network can have dual effectson contagion. Initially, as connections increase, their ability tochannel contagion increases alongside it; however, as connec-tions increase past a certain large threshold, the newfound sizeof the network allows for greater risk sharing among nodesand decreases contagion considerably. Gai and Kapadia (2010)followed the same probabilistic and endogenous approachin Nier et. al (2007), but also introduced contagious effectsoriginating from the sale of illiquid assets during crises.Much more importantly however, unlike previous literature,they differentiated between the probability and the extent ofcontagion. In doing so, they found that while the probability ofcontagion increases and subsequently decreases hyperbolicallywith network size, the extent of contagion spread increasesuntil it reaches an asymptote, implying that while for largenetwork sizes the probability of contagion is low, if contagionoccurs it would produce extremely large repercussions on thesystem.In our paper, among other questions, we will seek to under- a r X i v : . [ q -f i n . R M ] M a y stand, in the context of the pre-financial crisis U.S. bankingsystem, whether contagion is positively or inversely related towider financial damage once a shock occurs.Beyond theoretical studies on interbank market structure,several seminal studies have formalized the broad concepts ofnetwork structure in model form. The most widely adoptedinterbank contagion model is that of Eisenberg and Noe(2001), which proposes an iterative approach to networkcontagion, relating the default in payments of a borrowingbank to the devaluation in the assets of a lending bank, pro-ducing a clearing vector of payments that brings the bankingsystem to equilibrium following successive periods of shocktransmission. Another foundational model was that of Furfine(2003), which largely employs a similar contagion frameworkto Eisenberg and Noe (2001), but requires an exogenoustrainable parameter for the recovery rate of payment defaults.Model based papers of interbank contagion, compared totheoretical studies of market structure and empirical studies,are much less abundant. Since Eisenberg and Noe (2001) andFurfine (2003), the most meaningful new interbank contagionmodel advanced was that of Battiston et. al (2012), whichproposes an innovative model (named DebtRank) based on theproportional propagation of contagion, in which banks wouldtransmit shocks not through defaulted payments as in previousmodels but through gradual devaluation. They propose thata given interbank loan will lose value proportionally to theborrowing bank’s loss in equity, transmitting contagion evenbefore defaults occur.In our paper, we have elected to adopt the DebtRank modelframework, believing it to be more encompassing and flexiblethan alternative methods.On an empirical level, many papers from the last decade havesought to apply the aforementioned theoretical studies. Upperand Worms (2004) produced an empirical network of interbanklinkages simulating the German banking system, finding thatin the worst-case scenario and omitting safety mechanismsthe failure of a single bank could cause the banking system tolose up to 15% of its total assets. Degryse and Nguyen (2007)took a more aggregate approach and, rather than focusing ona single event or time period, used time series data for theBelgian banking system between 1993 and 2002. They foundthat contagion risk in the system had changed considerablyover time, increasing until 1997 and decreasing to a plateauafter. They also drew conclusions on network structure, findingthat Belgium’s shift from a complete network akin to thatof Allen and Gale (2000) to a more imperfectly connectedsystem of select large players and numerous small playersreduced contagion risk, arguing that the small banks couldnow no longer originate contagion. This assumes the largebanks are fairly leveraged among each other, but as seen inthe 2008 financial crisis, this is not always the case. Followinga same national focus, Mistrulli (2011) tested for the extent ofcontagion in the Italian banking sector using unique data forthe individual bilateral interbank assets and liabilities betweeneach bank, which in most literature including this study arereconstructed using a maximum entropy framework. The paperadopted its own methodology for contagion transmission andapplied a range of simulated shocks to draw conclusions on how capital ratios correlate with contagion, finding amongother things that an increase in capital requirements does notstymie contagion.A large number of successive empirical papers, such as (Upper,2011), (Georgescu, 2015) , (Liu, 2020), (Gabrieli & Salakhova,2019) and (Leventides et. al, 2018), all apply real-worldfinancial inputs from sample countries or banking systems todraw conclusions either on the extent of contagion potential inthose systems or the relationship between financial indicators,network structure and contagion extent.No empirical study has tested the validity of interbank con-tagion as a field of study, instead accepting the models tobe accurate ex ante and drawing policy conclusions givenempirical inputs.Additionally, due to the above, no study has also attempted tocompare the explanatory power of interbank contagion withother systemic risk tools and predictors.This paper aims primarily to address these fundamental unan-swered questions about interbank contagion and assess firstlywhether it is at all effective as a systemic risk indicator, andsecondly whether it is more or less effective than other leadingsystemic risk indicators.II. M ETHODOLOGY
A. DebtRank Contagion Transmission
Many algorithms that model interbank contagion transmissionare present in the literature. In this study we have elected touse DebtRank (Battiston et. al, 2012). Below, we will brieflyexplain the structure of the DebtRank algorithm. Let us definethe assets and liabilities of every bank i in the system as thesum of its external and interbank components: A i = A exti + n (cid:88) j =0 W ij (1) L i = L exti + n (cid:88) i =0 W ij (2)Where W ij is a matrix describing the loans (assets) extendedby every bank i to every other bank j. Thus the sum of itsrows, (cid:80) nj =0 W ij , represents the total loans extended by banki to all other banks in the system and constitutes the bank’sinterbank assets. A exti is a vector describing the total amount ofexternal assets, defined as the total amount of assets excludinginterbank assets ( A exti = A i − (cid:80) nj =0 W ij ) .Consequently, let us define a bank’s equity as the surplus ofassets it holds: E i = A i − L i (3)We now assume a mechanism for shock propagation thatrelates every bank to every other. We define a bank i’sinterbank assets in a given period as the interbank assets fromthe previous period devalued by the relative loss in equity ofevery other owing bank j: W t +1 ij = W tij E tj E t − j (4) The logic behind the assumption is that as the equity of anowing bank j decreases, it will directly devalue the loan itreceived from bank i by that same proportional amount, asits ability to repay has decreased by E tj E t − j . We can re-writeequation (4) in summative terms as: n (cid:88) j =0 W t +1 ij = ( n (cid:88) j =0 W tij E tj E t − j ) (5)Using this new form, we can then define the change ininterbank asset value between consecutive periods as: n (cid:88) j =0 W t +1 ij − n (cid:88) j =0 W tij = n (cid:88) j =0 W tij E tj E t − j − n (cid:88) j =0 W tij (6)Or: n (cid:88) j =0 W t +1 ij − n (cid:88) j =0 W tij = n (cid:88) j =0 ( W tij E tj E t − j − W tij ) (7)Which can be rewritten to be: n (cid:88) j =0 W t +1 ij − n (cid:88) j =0 W tij = n (cid:88) j =0 W tij ( E tj E t − j − (8)And further: n (cid:88) j =0 W t +1 ij − n (cid:88) j =0 W tij = n (cid:88) j =0 W tij E tj − E t − j E t − j (9)Let us now assume that external assets and liabilities remainunchanged, which according to the original equity identity (3)only allows interbank assets to change. We can hence drawan equivalency between interbank asset changes and equitychanges, as all else remains constant: n (cid:88) j =0 W t +1 ij − n (cid:88) j =0 W tij = E t +1 i − E ti (10)This allows us to re-write equation (9) as: E t +1 i − E ti = n (cid:88) j =0 W tij E tj − E t − j E t − j (11)Or: E t +1 i − E ti = n (cid:88) j =0 W tij E t − j ( E tj − E t − j ) (12)The term W tij E t − j ’s period t, is set to zero such that both theinterbank assets of bank i and the change in bank j’s equitybetween consecutive periods are relativized to their originalvalues. At every period t the entire term is set to zero if abank is insolvent or to itself otherwise: ϕ tij = W ij E j if E i ≥ if E i < We can then conclusively write the DebtRank model as: E t +1 i = max (0 | E ti + n (cid:88) j =0 ϕ tij ( E tj − E t − j )) (13)At every iteration period, the model defines the equity ofa bank, E t +1 i , by removing the devaluation in its interbankassets, (cid:80) nj =0 ϕ tij ( E tj − E t − j ) , from the previous period’sequity, E tj . If the devaluation is such that the bank’s equityturns negative and the bank defaults, the model stores a valueof zero instead.The model iterates through consecutive periods until therelative difference in equity between consecutive periods issmaller than a certain threshold α : E t +1 i − E ti E ti < α (14) B. DebtRank Linearity Coefficient
In the previous section, DebtRank’s shocks are propagatedlinearly according to the assumed propagation mechanism inequation (4). In this mechanism a percentage change in theequity of an owing bank j results in a corresponding percentagechange in the interbank assets of a receiving bank i, implyingthe relationship between the two is linear and one to one. Amodification to the model can be applied by the insertion ofa coefficient β in equation (14): E t +1 i = max (0 | E ti + n (cid:88) j =0 ϕ tij β ( E tj − E t − j )) (15)Such that a percentage change in the equity of an owingbank j now results in a corresponding β percent change inthe interbank assets of a receiving bank i, since: ϕ tij β (cid:16) E tj − E t − j (cid:17) = W tij E t − j βE tj − E t − j = W tij β E tj − E t − j E t − j (16) This maintains the linearity in the proportion between onebank’s equity and another’s interbank assets but varies thegradient in the proportion.
C. Contagion Proxy
The conventional output of the DebtRank algorithm is a sol-vency vector σ i describing whether each bank i is solvent ornot. This constitutes a binary output, which limits contagion asa categorical variable. To produce a continuous variable with aricher information set, we define a proxy value for contagion asthe percentage difference between the final equity state E finali and the initial post-shock equity state E post − shocki : Contagion P roxy = E finali − E post − shocki E post − shocki (17)This metric produces an isolated measure of the equity lost byeach bank i that is solely attributable to contagion, excludingthe initial shock. D. Variable Setup
The general testing framework consists in the formulation ofestablished explanatory X attributes in bank default prediction,namely: • Return on Assets (ROA). • Return on Equity (ROE). • Past Due Short-Term Loan Book Value on Total Assets(Short Term Bad Loans as a percentage of Assets). • TIER 1 Capital Ratio. • TIER 1 Leverage Capital Ratio.To these, we append an additional explanatory X attributerepresenting the output of the previously outlined DebtRankalgorithm, hereafter referred to as the: • Contagion Proxy.All X attributes are tied to a binary dependant variable Y(taking 0 and 1 values) signifying whether each analysed bankhas failed (0) or not (1). The high-level aim of the study isto verify for additional significance and explanatory powerderived by the Contagion Proxy with respect to establishedvariables. To increase dimensionality, each attribute has 4quarterly readings, resulting in 24 total X attributes.
E. Method of Default Classification and Prediction
Two machine learning classifiers will be used to predict thedefault Y outcome, given previously outlined X attributes. Anoverview of how each is applied is provided below.
1. Neural Networks
Neural Networks have been widely applied in economics andfinance with considerable success in a variety of use cases.They present advantages in their mapping of more complex,non-linear functions of X attributes, thus in this study they willbe employed as a measure of hypothesized best classificationperformance. The shortcoming of neural networks however istheir lack of transparency in variable significance. In this studywe have elected to apply simple sensitivity analysis of theinputs through gradient descent to gauge explanatory power.
Several elements of the network have been kept constant. Thenetwork consists of three hidden layers, all of which haveReLU activations, and a single sigmoid activation (for binaryclassification) node as its output layer. Two dropout layersare present after the first and second hidden layer with a 10percent dropout probability. Weights are initialized accordingto a random truncated normal distribution with mean 0 andstandard deviation 0.2, while biases are initialized to 0.
Data was split into training, validation and testing in equalthirds. All data was normalized using a Robust Scaler. Themodel trains on the training data, then selects the best com-bination of hyper-parameters based on accuracy performancein the validation set. The hyperparameters were combinationsof:Hidden Layer Structure ∈ ([8, 16, 8], [4, 8, 16], [16, 8, 4]). Solver ∈ (Stochastic Gradient Descent, Adam, RMSProp).Learning Rate ∈ (0.001, 0.05, 0.1).Once an ideal combination is identified, the model is re-trainedusing the tuned hyperparameters and evaluated Out-Of-Sample(OOS) on the testing dataset to obtain final performance. The model obtained in tuning represents our final, optimizedneural network. It contains the fully backpropagated weightsand biases that guaranteed it the lowest losses in binary crossentropy. We are now interested in the explanatory power ofeach input in the network. We can obtain an indication of thisthrough gradient descent. Let us provide a brief summary ofthe former and state key terms. A representation of our model,using a sample [16, 8, 4] node hidden layer architecture ispresented in Fig. 1.Fig. 1:
Sample neural network model.
The model has an input layer consisting of 24 nodes, one foreach explanatory X attribute we included in our dataset. Letus denote the input layer, subsequent three hidden layers andoutput layer by the index j = 1:5. Let us further denote thenumber of nodes in each layer, starting from the top, by theindex i = 1:n, where n is the last node. Let us also define somekey network components: • W ji,k : The weight in layer j, connecting the i-th node inlayer j to the k-th node in layer j+1. • Z j,i : The input value at layer j that is fed into node i. Thisterm is the sum of the product between each precedinglayer’s node and the weight that joins it to i. • A j,i : The activation function with Z j,i as input, such that A j,i = f ( Z j,i ) , for some function f.We can now more fully define Z j,i as: Z j,i = n (cid:88) i =1 W j − i,k · A j − ,i (18)Fig. 2 re-creates the network in Fig. 1 for just one input nodewho’s effect on the model we want to assess. Fig. 2:
Sample backpropagation path (highlighted in blue)for a given forward pass from a single neuron to the nextlayer’s single neuron through to output (Y).
The marginal effect of this input node on the final output Y isdetermined by the marginal effect on each subsequent weightand node it is connected to. To understand that general effect,let us begin with the local marginal effect on just one specificpath in the network. We take the highlighted path in lightblue from Fig. 2, going from the input node to the first node ofeach subsequent layer. Thus, we move from the input, or A , ,to Z , to A , to Z , to the final sigmoid activation A , and the output Y. The final output Y depends on the previoussigmoid activation A , with one-to-one transmission. Thus aunit change in A , will result in a unit change in Y, or insymbolic terms: dYdA , = 1 (19)Working backwards, the activation in the fifth layer, dependson its input in the same layer, formalized as: dA , dZ , (20)Then by chain rule, we can say that the output depends on theaforementioned input according to: dYdA , · dA , dZ , (21)We can keep extending the chain rule backwards along thepath until we arrive to the first input A , : dYdA , · dA , dZ , · dZ , dA , · dA , dZ , · dZ , dA , · dA , dZ , · dZ , dA , · dA , dZ , · dZ , dA , (22) Or: dYdA , · (cid:89) j =5 dA j, dZ j, · dZ j, dA j − , (23)Which tells us how a unit change in the input, will affect theoutput Y for this particular path. To understand how the sameunit change will affect the output Y across all possible paths generated by the input, we simply sum the gradients of eachpath: for a given input node I : dYdA ,I = 0 for a given second layer input node ∈ (1, 2, 3 ... N) : for a given third layer input node ∈ (1, 2, 3 ... M) : for a given fourth layer input node ∈ (1, 2, 3 ... O) : dYdA ,I = dYdA ,I + dYdA , · dA , dZ , · dZ , dA ,m · dA ,m dZ ,m · dZ ,m dA ,k · dA ,k dZ ,k · dZ ,k dA ,p · dA ,p dZ ,p · dZ ,p dA ,I Where N, M and O represent the number of nodes in thesecond, third and fourth layer respectively.We will later report our findings according to this methodol-ogy, resulting in a single gradient for each input variable/node,quantifying how a unit change in the input A ,I affects theoutput Y.To elaborate on the components above, the derivative of theinput to a given node k in a given layer j with respect to theprevious layer’s j-1 activation from node i, dZ j,k dA j − ,i , is givenby the previous layer’s weight from node i to node k, since: d (cid:16) (cid:80) ni =1 W j − i,k · A j − ,i (cid:17) dA j − ,i = W j − i,k (24)The activation function derivatives with respect to the input, dA j,k dZ j,i , are simply the gradients of the functions themselves. Inour case, all hidden layers contain ReLU activations, whichare defined as: ReLU ( Z j,i ) = (cid:40) for Z j,i < Z j,i for Z j,i ≥ With a gradient of: dA j,k dZ j,i = d ( ReLU ( Z j,i )) dZ j,i = (cid:40) for Z j,i > for Z j,i ≤ The final layer’s activation is a sigmoid function, to smoothenoutputs for binary classification. The sigmoid function isdefined as:
Sigmoid ( Z j,i ) = 11 + e − Z j,i (25)With a gradient of: dA j,k dZ j,i = d ( Sigmoid ( Z j,i )) dZ j,i = e Z j,i (1 + e Z j,i ) (26)Results obtained through these gradients will reveal the sen-sitivity of the output to each of the inputs in the study.
2. Logistic Regression
Logistic regression models are used extensively in financialliterature concerning default prediction and can be viewed as a reliable baseline for this type of analysis. We employthem due to their greater transparency in both coefficient andsignificance analysis.Logistic models are non-linear and follow a log of the oddsspecification: log P ( X )1 − P ( X ) = K (cid:88) j =0 b j x j (27)Where, P(X) is the probability of default, b j is the regres-sion’s j-th coefficient and x j is the regression’s j-th variable.Exponentiating both sides to the base of e results in a moreinterpretable specification of: P ( X )1 − P ( X ) = e (cid:80) Kj =0 b j x j (28)Or: P ( X )1 − P ( X ) = K (cid:89) j =0 e b j x j (29)Where the left-hand side of the equation represents the oddsof P(X) occurring (the probability of X happening over itnot happening), while the right-hand side is the product ofthe exponentiation of each regressor. The interpretation of themodel is thus that a one unit change in the value of somevariable x j will result in an e b j fold increase in the odds ofP(X) occurring.Our use of logistic regressions will be two-layered in thisstudy:Our particular use of logistic regression will also involve theapplication of a Lasso penalty to observe (a) if contagionremains among the significant variables (b) the magnitude anddirection of its coefficients. F. Testing Framework
We begin the study by selecting a period of observation duringwhich we want to record attributes. This period is the crashfrom the financial crisis and the first stages of its recovery,i.e. all four quarters from 2009. We do this with a view ofpredicting defaults occurring right after the aforementionedperiod. It would be reasonable to predict defaults from imme-diately after or during the crash, however the entire notionof interbank contagion is to identify the defaults of banksthat don’t immediately become insolvent during a crash, butmight eventually become insolvent later as a result of priorimmediate defaults. Hence we apply a lag that leads us toalso include some recovery months as predictor periods.In the stated 2009 window, we have attribute data on some7,000 commercial banks in the U.S. financial system (onlybanks which remained solvent and as standalone entities forthe entire period were included in this study). We then seekto identify which of these attributes caused certain banks tofail and others to remain solvent.We ascertained we have a number of X attributes for theseobservations/banks over the specified period, but we requirea Y label to discern whether they failed or not and whetherthe X attributes can predict such an event. As we will record X attribute data right up until the end of 2009, we willaim to predict bank defaults over the first quarter of 2010(the successive data period). We used a separate dataset toretrospectively label all 7,000 banks in the 2009 yearly windowas either still solvent in the first quarter of 2010 (1) or insolvent(0). The general framework of the study is to train a modelon the X attributes in the 2009 window to predict the binary2010 default state. The problem is framed cross sectionally.The model is trained In-Sample (IS) on a subset of the banks(as opposed to using all banks but a subset of the overalltime period) to optimize parameters and tested Out-Of-Sample(OOS) on a separate leftover subset of the banks. Overallpredictive performance is observed on this leftover sample.The study will present findings on whether financial attributes,including contagion, up to a 1 year lag had any, and which,influence on default.
Note on Class Balance
For each failed bank in our system there are roughly 50 non-failed banks, meaning our raw dataset is imbalanced and aclassifier that labels all test samples as non-failed will behighly accurate without producing insight. To remedy this, weadopt a nave random oversampling of the minority class (failedbanks) with replacement until comparable to the majority one.To reduce computation times and to avoid excessive simplifi-cation of attributes in the oversampled class, we have reducedthe overall dimensionality of the data to 1000 observations,evenly split between failed and non-failed institutions.III. D
ATA
Data has been chiefly sourced from the Federal FinancialInstitutions Examination Council (FFIEC, 2019). The FFIECprovides datasets with balance sheet information on all UnitedStates commercial banks at quarterly intervals. Tailored codehas been developed to compile and clean the semi-structuredquarterly entries into a new, never before used, time-seriesdataset for each attribute mentioned in the Variable Setupsection over a period of 2001-2018 (This constitutes a raw firstentry of this study’s X variables). A secondary source of dataoriginates from the Federal Deposit Insurance Corporation(FDIC, 2019), which produces a list of all failed U.S. banks.The unique identifiers from the FDIC’s dataset were matchedto the FFIEC’s, such that failed banks in the FFIEC datasetwould take on a value of 1 and non-failed banks wouldtake a value of 0. (This constitutes the Y dependant variablewe would look to predict) A data dictionary for variableabbreviations can be found at the Federal Reserve’s MicroData Reference Manual (Board of Governors of the FederalReserve System, 2018). We gathered all dataset variablesnecessary to either simulate or define the variables mentionedin the earlier model methodology. The data gathered was suffi-cient, but at times required manipulation. Interbank liabilities,for instance, had to be derived using related data, requiringthe assumption of a closed interbank system. Additionally,interbank data included information on the total interbankassets and liabilities of each bank, but not the interbank assetsand liabilities of each bank with respect to every other, whichis necessary to test contagion.
Fig. 3:
Correlation matrix of each attribute against every other. Attributes are substituted with number keys in the left tablewith a key dictionary table on the right. Coefficients in the correlation matrix are color coated according to a criterion thathighlights high correlations in red, whether positive or negative, leaving more uncorrelated readings in lighter hue.
We hence applied a maximum entropy reconstruction algo-rithm (Anand et. al, 2015) capable of approximating the neces-sary data. At odd (31), n+1, and even (30), n, iteration intervalsthe algorithm normalizes the earlier introduced interbank assetmatrix W ij to its total interbank assets (for iteration n) orliabilities (for n+1) to then multiply it by the observed totalinterbank assets (for n) or liabilities (for n+1) from the FFIECdataset. In other words, this process rescales the rows andcolumns of the matrix to the dataset’s aggregate values untila convergence is reached: W nij = W n − ij (cid:80) j W n − ij IA F F IECi (30) W n +1 ij = W nij (cid:80) i W nij IL F F IECj (31)The FFIEC data is used to iterate this paper’s selectedcontagion models and its predicted defaults are comparedwith observed defaults from the Federal Deposit InsuranceCorporation’s Failed Bank List (FDIC, 2019) to generateconclusions. IV. R
ESULTS AND D ISCUSSION
A. Preliminary Analysis of Variable Correlations
Before analysing our model outputs, this section will coversome initial correlation analysis of studied attributes. Fig. 3outlines correlation coefficients of each attribute in our studywith respect to every other. We find generally that everyattribute, with the exception of Return on Equity (keys 5 to 9),is highly autocorrelated across lags. In this regard, Return onAssets, Tier 1 Capital Ratio (RCON7206) and Tier 1 LeverageCapital Ratio (RCON7204) perform worst, while Contagionand Short Term Past Due loans perform moderately better butstill considerably below Return on Equity. In terms of crossattribute correlation, the picture changes. Predictably, Return on Equity is highly positively correlated with Return on Assetsand moderately inversely related to Short Term Past Due loans.Short Term Past Due loans are strongly inversely correlatedwith Return on Assets and RCON7204. Return on Assetsis highly positively correlated with Return on Equity andRCON7204. RCON7204 is highly positively correlated withReturn on Assets and inversely correlated with Short TermPast Due loans. Finally, out of all attributes studied, Contagionand RCON7206 are the only two attributes to be largelyuncorrelated with every other attribute, outside of its own lags,pointing to the high value they may bring into any analysisthat involves these composite metrics. In particular, contagionis exceptionally uncorrelated, with no correlation coefficienthaving an absolute value magnitude greater than 0.05 exclud-ing lag coefficients. We summarize from this preliminary dataanalysis that contagion has promising qualities that may bringadditional uncorrelated explanatory power to future regressionanalysis. We also conclude that some attributes have sizeableamounts of cross correlation with other attributes, and allattributes have a moderate to high degree of autocorrelation intheir lags, hence dimensionality reduction methods will laterbe applied to avoid collinearity.
B. Neural Network ResultsModel Details
A neural network was tuned In-Sample and tested Out-Of-Sample for the purpose of variable analysis. We report thespecifications of the final OOS tuned model in Table I.The tuning resulted in an ideal learning rate of 0.01 combinedwith an RMSprop solver and a pyramid architecture of 32, 16and 8 nodes over three hidden layers. The model performedextremely well in default detection with 97.74 percent accu-racy, meaning the vast majority of observations were correctlyclassified as either having remained solvent or failed in the firstquarter of 2010. The high performance implies weights and
HiddenLayerStructure Solver LearningRate OOS Ac-curacy (32, 16, 8) RMSprop 0.01 97.74%
TABLE I:
Classification accuracy and model specification oftuned OOS neural network model. magnitudes within the model are reliable indicators of howeach variable affects default.
Sensitivity Analysis
As outlined in the methodology, we applied gradient descentto obtain derivatives of the network’s output with respect toits inputs and gauge explanatory power. Table II reports ourfindings on a variable per variable basis.
Variable/Attribute Output Gradient w.r.tInput: (cid:16) dYdA , (cid:17) Short Term Past Due - Quarter 1, 2009 0.002Short Term Past Due - Quarter 2, 2009 -0.008Short Term Past Due - Quarter 3, 2009 0.002Short Term Past Due - Quarter 4, 2009 0.027ROE - Quarter 1, 2009 0.041ROE - Quarter 2, 2009 -0.028ROE - Quarter 3, 2009 -0.018ROE - Quarter 4, 2009 -0.019ROA - Quarter 1, 2009 0.006ROA - Quarter 2, 2009 0.000ROA - Quarter 3, 2009 0.002ROA - Quarter 4, 2009 -0.064RCON7206 - Quarter 1, 2009 0.006RCON7206 - Quarter 2, 2009 -0.009RCON7206 - Quarter 3, 2009 -0.007RCON7206 - Quarter 4, 2009 -0.019RCON7204 - Quarter 1, 2009 0.003RCON7204 - Quarter 2, 2009 -0.209RCON7204 - Quarter 3, 2009 -0.013RCON7204 - Quarter 4, 2009 -0.295
Contagion Proxy - Quarter 1, 2009 -0.005Contagion Proxy - Quarter 2, 2009 -0.009Contagion Proxy - Quarter 3, 2009 0.002Contagion Proxy - Quarter 4, 2009 -0.029
TABLE II:
Gradients of network’s output value with respectto listed input variable.
Before analysing the impact of contagion, let us generallydescribe the model and the overall structural integrity of themagnitude and direction of gradients, ensuring the model isbroadly consistent with theory if it is to be used as a reliablebenchmark for a novel attribute.We would generally expect for Return on Equity (ROE) andReturn on Assets (ROA) to be inversely correlated with defaultprobability, meaning an increase in either results in lowerchances of bankruptcy. The model broadly supports this, asthree out of the four ROE entries have a negative gradient,signifying that an increase in ROE results in a decrease in thefinal 0 to 1 Y output closer to 0 where a solvent classificationis more likely. ROA is more conflicting, as two of its readingshave a negative gradient and two have a positive one, howeverthe magnitude of its most significant reading (Quarter 4) isten times larger than any of its other entries, signifying a neteffect of inverse correlation with default. Short Term Past Due loans should be positively correlated with default, as a higherproportion of bad loans raises loss provisions, limiting banks’capital. This is confirmed by the data, with three of its fourentries having positive gradients of much larger magnitudethan its one negative reading. The two capital ratios, TIER1 Leverage Capital Ratio (RCON7204) and TIER 1 CapitalRatio (RCON7206) should produce inverse correlations withdefault, as the metrics measure a bank’s liquid capital buffer(where a higher ratio indicates a larger buffer). This is onceagain confirmed by the data as both have negative gradients forthree out of four of their entries and insignificant magnitudesfor their individual positive readings. Having ascertained themodel is structurally consistent, let us analyse the magnitudeand direction of contagion.The Contagion Proxy would be expected to be inverselyrelated to default, as it represents the equity lost due tocontagion (note this value is negative). This means an increasein its reading mathematically represents a lower equity loss,thus a lower exposure to contagion and thus a lower defaultprobability. We can confirm that this is the case, as threeout of four contagion readings have a negative gradient andtheir magnitudes greatly overshadow that of its only positivegradient.In addition to correct directionality, the Contagion Proxy alsoboasts extremely large gradients, where a higher gradient im-plies higher variable importance and impact on model output.Among variables with a negative gradient (15 variables) itsreading for Quarter 4 of 2009 is the fourth largest by mag-nitude, coming after the second and fourth quarter readingsof RCON7204 and the fourth quarter reading of Return onAssets. Most notably, the former contagion reading has largergradients than any of the TIER 1 Capital Ratio (RCON7206)entries, representing confirmation within the context of thismodel that contagion can play a larger role in default predic-tion than long established gold standard predictors.
C. Logistic Regression Results
A logistic model was fit to the study’s 24 attributes alongsidea Lasso regularization penalty to reduce dimensionality, co-linearity and increase significance of retained variables.Table III outlines our findings. The model’s directionality isstructurally consistent. Short Term Past Due loans are correctlydirectly proportional to odds of default, while Return onEquity, TIER 1 Capital Ratio (RCON7206), TIER 1 LeverageCapital Ratio (RCON7204) and the Contagion Proxy are cor-rectly inversely proportional according to existing discussionin the Neural Network Results section of this study. Havingestablished this, we may further analyse the model and therole of contagion in it.The applied Lasso regularization reduced the number of anal-ysed attributes from 24 to 7. The substantial penalty impliesthat any retained variable can be assumed to be very impactfulon default. Many attributes suffered heavy reductions, with allbut Return on Equity (ROE) being limited to a single entry permetric. Return on Assets (ROA) has notably been discardedentirely. Contagion has been retained in the model with asingle entry, offering an initial endorsement of its utility inthe context of this model.
OOS Accuracy: 93.7%Variable/Attribute (Yearly) Log of the Odds Coefficient Z Distribution P-Value
Short Term Past Due - Quarter 1, 2009
Lasso Reduced
Short Term Past Due - Quarter 2, 2009
Lasso Reduced
Short Term Past Due - Quarter 3, 2009
Lasso Reduced
Short Term Past Due - Quarter 4, 2009 0.0498 0.2671ROE - Quarter 1, 2009
Lasso Reduced
ROE - Quarter 2, 2009 -0.4107 0.0002*ROE - Quarter 3, 2009 -0.0521 0.3879ROE - Quarter 4, 2009 -0.106 0.0290*ROA - Quarter 1, 2009
Lasso Reduced
ROA - Quarter 2, 2009
Lasso Reduced
ROA - Quarter 3, 2009
Lasso Reduced
ROA - Quarter 4, 2009
Lasso Reduced
RCON7206 - Quarter 1, 2009 -0.0776 0.4007RCON7206 - Quarter 2, 2009
Lasso Reduced
RCON7206 - Quarter 3, 2009
Lasso Reduced
RCON7206 - Quarter 4, 2009
Lasso Reduced
RCON7204 - Quarter 1, 2009
Lasso Reduced
RCON7204 - Quarter 2, 2009
Lasso Reduced
RCON7204 - Quarter 3, 2009
Lasso Reduced
RCON7204 - Quarter 4, 2009 -1.8492 0.0000*
Contagion Proxy - Quarter 1, 2009
Lasso Reduced
Contagion Proxy - Quarter 2, 2009
Lasso Reduced
Contagion Proxy - Quarter 3, 2009
Lasso Reduced
Contagion Proxy - Quarter 4, 2009 -0.1489
TABLE III:
Coefficients from Logistic Regression following Lasso penalty. 24 original attributes reduced to 7 post-penalty.(*) Variable is significant at 95 percent confidence level.
To further elaborate, the Contagion Proxy’s reading for thefourth quarter of 2009 yields a coefficient of -0.1489, whichimplies a one unit (standard normal intervals) increment inthe Contagion Proxy would result in a e − . ≈ . factor multiplication (reduction) in the odds of default, ceterisparibus. Recall, as mentioned in the preceding Neural NetworkResults section, that a mathematical one unit increase in theContagion Proxy denotes a decrease in contagion exposure,given the proxy measures negative equity loss from conta-gion. The metric’s coefficient is the third largest in absolutemagnitude across the regression, ranking ahead of two of thethree Return on Equity readings and the individual RCON7206reading for same sign metrics. It is also extremely significant,with a Z-distributed p-value of 0.0018, representing the secondmost significant p-value in the regression, following onlyRCON7204’s reading of . · − .V. C ONCLUSION
This paper sought to explore the utility of interbank contagionas a default metric and more generally verify for its explana-tory power and significance in the wider context of moreestablished financial variables of interest. We adopted twodefault model frameworks, the first involving neural networksand the second involving logistic regression. In the former,we concluded contagion has high explanatory power withgradient magnitudes exceeding those of the TIER 1 CapitalRatio and correct correlation with default with greatercontagion exposure being associated with greater odds ofdefault.In the logistic framework, we also concluded contagion to behighly explanatory with a coefficient ranking third out ofseven attributes by magnitude in a Lasso reduced model andcorrectly correlated with default. We hence summarize that not only contagion plays a clear role in bank default analysis, butthat it is capable of even superseding the explanatory powerand significance of far more established variables, justifyingfar greater interest in its analysis and application to stresstesting, risk management and default prediction.VI. A
CKNOWLEDGEMENTS
I would like to thank Dr. Fabio Caccioli for providing helpfulcomments.VII. C
OMPETING I NTERESTS S TATEMENT
I confirm there have been no involvements or interests thatmight question the integrity of this study or the opinions statedtherein. R
EFERENCES[1] Anand, K., Craig, B., von Peter, G., Von Peter, G. (2015) Filling in theblanks: network structure and interbank contagion.
Quantitative Finance .15 (4), 625-636.[2] Allen, F. & Gale, D. (2000) Financial Contagion.
Journal of PoliticalEconomy . 108 (1), 1-33.[3] Battiston, S., Puliga, M., Kaushik, R., Tasca, P., Cladarelli, G. (2012)DebtRank: Too Central to Fail? Financial Networks, the FED andSystemic Risk.
Scientific Reports
Interna-tional Journal of Central Banking . 3 (2), 124-171.[6] Eisenberg, L. & Noe, T. (2001) Systemic Risk in Financial Systems.
Management Science [8] FFIEC. (2019) Download Bulk Data - FFIEC CentralData Repository’s Public Data Distribution. Available at:https://cdr.ffiec.gov/public/PWS/DownloadBulkData.aspx [Accessed11 Jan. 2019].[9] Furfine, C. (2003) Interbank exposures: quantifying the risk of contagion. Journal of Money, Credit and Banking . 35 (1), 111-128.[10] Gabrieli, S. and Salakhova, D. (2019) Cross-border interbank contagionin the European banking sector.
International Economics . 157, 33-54.[11] Gai, P. & Kapadia, S. (2010) Contagion in financial networks.
Proceed-ings of the Royal Society A: Mathematical, Physical and EngineeringSciences . 466 (2120), 2401-2423.[12] Georgescu, O. (2015) Contagion in the interbank market: Funding versusregulatory constraints.
Journal of Financial Stability . 18, 1-18.[13] Haaj, G. & Kok, C. (2013) Assessing Interbank Contagion UsingSimulated Networks.
European Central Bank .[14] Leventides, J., Loukaki, K. and Papavassiliou, V. (2018) Simulatingfinancial contagion dynamics in random interbank networks.
Journal ofEconomic Behavior & Organization . 158, 500-525.[15] Liu, A., Paddrik, M., Yang, S. and Zhang, X. (2020) Interbank contagion:An agent-based model approach to endogenously formed networks.
Journal of Banking & Finance . 112, 105-191.[16] Mistrulli, P. (2011) Assessing financial contagion in the interbankmarket: Maximum entropy versus observed interbank lending patterns.
Journal of Banking & Finance . 35 (5), 1114-1127.[17] Morrison, A., Vasios, M., Wilson, M., Zikes, F. (2017) IdentifyingContagion in a Banking Network.
Federal Reserve Board .[18] Nier, E., Yang, J., Yorulmazer, T. and Alentorn, A. (2007) Network mod-els and financial stability.
Journal of Economic Dynamics and Control .31 (6), 2033-2060.[19] Ota, T. (2017) Systemic Illiquidity in the Interbank Network.
Bank ofEngland .[20] Ozkan, F. & Unsal, F. (2012) Global Financial Crisis, Financial Conta-gion, and Emerging Markets.
IMF Working Papers .[21] Upper, C. & Worms, A. (2004) Estimating bilateral exposures in theGerman interbank market: Is there a danger of contagion?.
EuropeanEconomic Review . 48 (4), 827-849.[22] Upper, C. (2011) Simulation methods to assess the danger of contagionin interbank markets.