Bankruptcy prediction using disclosure text features
BB ANKRUPTCY PREDICTION USING DISCLOSURE TEXT FEATURES
A P
REPRINT
Sridhar Ravula
Department of AnalyticsHarrisburg University of Science and TechnologyHarrisburg, PA 17101 [email protected]
January 5, 2021 A BSTRACT
A public firm’s bankruptcy prediction is an important financial research problem because of thesecurity price downside risks. Traditional methods rely on accounting metrics that suffer fromshortcomings like window dressing and retrospective focus. While disclosure text-based metricsovercome some of these issues, current methods excessively focus on disclosure tone and sentiment.There is a requirement to relate meaningful signals in the disclosure text to financial outcomesand quantify the disclosure text data. This work proposes a new distress dictionary based on thesentences used by managers in explaining financial status. It demonstrates the significant differencesin linguistic features between bankrupt and non-bankrupt firms. Further, using a large sample of 500bankrupt firms, it builds predictive models and compares the performance against two dictionariesused in financial text analysis. This research shows that the proposed stress dictionary capturesunique information from disclosures and the predictive models based on its features have the highestaccuracy. K eywords Bankruptcy · Distress · NLP · bag-of-words · Disclosures · Machine learning · EDGAR · Text analysis · Investors and analysts place great emphasis on security analysis and valuation because of the potential excess returns oncapital and the downside risks. Research in this domain is potentially valuable because market inefficiencies can resultin volatility and crashes, costing the economy billions of dollars. Analysts extensively use public firm’s disclosures as asource of information.Investors are keen on knowing about the health of the firms they may invest in the future. A firm in financial distressloses a significant amount of its shareholder’s value. If the management cannot tide over the crisis, the firm may haveto file for bankruptcy, resulting in a 50% to 80% loss of capital for shareholders and lenders. Financial distress andbankruptcy prediction is an actively researched field.Once a company is unable to come out of distress, it will become insolvent. Insolvency is the state in which thecompany is not capable of honoring some commitment. Lenders and claim holders can force the insolvent company todiscontinue operations. Managements file for bankruptcy protection to recover from such a situation or liquidate it in anorderly manner. Bankruptcy prediction has been an active research topic for accounting researchers over decades. Oneof the pioneering works Altman (1968) proposed the ‘Z score’ model.Investors and analysts traditionally depended on quantitative information like accounting metrics for decision making.Multiple attributes of these accounting metrics drove this trend. FACC and accounting standards laid out what variablesto be measured and disclosed. Gathering, processing, and analyzing these quantitative metrics was easy. Many free andcommercial data providers automated the data gathering and published these metrics. However, these metrics do not a r X i v : . [ q -f i n . GN ] J a n PREPRINT - J
ANUARY
5, 2021always reveal the firm’s current status and are not a good indicator of the future. They suffer from shortcomings likewindow dressing and retrospective focus.Evidence exists for window dressing through commissions and omissions. Rajan, Seru, and Vig (2015) showed thatbanks did not report information regarding the deteriorating quality of borrowers’ disclosures in the run-up to thesubprime crisis. Huizinga and Laeven (2012) said that banks overstated the value of their distressed real estate assetsand regulatory capital. Window dressing, retrospective focus, and missing variables impact models based on accountingmetrics. Regulators and investors who rely on such models have been impacted adversely in the past due to modelfailures (Rajan, Seru, and Vig (2015)).Another approach for bankruptcy prediction is using market-based information. Classical efficient market theory andlater option pricing theories assume that all available information is reflected in market prices. Under those conditions,accounting-based metrics do not have additional information over and above market prices. More specifically, a suitablemarket-based measure will reflect all available information about bankruptcy probability. Hillegeist et al. (2004)developed a prediction model based on market information, using option pricing theory derived implied volatility. Thismodel outperformed the Altman (1968) z score model. Subsequently, numerous attempts have been made to replicatethese results. Wu, Gaunt, and Gray (2010) provides a comparison of accounting and market-based models, along withothers. They conclude that the Hillegeist et al. (2004) model performs better than the Z score model but is inferior tomodels that include non-traditional metrics. Similarly, Tinoco and Wilson (2013) concluded that accounting metricsbased models and market-based models are complimentary.Hence researchers started paying more attention to alternative approaches like textual analysis of disclosures. Man-agement disclosures have narrative content that contains important information. This information can explain manyfirm attributes and organization outcomes, and text analysis methods can extract this information. Prior works haveattempted to incorporate text features into accounting-based predictive models. However, standalone text feature-basedprediction models have not been attempted. There is a need to understand how much information can be extracted fromdisclosure texts and how useful, such information is in predicting bankruptcy. This work addresses that gap.
Numerous researchers tried to explain various firm attributes using disclosure narratives. Some analyzed MDA toexplain future stock performance (Tao, Deokar, and Deshmukh (2018)), future returns, volatility, and firm profitability(Amel-Zadeh and Faasse (2016)), bankruptcy (Yang, Dolar, and Mo (2018)), going-concern (Mayew, Sethuraman, andVenkatachalam (2015),Enev (2017)), litigation risk (Bourveau, Lou, and Wang (2018)), and incremental informationover earnings surprises, accruals and operating cash flows (OCF)(Feldman et al. (2008),Feldman et al. (2010)).Researchers attempted to incorporate text features into distress and bankruptcy predictive models. Below is a briefreview of the same.Auditors express going-concern opinions based on the firm’s obligations and liquidity. Financial disclosures includethese opinions. Change in such disclosures can act as a signal to identify distress. However, auditors do respond toexternal financial markets. Beams and Yan (2015) examined the financial crisis’s effect on auditor going-concernopinions and concluded that the financial crisis led to increased auditor conservatism. A going-concern opinion indisclosures is associated with the number of forward-looking disclosures and their ambiguity. Enev (2017) observed thatwhile the absolute number of forward-looking disclosures is lower for companies receiving a going concern opinion, theproportion of forward-looking disclosures in the MDA is higher in the presence of a going concern opinion. The resultssuggest generally improved forward-looking disclosures in MDA when companies receive a going concern opinionfrom their auditor.One consequence of distress is financial constraints. Firms undergo reduced cash flows during Stress, which results inliquidity events - like dividend omissions or increases, equity recycling, and underfunded pensions. Analysts measurethe extent of financial constraints to assess the capital structure. Bodnaruk, Loughran, and McDonald (2013) used aconstraining-words-based lexicon to measure the same. These measures have a low correlation with traditional financialconstraints measures and predict subsequent liquidity events better. Ball, Hoberg, and Maksimovic (2012) used text infirms’ 10-Ks to measure investment delays due to financial constraints. They found that the fundamental limitations arethe financing of R&D expenditures rather than capital expenditures and that the main challenge for firms is raisingequity capital to fund growth opportunities. These text-based measures predict investment cuts following the financialcrisis better than other indices of financial constraints used in the literature.Most prior bankruptcy prediction models were developed by using financial ratios. However, signs of distress mayappear in the nonfinancial information earlier than changes in the financial ratios. Current distress measures tend to missextreme events, especially in the banking sector (Gandhi, Loughran, and McDonald (2017)). In recent years, qualitative2
PREPRINT - J
ANUARY
5, 2021information and text analysis have become necessary because frequent changes in accounting standards have made itdifficult to compare financial numbers between years (Shirata et al. (2011)). Mayew, Sethuraman, and Venkatachalam(2015) stressed the importance of linguistic tone in assessing a firm’s health. Using a sample of bankrupt firms between1995 and 2012, they concluded that management’s opinion about going-concern and the MDA’s linguistic tone togetherpredict whether a firm will go bankrupt.The language used by future bankrupt companies differs from non-bankrupt companies. Hájek and Olej (2015) studiedvarious word categories from corporate annual reports and showed that the language used by bankrupt companiesshows stronger tenacity, accomplishment, familiarity, present concern, exclusion, and denial. Bankrupt companies alsouse more modal, positive, uncertain, and negative language. They built prediction models combining both financialindicators and word categorizations as input variables. This differential language usage is observed in non-Englishfirms’ disclosures also. Shirata et al. (2011) analyzed the sentences in Japanese financial reports to predict bankruptcy.Their research revealed that the co-occurrence of words “dividend” or “retained earnings” in a section distinguishbetween bankrupt companies and non-bankrupt companies.Working on U.S. Banks Gandhi, Loughran, and McDonald (2017) used disclosure text sentiment as a proxy for bankdistress. They found that the annual report’s more negative sentiment is associated with larger delisting probabilities,lower odds of paying subsequent dividends, higher subsequent loan loss provisions, and lower future return on assets.Similarly, Lopatta, Gloger, and Jaeschke (2017) concluded that firms at risk of bankruptcy use significantly morenegative words in their 10-K filings than comparable vital companies. This relationship holds up until three yearsbefore the actual bankruptcy filing. Other notable works using text analysis for bankruptcy prediction were Yang,Dolar, and Mo (2018) and Mayew, Sethuraman, and Venkatachalam (2015). Yang, Dolar, and Mo (2018) used high-frequency words from MDA and compared the differences between bankrupt and non-bankrupt companies. Mayew,Sethuraman, and Venkatachalam (2015) also analyzed MDA with a focus on going-concern options. They found thatboth management’s opinion about “going-concern” reported in the MDA and the MDA’s linguistic tone together providesignificant explanatory power in predicting whether a firm will cease as a going concern. Also, the predictive abilityof disclosure is incremental to financial ratios, market-based variables, even the auditor’s going concern opinion andextends to three years before the bankruptcy.Most of the prior works focused on disclosure sentiment as an incremental predictor for bankruptcy prediction. However,disclosure text contains significantly more information other than sentiment, and there is a need to extract and testits predictive power. To this end, this quantitative correlation study evaluates the differences in linguistic featuresbetween healthy and bankrupt disclosure texts. Further, predictive models are built to assess the information contentand predictive power. The next section will outline the methods.
The prior sections have reviewed the literature and identified the gaps in the text analysis of finance. As bankruptcy is asignificant organizational outcome for investors, this thesis focuses on the bankruptcy prediction task. To this end, thisquantitative correlation study evaluates the differences in linguistic features between healthy and bankrupt disclosuretexts. Further, predictive models are built to assess the information content and predictive power. This section describesthe framework, data analysis, and methodology used.To summarize, this thesis has four key components.
Text source: Management Discussion and Analysis from 10-K disclosures.Task: Bankruptcy prediction based on prior-year MDA.Sample size: Balanced sample with 500 number of bankrupt and non-bankrupt disclosures each.Language models: Multiple, as described in later parts of this chapter.
The methods section consists of 4 sub-sections covering data, language models, predictive models, and assessmentcriteria.
In this section, sample selection and data collection methods are described. This work aims to extract knowledge fromfinancial disclosure text and use it for predictive tasks. It considers public listed companies in the U.S. as the population.From 1994 to 2019, over 16,000 individual companies filed annual disclosures with SEC. New companies get listed onexchanges through Initial Public Offering or corporate spin-offs. Companies are delisted due to mergers, acquisitions,and bankruptcies. As a result, there are ~8000 listed public companies in the year 2019 in the U.S.3
PREPRINT - J
ANUARY
5, 2021This work focuses on bankruptcy prediction using disclosure text characteristics. So, two samples are critical. One is alist of bankrupt firms, and the other is a list of non-bankrupt firms.
A critical component of this study is to identify firms that went bankrupt. This work uses the list of bankrupt companiesfrom the UCLA-LoPucki Bankruptcy Research Database (BRD) maintained by LoPucki (2006). UCLA School of Lawcollects, updates, and disseminates this data. This dataset contains more than one-thousand large public companies thathave filed bankruptcy cases since October 1, 1979. BRD defines a public company as a firm that filed an Annual Report(Form 10-K or form 10) with the SEC for a year ending not less than three years before filing the bankruptcy case. BRDconsiders all firms with more than $ 100 million in assets in annual reports as “large.” Assets are measured in 1980constant dollars (about $ 3.1 current dollar). Both Chapter 7 and Chapter 11 cases are included in the bankruptcy list,whether filed by the debtors or creditors. From this list, bankruptcies before 1994 are excluded. Since EDGAR maintainsonline disclosures from 1994 onwards, it was convenient to extract those filings. The exclusion of prior bankruptciesresults in a new list of ~900 bankrupt companies. Around 7000 corresponding firm filings exist in EDGAR. Companieswithout at least one prior year 10-K filing are excluded from the list. Finally, the Management Discussion and Analysissections are extracted from these filings. A minimum threshold of 100 words is used to filter out non-informative MDAs.This filtering resulted in a sample of 500 company filings one year before bankruptcy.
List of Non-Bankrupt firms is identified by starting with S&P 1000 list and excluding companies with bankruptcyhistory. The net result is 980 firms. Around 16000 filings exist for all these firms.
Since annual bankruptcy incidence is less than 0.5%, the number of all filings one year prior to bankruptcies is verylow, compared to non-bankrupt filings. Hence, a balanced experiment design with an equal number of bankrupt andnon-bankrupt disclosures in the sample is used. Five hundred non-bankrupt filings are randomly chosen from thenon-bankrupt filings.
The method for the annual filings downloading has the below components.
From 1993 to 2018, Dec companies filed ~20 million records on EDGAR. For ease of access, SEC releases quarterlymaster indices for the list of filings on EDGAR. This list has ~ 220,00 annual (10-K) filings relevant to this thesis.Custom R and python scripts downloaded these 10-K documents programmatically.
The text version of the filings on SEC is a collection of all files in a submission. These include HTML, exhibits, jpgfiles, and XBRL files. A fraction of the text file size will contain actual text. ASCII-encoded pdfs, graphics, Xls, orother binary files can contribute to most of the filings document size. The next processing step removed all non-textcontent from disclosure documents, following Loughran and Mcdonald (2009). These cleaned filings are stored in textformat.
For bankruptcy prediction, Management Analysis, and Discussion (MDA) is the text features source. Managementteams discussed the current firm status and expected outcomes in the MDA section. A python script extracted all the textbetween “ITEM 7” and “ITEM 7 A”. Regular expressions and combinations of these phrases are used to identify themaximum number of the MDAs from 10-K files. In some disclosures, the MDA section is “incorporated by reference,”referring to the shareholder’s annual report. The thesis included MDA material from the body of the primary document.Also, it discarded all MDAs with less than a 100-word count. Subsequent sections explain the generation of these texts’numeric representation by using dictionary-based parsing or word embeddings.4
PREPRINT - J
ANUARY
5, 2021
This subsection explains the dependent and independent variables used in the thesis.
The dependent variable for this thesis is a Bankruptcy filing. The Bankruptcy filing Dummy equals one if the firm hasfiled for bankruptcy protection within one year after the 10-K filing date, else 0.
As outlined in the prior literature survey, numerous text representation methods successfully extract information fromfinancial disclosures. However, often, they were used in combination with traditional quantitative metrics and financialratios. This thesis aims to identify standalone information content in text and design methods for knowledge extraction.This work evaluates numeric representations of MDA generated using three types of Bag of Word dictionary-basedlanguage models. The models are below.1. Linguistic Inquiry and Word Count (LIWC)2. Loughran McDonalds Financial Dictionary (L.M.)3. Stress Dictionary (S Dictionary)
Dictionary-based models are an extension of word frequency models. As discussed in prior sections, word frequencymodels suffer from large dimensionality and sparse matrix problems. One way to reduce the dimensionality is tocategorize words into different groups and compute the category frequencies. These frequencies are normalized perthousand words making comparison easier. These categorized word groups are called dictionaries. Dictionary methodsact as filters in extracting relevant language features. For example, numerous words indicate negative sentiment ina discourse. Collecting them under one group and computing frequency helps in understanding document tone veryquickly. These advantages made dictionary-based methods prevalent in text analysis. The next section covers the threedictionary-based models this thesis uses.
LIWC
Linguistic Inquiry and Word Count (LIWC) is a text analysis program developed by Pennebaker ( Pennebaker,Francis, and Booth (2001)). It allows linguistic features analysis and content analysis. Also, the tool can review stylisticaspects of language use across different contexts. Since linguistic style reveals psychological information about a writerand their underlying thinking, it is a useful tool in MDA analysis. Researchers used LIWC in numerous financial textanalysis studies. Fisher, Garnsey, and Hughes (2016) provided a brief review.LIWC examines written language and classifies it along up to 90 language dimensions (Pennebaker et al. (2015)),including1.Four summary language variables (analytical thinking, clout, authenticity, and emotional tone)2.Three linguistic descriptor categories (dictionary words, words per sentence, six letters and above words)3.Twenty-one standard language categories (e.g., articles, prepositions. pronouns)4.Forty-one psychological process word categories (e.g., affect, cognition, biological processes, drives)5.Six personal concern categories, five informal language markers, and 12 punctuation categoriesThe LIWC dimensions are hierarchically organized. For example, the word ‘optimistic’ falls into five categories:‘optimism’, ‘positive emotion’, ‘overall effect’, ‘words longer than six letters’ and ‘adjective’. The program analyzestext files on a word-by-word basis, calculating the number of words that match each of the 90 LIWC dimensions,expressed as percentages of total words in the text, and records the data into one of 90 preset dictionary categories. TheLIWC dictionary comprises over 6,000 words and stems. Each category is composed of a list of dictionary words.Several sources (e.g., Blogs, Expressive writing, Novels, Natural Speech, NY Times, and Twitter) were used to formthe dictionary. The program classifies about 86 percent of the language used by people. LIWC’s external validity wastested; hence LIWC is a useful research tool for measuring psychological processes, content analysis, and assessingvarious linguistic features. LIWC measures for all MDAs are generated using the LIWC2015 dictionary.
LM dictionary
Loughran and Mcdonald (2011) demonstrated that word lists developed for other disciplines mis-classify common words in the financial text. Loughran and Mcdonald (2011) created an alternative negative word list(Fin-Neg) and five other word lists that better reflect tone in the financial disclosures to overcome this. They tested the5
PREPRINT - J
ANUARY
5, 2021relation between these word lists and 10 K filing returns, trading volume, return volatility, fraud, material weakness,and unexpected earnings. Subsequently, these word lists have been known as the L.M. dictionary, and other researchershave used them in financial text analysis. Nguyen and Huynh (2020), Gandhi, Loughran, and McDonald (2019). Thefive other word lists are positive (Fin-Pos), uncertainty (Fin-Unc), litigious (Fin-Lit), strong modal words (MW-Strong),and weak modal words (MW-Weak). The Fin-Neg list has 2,337 words. This list includes financial domain words thatcommon negative words list exclude, i.e., restated, litigation, termination, discontinued, penalties, unpaid, investigation,misstatement, misconduct, forfeiture, serious, allegedly, noncompliance, deterioration, and felony. The Fin-Pos wordlist consists of 353 words. The Fin-Unc list includes words indicating uncertainty and has 285 words. For capturingpropensity to litigate, 731 litigiousness words are combined into the Fin-Lit list. It contains words such as claimant,deposition, interlocutory, testimony, and tort. In the L.M. dictionary, words from these three groups overlap. Strong andweak modal words express levels of confidence. MW-Strong has 19 words: always, highest, must, and will. MW-Weakhas 27 words: could, depending, might, and possibly.For this work, Positive, Negative, and Uncertain words are included. This work used the quanteda library, whichincludes the L.M. dictionary (Benoit et al. (2018)), for generating numeric features.
Stress dictionary
While LIWC and L.M. dictionaries extract the document’s tone and sentiment, they do not capturefundamental differences between bankrupt and non-bankrupt companies. Also, L.M. demonstrated a need for task anddomain-specific dictionariesText features indicate differential language usage between bankrupt and non-bankrupt companies. Distressed firmscommunicate the nature of distress, remedial measures, and on-going concerns. Hence narrative of distressed companiesMDAs can differ from a healthy company MDA up to three years before the bankruptcy For example, the followingare some of the statements from some distressed company’s MDAs. “Operating results are affected by indebtednessincurred to finance the acquisition and by the amortization of capitalized fees and expenses incurred in connection withsuch financing.”“The company is unlikely to be able to meet its cash flow needs during..”“The company was downgraded in november 1994 by three primary insurance rating agencies, and..”
In a healthy firm’s MDAs, we will not observe these sentences. The following are some excerpts from healthy companyMDAs. “The increases in operating earnings were driven by revenue growth and . . . ”“The company was in compliance with all debt covenants.”
Further to the difference in content, the MDA content’s linguistic features in distressed firms can differ. This differenceresults from obfuscation attempts - lengthy sentences describing the firm’s state, capturing the contingent conditions-narrating multiple agents’ attitudes, i.e., suppliers, lenders, economic factors, and management prognosis.The following statements highlight how a distressed firm communicates its efforts in handling the situation “Sincethe company currently does not have the means to repay the Series notes, management is unable to predict the futureliquidity of the company if the restructuring is not accomplished.”“The company may be required to refinance such amounts as they become due and payable. While the companybelieves that it will be able to refinance such amounts, there can be no assurance that any Such refinancing would beconsummated or, if consummated, would be in An amount sufficient to repay such obligations, particularly in light ofthe company’s high level of debt that will continue after the Restructuring.”“After giving effect to this amendment, the company was in compliance with the terms and restrictive covenants of itsdebt obligations for fiscal 1994.”“The company has funded operations primarily from borrowings under its debt agreements and the sale of its stock.”“The company was not in compliance with a net worth requirement contained in its sale-leaseback agreement.”“As a result of the second quarter 1998 loss, the company was in default of certain covenants based on ebitda.”“The loss incurred during the fourth quarter of the year ended june 30, 1999 resulted in not being in compliance with thedebt service covenant”“The proposed plan currently contemplates the filing of a pre-packaged chapter 11 plan of reorganization in order to. . . ”“These factors among others indicate that there is substantial doubt about the company’s ability to continue as a goingconcern.”“Considering our default of the loan agreements and our liquidity as discussed above, there is substantial doubt aboutour ability to continue as a going concern.”
In contrast, healthy companies do not describe these details in a lengthy manner. The following are excerpts from somehealthy companies’ MDA “Management considers the company to be liquid and able to meet its Obligations on both ashort and long-term basis.”“We had no amounts outstanding under our agreement.” PREPRINT - J
ANUARY
5, 2021Table 1: Language models usedModel Name Language ModelModel 1 LIWC LIWCModel 2 LM LM dictionaryModel 3 Stress Stress DictionaryModel 4 LIWC_Stress LIWC+ Stress DictionaryModel 5 LM_Stress LM Dictionary + Stress DictionaryModel 6 LIWC_LM_Stress LIWC+ LM dictionary + Stress DictionaryThe above observations suggest that a distress dictionary capturing these differences would differentiate bankrupt andno-bankrupt firms.
Stress dictionary method.
The dictionary is constructed using all MDAs from 2018. An MDA can contain 5000 to10,000 words. This study focuses on the “liquidity and capital requirements” section, reported by most companies. Thetask is to go through the words and identify the ones that may be red flags for bankruptcy or Stress. The general decisioncriterion in the process is high discriminatory power for identifying financial distress. The list is prepared in two steps1. Identification of differentiating words2. Classifying the words into meaningful categories.Similar to content analysis, which aims to extract information from the text’s tone, this work searches for words thatmight indicate debt restructuring or distressed business situations. The first step identified 70 candidate words. This listis refined in the next step.
Derivation of dictionary
In a second step, we analyze the candidate list in detail. From the preliminary list of 80words, we categorize and select 70 words that are consistent with prior literature.
Category 1: Debt: Words used in expressing high indebtedness
Companies deploy debt to take care of working capital and capital expenditure requirements. During normal operations,firms manage debt comfortably. When firms face difficulty in servicing the debt, management discloses the status inMDA. This communication will result in an increased frequency of debt-related words.The following words characterize debt-related sentences: Agreement, amendment, borrow, claim, collateral, guarantees,secured. A detailed list is in appendix A.
Category 2: Distress: Words used by companies close to insolvency
Companies in danger of bankruptcy exhibit several characteristics and the MDA expresses the same. The expression ofthese characteristics increases with an approaching need for bankruptcy filing. Debt covenant violations are necessarypre-cursers to bankruptcy. Debt covenant violations serve as early indicators to creditors, signaling potential problems.Most violated covenants correspond to solvency (e.g., Interest coverage and leverage), liquidity, and profitabilityrequirements. Managers try to avoid debt covenant default. Other words indicating distress are loss, chapter 11, chapter7, downgrade, and bankruptcy. We add the following words to this list: covenant, default, breach, violate, amend,restrictive, waiver
Category 3: Restructure: Words used in restructuring sentences
Managers try to manage distress through various mechanisms. Raising fresh capital, debt restructuring, and selling ofassets are some of them. All these initiatives can be viewed as balance sheet restructuring activities. MDA containssentences explaining the proposed restructuring activities. We add the following restructuring-related words to this list:dispose, recapitalize, restructure, liquidate, alternative
Category 4: Health: Characteristics of statements describing a healthy state
Firms that are not at risk of bankruptcyexpress a healthy state of the company in MDA. These sentences correspond to solvency, profit, retained earnings, anddividend payment. We add the following words to this list: retain, profit, cash, dividend, meet. These four categoriesare defined as a dictionary and further used for generating the numeric representation of MDAs.
We built multiple combinations of language models from the different language models described in the previoussection. The final list of language models is shown in table 1.7
PREPRINT - J
ANUARY
5, 2021
Once the documents have been transformed into numeric forms using language models, they are fed into predictivemodels. The sample is divided into two groups, bankrupt firms and non-bankrupt firms. The outcome is a binarydependent variable. This binary outcome is modeled using logistic regression, similar to a panel logit framework(Altman and Hotchkiss (2010)).
Logistic regression is useful to model binary outcomes. It consists of a logistic (logit) function and a binomialdistribution. While standard regression can be used to model binary outcomes, the model is not interpretable. Theoutcome is not bounded, and an ad-hoc classification rule is required to translate output to binary outcomes. Also, theoutput cannot be converted to probabilities as, in some cases, the model will produce estimates outside [0,1] bounds.The bounded constraint can be overcome by modeling odds, i.e., p/ (1 − p ) . A log transform of the odds will ensurethat probabilities are symmetric at 0.5The logistic function (also known as sigmoid function or inverse logit function ) critical ingredient of logistic regression.Logistic function: f ( x ) = 11 + e − x The logistic (logit) function: / (1 + exp ( − x )) Given log-odds: log ( p/ (1 − p )) , logistic function is the inverse of log-odds.Another formula for logistic function: g ( x ) = e x e x + 1 The logistic function, also called the sigmoid function, gives an ‘S’ shaped curve that can take any real-valued number(- ∞ to + ∞ ) and maps it to a value between 0 and 1.This transformation allows modeling a family of relationships between continuous predictors and a binary outcomevariable, in this case, bankruptcy.Key steps are1. Assuming that predictors are linearly related to the log-odds2. Transform the odds to convert to probability3. Estimate the data likelihoodIn this context, intercept shifts the curve left or right. Slopes make the curve sharper or flatter, with respect to predictors.The logistic starts at 0, ends at 1 and is symmetric around .5.Logistic regression transforms the bankruptcy outcomes so that a linear combination of predictors produces log-oddseffects on the bankruptcy. A model coefficient is transformed and interpreted as an odds multiplier. These results areeasily interpretable.The logistic regression model used in this study is based on the following mathematical definition. Bankruptcy variablecoded using 1 and 0. Y = (cid:26) bankrupt non-bankruptVariable of interest p ( x ) = P [ Y = 1 | X = x ] logistic regression model. 8 PREPRINT - J
ANUARY
5, 2021 p odd s p versus odds(p) Figure 1: p versus odds(p) log (cid:18) p ( x )1 − p ( x ) (cid:19) = β + β x + . . . + β k − x k − This equation is similar to linear regression with k − predictors for a total of k β parameters. Here, the left partof the equation is the log odds. This gives the probability for a bankruptcy ( Y = 1) divided by the probability of anon-bankruptcy ( Y = 0) . When the odds are 1, both events are equally likely. Odds greater than 1 indicate bankruptcyand vice versa. p ( x )1 − p ( x ) = P [ Y = 1 | X = x ] P [ Y = 0 | X = x ] Researchers evaluate Bankruptcy prediction models using multiple criteria. In this thesis, we use Accuracy tables, thereceiver operating characteristics (ROC) curves, and information content tests. While ROC curves inform forecastingaccuracy, sensitivity, and specificity, Information content tests evaluate the bankruptcy-related information carried bythe distress risk measures. The following section presents the method of each.
A perfect model classifies all observations accurately. Real models make mistakes in classification. One way to evaluatethe model’s performance is its misclassification rate. Alternatively, models accuracy can be used, which measures theproportion of correction classifications 9
PREPRINT - J
ANUARY
5, 2021 −5.0−2.50.02.55.0 0.00 0.25 0.50 0.75 1.00 p l ogodd s p versus logodds(p) Figure 2: p versus logodds(p)Misclassification ( ˆ C, Data ) = 1 n n (cid:88) i =1 I ( y i (cid:54) = ˆ C ( x i )) I ( y i (cid:54) = ˆ C ( x i )) = (cid:40) y i = ˆ C ( x i )1 y i (cid:54) = ˆ C ( x i ) This measure is not useful in training data. This metric improves with the number of parameters and hence will bebiased towards large models. This bias encourages overfitting. Hence, this metric needs to be computed on test data,unseen by the model during training.Accuracy tables can be further split into confusion matrix, to understand the nature of misclassification. Confusionmatrix categorizes the classification errors into false negatives and false positives.Setting the classification threshold as 0.5 η ( x ) = 0 ⇐⇒ p ( x ) = 0 . Predictions can be used to create a confusion matrix as below.Prev = PTotal Obs = TP + FNTotal ObsA reasonable classifier has to outperform a naïve classifier that labels all observations as majority class. In this work, amodel classifying every company as non-bankrupt will be the baseline. Apart from accuracy, specificity and sensitivitycan be used to evaluate models. Sensitivity is the true-positive rate. Higher sensitivity means the model classifies more10
PREPRINT - J
ANUARY
5, 2021 logodds p logodds(p) versus p Figure 3: Logodds(p) vs ppositives correctly, reducing the false negatives. Specificity is the true negative rate. Higher specificity means theclassifier is labeling true negatives correctly, reducing false positives. The formulae are given below.Sensitivity = True Positive Rate = TPP = TPTP + FNSpecificity = True Negative Rate = TNN = TNTN + FPBoth metrics can be computed directly from the confusion matrix.
Relationship between Accuracy, Specificity, and Sensitivity
As we compute specificity and sensitivity from the confusion matrix, different classification thresholds generate multiplesensitivity/ specificity values. It is normal to use 0.5 probability as “cutoff.” By modifying the cutoff, we can improvethe sensitivity or specificity at the overall accuracy expense. Also, if sensitivity improves, specificity deteriorates, andvice versa. ˆ C ( x ) = (cid:26) p ( x ) > c p ( x ) ≤ c Receiver Operating Characteristics (ROC) curve is a method to assess the accuracy of a continuous measurement forpredicting a binary outcome. It is used extensively in the Lifesciences Domain. Over the past two decades, it gainedacceptance as a bankruptcy prediction model validation tool (Sobehart and Keenan (2001)).11
PREPRINT - J
ANUARY
5, 2021Figure 4: Confusion MatrixFor a bankruptcy prediction model, for a fixed cutoff c, we can compute accuracy metrics and two types of classificationerrors: false negatives and false positives. In bankruptcy prediction, the model generates the measure of firm distress M,based on independent variables. This measure is a continuous measurement. We derive a 1 (test positive) classificationas M exceeding a fixed threshold c: M>c. For bankruptcy detection, the binary outcome B, a good outcome of thetest, is when classification is 1 (the test is positive) among bankrupt companies B=1. A bad outcome is when theclassification is 1 (test is positive) among non-bankrupt companies B=0. The true-positive fraction is the probabilityof an estimated positive among the bankrupt firms: TPF(c)=P{M>c|B=1}. This value is the sensitivity at cutoff c.Similarly, the false-positive fraction is the probability of a bankrupt classification among the non-bankrupt firms:FPF(c)= P{M>c|B=0} ROC curve is the plot of TPF against at various cutoff levels c. It has FPF(c) on the x-axis andTPF(c) along the y-axis.A perfect bankruptcy prediction model - that is, the ranking on default probability at cutoff c is equal to the ranking offailures at c – would be able to capture all bankruptcies. This model corresponds to a vertical line at 0 FPF. A randombankruptcy prediction model – that is, the ranking at cutoff c is not correlated with the ranking of failures – would havethe same percentage of failures across each cutoff level. This model corresponds to a line at 45 to the x-axis. Since weexpect a bankruptcy prediction model to be better than a random model, the ROC curve is expected to be between theperfect and the random model.To compare the two models’ predictive ability, we calculate the area under the ROC curve (AUC). Sobehart and Keenan(2001) used the AUC is the decisive indicator for default model accuracy. Information content tests help examine the proposed bankruptcy prediction models. They evaluate if bankruptcyprediction models carry more information than another set of variables. The use of Information content tests has manyprecedents in bankruptcy prediction. They complement the ROC curve analysis since (i) ROC curve analysis provides12
PREPRINT - J
ANUARY
5, 2021users with a binary option, but users may not be making such decisions. Users of bankruptcy prediction models areinterested in determining credit terms or portfolio weights. (ii) ROC curve analysis ignores associated error costs basedon context-specific type I/ type II errors.There are two primary information criteria: the Akaike information criterion (AIC) and the Bayes information criterion(BIC). When models are built using the same data by maximum likelihood, smaller AIC or BIC indicates a better fit.
Akaike Information Criterion
The AIC is the simpler of the two; it is defined as AIC = -2LL + 2k, in which -2LLis the deviance (described below), and k is the number of predictors in the model.The maximum log-likelihood of a regression model is: [log L ( ˆ β , ˆ σ ) = − n log(2 π ) − n log (cid:0) RSS n (cid:1) − n , ] Where ˆ β and ˆ σ and RSS = (cid:80) ni =1 ( y i − ˆ y i ) were selected to maximize the likelihood.From the above, AIC is derived as the difference between penalty and log-likelihood [ AIC = − L ( ˆ β , ˆ σ ) + 2 k = 2 k + n + n log(2 π ) + n log (cid:0) RSS n (cid:1) , ] AIC combines two components of the model, i.e., the likelihood – a measure of “goodness-of-fit” and the penalty-proportional to the model size. The likelihood portion of AIC for two models fit on the same dataset is a function ofRSS. Higher RSS (squared deviation) indicates a poor model fit. A good model has low RSS and AIC. The penaltycomponent of AIC is [2 k, ] , a function of the number of β parameters used in the model. As k increases, AIC increases.A good model with a small AIC will have a balance between the goodness of fit and uses a small number of parameters. Bayesian Information Criterion
The BIC is similar to AIC but adjusts the penalty included by the number of cases:BIC = -2LL + k x log(n) in which n is the number of cases in the model. This way, BIC picks smaller models for largersample sizes, compared to AIC. For model selection, we use the model with the smallest BIC. [ BIC = log( n ) k − L ( ˆ β , ˆ σ ) = log( n ) k. + n + n log(2 π ) + n log (cid:0) RSS n (cid:1) ] The penalty for AIC is 2k whereas for BIC, it is [log( n ) p. ] . For datasets with log ( n ) > , BIC penalty will be highercompared to AIC. Hence BIC will prefer smaller models for similar log-likelihoods. This research work focuses on building bankruptcy prediction models using financial disclosures text features. Statisticalanalysis has been conducted, and models are built as per the methodology described in section 3. This chapter willdescribe the results.The chapter is structured into multiple sub-sections covering descriptive statistics of linguistic features, relationshipwith bankruptcy, model performance, and evaluation.
This section describes the statistical properties of datasets and features used in this work
The list of bankruptcies from LoPucki (2006) has more than 1000 observations. This dataset covers large bankruptciesfrom 1980 to date.The annual bankruptcy filings trend is given in figure 5.–>On average, 29 firms filed for bankruptcy in a year, with median annual bankruptcies at 25. A maximum of 97bankruptcies was filed in the year 2001. Recall that this research has included bankruptcies till 2018 Dec.
For the selected bankrupt firms and healthy firms, all available MDAs are transformed into numeric forms using threedictionaries, i.e., LIWC, L.M., and stress dictionary. These linguistic features are averaged at the group level andpresented in the below table. 13
PREPRINT - J
ANUARY
5, 2021Figure 5: Number of bankruptcies filed by yearColumn “All” documents the summary statistics for all sample firms. WPS and W.C. indicate that the sample firms’MDAs are in general lengthy with ~10000 words and 27 words per sentence, indicating ~400 sentences per MDA.Excluding WPS and W.C., all other values are in percentages. Close to 30% of words are complex (27.93 Sixltr )and functional (function. 30.57). The sample firms’ MDAs are present-focused (focuspresent 2.69), and their futurefocus is one-third of the present focus. Per LIWC classification, on average, the MDAs have three times more positivewords compared to negative words (posemo:2.1, negemo: 0.73). Cognitive process-related and drives related words areobserved with similar frequency (cogproc:7.38, drives:6.62), while Social/ affect words occur at half of that (social:3.37,affect:2.83). Based on the L.M. dictionary, we can observe that negative and uncertain words are double that of positivewords frequency (negative:1.01, uncertain:0.96, positive: 0.51) Stress dictionary features indicate that debt words areprevalent at 2.72. In a typical MDA of 10,000 words length, this indicates 270 words describing debt-related discussionand disclosures. Distress and restructure related occur less frequently, which can be expected as they are infrequentorganizational outcomes.The table’s focus is Column “Bankrupt,” as it illustrates the summary statistics of bankrupt firms. Bankrupt firms aremore past focused. They also use less cognitive and drives related words. A striking difference is observed in theincrease in debt and distress related word frequency. They also show increased negative word frequency.Since we are interested in building predictive models using prior year filings, it would be critical to observe how thelinguistic features trend for bankrupt companies compared to non-bankrupt companies. The figure 6 shows the same.14
PREPRINT - J
ANUARY
5, 2021Table 2: Linguistic features words percentageFeature All Bankrupt HealthyWPS 26.91 27.35 26.72WC 10109.29 10164.10 10085.28Sixltr 27.93 27.87 27.96Dic 162.26 160.67 162.96function. 30.57 30.67 30.52affect 2.83 2.83 2.84social 3.37 3.29 3.41cogproc 7.38 7.19 7.47percept 0.30 0.32 0.29bio 0.98 0.97 0.98drives 6.62 6.41 6.70relativ 10.71 10.70 10.72AllPunc 12.35 12.77 12.17focuspast 1.73 1.78 1.70focuspresent 2.69 2.61 2.73focusfuture 0.78 0.80 0.78anger 0.04 0.03 0.04posemo 2.10 2.12 2.09negemo 0.73 0.71 0.74debt 2.72 3.04 2.58distress 0.24 0.33 0.21restructure 0.08 0.09 0.07healthy 0.54 0.55 0.54negative 1.01 1.07 0.98positive 0.51 0.48 0.52uncertainty 0.96 0.91 0.9815 P R E P R I N T - J ANUA R Y , Figure 6: Linguistic features evolution PREPRINT - J
ANUARY
5, 2021This figure depicts various types of word frequencies for bankrupt companies during the year of bankruptcy and fiveprior years. For comparison, sample non-bankrupt firms’ word percentages are plotted over six years, going back fromthe latest filing. The values are averaged for bankrupt and non-bankrupt firms.
Notable trends in LIWC features
All linguistic LIWC features for bankrupt firms are lower than non-bankrupt firms throughout the period. There is agradual increase in focuspast and focusfuture.
Notable trends in L.M. features
Bankrupt companies have lower uncertain and positive words throughout the period. Negative words stat increasingtwo years before the bankruptcy.
Notable trends in Stress Dictionary features
Stress dictionary features captured the evolution of distress and bankruptcy. Debt related words exceed relative to healthyfirms four years before bankruptcy and gradually inch up further till the event of bankruptcy. Distress related wordsremain marginally higher from 5 years to 2 years before bankruptcy and dramatically increase after that. Restructurerelated word frequency for bankrupt firms is indistinguishable till two years before the bankruptcy. This observation isexpected as firms do not take up such costly exercises unless the financial distress is unmanageable and covenant defaultis imminent. There is no change in “healthy” frequency for both bankrupt and non-bankrupt firms, though bankruptfirms have lower occurrence throughout the period.Overall, we can observe sufficient differences between bankrupt and non-bankrupt firms.
Figures 7, 8,9 show correlation structure among LIWC features, LM-Stress dictionary and selected variables from thesethree models.Of the LIWC features, few are highly correlated, i.e., dictionary, functional, social, and drives. All other features havelow correlations indicating they are capturing different information. In the stress dictionary, debt and distress show a0.45 correlation, which is expected. Other variables are uncorrelated. Also, there is no correlation between the L.M.dictionary and stress dictionary features. Finally, selected variables from these three models are checked for correlation.There is an insignificant correlation indicating minimum overlap. This low correlation indicates their complementarynature, and a hybrid model combining these features might perform better than standalone models.17 P R E P R I N T - J ANUA R Y , Figure 7: Correlations between LIWC features P R E P R I N T - J ANUA R Y , Figure 8: Correlations between LM and stress dictionary P R E P R I N T - J ANUA R Y , Figure 9: Correlations between all selected linguistic features PREPRINT - J
ANUARY
5, 2021
The following will explain the results of various experiments done to test the hypothesis we outlined in the methodology
From descriptive statistics, we observed that there are distinct qualities that differentiate bankrupt firms from non-bankrupt firms. We set out to test this hypothesis.
Independent T-tests were conducted. The number of bankrupt firms and non-bankrupt firms is 500 eachThe 500 bankrupt firms compared to the 500 non-bankrupt firms demonstrated significantly higher distress, t(868) =17.38, p = .00.Bankrupt firms had significantly higher debt (t(992) = 12.32, p= 0.00), higher negative words (t(997) = 8.28, p= 0.00)and higher restructure words (t(922) = 7.67, p= 0.000)There was no significant effect for negative emotions (negemo), t(988) = 0.69, p = .62, despite bankrupt (M = 0.88, SD= 0.38) attaining higher scores than non-bankrupt (M = 0.86, SD = 0.35).10 shows the details. 21
PREPRINT - J
ANUARY
5, 2021Figure 10: Bankrupt vs non-bankrupt linguistic features T test
As part of this hypothesis, a Logistic regression model with all LIWC features as independent variables has been fit.Another model with L.M. features as predictors are built and compared.
Here we review the LIWC logit model. Table 3 presents the model details.We can observe that only a few predictors are significant. This observation is expected as the LIWC model capturesvarious aspects of language, and only a few of them can be expected to be impacted by the distress and potentialbankruptcy conditions.
W P S , Dic , f unction. , f ocuspast , and f ocusf uture are significant at 0.001 level. Thelogistic regression coefficients give the change in the log odds of the outcome for a one-unit increase in the predictorvariable. Here, except W P S , all predictors are percentages of category words.For every one unit change in
W P S , the log odds of bankruptcy (versus non-bankruptcy) increases by 0.08 with 95% CI[0.04, 0.12]. For a one percent increase in f ocuspast the log odds of being bankrupt increases by 1.10 with 95% CI[0.66, 1.55]. The same for f ocusf uture increases by 1.65 with 95% CI [1.00, 2.31].Another way to interpret these coefficients is to use the odds ratio. This fitted model says that holding other predictorsat a fixed value, the odds of bankruptcy for a firm whose disclosure has 1% f ocusf uture words than a firm with zeropercent such words are exp(1.65) = 5.2. We can say that the odds for a firm with higher f ocusf uture words are420% higher in terms of percent change.Other predictors that are significant at <0.05 levels are social , cogproc , and drives . The log odds are 0.37, -0.34, 0.30with 95% CIs [0.10, 0.65], [-0.64, -0.06], and [-0.00, -0.60], respectively.22 PREPRINT - J
ANUARY
5, 2021Table 3: LIWC model coefficientsPredictors Coefficients SE pvalue Lower CI Upper CI Odds RatioIntercept 2.75 3.34 0.411 -3.92 9.16 15.68WPS 0.08 0.02 <0.001 0.04 0.12 1.08WC 0.00 0.00 0.103 0.00 0.00 1.00Sixltr -0.07 0.05 0.202 -0.17 0.04 0.93Dic -0.16 0.04 <0.001 -0.24 -0.08 0.85function. 0.60 0.11 <0.001 0.39 0.81 1.82affect -0.60 5.21 0.909 -11.50 9.26 0.55social 0.37 0.14 0.007 0.10 0.65 1.45cogproc -0.34 0.15 0.021 -0.64 -0.06 0.71percept 0.73 0.40 0.070 -0.05 1.52 2.07bio -0.05 0.23 0.829 -0.49 0.39 0.95drives 0.30 0.15 0.049 0.00 0.60 1.35relativ -0.06 0.12 0.604 -0.29 0.17 0.94AllPunc -0.08 0.04 0.062 -0.16 0.00 0.92focuspast 1.10 0.23 <0.001 0.66 1.55 3.00focuspresent 0.06 0.21 0.773 -0.35 0.48 1.06focusfuture 1.65 0.33 <0.001 1.00 2.31 5.20anger 0.68 2.18 0.754 -3.57 5.05 1.98posemo 1.15 5.21 0.825 -8.70 12.05 3.17negemo 1.34 5.24 0.799 -8.57 12.30 3.81Table 4: LM model coefficientsPredictors Coefficients SE pvalue Lower CI Upper CI Odds RatioIntercept 0.56 0.31 0.070 -0.04 1.17 1.75negative 1.41 0.17 <0.001 1.08 1.76 4.10positive -2.88 0.40 <0.001 -3.68 -2.12 0.06uncertainty -0.64 0.20 0.002 -1.06 -0.26 0.53
Here we review the L.M. logit model. Table 4 presents the coefficients and confidence intervals.We can observe that all predictors are significant. negative and positive are significant at 0.001 level. For a onepercent increase in negative words, the log odds of being bankrupt increases by 1.141 with 95% CI [1.08, 1.76]. Thesame for positive changes by -2.88 with 95% CI [-3.68, -2.12].This L.M. model says that holding other predictors at a fixed value, the odds of bankruptcy for firms whose disclosurehas 1% negative words than a firm with zero percent such words are exp(1.41) = 4.10. We can say that the odds for afirm with higher negative words are 310% higher in terms of percent change. ANOVA indicates that models are significantly different.Accuracy, BIC, and AIC metricsROC comparison is shown in figure 13Table 5: LIWC and LM model comparisonModel Training Accuracy Test Accuracy LogLik AIC BIC AUC Deviance ParametersLIWC 0.69 0.63 -470.61 981.22 1074.92 0.72 941.22 19LM 0.68 0.74 -489.88 987.75 1006.49 0.78 979.75 323
PREPRINT - J
ANUARY
5, 2021Figure 11: LIWC model ROCWe can observe that for the L.M. model, while BIC is lower than the LIWC model, AIC is higher. Recall that wenoted in section 3.5.3, for sample size >100, BIC will prefer smaller models for similar log-likelihoods. The out ofsample forecasting performance represented in the
Test Accuracy column indicates the L.M. model provides 10% higheraccuracy. Also, ROC is better for L.M. Overall, while the LIWC model captures more information, probably due tomany parameters, the L.M. model predictive performance is better than the LIWC model.24
PREPRINT - J
ANUARY
5, 2021Figure 12: LM model ROC
Here we review the stress dictionary logit model. Table 6 presents the coefficients and confidence intervals.We observe that debt , distress and restructure are significant at 0.001 level. For a one percent increase in distress words, the log odds of being bankrupt increases by 5.03 with 95% CI [3.98, 6.15]. The same for debt , restructure increase by 0.36 and 2.96 with 95% CIs [0.19, 0.54] and [1.45, 4.54] respectively.Most importantly, as per this model, holding other predictors at a fixed value, the odds of bankruptcy for a firm whosedisclosure has 1% distress words compared to a firm with zero percent such words is exp(5.03) = 153.66. This highodds ratio indicates that distress words percentage is a highly sensitive indicator to forthcoming bankruptcy. ROC comparison is shown in figure 15 Overall, we can observe that the Stress model is better than the L.M. model onBIC and ROC criteria. Also, test performance is better in the Stress dictionary.25
PREPRINT - J
ANUARY
5, 2021Figure 13: LIWC LM ROC comparisonTable 6: Stress dictionary model coefficientsPredictors Coefficients SE pvalue Lower CI Upper CI Odds RatioIntercept -3.36 0.40 <0.001 -4.16 -2.59 0.03debt 0.36 0.09 <0.001 0.19 0.54 1.44distress 5.03 0.55 <0.001 3.98 6.15 153.66restructure 2.96 0.79 <0.001 1.45 4.54 19.39healthy 0.23 0.38 0.5 -0.51 0.98 1.26Table 7: Stress and LM model comparisonModel Training Accuracy Test Accuracy LogLik AIC BIC AUC Deviance ParametersLM 0.68 0.74 -489.88 987.75 1006.49 0.78 979.75 3Stress_Diction 0.72 0.79 -428.09 866.18 889.61 0.86 856.18 426
PREPRINT - J
ANUARY
5, 2021Figure 14: Stress dictionary model ROC
Considering the observation that the Correlation between LIWC, L.M., and stress dictionary features is low, we cantake advantage of their complementary nature. Three combination models with combined inputs have been fitted on thedataset: LIWC + Stress, L.M. + Stress, and LIWC + L.M. + Stress. Model coefficients are presented in appendix B.The performance results are shared below. 27
PREPRINT - J
ANUARY
5, 2021Figure 15: LM and stress dictionary model ROC comparison
Combination models PREPRINT - J
ANUARY
5, 2021Table 8: Dictionary models AUC comparisonModel AUCLIWC 0.72LM 0.78Stress_Diction 0.86LIWC_Stress 0.86LM_Stress 0.87LIWC_LM_Stress 0.87Figure 16: LIWC and stress dictionary model ROC29
PREPRINT - J
ANUARY
5, 2021Figure 17: LM and stress dictionary model ROC
This subsection reviews the dictionary-based models. We have evidence to believe that there is incremental performanceimprovement as additional features are incorporated into the model. A comparison of model performance is as below
This work provides the first comprehensive test of text disclosure-based dictionary-based bankruptcy predictionmodels. For Dictionary-based models, I apply the LIWC dictionary Pennebaker et al. (2015), the Loghron McDonaldsDictionary Loughran and Mcdonald (2011), and a custom dictionary, developed as part of this work. To test the models’performance, I use receiver operating characteristics (ROC) curves, information content tests, and the accuracy metrics.The tests using ROC curve analysis demonstrated that all dictionary-based bankruptcy prediction models have a greaterforecasting accuracy than a random model and that the composite models perform better than their individual languagemodels. Information content tests provide evidence that all models carry significant bankruptcy-related information.30
PREPRINT - J
ANUARY
5, 2021Figure 18: LIWC, LM and stress dictionary model ROC comparisonTable 9: Dictionary models comparisonModel Training Accuracy Test Accuracy LogLik AIC BIC AUC Deviance ParametersLIWC_LM_Stress 0.78 0.80 -366.95 787.89 914.38 0.87 733.89 26LIWC_Stress 0.77 0.79 -374.89 797.78 910.21 0.86 749.78 23LM_Stress 0.74 0.80 -410.58 837.17 874.64 0.87 821.17 7Stress_Diction 0.72 0.79 -428.09 866.18 889.61 0.86 856.18 4LIWC 0.69 0.63 -470.61 981.22 1074.92 0.72 941.22 19LM 0.68 0.74 -489.88 987.75 1006.49 0.78 979.75 331
PREPRINT - J
ANUARY
5, 2021Figure 19: Bag of words dictionary models ROC comparison
In this chapter, I summarize the contributions of the current study to text analysis in finance, present the research’sobjectives and findings in the context of previous research, and suggest appropriate future research directions.
This study constitutes an exploration of knowledge extraction from the narrative, corporate report sections using textanalysis. The aims are1.To establish if linguistic features of disclosures can explain firm attributes in the financial analysis context2.To determine which language models perform better in capturing information.3.Specifically, to predict bankruptcy based on management’s discussion and analysis in annual filings.Knowledge in a public firm’s context involves information that can influence organizational outcomes and future stockperformance. As managers have an information advantage over the public, their narrative disclosures have significantinformation content, over and above the quantitative financial measures. This information helps in understanding thefirm’s current financial status, the firm’s ability to continue its operations without hindrances, the kinds of risks the firmis exposed to, strategic and tactical interventions the management is undertaking to overcome the challenges capture theopportunities and capital allocation plans. This knowledge gives more in-depth insights into the firm’s prospects.32
PREPRINT - J
ANUARY
5, 2021In the context of this thesis, knowledge extraction is studied in the form of predicting adverse organizational outcomes,specifically in the form of (1) Bankruptcy prediction, i.e., predict if a firm will file for chapter 7 or 11 within one yearafter the annual filing date (2) using management disclosures and analysis section in annual filing (10-K).For this purpose, I employed one new measurement technique based on content analysis and research, namely a stressscore based on the number of financial stress words per thousand words. I built text feature-based bankruptcy predictionmodels LIWC dictionary, L.M. dictionary, and Stress dictionary. Concerning the text feature-based bankruptcyprediction introduced in this study, this is the first time they are used in accounting research, and they address thenumbers bias concerns inherent in traditional approaches. The scoring of Stress using linguistic markers is also anew approach to measuring financial distress. It is based on the linguistic characteristics that management displayswhen explaining the current liquidity challenges and its attempts to overcome them through debt extensions, newfinancing, and asset restructuring. Such explanations would result in increased frequencies of words related to covenants,modified loan agreements, restructuring, new financing activities, uncertainty about firms’ ability to raise funds, assetsales, and capital expense reduction in a distressed corporate reporting context. It may also result in managementattempting to present a rosy image of prospects to outsiders inconsistent with management’s perception of the firm andits performance.
Bankruptcy prediction has been an active research topic for accounting researchers over decades. With the improvedawareness about financial ratios’ shortcomings and availability of text analysis tools, researchers have exploredincorporating textual features into bankruptcy prediction models. Hájek and Olej (2015) studied various word categoriesfrom corporate annual reports and showed that the language used by bankrupt companies shows stronger tenacity,accomplishment, familiarity, present concern, exclusion, and denial. They built prediction models combining bothfinancial indicators and word categorizations as input variables.Working on U.S. Banks Gandhi, Loughran, and McDonald (2017) used disclosure text sentiment as a proxy for bankdistress. Other notable works using text analysis for bankruptcy prediction were Yang, Dolar, and Mo (2018) andMayew, Sethuraman, and Venkatachalam (2015). Yang, Dolar, and Mo (2018) used high-frequency words from MDAand compared the differences between bankrupt and non-bankrupt companies. Mayew, Sethuraman, and Venkatachalam(2015) also analyzed MDA with a focus on going-concern options. They found that disclosure’s predictive ability isincremental to financial ratios, market-based variables, even the auditor’s going concern opinion and extends to threeyears before the bankruptcy.As we can observe, prior work focused on marginal information content in the text. While researchers concluded thatnarratives have information content and predictive power, the limits and extent of that information are not tested. Thisthesis tests that and demonstrates that the information content is sufficient to predict bankruptcy, independent of anyfinancial and quantitative metrics.Prior work in disclosure text analysis focused on simple text measures like readability, sentiment, and tone. Thislimitation was probably motivated by the intent to use them as marginal predictors, along with financial ratios. Also, thelanguage model methods were the limitation. Limited organizational outcomes can be explained by shallow languagemodels that capture marginal information from disclosure text. My work has demonstrated that interpretable andaccurate predictions can be made with task-specific dictionaries.
This work demonstrates that textual disclosures, independent of financial ratios, have predictive power. Further, by wayof task-independent language models, this work enables multiple tasks to be solved with the same set of features, i.e.,language features. With a sufficiently large dataset containing 100s of samples, researchers can build reliable predictivemodels quickly.Another implication is text-based soft metrics. Investors are interested in knowing firm performance on corporateresponsibility, climate change, and ethical business practices. It is not easy to measure these attributes using accountingmetrics. Capturing and reporting new metrics will involve significant capital expenditure for firms. Firms can reportthe same using narrative disclosures. Investors can extract the same using the methods shown in this work. Finally,text metrics-based factors and factor investment is a possibility based on this approach. Lopez Lira (2019) usedtext-based analysis to measure firm risk exposure and built factor models with such risk portfolios. These modelsexplain cross-section returns suggesting internal validity. Similar portfolios on other dimensions like fraud, climateexposure, etc., can be explored using this thesis’s approach.33
PREPRINT - J
ANUARY
5, 2021
Like other empirical studies in finance and language processing, the results presented in this thesis contain somelimitations.Due to resource constraints for downloading, processing, and storing extensive text data, the Knowledge extraction fromFinancial disclosures has been attempted on a single task of Bankruptcy Prediction using different methodologies. I alsorestricted the text content to one type of corporate narrative document (i.e., Management’s Discussion and Analysis).For this reason, caution needs to be applied in generalizing results.Since the text analysis is restricted to the surface structure of language, it is impossible to say whether the extractedsignal is a true reflection of the management’s statement. What is more, disclosure changes can result from managerialinterventions, restructuring activities, e.g., raising capital, asset sell-off or cost reductions, and analyzing. Theseinterventions can improve performance, and the distressed firm might show better market performance, hence avoidingbankruptcy.I use the EDGAR filing date as the time stamp for filing. If bankruptcy filing happens within one such filing date, suchfiling is used to compute linguistic features, subsequently used as predictors. Any actions that management takes afterfiling, which can alter firm stress levels, are not captured. This limitation is inherent when using disclosure data.
The majority of prior text analysis research in the finance context focuses solely on sentiment analysis and does notaddress direct knowledge extraction. In particular, significant effort has been deployed in linking sentiment and toneto subsequent performance and fraud. Moreover, the extracted information, i.e., sentiment score, is used only as anadditional input to existing quantitative models. I see four broad questions that need to be addressed. Given thatmanagement has an information advantage about firms1.What knowledge can be extracted from the management’s textual disclosures?2.Which of the firm’s future states can disclosures textual analysis explain or predict?3.Which language and document models facilitate fast and reliable information extraction?4.How can investors incorporate this information into their decision-making process?These research questions have received relatively less attention by comparison with efforts to measure sentiment.Improving the affordability of data science tools is making unstructured analysis easier. Wider adoption of unstructuredanalysis will allow researchers to apply more focus to these four questions.
My work demonstrates that textual disclosures, independent of financial ratios, have predictive power. This observationraises the question: of all available financial and accounting metrics, which can be replaced with more reliable text-basedmetrics? Text-based metrics can not possibly possess all the information contained in accounting metrics. However, itis critical to understand the limits of such information as well as validity.Altman, Edward I. 1968. “Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy.”
TheJournal of Finance
23 (4): 589–609.Altman, Edward I, and Edith Hotchkiss. 2010.
Corporate Financial Distress and Bankruptcy: Predict and AvoidBankruptcy, Analyze and Invest in Distressed Debt . Vol. 289. John Wiley & Sons.Amel-Zadeh, Amir, and Jonathan Faasse. 2016. “The Information Content of 10-K Narratives: Comparing MD&A andFootnotes Disclosures.” https://doi.org/10.2139/ssrn.2807546 .Ball, Christopher, Gerard Hoberg, and Vojislav Maksimovic. 2012. “Redefining Financial Constraints: A Text-BasedAnalysis.”
SSRN Electronic Journal . https://doi.org/10.2139/ssrn.1923467 .Beams, Joseph, and Yun Chia Yan. 2015. “The effect of financial crisis on auditor conservatism: US evidence.” Accounting Research Journal
28 (2): 160–71. https://doi.org/10.1108/ARJ-06-2013-0033 .Benoit, Kenneth, Kohei Watanabe, Haiyan Wang, Paul Nulty, Adam Obeng, Stefan Müller, and Akitaka Matsuo. 2018.“Quanteda: An R Package for the Quantitative Analysis of Textual Data.”
Journal of Open Source Software https://doi.org/10.21105/joss.00774 .Bodnaruk, Andriy, Tim Loughran, and Bill McDonald. 2013. “Using 10-K Text to Gauge Financial Constraints.”
Ssrn
50 (4): 623–46. https://doi.org/10.2139/ssrn.2331544 .34
PREPRINT - J
ANUARY
5, 2021Bourveau, Thomas, Yun Lou, and Rencheng Wang. 2018. “Shareholder Litigation and Corporate Disclosure: Evidencefrom Derivative Lawsuits.”
Journal of Accounting Research
56 (3): 797–842. https://doi.org/10.1111/1475-679X.12191 .Enev, Maria. 2017. “Going Concern Opinions and Management’s Forward Looking Disclosures: Evidence from theMD&A.” https://doi.org/10.2139/ssrn.2938703 .Feldman, Ronen, Suresh Govindaraj, Joshua Livnat, and Benjamin Segal. 2008. “The Incremental Information Contentof Tone Change in Management Discussion and Analysis.” https://doi.org/10.2139/ssrn.1126962 .———. 2010. “Management’s tone change, post earnings announcement drift and accruals.”
Review of AccountingStudies
15 (4): 915–53. https://doi.org/10.1007/s11142-009-9111-x .Fisher, Ingrid E, Margaret R Garnsey, and Mark E Hughes. 2016. “Natural Language Processing in Accounting,Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research.”
Intelligent Systems inAccounting, Finance and Management
23 (3): 157–214.Gandhi, Priyank, Tim Loughran, and Bill McDonald. 2019. “Using Annual Report Sentiment as a Proxy for FinancialDistress in Us Banks.”
Journal of Behavioral Finance
20 (4): 424–36.———. 2017. “Using Annual Report Sentiment as a Proxy for Financial Distress in U.S. Banks.”
Ssrn , March, 1–13. https://doi.org/10.2139/ssrn.2905225 .Hájek, Petr, and Vladimír Olej. 2015. “Word categorization of corporate annual reports for bankruptcy predic-tion by machine learning methods.” In
Lecture Notes in Computer Science (Including Subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics) , 9302:122–30. https://doi.org/10.1007/978-3-319-24033-6_14 .Hillegeist, Stephen A, Elizabeth K Keating, Donald P Cram, and Kyle G Lundstedt. 2004. “Assessing the Probability ofBankruptcy.”
Review of Accounting Studies
Journal of Financial Economics
106 (3): 614–34.Lopatta, Kerstin, Mario Albert Gloger, and Reemda Jaeschke. 2017. “Can Language Predict Bankruptcy? TheExplanatory Power of Tone in 10-K Filings.”
Accounting Perspectives
16 (4): 315–43. https://doi.org/10.1111/1911-3838.12150 .Lopez Lira, Alejandro. 2019. “Risk Factors That Matter: Textual Analysis of Risk Disclosures for the Cross-Section ofReturns.” https://doi.org/10.2139/ssrn.3313663 .LoPucki, Lynn M. 2006. “Bankruptcy Research Database.”Loughran, Tim, and Bill Mcdonald. 2009. “Plain English , Readability , and 10-K Filings.”
English .Loughran, T I M, and Bill Mcdonald. 2011. “When is a Liability not a Liability ? Textual Analysis , Dictionaries , and 10-Ks Journal of Finance , forthcoming.” 1. Vol. 66. https://doi.org/10.1111/j.1540-6261.2010.01625.x .Mayew, William J., Mani Sethuraman, and Mohan Venkatachalam. 2015. “MD&A disclosure and the firm’s ability tocontinue as a going concern.”
Accounting Review
90 (4): 1621–51. https://doi.org/10.2308/accr-50983 .Nguyen, Ba-Hung, and Van-Nam Huynh. 2020. “Textual Analysis and Corporate Bankruptcy: A Financial Dictionary-Based Sentiment Approach.”
Journal of the Operational Research Society , 1–20.Pennebaker, James W, Ryan L Boyd, Kayla Jordan, and Kate Blackburn. 2015. “The Development and PsychometricProperties of Liwc2015.”Pennebaker, James W, Martha E Francis, and Roger J Booth. 2001. “Linguistic Inquiry and Word Count: LIWC 2001.”
Mahway: Lawrence Erlbaum Associates
71 (2001): 2001.Rajan, Uday, Amit Seru, and Vikrant Vig. 2015. “The Failure of Models That Predict Failure: Distance, Incentives, andDefaults.”
Journal of Financial Economics
115 (2): 237–60.Shirata, Cindy Yoshiko, Hironori Takeuchi, Shiho Ogino, and Hideo Watanabe. 2011. “Extracting Key Phrases asPredictors of Corporate Bankruptcy: Empirical Analysis of Annual Reports by Text Mining.”
Journal of EmergingTechnologies in Accounting https://doi.org/10.2308/jeta-10182 .Sobehart, Jorge, and Sean Keenan. 2001. “Measuring Default Accurately.”
Risk
14 (3): 31–33.35
PREPRINT - J
ANUARY
5, 2021Tao, Jie, Amit V. Deokar, and Ashutosh Deshmukh. 2018. “Analysing forward-looking statements in initial publicoffering prospectuses: a text analytics approach.”
Journal of Business Analytics https://doi.org/10.1080/2573234x.2018.1507604 .Tinoco, Mario Hernandez, and Nick Wilson. 2013. “Financial Distress and Bankruptcy Prediction Among ListedCompanies Using Accounting, Market and Macroeconomic Variables.”
International Review of Financial Analysis
30: 394–419.Wu, Yanhui, Clive Gaunt, and Stephen Gray. 2010. “A Comparison of Alternative Bankruptcy Prediction Models.”
Journal of Contemporary Accounting & Economics
Journal of Emerging Technologies in Accounting
15 (1): 45–55. https://doi.org/10.2308/jeta-52085https://doi.org/10.2308/jeta-52085