[PDF] Bankruptcy prediction using disclosure text features

Abstract

A public firm's bankruptcy prediction is an important financial research problem because of the security price downside risks. Traditional methods rely on accounting metrics that suffer from shortcomings like window dressing and retrospective focus. While disclosure text-based metrics overcome some of these issues, current methods excessively focus on disclosure tone and sentiment. There is a requirement to relate meaningful signals in the disclosure text to financial outcomes and quantify the disclosure text data. This work proposes a new distress dictionary based on the sentences used by managers in explaining financial status. It demonstrates the significant differences in linguistic features between bankrupt and non-bankrupt firms. Further, using a large sample of 500 bankrupt firms, it builds predictive models and compares the performance against two dictionaries used in financial text analysis. This research shows that the proposed stress dictionary captures unique information from disclosures and the predictive models based on its features have the highest accuracy.

Full PDF

BB ANKRUPTCY PREDICTION USING DISCLOSURE TEXT FEATURES

A P

REPRINT

Sridhar Ravula

Department of AnalyticsHarrisburg University of Science and TechnologyHarrisburg, PA 17101 [email protected]

January 5, 2021 A BSTRACT

A public ﬁrm’s bankruptcy prediction is an important ﬁnancial research problem because of thesecurity price downside risks. Traditional methods rely on accounting metrics that suffer fromshortcomings like window dressing and retrospective focus. While disclosure text-based metricsovercome some of these issues, current methods excessively focus on disclosure tone and sentiment.There is a requirement to relate meaningful signals in the disclosure text to ﬁnancial outcomesand quantify the disclosure text data. This work proposes a new distress dictionary based on thesentences used by managers in explaining ﬁnancial status. It demonstrates the signiﬁcant differencesin linguistic features between bankrupt and non-bankrupt ﬁrms. Further, using a large sample of 500bankrupt ﬁrms, it builds predictive models and compares the performance against two dictionariesused in ﬁnancial text analysis. This research shows that the proposed stress dictionary capturesunique information from disclosures and the predictive models based on its features have the highestaccuracy. K eywords Bankruptcy · Distress · NLP · bag-of-words · Disclosures · Machine learning · EDGAR · Text analysis · Investors and analysts place great emphasis on security analysis and valuation because of the potential excess returns oncapital and the downside risks. Research in this domain is potentially valuable because market inefﬁciencies can resultin volatility and crashes, costing the economy billions of dollars. Analysts extensively use public ﬁrm’s disclosures as asource of information.Investors are keen on knowing about the health of the ﬁrms they may invest in the future. A ﬁrm in ﬁnancial distressloses a signiﬁcant amount of its shareholder’s value. If the management cannot tide over the crisis, the ﬁrm may haveto ﬁle for bankruptcy, resulting in a 50% to 80% loss of capital for shareholders and lenders. Financial distress andbankruptcy prediction is an actively researched ﬁeld.Once a company is unable to come out of distress, it will become insolvent. Insolvency is the state in which thecompany is not capable of honoring some commitment. Lenders and claim holders can force the insolvent company todiscontinue operations. Managements ﬁle for bankruptcy protection to recover from such a situation or liquidate it in anorderly manner. Bankruptcy prediction has been an active research topic for accounting researchers over decades. Oneof the pioneering works Altman (1968) proposed the ‘Z score’ model.Investors and analysts traditionally depended on quantitative information like accounting metrics for decision making.Multiple attributes of these accounting metrics drove this trend. FACC and accounting standards laid out what variablesto be measured and disclosed. Gathering, processing, and analyzing these quantitative metrics was easy. Many free andcommercial data providers automated the data gathering and published these metrics. However, these metrics do not a r X i v : . [ q -f i n . GN ] J a n PREPRINT - J

ANUARY

5, 2021always reveal the ﬁrm’s current status and are not a good indicator of the future. They suffer from shortcomings likewindow dressing and retrospective focus.Evidence exists for window dressing through commissions and omissions. Rajan, Seru, and Vig (2015) showed thatbanks did not report information regarding the deteriorating quality of borrowers’ disclosures in the run-up to thesubprime crisis. Huizinga and Laeven (2012) said that banks overstated the value of their distressed real estate assetsand regulatory capital. Window dressing, retrospective focus, and missing variables impact models based on accountingmetrics. Regulators and investors who rely on such models have been impacted adversely in the past due to modelfailures (Rajan, Seru, and Vig (2015)).Another approach for bankruptcy prediction is using market-based information. Classical efﬁcient market theory andlater option pricing theories assume that all available information is reﬂected in market prices. Under those conditions,accounting-based metrics do not have additional information over and above market prices. More speciﬁcally, a suitablemarket-based measure will reﬂect all available information about bankruptcy probability. Hillegeist et al. (2004)developed a prediction model based on market information, using option pricing theory derived implied volatility. Thismodel outperformed the Altman (1968) z score model. Subsequently, numerous attempts have been made to replicatethese results. Wu, Gaunt, and Gray (2010) provides a comparison of accounting and market-based models, along withothers. They conclude that the Hillegeist et al. (2004) model performs better than the Z score model but is inferior tomodels that include non-traditional metrics. Similarly, Tinoco and Wilson (2013) concluded that accounting metricsbased models and market-based models are complimentary.Hence researchers started paying more attention to alternative approaches like textual analysis of disclosures. Man-agement disclosures have narrative content that contains important information. This information can explain manyﬁrm attributes and organization outcomes, and text analysis methods can extract this information. Prior works haveattempted to incorporate text features into accounting-based predictive models. However, standalone text feature-basedprediction models have not been attempted. There is a need to understand how much information can be extracted fromdisclosure texts and how useful, such information is in predicting bankruptcy. This work addresses that gap.

Numerous researchers tried to explain various ﬁrm attributes using disclosure narratives. Some analyzed MDA toexplain future stock performance (Tao, Deokar, and Deshmukh (2018)), future returns, volatility, and ﬁrm proﬁtability(Amel-Zadeh and Faasse (2016)), bankruptcy (Yang, Dolar, and Mo (2018)), going-concern (Mayew, Sethuraman, andVenkatachalam (2015),Enev (2017)), litigation risk (Bourveau, Lou, and Wang (2018)), and incremental informationover earnings surprises, accruals and operating cash ﬂows (OCF)(Feldman et al. (2008),Feldman et al. (2010)).Researchers attempted to incorporate text features into distress and bankruptcy predictive models. Below is a briefreview of the same.Auditors express going-concern opinions based on the ﬁrm’s obligations and liquidity. Financial disclosures includethese opinions. Change in such disclosures can act as a signal to identify distress. However, auditors do respond toexternal ﬁnancial markets. Beams and Yan (2015) examined the ﬁnancial crisis’s effect on auditor going-concernopinions and concluded that the ﬁnancial crisis led to increased auditor conservatism. A going-concern opinion indisclosures is associated with the number of forward-looking disclosures and their ambiguity. Enev (2017) observed thatwhile the absolute number of forward-looking disclosures is lower for companies receiving a going concern opinion, theproportion of forward-looking disclosures in the MDA is higher in the presence of a going concern opinion. The resultssuggest generally improved forward-looking disclosures in MDA when companies receive a going concern opinionfrom their auditor.One consequence of distress is ﬁnancial constraints. Firms undergo reduced cash ﬂows during Stress, which results inliquidity events - like dividend omissions or increases, equity recycling, and underfunded pensions. Analysts measurethe extent of ﬁnancial constraints to assess the capital structure. Bodnaruk, Loughran, and McDonald (2013) used aconstraining-words-based lexicon to measure the same. These measures have a low correlation with traditional ﬁnancialconstraints measures and predict subsequent liquidity events better. Ball, Hoberg, and Maksimovic (2012) used text inﬁrms’ 10-Ks to measure investment delays due to ﬁnancial constraints. They found that the fundamental limitations arethe ﬁnancing of R&D expenditures rather than capital expenditures and that the main challenge for ﬁrms is raisingequity capital to fund growth opportunities. These text-based measures predict investment cuts following the ﬁnancialcrisis better than other indices of ﬁnancial constraints used in the literature.Most prior bankruptcy prediction models were developed by using ﬁnancial ratios. However, signs of distress mayappear in the nonﬁnancial information earlier than changes in the ﬁnancial ratios. Current distress measures tend to missextreme events, especially in the banking sector (Gandhi, Loughran, and McDonald (2017)). In recent years, qualitative2

PREPRINT - J

ANUARY

5, 2021information and text analysis have become necessary because frequent changes in accounting standards have made itdifﬁcult to compare ﬁnancial numbers between years (Shirata et al. (2011)). Mayew, Sethuraman, and Venkatachalam(2015) stressed the importance of linguistic tone in assessing a ﬁrm’s health. Using a sample of bankrupt ﬁrms between1995 and 2012, they concluded that management’s opinion about going-concern and the MDA’s linguistic tone togetherpredict whether a ﬁrm will go bankrupt.The language used by future bankrupt companies differs from non-bankrupt companies. Hájek and Olej (2015) studiedvarious word categories from corporate annual reports and showed that the language used by bankrupt companiesshows stronger tenacity, accomplishment, familiarity, present concern, exclusion, and denial. Bankrupt companies alsouse more modal, positive, uncertain, and negative language. They built prediction models combining both ﬁnancialindicators and word categorizations as input variables. This differential language usage is observed in non-Englishﬁrms’ disclosures also. Shirata et al. (2011) analyzed the sentences in Japanese ﬁnancial reports to predict bankruptcy.Their research revealed that the co-occurrence of words “dividend” or “retained earnings” in a section distinguishbetween bankrupt companies and non-bankrupt companies.Working on U.S. Banks Gandhi, Loughran, and McDonald (2017) used disclosure text sentiment as a proxy for bankdistress. They found that the annual report’s more negative sentiment is associated with larger delisting probabilities,lower odds of paying subsequent dividends, higher subsequent loan loss provisions, and lower future return on assets.Similarly, Lopatta, Gloger, and Jaeschke (2017) concluded that ﬁrms at risk of bankruptcy use signiﬁcantly morenegative words in their 10-K ﬁlings than comparable vital companies. This relationship holds up until three yearsbefore the actual bankruptcy ﬁling. Other notable works using text analysis for bankruptcy prediction were Yang,Dolar, and Mo (2018) and Mayew, Sethuraman, and Venkatachalam (2015). Yang, Dolar, and Mo (2018) used high-frequency words from MDA and compared the differences between bankrupt and non-bankrupt companies. Mayew,Sethuraman, and Venkatachalam (2015) also analyzed MDA with a focus on going-concern options. They found thatboth management’s opinion about “going-concern” reported in the MDA and the MDA’s linguistic tone together providesigniﬁcant explanatory power in predicting whether a ﬁrm will cease as a going concern. Also, the predictive abilityof disclosure is incremental to ﬁnancial ratios, market-based variables, even the auditor’s going concern opinion andextends to three years before the bankruptcy.Most of the prior works focused on disclosure sentiment as an incremental predictor for bankruptcy prediction. However,disclosure text contains signiﬁcantly more information other than sentiment, and there is a need to extract and testits predictive power. To this end, this quantitative correlation study evaluates the differences in linguistic featuresbetween healthy and bankrupt disclosure texts. Further, predictive models are built to assess the information contentand predictive power. The next section will outline the methods.

The prior sections have reviewed the literature and identiﬁed the gaps in the text analysis of ﬁnance. As bankruptcy is asigniﬁcant organizational outcome for investors, this thesis focuses on the bankruptcy prediction task. To this end, thisquantitative correlation study evaluates the differences in linguistic features between healthy and bankrupt disclosuretexts. Further, predictive models are built to assess the information content and predictive power. This section describesthe framework, data analysis, and methodology used.To summarize, this thesis has four key components.

Text source: Management Discussion and Analysis from 10-K disclosures.Task: Bankruptcy prediction based on prior-year MDA.Sample size: Balanced sample with 500 number of bankrupt and non-bankrupt disclosures each.Language models: Multiple, as described in later parts of this chapter.

The methods section consists of 4 sub-sections covering data, language models, predictive models, and assessmentcriteria.

In this section, sample selection and data collection methods are described. This work aims to extract knowledge fromﬁnancial disclosure text and use it for predictive tasks. It considers public listed companies in the U.S. as the population.From 1994 to 2019, over 16,000 individual companies ﬁled annual disclosures with SEC. New companies get listed onexchanges through Initial Public Offering or corporate spin-offs. Companies are delisted due to mergers, acquisitions,and bankruptcies. As a result, there are ~8000 listed public companies in the year 2019 in the U.S.3

PREPRINT - J

ANUARY

5, 2021This work focuses on bankruptcy prediction using disclosure text characteristics. So, two samples are critical. One is alist of bankrupt ﬁrms, and the other is a list of non-bankrupt ﬁrms.

A critical component of this study is to identify ﬁrms that went bankrupt. This work uses the list of bankrupt companiesfrom the UCLA-LoPucki Bankruptcy Research Database (BRD) maintained by LoPucki (2006). UCLA School of Lawcollects, updates, and disseminates this data. This dataset contains more than one-thousand large public companies thathave ﬁled bankruptcy cases since October 1, 1979. BRD deﬁnes a public company as a ﬁrm that ﬁled an Annual Report(Form 10-K or form 10) with the SEC for a year ending not less than three years before ﬁling the bankruptcy case. BRDconsiders all ﬁrms with more than $ 100 million in assets in annual reports as “large.” Assets are measured in 1980constant dollars (about $ 3.1 current dollar). Both Chapter 7 and Chapter 11 cases are included in the bankruptcy list,whether ﬁled by the debtors or creditors. From this list, bankruptcies before 1994 are excluded. Since EDGAR maintainsonline disclosures from 1994 onwards, it was convenient to extract those ﬁlings. The exclusion of prior bankruptciesresults in a new list of ~900 bankrupt companies. Around 7000 corresponding ﬁrm ﬁlings exist in EDGAR. Companieswithout at least one prior year 10-K ﬁling are excluded from the list. Finally, the Management Discussion and Analysissections are extracted from these ﬁlings. A minimum threshold of 100 words is used to ﬁlter out non-informative MDAs.This ﬁltering resulted in a sample of 500 company ﬁlings one year before bankruptcy.

List of Non-Bankrupt ﬁrms is identiﬁed by starting with S&P 1000 list and excluding companies with bankruptcyhistory. The net result is 980 ﬁrms. Around 16000 ﬁlings exist for all these ﬁrms.

Since annual bankruptcy incidence is less than 0.5%, the number of all ﬁlings one year prior to bankruptcies is verylow, compared to non-bankrupt ﬁlings. Hence, a balanced experiment design with an equal number of bankrupt andnon-bankrupt disclosures in the sample is used. Five hundred non-bankrupt ﬁlings are randomly chosen from thenon-bankrupt ﬁlings.

The method for the annual ﬁlings downloading has the below components.

From 1993 to 2018, Dec companies ﬁled ~20 million records on EDGAR. For ease of access, SEC releases quarterlymaster indices for the list of ﬁlings on EDGAR. This list has ~ 220,00 annual (10-K) ﬁlings relevant to this thesis.Custom R and python scripts downloaded these 10-K documents programmatically.

The text version of the ﬁlings on SEC is a collection of all ﬁles in a submission. These include HTML, exhibits, jpgﬁles, and XBRL ﬁles. A fraction of the text ﬁle size will contain actual text. ASCII-encoded pdfs, graphics, Xls, orother binary ﬁles can contribute to most of the ﬁlings document size. The next processing step removed all non-textcontent from disclosure documents, following Loughran and Mcdonald (2009). These cleaned ﬁlings are stored in textformat.

For bankruptcy prediction, Management Analysis, and Discussion (MDA) is the text features source. Managementteams discussed the current ﬁrm status and expected outcomes in the MDA section. A python script extracted all the textbetween “ITEM 7” and “ITEM 7 A”. Regular expressions and combinations of these phrases are used to identify themaximum number of the MDAs from 10-K ﬁles. In some disclosures, the MDA section is “incorporated by reference,”referring to the shareholder’s annual report. The thesis included MDA material from the body of the primary document.Also, it discarded all MDAs with less than a 100-word count. Subsequent sections explain the generation of these texts’numeric representation by using dictionary-based parsing or word embeddings.4

PREPRINT - J

ANUARY

5, 2021

This subsection explains the dependent and independent variables used in the thesis.

The dependent variable for this thesis is a Bankruptcy ﬁling. The Bankruptcy ﬁling Dummy equals one if the ﬁrm hasﬁled for bankruptcy protection within one year after the 10-K ﬁling date, else 0.

As outlined in the prior literature survey, numerous text representation methods successfully extract information fromﬁnancial disclosures. However, often, they were used in combination with traditional quantitative metrics and ﬁnancialratios. This thesis aims to identify standalone information content in text and design methods for knowledge extraction.This work evaluates numeric representations of MDA generated using three types of Bag of Word dictionary-basedlanguage models. The models are below.1. Linguistic Inquiry and Word Count (LIWC)2. Loughran McDonalds Financial Dictionary (L.M.)3. Stress Dictionary (S Dictionary)

Dictionary-based models are an extension of word frequency models. As discussed in prior sections, word frequencymodels suffer from large dimensionality and sparse matrix problems. One way to reduce the dimensionality is tocategorize words into different groups and compute the category frequencies. These frequencies are normalized perthousand words making comparison easier. These categorized word groups are called dictionaries. Dictionary methodsact as ﬁlters in extracting relevant language features. For example, numerous words indicate negative sentiment ina discourse. Collecting them under one group and computing frequency helps in understanding document tone veryquickly. These advantages made dictionary-based methods prevalent in text analysis. The next section covers the threedictionary-based models this thesis uses.

LIWC

Linguistic Inquiry and Word Count (LIWC) is a text analysis program developed by Pennebaker ( Pennebaker,Francis, and Booth (2001)). It allows linguistic features analysis and content analysis. Also, the tool can review stylisticaspects of language use across different contexts. Since linguistic style reveals psychological information about a writerand their underlying thinking, it is a useful tool in MDA analysis. Researchers used LIWC in numerous ﬁnancial textanalysis studies. Fisher, Garnsey, and Hughes (2016) provided a brief review.LIWC examines written language and classiﬁes it along up to 90 language dimensions (Pennebaker et al. (2015)),including1.Four summary language variables (analytical thinking, clout, authenticity, and emotional tone)2.Three linguistic descriptor categories (dictionary words, words per sentence, six letters and above words)3.Twenty-one standard language categories (e.g., articles, prepositions. pronouns)4.Forty-one psychological process word categories (e.g., affect, cognition, biological processes, drives)5.Six personal concern categories, ﬁve informal language markers, and 12 punctuation categoriesThe LIWC dimensions are hierarchically organized. For example, the word ‘optimistic’ falls into ﬁve categories:‘optimism’, ‘positive emotion’, ‘overall effect’, ‘words longer than six letters’ and ‘adjective’. The program analyzestext ﬁles on a word-by-word basis, calculating the number of words that match each of the 90 LIWC dimensions,expressed as percentages of total words in the text, and records the data into one of 90 preset dictionary categories. TheLIWC dictionary comprises over 6,000 words and stems. Each category is composed of a list of dictionary words.Several sources (e.g., Blogs, Expressive writing, Novels, Natural Speech, NY Times, and Twitter) were used to formthe dictionary. The program classiﬁes about 86 percent of the language used by people. LIWC’s external validity wastested; hence LIWC is a useful research tool for measuring psychological processes, content analysis, and assessingvarious linguistic features. LIWC measures for all MDAs are generated using the LIWC2015 dictionary.

LM dictionary

Loughran and Mcdonald (2011) demonstrated that word lists developed for other disciplines mis-classify common words in the ﬁnancial text. Loughran and Mcdonald (2011) created an alternative negative word list(Fin-Neg) and ﬁve other word lists that better reﬂect tone in the ﬁnancial disclosures to overcome this. They tested the5

PREPRINT - J

ANUARY

5, 2021relation between these word lists and 10 K ﬁling returns, trading volume, return volatility, fraud, material weakness,and unexpected earnings. Subsequently, these word lists have been known as the L.M. dictionary, and other researchershave used them in ﬁnancial text analysis. Nguyen and Huynh (2020), Gandhi, Loughran, and McDonald (2019). Theﬁve other word lists are positive (Fin-Pos), uncertainty (Fin-Unc), litigious (Fin-Lit), strong modal words (MW-Strong),and weak modal words (MW-Weak). The Fin-Neg list has 2,337 words. This list includes ﬁnancial domain words thatcommon negative words list exclude, i.e., restated, litigation, termination, discontinued, penalties, unpaid, investigation,misstatement, misconduct, forfeiture, serious, allegedly, noncompliance, deterioration, and felony. The Fin-Pos wordlist consists of 353 words. The Fin-Unc list includes words indicating uncertainty and has 285 words. For capturingpropensity to litigate, 731 litigiousness words are combined into the Fin-Lit list. It contains words such as claimant,deposition, interlocutory, testimony, and tort. In the L.M. dictionary, words from these three groups overlap. Strong andweak modal words express levels of conﬁdence. MW-Strong has 19 words: always, highest, must, and will. MW-Weakhas 27 words: could, depending, might, and possibly.For this work, Positive, Negative, and Uncertain words are included. This work used the quanteda library, whichincludes the L.M. dictionary (Benoit et al. (2018)), for generating numeric features.

Stress dictionary

While LIWC and L.M. dictionaries extract the document’s tone and sentiment, they do not capturefundamental differences between bankrupt and non-bankrupt companies. Also, L.M. demonstrated a need for task anddomain-speciﬁc dictionariesText features indicate differential language usage between bankrupt and non-bankrupt companies. Distressed ﬁrmscommunicate the nature of distress, remedial measures, and on-going concerns. Hence narrative of distressed companiesMDAs can differ from a healthy company MDA up to three years before the bankruptcy For example, the followingare some of the statements from some distressed company’s MDAs. “Operating results are affected by indebtednessincurred to ﬁnance the acquisition and by the amortization of capitalized fees and expenses incurred in connection withsuch ﬁnancing.”“The company is unlikely to be able to meet its cash ﬂow needs during..”“The company was downgraded in november 1994 by three primary insurance rating agencies, and..”

In a healthy ﬁrm’s MDAs, we will not observe these sentences. The following are some excerpts from healthy companyMDAs. “The increases in operating earnings were driven by revenue growth and . . . ”“The company was in compliance with all debt covenants.”

Further to the difference in content, the MDA content’s linguistic features in distressed ﬁrms can differ. This differenceresults from obfuscation attempts - lengthy sentences describing the ﬁrm’s state, capturing the contingent conditions-narrating multiple agents’ attitudes, i.e., suppliers, lenders, economic factors, and management prognosis.The following statements highlight how a distressed ﬁrm communicates its efforts in handling the situation “Sincethe company currently does not have the means to repay the Series notes, management is unable to predict the futureliquidity of the company if the restructuring is not accomplished.”“The company may be required to reﬁnance such amounts as they become due and payable. While the companybelieves that it will be able to reﬁnance such amounts, there can be no assurance that any Such reﬁnancing would beconsummated or, if consummated, would be in An amount sufﬁcient to repay such obligations, particularly in light ofthe company’s high level of debt that will continue after the Restructuring.”“After giving effect to this amendment, the company was in compliance with the terms and restrictive covenants of itsdebt obligations for ﬁscal 1994.”“The company has funded operations primarily from borrowings under its debt agreements and the sale of its stock.”“The company was not in compliance with a net worth requirement contained in its sale-leaseback agreement.”“As a result of the second quarter 1998 loss, the company was in default of certain covenants based on ebitda.”“The loss incurred during the fourth quarter of the year ended june 30, 1999 resulted in not being in compliance with thedebt service covenant”“The proposed plan currently contemplates the ﬁling of a pre-packaged chapter 11 plan of reorganization in order to. . . ”“These factors among others indicate that there is substantial doubt about the company’s ability to continue as a goingconcern.”“Considering our default of the loan agreements and our liquidity as discussed above, there is substantial doubt aboutour ability to continue as a going concern.”

In contrast, healthy companies do not describe these details in a lengthy manner. The following are excerpts from somehealthy companies’ MDA “Management considers the company to be liquid and able to meet its Obligations on both ashort and long-term basis.”“We had no amounts outstanding under our agreement.” PREPRINT - J

ANUARY

5, 2021Table 1: Language models usedModel Name Language ModelModel 1 LIWC LIWCModel 2 LM LM dictionaryModel 3 Stress Stress DictionaryModel 4 LIWC_Stress LIWC+ Stress DictionaryModel 5 LM_Stress LM Dictionary + Stress DictionaryModel 6 LIWC_LM_Stress LIWC+ LM dictionary + Stress DictionaryThe above observations suggest that a distress dictionary capturing these differences would differentiate bankrupt andno-bankrupt ﬁrms.

Stress dictionary method.

The dictionary is constructed using all MDAs from 2018. An MDA can contain 5000 to10,000 words. This study focuses on the “liquidity and capital requirements” section, reported by most companies. Thetask is to go through the words and identify the ones that may be red ﬂags for bankruptcy or Stress. The general decisioncriterion in the process is high discriminatory power for identifying ﬁnancial distress. The list is prepared in two steps1. Identiﬁcation of differentiating words2. Classifying the words into meaningful categories.Similar to content analysis, which aims to extract information from the text’s tone, this work searches for words thatmight indicate debt restructuring or distressed business situations. The ﬁrst step identiﬁed 70 candidate words. This listis reﬁned in the next step.

Derivation of dictionary

In a second step, we analyze the candidate list in detail. From the preliminary list of 80words, we categorize and select 70 words that are consistent with prior literature.

Category 1: Debt: Words used in expressing high indebtedness

Companies deploy debt to take care of working capital and capital expenditure requirements. During normal operations,ﬁrms manage debt comfortably. When ﬁrms face difﬁculty in servicing the debt, management discloses the status inMDA. This communication will result in an increased frequency of debt-related words.The following words characterize debt-related sentences: Agreement, amendment, borrow, claim, collateral, guarantees,secured. A detailed list is in appendix A.

Category 2: Distress: Words used by companies close to insolvency

Companies in danger of bankruptcy exhibit several characteristics and the MDA expresses the same. The expression ofthese characteristics increases with an approaching need for bankruptcy ﬁling. Debt covenant violations are necessarypre-cursers to bankruptcy. Debt covenant violations serve as early indicators to creditors, signaling potential problems.Most violated covenants correspond to solvency (e.g., Interest coverage and leverage), liquidity, and proﬁtabilityrequirements. Managers try to avoid debt covenant default. Other words indicating distress are loss, chapter 11, chapter7, downgrade, and bankruptcy. We add the following words to this list: covenant, default, breach, violate, amend,restrictive, waiver

Category 3: Restructure: Words used in restructuring sentences

Managers try to manage distress through various mechanisms. Raising fresh capital, debt restructuring, and selling ofassets are some of them. All these initiatives can be viewed as balance sheet restructuring activities. MDA containssentences explaining the proposed restructuring activities. We add the following restructuring-related words to this list:dispose, recapitalize, restructure, liquidate, alternative

Category 4: Health: Characteristics of statements describing a healthy state

Firms that are not at risk of bankruptcyexpress a healthy state of the company in MDA. These sentences correspond to solvency, proﬁt, retained earnings, anddividend payment. We add the following words to this list: retain, proﬁt, cash, dividend, meet. These four categoriesare deﬁned as a dictionary and further used for generating the numeric representation of MDAs.

We built multiple combinations of language models from the different language models described in the previoussection. The ﬁnal list of language models is shown in table 1.7

PREPRINT - J

ANUARY

5, 2021

Once the documents have been transformed into numeric forms using language models, they are fed into predictivemodels. The sample is divided into two groups, bankrupt ﬁrms and non-bankrupt ﬁrms. The outcome is a binarydependent variable. This binary outcome is modeled using logistic regression, similar to a panel logit framework(Altman and Hotchkiss (2010)).

Logistic regression is useful to model binary outcomes. It consists of a logistic (logit) function and a binomialdistribution. While standard regression can be used to model binary outcomes, the model is not interpretable. Theoutcome is not bounded, and an ad-hoc classiﬁcation rule is required to translate output to binary outcomes. Also, theoutput cannot be converted to probabilities as, in some cases, the model will produce estimates outside [0,1] bounds.The bounded constraint can be overcome by modeling odds, i.e., p/ (1 − p ) . A log transform of the odds will ensurethat probabilities are symmetric at 0.5The logistic function (also known as sigmoid function or inverse logit function ) critical ingredient of logistic regression.Logistic function: f ( x ) = 11 + e − x The logistic (logit) function: / (1 + exp ( − x )) Given log-odds: log ( p/ (1 − p )) , logistic function is the inverse of log-odds.Another formula for logistic function: g ( x ) = e x e x + 1 The logistic function, also called the sigmoid function, gives an ‘S’ shaped curve that can take any real-valued number(- ∞ to + ∞ ) and maps it to a value between 0 and 1.This transformation allows modeling a family of relationships between continuous predictors and a binary outcomevariable, in this case, bankruptcy.Key steps are1. Assuming that predictors are linearly related to the log-odds2. Transform the odds to convert to probability3. Estimate the data likelihoodIn this context, intercept shifts the curve left or right. Slopes make the curve sharper or ﬂatter, with respect to predictors.The logistic starts at 0, ends at 1 and is symmetric around .5.Logistic regression transforms the bankruptcy outcomes so that a linear combination of predictors produces log-oddseffects on the bankruptcy. A model coefﬁcient is transformed and interpreted as an odds multiplier. These results areeasily interpretable.The logistic regression model used in this study is based on the following mathematical deﬁnition. Bankruptcy variablecoded using 1 and 0. Y = (cid:26) bankrupt non-bankruptVariable of interest p ( x ) = P [ Y = 1 | X = x ] logistic regression model. 8 PREPRINT - J

ANUARY

5, 2021 p odd s p versus odds(p) Figure 1: p versus odds(p) log (cid:18) p ( x )1 − p ( x ) (cid:19) = β + β x + . . . + β k − x k − This equation is similar to linear regression with k − predictors for a total of k β parameters. Here, the left partof the equation is the log odds. This gives the probability for a bankruptcy ( Y = 1) divided by the probability of anon-bankruptcy ( Y = 0) . When the odds are 1, both events are equally likely. Odds greater than 1 indicate bankruptcyand vice versa. p ( x )1 − p ( x ) = P [ Y = 1 | X = x ] P [ Y = 0 | X = x ] Researchers evaluate Bankruptcy prediction models using multiple criteria. In this thesis, we use Accuracy tables, thereceiver operating characteristics (ROC) curves, and information content tests. While ROC curves inform forecastingaccuracy, sensitivity, and speciﬁcity, Information content tests evaluate the bankruptcy-related information carried bythe distress risk measures. The following section presents the method of each.

A perfect model classiﬁes all observations accurately. Real models make mistakes in classiﬁcation. One way to evaluatethe model’s performance is its misclassiﬁcation rate. Alternatively, models accuracy can be used, which measures theproportion of correction classiﬁcations 9

PREPRINT - J

ANUARY

5, 2021 −5.0−2.50.02.55.0 0.00 0.25 0.50 0.75 1.00 p l ogodd s p versus logodds(p) Figure 2: p versus logodds(p)Misclassiﬁcation ( ˆ C, Data ) = 1 n n (cid:88) i =1 I ( y i (cid:54) = ˆ C ( x i )) I ( y i (cid:54) = ˆ C ( x i )) = (cid:40) y i = ˆ C ( x i )1 y i (cid:54) = ˆ C ( x i ) This measure is not useful in training data. This metric improves with the number of parameters and hence will bebiased towards large models. This bias encourages overﬁtting. Hence, this metric needs to be computed on test data,unseen by the model during training.Accuracy tables can be further split into confusion matrix, to understand the nature of misclassiﬁcation. Confusionmatrix categorizes the classiﬁcation errors into false negatives and false positives.Setting the classiﬁcation threshold as 0.5 η ( x ) = 0 ⇐⇒ p ( x ) = 0 . Predictions can be used to create a confusion matrix as below.Prev = PTotal Obs = TP + FNTotal ObsA reasonable classiﬁer has to outperform a naïve classiﬁer that labels all observations as majority class. In this work, amodel classifying every company as non-bankrupt will be the baseline. Apart from accuracy, speciﬁcity and sensitivitycan be used to evaluate models. Sensitivity is the true-positive rate. Higher sensitivity means the model classiﬁes more10

PREPRINT - J

ANUARY

5, 2021 logodds p logodds(p) versus p Figure 3: Logodds(p) vs ppositives correctly, reducing the false negatives. Speciﬁcity is the true negative rate. Higher speciﬁcity means theclassiﬁer is labeling true negatives correctly, reducing false positives. The formulae are given below.Sensitivity = True Positive Rate = TPP = TPTP + FNSpeciﬁcity = True Negative Rate = TNN = TNTN + FPBoth metrics can be computed directly from the confusion matrix.

Relationship between Accuracy, Speciﬁcity, and Sensitivity

As we compute speciﬁcity and sensitivity from the confusion matrix, different classiﬁcation thresholds generate multiplesensitivity/ speciﬁcity values. It is normal to use 0.5 probability as “cutoff.” By modifying the cutoff, we can improvethe sensitivity or speciﬁcity at the overall accuracy expense. Also, if sensitivity improves, speciﬁcity deteriorates, andvice versa. ˆ C ( x ) = (cid:26) p ( x ) > c p ( x ) ≤ c Receiver Operating Characteristics (ROC) curve is a method to assess the accuracy of a continuous measurement forpredicting a binary outcome. It is used extensively in the Lifesciences Domain. Over the past two decades, it gainedacceptance as a bankruptcy prediction model validation tool (Sobehart and Keenan (2001)).11

PREPRINT - J

ANUARY

5, 2021Figure 4: Confusion MatrixFor a bankruptcy prediction model, for a ﬁxed cutoff c, we can compute accuracy metrics and two types of classiﬁcationerrors: false negatives and false positives. In bankruptcy prediction, the model generates the measure of ﬁrm distress M,based on independent variables. This measure is a continuous measurement. We derive a 1 (test positive) classiﬁcationas M exceeding a ﬁxed threshold c: M>c. For bankruptcy detection, the binary outcome B, a good outcome of thetest, is when classiﬁcation is 1 (the test is positive) among bankrupt companies B=1. A bad outcome is when theclassiﬁcation is 1 (test is positive) among non-bankrupt companies B=0. The true-positive fraction is the probabilityof an estimated positive among the bankrupt ﬁrms: TPF(c)=P{M>c|B=1}. This value is the sensitivity at cutoff c.Similarly, the false-positive fraction is the probability of a bankrupt classiﬁcation among the non-bankrupt ﬁrms:FPF(c)= P{M>c|B=0} ROC curve is the plot of TPF against at various cutoff levels c. It has FPF(c) on the x-axis andTPF(c) along the y-axis.A perfect bankruptcy prediction model - that is, the ranking on default probability at cutoff c is equal to the ranking offailures at c – would be able to capture all bankruptcies. This model corresponds to a vertical line at 0 FPF. A randombankruptcy prediction model – that is, the ranking at cutoff c is not correlated with the ranking of failures – would havethe same percentage of failures across each cutoff level. This model corresponds to a line at 45 to the x-axis. Since weexpect a bankruptcy prediction model to be better than a random model, the ROC curve is expected to be between theperfect and the random model.To compare the two models’ predictive ability, we calculate the area under the ROC curve (AUC). Sobehart and Keenan(2001) used the AUC is the decisive indicator for default model accuracy. Information content tests help examine the proposed bankruptcy prediction models. They evaluate if bankruptcyprediction models carry more information than another set of variables. The use of Information content tests has manyprecedents in bankruptcy prediction. They complement the ROC curve analysis since (i) ROC curve analysis provides12

PREPRINT - J

ANUARY

5, 2021users with a binary option, but users may not be making such decisions. Users of bankruptcy prediction models areinterested in determining credit terms or portfolio weights. (ii) ROC curve analysis ignores associated error costs basedon context-speciﬁc type I/ type II errors.There are two primary information criteria: the Akaike information criterion (AIC) and the Bayes information criterion(BIC). When models are built using the same data by maximum likelihood, smaller AIC or BIC indicates a better ﬁt.

Akaike Information Criterion

The AIC is the simpler of the two; it is deﬁned as AIC = -2LL + 2k, in which -2LLis the deviance (described below), and k is the number of predictors in the model.The maximum log-likelihood of a regression model is: [log L ( ˆ β , ˆ σ ) = − n log(2 π ) − n log (cid:0) RSS n (cid:1) − n , ] Where ˆ β and ˆ σ and RSS = (cid:80) ni =1 ( y i − ˆ y i ) were selected to maximize the likelihood.From the above, AIC is derived as the difference between penalty and log-likelihood [ AIC = − L ( ˆ β , ˆ σ ) + 2 k = 2 k + n + n log(2 π ) + n log (cid:0) RSS n (cid:1) , ] AIC combines two components of the model, i.e., the likelihood – a measure of “goodness-of-ﬁt” and the penalty-proportional to the model size. The likelihood portion of AIC for two models ﬁt on the same dataset is a function ofRSS. Higher RSS (squared deviation) indicates a poor model ﬁt. A good model has low RSS and AIC. The penaltycomponent of AIC is [2 k, ] , a function of the number of β parameters used in the model. As k increases, AIC increases.A good model with a small AIC will have a balance between the goodness of ﬁt and uses a small number of parameters. Bayesian Information Criterion

The BIC is similar to AIC but adjusts the penalty included by the number of cases:BIC = -2LL + k x log(n) in which n is the number of cases in the model. This way, BIC picks smaller models for largersample sizes, compared to AIC. For model selection, we use the model with the smallest BIC. [ BIC = log( n ) k − L ( ˆ β , ˆ σ ) = log( n ) k. + n + n log(2 π ) + n log (cid:0) RSS n (cid:1) ] The penalty for AIC is 2k whereas for BIC, it is [log( n ) p. ] . For datasets with log ( n ) > , BIC penalty will be highercompared to AIC. Hence BIC will prefer smaller models for similar log-likelihoods. This research work focuses on building bankruptcy prediction models using ﬁnancial disclosures text features. Statisticalanalysis has been conducted, and models are built as per the methodology described in section 3. This chapter willdescribe the results.The chapter is structured into multiple sub-sections covering descriptive statistics of linguistic features, relationshipwith bankruptcy, model performance, and evaluation.

This section describes the statistical properties of datasets and features used in this work

The list of bankruptcies from LoPucki (2006) has more than 1000 observations. This dataset covers large bankruptciesfrom 1980 to date.The annual bankruptcy ﬁlings trend is given in ﬁgure 5.–>On average, 29 ﬁrms ﬁled for bankruptcy in a year, with median annual bankruptcies at 25. A maximum of 97bankruptcies was ﬁled in the year 2001. Recall that this research has included bankruptcies till 2018 Dec.

For the selected bankrupt ﬁrms and healthy ﬁrms, all available MDAs are transformed into numeric forms using threedictionaries, i.e., LIWC, L.M., and stress dictionary. These linguistic features are averaged at the group level andpresented in the below table. 13

PREPRINT - J

ANUARY

5, 2021Figure 5: Number of bankruptcies ﬁled by yearColumn “All” documents the summary statistics for all sample ﬁrms. WPS and W.C. indicate that the sample ﬁrms’MDAs are in general lengthy with ~10000 words and 27 words per sentence, indicating ~400 sentences per MDA.Excluding WPS and W.C., all other values are in percentages. Close to 30% of words are complex (27.93 Sixltr )and functional (function. 30.57). The sample ﬁrms’ MDAs are present-focused (focuspresent 2.69), and their futurefocus is one-third of the present focus. Per LIWC classiﬁcation, on average, the MDAs have three times more positivewords compared to negative words (posemo:2.1, negemo: 0.73). Cognitive process-related and drives related words areobserved with similar frequency (cogproc:7.38, drives:6.62), while Social/ affect words occur at half of that (social:3.37,affect:2.83). Based on the L.M. dictionary, we can observe that negative and uncertain words are double that of positivewords frequency (negative:1.01, uncertain:0.96, positive: 0.51) Stress dictionary features indicate that debt words areprevalent at 2.72. In a typical MDA of 10,000 words length, this indicates 270 words describing debt-related discussionand disclosures. Distress and restructure related occur less frequently, which can be expected as they are infrequentorganizational outcomes.The table’s focus is Column “Bankrupt,” as it illustrates the summary statistics of bankrupt ﬁrms. Bankrupt ﬁrms aremore past focused. They also use less cognitive and drives related words. A striking difference is observed in theincrease in debt and distress related word frequency. They also show increased negative word frequency.Since we are interested in building predictive models using prior year ﬁlings, it would be critical to observe how thelinguistic features trend for bankrupt companies compared to non-bankrupt companies. The ﬁgure 6 shows the same.14

PREPRINT - J

ANUARY

5, 2021Table 2: Linguistic features words percentageFeature All Bankrupt HealthyWPS 26.91 27.35 26.72WC 10109.29 10164.10 10085.28Sixltr 27.93 27.87 27.96Dic 162.26 160.67 162.96function. 30.57 30.67 30.52affect 2.83 2.83 2.84social 3.37 3.29 3.41cogproc 7.38 7.19 7.47percept 0.30 0.32 0.29bio 0.98 0.97 0.98drives 6.62 6.41 6.70relativ 10.71 10.70 10.72AllPunc 12.35 12.77 12.17focuspast 1.73 1.78 1.70focuspresent 2.69 2.61 2.73focusfuture 0.78 0.80 0.78anger 0.04 0.03 0.04posemo 2.10 2.12 2.09negemo 0.73 0.71 0.74debt 2.72 3.04 2.58distress 0.24 0.33 0.21restructure 0.08 0.09 0.07healthy 0.54 0.55 0.54negative 1.01 1.07 0.98positive 0.51 0.48 0.52uncertainty 0.96 0.91 0.9815 P R E P R I N T - J ANUA R Y , Figure 6: Linguistic features evolution PREPRINT - J

ANUARY

5, 2021This ﬁgure depicts various types of word frequencies for bankrupt companies during the year of bankruptcy and ﬁveprior years. For comparison, sample non-bankrupt ﬁrms’ word percentages are plotted over six years, going back fromthe latest ﬁling. The values are averaged for bankrupt and non-bankrupt ﬁrms.

Notable trends in LIWC features

All linguistic LIWC features for bankrupt ﬁrms are lower than non-bankrupt ﬁrms throughout the period. There is agradual increase in focuspast and focusfuture.

Notable trends in L.M. features

Bankrupt companies have lower uncertain and positive words throughout the period. Negative words stat increasingtwo years before the bankruptcy.

Notable trends in Stress Dictionary features

Stress dictionary features captured the evolution of distress and bankruptcy. Debt related words exceed relative to healthyﬁrms four years before bankruptcy and gradually inch up further till the event of bankruptcy. Distress related wordsremain marginally higher from 5 years to 2 years before bankruptcy and dramatically increase after that. Restructurerelated word frequency for bankrupt ﬁrms is indistinguishable till two years before the bankruptcy. This observation isexpected as ﬁrms do not take up such costly exercises unless the ﬁnancial distress is unmanageable and covenant defaultis imminent. There is no change in “healthy” frequency for both bankrupt and non-bankrupt ﬁrms, though bankruptﬁrms have lower occurrence throughout the period.Overall, we can observe sufﬁcient differences between bankrupt and non-bankrupt ﬁrms.

Figures 7, 8,9 show correlation structure among LIWC features, LM-Stress dictionary and selected variables from thesethree models.Of the LIWC features, few are highly correlated, i.e., dictionary, functional, social, and drives. All other features havelow correlations indicating they are capturing different information. In the stress dictionary, debt and distress show a0.45 correlation, which is expected. Other variables are uncorrelated. Also, there is no correlation between the L.M.dictionary and stress dictionary features. Finally, selected variables from these three models are checked for correlation.There is an insigniﬁcant correlation indicating minimum overlap. This low correlation indicates their complementarynature, and a hybrid model combining these features might perform better than standalone models.17 P R E P R I N T - J ANUA R Y , Figure 7: Correlations between LIWC features P R E P R I N T - J ANUA R Y , Figure 8: Correlations between LM and stress dictionary P R E P R I N T - J ANUA R Y , Figure 9: Correlations between all selected linguistic features PREPRINT - J

ANUARY

5, 2021

The following will explain the results of various experiments done to test the hypothesis we outlined in the methodology

From descriptive statistics, we observed that there are distinct qualities that differentiate bankrupt ﬁrms from non-bankrupt ﬁrms. We set out to test this hypothesis.

Independent T-tests were conducted. The number of bankrupt ﬁrms and non-bankrupt ﬁrms is 500 eachThe 500 bankrupt ﬁrms compared to the 500 non-bankrupt ﬁrms demonstrated signiﬁcantly higher distress, t(868) =17.38, p = .00.Bankrupt ﬁrms had signiﬁcantly higher debt (t(992) = 12.32, p= 0.00), higher negative words (t(997) = 8.28, p= 0.00)and higher restructure words (t(922) = 7.67, p= 0.000)There was no signiﬁcant effect for negative emotions (negemo), t(988) = 0.69, p = .62, despite bankrupt (M = 0.88, SD= 0.38) attaining higher scores than non-bankrupt (M = 0.86, SD = 0.35).10 shows the details. 21

PREPRINT - J

ANUARY

5, 2021Figure 10: Bankrupt vs non-bankrupt linguistic features T test

As part of this hypothesis, a Logistic regression model with all LIWC features as independent variables has been ﬁt.Another model with L.M. features as predictors are built and compared.

Here we review the LIWC logit model. Table 3 presents the model details.We can observe that only a few predictors are signiﬁcant. This observation is expected as the LIWC model capturesvarious aspects of language, and only a few of them can be expected to be impacted by the distress and potentialbankruptcy conditions.

W P S , Dic , f unction. , f ocuspast , and f ocusf uture are signiﬁcant at 0.001 level. Thelogistic regression coefﬁcients give the change in the log odds of the outcome for a one-unit increase in the predictorvariable. Here, except W P S , all predictors are percentages of category words.For every one unit change in

W P S , the log odds of bankruptcy (versus non-bankruptcy) increases by 0.08 with 95% CI[0.04, 0.12]. For a one percent increase in f ocuspast the log odds of being bankrupt increases by 1.10 with 95% CI[0.66, 1.55]. The same for f ocusf uture increases by 1.65 with 95% CI [1.00, 2.31].Another way to interpret these coefﬁcients is to use the odds ratio. This ﬁtted model says that holding other predictorsat a ﬁxed value, the odds of bankruptcy for a ﬁrm whose disclosure has 1% f ocusf uture words than a ﬁrm with zeropercent such words are exp(1.65) = 5.2. We can say that the odds for a ﬁrm with higher f ocusf uture words are420% higher in terms of percent change.Other predictors that are signiﬁcant at <0.05 levels are social , cogproc , and drives . The log odds are 0.37, -0.34, 0.30with 95% CIs [0.10, 0.65], [-0.64, -0.06], and [-0.00, -0.60], respectively.22 PREPRINT - J

ANUARY

5, 2021Table 3: LIWC model coefﬁcientsPredictors Coefﬁcients SE pvalue Lower CI Upper CI Odds RatioIntercept 2.75 3.34 0.411 -3.92 9.16 15.68WPS 0.08 0.02 <0.001 0.04 0.12 1.08WC 0.00 0.00 0.103 0.00 0.00 1.00Sixltr -0.07 0.05 0.202 -0.17 0.04 0.93Dic -0.16 0.04 <0.001 -0.24 -0.08 0.85function. 0.60 0.11 <0.001 0.39 0.81 1.82affect -0.60 5.21 0.909 -11.50 9.26 0.55social 0.37 0.14 0.007 0.10 0.65 1.45cogproc -0.34 0.15 0.021 -0.64 -0.06 0.71percept 0.73 0.40 0.070 -0.05 1.52 2.07bio -0.05 0.23 0.829 -0.49 0.39 0.95drives 0.30 0.15 0.049 0.00 0.60 1.35relativ -0.06 0.12 0.604 -0.29 0.17 0.94AllPunc -0.08 0.04 0.062 -0.16 0.00 0.92focuspast 1.10 0.23 <0.001 0.66 1.55 3.00focuspresent 0.06 0.21 0.773 -0.35 0.48 1.06focusfuture 1.65 0.33 <0.001 1.00 2.31 5.20anger 0.68 2.18 0.754 -3.57 5.05 1.98posemo 1.15 5.21 0.825 -8.70 12.05 3.17negemo 1.34 5.24 0.799 -8.57 12.30 3.81Table 4: LM model coefﬁcientsPredictors Coefﬁcients SE pvalue Lower CI Upper CI Odds RatioIntercept 0.56 0.31 0.070 -0.04 1.17 1.75negative 1.41 0.17 <0.001 1.08 1.76 4.10positive -2.88 0.40 <0.001 -3.68 -2.12 0.06uncertainty -0.64 0.20 0.002 -1.06 -0.26 0.53

Here we review the L.M. logit model. Table 4 presents the coefﬁcients and conﬁdence intervals.We can observe that all predictors are signiﬁcant. negative and positive are signiﬁcant at 0.001 level. For a onepercent increase in negative words, the log odds of being bankrupt increases by 1.141 with 95% CI [1.08, 1.76]. Thesame for positive changes by -2.88 with 95% CI [-3.68, -2.12].This L.M. model says that holding other predictors at a ﬁxed value, the odds of bankruptcy for ﬁrms whose disclosurehas 1% negative words than a ﬁrm with zero percent such words are exp(1.41) = 4.10. We can say that the odds for aﬁrm with higher negative words are 310% higher in terms of percent change. ANOVA indicates that models are signiﬁcantly different.Accuracy, BIC, and AIC metricsROC comparison is shown in ﬁgure 13Table 5: LIWC and LM model comparisonModel Training Accuracy Test Accuracy LogLik AIC BIC AUC Deviance ParametersLIWC 0.69 0.63 -470.61 981.22 1074.92 0.72 941.22 19LM 0.68 0.74 -489.88 987.75 1006.49 0.78 979.75 323

PREPRINT - J

ANUARY

5, 2021Figure 11: LIWC model ROCWe can observe that for the L.M. model, while BIC is lower than the LIWC model, AIC is higher. Recall that wenoted in section 3.5.3, for sample size >100, BIC will prefer smaller models for similar log-likelihoods. The out ofsample forecasting performance represented in the

Test Accuracy column indicates the L.M. model provides 10% higheraccuracy. Also, ROC is better for L.M. Overall, while the LIWC model captures more information, probably due tomany parameters, the L.M. model predictive performance is better than the LIWC model.24

PREPRINT - J

ANUARY

5, 2021Figure 12: LM model ROC

Here we review the stress dictionary logit model. Table 6 presents the coefﬁcients and conﬁdence intervals.We observe that debt , distress and restructure are signiﬁcant at 0.001 level. For a one percent increase in distress words, the log odds of being bankrupt increases by 5.03 with 95% CI [3.98, 6.15]. The same for debt , restructure increase by 0.36 and 2.96 with 95% CIs [0.19, 0.54] and [1.45, 4.54] respectively.Most importantly, as per this model, holding other predictors at a ﬁxed value, the odds of bankruptcy for a ﬁrm whosedisclosure has 1% distress words compared to a ﬁrm with zero percent such words is exp(5.03) = 153.66. This highodds ratio indicates that distress words percentage is a highly sensitive indicator to forthcoming bankruptcy. ROC comparison is shown in ﬁgure 15 Overall, we can observe that the Stress model is better than the L.M. model onBIC and ROC criteria. Also, test performance is better in the Stress dictionary.25

PREPRINT - J

ANUARY

5, 2021Figure 13: LIWC LM ROC comparisonTable 6: Stress dictionary model coefﬁcientsPredictors Coefﬁcients SE pvalue Lower CI Upper CI Odds RatioIntercept -3.36 0.40 <0.001 -4.16 -2.59 0.03debt 0.36 0.09 <0.001 0.19 0.54 1.44distress 5.03 0.55 <0.001 3.98 6.15 153.66restructure 2.96 0.79 <0.001 1.45 4.54 19.39healthy 0.23 0.38 0.5 -0.51 0.98 1.26Table 7: Stress and LM model comparisonModel Training Accuracy Test Accuracy LogLik AIC BIC AUC Deviance ParametersLM 0.68 0.74 -489.88 987.75 1006.49 0.78 979.75 3Stress_Diction 0.72 0.79 -428.09 866.18 889.61 0.86 856.18 426

PREPRINT - J

ANUARY

5, 2021Figure 14: Stress dictionary model ROC

Considering the observation that the Correlation between LIWC, L.M., and stress dictionary features is low, we cantake advantage of their complementary nature. Three combination models with combined inputs have been ﬁtted on thedataset: LIWC + Stress, L.M. + Stress, and LIWC + L.M. + Stress. Model coefﬁcients are presented in appendix B.The performance results are shared below. 27

PREPRINT - J

ANUARY

5, 2021Figure 15: LM and stress dictionary model ROC comparison

Combination models PREPRINT - J

ANUARY

5, 2021Table 8: Dictionary models AUC comparisonModel AUCLIWC 0.72LM 0.78Stress_Diction 0.86LIWC_Stress 0.86LM_Stress 0.87LIWC_LM_Stress 0.87Figure 16: LIWC and stress dictionary model ROC29

PREPRINT - J

ANUARY

5, 2021Figure 17: LM and stress dictionary model ROC

This subsection reviews the dictionary-based models. We have evidence to believe that there is incremental performanceimprovement as additional features are incorporated into the model. A comparison of model performance is as below

This work provides the ﬁrst comprehensive test of text disclosure-based dictionary-based bankruptcy predictionmodels. For Dictionary-based models, I apply the LIWC dictionary Pennebaker et al. (2015), the Loghron McDonaldsDictionary Loughran and Mcdonald (2011), and a custom dictionary, developed as part of this work. To test the models’performance, I use receiver operating characteristics (ROC) curves, information content tests, and the accuracy metrics.The tests using ROC curve analysis demonstrated that all dictionary-based bankruptcy prediction models have a greaterforecasting accuracy than a random model and that the composite models perform better than their individual languagemodels. Information content tests provide evidence that all models carry signiﬁcant bankruptcy-related information.30

PREPRINT - J

ANUARY

5, 2021Figure 18: LIWC, LM and stress dictionary model ROC comparisonTable 9: Dictionary models comparisonModel Training Accuracy Test Accuracy LogLik AIC BIC AUC Deviance ParametersLIWC_LM_Stress 0.78 0.80 -366.95 787.89 914.38 0.87 733.89 26LIWC_Stress 0.77 0.79 -374.89 797.78 910.21 0.86 749.78 23LM_Stress 0.74 0.80 -410.58 837.17 874.64 0.87 821.17 7Stress_Diction 0.72 0.79 -428.09 866.18 889.61 0.86 856.18 4LIWC 0.69 0.63 -470.61 981.22 1074.92 0.72 941.22 19LM 0.68 0.74 -489.88 987.75 1006.49 0.78 979.75 331

PREPRINT - J

ANUARY

5, 2021Figure 19: Bag of words dictionary models ROC comparison

In this chapter, I summarize the contributions of the current study to text analysis in ﬁnance, present the research’sobjectives and ﬁndings in the context of previous research, and suggest appropriate future research directions.

This study constitutes an exploration of knowledge extraction from the narrative, corporate report sections using textanalysis. The aims are1.To establish if linguistic features of disclosures can explain ﬁrm attributes in the ﬁnancial analysis context2.To determine which language models perform better in capturing information.3.Speciﬁcally, to predict bankruptcy based on management’s discussion and analysis in annual ﬁlings.Knowledge in a public ﬁrm’s context involves information that can inﬂuence organizational outcomes and future stockperformance. As managers have an information advantage over the public, their narrative disclosures have signiﬁcantinformation content, over and above the quantitative ﬁnancial measures. This information helps in understanding theﬁrm’s current ﬁnancial status, the ﬁrm’s ability to continue its operations without hindrances, the kinds of risks the ﬁrmis exposed to, strategic and tactical interventions the management is undertaking to overcome the challenges capture theopportunities and capital allocation plans. This knowledge gives more in-depth insights into the ﬁrm’s prospects.32

PREPRINT - J

ANUARY

5, 2021In the context of this thesis, knowledge extraction is studied in the form of predicting adverse organizational outcomes,speciﬁcally in the form of (1) Bankruptcy prediction, i.e., predict if a ﬁrm will ﬁle for chapter 7 or 11 within one yearafter the annual ﬁling date (2) using management disclosures and analysis section in annual ﬁling (10-K).For this purpose, I employed one new measurement technique based on content analysis and research, namely a stressscore based on the number of ﬁnancial stress words per thousand words. I built text feature-based bankruptcy predictionmodels LIWC dictionary, L.M. dictionary, and Stress dictionary. Concerning the text feature-based bankruptcyprediction introduced in this study, this is the ﬁrst time they are used in accounting research, and they address thenumbers bias concerns inherent in traditional approaches. The scoring of Stress using linguistic markers is also anew approach to measuring ﬁnancial distress. It is based on the linguistic characteristics that management displayswhen explaining the current liquidity challenges and its attempts to overcome them through debt extensions, newﬁnancing, and asset restructuring. Such explanations would result in increased frequencies of words related to covenants,modiﬁed loan agreements, restructuring, new ﬁnancing activities, uncertainty about ﬁrms’ ability to raise funds, assetsales, and capital expense reduction in a distressed corporate reporting context. It may also result in managementattempting to present a rosy image of prospects to outsiders inconsistent with management’s perception of the ﬁrm andits performance.

Bankruptcy prediction has been an active research topic for accounting researchers over decades. With the improvedawareness about ﬁnancial ratios’ shortcomings and availability of text analysis tools, researchers have exploredincorporating textual features into bankruptcy prediction models. Hájek and Olej (2015) studied various word categoriesfrom corporate annual reports and showed that the language used by bankrupt companies shows stronger tenacity,accomplishment, familiarity, present concern, exclusion, and denial. They built prediction models combining bothﬁnancial indicators and word categorizations as input variables.Working on U.S. Banks Gandhi, Loughran, and McDonald (2017) used disclosure text sentiment as a proxy for bankdistress. Other notable works using text analysis for bankruptcy prediction were Yang, Dolar, and Mo (2018) andMayew, Sethuraman, and Venkatachalam (2015). Yang, Dolar, and Mo (2018) used high-frequency words from MDAand compared the differences between bankrupt and non-bankrupt companies. Mayew, Sethuraman, and Venkatachalam(2015) also analyzed MDA with a focus on going-concern options. They found that disclosure’s predictive ability isincremental to ﬁnancial ratios, market-based variables, even the auditor’s going concern opinion and extends to threeyears before the bankruptcy.As we can observe, prior work focused on marginal information content in the text. While researchers concluded thatnarratives have information content and predictive power, the limits and extent of that information are not tested. Thisthesis tests that and demonstrates that the information content is sufﬁcient to predict bankruptcy, independent of anyﬁnancial and quantitative metrics.Prior work in disclosure text analysis focused on simple text measures like readability, sentiment, and tone. Thislimitation was probably motivated by the intent to use them as marginal predictors, along with ﬁnancial ratios. Also, thelanguage model methods were the limitation. Limited organizational outcomes can be explained by shallow languagemodels that capture marginal information from disclosure text. My work has demonstrated that interpretable andaccurate predictions can be made with task-speciﬁc dictionaries.

This work demonstrates that textual disclosures, independent of ﬁnancial ratios, have predictive power. Further, by wayof task-independent language models, this work enables multiple tasks to be solved with the same set of features, i.e.,language features. With a sufﬁciently large dataset containing 100s of samples, researchers can build reliable predictivemodels quickly.Another implication is text-based soft metrics. Investors are interested in knowing ﬁrm performance on corporateresponsibility, climate change, and ethical business practices. It is not easy to measure these attributes using accountingmetrics. Capturing and reporting new metrics will involve signiﬁcant capital expenditure for ﬁrms. Firms can reportthe same using narrative disclosures. Investors can extract the same using the methods shown in this work. Finally,text metrics-based factors and factor investment is a possibility based on this approach. Lopez Lira (2019) usedtext-based analysis to measure ﬁrm risk exposure and built factor models with such risk portfolios. These modelsexplain cross-section returns suggesting internal validity. Similar portfolios on other dimensions like fraud, climateexposure, etc., can be explored using this thesis’s approach.33

PREPRINT - J

ANUARY

5, 2021

Like other empirical studies in ﬁnance and language processing, the results presented in this thesis contain somelimitations.Due to resource constraints for downloading, processing, and storing extensive text data, the Knowledge extraction fromFinancial disclosures has been attempted on a single task of Bankruptcy Prediction using different methodologies. I alsorestricted the text content to one type of corporate narrative document (i.e., Management’s Discussion and Analysis).For this reason, caution needs to be applied in generalizing results.Since the text analysis is restricted to the surface structure of language, it is impossible to say whether the extractedsignal is a true reﬂection of the management’s statement. What is more, disclosure changes can result from managerialinterventions, restructuring activities, e.g., raising capital, asset sell-off or cost reductions, and analyzing. Theseinterventions can improve performance, and the distressed ﬁrm might show better market performance, hence avoidingbankruptcy.I use the EDGAR ﬁling date as the time stamp for ﬁling. If bankruptcy ﬁling happens within one such ﬁling date, suchﬁling is used to compute linguistic features, subsequently used as predictors. Any actions that management takes afterﬁling, which can alter ﬁrm stress levels, are not captured. This limitation is inherent when using disclosure data.

The majority of prior text analysis research in the ﬁnance context focuses solely on sentiment analysis and does notaddress direct knowledge extraction. In particular, signiﬁcant effort has been deployed in linking sentiment and toneto subsequent performance and fraud. Moreover, the extracted information, i.e., sentiment score, is used only as anadditional input to existing quantitative models. I see four broad questions that need to be addressed. Given thatmanagement has an information advantage about ﬁrms1.What knowledge can be extracted from the management’s textual disclosures?2.Which of the ﬁrm’s future states can disclosures textual analysis explain or predict?3.Which language and document models facilitate fast and reliable information extraction?4.How can investors incorporate this information into their decision-making process?These research questions have received relatively less attention by comparison with efforts to measure sentiment.Improving the affordability of data science tools is making unstructured analysis easier. Wider adoption of unstructuredanalysis will allow researchers to apply more focus to these four questions.

My work demonstrates that textual disclosures, independent of ﬁnancial ratios, have predictive power. This observationraises the question: of all available ﬁnancial and accounting metrics, which can be replaced with more reliable text-basedmetrics? Text-based metrics can not possibly possess all the information contained in accounting metrics. However, itis critical to understand the limits of such information as well as validity.Altman, Edward I. 1968. “Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy.”

TheJournal of Finance

23 (4): 589–609.Altman, Edward I, and Edith Hotchkiss. 2010.

Corporate Financial Distress and Bankruptcy: Predict and AvoidBankruptcy, Analyze and Invest in Distressed Debt . Vol. 289. John Wiley & Sons.Amel-Zadeh, Amir, and Jonathan Faasse. 2016. “The Information Content of 10-K Narratives: Comparing MD&A andFootnotes Disclosures.” https://doi.org/10.2139/ssrn.2807546 .Ball, Christopher, Gerard Hoberg, and Vojislav Maksimovic. 2012. “Redeﬁning Financial Constraints: A Text-BasedAnalysis.”

SSRN Electronic Journal . https://doi.org/10.2139/ssrn.1923467 .Beams, Joseph, and Yun Chia Yan. 2015. “The effect of ﬁnancial crisis on auditor conservatism: US evidence.” Accounting Research Journal

28 (2): 160–71. https://doi.org/10.1108/ARJ-06-2013-0033 .Benoit, Kenneth, Kohei Watanabe, Haiyan Wang, Paul Nulty, Adam Obeng, Stefan Müller, and Akitaka Matsuo. 2018.“Quanteda: An R Package for the Quantitative Analysis of Textual Data.”

Journal of Open Source Software https://doi.org/10.21105/joss.00774 .Bodnaruk, Andriy, Tim Loughran, and Bill McDonald. 2013. “Using 10-K Text to Gauge Financial Constraints.”

Ssrn

50 (4): 623–46. https://doi.org/10.2139/ssrn.2331544 .34

PREPRINT - J

ANUARY

5, 2021Bourveau, Thomas, Yun Lou, and Rencheng Wang. 2018. “Shareholder Litigation and Corporate Disclosure: Evidencefrom Derivative Lawsuits.”

Journal of Accounting Research

56 (3): 797–842. https://doi.org/10.1111/1475-679X.12191 .Enev, Maria. 2017. “Going Concern Opinions and Management’s Forward Looking Disclosures: Evidence from theMD&A.” https://doi.org/10.2139/ssrn.2938703 .Feldman, Ronen, Suresh Govindaraj, Joshua Livnat, and Benjamin Segal. 2008. “The Incremental Information Contentof Tone Change in Management Discussion and Analysis.” https://doi.org/10.2139/ssrn.1126962 .———. 2010. “Management’s tone change, post earnings announcement drift and accruals.”

Review of AccountingStudies

15 (4): 915–53. https://doi.org/10.1007/s11142-009-9111-x .Fisher, Ingrid E, Margaret R Garnsey, and Mark E Hughes. 2016. “Natural Language Processing in Accounting,Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research.”

Intelligent Systems inAccounting, Finance and Management

23 (3): 157–214.Gandhi, Priyank, Tim Loughran, and Bill McDonald. 2019. “Using Annual Report Sentiment as a Proxy for FinancialDistress in Us Banks.”

Journal of Behavioral Finance

20 (4): 424–36.———. 2017. “Using Annual Report Sentiment as a Proxy for Financial Distress in U.S. Banks.”

Ssrn , March, 1–13. https://doi.org/10.2139/ssrn.2905225 .Hájek, Petr, and Vladimír Olej. 2015. “Word categorization of corporate annual reports for bankruptcy predic-tion by machine learning methods.” In

Lecture Notes in Computer Science (Including Subseries Lecture Notesin Artiﬁcial Intelligence and Lecture Notes in Bioinformatics) , 9302:122–30. https://doi.org/10.1007/978-3-319-24033-6_14 .Hillegeist, Stephen A, Elizabeth K Keating, Donald P Cram, and Kyle G Lundstedt. 2004. “Assessing the Probability ofBankruptcy.”

Review of Accounting Studies

Journal of Financial Economics

106 (3): 614–34.Lopatta, Kerstin, Mario Albert Gloger, and Reemda Jaeschke. 2017. “Can Language Predict Bankruptcy? TheExplanatory Power of Tone in 10-K Filings.”

Accounting Perspectives

16 (4): 315–43. https://doi.org/10.1111/1911-3838.12150 .Lopez Lira, Alejandro. 2019. “Risk Factors That Matter: Textual Analysis of Risk Disclosures for the Cross-Section ofReturns.” https://doi.org/10.2139/ssrn.3313663 .LoPucki, Lynn M. 2006. “Bankruptcy Research Database.”Loughran, Tim, and Bill Mcdonald. 2009. “Plain English , Readability , and 10-K Filings.”

English .Loughran, T I M, and Bill Mcdonald. 2011. “When is a Liability not a Liability ? Textual Analysis , Dictionaries , and 10-Ks Journal of Finance , forthcoming.” 1. Vol. 66. https://doi.org/10.1111/j.1540-6261.2010.01625.x .Mayew, William J., Mani Sethuraman, and Mohan Venkatachalam. 2015. “MD&A disclosure and the ﬁrm’s ability tocontinue as a going concern.”

Accounting Review

90 (4): 1621–51. https://doi.org/10.2308/accr-50983 .Nguyen, Ba-Hung, and Van-Nam Huynh. 2020. “Textual Analysis and Corporate Bankruptcy: A Financial Dictionary-Based Sentiment Approach.”

Journal of the Operational Research Society , 1–20.Pennebaker, James W, Ryan L Boyd, Kayla Jordan, and Kate Blackburn. 2015. “The Development and PsychometricProperties of Liwc2015.”Pennebaker, James W, Martha E Francis, and Roger J Booth. 2001. “Linguistic Inquiry and Word Count: LIWC 2001.”

Mahway: Lawrence Erlbaum Associates

71 (2001): 2001.Rajan, Uday, Amit Seru, and Vikrant Vig. 2015. “The Failure of Models That Predict Failure: Distance, Incentives, andDefaults.”

Journal of Financial Economics

115 (2): 237–60.Shirata, Cindy Yoshiko, Hironori Takeuchi, Shiho Ogino, and Hideo Watanabe. 2011. “Extracting Key Phrases asPredictors of Corporate Bankruptcy: Empirical Analysis of Annual Reports by Text Mining.”

Journal of EmergingTechnologies in Accounting https://doi.org/10.2308/jeta-10182 .Sobehart, Jorge, and Sean Keenan. 2001. “Measuring Default Accurately.”

Risk

14 (3): 31–33.35

PREPRINT - J

ANUARY

5, 2021Tao, Jie, Amit V. Deokar, and Ashutosh Deshmukh. 2018. “Analysing forward-looking statements in initial publicoffering prospectuses: a text analytics approach.”

Journal of Business Analytics https://doi.org/10.1080/2573234x.2018.1507604 .Tinoco, Mario Hernandez, and Nick Wilson. 2013. “Financial Distress and Bankruptcy Prediction Among ListedCompanies Using Accounting, Market and Macroeconomic Variables.”

International Review of Financial Analysis

30: 394–419.Wu, Yanhui, Clive Gaunt, and Stephen Gray. 2010. “A Comparison of Alternative Bankruptcy Prediction Models.”

Journal of Contemporary Accounting & Economics

Journal of Emerging Technologies in Accounting

15 (1): 45–55. https://doi.org/10.2308/jeta-52085https://doi.org/10.2308/jeta-52085