[PDF] Investor Emotions and Earnings Announcements

Abstract

Armed with a decade of social media data, I explore the impact of investor emotions on earnings announcements. In particular, I test whether the emotional content of firm-specific messages posted on social media just prior to a firm's earnings announcement predicts its earnings and announcement returns. I find that investors are typically excited about firms that end up exceeding expectations, yet their enthusiasm results in lower announcement returns. Specifically, a standard deviation increase in excitement is associated with an 7.8 basis points lower announcement return, which translates into an approximately -5.8% annualized loss. My findings confirm that emotions and market dynamics are closely related and highlight the importance of considering investor emotions when assessing a firm's short-term value.

Full PDF

IInvestor Emotions and Earnings Announcements ∗ Domonkos F. Vamossy † June 29, 2020

Abstract

Armed with a decade of social media data, I explore the impact of investor emotionson earnings announcements. In particular, I test whether the emotional content of ﬁrm-speciﬁc messages posted on social media just prior to a ﬁrm’s earnings announcementpredicts its earnings and announcement returns. I ﬁnd that investors are typicallyexcited about ﬁrms that end up exceeding expectations, yet their enthusiasm results inlower announcement returns. Speciﬁcally, a standard deviation increase in excitementis associated with an 7.8 basis points lower announcement return, which translates intoan approximately -5.8% annualized loss. My ﬁndings conﬁrm that emotions and mar-ket dynamics are closely related and highlight the importance of considering investoremotions when assessing a ﬁrm’s short-term value.

Keywords: deep learning; investor emotions; capital markets.

JEL Codes: G41; L82. ∗ I am extremely grateful to Stefania Albanesi, Lee Dokyun, Sera Linardi, and Dan Berkowitz for theircontinued guidance and support throughout this project. Special thanks to StockTwits for sharing theirdata. I would also like to thank Graham Beattie, Andy Koch, Mark Azic, Doug Hanley, and Mallory Averyfor helpful comments and suggestions. This research was supported in part by the University of PittsburghCenter for Research Computing through the resources provided. All errors are my own. Preliminary results,I welcome feedback. † Department of Economics, University of Pittsburgh, [email protected]. a r X i v : . [ q -f i n . P M ] J un Introduction

Conventional wisdom often posits that changes to asset prices are the result of investoremotions. For instance, Galbraith (1994) describes stock market bubbles as “speculative eu-phoria”, while headlines such as “‘Gut Feelings’ Are Driving the Markets” or “How EmotionHurts Stock Returns” are common (e.g., Shiller (2020) and Wolfers (2015)). Alan Greenspan,as a chairman of the Federal Reserve, famously remarked that the U.S. stock market exhib-ited an “irrational exuberance” when it experienced a rapid run-up in 1996. This observationportrayed a belief on his part that the increase had an origin in traders’ positive emotions.In contrast, fear is cited as a force leading to sell-oﬀs, price declines, and price variability.Market volatility indices, such as the CBOE’s VIX, are often referenced as “fear” indices.In this paper I test whether ﬁrm-speciﬁc investor emotions predict a ﬁrm’s earningsand announcement returns. Speciﬁcally, I explore the following research questions: (1)Do investor emotions foreshadow earnings surprises? and (2) Do investor emotions predictannouncement returns? Only recently has academic literature begun exploring the role emo-tions play in capital markets. Due to data diﬃculties regarding the measurement of investoremotions, studies mainly relied on indirect proxies, or have been restricted to experimentalevidence. By pairing a large, novel dataset with recent advances in text processing, I amable to overcome the data challenge inherent in studying investor emotions. I ﬁnd that in-vestors are typically excited about ﬁrms that end up exceeding analysts expectations, yettheir enthusiasm results in lower announcement returns. To get to my answer, I use data from StockTwits, a social networking platform forinvestors to share stock opinions. A critical feature of this data is that it contains ﬁrm-speciﬁc messages, so I am able to compute ﬁrm-speciﬁc emotions. I employ a broad sampleof over 4 million messages that span the decade starting in 2010. My analysis focuses onearnings announcements because they are recurring, paramount corporate events that arefollowed closely by capital market participants.The primary challenge to studying my research questions is ﬁnding a way to quantifyinvestor emotions. I overcome this by using deep learning and a large, novel dataset ofinvestor messages. In particular, I construct emotion variables corresponding to seven Throughout the paper I use the word excited, enthusiastic and happy interchangeably. For other applications of deep learning in economics see Albanesi and Vamossy (2019) and Meursault(2019). For reviews of machine learning applications in economics, see Mullainathan and Spiess (2017) andAthey and Imbens (2019). The ﬁrstone isolates messages conveying information related to earnings, ﬁrm fundamentals or stocktrading from general chat. The second separates messages conveying original informationfrom those disseminating existing information.Once the emotion variables are constructed, I then use a ﬁxed eﬀects model, exploitingwithin ﬁrm variation in investor emotions, to test whether emotions predict a ﬁrm’s earningsand announcement returns. I use ﬁrm ﬁxed eﬀects to isolate within ﬁrm variation. Forinstance, if a ﬁrm tends to have positive earnings surprises, this might make investors alwaysmore excited before announcements, and by including ﬁxed eﬀects, I can rule out that myresults are driven by this. I also control for year, month, and day-of-the-week ﬁxed eﬀects, torule out that my results are only driven by factors which eﬀect emotions and returns acrossall ﬁrms simultaneously. I take a number of steps to mitigate additional concerns regardingthe estimation. To ensure that I am not picking up reactive emotions, I look at the impactof pre-announcement emotions on earnings announcements, so there is a clear temporalseparation between my independent and dependent variables. I tackle misattribution - theconcern that my emotion measures are not capturing emotions correctly - by training anadditional emotion model and use emotion variables obtained by this model for robustnesschecks and by investigating the impacts of contemporaneous emotions and asset prices, andﬁnd that my algorithm classiﬁes messages as happier when they are talking about assetsthat have gone up in value.I document two main ﬁndings. First, that inter-ﬁrm investor emotions can predict thecompany’s quarterly earnings. In particular, variation in how happy investors are is linkedwith marginally higher earnings surprises. Second, I ﬁnd a negative relationship between theimmediate stock price reaction to the quarterly earnings announcement and both within- andinter-ﬁrm variation in investor excitement. I show that this result is driven both by messages The emotions in this paper correspond to the seven emotional states speciﬁed in Breaban and Noussair(2018). I provide a detailed description of my classiﬁcation schemes in the Appendix C. Studies relying on indirect proxieshave severe limitations. For instance, Jacobsen and Marquering (2008) suggest that theﬁndings reported in Kamstra, Kramer, and Levi (2003) may be explained by a number ofother factors related to the season (they illustrate with ice cream consumption and airlinetravel), thus questioning their conclusion that changes in investors’ moods associated withthe Seasonal Aﬀective Disorder (SAD) directly inﬂuence stock market returns. On the otherhand, research using direct emotion proxies has been limited to studying the relationshipbetween investor emotions and daily stock returns, and aside from Li, Zhou, and Liu (2016), Similarly, Hirshleifer and Shumway (2003) ﬁnd that good weather is correlated with higher stock returns,and appeal to a similar intuition to explain their results. , These studies have investigated whether social media content can predictthe overall movement of the stock market. For instance, Mao et al. (2012) ﬁnd that the dailynumber of tweets that mention S&P 500 stocks is signiﬁcantly associated with the changes inthat same index. This literature has also analyzed how Twitter and/or StockTwits activityinﬂuences investor response to earnings. Curtis, Richardson, and Schmardebeck (2014) ﬁndthat high levels of activity correlate with greater sensitivity to earnings announcement returns The importance of social media is voiced in studies exploring how companies exploit this channel as ameans for investor communication. For instance, Blankespoor, Miller, and White (2014) show that ﬁrmscan reduce information asymmetry among investors by broadly disseminating their news, including pressreleases and other disclosures, to market participants using Twitter. Jung et al. (2018) ﬁnd that roughlyhalf of S&P 1500 ﬁrms have created a corporate presence on either Facebook or Twitter. Adding to the literature on StockTwits and Twitter is research that has examined investors’ use ofInternet search engines, ﬁnancial websites, forums, and other social media platforms. This research hasprovided mixed evidence on whether this information helps predict future earnings and stock returns. UsingGoogle search volume as a proxy for investors’ demand for ﬁnancial information, Da, Engelberg, and Gao(2011) ﬁnd that increased Google searches predict higher stock prices in the near-term followed by a pricereversal within a year. Drake, Roulstone, and Thornock (2012) show that the returns-earnings relation issmaller when Google search volume before earnings announcements is high. Antweiler and Frank (2004)and Das and Chen (2007) both ﬁnd that the volume of posts on message boards, such as Yahoo! or RagingBull, is associated with stock return volatility, but not stock returns. Chen et al. (2014) demonstrate thatinformation in user-generated research reports on the Seeking Alpha investing portal helps predict earningsand long-window stock returns following the report posting date. and the wisdom of crowds hypotheses. First, in line with the value of diversity hypothesis, I ﬁnd that the predictivepower of emotions are diminished when considering only groups of (more) homogeneous usersby segmenting them by sharing similar investment horizons, trading experience, tradingapproach, popularity and account types. Second, I add to wisdom of crowds literatureby showing that investor messages are better predictors when surrounded by higher user This hypothesis originates from Hong and Page (2004), who show that a diverse group of intelligentdecision-makers reaches reliably better decisions than a less diverse group of individuals with superior skillsand concludes that under certain conditions, “diversity trumps ability”. Interestingly, traditional informationintermediaries, such as ﬁnancial analysts, tend to herd to the consensus viewpoint (Jegadeesh and Kim(2010)) and produce ineﬃcient earnings forecasts (Abarbanell (1991)), perhaps because they belong to arather small and homogeneous group (Welch (2000)). This is relevant to the research questions of thispaper, because StockTwits has a diverse set of investors with widely diﬀerent investment philosophies. The wisdom of crowds refers to the phenomenon that aggregated information provided by many oftenresults in better predictions than those made by any single group member, even when that member is anexpert. Surowiecki (2004) presents numerous case studies and anecdotes to illustrate the principle. Onesuch example comes from the work of Sir Francis Galton: after observing a weight-judging competition at acounty fair in 1906, Galton found that the crowd accurately predicted the weight of an ox when their guesseswere averaged. The average guess was closer to the ox’s true weight than most the individual predictions,including estimates coming from cattle experts, butchers, and farmers. A similar outcome was witnessedin Berg et al. (2008), which revealed the remarkable ability of the Iowa Electronic Markets to predict high-proﬁle elections, outperforming polls conducted by experts. Recent research that builds on the wisdom ofcrowds concept shows that the content of tweets can be used to predict: (1) earnings announcement returns(Bartov, Faurel, and Mohanram (2018)), and (2) future returns around Federal Open Market Committee(FOMC) meetings (Azar and Lo (2016)).

The theoretical framework for this paper comes from Shu (2010). Shu (2010) modiﬁes theLucas model (Lucas Jr (1978)), and shows how investor mood variations aﬀect equilibriumasset prices and expected returns. Speciﬁcally, equity prices correlate positively with in-vestor mood, with higher asset prices associated with better mood. In contrast, expectedasset returns correlate negatively with investor mood. Given this, we expect to ﬁnd posi-tive contemporaneous relationships between investor enthusiasm and excess returns, whilea negative relationship between pre-announcement investor enthusiasm and announcementreturns. I provide a simple framework with a potential mechanism in Section A.

My investor emotion dataset comes from StockTwits, which was founded in 2008 as a so-cial networking platform for investors to share stock opinions. StockTwits looks similar toTwitter, where users post messages of up to 140 characters (280 characters since late 2019),and use “cashtags” with the stock ticker symbol (e.g., $AMZN) to link ideas to a particularcompany. Although the app does not directly integrate with other social media platforms,participants can share content to their personal Twitter, LinkedIn, and Facebook accounts.My original dataset spans the decade of 2010; starting from 1 January, 2010 until 31December, 2019. In total, there are 117,354,459 messages by 416,249 unique users mention-ing 9,742 tickers. For each message, I observe sentiment indicators as tagged by the user(bullish, bearish, or unclassiﬁed), sentiment score as computed by StockTwits, “cashtags” An alternative theory is provided by Duxbury et al. (2020), who present an emotion-based account ofbuy and sell preferences in asset markets. Speciﬁcally, they leverage psychological research (e.g., Loewensteinet al. (2001)) and propose that when the price of a single asset increases (decreases) above its purchase price,anticipatory hope increase (decreases). Users of the platform also provide their experience level as either novice, inter-mediate, or professional. Leveraging textual analysis, I also distinguish between institutionaland retail investor accounts. This user-speciﬁc information about the style, experience, typeand investment model employed is useful to explore heterogeneity in investor emotions.I restrict my sample to cover stocks traded on NASDAQ/NYSE, and remove messagesthat appear automated. I focus on messages that can be directly linked to particularstocks, so I restrict attention to messages that only mention one ticker. Last, I require atleast two users posting per stock for the duration over which I compute averages to discardnoisy signals. I summarize my sample restrictions in Table 1.

Table 1: Itemized Sample RestrictionsMessagesStockTwits Data 2010-2019 117,354,459KeepNASDAQ/NYSE Ticker 101,484,559Single Ticker 74,648,778Not Automated 68,305,130IBES/CRSP Ticker 60,963,143Final Announcement Sample 4,467,461

I restrict my sample to ﬁrms with posts from at least 2 users for the period between10 trading days before the earnings announcement until 2 trading days before the an-nouncement.

I plot the average word count per messages over time in Figure 1, displaying a relativelystable trend with a spike in late 2019. This spike is due the character limit extension from I group technical with momentum and value with fundamental for my heterogeneity explorations. Forinvestment horizon, I explore day traders and long term investors. I deﬁne automated messages as messages posted over 1,000 times by the same user over the period2010-2019.

840 characters to 280 characters. Given that the average post length peaks at 16, and sinceI use the ﬁrst 30 words to extract the emotion from messages, this likely does not eﬀect theestimation.

Data Month A v e r a g e W o r d C o un t Figure 1: Time Series of Average Post Length

Notes: Similarly to Twitter, StockTwits introduced longer messages in late 2019 (280 characters).

Figure 2 portrays the number of messages over time in my data, indicating substantialgrowth in the early years in the data, which plateaus around 2016. I control for the growingnature of my sample and the changing nature of my posts by including time ﬁxed eﬀects inmy analysis.I also explore when investors post the messages. In particular, I examine whether theypost messages concurrently with daily news so that it reﬂects hour by hour changes in beliefs,or in the evening after work, when they have more free time and then it is more of a reﬂectivegeneral analysis. In Panels (a) and (b) of Figure 3 I plot the distribution of messages bythe day of the week and by the hour of the day respectively. We can clearly see that mostposting activity on the platform happens when the markets are open (Monday-Friday andbetween 9am and 4pm). This behavior is consistent with investors updating their beliefs inreal time as ﬁnancial events unfold. 9

010 2011 2012 2013 2014 2015 2016 2017 2018 2019

Year T o t a l C l a ss i f i e d M e ss a g e s ( i n M illi o n s ) Figure 2: StockTwits Messages Over Time

Monday Tuesday Wednesday Thursday Friday Saturday Sunday

Day of the Week T o t a l C l a ss i f i e d M e ss a g e s ( i n M illi o n s ) (a) Midnight 6AM 12PM 6PM

Hour of the Day T o t a l C l a ss i f i e d M e ss a g e s ( i n M illi o n s ) (b) Figure 3: Distribution of Messages

Panel (a) portrays the day-of-the-week, while Panel (b) depicts the hour-of-the-day distribution ofmessages. -14-13-12-11-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Days till Earnings Report N u m b e r o f C l a ss i f i e d M e ss a g e s ( i n M illi o n s ) Figure 4: Posts around Earnings Announcements

Price and volume-related variables are obtained from CRSP, accounting information is ob-tained from Standard and Poor’s COMPUSTAT, analyst and earnings announcement relatedinformation is obtained from I/B/E/S, and institutional ownership data is from ThomsonReuters Institutional Holdings (13F). I match this data with StockTwits, and compute daystill earnings announcements based on Gabrovˇsek et al. (2017). I illustrate this in Figure 5.11 a)(b)

Figure 5: Event Windows for Firms Announcing Before vs. After the Market Opens

Notes: Panel (a) displays the estimation for ﬁrms reporting before the market opens, while Panel (b)portrays it for ﬁrms with announcements after the market closes.

I now brieﬂy describe the text analysis methodologies used in this paper. For an in depthdiscussion, see Appendix B, C, and D.

In order for my emotion measure to be useful, it must reveal the true state of investors.Thus, before using the data, I must rule out that users are trying to manipulate the stockmarket by posting fake opinions. For instance, if a user believes the stock price will go downand thus wants to sell the stock, she could post positive messages that might increase theprice temporarily and thus would allow her to sell at a higher price. This would invalidatemy measure, as I would capture her emotion as happy, even though her current emotional12tate might not be. This does not, though, seem to be an important concern in my data for anumber of reasons. First, there is anecdotal evidence that users post on platforms to attractfollowers, gain internet fame, or ﬁnd employment. In all those cases, it is incentive compatiblefor them to provide their honest opinion about the stock. Second, I also investigate thepricing impacts concerning S&P 1500 ﬁrms, which have large market caps that make itunlikely that individual investors could move prices.

The primary challenge underlying my research design is the estimation of emotion. Toovercome this, I use textual analysis to quantify the emotion expressed in investor messages.I leverage a large set of emojis and emoticons along with emotionally charged words togenerate a dataset of investor messages with corresponding emotions. I then use a standardbi-directional GRU model with word embeddings (see Chung et al. (2014)) to obtain aprobabilistic assessment for each message in the data.

To further explore the channels whereby emotions operate, I also compute the emotionalstate of messages separately as they relate to fundamental information (“fundamental”) orwhether they look like general social media chat (“chat”). I provide examples of messagesand their predicted emotion probabilities for a set of “fundamental” and “chat” posts inTable D.3.I also distinguish between messages by whether they provide original information (“orig-inal”) or they disseminate existing ones (“dissemination”). A message is considered originalif (1) it is not a retweet of another user’s message and (2) it does not include a hyperlink.

StockTwits uses an unclassiﬁed supervised learning model to generate a sentiment score formessages, and reports this score and statistics of this on its platform. I found that a largefraction of messages receive a score of 0, meaning that the message is either unclassiﬁed orhas no forward looking sentiment. To be consistent with prior research, I use a Naive Bayes13odel (Bartov, Faurel, and Mohanram (2018)). Since early investor sentiment studies, such as De Long et al. (1990), research has revealedthat investor sentiment and emotion are closely related. Examples include optimism (pes-simism) or hope about (fear of) the future. As Shiller (2003) suggested, excessive pricevolatility in asset markets may indicate that investors’ decisions are inﬂuenced by such op-timism or pessimism. Tetlock (2007) provides empirical support to De Long et al. (1990)by documenting that high media pessimism exerts downward pressure on prices throughshort-term spikes in trading volume. Still, there are three important distinctions betweenmy emotion measure and sentiment.First, the main diﬀerence is deﬁnition. Unlike emotion, investor sentiment is deﬁned as“a belief about future cash ﬂows and investment risks that is not justiﬁed by the facts athand” (Baker and Wurgler (2007)). Now, whether a model not trained speciﬁcally on socialmedia data can extract this component is not within the scope of this paper. Nonetheless,to alleviate such concerns I also train a deep learning based sentiment model trained onmessages pre-tagged by the author of the post as “bullish” or “bearish”.The second is dimensionality. While investor sentiment is a one dimensional object, myinvestor emotion is a multi-dimensional construct. This allows me to pinpoint what featuresof messages seem to matter more. For instance, both fear and anger are likely classiﬁedas negative, yet an angry message is diﬀerent from a fearful message (see Table D.3 forexamples), and it is conceivable that ﬁrms with angry messages perform diﬀerently thanﬁrms with fearful messages.Third, unlike my emotion model which incorporates emojis and emoticons, the sentimentmodel built on the Naive Bayes classiﬁer assigns a score of 0.5 (i.e., neutral) for each of theemoticons and emojis included in my dictionaries, both in its original format (i.e., “:)”) andits changed format (i.e., “happyface”). For instance, the message “I am :)” would be classi-ﬁed as happy with the emotion model, and neutral with the sentiment model. Therefore, thesentiment model measures the content of only words, ignoring potentially important infor- I use the Naive Bayes classiﬁer developed by https://textblob.readthedocs.io/en/dev/_modules/nltk/classify/naivebayes.html , and classify messages with a predicted probability just under 0.49 asnegative, and just over 0.51 as positive, and hence, my neutral class contains messages with a sentimentbetween 0.49 and 0.51.

I ﬁrst deﬁne the variables used in my analyses in Table 2.

Table 2: Variable DeﬁnitionsVariable Deﬁnition SourceAnalysts Natural logarithm of 1 plus the number of analysts in the lat-est I/B/E/S consensus analyst quarterly earnings per shareforecast prior to the quarter-end date. I/B/E/SEmotion Each message is classiﬁed by a many-to-one deep learningmodel into one of the seven categories (i.e., neutral, happy,sad, anger, disgust, surprise, fear), so that the correspondingprobabilities sum up to 1. For each emotions separately, wethen take the weighted average of these probabilities duringthe nine trading-day window [ − , − t, t + n ] as fol-lows: Exret i ; t,t + n = t + n (cid:89) k = t (1 + R ik ) − t + n (cid:89) k = t (1 + ER ik ) WRDS (U.S. DailyEvent Study)Inst Number of shares held by institutional investors scaled bytotal shares outstanding as of the quarter-end date Thomson ReutersInstitutional Hold-ings (13F)Loss Indicator variable equal to 1 if earnings before extraordinaryitems (IBQ) is strictly negative in the prior quarter, and 0otherwise Compustat (Quar-terly) able 2 – continued from previous page Variable Deﬁnition SourceMB Ratio of market value to book value of equity(CSHOQ ∗ PRCCQ / CEQQ) Compustat (Quar-terly)Sentiment Twits classiﬁed as positive minus twits classiﬁed as negativeduring the nine trading-day window [ − , − ∗ PRCCQ)). Compustat (Quar-terly)SUE Standardized unexpected earnings (suescore from I/B/E/S). I/B/E/SVolatility Standard deviation of daily returns during the half-year until10 trading days before the announcement. CRSP

For the excess return calculation, see the event and estimation windows in Figure F.1. CSHOQ left inmillions.

Table 3 presents the descriptive statistics for the analysis variables. I ﬁnd a high fractionof neutral messages, with a mean of 72.7% and a median of 74.9%, mainly driven by ﬁrmswith few posts (weighting ﬁrm-quarter observations by number of posts yields a mean of51.86% for neutral). Looking at my Naive Bayes based sentiment variable, I observe apositive skewness, with a mean of 1.867, and a median of 1.773. This might suggest a“good-news” bias in twits, following from investors being more likely to share their optimismon social media than pessimism. My earnings surprise variable: standardized unexpectedearnings (SUE) have a mean and median of 1.052 and 0.707 respectively. This suggests thatﬁrms in my sample exceeded analysts expectations more than disappointed. My measure ofabnormal returns around earnings announcements, has a slightly negative mean, − − able 3: Descriptive StatisticsObservations Mean σ σ within Median 10% 90%

Panel A: CRSP/IBES/Compustat/Thomson Reuters (13F)

EXRET − , , − , − Panel B: StockTwits − , − − , − − , − − , − , − , − − , − , − − , − , − − , − , − − , − , − − , − , − − , − , − Note: σ within denotes the within-ﬁrm (demeaned) standard deviations. Continuous variables winsorizedat the 1% and 99% level. Emotion classiﬁcations based on StockTwits model. a b l e : C o rr e l a t i o n M a t r i x E X R E T − , S U E t S U E t − E X R E T − , − L o ss A n l I n s t S i ze M B S e n t H a pp y S a d A n g e r D i s g u s t Su r p r i s e F e a r E X R E T − , . S U E t . ∗∗∗ . S U E t − - . ∗∗∗ . ∗∗∗ . E X R E T − , − . . ∗∗∗ - . ∗∗ . L o ss - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ . A n l . ∗ . ∗∗∗ . ∗∗∗ . - . ∗∗∗ . I n s t . ∗∗∗ . ∗∗∗ . ∗∗∗ - . - . ∗∗∗ . ∗∗∗ . S i ze . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗ - . ∗∗∗ . ∗∗∗ . ∗∗∗ . M B . ∗∗∗ . ∗∗∗ . ∗∗∗ - . . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . S e n t . ∗∗ . ∗∗∗ . ∗∗∗ . - . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . . H a pp y - . . ∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ - . ∗∗∗ . ∗∗∗ . ∗∗∗ - . ∗∗∗ . S a d - . - . - . - . ∗∗∗ . ∗∗∗ . ∗∗∗ - . ∗∗∗ . . ∗∗∗ - . ∗∗∗ . ∗∗∗ . A n g e r - . - . ∗∗ - . - . ∗∗∗ . ∗∗∗ . ∗∗∗ - . ∗∗∗ - . . ∗∗∗ - . ∗∗∗ . ∗∗∗ . ∗∗∗ . D i s g u s t - . - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ . ∗∗∗ . ∗∗∗ - . ∗∗∗ - . ∗∗∗ . ∗∗∗ - . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . Su r p r i s e - . - . ∗ - . - . ∗∗∗ . ∗∗∗ . ∗∗∗ - . ∗∗∗ - . ∗∗∗ . ∗∗∗ - . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . F e a r - . - . - . - . ∗∗∗ . ∗∗∗ . ∗∗∗ - . ∗∗∗ . ∗∗∗ . ∗∗∗ - . ∗∗∗ . . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . O b s e r v a t i o n s I n s tr e f e r s t o i n s t i t u t i o n a l, S e n tr e f e r s t o s e n t i m e n t , A n l r e f e r s t o A n a l y s t s . C o n t i nu o u s v a r i a b l e s w i n s o r i ze d a tt h e % a nd % l e v e l. ∗ p < . , ∗∗ p < . , ∗∗∗ p < . I start with addressing my ﬁrst research question: do investor emotions predict the company’searnings? This would be the case when investor emotions contain information relevant tocompany’s future earnings. In particular, it is conceivable that positive investor emotionsindicate performance exceeding prior expectations. To test this question I estimate thefollowing model: Y ift = α i + j =6 (cid:88) j =1 β j EM OT ION iftj + γX ift + δ t + δ f + (cid:15) ift (1)Here, the dependent variable is the earnings surprise, measured using standardized unex-pected earnings (SUE) for ﬁrm f during announcement t. My test variables, EMOTION iftj ,is the average ﬁrm-speciﬁc emotion extracted from individual messages written 10 trad-ing days before until 2 trading days before the announcement. Speciﬁcally, EMOTION iftj for j ∈ { happy, sad, fear, disgust, angry, surprise } is a probabilistic measure of the averageemotion from StockTwits, where the benchmark group is the neutral. In Equation (1), thehypothesis that the average emotion from individual messages is predictive of the upcomingearnings surprise implies | β j | > ift ) include: the lagged earnings surprise from the previous quar-ter to control for the positive autocorrelation in earnings surprises (SUE i,t − ); Carhart (1997)four-factor buy-and-hold abnormal stock returns for the ﬁrm over the window [ − , −

2] tocontrol for information outside of the realm of StockTwits that may have reached the capitalmarket prior to the earnings release (EXRET − , − ); ﬁrm size (Size); market-to-book ratio19MB); number of analysts in the consensus I/B/E/S/ quarterly earnings forecast (ANL); in-stitutional investor holding (INST); where applicable, indicator variable for the fourth ﬁscalquarter (Q4); an indicator variable for past quarterly loss (Loss). These last seven variablescontrol for eﬀects shown by prior research to explain the cross-sectional variation in earningssurprises. I include ﬁrm ( δ f ) and time ( δ t ) ﬁxed eﬀects (year, month, day of the week) toaccount for ﬁrm-speciﬁc and time patterns in earnings surprises that my controls might notaccount for. Along the lines of prior research (e.g., Petersen (2009)), I cluster standard errorsby ﬁrm, because the errors may be correlated over time at the ﬁrm level. I now address my second research question: Can the emotions extracted from StockTwitsmessages predict quarterly earnings announcement stock returns? Certainly, if emotionsare irrelevant, then the answer is no. Given Shu (2010), I expect a negative associationbetween pre-existing enthusiasm and announcement returns. To test this question empiri-cally, I examine the relationship between abnormal stock returns (EXRET) in the three daysaround earnings announcements, [ − , − , − − , EXRET ift .The prediction that pre-announcement emotional states are informative of earnings an-nouncement returns imply that | β j | > j ∈ { happy, sad, fear, disgust, angry, surprise } .This would be the case if, as discussed in Bartov, Faurel, and Mohanram (2018), the marketuses stock recommendations and analyst earnings forecasts in forming its earnings expecta-tions and stock prices, but does not extract information as they are released from other, lessprominent sources, such as StockTwits. Based on Shu (2010), I expect to ﬁnd β happy < ift ) leverage the ﬁndings of prior research: For instance,I include excess returns from ten days before the announcement until two days before theannouncement to control for momentum in stock returns. This ensures that the eﬀects Iattribute to emotional states are not driven by momentum of pre-announcement returns. Iinclude institutional ownership as a control variable, to acknowledge that the marginal in-vestor who sets stock prices is a sophisticated investor whose equity valuations and earnings20xpectations may not only rely on analyst forecasts and recommendations. The other fourvariables are used to control for eﬀects shown by prior research to explain the cross-sectionalvariation in stock returns around earnings announcements. I also include my realized earn-ings surprise variable of the current quarter (SUE) to explore the nature of the StockTwitsinformation that predicts stock returns. If the information conveyed by emotions is aboveand beyond earnings realizations, then the coeﬃcient on emotions will continue to be sig-niﬁcant even after controlling for SUE. Once again, I include ﬁrm ( δ f ) and time ( δ t ) ﬁxedeﬀects to account for ﬁrm-speciﬁc and time patterns in earnings surprises that my controlsmight not account for. I cluster standard errors by industry-quarter, using Fama-French48-industry groupings, because the errors may be correlated in the same calendar periodacross ﬁrms in the same industry. It is conceivable that ﬁrms that tend to have positive earnings surprises make investorsalways more excited before announcements. To rule out that my results are driven by this,I use ﬁrm ﬁxed eﬀects. I also control for year, month, and day-of-the-week ﬁxed eﬀects, toensure that my results are not driven by factors which eﬀect emotions and returns acrossall ﬁrms simultaneously. I take a number of steps to mitigate additional concerns regardingthe estimation. First, to guarantee that I am not picking up reactive emotions, I look atthe impact of pre-announcement emotions on earnings announcements, so there is a cleartemporal separation between my independent and dependent variables. Second, I tacklemisattribution - the concern that my emotion measures are not capturing emotions correctly- by training an additional emotion model and use emotion variables obtained by this modelfor robustness checks and by investigating the impacts of contemporaneous emotions andasset prices, and ﬁnd that my algorithm classiﬁes messages as happier when they are talkingabout assets that have gone up in value. One caveat of my analysis is that I do not controlfor traditional media coverage, and hence, I cannot exclude the possibility that it is theemotions invoked from traditional media coverage that drive my results.21

Primary Findings

I start with addressing my ﬁrst research question: do investor emotions predict the com-pany’s earnings? Before I exploit within-ﬁrm variation, Columns (1-2) of Table 5 documentsrelationship between investor emotions and earnings surprises without ﬁrm ﬁxed eﬀects. Asthe results show, emotions alone can only explain some of the variation in earnings surprises(0.8%). I ﬁnd that a standard deviation increase in happiness results in a 1.8% standarddeviation increase in earnings surprises (0 . ∗ . / . Table 5: Emotions, Earnings Surprises and Announcement Returns(1) (2) (3) (4)SUE SUE EXRET − , EXRET − , Happy − , − ∗∗∗ ∗∗∗ -0.3902 ∗∗ -0.4423 ∗∗ (0.0993) (0.0935) (0.1833) (0.1804)Sad − , − -0.4343 -0.0468 -0.9214 -0.0858(0.3087) (0.2725) (0.6212) (0.6067)Disgust − , − -2.6191 ∗∗∗ -0.1508 -0.8860 1.9378(0.7429) (0.6628) (1.6837) (1.6766)Anger − , − -3.1294 ∗∗∗ -1.1747 -5.3028 ∗∗ -1.4846(1.0204) (0.9393) (2.3774) (2.3273)Fear − , − -0.0272 0.0816 -0.3175 -0.0400(0.1935) (0.1796) (0.3821) (0.3685)Surprise − , − -0.6937 ∗∗ ∗∗∗ ∗ R Notes: Robust standard errors clustered at the ﬁrm (SUE) and industry-quarter (EXRET)level are in parentheses. ∗ p < . ∗∗ p < . ∗∗∗ p < .

01. Continuous variableswinsorized at the 1% and 99% level to mitigate the impact of outliers.

22 next account for unobservables by including ﬁrm, year, month, and day of the weekﬁxed eﬀects. Table 6 presents the results. When I estimate the entire sample, I ﬁnd thatwithin-ﬁrm variation in anger can be useful in predicting earnings surprises (Column (1)).The signiﬁcance disappears when I restrict the sample to S&P 1500 ﬁrms (Column (2)),or when I only include messages that do not contain information about earnings or ﬁrmfundamentals (Column (3)). Looking at messages pertaining to stock fundamentals, I ﬁnda negative relationship between sad and earnings surprises, i.e., a within-ﬁrm standard de-viation increase in sad is associated with a 0.9% within-ﬁrm standard deviation decreasein earnings surprise (Column (4)). Next, the predictive power is only present for messagescontaining original information (Column (5)), and not for those disseminating existing ones(Column (6)). Taken together, my results provide support that investor emotions extractedfrom social media marginally help predicting earnings surprises.

I now address my second research question: Can emotions extracted from StockTwits mes-sages predict quarterly earnings announcement stock returns? I ﬁrst present the results fromestimating Equation (1) without ﬁrm ﬁxed eﬀects in Columns (3-4) of Table 5. Column (3)suggests a negative relationship between emotions and abnormal returns around earnings an-nouncements, as the coeﬃcients on fear, anger, and happy are signiﬁcantly negative. Whencontrols from prior research explaining the cross-sectional variation in stock returns aroundearnings announcement are included (Column (4)), only the eﬀect of happy remains statis-tically signiﬁcant. Considering the results in Column (4), these impacts are not negligible;a standard deviation increase in excitement decreases announcement returns by 7.2 basispoints per three trading days ( − . ∗ . a b l e : P r e - A nn o un ce m e n t E m o t i o n s a nd E a r n i n g s Su r p r i s e s ( )( )( )( )( )( ) C h a t T y p e I n f o r m a t i o n T y p e C h a t F und a m e n t a l O r i g i n a l D i ss e m i n a t i o n D e p e nd e n t V a r i a b l e : S U E t H a pp y − , − . . . . . - . ( . )( . )( . )( . )( . )( . ) S a d − , − - . - . . - . ∗∗∗ - . - . ( . )( . )( . )( . )( . )( . ) D i s g u s t − , − - . - . - . - . - . - . ( . )( . )( . )( . )( . )( . ) A n g e r − , − - . ∗ - . - . - . - . ∗∗∗ . ( . )( . )( . )( . )( . )( . ) F e a r − , − . . - . . - . . ( . )( . )( . )( . )( . )( . ) Su r p r i s e − , − - . - . - . . . . ( . )( . )( . )( . )( . )( . ) C o n s t a n t . - . ∗∗∗ . ∗∗ . . ∗∗ . ( . )( . )( . )( . )( . )( . ) F i r m F E XXXXXX Y e a r , M o n t h , D a y o f W ee k F E XXXXXX C o n t r o l s XXXXXX S & P F i r m s X ≥ m e d i a nu s e r s σ y , w i t h i n . . . . . . O b s e r v a t i o n s d j . R . . . . . . N o t e s : R o bu s t s t a nd a r d e rr o r s c l u s t e r e d a tt h e ﬁ r m l e v e l a r e i np a r e n t h e s e s . ∗ p < . , ∗∗ p < . , ∗∗∗ p < . . C o n t i nu o u s v a r i a b l e s w i n s o r i ze d a tt h e % a nd % l e v e l t o m i t i ga t e t h e i m p a c t o f o u t li e r s . I r e p o r tt h e w i t h i n - ﬁ r m ( d e m e a n e d ) s t a nd a r dd e v i a t i o n o f t h e d e p e nd e n t v a r i a b l e .

24f happy messages trend similarly to ﬁrms that disappoint expectations while having abovethe median share of happy messages until the announcement is released. -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8

Days till Earnings Report C u m u l a t i v e A b n o r m a l R e t u r n ( % ) Happy < Median, Negative SurpriseHappy > Median, Negative Surprise Happy < Median, Positive SurpriseHappy > Median, Positive Surprise

Figure 6: Emotion and Announcement Returns.

Notes: Relationship between pre-announcement happiness (i.e., [ − , − Table 7 shows that this relationship holds even with ﬁrm ﬁxed eﬀects (Column (1-6)),for larger ﬁrms (Column (2)), and the impacts are larger when user engagement is higher(Column 3)). Columns (4-5) repeat the analysis using measures of emotions disaggregatedbetween messages that convey earnings or trade-related information (fundamental) and mes-sages that provide other information (chat). I ﬁnd that emotions extracted from messagesspeciﬁcally mentioning ﬁrm fundamentals and earnings have larger point estimates. Next,contrasting Column (6) and Column (7), I ﬁnd that both messages containing original in-formation and those disseminating existing information drive my results. Since I control forrealized earnings surprise, my ﬁndings suggest that the value relevance of emotions providedby StockTwits for stock returns stems not only from predicting the earnings surprise, butalso from other information relevant to stock valuation not accounted for by unobservabletime-invariant stock characteristics or by time patterns.25 a b l e : P r e - A nn o un ce m e n t E m o t i o n s a nd A nn o un ce m e n t R e t u r n s ( )( )( )( )( )( )( ) C h a t T y p e I n f o r m a t i o n T y p e C h a t F und a m e n t a l O r i g i n a l D i ss e m i n a t i o n D e p e nd e n t V a r i a b l e : E X R E T − , H a pp y − , − - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗ ( . )( . )( . )( . )( . )( . )( . ) S a d − , − . - . . . ∗ - . . . ( . )( . )( . )( . )( . )( . )( . ) D i s g u s t − , − . . . - . . ∗∗∗ . . ( . )( . )( . )( . )( . )( . )( . ) A n g e r − , − . . . . - . . - . ( . )( . )( . )( . )( . )( . )( . ) F e a r − , − . . . ∗ . . . . ( . )( . )( . )( . )( . )( . )( . ) Su r p r i s e − , − . . . - . . ∗∗∗ . . ( . )( . )( . )( . )( . )( . )( . ) C o n s t a n t . . - . . . . . ∗ ( . )( . )( . )( . )( . )( . )( . ) F i r m F E XXXXXXX Y e a r , M o n t h , D a y o f W ee k F E XXXXXXX C o n t r o l V a r i a b l e s XXXXXXX S & P F i r m s X ≥ m e d i a nu s e r s X σ y , w i t h i n . . . . . . . O b s e r v a t i o n s d j . R . . . . . . . N o t e s : R o bu s t s t a nd a r d e rr o r s c l u s t e r e d a tt h e i ndu s t r y a nd q u a r t e r l e v e l a r e i np a r e n t h e s e s . ∗ p < . , ∗∗ p < . , ∗∗∗ p < . . C o n t i nu o u s v a r i a b l e s w i n s o r i ze d a tt h e % a nd % l e v e l t o m i t i ga t e t h e i m p a c t o f o u t li e r s . I r e p o r tt h e w i t h i n - ﬁ r m ( d e m e a n e d ) s t a nd a r dd e v i a t i o n o f t h e d e p e nd e n t v a r i a b l e .

26 comparison of the results in Tables 5-7 presents an interesting contrast. Investorenthusiasm extracted from messages matters both for predicting earnings surprises and themarket reaction to earnings news. In particular, it seems that investors are excited aboutﬁrms that exceed expectations (i.e., positive relationship between happy and SUE), buttheir enthusiasm may lead to short-term overpricing, and hence, when I compare ﬁrmsannouncing similar earnings surprises, ones that experienced higher investor enthusiasmtend to experience lower announcement returns. While the emotion-earnings relationshiponly holds without ﬁrm ﬁxed eﬀects, both within-ﬁrm and inter-ﬁrm variation in emotionsare indicative of the market reaction to earnings news.

To help corroborate the theoretical work of Shu (2010), I further analyze the link betweeninvestor emotions and excess returns. I ﬁrst estimate the relationship using contempora-neous emotions and excess returns during windows [ − , −

2] and [ − , − , , EXRET − , − being my dependent variable,while excluding controls for earnings surprise. Table 8 presents the results. As expected, Iﬁnd a large positive (negative) association between positive (negative) emotional states andexcess returns. This relationship is smaller for larger ﬁrms (Column (2)), larger when userengagement is higher (Column (3)), and holds for messages of all types (Columns (4-7)).To provide further support that my emotion measures are capturing investor emotionsaccurately, I also estimate Equation (1) with contemporaneous (i.e., same window) emotionvariables. Even after controlling for the content of the report, I ﬁnd a large positive (negative)27 a b l e : P r e - A nn o un ce m e n t E m o t i o n s a nd P r e - A nn o un ce m e n t R e t u r n s ( )( )( )( )( )( )( ) C h a t T y p e I n f o r m a t i o n T y p e C h a t F und a m e n t a l O r i g i n a l D i ss e m i n a t i o n D e p e nd e n t V a r i a b l e : E X R E T − , − H a pp y − , − . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ ( . )( . )( . )( . )( . )( . )( . ) S a d − , − - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ ( . )( . )( . )( . )( . )( . )( . ) D i s g u s t − , − - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ ( . )( . )( . )( . )( . )( . )( . ) A n g e r − , − - . ∗∗∗ - . - . ∗∗∗ - . ∗∗∗ - . ∗ - . ∗∗ . ( . )( . )( . )( . )( . )( . )( . ) F e a r − , − - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ ( . )( . )( . )( . )( . )( . )( . ) Su r p r i s e − , − - . ∗∗∗ - . ∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ . ( . )( . )( . )( . )( . )( . )( . ) C o n s t a n t - . . - . - . - . . - . ( . )( . )( . )( . )( . )( . )( . ) F i r m F E XXXXXXX Y e a r , M o n t h , D a y o f W ee k F E XXXXXXX C o n t r o l V a r i a b l e s XXXXXXX S & P F i r m s X ≥ m e d i a nu s e r s X σ y , w i t h i n . . . . . . . O b s e r v a t i o n s d j . R . . . . . . . N o t e s : R o bu s t s t a nd a r d e rr o r s c l u s t e r e d a tt h e i ndu s t r y - q u a r t e r l e v e l a r e i np a r e n t h e s e s . ∗ p < . , ∗∗ p < . , ∗∗∗ p < . . C o n t i nu o u s v a r i a b l e s w i n s o r i ze d a tt h e % a nd % l e v e l t o m i t i ga t e t h e i m p a c t o f o u t li e r s . I r e p o r tt h e w i t h i n - ﬁ r m ( d e m e a n e d ) s t a nd a r dd e v i a t i o n o f t h e d e p e nd e n t v a r i a b l e . a b l e : A nn o un ce m e n t E m o t i o n s a nd A nn o un ce m e n t R e t u r n s ( )( )( )( )( )( )( ) C h a t T y p e I n f o r m a t i o n T y p e C h a t F und a m e n t a l O r i g i n a l D i ss e m i n a t i o n D e p e nd e n t V a r i a b l e : E X R E T − , H a pp y − , . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ ( . )( . )( . )( . )( . )( . )( . ) S a d − , - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ ( . )( . )( . )( . )( . )( . )( . ) D i s g u s t − , - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ ( . )( . )( . )( . )( . )( . )( . ) A n g e r − , - . ∗∗∗ - . - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ . ∗ ( . )( . )( . )( . )( . )( . )( . ) F e a r − , - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ ( . )( . )( . )( . )( . )( . )( . ) Su r p r i s e − , - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ . ( . )( . )( . )( . )( . )( . )( . ) C o n s t a n t . . - . . ∗ . . ∗ . ( . )( . )( . )( . )( . )( . )( . ) F i r m F E XXXXXXX Y e a r , M o n t h , D a y o f W ee k F E XXXXXXX C o n t r o l V a r i a b l e s XXXXXXX S & P F i r m s X ≥ m e d i a nu s e r s X σ y , w i t h i n . . . . . . . O b s e r v a t i o n s d j . R . . . . . . . N o t e s : R o bu s t s t a nd a r d e rr o r s c l u s t e r e d a tt h e i ndu s t r y a nd q u a r t e r l e v e l a r e i np a r e n t h e s e s . ∗ p < . , ∗∗ p < . , ∗∗∗ p < . . C o n t i nu o u s v a r i a b l e s w i n s o r i ze d a tt h e % a nd % l e v e l t o m i t i ga t e t h e i m p a c t o f o u t li e r s . I r e p o r tt h e w i t h i n - ﬁ r m ( d e m e a n e d ) s t a nd a r dd e v i a t i o n o f t h e d e p e nd e n t v a r i a b l e . a b l e : A nn o un ce m e n t E m o t i o n s a nd P o s t - A nn o un ce m e n t R e t u r n s ( )( )( )( )( )( )( ) C h a t T y p e I n f o r m a t i o n T y p e C h a t F und a m e n t a l O r i g i n a l D i ss e m i n a t i o n D e p e nd e n t V a r i a b l e : E X R E T , H a pp y − , - . ∗∗∗ - . ∗∗ - . ∗∗ - . - . ∗∗∗ - . ∗∗ - . ∗∗ ( . )( . )( . )( . )( . )( . )( . ) S a d − , . . ∗∗ . - . . . . ( . )( . )( . )( . )( . )( . )( . ) D i s g u s t − , . . . - . . - . . ∗ ( . )( . )( . )( . )( . )( . )( . ) A n g e r − , - . . . - . . . . ( . )( . )( . )( . )( . )( . )( . ) F e a r − , . ∗∗∗ . . ∗∗∗ . ∗∗∗ . ∗ . ∗∗ . ∗ ( . )( . )( . )( . )( . )( . )( . ) Su r p r i s e − , - . - . . . . . - . ( . )( . )( . )( . )( . )( . )( . ) C o n s t a n t - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ ( . )( . )( . )( . )( . )( . )( . ) F i r m F E XXXXXXX Y e a r , M o n t h , D a y o f W ee k F E XXXXXXX C o n t r o l V a r i a b l e s XXXXXXX S & P F i r m s X ≥ m e d i a nu s e r s X σ y , w i t h i n . . . . . . . O b s e r v a t i o n s d j . R . . . . . . . N o t e s : R o bu s t s t a nd a r d e rr o r s c l u s t e r e d a tt h e i ndu s t r y a nd q u a r t e r l e v e l a r e i np a r e n t h e s e s . ∗ p < . , ∗∗ p < . , ∗∗∗ p < . . C o n t i nu o u s v a r i a b l e s w i n s o r i ze d a tt h e % a nd % l e v e l t o m i t i ga t e t h e i m p a c t o f o u t li e r s . I r e p o r tt h e w i t h i n - ﬁ r m ( d e m e a n e d ) s t a nd a r d d e v i a t i o n o f t h e d e p e nd e n t v a r i a b l e . − , , Theory on investor sentiment posits that younger, smaller, more volatile, unproﬁtable, non-dividend paying, distressed stocks are most sensitive to investor sentiment. Conversely,“bond-like” stocks are less driven by sentiment (see, Baker and Wurgler (2007)). To exam-ine whether emotions behave similarly to sentiment, I interact my happy variable with adummy variable intended to capture high volatility stocks. In line with this, I ﬁnd largerpoint estimates for more volatile ﬁrms (Column (2) of Table 11).I also explore the eﬀect of emotions for ﬁrms exceeding versus disappointing expectations,and ﬁnd larger impacts for ﬁrms disappointing expectations (Column (3) vs. Column (4) ofTable 11).

Hong and Page (2004) show that a diverse group of intelligent decision makers reach reliablybetter decisions than a less diverse group of individuals with superior skills. I investigate thisby segmenting my messages coming from traders with similar investment horizons (long-term,short-term), trading approaches (value, technical), trading experiences (amateur, intermedi-ate, professional), popularity levels (users with followers in the 95th percent versus the rest),31 able 11: Pre-Announcement Emotions, Announcement Returns and Earnings(1) (2) (3) (4)Earnings SurpriseNegative PositiveDependent Variable: EXRET − , Happy − , − -0.6243 ∗∗∗ -0.5346 ∗∗∗ -1.1025 ∗∗∗ -0.6622 ∗∗∗ (0.1975) (0.1917) (0.3604) (0.2434)Sad − , − − , − − , − − , − − , − ∗ ∗∗∗ (0.2785)Volatility × Happy − , − -1.8081 ∗ (1.0352)Constant 0.5219 0.3269 -1.2743 ∗∗ ∗∗∗ (0.3322) (0.3308) (0.5399) (0.4328)Firm FE X X X XYear, Month, Day of Week FE X X X XControl Variables X X X X σ y,within R Notes: Robust standard errors clustered at the industry and quarter level are in parentheses. ∗ p < . ∗∗ p < . ∗∗∗ p < .

01. Continuous variables winsorized at the 1% and 99% level to mitigate the impact ofoutliers. I report the within-ﬁrm (demeaned) standard deviation of the dependent variable. Indicator variablefor experiencing volatility in the top 10% leading up to the announcement (Volatility). and account type (institutional vs. human).I report heterogeneity across user types in Table 12 and document a few interestingobservations. First, in line with the value of diversity hypothesis, I ﬁnd that the emotionsof homogeneous groups are less informative in predicting announcement returns. Second,the relationship between happiness and returns are negative in most speciﬁcations, and isstatistically signiﬁcant in over half of them. Last, it is the variation in excitement expressedby traders, and not by institutions that predicts returns (Columns (10-11)).32 a b l e : P r e - A nn o un ce m e n t E m o t i o n s a nd A nn o un ce m e n t R e t u r n s a c r o ss U s e r s ( )( )( )( )( )( )( )( )( )( )( ) T r a d i n g A pp r oa c h I n v e s t m e n t H o r i z o n T r a d i n g E x p e r i e n ce P o pu l a r i t y A cc o un t T y p e T ec hn i c a l F und a m e n t a l L o n g - T e r m Sh o r t - T e r m P r o f e ss i o n a l I n t e r m e d i a t e N o v i ce T o p % R e s t I n s t i t u t i o n T r a d e r H a pp y − , − - . ∗∗ - . ∗∗ - . - . - . ∗∗∗ - . - . ∗∗ - . ∗∗∗ - . ∗∗∗ . - . ∗∗∗ ( . )( . )( . )( . )( . )( . )( . )( . )( . )( . )( . ) S a d − , − . . . . . . . . . - . . ( . )( . )( . )( . )( . )( . )( . )( . )( . )( . )( . ) D i s g u s t − , − - . . . - . . - . . - . . . . ( . )( . )( . )( . )( . )( . )( . )( . )( . )( . )( . ) A n g e r − , − - . - . . - . - . . - . - . - . . - . ( . )( . )( . )( . )( . )( . )( . )( . )( . )( . )( . ) F e a r − , − - . . - . . . . . - . . . . ( . )( . )( . )( . )( . )( . )( . )( . )( . )( . )( . ) Su r p r i s e − , − - . . - . . - . . - . - . . . . ( . )( . )( . )( . )( . )( . )( . )( . )( . )( . )( . ) C o n s t a n t . . . - . . . . . . - . . ( . )( . )( . )( . )( . )( . )( . )( . )( . )( . )( . ) F i r m F E XXXXXXXXXXX Y e a r , M o n t h , D a y o f W ee k F E XXXXXXXXXXX C o n t r o l V a r i a b l e s XXXXXXXXXXX σ y , w i t h i n . . . . . . . . . . . O b s e r v a t i o n s d j . R . . . . . . . . . . . N o t e s : R o bu s t s t a nd a r d e rr o r s c l u s t e r e d a tt h e i ndu s t r y a nd q u a r t e r l e v e l a r e i np a r e n t h e s e s . ∗ p < . , ∗∗ p < . , ∗∗∗ p < . . C o n t i nu o u s v a r i a b l e s w i n s o r i ze d a tt h e % a nd % l e v e l t o m i t i ga t e t h e i m p a c t o f o u t li e r s . I r e p o r tt h e w i t h i n - ﬁ r m ( d e m e a n e d ) s t a nd a r dd e v i a t i o n o f t h e d e p e nd e n t v a r i a b l e . .3 Sensitivity Analysis I report the results of the sensitivity analysis in Table 13.

Four-year sample

I compare my point estimate on sentiment in Column (1) with Bartov, Faurel, and Mohanram(2018) using their empirical speciﬁcation. One diﬀerence between my four year sample andtheir is that it starts a year later. Yet, I ﬁnd similar coeﬃcients (0.0638 versus 0.0599).Controlling for the sentiment variable only marginally aﬀects the coeﬃcients on emotions,and does not impact the statistical signiﬁcance on happy.

Alternative Dependent Variable

My main speciﬁcation for excess returns is deﬁned in Table 2, but as I show in Columns(2-3), my results are robust to alternative speciﬁcations.

Extending the Window Length

My primary analyses concerns the window just leading up to earnings announcements([ − , − − , − Alternative Classiﬁcation

Arguably the most important part of my robustness checks, I now explore my Twitter basedmodel in Column (5). The point estimate on happy is comparable to the one obtained bythe StockTwits based model. This ﬁnding provides strong support that it is indeed investorenthusiasm that helps predicting the market response to earnings reports.

Alternative Weighting

As a validity check, I consider two alternative weighting schemes. First, I investigate aban-doning the weighting scheme entirely, and hence, messages are weighted equally (Column(6)), and second, I weight each message by the number of likes it received, 1+log(1+ a b l e : S e n s i t i v i t y A n a l y s i s ( )( )( )( )( )( )( ) - A l t e r n a t i v e D e p . V a r . L o n g e r W i nd o wTw i tt e r M o d e l U n w e i g h t e d A l t e r n a t i v e W e i g h t i n g D e p e nd e n t V a r i a b l e : E X R E T − , H a pp y - . ∗∗∗ - . ∗∗∗ - . ∗∗∗ - . ∗∗ - . ∗∗∗ - . ∗∗∗ ( . )( . )( . )( . )( . )( . ) S a d . . . . . . ( . )( . )( . )( . )( . )( . ) D i s g u s t . . . ∗ . . ∗ . ∗ ( . )( . )( . )( . )( . )( . ) A n g e r - . - . . - . . . ( . )( . )( . )( . )( . )( . ) F e a r . . . - . . . ( . )( . )( . )( . )( . )( . ) Su r p r i s e . . . . . . ∗ ( . )( . )( . )( . )( . )( . ) S e n t i m e n t . ∗∗ ( . ) C o n s t a n t - . . ∗ . ∗ . ∗ . ∗ . . ( . )( . )( . )( . )( . )( . )( . ) F i r m F E XXXXX Y e a r , M o n t h , D a y o f W ee k F E XXXXX C o n t r o l V a r i a b l e s XXXXXXX σ y . . . . . . . O b s e r v a t i o n s d j . R . . . . . . . N o t e s : R o bu s t s t a nd a r d e rr o r s c l u s t e r e d a tt h e i ndu s t r y a nd q u a r t e r l e v e l a r e i np a r e n t h e s e s . ∗ p < . , ∗∗ p < . , ∗∗∗ p < . . C o n t i nu o u s v a r i a b l e s w i n s o r i ze d a tt h e % a nd % l e v e l t o m i t i ga t e t h e i m p a c t o f o u t li e r s . A s i d e f r o m C o l u m n ( ) , I r e p o r tt h e w i t h i n - ﬁ r m ( d e m e a n e d ) s t a nd a r dd e v i a t i o n o f t h e d e p e nd e n t v a r i a b l e . I n C o l u m n ( ) , I u s ec u m u l a t i v e a bn o r m a l r e t u r n s o v e r t h e t h r ee - d a y t r a d i n g w i nd o w a s m y d e p e nd e n t v a r i a b l e , w h il e i n C o l u m n ( ) I u s e a l o n g e r e s t i m a t i o n w i nd o w f o r e x ce ss r e t u r n s ( s ee F i g u r e F . ) . T h e ” L o n g W i nd o w ” i n C o l u m n ( ) r e f e r s t o [ − , ], w h il e m y a l t e r n a t i v e w e i g h t i n g i n C o l u m n ( ) r e f e r s t o m y li k e - b a s e d w e i g h t i n g s c h e m e : + l og ( + o f li k e s ) . Conclusion

In this paper, I study the impact of ﬁrm-speciﬁc emotions on quarterly earnings announce-ments. I demonstrate that investor emotions can help predict the company’s quarterlyearnings. I ﬁnd that both within- and inter-ﬁrm variation in investor enthusiasm is linkedwith lower announcement returns. In particular, I ﬁnd that both messages that conveyoriginal information, and those disseminating existing information drive my results. Whenconsidering messages that carry information directly related to earnings, ﬁrm fundamentals,and/or stock trading and those covering other information, I ﬁnd that the former has a largerimpact on announcement returns.The link between emotions and market behavior has interesting policy implications. Itdemonstrates that there is a concrete foundation for the idea that central banks, govern-ments, ﬁrms, and the media should consider the eﬀects of announcements and data releaseon the emotional state of market participants and how this might, in turn, aﬀect marketprices. Such impacts would arise alongside the inﬂuence of new information on economicfundamentals that might aﬀect asset prices accordingly. While the eﬀects of information onfundamentals can be identiﬁed with well-established techniques in ﬁnance and economics,studying the emotional component requires new tools. In my view, the methods describedherein constitute a step forward in this direction.

References

Abadi, Mart´ın, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeﬀrey Dean,Matthieu Devin, Sanjay Ghemawat, Geoﬀrey Irving, Michael Isard, et al. 2016. “Ten-sorﬂow: a system for large-scale machine learning.”

OSDI , Volume 16. 265–283.Abarbanell, Jeﬀrey S. 1991. “Do analysts’ earnings forecasts incorporate information inprior stock price changes?”

Journal of Accounting and Economics .Albanesi, Stefania, and Domonkos F Vamossy. 2019. “Predicting consumer default: A deeplearning approach.” Technical Report, National Bureau of Economic Research.Andrade, Eduardo B, Terrance Odean, and Shengle Lin. 2016. “Bubbling with excitement:an experiment.”

Review of Finance

20 (2): 447–466.Antweiler, Werner, and Murray Z Frank. 2004. “Is all that talk just noise? The informationcontent of internet stock message boards.”

The Journal of ﬁnance

59 (3): 1259–1294.Athey, Susan, and Guido W Imbens. 2019. “Machine learning methods that economistsshould know about.”

Annual Review of Economics , vol. 11.36zar, Pablo D, and Andrew W Lo. 2016. “The wisdom of Twitter crowds: Predictingstock market reactions to FOMC meetings via Twitter feeds.”

The Journal of PortfolioManagement

42 (5): 123–134.Baker, Malcolm, and Jeﬀrey Wurgler. 2007. “Investor sentiment in the stock market.”

Journal of economic perspectives

21 (2): 129–152.Barber, Brad M, and Terrance Odean. 2013. “The behavior of individual investors.” In

Handbook of the Economics of Finance , Volume 2, 1533–1570. Elsevier.Bartov, Eli, Lucile Faurel, and Partha S Mohanram. 2018. “Can Twitter help predictﬁrm-level earnings and stock returns?”

The Accounting Review

93 (3): 25–57.Berg, Joyce, Robert Forsythe, Forrest Nelson, and Thomas Rietz. 2008. “Results from adozen years of election futures markets research.”

Handbook of experimental economicsresults

The Accounting Review

89 (1): 79–112.Bollen, Johan, Huina Mao, and Xiaojun Zeng. 2011. “Twitter mood predicts the stockmarket.”

Journal of computational science

Review of Finance

22 (1): 279–309.Carhart, Mark M. 1997. “On persistence in mutual fund performance.”

The Journal ofﬁnance

52 (1): 57–82.Chen, Hailiang, Prabuddha De, Yu Jeﬀrey Hu, and Byoung-Hyoun Hwang. 2014. “Wisdomof crowds: The value of stock opinions transmitted through social media.”

The Reviewof Financial Studies

27 (5): 1367–1403.Chollet, Fran¸cois, et al. 2015. “Keras: Deep learning library for theano and tensorﬂow.”

URL: https://keras. io/k

7, no. 8.Chung, Junyoung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. “Empiricalevaluation of gated recurrent neural networks on sequence modeling.” arXiv preprintarXiv:1412.3555 .Cookson, J Anthony, and Marina Niessner. 2020. “Why don’t we agree? Evidence from asocial network of investors.”

The Journal of Finance

75 (1): 173–228.Curtis, Asher, Vernon J Richardson, and Roy Schmardebeck. 2014. “Investor attention andthe pricing of earnings news.”

Handbook of Sentiment Analysis in Finance, Forthcoming .Da, Zhi, Joseph Engelberg, and Pengjie Gao. 2011. “In search of attention.”

The Journalof Finance

66 (5): 1461–1499.Das, Sanjiv R, and Mike Y Chen. 2007. “Yahoo! for Amazon: Sentiment extraction fromsmall talk on the web.”

Management science

53 (9): 1375–1388.De Long, J Bradford, Andrei Shleifer, Lawrence H Summers, and Robert J Waldmann.1990. “Noise trader risk in ﬁnancial markets.”

Journal of political Economy

98 (4):703–738. 37rake, Michael S, Darren T Roulstone, and Jacob R Thornock. 2012. “Investor informationdemand: Evidence from Google searches around earnings announcements.”

Journal ofAccounting research

50 (4): 1001–1040.Duxbury, Darren, Tommy G¨arling, Amelie Gamble, and Vian Klass. 2020. “How emo-tions inﬂuence behavior in ﬁnancial markets: a conceptual analysis and emotion-basedaccount of buy-sell preferences.”

The European Journal of Finance , pp. 1–22.Epstein, Larry G, and Martin Schneider. 2008. “Ambiguity, information quality, and assetpricing.”

The Journal of Finance

63 (1): 197–228.Felbo, Bjarke, Alan Mislove, Anders Søgaard, Iyad Rahwan, and Sune Lehmann. 2017.“Using millions of emoji occurrences to learn any-domain representations for detectingsentiment, emotion and sarcasm.” arXiv preprint arXiv:1708.00524 .Gabrovˇsek, Peter, Darko Aleksovski, Igor Mozetiˇc, and Miha Grˇcar. 2017. “Twitter senti-ment around the Earnings Announcement events.”

PloS one

12, no. 2.Galbraith, John Kenneth. 1994.

A short history of ﬁnancial euphoria . Volume 3856.Penguin Books.Gilbert, Eric, and Karrie Karahalios. 2010. “Widespread worry and the stock market.”

Fourth International AAAI Conference on Weblogs and Social Media .Graves, Alex, and J¨urgen Schmidhuber. 2005. “Framewise phoneme classiﬁcation withbidirectional LSTM and other neural network architectures.”

Neural networks

18 (5-6):602–610.Hirshleifer, David, and Tyler Shumway. 2003. “Good day sunshine: Stock returns and theweather.”

The Journal of Finance

58 (3): 1009–1032.Hochreiter, Sepp, and J¨urgen Schmidhuber. 1997. “Long short-term memory.”

Neuralcomputation

Proceedings of the National Academy of Sciences

101 (46): 16385–16389.Hunter, John D. 2007. “Matplotlib: A 2D graphics environment.”

Computing in science& engineering

Journal of Banking &Finance

32 (4): 526–540.Jegadeesh, Narasimhan, and Woojin Kim. 2010. “Do analysts herd? An analysis ofrecommendations and market reactions.”

The Review of Financial Studies

23 (2): 901–937.Jung, Michael J, James P Naughton, Ahmed Tahoun, and Clare Wang. 2018. “Do ﬁrmsstrategically disseminate? Evidence from corporate use of social media.”

The AccountingReview

93 (4): 225–252.Kamstra, Mark J, Lisa A Kramer, and Maurice D Levi. 2003. “Winter blues: A SAD stockmarket cycle.”

American Economic Review

93 (1): 324–343.38awrence, Alastair, James Ryans, Estelle Sun, and Nikolay Laptev. 2016. “Yahoo Financesearch and earnings announcements.”

Available at SSRN 2804353 .Li, Qian, Bing Zhou, and Qingzhong Liu. 2016. “Can twitter posts predict stock behav-ior?: A study of stock market with twitter social emotion.” . IEEE, 359–364.Loewenstein, George F, Elke U Weber, Christopher K Hsee, and Ned Welch. 2001. “Riskas feelings.”

Psychological bulletin

127 (2): 267.Loughran, Tim, and Bill McDonald. 2011. “When is a liability not a liability? Textualanalysis, dictionaries, and 10-Ks.”

The Journal of Finance

66 (1): 35–65.Lucas Jr, Robert E. 1978. “Asset prices in an exchange economy.”

Econometrica: Journalof the Econometric Society , pp. 1429–1445.Lundberg, Scott M, and Su-In Lee. 2017. “A Uniﬁed Approach to Interpreting ModelPredictions.” In

Advances in Neural Information Processing Systems 30 , edited byI. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, andR. Garnett, 4765–4774. Curran Associates, Inc.Mao, Yuexin, Wei Wei, Bing Wang, and Benyuan Liu. 2012. “Correlating S&P 500 stockswith Twitter data.”

Proceedings of the ﬁrst ACM international workshop on hot topicson interdisciplinary social networks research . 69–72.McKinney, Wes, et al. 2010. “Data structures for statistical computing in python.”

Pro-ceedings of the 9th Python in Science Conference , Volume 445. Austin, TX, 51–56.Meursault, Vitaliy. 2019. “The Language of Earnings Announcements.” Technical Report,Working paper.Mullainathan, Sendhil, and Jann Spiess. 2017. “Machine learning: an applied econometricapproach.”

Journal of Economic Perspectives

31 (2): 87–106.Pennington, Jeﬀrey, Richard Socher, and Christopher D Manning. 2014. “Glove: Globalvectors for word representation.”

Proceedings of the 2014 conference on empirical meth-ods in natural language processing (EMNLP) . 1532–1543.Petersen, Mitchell A. 2009. “Estimating standard errors in ﬁnance panel data sets: Com-paring approaches.”

The Review of Financial Studies

22 (1): 435–480.Ratner, Alexander, Stephen H Bach, Henry Ehrenberg, Jason Fries, Sen Wu, and Christo-pher R´e. 2020. “Snorkel: Rapid training data creation with weak supervision.”

TheVLDB Journal

29 (2): 709–730.Schmidhuber, J¨urgen. 2015. “Deep learning in neural networks: An overview.”

Neuralnetworks

IEEE transactions on Signal Processing

45 (11): 2673–2681.Shiller, Robert J. 2020. “’Gut Feelings’ Are Driving the Markets.”

The New York Times .Shiller, Robert J. 2003. “From eﬃcient markets theory to behavioral ﬁnance.”

Journal ofeconomic perspectives

17 (1): 83–104. 39hu, Hui-Chu. 2010. “Investor mood and ﬁnancial markets.”

Journal of Economic Behavior& Organization

76 (2): 267–282.Surowiecki, James. 2004. “The wisdom of crowds: Why the many are smarter than thefew and how collective wisdom shapes business.”

Economies, Societies and Nations ,vol. 296.Tetlock, Paul C. 2007. “Giving content to investor sentiment: The role of media in thestock market.”

The Journal of ﬁnance

62 (3): 1139–1168.Walt, St´efan van der, S Chris Colbert, and Gael Varoquaux. 2011. “The NumPy array:a structure for eﬃcient numerical computation.”

Computing in Science & Engineering

13 (2): 22–30.Welch, Ivo. 2000. “Herding among security analysts.”

Journal of Financial economics

The New York Times .40 ppendixA A Simple Model of Investor Emotion

The theoretical framework of this paper is motivated by Epstein and Schneider (2008). Iinclude this simple model to illustrate how emotion can aﬀect asset prices. There are threedates, labeled 0, 1, and 2. I focus on news about one particular asset (asset A). There are n shares of this asset outstanding, where each share is a claim to a dividend: d = m + (cid:15) a + (cid:15) i (2)where m denotes the mean dividend, (cid:15) a isan aggregate shock, and (cid:15) i is an idiosyncratic shockthat aﬀects only asset A. Assumption 1.

Shocks are mutually independent and normally distributed with mean zero. (cid:15) i ∼ N (0 , σ i ) (cid:15) a ∼ N (0 , σ a )I summarize the payoﬀ on all other assets by a dividend:˜ d = ˜ m + (cid:15) a + ˜ (cid:15) i (3)There are n − n shares of other assets outstanding and each pays ˜ d . The market portfoliois then a claim to n d + n − n ˜ d .When n = 1, asset A is the market. Aside from this specialcase, asset A can be interpreted as a stock in a single company (for n large). In this scenario,˜ d can be interpreted as the sum of stock payoﬀs for other companies. For what follows, Iassume a symmetric case of n stocks that each promise a dividend of the form Equation (2),with the aggregate shock being identical, while the idiosyncratic shocks being independentacross companies. I use this symmetric case for simplicity and tractability, however, theprecise nature of ˜ d is irrelevant for most of my results.Dividends are revealed at date 2. At date 1, the representative agent receives two noisy41ignals ( s , s ), informing her about the aggregate and the idiosyncratic shock. This capturesthe idea that the investor is able to access news updates (sector and company speciﬁc). s = (cid:15) i + (cid:15) (4) s = (cid:15) a + (cid:15) (5) Assumption 2.

Signals are imprecise; (cid:15) and (cid:15) are mutually independent and normallydistributed with mean zero. (cid:15) ∼ N (0 , σ ) (cid:15) ∼ N (0 , σ )The investor tries to infer (cid:15) i + (cid:15) a from the two signals ( s , s ). The set of one-step-aheadbeliefs about s and s at date 0 consists of normals with mean zero and variance σ i + σ and σ a + σ respectively. The set of posteriors about (cid:15) i + (cid:15) a is calculated using standardrules for updating normal random variables. For ﬁxed σ i ( i =1 , , let γ i denote the regressioncoeﬃcient : γ ( σ ) = cov ( s , (cid:15) i ) var ( s ) = σ i σ i + σ (6) γ ( σ ) = cov ( s , (cid:15) a ) var ( s ) = σ a σ a + σ (7)For ﬁxed σ i , the coeﬃcient γ i ( σ i ) determines the fraction of prior variance in (cid:15) a and in (cid:15) i that is resolved by the signal. Given ( s , s ), the posterior density (cid:15) a + (cid:15) i is also normal.In particular (cid:15) i + (cid:15) a ∼ N ( γ s + γ s , (1 − γ ) σ i + (1 − γ ) σ a ) Assumption 3.

There is a representative agent who does not discount the future and caresonly about consumption at date 2. Her utility function is represented by: u ( c ) = − e − ρc (8) It is common to measure the information content of a signal relative to the volatility of the parameter(Epstein and Schneider (2008)). .1 Bayesian Benchmark The price of asset A equals the expected present value minus a risk premium that dependson risk aversion and covariance with the market. It is straightforward then to calculate theprice of asset A at dates 0 and 1: q Bayesian = m − ρcov (cid:16) d, n d + n − n ˜ d (cid:17) = m − ρ (cid:16) n σ i + σ a (cid:17) (9) q Bayesian = m + γ s + γ s − ρ (cid:104) n (1 − γ ) σ i + (1 − γ ) σ a (cid:105) (10)At date 0, the expected present value is simply the prior mean dividend m. At date 1,it is the posterior mean dividend m + γ s + γ s , as it now depends on the value of thesignals (given that the signal is informative: γ i > s and s ): it is smaller at date 1, since the signal resolves someuncertainty. At either date, it is composed of two parts, one is driven by the variance ofthe aggregate shock ( (cid:15) a ), and the other one equals the variance of the idiosyncratic shock( (cid:15) i ) multiplied by the market share of the asset: n . As n becomes large, idiosyncratic risk isdiversiﬁed away and does not matter for prices. A.2 Investor Emotion

In absence of signals, the investor is guided by her emotions at date 0. As it has been shownin the literature, the investor overprices the asset when in a good mood (Breaban and Nous-sair (2018)).I use η to represent the investors emotional state at date 0, such that η ∈ ( − , η . . Inthis environment, the price of asset A in period 0: q EM = m (1 + η ) − ρcov (cid:16) d, n d + n − n ˜ d (cid:17) = m (1 + η ) − ρ (cid:16) n σ i + σ a (cid:17) (11)As there is information to process at date 1, the investor loses her emotional attachment This can be easily extended. Say η a is the emotion parameter for asset A, while η m is the emotionparameter for all other assets in the market. η = 0, I obtain the same price as in the Bayesian Benchmark. A.3 Comparative Statics

I are interested in the price adjustment dynamics from period 0 to period 1:∆ q Bayesian − ∆ q EM = − ηm (12)Thus, compared with the Bayesian benchmark, which also corresponds to neutral val-uation, the price of asset A responds more to the signals when investors draw a negativeemotion shock with respect to asset A. This is a testable implication of the model, whichI investigate in Section 5.2 empirically. In particular, I examine asset price movementssurrounding quarterly earnings announcements (this stands for the idiosyncratic shock). B Text Processing

I ﬁrst remove images, hyperlinks, and tags from the text. I discard tweeted at (e.g., @dva-mossy), cashtags (e.g., $FORD), and the retweet indicator (i.e., “RT”) where applicable. Iset text to lower case, translate emojis and emoticons (e.g., “:)” substituted with “happy-face”), ﬁx contractions (e.g., “i’ve” changed to “i have”), and correct common misspellings .I replace numberspreceded by a $ sign with “isdollarvalue”, other numbers with “isnumber-value”, and the % sign with “ispercentage”. This feature is important for distinguishingbetween general chat versus stock trading related messages. I then remove any non-wordtokens, such as punctuation marks. I include the 60,000 most frequent words in the modeldictionary, changing all other tokens to “NONE”. The messages are then tokenized (i.e.,words are changed to numbers) and split into sentences using keras. I provide a description of my misspell correction in the Online Appendix. Measuring Emotions with Deep Learning

My deep learning model operates by sequentially learning a latent representation. Thesereﬂect features such asword order, word usage, and local context. Minimization of predictionerror drives feature extraction. I use a Bidirectional-GRU model, which can be deﬁned asthe composition of several functions (layers):f(X j,T ; w ) = S ◦ D ◦ O ◦ BiGRU ◦ Emb(X j,T ) (13)where X j,t is the jth message of length T, Emb is the embedding, BiGRU is the Bidirec-tional Gated Recurrent Unit, O is a linear layer, D is a two-layered NN with ReLu activation,and S is the ﬁnal softmax layer, which ensures that the output is between 0 and 1. I nextdeﬁne each component of the model.

C.1 Message

I deﬁne a message as a vector X = [ x . . . x T ], where x k is the index of the kth word in themodel dictionary, and T is the maximum document length (30 in my case). For documentsshorter than the maximum document length, I ﬁll the extra space with special paddingwords. C.2 Embedding

Embedding (Emb) assigns vectors to individual words. I obtain the starting value for myword embeddings from Pennington, Socher, and Manning (2014), leveraging 2 billion tweets,27 billion tokens, and a vocabulary of 1.2 million words. These embedding vectors are thenupdated during estimation via backpropagation. I denote the embedding of the word x i ase(x i ) = e i ∈ R d ( E ) , where d(E) is the embedding size (200 in my case). Thus, the documentcan be represented as: I use a categorical-cross entropy loss function. j,T ) =  e e ... e T  =  e , e , . . . e ,d ( E ) e , e , . . . e ,d ( E ) ... ... ... ... e T, e T, . . . e T,d ( E )  (14)Words frequently used interchangeably are prone to cluster in the embedding space. C.3 Gated Recurrent Unit (GRU)

Introduced by Chung et al. (2014), the Gated Recurrent Unit (GRU) is a slight variationon the LSTM (see Hochreiter and Schmidhuber (1997)). It addresses the high memoryrequirements imposed by the LSTM by combining the forget and input gates into a single“update gate”, and by merging the cell state with the hidden state. The resulting modelrequires less computation, and has enjoyed growing popularity. I illustrate the GRU cell inFigure C.1. x t1 h t1 h t+1 o t1 o t o t+1 x t+1 ... ... h t1 GRU unit

1 h t h t tanh x t σ σ R t Z t Figure C.1: The architecture of the GRU unit

The update gate decides which parts of the previous hidden state are updated (or dis-carded). By selecting valuable parts from the previous hidden state, the reset gate determineswhich parts are used to compute new content. This is then used along with current inputto compute the hidden state update. Notice that the update gate controls both what iskept from the previous hidden state, and what is taken from the hidden state update. The This property allows me to capture word similarities without imposing any additional structure. X t ∈ R n × d (n is samplesize, d denotes input). The computations (forward-propagation) for the GRU unit can besummarized as: Z t = σ g ( W xz X t + W hz H t − + b z ) (15) R t = σ g ( W xr X t + W hr H t − + b r ) (16) H t = Z t (cid:12) H t − + (1 − Z t ) (cid:12) σ h ( W xh X t + W hh ( R t (cid:12) H t − ) + b h ) (17)where (cid:12) denotes elementwise multiplication, X t denotes the input, H t ∈ R n × h the out-put, Z t ∈ R n × h the update gate, R t ∈ R n × h the reset gate, and h the number of hiddenstates. Here, W xr , W xz ∈ R d × h and W hr , W hz ∈ R h × h are weight matrices, while b r , b z ∈ R × h are bias parameters. Typically σ g is a sigmoid function to transform input valuesto the interval (0,1), while σ h is tanh. C.4 Bidirectional GRU

Bidirectional RNNs were developed by Schuster and Paliwal (1997). The key feature of thebidirectional architecture is that dependencies and training can go forwards and backwardsin time. Before I move forward, let me denote my previous operations deﬁned in Equations(15)-(17) as Z t = −→ Z t , R t = −→ R t , H t = −→ H t . The key addition of the bidirectional architectureis that for a given timestep t, I also compute hidden state updates as follows: ←− Z t = σ g ( W fxz X t + W fhz ←−−− H t + + b fz ) (18) ←− R t = σ g W fxr X t + W fhr ←−−− H t + + b fr ) (19) ←− H t = ←− Z t (cid:12) ←−−− H t + + (1 − ←− Z t ) (cid:12) σ h ( W fxh X t + W fhh ( ←− R t (cid:12) ←−−− H t + ) + b fh ) (20)I then concatenate the forward and backward hidden states ←− H t and −→ H t to obtain thehidden state H t ∈ R n × h . For instance, if I were to ingest “oil and gas” with a forward architecture, “oil” would receive signalfrom “gas” during backpropagation but not the reverse. The bidirectional architecture allows for bothrelationships. For an in-depth discussion of diﬀerent bidirectional architectures see Graves and Schmidhuber(2005). .5 Linear Layer, Neural Network and Softmax Activation My next step is to apply another set of weights and bias terms and pass it to two-layeredNeural Network: O t = H t W h , q + b q (21) D = σ d ( O t W o + b o ) (22) D (cid:48) = σ d ( D W d + b d ) (23)the output of the GRU is O t ∈ R n × q , the output of the ﬁrst dense layer is D ∈ R n × q (cid:48) ,while the output of the second dense layer is D (cid:48) ∈ R n × q (cid:48)(cid:48) where q, q’, q” denote the numberof hidden units for each of the layers, and σ d is the RELU activation function in my case,deﬁned as: RELU( x ) =  x if x ≥

00 otherwise (24)This is then passed to another hidden layer, followed by a softmax layer to obtain theﬁnal output: ˆy = softmax( D (cid:48) W y + b y ) (25)where ˆy denotes the ﬁnal output with ˆy ∈ R n × y , and y denotes the number of outputs,7 in my case for the emotion classiﬁcation, and 3 for the chat type classiﬁcation. C.6 Training Data Sources

Since performing textual analysis using any word classiﬁcation scheme is inherently im-precise (see, e.g., Loughran and McDonald (2011)), I train two diﬀerent models basedon diﬀerent data sources. My ﬁrst, and preferred model is similar to Li, Zhou, and Liu(2016). It relies on building the training data from dictionaries. In particular, I deﬁne dic-tionaries for each emotional states. My dictionaries include both emojis and emoticons,and consists of 2,250 words. To map emojis and emoticons to emotions I use https://unicode.org/Public/emoji/13.0/emoji-test.txt . I translate emoticons into cate-48ories such as “happyface”, while I retain emojis in their original format, such as “face-withopenmouth”. I do this to keep the diversity of my emotional labels, which has beenshown to improve predictive power (e.g, Felbo et al. (2017)). It is important to note thatthe Naive Bayes based sentiment methodology assigns a score of 0.5 (i.e., neutral) for eachof the emoticons and emojis included in my dictionaries, both in its original format (i.e., :))and its changed format (i.e., “happyface”). I then prepare a training data with messagescontaining such words, and augment this with messages not containing any of these wordswhile having zero sentiment as neutral ones. I then use these dictionaries and Ratner et al.(2020) to generate my training data. The second classiﬁcation scheme builds a model frompre-compiled emotion datasets based on Twitter messages. I construct this training datafrom https://github.com/sarnthil/unify-emotion-datasets/tree/master/datasets .I compare the performance of these two models in Appendix D, and discuss further limita-tions of using the Twitter based model in the Online Appendix.For my information-based classiﬁcation, my “fundamental” data comes from StockTwitsdata with messages containing earnings or fundamental information, while my “chat” datacomes from my Twitter training data, excluding messages containing such information. Thisallows me to isolate general chat-like messages from those containing ﬁnancial information.I rely on these models instead of using a dictionary-based method since it gives me a prob-abilistic assessment whether a message belongs to a certain class. Additionally, trainedword-embeddings learn words often co-occurring with entries from my dictionaries, so thatwords not included in the dictionary but containing ﬁnancial information could be pickedup by my model.

C.7 Implementation

I include 30 words for each message and train roughly 47 million messages for my emotionclassiﬁcation . I use a batch-size of 4,096, a learning rate of 0.01 (0.001 for Twitter), anearly-stopping parameter of 1 (20 for Twitter), an embedding dropout of 0.25, 256 hiddenunits for my GRU, and 256-128 hidden units respectively for my dense layers. I further require positive (negative) emotions to have positive (negative) sentiment, classiﬁed by https://textblob.readthedocs.io/en/dev/ . I do not impose this for surprise. This reduced the coverage of mylabeling model, but increased the accuracy. Training data messages by class: 19M neutral, 13.9M happy, 5M fear, 4.2M surprise, 2.6M sad, 1.5Mdisgust, 1.2M anger.

49y deep learning models are made up of millions of free parameters. Since the estimationprocedure relies on computing gradients via backpropagation, which tends to be time andmemory intensive, using conventional computing resources (e.g., desktop) would be impracti-cal (if not infeasible). Acknowledging the impact of GPUs in deep learning (see Schmidhuber(2015)), I train my models on a GPU cluster GPU cluster (1-2 NVIDIA GeForceGTX1080GPUs proved to be suﬃcient). I conduct my analysis using Python 3.6.3 (Python SoftwareFoundation), building on the packages numpy (Walt, Colbert, and Varoquaux (2011)), pan-das (McKinney et al. (2010)) and matplotlib (Hunter (2007)). I develop my bidirectional grumodel with keras (Chollet et al. (2015)) running on top of Google TensorFlow, a powerfullibrary for large-scale machine learning on heterogenous systems (Abadi et al. (2016)).

D Model Comparison & Output

I contrast my model trained on Twitter with the one trained on StockTwits data. Eachmethodology presents strengths and weaknesses. Given that the training data for the ﬁrstmodel was built using dictionaries, it may miss words that my emotional in nature but werenot included in the dictionary. To alleviate this concern, I report the accuracy and coverageof my dictionary based training data preparation on a sample of 5,000 hand-tagged messagesfrom StockTwits in Table D.1.

Table D.1: Generating Training Data: Dictonary Based Labeling AccuracyClass Correct Incorrect EmpiricalAccuracyNeutral 1251 36 97.2%Happy 837 110 88.4%Sad 163 51 76.2%Anger 84 12 87.5%Disgust 99 36 73.3%Surprise 310 44 87.6%Fear 330 73 81.9%

Labeling accuracy evaluated on the hand-tagged 5,000 messages. Note: my labeling modelwould cover 68.7% of these messages, so it would not classify 1,564 messages (i.e., would notlabel the message with any of our classes).

This shows that approximately 2/3 of my messages can be tagged with this approachwith an accuracy of 89.5%, suggesting that this issue might not be severe. The second model,however, was developed using Twitter messages, and it is unclear whether this model would50e directly applicable to messages about stocks and companies (we provide mixed evidencein the Online Appendix). In addition, a large number of words are not accounted for usingthis technique: while in terms of word frequencies my Twitter words cover 96.2% of myStockTwits words, they only cover 5.02% of the vocabulary. Thus, trained on this data, mymodel discards potentially important words for classiﬁcation.

D.1 Classiﬁer Performance

To directly compare these two models, I test the accuracy of both classiﬁers on a sampleof 5,000 hand-tagged messages from StockTwits. I report the results in Table D.2. Thisshows that the StockTwits trained model performs signiﬁcantly better in terms of accuracy(roughly 31% better), but lower in terms of loss. The worse performance in terms of loss isdue to the StockTwits based model’s tendency to classify non-neutral messages as neutralwith almost certainty.

Table D.2: Five Fold Cross Validation with Hand-Tagged Test SampleStockTwits TwitterIn-Sample Test Sample In-Sample Test SampleFold Loss Accuracy Loss Accuracy Loss Accuracy Loss Accuracy

Test sample refers to the hand-tagged 5,000 messages; these messages are never fed into the modelduring training. Model that gets selected based on in-sample loss in bold.

I conﬁrm this with ﬁrst plotting the classiﬁcation errors for each emotions in the data foreach of my models. Panel (a) of Figure D.1 plots it for my StockTwits based model, whilePanel (b) does so for the Twitter based model. The diagonal entries represent the precisionof the classiﬁer. For instance, the 83.6% in the upper right corner of Panel (a) impliesthat my StockTwits classiﬁer accurately classiﬁed 83.6% of all neutral messages as neutral,while the 12.3% in the second row ﬁrst column represents that my classiﬁer mistakenlytagged 12.3% happy messages as neutral. As we can see, majority of my mistakes with theStockTwits based model is classifying non-neutral messages as neutral. Since I take neutralas my benchmark group, these types of mistakes likely bias my coeﬃcients towards zero,51ut retain the true ordinal ranking. Particularly important for my results is the lack ofmisclassiﬁcation between positive and negative valence emotions. This is not the case for myTwitter based model. Though the most mistakes are towards neutral, there is a large degreeof misclassiﬁcation from sad, angry, and disgust to both happy, and fear, I mainly use thismodel for a robustness check, and use my StockTwits model for most of my analyses.I also plot the distributions of each emotions in the data in Figure D.2. CombiningFigure D.2 and Figure D.1: the mistakes the StockTwits based model makes is to classifynon-neutral messages as neutral with almost certainty. N e u t r a l H a pp y S a d A n g e r D i s g u s t S u r p r i s e F e a r Predicted outcome

NeutralHappySadAngerDisgustSurpriseFear A c t u a l o u t c o m e (a) N e u t r a l H a pp y S a d A n g e r D i s g u s t S u r p r i s e F e a r Predicted outcome

NeutralHappySadAngerDisgustSurpriseFear A c t u a l o u t c o m e (b) Figure D.1: Confusion Matrices for Emotion Classiﬁcation Models.

Notes: (a) StockTwits based model, (b) Twitter based model. Results reported are based on performanceon the hand-tagged sample for the best performing model on the validation set during ﬁve-fold CV. I ﬁnd few errors in my chat type classiﬁcation, but it was not evaluated on a hand-tagged sample, andhence the results are not surprising. .0 0.2 0.4 0.6 0.8 1.0 StockTwits K e r n e l D e n s i t y NeutralHappySadDisgustAngerSurpriseFear (a)

Twitter K e r n e l D e n s i t y NeutralHappySadDisgustAngerSurpriseFear (b)

Figure D.2: Kernel Density Distributions of Emotional States.

Notes: Panel (a) plots the Kernel Density Distribution for the StockTwits model, while Panel (b) displaysit for the Twitter model. .2 Examples of Messages & Outputs I provide examples of my model’s predictions in Table D.3.

Table D.3: Examples of Model Outputs

Text Emotion Emotion(%) Typeangryfacesymbolshead angryfacesymbolshead angryfacesymbol-shead angryfacesymbolshead anger 100.0% chati hate google chrome so buggy even with windows isnumbervalue anger 100.0% chatfu shorts f*** right the f*** oﬀ anger 100.0% ﬁnanced*mn it i do not have funds in yob it how long does it usually take anger 100.0% ﬁnancethis guy is f*g made accounts to shill his own account lm*ao disgust 99.7% chatthose shorter are silent now say something losers disgust 100.0% chatnew high the master mind price gouging and manipulator martinceo will push up disgust 68.7% ﬁnancenow we know which a**holes are shorting crooks isnumbervalue disgust 97.4% ﬁnancedump it fear 100.0% chatbig boys dumping ah fear 100.0% chatthese fools bashing instead of loading all they can while still underdont want to make a lot of lol isnumbervalue fear 90.4% chater is nov what s the problem isnumbervalue fear 100.0% ﬁnancewhat would you do price may drop as it breaks higher dillingerband view odds of downtrend fear 99.4% ﬁnancea gift happy 100.0% chatlove the fear they try to spread it s literally a discord channel forthem they try unbelievably hard i ll give them that isnumbervalue happy 98.0% chattrend reversed in complete bull momentum and will continue torally hard going up thumbsup happy 100.0% ﬁnanceway way undervalued here rocket moneywithwings rocket money-withwings rocket moneywithwings rocket moneywithwings happy 100.0% ﬁnancesame patterns neutral 100.0% chatbrussels is the center of european union neutral 100.0% chatbank of marin bangor ceo russell colombo sells in isdollarvalue is-numbervalue neutral 100.0% ﬁnancejust ﬁled a earnings release and a ﬁnancial exhibit neutral 100.0% ﬁnanceding dong the witch is dead for now sad 100.0% chattough to watch having doubts will even hold might have to notwatch and check back in a few months brutal isnumbervalue sad 100.0% chatthis stock is brutal sad 100.0% ﬁnancestop bleeding they are reducing staﬀ and working on getting abillion dollars tax credit isnumbervalue sad 100.0% ﬁnancethis makes no sense lmao surprise 100.0% chatthis thing might even push holy crap isdollarvalue surprise 100.0% chati seriously doubt that if you are still holding or god forbid buyingat these levels right before an earnings miss surprise 100.0% ﬁnancegifted some shares at the opening bell lol isdollarvalue surprise 86.3% ﬁnance Model Explanations

I next uncover associations between the explanatory variables (words) and my model’s pre-dictions. I implement SHapley Additive exPlanations (SHAP), a uniﬁed framework forinterpreting predictions, to explain the output of my GRU model . SHAP leverages a gametheoretical concept to give each feature (word) a local importance value for a given predic-tion. Shapley values are local by design, yet they can be combined into global explanationsby averaging the absolute Shapley values word-wise. Then, I can compare words based ontheir absolute average Shapley values, with higher values implying higher word importance.To do the SHAP analysis, I a draw a random sample of 100,000 StockTwits messages.Table E.1 reports average absolute SHAP for my StockTwits model, while Figure E.1 plotsthe distribution of the ten most important words for each emotions. For instance, the ﬁrstentry in Panel (b) of Figure E.1 is “insane”, and this word is strongly associated with anincreased model output for the surprise class. As another example, the ninth entry of Panel(d) is “problem”, which has a dispersed distribution, illustrating that in certain cases theword “problem” nudges the model’s prediction towards fear by relatively insigniﬁcantly . Aquick inspection of Figure E.1 and Table E.1 conﬁrm that my StockTwits model relates wordsto emotions correctly. When looking at the SHAP values of the Twitter model, however,we can see some of the roots of my misclassiﬁcations. For instance, “marketing” is a strongpredictor for the happy class, while “natural” is a strong predictor for surprise.The interpretability results for my “chat type” model are as expected (see Figure E.3and Table E.3). For instance, words such as transaction, bankruptcy are associated with alower (higher) predicted probability for my “chat” (“fundamental”) class. For a detailed description of the approach see Lundberg and Lee (2017). These are typically sentences where “problem” is surrounded by other words that are associated withfear. .00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 SHAP value (impact on model output) gladfullbestwonrightwelcomefunnycalmnewlarge (a)

SHAP value (impact on model output) whoaunusualridiculousseriouslywhyholywtfcrazycuriousinsane (b)

SHAP value (impact on model output) ouchbadfailedcrappainworstterriblesadfailripping (c)

SHAP value (impact on model output) haltedproblemtroubledumpeddowntrendnervoustrappedsafetycarefulworthless (d)

SHAP value (impact on model output) dumbstupidwelcomeb*tchhatef**kedf***su**madlarge (e)

SHAP value (impact on model output) junkloserpiglmfaotrashshortyidiotspatheticidiotlosers (f)

Figure E.1: Selected Distribution of Word Importances for StockTwits Emotion Model.

Notes: SHAP values evaluated on a random sample of 100,000 StockTwits messages. (a) happy, (b)surprise, (c) sad, (d) fear, (e) anger, (f) disgust. Not shown here: neutral. .00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 SHAP value (impact on model output) soundsreboundmarketingminingdavidcloudsciencemesspickingnatural (a)

SHAP value (impact on model output) fightingsurprisesignalsweirdbeyondqexpectingstnaturalrevenue (b)

SHAP value (impact on model output) deadhisu**naturalfavorrevenueethliesjurypicking (c)

SHAP value (impact on model output) miningcatcheersseriouslyethfavorredheartnaturalrevenuepicking (d)

SHAP value (impact on model output) f***privatemadslapidiotswasteshakesurprisenaturalsnap (e)

SHAP value (impact on model output) signalhatef***mistakeb*tchscaredstuckpileofpoounusualnatural (f)

Figure E.2: Distribution of Word Importances for Twitter Emotion Model.

Notes: SHAP values evaluated on a random sample of 100,000 StockTwits messages. (a) happy, (b)surprise, (c) sad, (d) fear, (e) anger, (f) disgust. Not shown here: neutral. a s u a l T a l k F i n a n c e T a l k Predicted outcome

Casual TalkFinance Talk A c t u a l o u t c o m e (a) SHAP value (impact on model output) boardbrokebearplayedperformancemissedshortsbearsbreaktransaction (b)

Figure E.3: Distribution of Word Importances & Confusion Matrix for Information Content Model.

Notes: Confusion Matrix results reported on the test set are based on the best performing model on thevalidation set. SHAP values plotted for the “fundamental” class. Given that the ranking is based onabsolute average SHAP values, the “chat” class values are the negative of the “fundamental” class values. a b l e E . : Sh a p V a l u e s : S t o c k Tw i t s M o d e l N e u t r a l H a pp y S a d A n g e r D i s g u s t Su r p r i s e F e a r l a r g e . l a r g e . r i pp i n g0 . l a r g e . l o s e r s . i n s a n e . w o r t h l e ss . n e w . n e w . f a il . m a d . i d i o t . c u r i o u s . c a r e f u l . e v e n t s . c a l m . s a d . s u **0 . p a t h e t i c . c r a z y . s a f e t y . c a l m . f unn y . t e rr i b l e . f ***0 . i d i o t s . w t f . t r a pp e d . f u ll . w e l c o m e . w o r s t . f ** k e d . s h o r t y . h o l y . n e r v o u s . i m p r e ss i v e . r i g h t . p a i n . h a t e . t r a s h . w h y . d o w n t r e nd . e x ce ll e n t . w o n . c r a p . b * t c h . l m f ao0 . s e r i o u s l y . du m p e d . i n s a n e . b e s t . f a il e d . w e l c o m e . p i g0 . r i d i c u l o u s . t r o ub l e . w e l c o m e . f u ll . b a d . s t up i d . l o s e r . unu s u a l . p r o b l e m . w i nn i n g0 . l a d . u c h . du m b . j un k . w h oa0 . h a l t e d . l a t e s t . e v e n t s . u g h . f u ll . bu ll s h i t . s u r p r i s e d . f a ll s . r i g h t . w n . d e a d . r i g h t . f a k e . w o w . w o rr y .

577 g l a d . k i nd . c r u s h e d . s u ** s . r b ag e . s u r p r i s e . b a s h i n g0 . c a r e f u l . c o n g r a t s . s c a l p . l a r g e s t . c l o w n . j e s u s . t a n k i n g0 .

574 g r e a t . l o v i n g0 . m a x p a i n . f *** i n g0 . li e . nu t s . r s .

574 a m a z i n g0 . l o v e . d e s p e r a t e . d * m n . t u r d . m g0 . du m p i n g0 .

573 a w e s o m e . i m p r e ss i v e . s ill y . s h **0 . f oo l . d . c o rr ec t i o n . w o r s t . f a i r . l o n e . t v . b ag s . n e w . m a n i pu l a t e d . li v e . l o c k e d . s a d f a ce . p il e o f p oo0 . s c r e w e d . eez . w o rr i e s . b e a u t i f u l . m a z i n g0 . r i p . t r a d e s . s c a m . t h i n k i n g f a ce . d r o pp i n g0 . t e rr i b l e . e x ce ll e n t . d e a t h . m o s t l y . b a s h e r s . p e r s o n s h r u gg i n g0 . f a lli n g0 . f a i r . e x c i t e d . f a ce w i t h r o lli n g e y e s . w n . f oo l s . ﬂu s h e d f a ce . p a n i c . n i ce . h o p e f u ll y . r e g r e t . . b s . s i c k . pu m p i n g0 . f unn y . s t o r m . m i ss e d . b e s t . pu m p e r s . w e l c o m e . r ece ss i o n . l o v i n g0 . hu g e . l o s i n g0 . c a l m . b ag h o l d e r s . b e s t . c o v e r i n g0 . s a f e t y . bu lli s h . b l ee d i n g0 . k i nd . b ag0 . l o l . t r a p . e a s y . w i nn i n g0 . s m h . f a i r . f r a ud . c a l m . m a n i pu l a t i o n .

544 go l d e n . r e a t . hu r t . h il a r i o u s . pu m p e r . i n t e r e s t i n g0 . c o n t r a c t s . k i nd . h o p e . d ece n t . T R U E . li e s . z a n y f a ce . s e ll o ﬀ . f a il e d . c h ee r s . p e r s o n f a ce p a l m i n g0 . t i r e d . l m ao0 . u e ss . d r o pp e d . b e s t . i n t e r e s t e d . t i r e d . n e w . s t up i d . f u ll . c r a s h . l o v e . n i ce . e v e n t s . f r e a k i n g0 . h o l d e r . w o n . tt a c k .

522 goo d . m oo n . n e w . l a t e s t . f *** i n g0 . s u r e . b e a r i s h . w o r t h l e ss . l d e n . c r ud e . li v e . p ee r s . h o p e f u ll y . bubb l e . n e r v o u s . li v e . b a s e . ﬁn e . w e l c o m e . li v e . d o w n s i d e . n i ce l y . w e s o m e . p oo r . p o s i t i v e . s p ec i a l . t i r e d . f e a r . t r a pp e d . p r og r a m . e x p . h i g h . h a t e . w o nd e r . s c a r e d . p a i n . b e a u t i f u l . b l a c k . c r o ss . b o r i n g0 . h a pp e n e d . r ee d y . e x c i t e d . l o l . w e a k . ﬂ y . ﬁ r m . hu g e . v e r b o u g h t . b a d . h a pp y . l a t e . q u i c k . p oo r . k i nd . pu ll b a c k .

483 o w n . d ece n t . t o u g h . r i c h . m a d . s t o r m . pu m p . w i n s . r i c h . t i g h t . l o l . pu r e . r i g h t . s q u eeze .

476 o u c h . ﬁn a ll y . s o rr y . d illi n g e r . s u ** s . f unn y . r i s ky . m a d . b a r ga i n . l o ud l y c r y i n g f a ce . i g n o r e . . T R U E . s c a r e . h a pp y . d . c o mm o n . d . s o und . u t s t a nd i n g0 . d i v e r g e n ce . ﬂ y . e x c i t i n g0 . w r o n g0 . p e r f ec t . du m b . w a n t s . w a r n i n g0 . l o l . r a ll y . l o v i n g0 . p r o0 . s t u c k o u t . ﬂ y . w o rr i e d . c r a z y . s p ec i a l . w o r s e . t o p . f a i t h . w e i r d . b e a r f a ce . s w ee t . p o s i t i v e . c t i v e . f unn y . n o i s e . b e g i n . du m p . p e r f ec t . e a s y . ﬂu s h e d f a ce . u t s t a nd i n g0 . f a v o r i t e . d . b o m b . A v e r ag e a b s o l u t e S HA P v a l u e s e v a l u a t e d o n a r a nd o m s a m p l e o f , S t o c k Tw i t s m e ss ag e s . W o r d s r e p o r t e d , f o ll o w e db y t h e i r c o rr e s p o nd i n ga v e r ag e a b s o l u t e S HA P v a l u e s , a r e t h e m o s t i m p o r t a n t w o r d s t h a t a pp e a r a t l e a s t t i m e s i n t h e S HA P s a m p l e . s t u c k o u t = f a ce w i t h s t u c k o u tt o n g u e w i n k i n g e y e . a b l e E . : Sh a p V a l u e s : Tw i tt e r M o d e l N e u t r a l H a pp y S a d A n g e r D i s g u s t Su r p r i s e F e a r n a t u r a l . n a t u r a l . p i c k i n g1 . s n a p . n a t u r a l . r e v e nu e . p i c k i n g1 . p i c k i n g2 . p i c k i n g4 . j u r y . n a t u r a l . unu s u a l . n a t u r a l . r e v e nu e . p r e ss u r e . m e ss . li e s . s u r p r i s e . p il e o f p oo0 . s t . n a t u r a l . f a v o r . s c i e n ce . e t h . s h a k e . s t u c k . e x p ec t i n g1 . r e dh e a r t . d a v i d . c l o ud . r e v e nu e . w a s t e . s c a r e d . q . f a v o r . c l o ud . d a v i d . f a v o r . i d i o t s . b * t c h . b e y o nd . e t h . r e v e nu e . m i n i n g1 . n a t u r a l . s l a p . m i s t a k e . w e i r d . s e r i o u s l y . m a r k e t i n g1 . m a r k e t i n g1 . s u **0 . m a d . f ***0 . s i g n a l s . c h ee r s . m e ss . r e b o und . h i . p r i v a t e . h a t e . s u r p r i s e . c a t .

649 a cc u m u l a t i o n . s o und s . d e a d . f ***0 . s i g n a l . ﬁ g h t i n g0 . m i n i n g0 . t a l k s . p r e s e n t a t i o n . r a d a r . c o m p l e t e l y . s u **0 . t a l k s . ﬁ g h t i n g0 . p r e s e n t a t i o n . t a l k s . f a d e . . s t o r m . w i nn e r s . s c a r e d . s o und s . h o li d a y . i t s e l f . dud e . n e t . s w i n g s . pp o s i t e . s c a r e d . i t s e l f . s o und s . w e i r d . s t a t e m e n t . s c a r e d . m b l e . b * t c h . b o r i n g1 . s t u c k . cc u m u l a t e . li m i t e d . f e a r . d e a d . i t s e l f . m a t e r i a l . r e a d i n g0 . f a ce w i t h r o lli n g e y e s . t h e m s e l v e s . p a n i c . d a v i d . i m p r e ss i v e . b e tt i n g1 . b o r i n g0 . r a b . ﬁ g h t i n g0 . b i d s . s o und s . m u l t i . i m p r e ss i v e . ﬁ g h t i n g0 . t r a s h . s h **0 . i d i o t s . p r o b . s e r i o u s . l o n g1 . d ee p . s m ili n g h e a r t . i d i o t s . w n e r . i t s e l f . s o m e b o d y . i d e a s . b o t h . m e ss . c o m p e t i t i o n . s c i e n ce . r a d a r .

529 a w e s o m e . p r e ss u r e . l o s t . j o i n . f a v o r . m g0 . b o r i n g0 . t r i e d . i s h . c a t . h y p e . m g0 . n e r v o u s . s t . p i v o t . w h a l e . c a n ce r . s h **0 . f *** i n g0 . p i v o t . e x p ec t i n g0 . i m p a c t . i m p a c t . m e ss . r o c k e t . p i c k i n g0 . b a s i c a ll y . s i g n a l s . b e y o nd . w e s o m e . t r u m p . r i d i c u l o u s . m a d . ﬂ ag0 . m i ss i n g0 . ﬁ g h t i n g1 . k h a nd . . m e n t i o n e d . b o0 . w e s o m e . b e y o nd .

491 a l o n g1 . s e r i o u s . s c a r e d . s t u c k . w o r t h l e ss . b e a u t i f u l . i m p r e ss i v e . i s h . s t o r m . r o lli n g*0 . m i n s . r i d i c u l o u s . s c a r e . m u l t i . r e b o und . s o m e b o d y . w e a k . du m b . s u ** s . w a ll . f ee li n g0 . s t u c k . p i v o t . d illi n g e r . h a t e . p r e ss u r e . v s . m a t e r i a l . t r u l y . m a z i n g1 . s a d . m o n e y w i t h w i n g s . d * m n . y e s . c t i n g0 . s h a k e . m u l t i . m u l t i . j o k e . cc u m u l a t i o n . m u l t i . m g0 . b e tt i n g1 . l o t s . s i d e . b e y o nd . b o r i n g0 . s t o c k t w i t s . t r i e d . p r e s i d e n t . l e . l o ss . b l o w . r a n k s . s o m e b o d y . q .

464 o m g0 . ﬁn a ll y . s t r i k e . s t a n l e y . s o c i a l . c q u i r e d . p l z . m o v e m e n t . s m e ll . l o t . f . f a i r . l o t s . b r o k e . b o r i n g0 . b e y o nd . f ee li n g0 . p a n i c . m u l t i . m o n s t e r . cc u m u l a t i o n . s t u ﬀ . c l a ss i c . w o rr i e d . m u l t i . m e ss . c o n g r a t s . m e ss . e x a c t l y . s t u ﬀ . c a n a d a0 . c r a p . p a r t . m o v e m e n t . s u **0 . l o t s . p o w e ll . t r i e d . f a v o r . b l o c k e d . p o r t f o li o0 . d ee p . s c i e n ce . c a r . m i ss i n g0 . b e a t i n g0 . r a b . p o s i t i o n s . s t o r m .

422 ga m b l e . c o m p l e t e l y . i n t e l . c l o w n . s e ss i o n . h o li d a y . r ec o v e r y . d e a d . s c a m . m o m e n t u m . t a l k s . b e y o nd . f a v o r . f ***0 . s t o c k t w i t s . s . w o r s t . m o m o0 . p e r f ec t . p e r f o r m a n ce . bu r n . h a pp y f a ce . c oo l . s l o w . bu ll . s u r p r i s e d . t a n k . s p r e a d . i d e a s . c o n g r a t s . s m ili n g s un *0 . k h a nd . p a n i c . m o m e n t u m . m i n i m u m .

407 a m a z i n g0 . d og0 . l o s e r s . li t e r a ll y . v s . b ec o m e . b i o t ec h .

404 o w n e r . e x a c t l y . h a nd . b o r i n g0 . s o m e b o d y . p i g0 . e x t r e m e l y . r a d a r . f unn y . r a ll y . d * m n . i n t r a d a y . r i nn i n g*0 . f und s . s i g n a l s . b e a u t i f u l . s i g n a l s . v s . h a0 . p i c k i n g0 . s c a m . A v e r ag e a b s o l u t e S HA P v a l u e s e v a l u a t e d o n a r a nd o m s a m p l e o f , S t o c k Tw i t s m e ss ag e s . W o r d s r e p o r t e d , f o ll o w e db y t h e i r c o rr e s p o nd i n ga v e r ag e a b s o l u t e S HA P v a l u e s , a r e t h e m o s t i m p o r t a n t w o r d s t h a t a pp e a r a t l e a s t t i m e s i n t h e S HA P s a m p l e . r o lli n g* = r o lli n go n t h e ﬂ oo r l a u g h i n g , s m ili n g s un * = s m ili n g f a ce w i t h s un g l a ss e s , s m ili n g h e a r t = s m ili n g f a ce w i t hh e a r t e y e s , g r i nn i n g* = g r i nn i n g f a ce w i t h s m ili n g e y e s . able E.3: Shap Values: Chat Type Most Important Wordstransaction 0.366break 0.340bears 0.328shorts 0.325missed 0.319performance 0.318played 0.317bear 0.313broke 0.312board 0.311downgraded 0.310climb 0.308longs 0.308bulls 0.304news 0.304bought 0.300bull 0.298reversal 0.297buying 0.295halt 0.294oversold 0.291move 0.291pumpers 0.291stop 0.290close 0.289economy 0.287highs 0.286charts 0.286rally 0.285miss 0.285proﬁts 0.284sells 0.284support 0.284spike 0.283breaks 0.283long 0.282ceo 0.282stops 0.282moving 0.281bankruptcy 0.281bullish 0.281sold 0.280upgraded 0.279production 0.278premarket 0.277fed 0.277watchlist 0.277plays 0.276vix 0.276bearish 0.275Average absolute SHAP values evaluated on a random sample of 100,000 StockTwits messages. Wordsreported, followed by their corresponding average absolute SHAP values, are the 50 most importantwords that appear at least 50 times in the SHAP sample. Computing Excess Returns: Windows

Figure F.1 illustrates the event and estimation windows for the event study calculations. (a)(b)

Figure F.1: Excess Return Calculation Windows

Notes: Panel (a) reports estimation windows for my preferred speciﬁcation, while Panel (b) presents thealternative estimation window for robustness checks.

G StockTwits Activity & Sample Distributions